Error message

Flood protection has blocked this Solr request. See more at The Acquia Search flood control mechanism has blocked a Solr query due to API usage limits

Enabling Large-Scale Research on Autism Spectrum Disorders Through Automated Processing of EHR Using Natural Language Understanding

Project Final Report (PDF, 1.47 MB) Disclaimer

Applying algorithms on free text in electronic health records can identify criteria for autism spectrum disorder (ASD), which improves earlier detection and treatment as well as research with large-scale data.

Project Details -
Completed

Grant Number

R21 HS024988
Funding Mechanism(s)

Exploratory and Developmental Grant to Improve Health Care Quality through Health Information Technology (IT) (R21)
AHRQ Funded Amount

$292,404
Principal Investigator(s)

Leroy, Gondy
Organization

University Of Arizona
Location

Tucson

Arizona
Project Dates

09/01/2017 - 08/31/2020
Technology

Artificial Intelligence

Machine Learning

Natural Language Processing System
Medical Condition

Mental/Behavioral Health
Population

Children

Researcher
Health Care Theme

Preventive Medicine

The prevalence of autism spectrum disorders (ASD) has increased dramatically in the last 2 decades. While this increase is not well understood, hypotheses range from changing diagnostic criteria to environmental factors. With new research focusing on neural, genetic, and environmental causes, there is a need to extract new types of data from patient records. Much of this data, when it does exist, is contained in free-text notes and is not readily available for research unless manually extracted. Natural language processing (NLP) can transform unstructured information into computable discrete data elements. NLP algorithms designed specifically for the ASD population can make data analysis and integration with other sources possible.

This research study developed and evaluated NLP algorithms to identify ASD behaviors within free text in an EHR, labeling them with the Diagnostic and Statistical Manual of Mental Disorders (DSM) diagnostic criteria for ASD. In addition, machine learning (ML) algorithms were used to label a child’s clinical record as either ASD or not. Finally, the researchers developed a prototype user interface that highlights clinicians free-text sentences containing ASD DSM criteria.

The specific aims of the research were as follows:

Design NLP algorithms to create human-interpretable models that automatically annotate free text in electronic health records and match to criteria in the DSM for ASD.
Demonstrate the feasibility and usefulness of the models for new research projects.

Data from the Centers for Disease Control and Prevention’s Autism and Developmental Disabilities Monitoring Network (ADDM) were used and matched against data from any of the four existing clinical sources. The ADDM monitors ASD in 4- to 8-year-olds. Records were manually annotated by experts who marked sentences containing DSM criteria. The NLP and ML algorithms were then applied to the records. Both precision and recall were measured. In this context, precision was the correct labeling of phenotypical expression of ASD behavior with the correct DSM diagnostic criterion. Recall was the ability of the system to identify the sentences that the experts had annotated. At the annotation level, precision was 74 percent, while recall was 42 percent. At the sentence level, average precision was 76 percent, with average recall being 43 percent.

The study addressed a gap in electronic health record use in mental health, where behaviors that meet DSM criteria are frequently buried in free text. Given that children with ASD demonstrate drastically variable behaviors that qualify for the same DSM criteria, diagnosing these children is complex and may be delayed. The algorithms can be integrated in a user-friendly interface, which can facilitate diagnosing of children by clinicians with limited expertise. This work has the potential to improve earlier diagnosis and treatment of children with ASD and enhance research efforts for ASD.

Enabling Large-Scale Research on Autism Spectrum Disorders Through Automated Processing of EHR Using Natural Language Understanding - Final Report

Citation: Leroy G. Enabling Large-Scale Research on Autism Spectrum Disorders Through Automated Processing of EHR Using Natural Language Understanding - Final Report. (Prepared by the University Of Arizona under Grant No. R21 HS024988). Rockville, MD: Agency for Healthcare Research and Quality, 2020.

Link: PDF (1.47 MB)

The findings and conclusions in this document are those of the author(s), who are responsible for its content, and do not necessarily represent the views of AHRQ. No statement in this report should be construed as an official position of AHRQ or of the U.S. Department of Health and Human Services. (Persons using assistive technology may not be able to fully access information in this report. For assistance, please contact Corey Mackison).

Principal Investigator: Leroy, Gondy

Project Name: Enabling Large-Scale Research on Autism Spectrum Disorders Through Automated Processing of EHR Using Natural Language Understanding

Document Type: Report

Research Method: Algorithm, Case Study

Technology: Artificial Intelligence, Machine Learning, Natural Language Processing System

Population: Children, Researcher

Optimizing corpus creation for training word embedding in low resource domains: a case study in Autism Spectrum Disorder (ASD).

Citation: Gu Y, Leroy G, Pettygrove S, Galindo MK, Kurzius-Spencer M. Optimizing corpus creation for training word embedding in low resource domains: a case study in Autism Spectrum Disorder (ASD). AMIA Annu Symp Proc. 2018 Dec 5;2018:508-517. PMID: 30815091.

Link: https://pubmed.ncbi.nlm.nih.gov/30815091/

Principal Investigator: Leroy, Gondy

Project Name: Enabling Large-Scale Research on Autism Spectrum Disorders Through Automated Processing of EHR Using Natural Language Understanding

Document Type: Conference Proceeding

Research Method: Mixed Methods

Technology: Artificial Intelligence, Electronic Health Record/Electronic Medical Record, Machine Learning, Natural Language Processing System

Population: Children

Automated extraction of diagnostic criteria from electronic health records for autism spectrum disorders: development, evaluation, and application.

Citation: Leroy G, Gu Y, Pettygrove S, et al. Automated extraction of diagnostic criteria from electronic health records for autism spectrum disorders: development, evaluation, and application. J Med Internet Res. 2018 Nov 7;20(11):e10497. PMID: 30404767.

Link: https://www.ncbi.nlm.nih.gov/pubmed/30404767

Principal Investigator: Leroy, Gondy

Project Name: Enabling Large-Scale Research on Autism Spectrum Disorders Through Automated Processing of EHR Using Natural Language Understanding

Document Type: Journal Publication

Research Method: Quantitative

Technology: Artificial Intelligence, Natural Language Processing System

Population: Children

Using Natural Language Processing to Improve Autism Spectrum Disorder Research and Care

Applying algorithms on free text in electronic health records can identify criteria for autism spectrum disorder, which improves earlier detection and treatment as well as research with large-scale data.

Difficulty of accessing unstructured data for decision making

The use of electronic health records (EHRs) and other digital healthcare tools has generated a large volume of data, but it is often difficult to access and use for decision making. In healthcare, data are critical to providers in diagnosing and making informed treatment decisions. While structured health data, including data coded with a standardized code system such as SNOMED or LOINC, can more readily support analysis and decision making, unstructured data—in the form of free texts and narratives—are not easily extractable for use in care delivery. Natural language processing and other machine learning techniques convert unstructured text into structured, codified content in an automated manner for larger-scale use and for integration with other data.

How can we use that valuable information in free text notes?

Dr. Gondy Leroy of the University of Arizona decided to focus on autism spectrum disorders (ASDs) to show how extracting and coding information from free text in EHRs can lead to new insights and treatments. While the prevalence of ASD has increased dramatically in the last two decades, the causes are not well understood, with hypotheses ranging from changing diagnostic criteria to environmental factors. With new research focusing on neural, genetic, and environmental causes, there is a need to extract new types of data from patient records. Much of these data, when they do exist, are contained in free-text notes and are not readily available unless manually extracted. Dr. Leroy and her team sought to create methods and tools for leveraging existing and detailed ASD patient information in EHRs to improve ASD research and, ultimately, to improve earlier diagnosis, treatments, and cures.

The importance of this research is that the earlier you identify ASD, the earlier you can provide treatments and services. If you identify ASD at 5 years old, compared to 3-1/2 years old, it's a big difference. By catching it earlier, you can start treatment and therapy with that child sooner.”

- Dr. Leroy

Using natural language processing to improve ASD research

The team developed and evaluated natural language processing algorithms to identify ASD behaviors within free text in EHRs, labeling them with the Diagnostic and Statistical Manual of Mental Disorders (DSM) diagnostic criteria for ASD. In addition, machine learning algorithms were used to label a child’s clinical record as either ASD or not. The researchers then developed a prototype user interface that highlights clinicians’ free-text sentences containing ASD DSM criteria. This study addressed a gap in EHR use in mental health, where behaviors that meet DSM criteria are frequently buried in free text. Given that children with ASD demonstrate drastically variable behaviors that qualify for the same DSM criteria, diagnosing these children is complex and may be delayed. The algorithms can be integrated in a user-friendly interface, which can help clinicians with limited expertise diagnose children. This work has the potential to improve earlier diagnosis and treatment of children with ASD and enhance research efforts for ASD. Findings from this research led to a recently awarded $1.5 million grant from the National Institute of Mental Health to expand the technology to support non-expert clinicians in identifying children at risk for autism spectrum disorder.

Error message

Enabling Large-Scale Research on Autism Spectrum Disorders Through Automated Processing of EHR Using Natural Language Understanding

Disclaimer

Project Details - Completed

Using Natural Language Processing to Improve Autism Spectrum Disorder Research and Care

Using Natural Language Processing to Improve Autism Spectrum Disorder Research and Care

Difficulty of accessing unstructured data for decision making

How can we use that valuable information in free text notes?

Using natural language processing to improve ASD research

Project Details -
Completed