NLP to Improve Accuracy and Quality of Dictated Medical Documents

Project Final Report (PDF, 389.03 KB) Disclaimer

The use of natural language processing shows promise for automatically detecting errors in electronic patient notes created with speech recognition, with the potential of improving the accuracy, completeness, legibility, and accessibility of medical documents to enhance patient safety and health care delivery.

Project Details -
Completed

Grant Number

R01 HS024264
Funding Mechanism(s)

AHRQ Health Services Research Projects (R01)
AHRQ Funded Amount

$749,989
Principal Investigator(s)

Zhou, Li
Organization

Brigham and Women's Hospital
Location

Boston

Massachusetts
Project Dates

09/30/2015 - 09/29/2019
Technology

Artificial Intelligence

Clinical Documentation

Machine Learning

Natural Language Processing System

Voice Recognition
Care Setting

Academic Medical Center

Across the Health Care System

Ambulatory Setting

Inpatient
Population

IT Staff

Physician
Health Care Theme

Quality Improvement

In addition to typing, dictating, and use of template-based documentation, speech recognition (SR) software integrated into electronic health records allows users to create patient notes to document patient care. While easy to use and efficient, SR is prone to errors, including spelling errors and “real-word” errors, where a correctly spelled word is incorrect in the context of the note. Spell check functionality catches spelling errors, but real-word errors are more difficult to automatically detect and correct. As such, clinicians must proofread and edit SR-generated notes, a step that may be skipped due to time constraints. Errors that are missed become part of the permanent medical record, making those documents inaccurate and potentially impacting future patient care and safety. This research utilized natural language processing (NLP) to improve accuracy of SR notes by automatically detecting and identifying potential errors. Conducted at two large integrated healthcare systems, Partners HealthCare in Boston Massachusetts and the University of Colorado Health in Aurora Colorado, the research also surveyed physicians around their perceptions of SR errors and the value of SR in creating patient notes. SR errors in documents were analyzed, allowing the researchers to develop guidelines for the identification and classification of the errors.

The specific aims were as follows:

Build a large corpus of clinical documents dictated via SR across different healthcare institutions and clinical settings.
Conduct error analysis to estimate the prevalence and severity of SR errors.
Develop automated, robust methods to detect SR errors in medical documents.
Evaluate the performance of the proposed methods and tool.
Distribute our methods and tools.

An annotation schema was developed that included 12 general error types such as insertion or deletion; 14 semantic types, such as medication and general English; and clinical significance as being either direct, ones that could influence clinical decision making, or indirect, such as ones that could result in billing errors. In evaluating SR notes, an error rate of 7.4 percent was observed in pre-edited notes; this dropped to 0.4 percent after editing from a professional transcriptionist, and further dropped to 0.3 percent with the dictating physician’s review. Errors noted under the schema found that deletions were the most prevalent general error type, English was the most frequent semantic type, medication was the most common clinical semantic type in original SR transcriptions, and diagnosis was the most common in the transcriptionist-edited, clinician-reviewed versions.

Several error-detection models were developed utilizing NLP and tested for accuracy using F1 scores. F1 score is a measure of a test’s accuracy, with 100 percent being perfect accuracy. A model based on a statistical language achieved an F1 score of 81 percent, a recurrent neural network-based model had an F1 score of 77 percent, and a topic model-based classifier had an F1 score of 24 percent.

All participants interviewed agreed that SR increases efficiency and accuracy of documentation. User estimates of SR errors ranged widely, with a low of 1 percent and a high of more than 50 percent. Estimated time spent editing and correcting errors was between 1 and 3 minutes per patient. The researchers concluded that using NLP for error detection in SR-generated patient notes is promising, but research needs to continue to further refine these models.

Improving Health IT Safety through the Use Natural Language Processing to Improve Accuracy of EHR Documentation

This webinar discussed the development of innovative tools and methods designed to advance health IT safety through improved EHR documentation.

A clinician survey of using speech recognition for clinical documentation in the electronic health record.

Citation: Goss FR, Blackley SV, Ortega CA, Kowalski LT, Landman AB, Lin CT, Meteer M, Bakes S, Gradwohl SC, Bates DW, Zhou L. A clinician survey of using speech recognition for clinical documentation in the electronic health record. Int J Med Inform. 2019 Jul 31;130:103938. doi: 10.1016/j.ijmedinf.2019.07.017. [Epub aheadof print]. PMID: 31442847.

Link: https://www.ncbi.nlm.nih.gov/pubmed/31442847

Principal Investigator: Zhou, Li

Project Name: NLP to Improve Accuracy and Quality of Dictated Medical Documents

Document Type: Journal Publication

Research Method: Quantitative, Survey

Technology: Artificial Intelligence

Population: Clinical Staff/Clinician

NLP to Improve Accuracy and Quality of Dictated Medical Documents - Final Report

Citation: Zhou L. NLP to Improve Accuracy and Quality of Dictated Medical Documents - Final Report. (Prepared by Brigham and Women's Hospital under Grant No. R01 HS024264). Rockville, MD: Agency for Healthcare Research and Quality, 2019.

Link: PDF (389.03 KB)

The findings and conclusions in this document are those of the author(s), who are responsible for its content, and do not necessarily represent the views of AHRQ. No statement in this report should be construed as an official position of AHRQ or of the U.S. Department of Health and Human Services.

Principal Investigator: Zhou, Li

Project Name: NLP to Improve Accuracy and Quality of Dictated Medical Documents

Document Type: Report

Research Method: Chart Review, Interview, Mixed Methods, Observational Study, Systematic Review

Technology: Artificial Intelligence, Machine Learning, Natural Language Processing System, Voice Recognition

Population: IT Staff, Physician

Speech recognition for clinical documentation from 1990 to 2018: a systematic review.

Citation: Blackley SV, Huynh J, Wang L, Korach Z, Zhou L. Speech recognition for clinical documentation from 1990 to 2018: a systematic review. J Am Med Inform Assoc. 2019 Apr 1;26(4):324-338. doi: 10.1093/jamia/ocy179. PMID: 30753666.

Link: https://www.ncbi.nlm.nih.gov/pubmed/30753666

Principal Investigator: Zhou, Li

Project Name: NLP to Improve Accuracy and Quality of Dictated Medical Documents

Document Type: Journal Publication

Research Method: Mixed Methods, Systematic Review

Technology: Voice Recognition

Analysis of errors in dictated clinical documents assisted by speech recognition software and professional transcriptionists.

Citation: Zhou L, Blackley SV, Kowalski L, Doan R, Acker WW, Landman AB, Kontrient E, Mack D, Meteer M, Bates DW, Goss FR. Analysis of errors in dictated clinical documents assisted by speech recognition software and professional
transcriptionists. JAMA Netw Open. 2018 Jul;1(3):e180530. doi:10.1001/jamanetworkopen.2018.0530. Epub 2018 Jul 6. PMID: 30370424.

Link: https://www.ncbi.nlm.nih.gov/pubmed/30370424

Principal Investigator: Zhou, Li

Project Name: NLP to Improve Accuracy and Quality of Dictated Medical Documents

Document Type: Journal Publication

Research Method: Cross-Sectional (Prevalence) Study, Quantitative

Technology: Voice Recognition

Population: Physician

Incidence of speech recognition errors in the emergency department.

Citation: Goss FR, Zhou L, Weiner SG. Incidence of speech recognition errors in the emergency department. Int J Med Inform 2016 Sep;93:70-3. PMID: 27435949.

Link: https://www.ncbi.nlm.nih.gov/pubmed/27435949

Principal Investigator: Zhou, Li

Project Name: NLP to Improve Accuracy and Quality of Dictated Medical Documents

Document Type: Journal Publication

Research Method: Chart Review, Cross-Sectional (Prevalence) Study, Quantitative

Technology: Clinical Documentation, Voice Recognition

Population: Patient, Physician

NLP to Improve Accuracy and Quality of Dictated Medical Documents

Disclaimer

Project Details - Completed

Project Details -
Completed