NLP to Improve Accuracy and Quality of Dictated Medical Documents
Project Final Report (PDF, 389.03 KB) Disclaimer
Disclaimer
Disclaimer details
The use of natural language processing shows promise for automatically detecting errors in electronic patient notes created with speech recognition, with the potential of improving the accuracy, completeness, legibility, and accessibility of medical documents to enhance patient safety and health care delivery.
Project Details -
Completed
-
Grant NumberR01 HS024264
-
Funding Mechanism(s)
-
AHRQ Funded Amount$749,989
-
Principal Investigator(s)
-
Organization
-
LocationBostonMassachusetts
-
Project Dates09/30/2015 - 09/29/2019
-
Health Care Theme
In addition to typing, dictating, and use of template-based documentation, speech recognition (SR) software integrated into electronic health records allows users to create patient notes to document patient care. While easy to use and efficient, SR is prone to errors, including spelling errors and “real-word” errors, where a correctly spelled word is incorrect in the context of the note. Spell check functionality catches spelling errors, but real-word errors are more difficult to automatically detect and correct. As such, clinicians must proofread and edit SR-generated notes, a step that may be skipped due to time constraints. Errors that are missed become part of the permanent medical record, making those documents inaccurate and potentially impacting future patient care and safety. This research utilized natural language processing (NLP) to improve accuracy of SR notes by automatically detecting and identifying potential errors. Conducted at two large integrated healthcare systems, Partners HealthCare in Boston Massachusetts and the University of Colorado Health in Aurora Colorado, the research also surveyed physicians around their perceptions of SR errors and the value of SR in creating patient notes. SR errors in documents were analyzed, allowing the researchers to develop guidelines for the identification and classification of the errors.
The specific aims were as follows:
- Build a large corpus of clinical documents dictated via SR across different healthcare institutions and clinical settings.
- Conduct error analysis to estimate the prevalence and severity of SR errors.
- Develop automated, robust methods to detect SR errors in medical documents.
- Evaluate the performance of the proposed methods and tool.
- Distribute our methods and tools.
An annotation schema was developed that included 12 general error types such as insertion or deletion; 14 semantic types, such as medication and general English; and clinical significance as being either direct, ones that could influence clinical decision making, or indirect, such as ones that could result in billing errors. In evaluating SR notes, an error rate of 7.4 percent was observed in pre-edited notes; this dropped to 0.4 percent after editing from a professional transcriptionist, and further dropped to 0.3 percent with the dictating physician’s review. Errors noted under the schema found that deletions were the most prevalent general error type, English was the most frequent semantic type, medication was the most common clinical semantic type in original SR transcriptions, and diagnosis was the most common in the transcriptionist-edited, clinician-reviewed versions.
Several error-detection models were developed utilizing NLP and tested for accuracy using F1 scores. F1 score is a measure of a test’s accuracy, with 100 percent being perfect accuracy. A model based on a statistical language achieved an F1 score of 81 percent, a recurrent neural network-based model had an F1 score of 77 percent, and a topic model-based classifier had an F1 score of 24 percent.
All participants interviewed agreed that SR increases efficiency and accuracy of documentation. User estimates of SR errors ranged widely, with a low of 1 percent and a high of more than 50 percent. Estimated time spent editing and correcting errors was between 1 and 3 minutes per patient. The researchers concluded that using NLP for error detection in SR-generated patient notes is promising, but research needs to continue to further refine these models.
Disclaimer
Disclaimer details
transcriptionists. JAMA Netw Open. 2018 Jul;1(3):e180530. doi:10.1001/jamanetworkopen.2018.0530. Epub 2018 Jul 6. PMID: 30370424.
Disclaimer
Disclaimer details