NLP to Improve Accuracy and Quality of Dictated Medical Documents (Massachusetts)

Project Details - Ongoing

Project Categories


High-quality and accurate medical documents are critical for effective inter-provider communication and patient care. Electronic health records (EHRs) have evolved to offer clinicians a range of documentation methods, including traditional dictation, typed free-text documents, and template-based, structured documents. Physician use of speech recognition (SR) technology has risen in recent years because of its ease of use and efficiency at the point of care. However, high error rates, upwards of 10 to 23 percent, have been observed in SR-generated medical documents. In order to avoid SR errors, physicians must engage in careful proofreading and report editing, which is time-consuming for busy clinicians. As such, an increasing number of errors are entered into the permanent medical record through this technology, potentially jeopardizing the quality and accuracy of medical documents and ultimately patient care.

A solution to this problem is to improve accuracy through automated error detection using natural language processing (NLP). The goals of this project are to study the nature of SR-generated errors in clinical documents and to develop and evaluate innovative methods for automatic error detection and correction. The research team will investigate statistical, machine learning, and knowledge-based approaches. The statistical methods will include the traditional noisy channel, language models, and word-context co-occurrence patterns. Machine-learning methods will be used to predict whether the recognition output is correct or contains errors. Rule-based methods based on medical domain knowledge also will be applied. NLP will be used to process SR documents to extract predictive features (e.g., contextual information, semantic class of the words, and phrases) for the above methods.

The specific aims of this project are as follows:

  • Build a large corpus of clinical documents dictated via SR across different health care institutions and clinical settings. 
  • Conduct error analysis to estimate the prevalence and severity of SR errors. 
  • Develop automated, robust methods to detect SR errors in medical documents. 
  • Evaluate the performance of the proposed methods and tool. 
  • Distribute the methods and tool. 

This project has the potential to improve the accuracy, completeness, legibility, and accessibility of medical documents to enhance patient safety and health care delivery.

This project does not have any related annual summary.
This project does not have any related resource.
This project does not have any related survey.
This project does not have any related project spotlight.
This project does not have any related survey.
This project does not have any related story.
This project does not have any related emerging lesson.