Joeky Senders

129 Comparing NLP methods Introduction In recent years, the volume and complexity of patient-generated health data are increasing exponentially. Although this data has the potential to propel clinical research in, its utilization is impeded because most of it comes in unstructured format, namely free-text clinical reports. Manual chart review remains therefore inevitable to identify patients and extract features of interest; however, as data sets are growing in size and granularity, this manual chart review becomes increasingly inefficient and even prone to error. Natural language processing (NLP) is a branch of artificial intelligence that focuses on enabling computers to process human language. This technique could therefore facilitate clinical research in this patient population by accelerating the throughput of free-text clinical reports. 1 A variety of NLP approaches has emerged ranging from statistical to deep learning-based models; however, the optimal approach for automating the analysis of free-text medical documents remains to be determined. The aim of this study was to provide a head-to-head comparison of NLP techniques for biomedical text analysis. Therefore, we have trained, evaluated, and compared various NLP techniques on their ability to process brain magnetic resonance imaging (MRI) reports of patients diagnosed with brain metastasis and quantify the number of metastases present. Although the current study focuses on radiology reports and brain metastasis patients, it provides a framework for development of NLP models for automated medical text analysis. Methods Participants The Research Patient Data Registry (RPDR), which is a centralized clinical data registry maintained across the Partners Healthcare Hospitals Brigham and Women’s Hospital and Massachusetts General Hospital, was queried for patients with known cerebral metastases using the international classification of diseases ninth revision (ICD-9) code 198.3. Patients were included if they had a radiological diagnosis of cerebral metastases and a complete free-text radiology report of the initial MRI brain examination. No follow-up reports were used as the number of lesions documented in these reports might have been distorted by treatment effects. This study was approved by the Institutional Review Board of Partners Healthcare, which waived the need for informed consent due to the retrospective nature of the study.

RkJQdWJsaXNoZXIy ODAyMDc0