Joeky Senders

130 Chapter 7 Ground truth The selected reports were randomized into six blocks. Each block was manually reviewed and annotated by two independent medical students (AK, DC, NL, AD) for the number of metastases present by means of a binary classification (single metastasis versus two or more metastases). Each student reviewer was blinded to the label generated by the other reviewer, and no additional clinical information apart from the text within the radiology report was provided. Conflicts in labeling were resolved by a final reviewer (JS, IM). Consensus in student classification was used to provide accurate labels for the training and test data, but also to replicate the way chart reviews are performed in clinical research. Although clinicians are commonly considered as most appropriate for collecting clinical data, recent studies suggest that the reliability of data collected by research assistants is not inferior, especially for information with low clinical complexity. 2,3 Development of a natural language processing model The goal of this project was to compare various NLP approaches on their ability to classify MRI brain reports into those that describe a single metastasis versus those that describemultiplemetastases. The approaches and algorithms used for this purpose can be classified into two broad categories: a bag-of-words and sequence-based approach. The bag-of-words approach considers the relative frequency of words in a document but ignores the order of these words. Similarly, the algorithms trained according to the bag-of-words approach in this project (logistic regression, least absolute shrinkage and selection operator [LASSO] regression, and multilayer perceptron) ignore the order of the words as well. Due to the rapid developments in the artificial neural network field, deep learning architectures have emerged that can model spatial or temporal configurations of the input features, which allow for a sequence-based NLP approach. These algorithms consider, for example, if words are close or far away from each other in the document. In this study, algorithms trained and evaluated according to a sequence-based approach included 1D-convolutional neural networks, Long-Short Term Memory (LSTM), and Gated Recurrent Unit (GRU).