Joeky Senders
165 Summary solitary versus multiple metastases. The bag-of-words approach combined with a LASSO regression algorithm still demonstrated to outperform competing algorithms in terms of model discrimination (AUC = 0.92) and calibration. In Chapter 8 , we examined the learning curves of various algorithms in determining the histopathological diagnosis of brain tumor patients based on free-text pathology reports. In this study, we developed a modified deep learning model (ClinicalTextMiner) equipped with stronger methods of regularization. In addition to modeling only the relative frequency of words (like LASSO regression does), the resultant model was able to model the semantic complexity of text documents as well, without overfitting to statistical noise. The number of required training samples to reach the predetermined performance thresholds (AUC of 0.95 and 0.98) was two to eight times lower for ClinicalTextMiner compared to regression and conventional deep learning-based architectures. All custom computer code and software developed throughout this thesis have been made publicly-available to facilitate the transparency and reproducibility of this work. The associated URL-links can be found in the Open-source code and software appendix .
Made with FlippingBook
RkJQdWJsaXNoZXIy ODAyMDc0