Milea Timbergen

130 Classification by radiologists To compare the models with clinical practice, the tumours were classified by two musculoskeletal radiologists (5 and 4 years of experience), which had access to all available MRI sequences, age, and sex. They were specifically instructed to distinguish between STS and DTF. Classification was made on a ten-point scale to indicate the radiologists’ certainty. As only extremity STS were selected for the non-DTF group, a location-matched database was used. This included all extremity DTF and the same number of non-DTF. Agreement between the radiologists was evaluated using Cohen’s kappa. The radiomics models were evaluated as well in this cohort. In each cross-validation iteration, these models were trained on 80% of the full dataset, but tested only on patients from the location-matched cohort in the other 20% of the dataset. The DeLong test was used to compare the AUCs 31 . Results Study selection and population The dataset included 203 patients; see Table 1 for the clinical characteristics. The differential diagnosis cohort consisted of 64 fibromyxosarcomas, 31 leiomyosarcomas, 36 myxoid liposarcomas, and 72 DTFs (65 primary, 7 recurrent), of which 61 were suitable for the mutation analysis. The dataset originated from 68 scanners, resulting in a large heterogeneity in the acquisition protocols, see Table 2. From the 72 patients in the DTF cohort, there were 30 T1w post- contrast (42%), 49 T1w post-contrast FatSat (68%), 34 T2w (47%), 33 T2w FatSat (46%), 3 proton density (PD) (4%), 18 DCE (25%) and 3 DWI (4%) MRI scans. Due to the limited availability of the PD, DCE, and DWI sequences, besides the T1w-MRI, only the T1w post- contrast and T2w (with/without FatSat) sequences were analysed. On the subset of 30 DTF that was segmented by both observers, the mean DSC was 0.77 (standard deviation of 0.20), indicating good agreement. An example of the image registration results is depicted in Figure 2. 5

RkJQdWJsaXNoZXIy ODAyMDc0