Timo Soeterik

143 MRI T-stage for LNI Risk Prediction The ePLND template included removal of nodes overlying the external iliac vessels, internal iliac artery, and the nodes located within the obturator fossa. 16 All resected nodal tissue was submitted for pathologic evaluation, performed by experienced uro-pathologists. The total number of lymph nodes found in the tissue, as well as the number of nodes containing prostate cancer metastasis were assessed. Histopathological evaluation was performed in accordance with the ISUP consensus statement on handling and staging of radical prostatectomy specimens. 17 Statistical analysis The risk of LNI was estimated a total of four times per patient: using both the MSKCC 2018 and Briganti 2012 nomograms, with both DRE- and mpMRI T-stage. Other covariates used for LNI risk calculation included most recent preoperative serum PSA level, highest ISUP grade found on either systematic or target biopsy, as well as the number of positive cores and the total number of cores taken on systematic biopsy. Model discrimination was quantified using the AUC, and refers to the probability of a random patient with histologically proven LNI (pN1) having a higher predicted risk than a random patient without histologically proven LNI (pN0). 18 Classification plots showing the true and false positive rates per risk threshold were used to visualize discriminatory ability. 19 Model calibration, which refers to the agreement between observed and predicted LNI, was assessed by plotting calibration curves and by determining calibration-the-large and calibration slopes. 18 Calibration-in-the-large indicates whether predicted probabilities are systematically too low or too high. Perfect calibration is characterized by an intercept of 0, and a calibration slope of 1. 18 The scaled Brier score, which is the average squared difference between the actual outcomes (i.e. LNI) and predicted probabilities, was also determined. A scaled Brier score close to 1 shows overall poor predictive ability, whereas a scaled Brier score of 0 corresponds with perfect risk prediction of the model. 18 Decision- curve analysis was performed to determine net benefit of the models over multiple clinically relevant thresholds. The calculated net benefit of the models was compared to the scenarios of treating either all or no patients. 20 A systematic analysis was performed to determine the number of patients (with or without LNI) in whom ePLND would be advised, for LNI risk thresholds between 1% and 15%. Missing data were handled by using multiple imputations by chained equations. 21 A total of 10 imputed datasets were created. Model performance measures were estimated by bootstrapping each imputed dataset 500 times. To select the best performing approach, the different approaches were compared head-to-head by estimating in how many bootstrap samples a specific approach resulted in the highest pooled AUC measure. Statistical analysis was performed using R v3.6.3. (R Project for Statistical Computing, www.r-project.org ). 8