Tjitske van Engelen

172 Chapter 8 patients, which was set against the hours needed by students and residents to classify patients. We also evaluated the validity of classifications by the medical students and residents, defined as agreement between their classification and the classification by the expert panel. Possible outcomes were total agreement (defined as agreement on all diagnostic labels), partial agreement (agreement on at least one, but not all diagnostic labels) or total disagreement. Agreement includes cases where disagreement on the diagnosis was based on procedural errors or on discordance on labels from the additional diagnostic categories only. An additional goal was to get a qualitative impression of the differences between classifications done by students and residents versus the expert panel. Therefore, the partial agreement and total disagreement cases were studied in detail. In addition, we evaluated the reasons for referral of a case to the expert panel, consistency of the diagnostic handbook (defined as overall inter-observer agreement between the students), reasons for disagreement between the students, inter-observer agreement between students for specific diagnostic labels, and classification by and inter-observer agreement between members of the expert panel. Statistical analysis Percentages were calculated with their 95% confidence interval (CI)[6]. An agreement percentage ≥ 80% was regarded as an acceptable inter-observer agreement. To correct for agreement by chance on diagnostic labels, Cohen’s kappa (κ) statistics was calculated with a 95% CI [7]. We categorised κ agreement as very good (0.81–1.00), good (0.61–0.80), moderate (0.41–0.60), fair (0.21–0.40), or poor (<0.20) [8, 9]. Data were analysed using SPSS 24 (2019, IBM Software, Armonk, NY, United States). Results Demographics The 240 cases that were randomly selected from the 2,418 patients enrolled in the OPTIMACT trial did not differ from the non-selected cases in age, gender and comorbidities, based on a preliminary analysis prior to data cleaning (data not shown). Follow-up at 28 days was complete for all cases. Efficiency of the method Hundred and seven of the 240 participants (45%) could be assigned a diagnosis by the students without additional evaluation and 76 more (32%) by students and residents after a consensus meeting (Figure 1). This way, only 57 of 240 cases (24%) had to be referred to the experts. There were four cases with disagreement on extrathoracic pathology, which were ignored in this study on thoracic pathology and therefore counted as agreement. Of the 108 cases discussed in the consensus meeting between students and a resident, 32 cases were referred to the expert panel. The reasons for referral of a case to the expert panel were predefined rules of the diagnostic handbook (n = 12), specific questions for the experts (n = 10), case complexity (n = 9), or because no consensus could be obtained (n = 1).

RkJQdWJsaXNoZXIy MTk4NDMw