Mylène Jansen

218 Chapter 11 Data collection Since the first KIDA publication in 2008, all KIDA evaluations have been performed by the same observer. The radiographs from the 3 previously performed studies (described under patients ) that we evaluate in the current analysis were all analyzed for the first time between 2013 and 2015. More recently (2017 – 2018) all radiographs at baseline, and 1 and 2 years for these 3 studies, were reanalyzed by the same reader. As such, almost all radiographs had duplicate readings, which could be used for determining the intra-observer variability and as such for an evaluation of measurement properties and performance of KIDA in these patients with severe OA. Since there were multiple years between the first and second analysis, which might influence results, 100 of the radiographs were randomly selected to be evaluated again twice with maximum 1 month in between. The selection was made randomly to ensure that the subset was generalizable to the full set of radiographs. These 200 images (100 radiographs analyzed twice) were randomly ordered and divided in 4 batches of 50; every week 1 batch was analyzed by the same observer (MM) completely blinded to patient characteristics. This data set was additionally used to make a comparison with the dataset from the original KIDA publication, which consisted of mild OA patients with duplicate readings with limited timespan in between both readings. 5 Moreover, the relevance of the in-between reading time, months versus years, could be evaluated. Statistical analysis The intra-observer variability was calculated for the 2 groups of severe radiographs separately: the total group with radiographs analyzed with a larger and varying time period (years) between 2 observations, and the 100 radiographs analyzed within 1 month. The intra-observer variation was, for all KIDA parameters separately, displayed with Bland- Altman plots in which the difference between the first and second result was plotted against the mean of the 2 observations. 18 In accordance with the original publication, the mean and standard deviation (SD) of all measurements were calculated, as were the mean, SD and 95% confidence interval (95%CI) of the differences between the duplicate readings; the SDD was defined as 1.96 times the SD of the differences. The intraclass correlation coefficient (ICC) was calculated for single measures using a 2-way random model with absolute agreement. ICCs were interpreted according to the definitions of Koo and Li: an ICC <0.50 was considered poor, 0.50< ICC >0.75 was moderate, 0.75< ICC >0.90 was good, and ICC >0.90 was excellent. 19 The mean, SD (of the difference), and SDD were compared between the 3 groups of radiographs: total group with severe OA radiographs analyzed with a larger period between 2 observations, the 100 severe OA radiographs analyzed within 1 month, and the results from the mild OA patients from the original publication. Since the SD and with that the SDD may

RkJQdWJsaXNoZXIy ODAyMDc0