38 Chapter 2 APPENDIX 1 Detailed description of statistical methods This document provides a detailed description of the statistical methods used with regard to 1) dependent or independent comparisons of proportions, 2). adjustment for multiple testing, 3) the pooling of the results of the two sets of CTG, and 4) the sample size estimations. Re 1: Dependent and independent comparisons. Whether the four professional groups differed in the proportions of intraobserver and interobserver agreement was tested with the independent sample t-test for differences in proportions. At first sight, as it concerns the same dataset, one would think that a dependent sample t-test is indicated. However, in our opinion, the independent t-test should be used: we are comparing the proportion agreement within one professional group to the proportion agreement within another professional group (i.e., the proportion agreement as presented in the diagonal of Table 2); we are not comparing the actual ratings of the members of the professional groups in this analysis. An example may help: If the primary care midwives reach 100% agreement with (the same) 3 out of 10 CTGs scored non-reassuring, and the obstetricians reach 100% agreement with (the same) 2 CTGs out of 10 being non-reassuring, they both have an agreement of 100%. If the scores of CTGs are compared between the professional groups, they differ on some CTGs. Comparisons of the actual ratings of the different professional groups (i.e., the interobserver agreement between the professional groups) are presented in the off-diagonal cells in Table 2 (in the manuscript). Re 2: Adjustment for multiple testing. Different views exist on correction for multiple testing. In our opinion, multiple test corrections are not necessary if a P-value is interpreted adequately, i.e., a 1/20 chance for a false positive result. However, multiple corrections are indicated if the results of multiple tests are used for a final decision, and in case there are no prior hypotheses.1,2 Re 3: Pooling of the results. The formula we used for pooling of proportions x1/n1 and x2/n2 = x1+ + x2 / n1 + n2. As n1 and n2 had the same size (this holds for both the set size (n=10) and number of observers in each professional group (n=5), we just took the average of the proportions found. Both the selected samples to be rated and the characteristics of raters within a professional group were comparable.
RkJQdWJsaXNoZXIy MTk4NDMw