114 Chapter 3 A lack of transparency in combination with the use of state-of-the-art methods was also described by Arevalo-Rodriguez and colleagues, who studied the methods and reports of 191 rapid reviews of medical tests [34]. In the majority of those reviews, the study selection method was not reported. Although almost 20% of the reviews claimed to have applied the GRADE approach, few actually reported the data extraction and quality appraisal methods. This finding is consistent with a recent report on the application of GRADE in U.S. guidelines [35]. Although guideline developers indicated that they used the GRADE approach, only 10% of the included CPGs reported on all 8 criteria for assessing the certainty in the evidence (e.g. indirectness and dose-response gradient), and around half of these included an evidence profile or summary of findings table. Gopalakrishna et al. studied barriers in the development of recommendations about medical tests in a qualitative study among European CPG developers [36]. They also reported challenges in the development of recommendations about medical tests, e.g. in the definition of key questions, the types of evidence and outcomes included in the CPG, and synthesizing and appraising the evidence. Awareness and education were reported as the most important ways to solve these challenges. Our study emphasises the need for more knowledge and expertise among CPG developers when evaluating diagnostic tests. Currently available competency-based frameworks for CPG developers do not include a special focus on diagnostic test evaluation [37, 38]. This also applies for current training programs of CPG panel members, e.g. INGUIDE [39]. Facilitating the implementation of GRADE for diagnosis by defining competencies and training needs may improve the quality of CPGs about diagnostic tests. Strengths and limitations This study evaluated the supporting evidence of recommendations in CPGs on three medical tests. The selection of only three topics is a limitation of this study. However, we chose three diagnostic tests with divergent characteristics (e.g. invasiveness, possible burden of the test, disease of interest, costs) allowing comparison of many CPGs. The homogenous results in all three clusters of CPGs strengthens the external validity of our findings. Additionally, we found large variance in methodological quality of the included CPGs. However, high scoring CPGs on the AGREE II domain methodology did not reflect a better or more transparent underpinning of the recommendations than lower scoring CPGs.
RkJQdWJsaXNoZXIy MTk4NDMw