Aernoud Fiolet

348 Chapter 14 and myocardial infarction, and low positive predictive values for all endpoints. For the use of clinical endpoints, an adjudication committee could help to increase specificity in such circumstances. Endpoint identification using EHR text-mining comes with three consequences for scientific research, and clinical trials in particular. First, using EHR-based endpoint information instead of investigator-reported endpoints will usually lower the power of a clinical trial, provided misclassification of endpoints is independent of treatment arms. As sensitivity and specificity of the EHR-based endpoints depart from the ideal values of 1, the power of the trial will become lower. To maintain a pre-specified power, this would typically mean that sample sizes need to increase when using EHR-based endpoints instead of using investigator-reported endpoints. In particular low-cost, pragmatic investigations, such as in low-income countries or head-to-head comparisons for on-market treatments might benefit from EHR data collection methods. These studies meet clinical needs, but are often difficult to fund. 25,26 Second, missed endpoints may introduce bias in assessing treatments. For example, treatments with more adverse effects than control might lead to higher number of hospital visits and increased chances to register endpoints than in the control arm, spuriously increasing risk in the treatment arm. However, such bias also applies to a smaller extent for regular endpoint collection as trial endpoints are also identified in routine physician-patient interactions. Third, EHR data collection is limited to information that is regularly collected. Data that is not regularly ascertained in routine care, such as rare adverse events, or have ambiguous registration will be more difficult to identify in EHR data retrieval accurately. Limitations We acknowledge several limitations. First, the results in this study were yielded using a closed-source, commercial software package. As a result, we could not validate underlying algorithms used in the software packages employed directly. To mitigate this limitation, we assessed multiple EHR vendors. Second, the medical background and training of the operator in validating the results will affect sensitivity. We could not isolate the magnitude of this effect on our estimates. Lastly, the limited sample size in this analysis and the low incidence rate of some endpoints in the trial affected precision of estimates sensitivity.

RkJQdWJsaXNoZXIy ODAyMDc0