Aernoud Fiolet

324 Chapter 13 as positive also enroll in the trial) would maximize efficacy improvements. 27 Our study indicates that automated EHR screening has the potential to identify large numbers of eligible participants in a time-efficient and cost-efficient manner (data not shown). Data collection accuracy Since baseline characteristics are not always included in final outcome analysis generally, small errors in these data can be acceptable when counterbalanced by improved efficiency. Incorporation of baseline characteristics measured with error in the analyses would only have an effect on research validity when accuracy is not randomly distributed across intervention groups. If random, it could affect the precision of effect estimates after adjustment. 12 Accuracy of automated EHR data collection depends on the amount of missing data and measurement errors. First, variables collected from data can be missing because they were not recorded or not extracted from the data. Physicians often measure and register only what they consider relevant for delivering care. Consequently, (ordinary) characteristics that are desired in clinical research are not registered. 28 Whether this will lead to problems in identifying patients eligible for trial participation differs per variable and context. Missing data on smoking, for example, will be of less value than missing data on coronary revascularization since clinicians will not always ask about smoking but may be expected to document coronary interventions. 29 These factors make it harder to extract data due to ensuing variability in how characteristics are reported. Substantive knowledge on the topics of data to be extracted is therefore still essential. Second, EHR data could contain more measurement errors because they were not collected and measured in a standardized format, as is generally done in conventional trial data collection. EHR data can, for example, be hampered in its currency (i.e., stored variables are out of date) due to irregular visits of patients. These remain challenges of the use of EHR data that should be addressed in future research. Third, relevant information encompassed in the EHR can still be missed due to interindividual differences in reporting or reporting errors (abbreviations, misspelling, synonyms). Improved intelligent text-pattern recognition systems might reduce the risk of missing data.

RkJQdWJsaXNoZXIy ODAyMDc0