2 41 REVIEW OF APPLIED & CASUAL GAMES FOR MENTAL HEALTH be weighed by sample size, we deemed this method superior to a narrative review that describes results study by study. Standardised mean differences (i.e., Cohen’s d effect sizes with their 95% confidence intervals) at the post-test measurement were calculated by dividing the difference between two means by the pooled standard deviation. The calculation of the pooled standard deviation was adjusted with weights for the sample sizes (Lenhard & Lenhard, 2016; calculator 2). The rationale and choice for specific outcome measures used in the current review are further discussed below in the Results sections of the subsequent subgroups of the (sub)clinical and healthy participant samples. If higher values on the outcome measurement indicated better performance or improvements, the mean value was multiplied by –1 such that a positive effect size indicated a beneficial effect of the experimental group compared to the control group. Effect sizes in the follow-up period were not calculated because not all studies included a follow-up measurement and the follow-up periods differed greatly among the included studies. Effect sizes of 0.2 indicate a small effect, 0.5 a moderate effect, and 0.8 a large effect (Cohen, 1988). If studies reported standard errors (SE) instead of standard deviations (SD), these were transformed with the following formula: SD = SE * √n (Barde & Barde, 2012; Higgins, Li et al., 2022). For studies that reported medians and (inter)quartiles (range), skewness was checked prior to transformation (Shi et al., 2020). If data were not skewed, medians and (inter)quartiles (range) were transformed to means and standard deviations (Luo et al., 2018; Wan et al., 2014). For crossover trials, effect sizes were based on data from the first period of the trial, essentially representing a parallel group comparison. In crossover trials, each participant is randomised to an ordering of interventions and thus receives all interventions in sequence. These types of trials are suitable for evaluating interventions with a temporary effect and when used in the treatment of stable conditions (Higgins, Eldridge et al., 2022). Because skills learned in treatment may not ‘wash-out’ before participants enter the second phase of the trial (i.e., a carry-over effect is very likely), we deemed it inappropriate to calculate effect sizes on data combined over both periods. Cluster-randomised trials involve randomising groups of participants (e.g., school classes) to different interventions. A key implication of a cluster design is that participants within a cluster often tend to respond in a similar manner, challenging the assumption of data independence. Statistical analyses should therefore account for the clustering, to prevent artificially narrow confidence intervals and false-positive conclusions (Higgins, Eldridge et al., 2022). When
RkJQdWJsaXNoZXIy MTk4NDMw