Chapter 9 306 study, we felt we needed to clarify why respondents would consider deposits if there was no matching. One may argue that these added explanations, that are only included to deposit-based incentives and in the nudged assignment may be considered a form of framing. We are unable to disentangle framing from effects of mode of assignment or incentive schemes, which requires a different type of design (e.g. de Buisonjé et al.60). As such, future research should avoid such strong framing, and preferably include a theoretical model of incentive design,37, 61 which may allow designing optimal incentives for respondents given their time and risk preferences. Such work could also explore alternative modes of assignment, e.g., if respondents’ characteristics can be measured beforehand for them to be assigned optimal incentive schemes (without opt-out). Second, our study used a relatively small sample, which suggests that both the test power for the contrasts included may be questioned.(5) Third, the economic preferences (i.e., delay discounting and loss aversion) were elicited for hypothetical rewards. Typically, it is preferred for risk and time preferences to be elicited with incentivecompatible procedures,63 i.e., with procedures that translate to real payments. Fourth, in this study economic preferences, including demand for commitment, were measured before offering respondents choice of reward- or deposit-based incentives. As such, respondents may have felt a need to act consistent with their hypothetical demand for commitment, which would not have been observed if this question was not asked. Hence, it is possible the current order biases takeup of deposit-based incentives, and explains why demand for commitment was associated with take-up of deposit-based incentives. Future work should consider randomising the order of the demand for commitment question (and other economic preferences), to be able to estimate the size of this bias. Fifth, the measures applied for estimating discounting and loss aversion also suffer from limitations, which thus also apply here. For example, the MCQ used to measure discounting uses relatively small monetary amounts, and it is well-known that discounting is larger for smaller amounts.64 Furthermore, the MCQ is not able to 5 To identify whether statistical power is an issue in our study, we used the pwr package in R to determine what the minimum detectable effect was given our sample. We assume a test power of 0.8 and explore the minimum detectable effects for first two contrasts (two sample t-test): nudged (n = 81) vs. random assignment (n = 90) and deposit-based (n = 71) vs. reward-based incentives (n = 100). The minimum detectable effects are Cohen’s d=0.43 and d=0.44, i.e., moderate effects. In order to see if that is ‘reasonable’ we consider the standard deviation of the number of sliders completed across all participants, which is 159. The minimum detectable difference in means when treating 159 as the pooled standard deviation would be 0.43 (0.44) x 159 = 68 (70). We feel that this is a considerable difference, but not unreasonably large. Note that these are ex-post analyses and should be treated with caution, see Hoenig & Heisey, 2001 .
RkJQdWJsaXNoZXIy MTk4NDMw