Vincent de Leijster

126 Chapter 6 homoscedasticity and normal distribution. If the model failed to meet the assumptions or showed a poor fit, we considered a linear regression (OLR) or a generalized linear model (GLM), which were also evaluated for meeting their assumptions. We tested whether slope was significantly different from zero using a t-test and evaluated the model fit to the data using an F-test. As an indication for model fit we calculated R 2 for OLR and the pseudo R 2 of McFadden for GLM (Kattwinkel et al., 2011). Finally, we used the Akaike Information Criterion (AIC), which is an estimate of the information content of the model to explain the process at hands to be able to select the (set of) best models. For the costs and (potential) benefits, we tested whether they varied with ‘time since transition to agroforestry’ by using either OLR or GLM, similar to the analysis of economic performance. We expected that this relationship would follow a saturation curve for potential timber revenues and potential carbon revenues, therefore, we tested for this using non- linear sigmoid and asymptotic models (De Leijster et al. 2021). For the non-linear models we calculated a pseudo R 2 by regressing the predicted values to the observed values: for OLR and GLM we used similar model parameters as described for the economic performance analysis. We analyzed the data in R version 3.6.1 (R Core Team, 2019) using packages ‘lme4’ and ‘nlstools’. 6.2.5.2 Identification of drivers of economic performance To identify which factors best explained economic performance, we tested the effect of the following ‘factor groups’: farm characteristics (size, elevation, time since pruning of coffee plants), management characteristics (intensity of pest control, fertilization, weed control), vegetation characteristics (tree species richness, tree density, tree spatial arrangement), and supply chain (intermediary, certified). We made an index for management intensity variables by applying a min-max normalization on labor and quantity of chemicals (Patro and Sahu, 2015). First, we tested the direct relationships between the factors and economic performance by using linear regressions. We visually assessed whether the residuals of the model met the assumptions of homoscedasticity and normal distribution by inspecting density plots and residuals vs. fitted values plots, and if this was not the case, we applied log transformations to both explanatory and response variables. For the categorical factors, tree spatial arrangement, intermediary choice and certification, we used a Tukey’s test to determine which pairs of groups differed. We tested for collinearity among the factors using pair-wise correlation analysis with a Spearman test. We then used multiple regression analyses to identify the factors that best explained the economic performance. To avoid