Joyce Molenaar

61 Predicting population-level vulnerability among pregnant women PHM-2016 variables (comprising 33 variables); and 4) baseline combined with all PHM2016 variables, representing a potential optimum (42 variables). Comparing average F1measures for each combination helped identify which PHM-2016 variables enhanced model performance. To identify which variables were most important in model predictions (objective 3), we assessed variable importance in the final RF-model with and without PHM-2016 data. Variable importance was measured using out-of-bag (OOB) observations, explained in Appendix 1. This process yields a ranking of variable importance (32). As sensitivity analyses, we checked the permutation importance and Partial Dependence Plots (PDPs), explained in Appendix 1. Ethics approval The Clinical Expertise Centre of the National Institute for Public Health and the Environment confirmed that our study was not subject to the Dutch Medical Research involving Human Subjects Act (WMO) (reference number: VPZ-574). RESULTS Study population The study population comprised 4172 women (Appendix 2). Approximately 42.1% of these women were nullipara, 4.6% had a low income and 6.0% a low educational level. In comparison to all women with unique pregnancies between 2017 and 2021 (n = 807.904), the distribution regarding most variables was comparable, but differences were found for variables such as income, educational level and ethnicity. Among the 4172 women, there was generally a lower incidence of the risk factors. Predictions with routinely collected data The RF-model which included the routinely collected data obtained an average AUC of 0.98 (see Table 1). Such a high AUC implicates that the model sufficiently distinguishes between those with and without multidimensional vulnerability. The F1-measure had an average of 0.70, indicating that the model is able to correctly predict cases of multidimensional vulnerability, but that there are also cases missed as well as women incorrectly assigned to the vulnerability-class. Appendix 2 presents the selected hyperparameters and thresholds and the results of the separate folds. Results were consistent with those of XGBoost and Lasso (Appendix 2). 3

RkJQdWJsaXNoZXIy MTk4NDMw