Cindy Boer

246 | Chapter 5.2 analysis of variance (PERMANOVA) to inspect the global effect of WOMAC-pain score and KLsum score on the overall microbiome profiles, using the adonis PERMANOVA function in VEGAN. PERMANOVA models included age, sex, TimeInMail, DNA isolation batch, BMI and WOMAC-pain score or KLsum, in that order. Results were visualized using a PCA plot. CLR method and PCA plot were based on Gloor et al.[33]. Intra-indi- vidual microbial composition metrics (α-diversity) used are the Shannon Index and In- verse Simpson Index. Association of α-diversity with WOMAC pain score was examined by Poisson-regression model adjusted for age, sex, batch and TimeInMail. To identify associated microbiological taxa with the investigated phenotypes, we have used the multivariate statistical linear regression analysis, R package: MaAsLin[60]. MaAsLin is a specialized statistical R package for the analysis of microbial community abundance data and clinical/phenotypic metadata. All unknown and unclassified taxonomies were excluded from this analysis (n=63). In MaAsLin, we have used adjusted default settings, i.e., Linear model based, quality control (QC) and exclusion of outliers based on the Grubbs test on the microbiome data only, and arcsine-square-root transformation of the single taxonomies relative abundance table to normalize the microbiome profiles. After MaAsLin QC we were left with 256 taxonomies for analysis (2 domains, 12 phyla, 18 classes, 25 orders, 41 families, and 158 genera). Missing values were not imputed, nor was automatic QC of the metadata, WOMAC pain score and covariates, or boost- ing by excluding metadata from the association analyses performed. For each analy- sis we forced the following cofactors: age (years), sex (0/1), technical covariates: DNA isolation batch (0/1) and TimeInMail (days). Depending on the model we additionally forced the following cofactors: BMI (body mass Index), smoking (current smoker, y/n), daily alcohol consumption (glass/day), PPI use (y/n), and NSAID use (y/n). Statistical significance was determined by multiple testing correction, Benjamini-Hochberg False discovery rate (FDR) <0.05. Removal of possible collinearity in the microbiome data was done by ILR (isometric log-ratio transformation) on the counts of the directly taxo- nomic classified reads. The full dataset including unknown and unclassified taxonomies was used. Multivariate linear regression model adjusted for age, sex, and cohort-specific technical covariates was performed on the ILR transformed data. Replication analysis in LifeLines-DEEP (LLD), used MaAsLin using similar settings, with the exception of the automatic QC in MaAsLin. In LLD the QC of the metadata was done manually. All analysis were also adjusted for age, sex, and cohort-specific technical covariates. Meta-analy- sis of RS and Lifelines was performed using inverse-variance weighting by METAL[61]. Correlation between 16S sequencing Streptococcus spp. abundancy and qPCR Strep- tococcus spp. abundance was done by Spearman correlation in R. Association of knee WOMAC pain and qPCR data were done by Poisson-regression models adjusted for age, sex, and qPCR technical covariates (plate number). All figures and graphs were made in R and adapted in Adobe Illustrator.

RkJQdWJsaXNoZXIy ODAyMDc0