Maider Junkal Echeveste Medrano

111 Unraveling nitrogen, sulfur and carbon microbial bioreactor responses to stress to assembled contigs using BBMap (BBTools v38.75) (Bushnell, 2014). Sequence mapping files were handled and converted using SAMtools v1.10. Contigs at least 1000 bp-long were used for binning with CONCOCT v1.1.0 (Alneberg et al., 2014), MaxBin2 v2.2.7 (Wu et al., 2015), and MetaBAT2 v2.1512 (Kang et al., 2019). Resulting metagenome-assembled genomes (MAGs) were dereplicated with DAS Tool v1.1.213 (Sieber et al., 2018) and taxonomically classified with GTDB-Tk v1.3.0 (Chaumeil et al., 2019) release 9514. MAG completeness and contamination was estimated with CheckM v1.1.2 (Parks et al., 2015). MAGs were annotated with DRAM v1.0 (Shaffer et al., 2020) with default options, except -min_contig_size 1000 bp, and genes of interest were searched in annotation files as well as via BLASTP and HMM analyses. Only high and medium quality MAGs (>50% complete and < 10% contaminated) were included in genome-centric analyses, and the entire dataset (binned and unbinned contigs) was considered in gene-centric analyses. For phylogenetic trees, sequences were aligned with MUSCLE v3.8.31 (Edgar, 2004), alignment columns were stripped with trimAl v1.4.rev22 (Capella-GutiƩrrez et al., 2009) using the option -gappyout, and trees were built with FastTree v2.1.10 (Price et al., 2010) or UBCG v3.0 (Na et al., 2018) and visualized with iToL v6 (Letunic and Bork, 2021). For calculating average amino acid identity (AAI) between selected MAGs, genomes were gene-called with Prodigal v2.6.3 (Hyatt et al., 2010), and amino acid fasta files were used as input to the Kostas Lab online tool (http://enve-omics.ce.gatech.edu/g-matrix/index, (Rodriguez-R and Konstantinidis, 2016). The Genome-to-Genome Distance Calculator v3.0 tool was used online (https://ggdc.dsmz.de/ggdc.php#, (Meier-Kolthoff et al., 2013). Heat maps were constructed with the package gplot, function heatmap.2 on RStudio v1.3.959, R v4.0.4. Metatranscriptomic reads were quality trimmed with Sickle v1.33 (Joshi and Fass, 2011) using the sickle se (single end) or pe (paired-end) options for Sanger sequencing (-t sanger). Trimmed transcripts were mapped against the annotated metagenome with Bowtie2 (Langmead and Salzberg, 2012) (bowtie2 -D 10 -R 2 -N 1 -L 22 -i S,0,2.50 -q -a -p 30), allowing only one mismatch. Index stats files were imported into RStudio to calculate Transcripts per Million (TPM) values according to the formula TPM = (number of reads / (gene length/103))/ 106, which were used as 4

RkJQdWJsaXNoZXIy MTk4NDMw