Milea Timbergen

95 fragments of 32 base pairs (bp) by cutting 16 bp downstream from the methylated CpG sites, which allows specific focus on the methylated regions. The MeD-seq analyses were essentially carried out as previously described 28, 29 . In brief: DNA samples were digested by LpnPI (New England Biolabs, Ipswich, MA, USA), resulting in snippets of 32 bp around a fully-methylated recognition site that contains a CpG. These short DNA fragments were further processed using a ThruPlex DNA–seq 96D kit (cat#R400407, Rubicon Genomics Ann Arbor, MI, USA) and a Pippin system. Stem-loop adapters were blunt-end ligated to repaired input DNA and amplified to include dual indexed barcodes using a high-fidelity polymerase to generate an indexed Illumina NGS library. The amplified end product was purified on a Pippin HT system with 3% agarose gel cassettes (Sage Science, Beverly, MA, USA). Multiplexed samples were sequenced on Illumina HiSeq2500 systems for single read of 50 bp according to the manufacturer’s instructions. Dual indexed samples were demultiplexed using bcl2fastq software (Illumina, San Diego, CA, USA). MeD-seq data analysis Data processing was carried out using specifically created scripts in Python. The proprietary Python script is used in the context of an exclusive license from the Erasmus Medical Center with Methylomics BV. Raw fastq files were subjected to Illumina adaptor trimming and reads were filtered based on LpnPI restriction site occurrence between 13-17 bp from either 5’ or 3’ end of the read. Reads that passed the filter were mapped to hg38 using bowtie2. Genome-wide individual LpnPI site scores were used to generate read count scores for the following annotated regions: transcription start sites ((TSS), 1 kb before and 1 kb after), CpG-islands and gene bodies (1kb after TSS till Transcription End Site (TES)). Gene and CpG-island annotations were downloaded from ENSEMBL ( www.ensembl.org) . Detection of DMRs was performed between two datasets containing the regions of interest (TSS, gene body or CpG-islands) using the Chi-square test on read counts. Significance at a p-value of < 0.05 was called by either Bonferroni or FDR using the Benjamini-Hochberg procedure. In addition, a genome-wide sliding window was used to detect sequentially differentially methylated LpnPI sites. Statistical significance was called between LpnPI sites in predetermined groups using the Chi-square test. Neighbouring significantly called LpnPI sites were binned and reported. Annotation of the overlap of genome-wide detected DMRs was reported for TSS, CpG-islands and gene body regions. DMR thresholds were based on LpnPI site count. Fold changes of read counts are mentioned in the figure legends before performing hierarchical clustering. The differentially methylated datasets generated and analysed during the current study have been deposited to the Sequence Read Archive (SRA) 4

RkJQdWJsaXNoZXIy ODAyMDc0