Aster Harder

GENERAL DISCUSSION 225 10 Therefore, powerful mathematical approaches are necessary to couple biological knowledge with biological omics data. There are two main statistical difficulties with integrating data: (1) the dimensionality, were the number of variables is greater than the number of samples and (2) the development of algorithms that are able to integrate and analyse biological information based on the most up-to-date information.119 Currently there is an enormous number of pathway databases that store pathway topology information and that can be used for data integration, focussing on different networks such as databases on metabolic pathways, signalling pathways, transcription factor targets, gene regulatory networks, genetic interactions, protein-compound interactions, and protein-protein interactions.156 Some well-known pathway databases are KEGG, Gene Ontology and REACTOME. Important to note is that not all database update their information as regularly as others. Pathway analysis software help to combine the information from the omics platforms with the pathway databases and are able to perform all statistical and mathematical computations. Although all software uses different statistical methods they fundamentally test the same thing, the possibility that any given pathway is represented by the high throughput data.156 Choosing the best software platform depends on the hypothesis that needs answering and the user skills. In this thesis we have used DEPICT, FOCUS, TWAS, FUMA and LDSC-SEG for the identification of relevant tissues, cell types and pathways.103, 153-155, 157, 158 Multi-omics Integrated approaches combine single-layer omics data to understand the interplay of molecules and help bridge the gap from genotype to phenotype. Integrative approaches can be more or less stringent on the types of omics considered as input, some methods are designed for a specific combination of datasets, while others are more general. In addition tools can differ in respect to sequential and simultaneous analysis of multiple layers. In addition different types of mathematics are used in current integration tools.159 The most challenging part of the field are the mathematical methodologies and interpretation of data, because of the complexity of biological systems, the technological limits, the many biological variables and the relatively low number of biological samples.159, 160 In addition, data quality remains crucial for each omics layer, hence the community should also focus on standardization of sample quality, sample analysis pipelines, data analysis pipelines and data formats. Given the vast amount of data on genomics of cluster headache and migraine in previous studies, and in this thesis (Chapters 6, 7 and 8), together with the ever growing field of metabolites and metabolomics (Chapters 2 - 5) a promising future research field would be integration of omics data.

RkJQdWJsaXNoZXIy MTk4NDMw