Multiblock and Multilevel Analysis in Metabolomics
Johan Trygg, Professor, Umea University
A central task in metabolomics and systems biology is the integrative analysis of multiple ‘omics datasets to predict gene function, identify new targets and characterize the systematic interactions in biological processes. There are still many obstacles and challenges to overcome in order to succeed. Extracting and integrating useful information from these large complex data is a nontrivial task that requires a parallel development in data integration and visualization. The overwhelming size and complexity of the ‘omics’ technologies have therefore driven biology towards the adoption of multivariate and chemometric methods.
A family of methods capable of analysing several different sets of (possibly massive amounts of) variables measured on the same set of samples are called multi-block methods. The orthogonal projections to latent structures (OPLS), specifically the O2PLS and OnPLS methods have demonstrated the proper model structure to integrate and analyse multiple “omics” data sets.
Multi-level analysis is used for modelling the time series of biological systems. It is essential for understanding their dynamic responses to perturbations. In metabolomics, time-dependant, dynamic data are also becoming more and more common. One problem is that metabolomics data from human studies are often characterized by large variations between the subjects. This means we need to handle such inter-subject variations to enable us to allow focus on the often smaller, more wanted, treatment or pertubation effects.
|
|