Building Comprehensive Statistical Workflows for the Large-scale Analysis of Human Cohorts: Application to the Study of Urine Metabolome Physiological Variations with Age, Body Mass Index and Gender
Etienne Thevenot, Research Scientist, CEA
Biomarker discovery through metabolomics profiling in urine by liquid chromatography coupled to high-resolution mass spectrometry (LC-HRMS) has shown promising results in many kidney, liver, and bladder diseases, including cancers. To prevent potential confounding effects, characterization of the physiological variations of the metabolome is of high interest, but is currently scarce in databases. Robust feature selection, in turn, requires the development of comprehensive statistical workflows. We combined univariate testing and orthogonal partial least-squares modeling to study the variations with age, body mass index, and gender of 170 annotated metabolites in urine samples from 183 volunteers. We identified 108 metabolites displaying concentration variations with at least one of the physiological factors. Three clusters were evidenced by correlation analysis (enriched in hormone steroids, acyl-carnitines, amino acids, acyl-glycines, or xenobiotics), and allowed stratification of the cohort. The full data analysis pipeline (signal drift correction, outlier detection, univariate testing and OPLS modeling) was integrated into the Worfklow4Metabolomics.org (W4M; [1]) open platform for computational metabolomics, which provides experimenters with high-performance and user-friendly functionalities to analyze large and complex datasets. This first large-scale, untargeted analysis of physiological urine thus provides new biological and statistical insights into the variations of the metabolome and the discovery of biomarkers.
|