Keywords: linear mixed model; genetic correlation
Authors: M. Pirinen1, C. Benner1, T. Lehtimäki2, J. G. Eriksson3,4,5, O. T. Raitakari6,7, M. Järvelin8,9,10, V. Salomaa3, S. Ripatti1,11,12; 1Institute for Molecular Medicine Finland, Helsinki, Finland, 2Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, 3Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Finland, 4Department of General Practice and Primary Health Care, University of Helsinki, Helsinki, Finland, 5Unit of General Practice, Helsinki University Central Hospital, Helsinki, Finland, 6Department of Clinical Physiology and Nuclear Medicine, University of Turku and Turku University Hospital, Turku, Finland, 7Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, Turku, Finland, 8Department of Epidemiology and Biostatistics, MRC Health Protection Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College London, London, United Kingdom, 9Institute of Health Sciences, University of Oulu, Oulu, Finland, 10Biocenter Oulu, University of Oulu, Oulu, Finland, 11Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom, 12Hjelt Institute, University of Helsinki, Helsinki, Finland.
Abstract: Several modern technologies, such as nuclear magnetic resonance and mass spectrometry platforms in metabolomics, produce high-dimensional phenotype data on individuals. A first step towards utilising high-dimensional phenotypes in genetic studies is to understand how their genetic components are related.
Recent algorithmic advances in multivariate linear mixed models have enabled variance component estimation for pairs of traits using population samples of individuals and genome-wide panels of SNPs. However, current methods have not been tailored for situations where hundreds of traits are available on the same set of individuals. For such settings, we introduce an algorithm that efficiently decomposes pairwise phenotypic correlations into genetic and environmental components.
We illustrate our approach with an application to 105 pairs of metabolic and anthropometric traits measured on up to 14,000 Finnish individuals. For example, we estimate that the observed phenotypic correlation (-0.41) between triglyserides (TG) and HDL cholesterol decomposes into an additive genetic correlation (-0.59, s.e. 0.06) and an environmental correlation (-0.36 s.e. 0.02).
We discuss the interpretation of genetic correlations as correlations between locus-wise genetic effects and characterise settings where prior information about genetic correlation increases statistical power to identify pleiotropic loci, i.e. loci that contribute to multiple traits.