Large-scale genome-wide association meta-analyses of physical activity
Physical activity is associated with a substantially lower risk of metabolic disease1 and cardiovascular disease2. Compared to the least active study participants, highly active participants have an average of 30-40% median risk reduction of coronary heart disease2. Physical activity is also associated with a myriad of other health outcomes, such as lower cancer risk, improved bone health, improved cognition, lower dementia risk, improved sleep quality, decreased risk of depression and anxiety, and overall increased quality of life and lifespan1. In 2008, the US issued its first national physical activity guidelines. However, less than a quarter of the US population meet current national physical activity guidelines3. Assuming that the health effects of physical activity described earlier are causal, this lack of adherence is estimated to result in $117 bn in annual health care costs, and a 10% increase in premature mortality1
Recently, the UK Biobank has enabled well-powered genome-wide association studies (GWAS) of physical activity. However, the published GWA studies thus far have made suboptimal use of the available data, focusing either on the small subset with accelerometer data4, while the GWAS of questionnaire data exclusively analyzed single-question variables related to exercise behaviors or moderate and vigorous activities5. This is unfortunate, as occupation, commuting, housework, and walking may be important sources of physical activity – especially for individuals who do not otherwise “exercise” during leisure. Moreover, meta-analysis with other large samples, such as 23andme and the Million Veteran Program, has not been performed yet. Thus, it is unclear whether the currently available GWAS results are externally valid, and whether they are well-powered enough for genetically informed follow-up analyses.
In this project, we propose leveraging UKB questionnaire data by making optimal use of all available questions on physical activity – deriving a data-driven composite measure of overall physical activity. Meta-analyzing Lifelines and the Million Veteran Program with the UKB (N = 438k) and 23andme (N = 272k) would subsequently result in an overall combined sample size of over one million individuals, and would be one of the largest GWAS of any trait performed yet. The GWAS would unequivocally provide a well-powered basis for causal analyses (e.g., Mendelian Randomization6–9 or GIV regression10) and other types of genetically informed follow-up analyses, such as gene-by-environment interaction analyses. Taken together, this GWAS could critically enhance our understanding of the causal effects of physical activity on physical and mental health.