Exploiting omics to measure genome-by-environment interactions
The objective of this study is to identify the extent to which the impact on obesity risk of causal genes is influenced by individuals’ lifestyle choices, e.g., diets or level of physical activity. Obesity is a severe health problem which is associated with a large number of conditions including cardiovascular disease, diabetes, and several types of cancer. Obesity and obesity related traits such as body mass index (BMI, an index derived from a person’s height and weight that categorises individuals as underweight, normal weight, overweight, or obese) have a large genetic component. Although the heritability (i.e., the proportion of the phenotypic variation that is driven by genetic variation) of BMI ranges between 30 and 60% (XIA et al. 2016), most individual genetic variants associated with BMI (detected through genome-wide association studies, GWAS) have very small effects (YENGO et al. 2018). Genetic variation clearly plays a major role in BMI variation within a population, but we can only pinpoint a small proportion of the genetic effects responsible. The observed increase of obesity prevalence in the western world, the so called obesity epidemic, is generally attributed to “environmental” causes, including lack of physical exercise and high-calorie diets (GORTMAKER et al. 2011; SWINBURN et al. 2011). Geographic differences in health outcomes seem to be driven by these lifestyle and socioeconomic differences, rather than genetic differences (AMADOR et al. 2015; AMADOR et al. 2017).
On top of the independent effects that genetics and lifestyle have on obesity related traits, several studies have revealed that specific genetic variants can have different effects on health outcomes by interacting with lifestyle differences (GRAFF et al. 2017; TYRRELL et al. 2017). In other words, obesity related traits might be under the influence of gene-by-environment interactions, where the combined effect of genes and lifestyles is not just the sum of their independent effects. Challenges to understanding gene-by-environment interactions include their predicted small effect size, which makes their detection difficult using cohorts with small numbers of participants (VISSCHER et al. 2012), and also a lack of consistent and objective measurements of relevant lifestyle variables such as dietary behaviours, by study questionnaires.
Our analyses in cohorts like UK Biobank or Generation Scotland have shown that we can estimate the contributions of interactions between levels of self-reported environmental variables and individuals’ genetic backgrounds in an appropriate statistical analysis (technically, a mixed linear model). This approach analyses the interactions at a genome-wide level by measuring both genomic and lifestyle similarity between individuals and combining the two to capture these “genome-by-environment” interactions. This method measures the effect of sharing both genes and lifestyles, over and above their separate effects. We have previously analysed the impact of genome-by-environment interactions using factors like smoking and alcohol intake from self-reported questionnaires and have shown that these have substantial effects. However, inherent limitations to the way environmental data are collected may reduce analytical power. Differences between individuals for these types of environment are likely to be on a continuous scale, while most of the cohort information regarding these environments is recorded as categorical, creating a restricted number of groups to which individuals can belong. A way to overcome this limitation would be to include other sources of information as objective proxies for the environments of interest. We have explored this in Generation Scotland cohort data by using DNA methylation as a proxy for smoking. Preliminary results showed that interactions between an individual’s genome and their smoking status can be modelled using methylation sites associated with smoking. In the same model, we estimated the heritability of BMI to be ~50% and an effect of self-reported smoking status of 2% over BMI variation. However, a much larger value, 22% of BMI variation (on top of the 50% explained by genomic variation) was estimated to be driven by the effect of smoking-associated methylation, suggesting that DNA methylation can provide an objective and quantitative measurement of smoking that improves our models, when compared with a self-reported measurement, which may miss relevant information such as passive smoking. Genome-by-smoking interactions explained an extra 10% of BMI variation (in addition to the independent direct effects of genetics and the environment, i.e., including smoking), both when modelled based on self-reported status or smoking-associated methylation (Figure 1). These results show that omics data can be used as a proxy for environmental variation.
Here we propose to exploit the association between different omics (methylomics, gut microbiomics and metabolomics) and lifestyle, to then estimate the impact of genome-by-environment associations on obesity related traits. We would like to replicate our smoking associated methylation results observed in Generation Scotland and perform equivalent analyses exploiting microbiomics and metabolomics as a proxy of dietary behaviours. Several studies have shown associations between gut microbial communities and health outcomes, and in particular, with obesity related traits (TURNBAUGH et al. 2006; BENSON 2016). Gut microbiota is modulated by individuals’ diets and it responds rapidly to changes in them (e.g., increasing or decreasing fat content, fibre intake, etc.) (SONNENBURG AND BÄCKHED 2016). Several associations have been identified connecting dietary and microbiome features (e.g., relative abundance of different taxa or metabolic pathways) (ZHERNAKOVA et al. 2016; SANNA et al. 2019) and some of these features have been shown to influence the efficiency of utilisation of food in model organisms (DELGADO et al. 2019). Similarly, studies have uncovered associations between metabolites and nutrients and dietary patterns. Although sample sizes are still small, these show that metabolomics can play a relevant role in nutritional epidemiologic studies to improve assessment of dietary intake (GUASCH-FERRÉ et al. 2020). Here we hypothesise that we can use these omics as a proxy for individuals’ diets, in order to exploit them to estimate the contribution of the environments and their interactions with genetic markers to obesity variation. We believe Lifelines is the ideal cohort to replicate and perform these innovative analyses which will broaden our knowledge of obesity, because of its high enrichment in omics in an ideal population structure.