Kaikki aineistot
Lisää
Abstract Optimizing research on the developmental origins of health and disease (DOHaD) involves implementing initiatives maximizing the use of the available cohort study data; achieving sufficient statistical power to support subgroup analysis; and using participant data presenting adequate follow-up and exposure heterogeneity. It also involves being able to undertake comparison, cross-validation, or replication across data sets. To answer these requirements, cohort study data need to be findable, accessible, interoperable, and reusable (FAIR), and more particularly, it often needs to be harmonized. Harmonization is required to achieve or improve comparability of the putatively equivalent measures collected by different studies on different individuals. Although the characteristics of the research initiatives generating and using harmonized data vary extensively, all are confronted by similar issues. Having to collate, understand, process, host, and co-analyze data from individual cohort studies is particularly challenging. The scientific success and timely management of projects can be facilitated by an ensemble of factors. The current document provides an overview of the ‘life course’ of research projects requiring harmonization of existing data and highlights key elements to be considered from the inception to the end of the project.
Abstract Early life is an important window of opportunity to improve health across the full lifecycle. An accumulating body of evidence suggests that exposure to adverse stressors during early life leads to developmental adaptations, which subsequently affect disease risk in later life. Also, geographical, socio-economic, and ethnic differences are related to health inequalities from early life onwards. To address these important public health challenges, many European pregnancy and childhood cohorts have been established over the last 30 years. The enormous wealth of data of these cohorts has led to important new biological insights and important impact for health from early life onwards. The impact of these cohorts and their data could be further increased by combining data from different cohorts. Combining data will lead to the possibility of identifying smaller effect estimates, and the opportunity to better identify risk groups and risk factors leading to disease across the lifecycle across countries. Also, it enables research on better causal understanding and modelling of life course health trajectories. The EU Child Cohort Network, established by the Horizon2020-funded LifeCycle Project, brings together nineteen pregnancy and childhood cohorts, together including more than 250,000 children and their parents. A large set of variables has been harmonised and standardized across these cohorts. The harmonized data are kept within each institution and can be accessed by external researchers through a shared federated data analysis platform using the R-based platform DataSHIELD, which takes relevant national and international data regulations into account. The EU Child Cohort Network has an open character. All protocols for data harmonization and setting up the data analysis platform are available online. The EU Child Cohort Network creates great opportunities for researchers to use data from different cohorts, during and beyond the LifeCycle Project duration. It also provides a novel model for collaborative research in large research infrastructures with individual-level data. The LifeCycle Project will translate results from research using the EU Child Cohort Network into recommendations for targeted prevention strategies to improve health trajectories for current and future generations by optimizing their earliest phases of life.
Abstract The Horizon2020 LifeCycle Project is a cross-cohort collaboration which brings together data from multiple birth cohorts from across Europe and Australia to facilitate studies on the influence of early-life exposures on later health outcomes. A major product of this collaboration has been the establishment of a FAIR (findable, accessible, interoperable and reusable) data resource known as the EU Child Cohort Network. Here we focus on the EU Child Cohort Network’s core variables. These are a set of basic variables, derivable by the majority of participating cohorts and frequently used as covariates or exposures in lifecourse research. First, we describe the process by which the list of core variables was established. Second, we explain the protocol according to which these variables were harmonised in order to make them interoperable. Third, we describe the catalogue developed to ensure that the network’s data are findable and reusable. Finally, we describe the core data, including the proportion of variables harmonised by each cohort and the number of children for whom harmonised core data are available. EU Child Cohort Network data will be analysed using a federated analysis platform, removing the need to physically transfer data and thus making the data more accessible to researchers. The network will add value to participating cohorts by increasing statistical power and exposure heterogeneity, as well as facilitating cross-cohort comparisons, cross-validation and replication. Our aim is to motivate other cohorts to join the network and encourage the use of the EU Child Cohort Network by the wider research community.
Abstract The current epidemics of cardiovascular and metabolic noncommunicable diseases have emerged alongside dramatic modifications in lifestyle and living environments. These correspond to changes in our ”modern” postwar societies globally characterized by rural-to-urban migration, modernization of agricultural practices, and transportation, climate change, and aging. Evidence suggests that these changes are related to each other, although the social and biological mechanisms as well as their interactions have yet to be uncovered. LongITools, as one of the 9 projects included in the European Human Exposome Network, will tackle this environmental health equation linking multidimensional environmental exposures to the occurrence of cardiovascular and metabolic noncommunicable diseases.
Abstract Genome-wide association studies (GWAS) have identified thousands of variants associated with complex traits, but their biological interpretation often remains unclear. Most of these variants overlap with expression QTLs, indicating their potential involvement in regulation of gene expression. Here, we propose a transcriptome-wide summary statistics-based Mendelian Randomization approach (TWMR) that uses multiple SNPs as instruments and multiple gene expression traits as exposures, simultaneously. Applied to 43 human phenotypes, it uncovers 3,913 putatively causal gene–trait associations, 36% of which have no genome-wide significant SNP nearby in previous GWAS. Using independent association summary statistics, we find that the majority of these loci were missed by GWAS due to power issues. Noteworthy among these links is educational attainment-associated BSCL2, known to carry mutations leading to a Mendelian form of encephalopathy. We also find pleiotropic causal effects suggestive of mechanistic connections. TWMR better accounts for pleiotropy and has the potential to identify biological mechanisms underlying complex traits.
Abstract Few genome-wide association studies (GWAS) account for environmental exposures, like smoking, potentially impacting the overall trait variance when investigating the genetic contribution to obesity-related traits. Here, we use GWAS data from 51,080 current smokers and 190,178 nonsmokers (87% European descent) to identify loci influencing BMI and central adiposity, measured as waist circumference and waist-to-hip ratio both adjusted for BMI. We identify 23 novel genetic loci, and 9 loci with convincing evidence of gene-smoking interaction (GxSMK) on obesity-related traits. We show consistent direction of effect for all identified loci and significance for 18 novel and for 5 interaction loci in an independent study sample. These loci highlight novel biological functions, including response to oxidative stress, addictive behaviour, and regulatory functions emphasizing the importance of accounting for environment in genetic analyses. Our results suggest that tobacco smoking may alter the genetic susceptibility to overall adiposity and body fat distribution.
Abstract Large consortia have revealed hundreds of genetic loci associated with anthropometric traits, one trait at a time. We examined whether genetic variants affect body shape as a composite phenotype that is represented by a combination of anthropometric traits. We developed an approach that calculates averaged PCs (AvPCs) representing body shape derived from six anthropometric traits (body mass index, height, weight, waist and hip circumference, waist-to-hip ratio). The first four AvPCs explain >99% of the variability, are heritable, and associate with cardiometabolic outcomes. We performed genome-wide association analyses for each body shape composite phenotype across 65 studies and meta-analysed summary statistics. We identify six novel loci: LEMD2 and CD47 for AvPC1, RPS6KA5/C14orf159 and GANAB for AvPC3, and ARL15 and ANP32 for AvPC4. Our findings highlight the value of using multiple traits to define complex phenotypes for discovery, which are not captured by single-trait analyses, and may shed light onto new pathways.
Abstract In many species, the offspring of related parents suffer reduced reproductive success, a phenomenon known as inbreeding depression. In humans, the importance of this effect has remained unclear, partly because reproduction between close relatives is both rare and frequently associated with confounding social factors. Here, using genomic inbreeding coefficients (FROH) for >1.4 million individuals, we show that FROH is significantly associated (p < 0.0005) with apparently deleterious changes in 32 out of 100 traits analysed. These changes are associated with runs of homozygosity (ROH), but not with common variant homozygosity, suggesting that genetic variants associated with inbreeding depression are predominantly rare. The effect on fertility is striking: FROH equivalent to the offspring of first cousins is associated with a 55% decrease [95% CI 44–66%] in the odds of having children. Finally, the effects of FROH are confirmed within full-sibling pairs, where the variation in FROH is independent of all environmental confounding.
Abstract Trait-associated genetic variants affect complex phenotypes primarily via regulatory mechanisms on the transcriptome. To investigate the genetics of gene expression, we performed cis- and trans-expression quantitative trait locus (eQTL) analyses using blood-derived expression from 31,684 individuals through the eQTLGen Consortium. We detected cis-eQTL for 88% of genes, and these were replicable in numerous tissues. Distal trans-eQTL (detected for 37% of 10,317 trait-associated variants tested) showed lower replication rates, partially due to low replication power and confounding by cell type composition. However, replication analyses in single-cell RNA-seq data prioritized intracellular trans-eQTL. Trans-eQTL exerted their effects via several mechanisms, primarily through regulation by transcription factors. Expression of 13% of the genes correlated with polygenic scores for 1,263 phenotypes, pinpointing potential drivers for those traits. In summary, this work represents a large eQTL resource, and its results serve as a starting point for in-depth interpretation of complex phenotypes.