Browse the corpus

Walk the Even Hospital Database by book and chapter — the raw source passages that ground Ask, DDx, and the rest.

96 passages

abstractpubmed· Abstract· item 41851348

An atlas of exposome-phenome associations in health and disease risk. Nongenetic exposures comprising the 'exposome', including diet, lifestyle, infections and pollutants, shape many clinical phenotypes yet the evidence remains fragmented. Here we conducted an exposome-wide association study incorporating 619 exposure indicators and 305 quantitative phenotypes across ten independent waves of the US Centers for Disease Control and Prevention National Health and Nutrition Examination Survey. Replicable and stable signals were most concentrated in cardiometabolic and anthropometric phenotypes, linking objective nutrient biomarkers and lipophilic pollutants with body mass index, glycated hemoglobin and lipid profiles. Triglycerides, an important marker for cardiovascular risk, emerged as the phenotype most strongly associated with multidomain exposures, notably trans fatty acids, persistent pollutants and vitamin E isoforms. In pulmonary traits, tobacco-specific and carcinogen biomarkers were more prominently associated with reduced lung function than short-lived nicotine metabolites, refining exposomic links to forced expiratory volume in 1 s. Whereas individual exposures showed modest effects, aggregate 'poly-exposomic' models explained phenotypic variation comparable to genome-wide polygenic scores. Exposome globes further reveal an interconnected architecture where exposures rarely act in isolation, complicating causal attribution while providing a more holistic view of environmental risk. Our findings highlight which exposures are most likely to add value to disease risk assessment, population surveillance as well as further exposure prioritization and next-generation longitudinal exposomics.

fulltextpubmed· Main· item 41851348

Clinically relevant phenotypes are influenced by both genetics and environmental exposures1–3. Despite this, the structural relationship between the exposome—defined as the totality of environmental exposures in broad physical, chemical and psychosocial domains4,5—and human health remains obscure, characterized by a lack of systematic mapping across its broad domains. Until now, interrogating exposome–phenome relationships has been limited to studies that target a few candidate exposures and phenotypes. These candidate studies are presented selectively in millions of papers on claimed associations yielding fragmented and often biased snapshots of the exposome–phenotype maze6. While candidate approaches have been successful in identifying factors with large effects, such as smoking7, these millions of studies so far have not yielded robust associations8; moreover, many reported results might be false positives9. For example, disciplines such as nutritional epidemiology have yielded numerous associations regarding single dietary factors and patterns in disease outcomes have been nonrobust10,11. Analogous debates have been made in fields studying other domains of the exposome, such as environmental epidemiology12,13. Previous criteria to gauge causality, such as those developed by Bradford and Hill7, may not be readily applicable in new exposome epidemiology scenarios, for example, if most of the true associations to be discovered have small effect sizes and not readily discernible biological plausibility14, analogy, coherence and specificity and there is no possibility to validate in experimental studies. Consequently, the opportunity to integrate environmental data into precision medicine remains underrealized.

fulltextpubmed· Main· item 41851348

ue associations to be discovered have small effect sizes and not readily discernible biological plausibility14, analogy, coherence and specificity and there is no possibility to validate in experimental studies. Consequently, the opportunity to integrate environmental data into precision medicine remains underrealized. Precision medicine approaches, however, are dominated by genetic factors. Which exposures, if measured, would meaningfully improve risk stratification or refine prognosis and how large are those effects relative to demographics and genetics? Many phenotypes routinely used for care, diagnosis, staging and risk prediction, such as lipids (for example, triglycerides), hemoglobin A1C% (A1C%) and fasting glucose, estimated glomerular rate (eGFR)/creatinine, inflammatory markers (for example, C-reactive protein (CRP)), and spirometry (forced expiratory respiratory volume in 1 s (FEV₁)), may be partially driven by modifiable exposures. Prioritizing clinical phenotypes by the magnitude and replicability of associations, contextualizing connections between smoking and nutrient biomarkers and quantifying variance explained to gauge utility for risk equations are needed for evaluating the role of the exposome in precision medicine.

fulltextpubmed· Main· item 41851348

modifiable exposures. Prioritizing clinical phenotypes by the magnitude and replicability of associations, contextualizing connections between smoking and nutrient biomarkers and quantifying variance explained to gauge utility for risk equations are needed for evaluating the role of the exposome in precision medicine. Here, we hypothesize that the exposome exhibits a replicable associational architecture where aggregate factors explain clinically relevant phenotypic variance and disease risk. To evaluate this, we systematically quantify these relationships, executing an ‘exposome-wide association study’, establishing the data-driven foundation required to integrate the exposome into precision medicine15.

fulltextpubmed· Results· item 41851348

In brief, we developed an analytic pipeline (Fig. 1 and Extended Data Fig. 1) to analyze data from participants of the US National Health and Nutrition Examination Survey (NHANES)16 in ten serial cross-sectional surveys that were sampled in years 1999–2000, 2001–2002, 2003–2004, 2005–2006, 2007–2008, 2009–2010, 2011–2012, 2013–2014, 2015–2016 and 2017–2018. We cataloged a total of 374 real-valued continuous phenotypes and 810 biomarkers or self-report questionnaire responses that measure pollutant, dietary, infectious or smoking-related exposures across all ten surveys. Supplementary Fig. 1 shows the distribution of demographic characteristics (sex, age, ethnicity, education and income) for each association. The median age was 40 (interquartile range (IQR), 34 to 42) years and the median income-to-poverty ratio was 2.9 (IQR, 2.8 to 2.9). All associations are presented in Supplementary Table 1. The exposure and phenotype catalogs are presented in Supplementary Tables 2 and 3. Examples of exposures and phenotypes are presented in Supplementary Tables 4 and 5.Fig. 1Schematic of conducting systematic P-ExWAS.Top left: Samples of the phenotypic domain comprising 305 phenotypes. Top right: Samples of the exposomic domain comprising 619 exposures. These data are harmonized across eight cohort samples of the NHANES 1999–2018. Bottom: Resources to describe the architecture of phenome–exposome associations, including exposome globes, the Exposome–Phenome Atlas and digital resources for conducting P-ExWAS (database and software). Figure created in BioRender; Patel, C. https://biorender.com/d4u5v1p (2026).

fulltextpubmed· Results· item 41851348

ohort samples of the NHANES 1999–2018. Bottom: Resources to describe the architecture of phenome–exposome associations, including exposome globes, the Exposome–Phenome Atlas and digital resources for conducting P-ExWAS (database and software). Figure created in BioRender; Patel, C. https://biorender.com/d4u5v1p (2026). We conducted a ‘phenotype-by-exposome-wide association study’ (P-ExWAS), whereby each exposure is related with each phenotype15. We used survey-weighted regression to associate phenotypes with all exposures under nine different modeling scenarios that adjust for demographic and social attributes: the (1) main reported model, which consists of age, age2, sex, income (household income index divided by the poverty level), ethnicity (five groups), education (three groups: above high school, high school equivalent and below high school) and survey year (for example, 1999–2000, 2001–2002, 2003–2004, 2005–2006, 2007–2008, 2009–2010, 2010–2011, 2012–2013, 2014–2015 and 2016–2017 as a categorical variable); (2) base model, with no adjustments; (3) sex and survey year; (4) age, age2 and survey year; (5) sex, age, age2 and survey year; (6) ethnicity and survey year; (7) income, education and survey year (8) age, age2, sex, ethnicity and survey year; and (9) age, age2, sex, income, education and survey year.

fulltextpubmed· Results· item 41851348

rical variable); (2) base model, with no adjustments; (3) sex and survey year; (4) age, age2 and survey year; (5) sex, age, age2 and survey year; (6) ethnicity and survey year; (7) income, education and survey year (8) age, age2, sex, ethnicity and survey year; and (9) age, age2, sex, income, education and survey year. We scaled all continuous exposures and phenotypes by their standard deviation (s.d.; Methods) and ran regression models to obtain standardized β-coefficients, P values and R2 (Figs. 2–4). Categorical exposures were compared with predefined reference groups. Statistical significance was defined by a Bonferroni threshold (α ≈ 4 × 10−7) and the Benjamini–Yekutieli false discovery rate (FDR).Fig. 2Associational architecture of the exposome on the phenome.a, The two-sided log10(P value) versus the exposure type. P values are not shown corrected for multiple hypotheses. Red, associations below Bonferroni (4 × 10−7); green, associations below the FDR (5 × 10−4) but greater than Bonferroni; blue, associations greater than the FDR. b, Number of significant phenotype–exposure associations per phenotype category (total E associations shown in text above bar). c, Number of significant phenotype–exposure associations per exposome category (total phenotype associations shown in text).Source dataFig. 3Variance explained by the exposome across phenotype groups.a, R2 for exposure across exposure categories and phenome categories. b, Cumulative distribution of R2. The median R2 for each phenotypic category is annotated. c, Cumulative distribution of R2 across 119k phenotype–exposure associations. The median R2 for each exposome category is annotated. Colors for a to c: red, associations below Bonferroni (4 × 10−7); green, associations below the FDR (5 × 10−4) but greater than Bonferroni; blue, associations greater than the FDR. d, R2 attributable to the exposure versus only to demographics (age, age2, ethnicity, income, education and sex). Red, R2 attributable to multiple simultaneous exposures (up to ten). Demo., demographics (age, age2, ethnicity, income, education and sex).Source dataFig. 4Phenome–Exposome Atlas.A total of 305 phenotypes across 18 categories are depicted in the columns and 625 exposures across 18 categories are depicted in the rows. Each entry in the matrix is the linear association (‘adjusted beta’) between exposure and phenotype.

fulltextpubmed· Results· item 41851348

, ethnicity, income, education and sex).Source dataFig. 4Phenome–Exposome Atlas.A total of 305 phenotypes across 18 categories are depicted in the columns and 625 exposures across 18 categories are depicted in the rows. Each entry in the matrix is the linear association (‘adjusted beta’) between exposure and phenotype. Gray shading denotes associations that could not be estimated owing to pairwise missingness or a total sample size lower than 500.Source data a, The two-sided log10(P value) versus the exposure type. P values are not shown corrected for multiple hypotheses. Red, associations below Bonferroni (4 × 10−7); green, associations below the FDR (5 × 10−4) but greater than Bonferroni; blue, associations greater than the FDR. b, Number of significant phenotype–exposure associations per phenotype category (total E associations shown in text above bar). c, Number of significant phenotype–exposure associations per exposome category (total phenotype associations shown in text). Source data

fulltextpubmed· Results· item 41851348

a, The two-sided log10(P value) versus the exposure type. P values are not shown corrected for multiple hypotheses. Red, associations below Bonferroni (4 × 10−7); green, associations below the FDR (5 × 10−4) but greater than Bonferroni; blue, associations greater than the FDR. b, Number of significant phenotype–exposure associations per phenotype category (total E associations shown in text above bar). c, Number of significant phenotype–exposure associations per exposome category (total phenotype associations shown in text). Source data a, R2 for exposure across exposure categories and phenome categories. b, Cumulative distribution of R2. The median R2 for each phenotypic category is annotated. c, Cumulative distribution of R2 across 119k phenotype–exposure associations. The median R2 for each exposome category is annotated. Colors for a to c: red, associations below Bonferroni (4 × 10−7); green, associations below the FDR (5 × 10−4) but greater than Bonferroni; blue, associations greater than the FDR. d, R2 attributable to the exposure versus only to demographics (age, age2, ethnicity, income, education and sex). Red, R2 attributable to multiple simultaneous exposures (up to ten). Demo., demographics (age, age2, ethnicity, income, education and sex). Source data

fulltextpubmed· Results· item 41851348

a, R2 for exposure across exposure categories and phenome categories. b, Cumulative distribution of R2. The median R2 for each phenotypic category is annotated. c, Cumulative distribution of R2 across 119k phenotype–exposure associations. The median R2 for each exposome category is annotated. Colors for a to c: red, associations below Bonferroni (4 × 10−7); green, associations below the FDR (5 × 10−4) but greater than Bonferroni; blue, associations greater than the FDR. d, R2 attributable to the exposure versus only to demographics (age, age2, ethnicity, income, education and sex). Red, R2 attributable to multiple simultaneous exposures (up to ten). Demo., demographics (age, age2, ethnicity, income, education and sex). Source data A total of 305 phenotypes across 18 categories are depicted in the columns and 625 exposures across 18 categories are depicted in the rows. Each entry in the matrix is the linear association (‘adjusted beta’) between exposure and phenotype. Gray shading denotes associations that could not be estimated owing to pairwise missingness or a total sample size lower than 500. Source data The number of associations across all phenotype–exposure associations that passed the Bonferroni threshold was 5,674 (5% of 123,774) (Fig. 2a, blue shaded points, and Supplementary Fig. 2). The P value corresponding to an FDR of 5% was 5.1 × 10−4 (Fig. 2a, blue and red shaded points). The total number of associations that exceeded the FDR of 5% was 15,386 (12%).

fulltextpubmed· Results· item 41851348

ype–exposure associations that passed the Bonferroni threshold was 5,674 (5% of 123,774) (Fig. 2a, blue shaded points, and Supplementary Fig. 2). The P value corresponding to an FDR of 5% was 5.1 × 10−4 (Fig. 2a, blue and red shaded points). The total number of associations that exceeded the FDR of 5% was 15,386 (12%). The total number of tests conducted per phenotype (n = 305) was 16–654 (median of 397). The average percentage of associations (per phenotype) that were Bonferroni significant was 5% (range of 0.25–20%). The most associations were found for serum bilirubin, waist circumference and body mass index (BMI; 20% of tests for these phenotypes were significant for over 640 total tests). We observed a large range of identified associations by exposome or phenome category (Fig. 2b,c). For example, the anthropometric phenome category saw the highest number of associations (13% of phenotypes in this category had a Bonferroni significant association; Fig. 2b and Supplementary Fig. 2). Of the exposome variables, smoking and dietary/nutrient biomarkers were implicated in the most phenotype–exposure associations: ~15% and 13%, respectively (Fig. 2c and Supplementary Fig. 2).

fulltextpubmed· Results· item 41851348

of associations (13% of phenotypes in this category had a Bonferroni significant association; Fig. 2b and Supplementary Fig. 2). Of the exposome variables, smoking and dietary/nutrient biomarkers were implicated in the most phenotype–exposure associations: ~15% and 13%, respectively (Fig. 2c and Supplementary Fig. 2). For the reporting of main results, we combined cohort samplings across survey waves to maximize power. We can also examine associations within each of the survey samples to estimate the approximate ‘replication rate’. We estimated the rate of ‘replication’ or the number of times an association appeared to be nominally significant at a P value threshold of 0.05 in greater than one survey sampling, which we call a ‘replication rate’. Of the 5,674 associations that were Bonferroni significant, the replication rate was 41% (n = 2,321). By contrast, for those that did not achieve an FDR nor Bonferroni significance, the replication rate was 0.8% (n = 867).

fulltextpubmed· Results· item 41851348

value threshold of 0.05 in greater than one survey sampling, which we call a ‘replication rate’. Of the 5,674 associations that were Bonferroni significant, the replication rate was 41% (n = 2,321). By contrast, for those that did not achieve an FDR nor Bonferroni significance, the replication rate was 0.8% (n = 867). Across the atlas of associations, we found that an association (at a P value threshold of 0.05) occurred in two surveys 5% of the time. By contrast, if a phenotype–exposure association achieved an FDR significance across all surveys (Fig. 2), P value significance was achieved in greater than two surveys at least 20% of the time (Extended Data Fig. 2). However, phenotype–exposure replicated rates vary depending on the number of surveys available for a phenotype–exposure association. Specifically, for FDR-significant phenotype–exposure associations assessed in only two surveys were found in both surveys 39% of the time at a P value less than 0.05 (Extended Data Fig. 2). For associations that were at least FDR significant, the median I2 was 0, 5, 0, 26, 6, 14, 14, 20 and 18% for associations in 2, 3, 4, 5, 6, 7, 8, 9 and 10 surveys. We also assessed the percentage that were nominally significant in multiple survey waves. For the 1,211 associations estimated in ten survey waves (for example, they were tested in each of the ten surveys), there were 76%, 11%, 4%, 2.5%, 1%, 1%, 1%, 0.5%, 1% and 1% of associations that were nominally significant in exactly 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 surveys. In other words, the replication rate among 1,211 associations estimated in ten different cohort samples was 13% (n = 161/1,211).

fulltextpubmed· Results· item 41851348

in each of the ten surveys), there were 76%, 11%, 4%, 2.5%, 1%, 1%, 1%, 0.5%, 1% and 1% of associations that were nominally significant in exactly 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 surveys. In other words, the replication rate among 1,211 associations estimated in ten different cohort samples was 13% (n = 161/1,211). See Supplementary Fig. 3 for heterogeneity of association across survey samples. We estimated the variance explained (R2) attributable to the exposure variable (after subtracting the potential role of demographic attributes; Fig. 3b–d and Methods). Demographic factors, including age, age2, ethnicity, income, education and sex explained a large range of overall phenotypic variation, from ~0% to 80% (Fig. 3d, x axis). In comparison, single exposures added a median of 0.14% (Fig. 3a,d).

fulltextpubmed· Results· item 41851348

ubtracting the potential role of demographic attributes; Fig. 3b–d and Methods). Demographic factors, including age, age2, ethnicity, income, education and sex explained a large range of overall phenotypic variation, from ~0% to 80% (Fig. 3d, x axis). In comparison, single exposures added a median of 0.14% (Fig. 3a,d). The median R2 for all associations that were Bonferroni significant was 0.6% (IQR, 0.3% to 1%; 5th to 95th percentile, 0.1% to 3.6%; Fig. 3b,c). The median R2 was 0.02% for nonsignificant associations. We observed a range of R2 by exposure associations across domains of the phenome and exposome (Fig. 3b,c). For example, phenotypes in the inflammation category had exposures that explained 3% of variance on average across all phenotype–exposure associations (Fig. 3b) that were Bonferroni significant. For exposures, pollutant factors explained on average 0–3% of variation across all phenotypes. Organochlorine exposures explained ~3% of variance on average across all phenotypes. On average, dietary biomarkers accounted on average 1% of variation; however, dietary factors measured through an interview explained on average 0.5% of variation in phenotype.

fulltextpubmed· Results· item 41851348

lained on average 0–3% of variation across all phenotypes. Organochlorine exposures explained ~3% of variance on average across all phenotypes. On average, dietary biomarkers accounted on average 1% of variation; however, dietary factors measured through an interview explained on average 0.5% of variation in phenotype. Second, we also estimated the R2 owing to the additive contribution of 20 exposures simultaneously. For phenotypes that had greater than 20 exposures associated at a FDR level of significance, we imputed exposure data where missing (Methods). When considering 20 exposome factors simultaneously in 119 phenotypes, the median variance explained is 3.5% (IQR, 1.8% to 7.9%), greater than the median R2 for single exposures (Supplementary Table 6). When considering all phenotypes with ≤20 exposures in the model, the median R2 was 1.6% (IQR, 0.7% to –3.5%; Fig. 3d, red points).

fulltextpubmed· Results· item 41851348

ing 20 exposome factors simultaneously in 119 phenotypes, the median variance explained is 3.5% (IQR, 1.8% to 7.9%), greater than the median R2 for single exposures (Supplementary Table 6). When considering all phenotypes with ≤20 exposures in the model, the median R2 was 1.6% (IQR, 0.7% to –3.5%; Fig. 3d, red points). The maximum multiple exposure R2 estimated was 43% for triglycerides (Fig. 3d, red points). Triglycerides are an important clinical phenotype used to screen for cardiovascular disease. We found that aggregate exposome, particularly lipophilic dietary and pollutant-related exposures, described large variance in levels of triglycerides in the USA (R2 of 43%, the largest of any phenotype) (Supplementary Tables 6 and 7), even after adjusting for total cholesterol. Of the 20 variables in the model (Supplementary Table 8), a trans fatty acid (trans,trans-9,12-octadecadienoic acid), alpha-tocopherol and gamma-tocopherol independently contributed the most to the variance and were positively associated with triglycerides (Supplementary Table 8). Association sizes correspond to the change in the outcome per 1-s.d. increase in exposure for log-transformed continuous exposures or relative to the reference group for categorical variables (Fig. 4, ‘adjusted beta’). For associations between Bonferroni-significant exposome factors and phenotypes (association sizes for a 1-s.d. change in exposome factor), the 5th to 95th percentile range was −0.17 to 0.19 (0.03 to 0.24 in absolute values).

fulltextpubmed· Results· item 41851348

osures or relative to the reference group for categorical variables (Fig. 4, ‘adjusted beta’). For associations between Bonferroni-significant exposome factors and phenotypes (association sizes for a 1-s.d. change in exposome factor), the 5th to 95th percentile range was −0.17 to 0.19 (0.03 to 0.24 in absolute values). Exposures exhibit a dense correlation web (Fig. 5a,b). The median correlation between exposures was 0.01, and the median absolute value correlation was 0.05 across all correlations. For exposure–exposure correlations that passed the Bonferroni threshold (P < 0.05/201,265), the median for Bonferroni-corrected correlations (alpha threshold of 2 × 10−7) was 0.19 and the median absolute value correlation was 0.21 (Fig. 5c); the IQR of Bonferroni-corrected correlations was 0.08 to 0.37 (Fig. 5c) (0.11–0.38 for the absolute values). The 95th percentile reached a correlation of 0.69 (0.69 for the absolute values).Fig. 5Exposome–exposome correlational globe and distribution of exposure–exposure correlations across the exposome.a, An example exposome globe depicts exposure factors and their correlations chosen randomly (thresholded for absolute values of correlations above 0.25). b, Exposome correlation globe for exposures associated with hemoglobin A1C or BMI. Node colors include: pollutants (red), infection (yellow) and nutrients (red). c, Distribution of exposure–exposure correlations. Gray color denotes randomly selected correlations. Blue line denotes exposome correlations for exposures associated with BMI or hemoglobin A1C. Black denotes correlations that achieved Bonferroni significance. Lines depict correlations at −0.25 and 0.25.Source data

fulltextpubmed· Results· item 41851348

d). c, Distribution of exposure–exposure correlations. Gray color denotes randomly selected correlations. Blue line denotes exposome correlations for exposures associated with BMI or hemoglobin A1C. Black denotes correlations that achieved Bonferroni significance. Lines depict correlations at −0.25 and 0.25.Source data a, An example exposome globe depicts exposure factors and their correlations chosen randomly (thresholded for absolute values of correlations above 0.25). b, Exposome correlation globe for exposures associated with hemoglobin A1C or BMI. Node colors include: pollutants (red), infection (yellow) and nutrients (red). c, Distribution of exposure–exposure correlations. Gray color denotes randomly selected correlations. Blue line denotes exposome correlations for exposures associated with BMI or hemoglobin A1C. Black denotes correlations that achieved Bonferroni significance. Lines depict correlations at −0.25 and 0.25. Source data The dense correlational web is described for a sampling of exposures in a correlation globe (Fig. 5a,b and Methods). Figure 5a shows 50 randomly selected correlations sampled from all 201,265 correlations. Figure 5b depicts exposures identified in their association with BMI and A1C%. Correlations are only drawn between nodes (or exposures) whose absolute value of correlation was greater than 0.25; these correlations are among the top 15% of the distribution of correlations (Fig. 5c). Correlations for exposures associated with BMI and A1C% overall have larger correlations than a randomly selected subset (Fig. 5c, blue).

fulltextpubmed· Results· item 41851348

drawn between nodes (or exposures) whose absolute value of correlation was greater than 0.25; these correlations are among the top 15% of the distribution of correlations (Fig. 5c). Correlations for exposures associated with BMI and A1C% overall have larger correlations than a randomly selected subset (Fig. 5c, blue). Next, we computed the difference between adjusted associations (shown in Fig. 4) subtracted from univariate estimates to assess the impact of adjustment. The average difference between a minimally adjusted and a fully adjusted model and each scenario was 0.01 (Extended Data Fig. 3a), demonstrating some bias due to demographic adjustment. The s.d. sizes were the largest for the fully adjusted minus univariate model (s.d. of 0.1) and fully adjusted minus ethnicity model (s.d. of 0.1). The s.d. sizes were the smallest for the fully adjusted minus age, sex and ethnicity models and the fully adjusted minus age, sex and income/education models (Extended Data Fig. 3a).

fulltextpubmed· Results· item 41851348

izes were the largest for the fully adjusted minus univariate model (s.d. of 0.1) and fully adjusted minus ethnicity model (s.d. of 0.1). The s.d. sizes were the smallest for the fully adjusted minus age, sex and ethnicity models and the fully adjusted minus age, sex and income/education models (Extended Data Fig. 3a). For some associations, the sign of the association was the opposite depending on the demographic correction scenario. Of the significant findings, 932 out of 5,194 (15% of total significant Bonferroni-identified pairs) exhibited a switch of coefficient sign between the univariate model (a model with no demographic or social factor adjustment) and the adjusted model. For example, BMI and blood cadmium had positive associations (for example, for an increase in exposure—there is linear increase in BMI); however, when controlling or adjusting for factors in the ‘main’ model, the association becomes stronger (for example, the standard errors are reduced) and opposite in direction (Extended Data Fig. 3b). The difference between adjusted associations also differed per exposome domain (Supplementary Fig. 4). Self-reported dietary nutrients and variables dominate nutrient exposure assessment in epidemiological studies. Overall, we found that 1,452 phenotypes had Bonferroni-significant associations with nutrient variables derived from self-report questionnaires; however, among these phenotypes, they exhibited a median R2 of only 0.2%.

fulltextpubmed· Results· item 41851348

ted dietary nutrients and variables dominate nutrient exposure assessment in epidemiological studies. Overall, we found that 1,452 phenotypes had Bonferroni-significant associations with nutrient variables derived from self-report questionnaires; however, among these phenotypes, they exhibited a median R2 of only 0.2%. Next, we hypothesized that a dilution of correlation size was due to measurement noise. Self-reported dietary nutrients were assessed on 2 days. The median correlation across 69 dietary nutrient recalls on day 1 versus day 2 was 0.36 (IQR, 0.28 to 0.43). We estimated the correlations of associations (for example, the beta carotene–P association on day 1 versus the beta carotene–P association on day 2) (Extended Data Fig. 4a). Across all levels of significance, we observed a 0.84 correlation between day 1 self-report versus day 2 (Extended Data Fig. 4a). Dietary biomarkers, on the other hand, had a larger median R2 of 1% across the 1,101 phenotypes with Bonferroni significance, five times larger than their self-reported counterparts. We also estimated the correlation between biomarkers and their self-reported counterparts (for example, day 1/day 2 average of trans-beta carotene versus serum trans-beta carotene) (Extended Data Fig. 4b). The correlations between the self-reported values and biomarkers were smaller, and we observed a Pearson correlation of 0.52. For those blood nutrient variables that were Bonferroni significant, we observed a correlation of 0.60.

fulltextpubmed· Results· item 41851348

average of trans-beta carotene versus serum trans-beta carotene) (Extended Data Fig. 4b). The correlations between the self-reported values and biomarkers were smaller, and we observed a Pearson correlation of 0.52. For those blood nutrient variables that were Bonferroni significant, we observed a correlation of 0.60. Blood and urine pollutant biomarkers reflect the biological relationship between exposure and excretion. We observed a positive and strong correlation between associations. For example, for blood versus urinary biomarkers, we observed a 0.72 Pearson correlation. When only considering blood biomarkers that were Bonferroni significant, the phenotype–exposure associations between blood versus urine biomarkers was 0.78 for cadmium, 0.96 for cotinine and 0.71 for mercury (Extended Data Fig. 4c).

fulltextpubmed· Results· item 41851348

r example, for blood versus urinary biomarkers, we observed a 0.72 Pearson correlation. When only considering blood biomarkers that were Bonferroni significant, the phenotype–exposure associations between blood versus urine biomarkers was 0.78 for cadmium, 0.96 for cotinine and 0.71 for mercury (Extended Data Fig. 4c). Smoking is a strong risk factor for reduction of lung function, such as the amount of air expelled (for example, FEV1). Smoking-related biomarkers such as 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol (NNAL) and serum cotinine indeed show negative associations with FEV1, consistent with prior evidence linking tobacco exposure to reduced lung function. Urinary NNAL is a tobacco-specific nitrosamine, which showed a stronger negative association with lung function (FEV1 (–0.06 per s.d., R2 = 0.2%)) compared with serum cotinine (–0.03 per s.d., R2 = 0.08%). The difference in association between cotinine and NNNL is consistent with their biological properties: cotinine, a short half-life metabolite of nicotine, primarily captures recent exposure and is subject to greater day-to-day variability, whereas NNAL, a metabolite of the tobacco-specific nitrosamine with a half-life of 10–16 days, provides a more stable marker of cumulative exposure. Nevertheless, numerous densely correlated exposures emerged associated with FEV1 (Extended Data Fig. 5).

fulltextpubmed· Results· item 41851348

recent exposure and is subject to greater day-to-day variability, whereas NNAL, a metabolite of the tobacco-specific nitrosamine with a half-life of 10–16 days, provides a more stable marker of cumulative exposure. Nevertheless, numerous densely correlated exposures emerged associated with FEV1 (Extended Data Fig. 5). We deployed our ExWAS procedure across timely biomarkers of aging, including epigenetic age (for example, Horvath’s biological age predictor and GrimAge (which includes smoking status)17,18). We also interrogated the indicators of cognitive recall (digit substitution test), as they are used in the clinic to stage cognitive decline in older adults (Supplementary Fig. 5). Volatile organic compounds (VOCs), smoking indicators (cotinine) and physical activity had the strongest associations with cognitive function (Supplementary Fig. 5a,b). We found shared exposure associations (or, ‘shared architecture’) between better cognitive function and other phenotypes interrogated in the population, including higher exhaled nitrous oxide (shared correlation of 0.35) and with urinary creatinine (Supplementary Fig. 5c). The strongest signals for accelerated epigenetic age (such as GrimAgeMort) was associated with smoking, heavy metals and physical activity behavior. Among these domains, physical activity explained the most variance (less than 1% in R2); however, in terms of aggregate total exposomic risk, ten exposures explained 10% of the variance in GrimAgeMort (Supplementary Table 6).

fulltextpubmed· Results· item 41851348

such as GrimAgeMort) was associated with smoking, heavy metals and physical activity behavior. Among these domains, physical activity explained the most variance (less than 1% in R2); however, in terms of aggregate total exposomic risk, ten exposures explained 10% of the variance in GrimAgeMort (Supplementary Table 6). Recognizing that the effect of an exposure may vary depending on an individual’s age at the time of exposure, we modified the ExWAS procedure to consider age by exposure interactions (for example, does the association size differ for individuals at different ages?; Methods). In summary, the inclusion of exposure-by-age interactions resulted in only marginal improvements in the variance explained (R2) for most phenotypes (Extended Data Fig. 6a–c). Most of the additional variance for models that incorporate an interaction term are limited, although there are exceptions. For a pair of exposures, shared ‘associational architecture’ measures the correlation or the similarity of their correlations across all phenotypes (Extended Data Fig. 7a,b). For example, the correlation between associations for blood trans- versus cis-beta carotene was 0.98; in other words, the association coefficients across the phenome were very similar for those two exposures. The associational architecture between serum cotinine and 3-fluorene had 0.90 correlation.

fulltextpubmed· Results· item 41851348

ded Data Fig. 7a,b). For example, the correlation between associations for blood trans- versus cis-beta carotene was 0.98; in other words, the association coefficients across the phenome were very similar for those two exposures. The associational architecture between serum cotinine and 3-fluorene had 0.90 correlation. Exposures from the same category (for example, dietary interview, organochlorine, VOCs and dietary biomarkers) tended to have similar phenotypic associations; the degree of shared associational architecture is larger within categories than across categories. For all correlations within exposure variables that were dietary biomarkers, the median absolute value of shared architecture was 0.2 (IQR, 0.01 to 0.35). Similar shared associational architecture was observed within smoking biomarkers (0.2, IQR, 0.08 to 0.35). The shared associational architecture between dietary biomarkers and self-reported dietary nutrients had a median absolute value correlation of 0.24. We examined the degree of shared associational architecture between phenotypes. The shared associational architecture between BMI and body weight was 0.98. BMI and cardiorespiratory fitness had opposite associational architecture (the sign of the correlations were opposite between BMI and fitness): a correlation of −0.83. A1C% had an opposite architecture compared with high-density lipoprotein cholesterol (correlation of −0.54).

fulltextpubmed· Results· item 41851348

cture between BMI and body weight was 0.98. BMI and cardiorespiratory fitness had opposite associational architecture (the sign of the correlations were opposite between BMI and fitness): a correlation of −0.83. A1C% had an opposite architecture compared with high-density lipoprotein cholesterol (correlation of −0.54). We used data from participants of the UK Biobank to compare genetic versus exposomic predictions. We compared the variance explained due to ~1 M imputed genotypes from genome-wide association studies (GWAS) performed on 29 of the phenotypes19 examined here (Extended Data Fig. 8 and Supplementary Table 7). Across the 29 phenotypes, the median incremental R2 due to genetics was 7.9% (IQR, 2.8% to 9.3%; maximum 21%) and the median incremental R2 due to exposome (20 exposome variables across 39 phenotypes) was 7.9% (IQR, 3.1% to –12%; maximum 57%). We found that the multiple exposome factors, when modeled simultaneously, had explained variance comparable to the entire genomic array across the phenotypes (Extended Data Fig. 8). Specifically, 55% (n = 16) of phenotypes had higher exposomic versus genetic R2 (Extended Data Fig. 8). For example, the total genetic (1 M common single nucleotide polymorphisms) and exposomic (20 factors) variance explained for BMI was similar, at ~10% for both. We benchmarked the atlas against three exposure-wide analyses and found that our findings were concordant (directionally consistent and robust P values) with previous published data found in refs. 20–22.

fulltextpubmed· Exposure-wide associations across the phenome· item 41851348

We conducted a ‘phenotype-by-exposome-wide association study’ (P-ExWAS), whereby each exposure is related with each phenotype15. We used survey-weighted regression to associate phenotypes with all exposures under nine different modeling scenarios that adjust for demographic and social attributes: the (1) main reported model, which consists of age, age2, sex, income (household income index divided by the poverty level), ethnicity (five groups), education (three groups: above high school, high school equivalent and below high school) and survey year (for example, 1999–2000, 2001–2002, 2003–2004, 2005–2006, 2007–2008, 2009–2010, 2010–2011, 2012–2013, 2014–2015 and 2016–2017 as a categorical variable); (2) base model, with no adjustments; (3) sex and survey year; (4) age, age2 and survey year; (5) sex, age, age2 and survey year; (6) ethnicity and survey year; (7) income, education and survey year (8) age, age2, sex, ethnicity and survey year; and (9) age, age2, sex, income, education and survey year.

fulltextpubmed· Phenotype–exposure associations replicate across cohorts· item 41851348

For the reporting of main results, we combined cohort samplings across survey waves to maximize power. We can also examine associations within each of the survey samples to estimate the approximate ‘replication rate’. We estimated the rate of ‘replication’ or the number of times an association appeared to be nominally significant at a P value threshold of 0.05 in greater than one survey sampling, which we call a ‘replication rate’. Of the 5,674 associations that were Bonferroni significant, the replication rate was 41% (n = 2,321). By contrast, for those that did not achieve an FDR nor Bonferroni significance, the replication rate was 0.8% (n = 867).

fulltextpubmed· Phenotype–exposure associations replicate across cohorts· item 41851348

fulltextpubmed· Variance explained of the exposome· item 41851348

We estimated the variance explained (R2) attributable to the exposure variable (after subtracting the potential role of demographic attributes; Fig. 3b–d and Methods). Demographic factors, including age, age2, ethnicity, income, education and sex explained a large range of overall phenotypic variation, from ~0% to 80% (Fig. 3d, x axis). In comparison, single exposures added a median of 0.14% (Fig. 3a,d). The median R2 for all associations that were Bonferroni significant was 0.6% (IQR, 0.3% to 1%; 5th to 95th percentile, 0.1% to 3.6%; Fig. 3b,c). The median R2 was 0.02% for nonsignificant associations. We observed a range of R2 by exposure associations across domains of the phenome and exposome (Fig. 3b,c). For example, phenotypes in the inflammation category had exposures that explained 3% of variance on average across all phenotype–exposure associations (Fig. 3b) that were Bonferroni significant. For exposures, pollutant factors explained on average 0–3% of variation across all phenotypes. Organochlorine exposures explained ~3% of variance on average across all phenotypes. On average, dietary biomarkers accounted on average 1% of variation; however, dietary factors measured through an interview explained on average 0.5% of variation in phenotype.

fulltextpubmed· Variance explained of the exposome· item 41851348

fulltextpubmed· An atlas across the exposome and phenome· item 41851348

Association sizes correspond to the change in the outcome per 1-s.d. increase in exposure for log-transformed continuous exposures or relative to the reference group for categorical variables (Fig. 4, ‘adjusted beta’). For associations between Bonferroni-significant exposome factors and phenotypes (association sizes for a 1-s.d. change in exposome factor), the 5th to 95th percentile range was −0.17 to 0.19 (0.03 to 0.24 in absolute values).

fulltextpubmed· Dense correlational web of the exposome· item 41851348

Exposures exhibit a dense correlation web (Fig. 5a,b). The median correlation between exposures was 0.01, and the median absolute value correlation was 0.05 across all correlations. For exposure–exposure correlations that passed the Bonferroni threshold (P < 0.05/201,265), the median for Bonferroni-corrected correlations (alpha threshold of 2 × 10−7) was 0.19 and the median absolute value correlation was 0.21 (Fig. 5c); the IQR of Bonferroni-corrected correlations was 0.08 to 0.37 (Fig. 5c) (0.11–0.38 for the absolute values). The 95th percentile reached a correlation of 0.69 (0.69 for the absolute values).Fig. 5Exposome–exposome correlational globe and distribution of exposure–exposure correlations across the exposome.a, An example exposome globe depicts exposure factors and their correlations chosen randomly (thresholded for absolute values of correlations above 0.25). b, Exposome correlation globe for exposures associated with hemoglobin A1C or BMI. Node colors include: pollutants (red), infection (yellow) and nutrients (red). c, Distribution of exposure–exposure correlations. Gray color denotes randomly selected correlations. Blue line denotes exposome correlations for exposures associated with BMI or hemoglobin A1C. Black denotes correlations that achieved Bonferroni significance. Lines depict correlations at −0.25 and 0.25.Source data

fulltextpubmed· Demographic adjustment influences association sizes· item 41851348

Next, we computed the difference between adjusted associations (shown in Fig. 4) subtracted from univariate estimates to assess the impact of adjustment. The average difference between a minimally adjusted and a fully adjusted model and each scenario was 0.01 (Extended Data Fig. 3a), demonstrating some bias due to demographic adjustment. The s.d. sizes were the largest for the fully adjusted minus univariate model (s.d. of 0.1) and fully adjusted minus ethnicity model (s.d. of 0.1). The s.d. sizes were the smallest for the fully adjusted minus age, sex and ethnicity models and the fully adjusted minus age, sex and income/education models (Extended Data Fig. 3a). For some associations, the sign of the association was the opposite depending on the demographic correction scenario. Of the significant findings, 932 out of 5,194 (15% of total significant Bonferroni-identified pairs) exhibited a switch of coefficient sign between the univariate model (a model with no demographic or social factor adjustment) and the adjusted model. For example, BMI and blood cadmium had positive associations (for example, for an increase in exposure—there is linear increase in BMI); however, when controlling or adjusting for factors in the ‘main’ model, the association becomes stronger (for example, the standard errors are reduced) and opposite in direction (Extended Data Fig. 3b). The difference between adjusted associations also differed per exposome domain (Supplementary Fig. 4).

fulltextpubmed· Consistency of associations across exposure categories· item 41851348

Self-reported dietary nutrients and variables dominate nutrient exposure assessment in epidemiological studies. Overall, we found that 1,452 phenotypes had Bonferroni-significant associations with nutrient variables derived from self-report questionnaires; however, among these phenotypes, they exhibited a median R2 of only 0.2%. Next, we hypothesized that a dilution of correlation size was due to measurement noise. Self-reported dietary nutrients were assessed on 2 days. The median correlation across 69 dietary nutrient recalls on day 1 versus day 2 was 0.36 (IQR, 0.28 to 0.43). We estimated the correlations of associations (for example, the beta carotene–P association on day 1 versus the beta carotene–P association on day 2) (Extended Data Fig. 4a). Across all levels of significance, we observed a 0.84 correlation between day 1 self-report versus day 2 (Extended Data Fig. 4a). Dietary biomarkers, on the other hand, had a larger median R2 of 1% across the 1,101 phenotypes with Bonferroni significance, five times larger than their self-reported counterparts. We also estimated the correlation between biomarkers and their self-reported counterparts (for example, day 1/day 2 average of trans-beta carotene versus serum trans-beta carotene) (Extended Data Fig. 4b). The correlations between the self-reported values and biomarkers were smaller, and we observed a Pearson correlation of 0.52. For those blood nutrient variables that were Bonferroni significant, we observed a correlation of 0.60.

fulltextpubmed· Consistency of exposome associations for lung function· item 41851348

Smoking is a strong risk factor for reduction of lung function, such as the amount of air expelled (for example, FEV1). Smoking-related biomarkers such as 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol (NNAL) and serum cotinine indeed show negative associations with FEV1, consistent with prior evidence linking tobacco exposure to reduced lung function. Urinary NNAL is a tobacco-specific nitrosamine, which showed a stronger negative association with lung function (FEV1 (–0.06 per s.d., R2 = 0.2%)) compared with serum cotinine (–0.03 per s.d., R2 = 0.08%). The difference in association between cotinine and NNNL is consistent with their biological properties: cotinine, a short half-life metabolite of nicotine, primarily captures recent exposure and is subject to greater day-to-day variability, whereas NNAL, a metabolite of the tobacco-specific nitrosamine with a half-life of 10–16 days, provides a more stable marker of cumulative exposure. Nevertheless, numerous densely correlated exposures emerged associated with FEV1 (Extended Data Fig. 5).

fulltextpubmed· Exposome correlates of methylation and cognitive aging· item 41851348

We deployed our ExWAS procedure across timely biomarkers of aging, including epigenetic age (for example, Horvath’s biological age predictor and GrimAge (which includes smoking status)17,18). We also interrogated the indicators of cognitive recall (digit substitution test), as they are used in the clinic to stage cognitive decline in older adults (Supplementary Fig. 5). Volatile organic compounds (VOCs), smoking indicators (cotinine) and physical activity had the strongest associations with cognitive function (Supplementary Fig. 5a,b). We found shared exposure associations (or, ‘shared architecture’) between better cognitive function and other phenotypes interrogated in the population, including higher exhaled nitrous oxide (shared correlation of 0.35) and with urinary creatinine (Supplementary Fig. 5c). The strongest signals for accelerated epigenetic age (such as GrimAgeMort) was associated with smoking, heavy metals and physical activity behavior. Among these domains, physical activity explained the most variance (less than 1% in R2); however, in terms of aggregate total exposomic risk, ten exposures explained 10% of the variance in GrimAgeMort (Supplementary Table 6).

fulltextpubmed· Age by exposome interactions· item 41851348

Recognizing that the effect of an exposure may vary depending on an individual’s age at the time of exposure, we modified the ExWAS procedure to consider age by exposure interactions (for example, does the association size differ for individuals at different ages?; Methods). In summary, the inclusion of exposure-by-age interactions resulted in only marginal improvements in the variance explained (R2) for most phenotypes (Extended Data Fig. 6a–c). Most of the additional variance for models that incorporate an interaction term are limited, although there are exceptions.

fulltextpubmed· Shared associational architecture between exposures· item 41851348

For a pair of exposures, shared ‘associational architecture’ measures the correlation or the similarity of their correlations across all phenotypes (Extended Data Fig. 7a,b). For example, the correlation between associations for blood trans- versus cis-beta carotene was 0.98; in other words, the association coefficients across the phenome were very similar for those two exposures. The associational architecture between serum cotinine and 3-fluorene had 0.90 correlation. Exposures from the same category (for example, dietary interview, organochlorine, VOCs and dietary biomarkers) tended to have similar phenotypic associations; the degree of shared associational architecture is larger within categories than across categories. For all correlations within exposure variables that were dietary biomarkers, the median absolute value of shared architecture was 0.2 (IQR, 0.01 to 0.35). Similar shared associational architecture was observed within smoking biomarkers (0.2, IQR, 0.08 to 0.35). The shared associational architecture between dietary biomarkers and self-reported dietary nutrients had a median absolute value correlation of 0.24. We examined the degree of shared associational architecture between phenotypes. The shared associational architecture between BMI and body weight was 0.98. BMI and cardiorespiratory fitness had opposite associational architecture (the sign of the correlations were opposite between BMI and fitness): a correlation of −0.83. A1C% had an opposite architecture compared with high-density lipoprotein cholesterol (correlation of −0.54).

fulltextpubmed· Comparison with GWAS· item 41851348

We used data from participants of the UK Biobank to compare genetic versus exposomic predictions. We compared the variance explained due to ~1 M imputed genotypes from genome-wide association studies (GWAS) performed on 29 of the phenotypes19 examined here (Extended Data Fig. 8 and Supplementary Table 7). Across the 29 phenotypes, the median incremental R2 due to genetics was 7.9% (IQR, 2.8% to 9.3%; maximum 21%) and the median incremental R2 due to exposome (20 exposome variables across 39 phenotypes) was 7.9% (IQR, 3.1% to –12%; maximum 57%). We found that the multiple exposome factors, when modeled simultaneously, had explained variance comparable to the entire genomic array across the phenotypes (Extended Data Fig. 8). Specifically, 55% (n = 16) of phenotypes had higher exposomic versus genetic R2 (Extended Data Fig. 8). For example, the total genetic (1 M common single nucleotide polymorphisms) and exposomic (20 factors) variance explained for BMI was similar, at ~10% for both. We benchmarked the atlas against three exposure-wide analyses and found that our findings were concordant (directionally consistent and robust P values) with previous published data found in refs. 20–22.

fulltextpubmed· Discussion· item 41851348

Our systematic mapping of the exposome onto the phenome reveals three insights with direct clinical and biomedical implications. First, robust environmental signals are highly concentrated in cardiometabolic and pulmonary phenotypes used to stage and gauge care, establishing lipids (specifically triglycerides), glycemic markers and lung function (FEV1) as the highest-yield targets for our data-driven environmental risk assessment. Objective nutrient biomarkers and lipophilic pollutants are reproducible correlates of BMI, glycated hemoglobin and lipid profiles. Triglycerides stood out as the phenotype most strongly linked to multidomain exposure patterns, with trans fatty acids, banned persistent pollutants (for example, polychlorinated compounds) and vitamin E isoforms among the most informative contributors, suggesting that lipid risk assessment—important for staging cardiovascular disease—may be sensitive to integrated dietary and pollutant chemical contexts. In pulmonary traits, tobacco-specific biomarkers showed stronger and more stable associations with reduced lung function than short-lived nicotine metabolites, supporting the clinical utility of longer half-life biomarkers when refining smoking-related risk for FEV1 and related outcomes (for example, chronic pulmonary disorder). Importantly, we also demonstrate that while single exposures have modest association sizes, aggregate ‘poly-exposomic’ profiles explain phenotypic variance comparable to genome-wide polygenic scores; this suggests that multifactor environmental exposome integration is required to meaningfully improve precision risk models beyond age and sex. Third, we identify a critical reliance on objective measurement: biomarkers (for example, serum nutrients and urinary tobacco metabolites) consistently revealed biomedical associations that self-reported history failed to capture. Collectively, these findings move the field beyond fragmented and nonobjective associations, defining the specific clinical domains and measurement modalities necessary to operationalize the exposome in biomedical research to evaluate medical decision-making.

fulltextpubmed· Discussion· item 41851348

l associations that self-reported history failed to capture. Collectively, these findings move the field beyond fragmented and nonobjective associations, defining the specific clinical domains and measurement modalities necessary to operationalize the exposome in biomedical research to evaluate medical decision-making. Our findings have several implications. Estimating association sizes and their replication rate across exposure domains helps to prioritize which exposures are most likely to yield clinically meaningful signals and can guide study design and power planning for ExWAS in new cohorts (Extended Data Table 1). Most of the exposome tabulated here adds little incremental clinically relevant predictive value for many phenotypes, but a smaller set especially in cardiometabolic and smoking-related domains appear more promising for refining risk equations. We note that cancer-related phenotypes are underrepresented in our atlas and are ripe for further research. Our data suggest that clinically useful environmental risk stratification will more often require integrated, multiexposure models (for example, ref. 23) rather than isolated markers. Demographics remain essential for risk adjustment, and modest exposome signal strength probably reflects measurement limitations and the cross-sectional nature of many exposure assessments.

fulltextpubmed· Discussion· item 41851348

risk stratification will more often require integrated, multiexposure models (for example, ref. 23) rather than isolated markers. Demographics remain essential for risk adjustment, and modest exposome signal strength probably reflects measurement limitations and the cross-sectional nature of many exposure assessments. Most exposures show broad, nonspecific associations across phenotypes and are often correlated with other exposures24–26, complicating causal interpretation and attribution, underscoring the need to view high-priority signals in the context of exposure ‘mixtures’, globes and bundles. This is exemplified by smoking, where biomarker indicators of the behavior with different half-lives may capture distinct time windows of exposure relevant to lung function; however, the exposures are all related to one another and make a complex globe. Inferred exposure–phenotype relationships are sensitive to analytical choices and confounding control27, especially for age, sex and socioeconomic factors, reinforcing the need for transparent modeling and framing. Future work should test whether top signals persist under alternative adjustment strategies and in longitudinal settings, and the exposome field may need to configure model specifications per exposome or phenotype domain and explore mediation and interaction28.

fulltextpubmed· Discussion· item 41851348

reinforcing the need for transparent modeling and framing. Future work should test whether top signals persist under alternative adjustment strategies and in longitudinal settings, and the exposome field may need to configure model specifications per exposome or phenotype domain and explore mediation and interaction28. Objective biomarkers appear more consistent and informative than self-reported measures, with strong concordance across blood and urine heavy metal indicators and far weaker signals for dietary recalls compared with nutrient biomarkers. This supports prioritizing standardized biospecimen assays when the goal is clinical translation for precision medicine or robust population surveillance.

fulltextpubmed· Discussion· item 41851348

eported measures, with strong concordance across blood and urine heavy metal indicators and far weaker signals for dietary recalls compared with nutrient biomarkers. This supports prioritizing standardized biospecimen assays when the goal is clinical translation for precision medicine or robust population surveillance. We have several technical recommendations for implementation of ExWAS (Supplementary Table 9). Exposomics, at present, is associational discovery. To move toward causal attribution, we recommend a triangulated strategy (for example, ref. 29) that prioritizes top exposure–phenotype pairs, ranked by replication rate, effect size, P values and vibration of effects30 for targeted follow-up. Next, temporality in disease-specific longitudinal cohorts should be established by measuring exposures at baseline and relating them to time-to-event or longitudinal trajectories. One can apply instrumental-variable approaches, including Mendelian randomization31, where genetic variants serve as stand-ins for exposures. Third, aim for ‘functional exposomics’ and measure more granularly and precisely, using proteomics, metabolomics and methylomics to map exposure–responsive pathways and test mediation (for example, refs. 32,33). For behavioral exposure bundles (for example, diet), investigators should move to randomized interventions.

fulltextpubmed· Discussion· item 41851348

Third, aim for ‘functional exposomics’ and measure more granularly and precisely, using proteomics, metabolomics and methylomics to map exposure–responsive pathways and test mediation (for example, refs. 32,33). For behavioral exposure bundles (for example, diet), investigators should move to randomized interventions. Our study has limitations, with many directions to evaluate next. Despite cataloging hundreds of factors, we capture only a fraction of the total exposome; characterizing complex exposure–exposure and gene–environment interactions34 will require larger sample sizes and broader, high-resolution chemical profiling. While our reliance on objective biomarkers yields stronger signals than the self-reported or geospatial proxies often used in other biobanks, direct cross-cohort comparison is complicated by differences in sampling frames (for example, volunteer bias in cohorts such as UK Biobank35). Furthermore, our systematic evaluation suggests that many associations reported in previous candidate–exposure literature may be false positives9. The cross-sectional design limits causal inference and the capture of cumulative lifetime exposures. Although we attempted to model participant age as modifying exposure–phenotype relationships, delineating critical windows of susceptibility or nonlinear temporal dynamics ultimately requires longitudinal designs to distinguish chronic accumulation from acute reverse-causal effects. The exposomic architectures measured here are chronic but they may also be acute, such as glucose response to diet and physical activity36,37. Frontier studies will incorporate dynamic and personalized exposome and phenotype measurements such as continuous glucose monitoring devices to ascertain heterogeneity or person-specific responses to the exposome38.

fulltextpubmed· Discussion· item 41851348

ere are chronic but they may also be acute, such as glucose response to diet and physical activity36,37. Frontier studies will incorporate dynamic and personalized exposome and phenotype measurements such as continuous glucose monitoring devices to ascertain heterogeneity or person-specific responses to the exposome38. Our atlas will complement existing global efforts to document exposure–phenotype relationships. For example, there are numerous biobanks and cohorts with phenotype measures, and to some extent, exposure measures such as those documented in the Human Health and Exposure Analysis Resource (https://hhearprogram.org/data-center) (Extended Data Table 1). A recent serum‑only exposome mapping in a Chinese cohort (You et al.)39 prioritized breadth and population representativeness, interrogating 267 blood chemicals in 5,700 volunteers. We view our atlas as a complement to these efforts and will help to devise standards by which to catalog associations40 to enhance longer-term reproducibility.

fulltextpubmed· Discussion· item 41851348

xposome mapping in a Chinese cohort (You et al.)39 prioritized breadth and population representativeness, interrogating 267 blood chemicals in 5,700 volunteers. We view our atlas as a complement to these efforts and will help to devise standards by which to catalog associations40 to enhance longer-term reproducibility. Emerging efforts are expanding the molecular exposome with increasingly precise high-resolution assays, including targeted panels and high-resolution mass spectrometry (for example, ref. 41), but the next major advance will be enhanced measurement paired with expanded study designs. In particular, repeated and personalized exposure profiling that spans chronic burdens and acute signals will be essential for moving to causal implication of the exposome in disease. Such measurement-rich longitudinal studies, will further prioritize the most promising exposure domains for follow-up, clarify temporality, explain personal heterogeneity and identify modifiable drivers of clinical phenotypes. As these studies mature, the field will be positioned to define causally attributable exposures that can be targeted through behavioral, pharmaceutical, environmental or policy interventions and/or incorporated into predictive models (for example ref. 42) for individual risk assessment in the clinic. In this way, our ExWAS serves as a prerequisite for systematic, large-scale integration of exposomic information at the point of care.