Colocalization of GWAS and eQTL signals at loci with multiple signals identifies additional candidate genes for body fat distribution

Integration of genome-wide association study (GWAS) signals with expression quantitative trait loci (eQTL) studies enables identification of candidate genes. However, evaluating whether nearby signals may share causal variants, termed colocalization, is affected by the presence of allelic heterogeneity, different variants at the same locus impacting the same phenotype. We previously identified eQTL in subcutaneous adipose tissue from 770 participants in the Metabolic Syndrome in Men (METSIM) study and detected 15 eQTL signals that colocalized with GWAS signals for waist–hip ratio adjusted for body mass index (WHRadjBMI) from the Genetic Investigation of Anthropometric Traits consortium. Here, we reevaluated evidence of colocalization using two approaches, conditional analysis and the Bayesian test COLOC, and show that providing COLOC with approximate conditional summary statistics at multi-signal GWAS loci can reconcile disagreements in colocalization classification between the two tests. Next, we performed conditional analysis on the METSIM subcutaneous adipose tissue data to identify conditionally distinct or secondary eQTL signals. We used the two approaches to test for colocalization with WHRadjBMI GWAS signals and evaluated the differences in colocalization classification between the two tests. Through these analyses, we identified four GWAS signals colocalized with secondary eQTL signals for FAM13A , SSR3 , GRB14 and FMO1 . Thus, at loci with multiple eQTL and/or GWAS signals, analyzing each signal independently enabled additional candidate genes to be identified.


Introduction
Genome-wide association studies (GWAS) have discovered thousands of loci associated with hundreds of complex diseases and related traits (www.ebi.ac.uk/gwas), yet the underlying genes that influence disease susceptibility often remain unknown. One approach to identify causal genes is to map expression quantitative trait loci (eQTL) that contribute to variation in gene expression level (1)(2)(3), and then assess evidence of colocalization between overlapping GWAS and eQTL signals-that is, we test whether the same variant(s) is(are) likely responsible for the signals in both studies. Several studies have interrogated available databases/resources for eQTL, and identifying GWAScolocalized eQTL has enabled identification and interpretation of likely functional genes and potential biological pathways underlying the disease/trait associations (4)(5)(6)(7)(8)(9)(10). Previously, we analyzed subcutaneous adipose tissue gene expression using microarrays in 770 participants in the Metabolic Syndrome in Men (METSIM) study and identified novel eQTL colocalized with 109 GWAS loci for cardiometabolic diseases and traits, suggesting new candidate genes mediating the variant associations with cardiometabolic disorders (11).
Allelic heterogeneity, in which more than one genetic variant at the same locus influences the same phenotype, is a common characteristic of complex traits (12). Fine-mapping at GWAS loci routinely identifies many loci with multiple conditionally distinct association signals (defined as signals that remain or become significantly associated with the outcome after modeling the effect of other nearby signals) that increase the proportion of phenotypic variance explained by genetic variation at the locus (13)(14)(15)(16). Fine-mapping at eQTL loci also has identified eQTL with multiple conditionally distinct signals (7); (17); (18). Identifying multi-signal loci is becoming more common as sample sizes of eQTL studies increase (14); (19), and testing for colocalization at each signal within a locus will help identify additional candidate genes that contribute to a trait. For example, two eQTL signals were identified in peripheral blood for the gene FAM117B at the total cholesterol locus FAM117B. After accounting for linkage disequilibrium (LD), only the secondary eQTL signal was colocalized with the total cholesterol GWAS signal (18), emphasizing the utility of conditional analysis.
Body fat distribution is a heritable trait related to cardiometabolic risk (20); (21). One GWAS by the Genetic Investigation of Anthropometric Traits (GIANT) consortium reported 68 conditionally distinct signals at 49 loci (49 primary signals and 19 additional signals after accounting for primary signals) associated with waist-to-hip ratio adjusted for body mass index (WHRadjBMI), a measure of body fat distribution (10). Based on available eQTL resources, the consortium reported that the lead GWAS variants at 21 of these 49 loci were in strong LD (r 2 > 0.8) with variants associated with transcript levels in subcutaneous adipose tissue, omental adipose tissue, liver and/or blood (10). Using our subcutaneous adipose eQTL data from METSIM, we reported 15 WHRadjBMI-eQTL colocalized signals, including seven GWAS-colocalized eQTL at six loci that had not been detected by the GIANT consortium (11). However, both analyses were limited to primary eQTL signals.
Here, we extended our analysis of initial, primary subcutaneous adipose tissue eQTL in the METSIM study to identify secondary eQTL signals. We evaluate colocalization of primary and secondary eQTL signals with primary and secondary GWAS signals for WHRadjBMI (10). We identify colocalization by pairwise LD and conditional analysis on the lead GWAS and lead eQTL variants (10); (11), and compare our findings to those obtained using COLOC, a Bayesian test of colocalization (22). The results demonstrate the value of separating signals at eQTL and GWAS loci to identify additional candidate cis-regulated genes that may influence disease etiology.

Results
In the METSIM microarray study of subcutaneous adipose tissue, we previously identified primary cis-eQTL that showed association between a genetic variant and expression level of at least one gene (FDR < 1%, equivalent to P < 2.4 × 10 −4 in genomescale eQTL mapping) (11). Here, we focused on the lead variants for each of 68 conditionally distinct GWAS signals identified previously at 49 WHRadjBMI loci (10). Of these 68 variants, 40 were associated with expression level of at least one gene within 1 Mb (FDR < 1%), while 28 were not associated with expression level of any gene. Some variants were associated with expression level of more than one gene; the 40 variants were associated with expression level of a total of 71 genes. For each of the 71 genes, we also identified the variant that exhibited the strongest association with expression level, which we denote as the lead eSNP. We further define a 'signal pair' as a lead GWAS variant that is associated with a gene's expression level, and the lead eSNP for that gene. Details on the 71 signal pairs are in Supplementary Material, Table S1.
We then used two strategies to determine whether the GWAS signals were colocalized with the primary eQTL signals for those genes. First, we assessed colocalization through two criteria of LD and conditional analysis (11), requiring both high pairwise LD (r 2 > 0.8) between the lead GWAS variant and the lead eSNP, as well as attenuation of the eQTL signal in conditional analysis (conditional P > 2.4 × 10 −4 , see Materials and Methods). Then, we assessed colocalization using the Bayesian test COLOC, defining colocalization based on a high posterior probability that a single shared variant is responsible for both signals (PP4 > 0.8) (22).

Colocalization of primary eQTL signals with WHRadjBMI GWAS signals
Among the 71 pairs of GWAS and primary eQTL signals, 20 were classified as colocalized by LD and conditional analysis (Table 1). Only 15 of the 20 signal pairs were reported in our previous study (11) due to differences in software to identify eQTL. Similarly, based on LD and subcutaneous adipose tissue eQTL data from other studies (10), the GIANT consortium had described 9 of the 20 signal pairs as colocalized. New colocalized The effect size and P-value of the association between the lead GWAS variant and the gene expression level. GWAS-eQTL signals identified here include a GWAS signal near VEGFA colocalized with an eQTL for VEGFA, a GWAS signal at HOXC4-HOXC6 colocalized with an eQTL for HOXC4, and a GWAS signal near PBRM1 colocalized with eQTL for both GNL3 and NEK4 (Table 1). Of the 20 GWAS-eQTL signal pairs classified as colocalized by LD and conditional analysis, 15 were also classified as colocalized by COLOC (PP4 ≥ 0.8), but 5 were not (PP4: 0-0.76, Table 1). COLOC did not identify any additional colocalizations that LD and conditional analysis did not also classify as colocalizations. Two of the five signals COLOC did not classify as colocalized had marginal PP4 values (0.76 at the NKX3 GWAS locus for an STC1 eQTL and 0.66 for a secondary GWAS signal at the NT5DC locus for a C3orf78 eQTL). Since prior probabilities can play an important role in the posterior expectations in COLOC (23) and our priors were conservative, we carried out sensitivity analysis to address whether altering the priors could lead to different posterior probabilities. When we increased the prior probability that a shared causal variant influences both WHRadjBMI and gene expression level from 1 × 10 −6 to 5 × 10 −6 , the PP4 posterior probabilities increased from 0.66 to 0.91 for C3orf78 and from 0.76 to 0.94 for STC1, respectively (Supplementary Material, Table S2). As expected, the 15 colocalized signals discovered with the conservative priors showed stronger Bayesian evidence of colocalization as the priors became less stringent.
Three remaining putative colocalizations (based on LD and conditional analysis) had low PP4 values (PP4 < 0.8) even with more lenient priors. These colocalizations were found at two GWAS loci that each consists of more than one distinct GWAS signal, CCDC92-ZNF664 and NFE2L3-SNX10 (Supplementary Material, Table S3) (10); (24). Since the presence of more than one causal signal within a GWAS locus is expected to limit COLOC's power to detect colocalizations (23), we used a tool for Genome-wide Complex Trait Analysis (GCTA) conditional analysis to estimate residual GWAS association statistics for each signal, conditioning on the effect of the other nearby signals (see Materials and Methods) (24). We then provided COLOC the GCTA residual summary statistics, which should mitigate the impact of multiple significant signals in the region. Using GCTA's approximate conditional summary statistics of the GWAS data at these three loci, COLOC identified the same three colocalizations with eQTL signals detected by conditional analysis (Table 2). At the CCDC92-ZNF664 locus, the secondary (rs863750) GWAS signal was colocalized with the primary eQTL for ZNF664 (PP4 = 0.98, Table 2, Fig. 1). At the NFE2L3-SNX10 locus, the secondary (rs1534696) GWAS signal was colocalized with the primary eQTL for both SNX10 (PP4 = 0.99) and CBX3 (PP4 = 0.99, Table 2). Overall, COLOC and conditional analysis had high agreement in colocalization classification, with some differences that could be due to the assigned priors in the Bayesian test and/or thresholds to define colocalization.
Given that METSIM study participants are all men, we examined whether six colocalizations between METSIM eQTL signals and WHRadjBMI GWAS signals that had exhibited sex heterogeneity can be detected in other eQTL data from men-only and women-only analyses. We compared evidence of colocalization of the six signals with the lead adipose eQTL signals from men and women in the Diabetes Epidemiology: Collaborative analysis of Diagnostic criteria in Europe (deCODE) study and the female-only TwinsUK Multiple Tissue Human Expression Resource (MuTHER) study (10); (25). As shown in Supplementary Material, Table S4, four of the signal pairs colocalized with the eQTL in women-only data [the lead eSNPs for all four are in strong LD (r 2 ≥ 0.91) with the lead GWAS variants], showing The effect size and P-value of the association between the GWAS variant and the gene expression level.  that the primary eQTL signals for TNFAIP8, ADAMTS9, SNX10 and CBX3 are not sex-specific. Our results are consistent with existing literature: previously, WHRadjBMI signals have shown largely similar evidence of colocalization with adipose eQTL signals from men and women, even for GWAS loci with significant evidence of sex heterogeneity (10).

Colocalization of secondary eQTL signals with WHRadjBMI GWAS signals
We next tested for secondary eQTL signals at the 71 genes that were associated with WHRadjBMI GWAS signals. We performed association analyses with gene expression level while including the lead eSNP as a covariate in the regression model, and defined the new variant that exhibited the strongest association in the conditional analysis as the secondary eSNP. We restricted the analysis to secondary signals, rather than testing for further signals (that is, tertiary and beyond), due to limited power to detect these smaller effects.
After adjusting for the lead eSNP, lead variants for 37 conditionally distinct GWAS signals were associated with expression level of 51 genes (FDR < 1%, P < 2.4 × 10 −4 ) (Supplementary Material, Table S5). Conditional analysis classified four GWAS signals as colocalized with a secondary eQTL signal, but not the primary eQTL signal for the following four genes: the FAM13A GWAS signal was colocalized with the secondary eQTL for FAM13A (r 2 = 1.00, conditional P ≥ 0.37), the GRB14-COBLL1 GWAS signal was colocalized with the secondary eQTL for GRB14 (r 2 = 0.93, conditional P ≥ 0.39), the LEKR1 GWAS signal was colocalized with the secondary eQTL for SSR3 (r 2 = 0.94, conditional P ≥ 0.43) and the GORAB GWAS signal was colocalized with the secondary eQTL for FMO1 (r 2 = 1.00, conditional P = 1.00) ( Table 3). As with colocalization of primary eQTL signals, COLOC did not identify additional signals as colocalized that were not also discovered by LD and conditional analysis. COLOC did classify the first three GWAS signals as colocalized with the secondary eQTL (PP4 ≥ 0.92); at the GORAB locus, the PP4 value very narrowly missed the colocalization classification threshold (PP4 = 0.79). We next describe these four loci in further detail.
At the FAM13A WHRadjBMI locus, the lead GWAS variant rs9991328 was associated with the expression level of FAM13A (P = 1.0 × 10 −32 in primary eQTL analysis, Table 3 and Fig. 2). While COLOC found suggestive evidence for colocalization between the lead GWAS variant and the lead eSNP, rs10024506 (PP4 = 0.73), the two variants were in very weak LD (r 2 = 0.02). After controlling for the effect of the FAM13A lead eSNP (rs10024506), we identified a secondary eQTL signal represented by rs2290782 (eQTL P uncond = 1.0 × 10 −32 , P cond = 2.6 × 10 −29 when adjusting for lead eSNP rs10024506). In contrast to the primary eSNP, the secondary eSNP was in strong LD with the lead GWAS variant (r 2 = 1.00), and both COLOC and conditional analysis provided strong evidence of colocalization (COLOC PP4 = 1.00, both conditional P > 0.37). Our findings suggest that FAM13A might be a functional gene mediating the genetic influence of this GWAS locus on body fat distribution. This conclusion is bolstered by recent studies that linked the FAM13A and Fam13a genes to adipose morphology and adipose tissue function in human and mice (26).
At the GRB14-COBLL1 WHRadjBMI locus, the lead GWAS variant rs10195252 was associated with the expression level of GRB14 (P = 3.8 × 10 −6 in primary eQTL analysis), but neither LD nor COLOC supported colocalization with the primary eQTL signal rs1474249 (PP4 = 0.25, LD r 2 = 0.00) ( Table 1; Supplementary Material, Fig. S1). After controlling for the effect of the primary eSNP,   we observed a secondary eQTL signal for GRB14 at rs1128249 (P cond = 4.7 × 10 −6 , Table 3). The secondary eSNP was in high LD with the lead GWAS variant (r 2 = 0.93), and both COLOC and conditional analysis provided strong evidence of colocalization (COLOC PP4 = 0.92, conditional P = 0.72). To further explore the relationship between GRB14 and WHRadjBMI, we tested for the association between cardiometabolic traits and gene expression level in METSIM. While higher expression level of GRB14 was not significantly associated with WHRadjBMI (P = 0.07), it was significantly associated with several related traits, including higher fasting plasma insulin (P = 5.9 × 10 −6 ), higher HOMA-β (P = 1.1 × 10 −5 ), lower Matsuda index (P = 1.4 × 10 −4 ) and higher fasting plasma proinsulin (P = 1.5 × 10 −4 ) (Supplementary Material, Table S6). These findings are consistent with previous observations of improved glucose homeostasis and enhanced insulin signaling in Grb14-deficient mice (27), and prioritize GRB14 as a candidate gene potentially mediating the WHRadjBMI association at this locus. The third and fourth examples of WHRadjBMI GWAS signals that colocalized with secondary eQTL signals identified different genes than those that colocalized with primary eQTL signals. The GWAS signal at LEKR1 (lead variant rs17451107) colocalized with the primary eQTL signal for TIPARP (Table 1; Supplementary Material, Fig. S2), as well as the secondary eQTL signal for SSR3, located ∼500 kb away (PP4 = 0.94; LD r 2 = 0.98; Table 3; Supplementary Material, Fig. S2). Similarly, the GWAS signal at GORAB (lead variant rs10919388) colocalized with a primary eQTL at PRRX1 (Table 1; Supplementary Material, Fig. S3) and with the secondary eQTL for FMO1 (PP4 = 0.79; LD r 2 = 1.00; Table 3; Supplementary Material, Fig. S3). These results suggest that these GWAS loci may be mediated through altered expression levels of either or both genes.

Discussion
We show examples of how colocalization between GWAS and eQTL signals can be influenced by the presence of multiple GWAS signals at a locus or multiple eQTL signals for the same gene. In our study of 49 GWAS loci for WHRadjBMI and primary eQTL from subcutaneous adipose tissue from the METSIM study, we describe 20 colocalized signal pairs. At two loci with multiple GWAS signals, COLOC initially was unable to identify signal pairs as colocalized, despite complete LD between the secondary lead GWAS variant and lead eSNP (r 2 = 1.00). Because the presence of multiple signals violates COLOC's assumptions and likely reduces power to detect a true colocalization, we provided the program with estimated residual GWAS association summary statistics, conditioning on the neighboring signal. Using approximate conditional summary statistics of the GWAS data, COLOC identified the signal pairs as colocalized. In addition, analyses of secondary eQTL signals in the METSIM study identified four colocalized eQTL that were not detected in analyses of primary eQTL signals. At loci with multiple eQTL and/or GWAS signals, comparing the signals separately after conditional analysis led to more robust evidence of colocalization. Dissecting the allelic heterogeneity provided insight into how GWAS loci might influence WHRadjBMI through gene expression.
At least three of the genes detected using secondary eQTL signals, FAM13A, GRB14 and FMO1, have other evidence suggesting that they may influence WHRadjBMI. FAM13A expression level increases during adipocyte differentiation and is associated with adipocyte hyperplasia, consistent with alleles associated with both higher WHRadjBMI and higher FAM13A (26). Grb14-deficient mice showed improved glucose homeostasis and enhanced insulin signaling (27), and Fmo1-deficient mice were leaner and stored fewer triglycerides in white adipose tissue than wild-type mice (28), both consistent with our observations that alleles associated with higher expression level are associated with higher WHRadjBMI (Table 3). The secondary eQTL signal for FMO1 is colocalized with the same GWAS signal as a primary eQTL signal for PRRX1, which has been shown to inhibit adipogenesis (29), suggesting that the GWAS signal may act through both genes to influence WHRadjBMI. The allele associated with higher WHRadjBMI was associated with lower PRRX1, corresponding to more adipogenesis and higher FMO1, corresponding to more storage of triglyceride in adipose (28), (29). The fourth gene detected using a secondary eQTL signal, SSR3, contributes to the formation of a vascular network in murine placenta (30), and may reflect the blood component of adipose tissue; this GWAS signal colocalized with a primary eQTL for TIPARP, which positively regulates liver X receptor, which can impair adipose expansion (31). While colocalized GWAS and eQTL signals do not provide causal mechanisms, they can suggest candidate genes for further investigation.
Although the results of COLOC and the LD and conditional analysis approach were largely concordant, conditional analysis did identify five GWAS/eQTL pairs as colocalized that COLOC initially did not (Table 1). Two GWAS/eQTL variant pairs that were classified as colocalized only by conditional analysis had marginal COLOC PP4 probabilities (PP4 0.66 for C3orf78 and 0.76 for STC1). However, we expect that the priors we selected for COLOC will be conservative; when we increased the prior probability that a variant is causal to both GWAS and eQTL to COLOC's default, PP4 posterior probabilities increased sufficiently for COLOC to also classify the pairs as colocalized (PP4 0.95 and 0.97, respectively). The remaining three pairs of GWAS/eQTL signals with inconsistent results between COLOC and conditional analysis were secondary GWAS signals at multisignal loci. Since COLOC assumes that the trait is associated with at most one causal variant per locus (22); (23), the presence of multiple association signals could lead to missed colocalized signals. After accounting for multiple signals by using GCTA, COLOC also classified the three GWAS/eQTL variant pairs as colocalized. While COLOC's conclusions in this study ultimately align with LD and conditional analysis, our findings do highlight the care needed to properly implement COLOC such that important colocalizations are not missed. Overall, COLOC and conditional analysis had high agreement in colocalization classification; differences can be attributed to be unaccounted for multiple signals per locus or can be reconciled through changes to assigned priors.
Our ability to colocalize signals might have been affected by the limitations of this study. First, the GWAS loci were identified by GIANT based on HapMap-imputed genotypes (10). If the METSIM lead eSNP or its LD proxies imputed from the higher density Haplotype Reference Consortium (HRC) reference panel better represented an underlying signal, we might fail to capture the colocalized signals. Compared to the HapMap Project, more recent studies have expanded the coverage of human genetic variation (32) and enhanced the ability of GWAS to fine-map complex traits (33). Second, we expect to have missed some colocalized signals due to the statistical power of our eQTL analysis. Although the METSIM eQTL study (n = 770) had a reasonable sample size to identify initial eQTL signals (11), larger studies will better differentiate between variants in moderate LD with each other (34) and better detect allelic heterogeneity at eQTL. With the increasing availability of eQTL studies, the integration of GWAS data with eQTL results from larger studies or meta-analyses of multiple eQTL datasets (35); (36) will increase the opportunity to detect additional GWAS-relevant eQTL. Third, we were unable to assess potential sexual dimorphism on gene expression level and could have missed some colocalizations because all samples analyzed in METSIM were from men. While the GIANT data demonstrated that 19 of the 49 WHRadjBMI loci had stronger genetic effects in women, 5 of these 19 loci showed evidence of colocalized eQTL in men in METSIM, and none of the remaining loci exhibited evidence of cis-eQTL in the MuTHER eQTL study of women (36), suggesting that these loci do not display strong sex-specific effects on gene expression levels. The credibility of the colocalizations reported at these 5 GWAS loci is bolstered by the observation that many variants which exhibit sex heterogeneity in WHRadjBMI, including effects observed exclusively in women, have a similar effect on body fat percentage in men as well as women (37). Fourth, the effect of an eQTL can vary across tissue and time (35); (38). Identification of colocalizing signals may be dependent on measuring expression at the appropriate time and in a trait-relevant tissue. Fifth, signal colocalization is dependent on haplotype structure. Haplotypes can differ by ancestry, even between individuals of broad European descent and specifically from Finland, and may result in false negative and false positive colocalizations.
Identifying additional signals in eQTL data has the potential to reveal previously undiscovered colocalizations with GWAS loci. However, as we demonstrated, care must be taken when either the GWAS or the eQTL study have multiple signals within a locus. Conditional analyses appear to separate multi-signal loci well, although additional assessments using simulated data are warranted. Other analytic methods, such as COLOC, might fail to detect a colocalization when more than one signal is present in the region. However, providing COLOC with residual statistics that account for the effects of other signals within a locus might be a solution. As eQTL studies grow in size, the ability to detect allelic heterogeneity will increase, thus complicating tests of colocalization. Testing for colocalization at every distinct signal, and selection of colocalization analytic pipelines that are robust to the presence of multiple signals can reveal colocalizations we otherwise would miss.

GIANT consortium data for WHRadjBMI
We obtained GIANT consortium 2015 GWAS results for WHRad-jBMI (10) from www.broadinstitute.org/collaboration/giant/index. php/GIANT_consortium_data_files. The downloaded files included dbSNP name, effect/non-effect alleles (an effect allele is the WHRadjBMI-increasing allele in the sex-combined analysis), effect allele frequency, beta, standard error, P-value and sample size for each variant. At the 29 loci with no evidence of sexual dimorphism, we used GIANT association statistics from the sexcombined meta-analysis. For the locus GDF5, which showed a male-specific effect on WHRadjBMI, we used the male-only GWAS results. At the WHRadjBMI loci (PLXND1, NMU, FAM13A, MAP3K1, HMGA1, NKX2-6, SFXN2, MACROD1-VEGFB, CMIP, BCL2, SNX10, LYPLAL1, GRB14-COBLL1, PPARG, ADAMTS9, TNFAIP8, VEGFA and RSPO3) with significantly larger genetic estimated effects on trait variation in women than men, we used the association results from the female-only meta-analysis. We used the results from the European-ancestry meta-analysis for all loci except the locus SNX10, for which the all-ancestry metaanalysis data were used because SNX10 achieved genome-wide significance only in the all-ancestry analysis, with no evidence of heterogeneity across ancestries (10).

METSIM subcutaneous adipose eQTL data
In Civelek et al. (11), we described in detail the subcutaneous adipose eQTL data from the METSIM study. Briefly, METSIM is a population-based study of 10 197 men, aged from 45 to 73 years at time of enrollment, randomly selected from the population register of Kuopio, Eastern Finland, and examined in 2005-2010 (39). The University of Kuopio and Kuopio University Hospital ethics committee approved the study, and all study participants provided informed consent. We used the Illumina HumanOmniExpress BeadChip and the Illumina HumanCore-Exome array to obtain genotypes, and imputed based on haplotypes from the HRC (version 1) (www.haplotype-referenceconsortium.org/) (40) by using Minimac3 on the Michigan Imputation Server (imputationserver.sph.umich.edu/index.html). A total of 7.8 million imputed variants passed the eQTL mapping inclusion criteria of MAF ≥ 0.01 and imputation quality r 2 > 0.3.
As previously described, we isolated total ribonucleic acid (RNA) from subcutaneous adipose tissue from 770 METSIM participants, and used a Robust Multi-Array Average (RMA) approach to normalize expression data measured using the Affymetrix Human Genome U219 Array for 43 145 probesets corresponding to 18 250 unique genes (11). We applied probabilistic estimation of expression residuals (PEER) (41) to account for complex non-genetic factors influencing gene expression levels. Based on METSIM adipose eQTL PEER factor observations described previously (11), and to better match the previous results, we adjusted for 35 PEER-inferred confounding factors. We then inverse normal transformed the PEER-processed residuals.
We define a cis-eQTL as an eQTL located within 1 Mb of a gene transcript. Association tests for cis-eQTL were carried out for variant-probeset pairs with a distance between the variant and the closer boundary of the gene <1 Mb using EPACTS-multi, in which EMMAX accounted for family relatedness (genome.sph.umich.edu/wiki/EPACTS) (42). We selected an FDR < 0.01 (equivalent P < 2.4 × 10 −4 ) as the significance threshold for a cis-eQTL and defined the lead eSNP as the variant for which the association with gene expression level resulted in the smallest P-value for that gene.

Conditional analyses on GWAS and eQTL data
For METSIM adipose eQTL data, we carried out conditional analyses by including the lead eSNP in the linear regression model then testing for evidence that other variants are associated with that gene's expression level.
For GIANT WHRadjBMI loci identified as containing more than one association signal by the GCTA joint model (10); (24), we used GCTA to run approximate conditional analyses on the GIANT GWAS summary statistics, using LD data from HapMapimputed METSIM genotype data on 10 070 Finnish men (24); (43). This method estimates residual association statistics after conditioning on the lead GWAS variant in the region, allowing for identification and effect size estimation of secondary signals. Approximate analysis based on summary statistics was required because individual-level data were not available.

Colocalization of GWAS and eQTL signals
First, we assessed the relationship between gene expression level and the lead GWAS variants associated with WHRadjBMI provided by the GIANT consortium (10). Next, for GWAS variants that were associated with gene expression level (at FDR < 1%, P < 2.4 × 10 −4 ), we tested whether the GWAS variant was colocalized with the lead eSNP. To do so, we extracted the summary statistics for variants located within 1 Mb of the lead GWAS variants from the GIANT 2015 GWAS results for WHRadjBMI and the METSIM subcutaneous adipose eQTL dataset.
We first tested for colocalization using a conditional analysis approach similar to that implemented in Civelek et al. (11) Specifically, we calculated pairwise LD r 2 between the lead GWAS and the lead eSNP that had the strongest evidence of association with the corresponding probesets. LD estimates were calculated from the HRC-imputed genotypes of the 770 METSIM individuals. For variant pairs with LD r 2 > 0.8, we examined the changes of the eQTL association for the lead eSNP when conditioned on the lead GWAS variant. Following Civelek et al. (11), we applied two criteria and defined GWAS-colocalized eQTL by requiring lead variant pairwise r 2 > 0.8 and change in the eSNP P-value to be no longer significant after conditional analysis (P > 2.4 × 10 −4 corresponding to FDR > 0.01) for the lead eSNP. Of note, the LD criterion helps prevent signals from being erroneously defined as colocalized based on small variation around the threshold.
We compared results to those obtained using a Bayesian test for colocalization, COLOC (22); (23). We applied COLOC using the Approximate Bayes Factor computations on the intersection of variants available in both the GIANT WHRadjBMI and METSIM eQTL datasets. We used default priors that a random variant in the region is associated with either GWAS or eQTL individually (prior probabilities = 1 × 10 −4 ), and set the prior probability that the random variant is causal to both GWAS and eQTL (prior probability = 1 × 10 −6 ). We selected a more conservative prior probability than COLOC's default for this last scenario (default prior = 1 × 10 −5 ) because we treat COLOC as a confirmatory test of results discovered by conditional analysis. However, in sensitivity analyses, we raised the prior probability that a random variant is causal to both GWAS and eQTL from 1 × 10 −6 to 5 × 10 −6 and 1 × 10 −5 . As recommended by the authors of the method, we defined the variants as colocalized when the posterior probability of a colocalized signal (PP4) was >0.8. Bayesian colocalization analyses were conducted by using the R package 'coloc' (cran.rproject.org/web/packages/coloc) (22).

Association of expression level with cardiometabolic traits
We conducted regression analyses to evaluate the association for gene expression level with 16 cardiometabolic traits as follows: waist-hip ratio, Matsuda index, insulin, BMI, HOMA-β, proinsulin, triglycerides, total fatty acids, waist circumference, fat-free mass, free fatty acids, total cholesterol, glucose, LDL-C, HDL-C and adiponectin in up to 770 METSIM participants. Of the 770 participants, 27 had type 2 diabetes at their baseline visit; diabetics were included in all regression analyses. The RMA-normalized expression levels were inverse normal transformed after accounting for 35 PEER-inferred factors. All cardiometabolic-related traits were adjusted for age and BMI before inverse normal transformation except BMI, which was only adjusted for age. Traits were adjusted for BMI to be comparable to recent GWAS analyses of cardiometabolic traits (10); (44)- (46).

Supplementary Material. Supplementary
Material is available at HMG online.