Study design and sample
DHS participants were recruited from outpatient internal medicine and endocrinology clinics and from the community from 1998 through 2005 in western North Carolina. Siblings concordant for T2D without advanced renal insufficiency were recruited, with additional non-diabetic siblings enrolled whenever possible. Recruitment was based upon family structure, and there were no inclusions/exclusions based on evidence of prevalent CVD at the time of recruitment. Ascertainment and recruitment have been described in detail previously [9–12]. T2D was defined as diabetes developing after the age of 35 years treated with insulin and/or oral agents, in the absence of historical evidence of ketoacidosis. Diabetes diagnosis was confirmed by measurement of fasting glucose and glycated hemoglobin (HbA1C) at the exam visit. Analyses completed for the current investigation included 983 self-described European American individuals with T2D from 466 DHS families.
Study protocols were approved by the Institutional Review Board at Wake Forest School of Medicine, and all participants provided written informed consent. Participant examinations were conducted in the General Clinical Research Center of the Wake Forest Baptist Medical Center. Examinations included interviews for medical history and health behaviors, anthropometric measures, resting blood pressure, electrocardiography, fasting blood sampling for laboratory analyses, and spot urine collection. Individuals were considered hypertensive if they were prescribed anti-hypertensive medication or had blood pressure measurements exceeding 140 mmHg (systolic) or 90 mmHg (diastolic). Standard laboratory analyses included fasting glucose, HbA1C, total cholesterol, HDL, and triglycerides. Low-density lipoprotein cholesterol (LDL) concentration was calculated using the Friedewald equation, and LDL concentrations were considered valid for subjects whose triglycerides were less than 796 mg/dL. CAC was measured using fast-gated helical computed tomography (CT) scanners, and calcium scores were calculated as previously described and reported as an Agatston score [13, 14].
Vital status was determined for all subjects from the National Social Security Death Index maintained by the United States Social Security Administration. For participants confirmed as deceased, length of follow-up was determined from the date of initial study visit to date of death. For all other participants, the length of follow-up was determined from the date of the initial study visit to the end of 2011. For deceased participants, copies of death certificates were obtained from relevant county Vital Records Offices to determine cause of death. Cause of death was categorized based on information contained in death certificates as CVD-mortality (MI, congestive heart failure, cardiac arrhythmia, sudden cardiac death, peripheral vascular disease, and stroke) or either cancer, infection, end-stage renal disease, accidental, or other (including obstructive pulmonary disease, pulmonary fibrosis, liver failure and Alzheimer’s dementia). Association with mortality was assessed for both CVD-mortality and all-cause mortality, i.e. death from any cause.
Total genomic DNA was purified from whole blood samples using the PUREGENE DNA isolation kit (Gentra Inc., Minneapolis, MN). DNA concentration was quantified using standardized fluorometric readings on a Hoefer DyNA Quant 200 fluorometer (Hoefer Pharmacia Biotech Inc., San Francisco, CA). Genotype data for specific SNPs was derived from: (i) the MassARRAY SNP Genotyping System (Sequenom Inc., San Diego, CA) (n = 4 SNPs), (ii) a genome wide association study (GWAS) using the Affymetrix® Genome-Wide Human SNP Array 5.0 (Affymetrix® Inc., Santa Clara, CA) (n = 2 SNPs), (iii) Illumina® HumanExome BeadChips (Illumina® Inc., San Diego, CA) (n = 18 SNPs), and (iv) GWAS imputed data (n = 4 SNPs).
Genotyping using the MassARRAY SNP Genotyping System was completed as described previously . Primers for PCR amplification and extension reactions were designed using the MassARRAY Assay Design Software (Sequenom). Samples were diluted to a final concentration of 5 ng/μl, and single-base extension reaction products were separated and scored using a matrix-assisted laser desorption ionization/time of flight mass spectrometer. To evaluate genotyping accuracy, 39 quality control samples were included as blind duplicates. The concordance rate for these blind duplicates was 100%.
For the DHS GWAS data, genotype calling was completed using the BRLLM-P algorithm in Genotyping Console v4.0 (Affymetrix). Samples failing to meet an intensity quality control threshold (n = 4) were not included for genotype calling and those failing to meet a minimum acceptable call rate of 95% (n = 3) were excluded from further analyses. An additional 39 samples were included as blind duplicates within the genotyping set to serve as quality controls; the concordance rate for these blind duplicates was 99.0 ± 0.72% (mean ± standard deviation (SD)).
For the DHS Exome Chip data, genotype calling was completed using Genome Studio Software v1.9.4 (Illumina). Samples failing to meet a minimum acceptable call rate of 98% (n = 3) were excluded from further analyses. An additional 58 samples were included as blind duplicates within the genotyping set to serve as quality controls; the concordance rate for blind duplicates was 99.9 ± 0.0001% (mean ± SD). Additional quality control of GWAS and Exome Chip data sets was completed to exclude samples with poor quality genotype calls, gender errors, or unclear/unexpected sibling relationships.
For SNPs where direct genotyping data was not available, genotype data was obtained from GWAS imputed data. Imputation of 1,000 Genomes Project SNPs was completed using the program IMPUTE2 and the Phase I v2, cosmopolitan (integrated) reference panel, build 37 [16, 17]. SNPs that were used for imputation were required to have low missingness and show no significant departure from Hardy-Weinberg expectations (p > 1 × 10-4). To maximize the quality of imputation, the samples were not pre-phased. Only imputed SNPs with a confidence score > 0.90 and information score > 0.50 were used. A total of ~4.5 million SNPs passed imputation quality control.
For all SNPs used to derive the GRS, the minimum acceptable call rate was 95%; the average SNP call rate was 99.4% ± 1.2% (mean ± SD), and the average sample call rate was 99.4% ± 1.4%. Allele and genotype frequencies were calculated from unrelated individuals and tested for departures from Hardy-Weinberg equilibrium. No SNPs showed significant departure from Hardy-Weinberg equilibrium (p > 0.05). One SNP (rs386000) included by Voight et al.  in their HDL GRS failed genotyping and was not included in the current analysis.
Both unweighted GRS and GRS weighted by SNP effect size were derived for two sets of SNPs previously reported to be associated with HDL . One set of 14 SNPs had documented effects on HDL concentrations only and were used by Voight et al.  for construction of a GRS. We created GRS from 13 of these SNPS with good quality genotyping data (rs386000 excluded) (Score 1; 1a = unweighted, 1b = weighted).
In addition, Voight et al.  also reported an additional set of 15 SNPs, including a coding variant in LIPG, as associated with HDL concentrations, with some of these SNPs also reported to have pleiotropic effects on LDL cholesterol and triglyceride concentrations. All SNPs were primarily selected for their impact on HDL. Weighted and unweighted GRS were derived from this additional set of 15 SNPs (Score 2; 2a = unweighted, 2b = weighted). SNPs (n = 26) from both sets were also combined to derive unweighted and weighted combined GRS; two pairs of SNPs (rs2338104 and rs7134594; rs2271293 and rs16942887) are in strong linkage disequilibrium (r2 > 0.90), and as such two SNPs (rs7134594 and rs2271293) were excluded from the combined scores. SNPs in the combined GRS were weighted by their effect sizes in mmol/L. The effect sizes used were drawn from the Voight et al. paper or the Global Lipid Genetic Consortium GWAS for Lipids paper that Voight et al. cited [4, 18]. All derived GRS (1a, 1b, 2a, 2b, Combined Unweighted, Combined Weighted) were tested for association with HDL, LDL and triglycerides to evaluate whether the GRS were a measure of genetic contributions to either HDL only or to global lipid levels.
For all GRS, the effect allele was assigned as the allele associated with an increase in plasma HDL concentrations, i.e., an increase in GRS can be interpreted as an increase in genetic predisposition for elevated plasma HDL, as seen in the GRS used by Voight et al. . Unweighted scores were derived by adding the number of effect alleles across each SNP. The SNPs were also weighted by their previously reported effect sizes . For the weighted scores, the number of effect alleles possessed by an individual at a particular SNP locus was multiplied by a weight derived from that SNP’s effect size contribution to the total effect size for all SNPs included in the GRS. For individuals missing genotype data for a particular SNP, the mean genotype calculated in the DHS for that given SNP was assigned .
For statistical analyses, continuous variables were transformed as necessary to approximate normality. Single SNP association analyses were performed using variance components methods implemented in Sequential Oligogenic Linkage Analysis Routines (SOLAR) version 6.4.1 (Texas Biomedical Research Institute, San Antonio, TX) to account for relatedness between subjects . Association was examined assuming an additive model of inheritance. Age and sex were included as covariates in single SNP association analyses for HDL.
GRS were considered as both ordinal (three tertiles: T1, T2, T3 derived from increasing tertile ranges) and continuous variables. Relationships between the GRS and HDL, LDL, triglycerides, CAC, and prior history of CVD and MI were examined using marginal models with generalized estimating equations. The models account for familial correlation using a sandwich estimator of the variance under exchangeable correlation. Relationships between GRS and both all-cause and CVD-mortality were examined using Cox proportional hazards models with sandwich-based variance estimation due to the inclusion of related individuals in this study. Associations were adjusted for covariates including age, sex, BMI, smoking status (history of current or prior smoking), hypertension, cholesterol medication use, prior CVD, oral T2D medication use, and insulin use as indicated. All analyses were performed in SAS 9.3 (SAS Institute, Cary, NC). Statistical significance was accepted at p < 0.05.