To analyze the association of lysine, 2-AAA or pipecolic acid with CVD or T2D, we used two sets of case-cohort studies (Figs. 1, 2) nested within the PREDIMED trial (Trial registration: ISRCTN35739639; registration date: 05/10/2005; recruitment start date 01/10/2003; http://www.predimed.es). PREDIMED was a primary cardiovascular prevention trial testing Mediterranean diets, as described elsewhere [8, 9]. Briefly, 7447 participants (men aged 55–80 years and women aged 60–80 years), initially free of CVD but at high cardiovascular risk, were allocated to 3 dietary interventions: (1) a MedDiet supplemented with extra-virgin olive oil (MedDiet + EVOO); (2) a MedDiet supplemented with mixed nuts (MedDiet + nuts); or (3) a control diet (low-fat diet).
In the first nested case-cohort study (Fig. 1), cases were all primary CVD events with available blood samples and the subcohort was a random sample of all the trial. We included a random sample of ~ 10% of PREDIMED participants at baseline (the subcohort) and 233 incident cases of CVD with available blood samples which occurred during a median follow-up of 4.8 years (55 of the 288 incident cases of the PREDIMED trial had no available plasma samples). We also excluded 73 participants (63 non-cases and 10 cases) because of unavailable 2-AAA or lysine metabolites data (Fig. 1). Finally, 917 participants were included in our analysis: 221 incident cases and 730 participants in the subcohort (including 34 overlapping cases). In addition, 809 participants (687 participants in the subcohort and 147 cases, including 25 overlapping cases) had available plasma samples after 1-year of follow-up and were included in the analyses of metabolite changes at 1 year (Fig. 1).
In the design of the second case-cohort study (Fig. 2) we considered that 3541 participants did not have T2D at baseline in the full PREDIMED cohort and we only considered them for the second case-cohort design, as they were candidates to develop new-onset T2D. Among them, there were 273 incident cases of T2D observed during follow-up. The present study comprised a random selection of 694 participants (approximately 20%) from the roster of all eligible subjects of the PREDIMED (subcohort), together with all incident cases of T2D with available plasma sample that occurred during a median follow-up of 3.8 years of intervention (cases). Baseline measurements for the metabolites were available for 610 non-cases and 243 incident cases (51 of the cases were overlapped with the subcohort). On the other hand, 669 participants, 516 non-cases and 153 (35 of them overlapped with the subcohort) cases that occurred after 1 year of follow-up, had available 1-year follow-up samples. Among them, 609 subjects had 1-year lysine, 2-AAA and pipecolic acid measurements and were finally included in the 1-year change analyses (Fig. 2).
The Research Ethics Committees for each of the recruitment centres approved the study protocol and all participants provided written informed consent.
Ascertainment of CVD and T2D cases
The primary endpoint of the PREDIMED trial was a composite outcome of non-fatal acute myocardial infarction, non-fatal stroke and cardiovascular death. Physicians, blinded with respect to the intervention, reviewed yearly, in each recruitment center, all the participants’ medical charts to assess any incident CVD outcome. Other sources of information (also blinded with respect to the intervention), such as consultation of the National Death Index, were used to ascertain incident cases. Then, anonymized information was sent from each recruitment center to a blinded central Event Ascertainment Committee who finally adjudicated the events.
The PREDIMED protocol included T2D as a pre-specified secondary endpoint of the trial among participants initially free of diabetes. At baseline, prevalent T2D was identified by clinical diagnosis and/or use of antidiabetic medication. The diagnosis of incident T2D during follow-up has been described elsewhere [9, 10] and followed the American Diabetes Association criteria [11]. Blinded study physicians collected information on the outcomes in the yearly ad hoc reviews of the participants’ medical charts. Also, information on incident cases of T2D was collected from continuous contact with participants and primary health care physicians, and annual follow-up visits. Blinded to the intervention assignment, the Clinical End-Point Committee adjudicated the events according to standard criteria.
Covariate assessment
At baseline and at yearly follow-up visits, participants completed a questionnaire collecting lifestyle information, educational achievement, history of illnesses, medication use, and family history of disease. Physical activity was assessed using the validated Spanish version of the Minnesota Leisure-Time Physical Activity questionnaire [12].
Study samples and metabolite profiling
Fasting blood samples were collected at baseline and after 1 year of follow-up. After centrifugation, plasma EDTA was collected, and aliquots were coded and kept refrigerated until they were stored at − 80 °C. Pairs of samples (baseline and first-year visits from each participant) were randomly ordered and shipped on dry ice to the Broad Institute for the metabolomics analyses.
Liquid chromatography tandem mass spectrometry (LC–MS) was used to measure polar plasma metabolites. Negative ion mode, targeted MS analyses of 2-AAA were conducted as described previously [5]. Briefly, LC–MS samples were prepared from plasma (30 µL) via protein precipitation with the addition of four volumes of 80% methanol containing inosine-15N4, thymine-d4 and glycocholate-d4 internal standards (Cambridge Isotope Laboratories; Andover, MA). The samples were centrifuged (10 min, 9000×g, 4 °C) and the supernatants were analyzed using an ACQUITY UPLC (Waters, Milford MA) coupled to a 5500 QTRAP triple quadrupole mass spectrometer (AB SCIEX, Framingham, MA). Extracts (10 µL) were injected directly onto a 150 × 2.0 mm Luna NH2 column (Phenomenex; Torrance, CA). The column was eluted at a flow rate of 400 µL/min with initial conditions of 10% mobile phase A (20 mM ammonium acetate and 20 mM ammonium hydroxide in water) and 90% mobile phase B (10 mM ammonium hydroxide in 75:25 v/v acetonitrile/methanol) followed by a 10 min linear gradient to 100% mobile phase. MS data were acquired using multiple reaction monitoring scans tuned using authentic reference standards. The ion spray voltage was − 4.5 kV and the source temperature was 500 °C. Raw data were processed using MultiQuant 2.1 software (SCIEX, Framingham MA). High resolution, positive ion mode analyses of lysine and pipecolic acid were conducted using a hydrophilic interaction liquid chromatography (HILIC) LC–MS method as described previously [13]. Briefly, data were acquired using a Nexera X2 U-HPLC system (Shimadzu Scientific Instruments; Marlborough, MA) coupled to a Q Exactive orbitrap mass spectrometer (Thermo Fisher Scientific; Waltham, MA). Metabolites were extracted from plasma (10 µL) using 90 µL of 74.9:24.9:0.2 v/v/v acetonitrile/methanol/formic acid containing stable isotope-labeled internal standards (valine-d8, Isotec; and phenylalanine-d8, Cambridge Isotope Laboratories; Andover, MA). The extracts were centrifuged (10 min, 9000×g, 4 °C), and the supernatants were injected onto a 150 × 2 mm Atlantis HILIC column (Waters; Milford, MA). The column was eluted isocratically at a flow rate of 250 µL/min with 5% mobile phase A (10 mM ammonium formate and 0.1% formic acid in water) for 1 min followed by a linear gradient to 40% mobile phase B (acetonitrile with 0.1% formic acid) over 10 min. Polar metabolite MS analyses were carried out using electrospray ionization in the positive ion mode using full scan analysis over m/z 70–800 at 70,000 resolution and 3 Hz data acquisition rate. Additional MS settings were: ion spray voltage, 3.5 kV; capillary temperature, 350 °C; probe heater temperature, 300 °C; sheath gas, 40; auxiliary gas, 15; and S-lens RF level 40. Raw data were processed using Progenesis QI software (NonLinear Dynamics) for feature alignment, nontargeted signal detection, and signal integration. Compound identities were confirmed using reference standards and reference samples. Internal standard peak areas were monitored for quality control and to ensure system performance throughout analyses. Pooled plasma reference samples were analyzed at intervals of approximately 20 samples as an additional quality control and to determine analytical reproducibility [14]. Coefficients of variation (CV) for 2-AAA were 42.4% and 41.5% in the CVD (n = 126 pooled samples) and T2D (n = 124 pooled samples) datasets, respectively. Lysine CVs were 6.8% and 1.9% and pipecolic acid CVs were 10.6% and 2.3% in the CVD (n = 100 pooled samples) and T2D (n = 92 pooled samples) datasets, respectively.
Statistical analysis
The analyses in both case-cohort studies were run in parallel and following the same scheme, so we will describe the common methods. Only small differences concerning adjustments for multivariable models were introduced and will be summarized below.
Baseline analyses
Baseline lysine, 2-AAA and pipecolic acid values were normalized and scaled in multiples of 1 SD with Blom’s inverse normal transformation [15]. We fitted weighted Cox regression models using Barlow weights to account for the over-representation of cases, as recommended for case-cohort designs [16]. We calculated hazard ratios (HR) and their 95% confidence intervals (95% CI) for CVD or T2D by quartiles of baseline lysine, 2-AAA or pipecolic acid. Quartile cut-off points were generated based on the distributions of each metabolite among subcohort. We conducted tests of linear trend by examining an ordinal score based on the median value in each quartile of lysine, 2-AAA or pipecolic acid in the multivariable models. Follow-up time was calculated from the date of enrollment to the date of diagnosis of CVD/T2D for cases, and to the date of the last visit or the end of the follow-up period for non-cases (December 1, 2010). Progressively further adjusted models were designed: first, a model adjusted for age, sex and intervention group was run; secondly, we included a multivariable adjustment including as covariates BMI (kg/m2), smoking (never/current/former), leisure-time physical activity (metabolic equivalent task [MET]s-min/day), educational level (primary vs secondary or higher), hypertension and dyslipidemia for the association with T2D, and, also including baseline T2D and family history of CVD for the association with CVD. An additional model was designed for the association with T2D including also the adjustment for plasma baseline glucose (both continuous and quadratic term included in the model).
In order to assess the potential relationship of this pathway and other factors underlying the development of T2D, the correlations (Pearson) between 2-AAA and baseline glucose, insulin, LDL-c and HDL-c were analyzed.
As an additional analysis, we wanted to observe the potential modifying role of prevalent T2D on the associations between baseline lysine, 2-AAA or pipecolic acid with CVD. A new variable combining the baseline diabetic status and the levels of each metabolite (below/above the median) was introduced into the models. Moreover, two independent models for diabetics and for non-diabetics were fitted.
To assess if the intervention with MedDiet was an effect modifier of the associations between baseline values of the metabolites and the risk of T2D or CVD, fully adjusted models for the case-cohorts of both CVD and T2D were fitted and multiplicative independent product-terms were used to assess the potential interactions. Potential multiplicative interactions between the intervention group (MedDiet + EVOO, MedDiet + nuts or control) and the dichotomous variable for lysine, 2-AAA and pipecolic acid (defined by the values below/above its respective median) were tested with the likelihood ratio test. Moreover, three new independent variables combining the intervention groups (MedDiet + EVOO, MedDiet + nuts, or control) and the dichotomous variable defined for each metabolite by its median (below or above) were created and introduced separately in fully adjusted models to analyze the effects of each metabolite stratified by intervention group. The reference category was the group of subjects allocated to the control group of the trial and the group of participants below the median of each metabolite. Additionally, adjustment for propensity scores that used 30 baseline variables to estimate the probability of assignment to each of the intervention groups and robust variance estimators were used to take into account that a small percentage of participants were non-individually randomized to the intervention groups and minor imbalances in baseline covariates existed in the trial [9].
One-year changes
To assess the associations between 1-year changes in lysine, 2-AAA and pipecolic acid and subsequent risk of CVD or T2D, only the cases that occurred after 1-year follow-up were used.
First of all, we calculated the changes between 1-year measurements and baseline for each metabolite and then normalized the differences using Blom’s inverse normal transformation. We used the same Cox regression models that for baseline analyses but including as independent variable 1-year changes for each metabolite and adjusting for their respective baseline values.
We also assessed the combined effects of intervention and 1-year changes for the three metabolites and tested the interactions between intervention groups and 1-year changes. As we did at baseline, propensity scores adjustment and robust variance estimators were used in these models.
To analyze the effects of intervention on metabolites changes, we only included participants in the selected subcohorts representing a random sample of the full roster of the trial (CVD and T2D cases not included in subcohorts were excluded). We used a linear regression model with the intervention as the main independent variable and 1-year changes in lysine, 2-AAA and pipecolic acid (residual change obtained after a regression of the 1-year metabolite value on baseline value), respectively, as dependent variables and adjusted for the following independent covariates: age, sex, BMI, smoking, leisure-time physical activity, hypertension and dyslipidemia. Again, propensity scores adjustment and robust variance were used in these models.