A novel 6-metabolite signature for prediction of clinical outcomes in type 2 diabetic patients undergoing percutaneous coronary intervention
Cardiovascular Diabetology volume 21, Article number: 126 (2022)
Outcome prediction tools for patients with type 2 diabetes mellitus (T2DM) undergoing percutaneous coronary intervention (PCI) are lacking. Here, we developed a machine learning-based metabolite classifier for predicting 1-year major adverse cardiovascular events (MACEs) after PCI among patients with T2DM.
Serum metabolomic profiling was performed in a nested case–control study of 108 matched pairs of patients with T2DM occurring and not occurring MACEs at 1 year after PCI, then the matched pairs were 1:1 assigned into the discovery and internal validation sets. External validation was conducted using targeted metabolite analyses in an independent prospective cohort of 301 patients with T2DM receiving PCI. The function of candidate metabolites was explored in high glucose-cultured human aortic smooth muscle cells (HASMCs).
Overall, serum metabolome profiles differed between diabetic patients with and without 1-year MACEs after PCI. Through VSURF, a machine learning approach for feature selection, we identified the 6 most important metabolic predictors, which mainly targeted the nicotinamide adenine dinucleotide (NAD+) metabolism. The 6-metabolite model based on random forest and XGBoost algorithms yielded an area under the curve (AUC) of ≥ 0.90 for predicting MACEs in both discovery and internal validation sets. External validation of the 6-metabolite classifier also showed good accuracy in predicting MACEs (AUC 0.94, 95% CI 0.91–0.97) and target lesion failure (AUC 0.89, 95% CI 0.83–0.95). In vitro, there were significant impacts of altering NAD+ biosynthesis on bioenergetic profiles, inflammation and proliferation of HASMCs.
The 6-metabolite model may help for noninvasive prediction of 1-year MACEs following PCI among patients with T2DM.
Patients with type 2 diabetes mellitus (T2DM) account for more than a quarter of all coronary artery disease (CAD) patients receiving percutaneous coronary intervention (PCI) . Despite great advances in stent technologies, T2DM remains a strong indicator of major adverse cardiovascular events (MACEs) after PCI [2, 3]. Thus, identifying biomarkers for noninvasive prediction of post-PCI outcomes among type 2 diabetic patients has substantial clinical implications .
Metabolomics, which provides untargeted measurements of the multiparametric metabolic response of living systems to pathophysiological stimuli , has the potential to both discover new biomarkers and reveal key metabolic pathways intrinsic to the disease pathogenesis . Although such metabolomic approaches have been increasingly explored in cardiovascular biomarker discovery, most previous studies have concentrated on the screening of metabolic biomarkers for the discrimination of established CAD from non-CAD controls [7, 8]. Yet it remains unclear how systemic metabolic alterations impact clinical outcomes after PCI, especially for patients having T2DM. Moreover, the majority of metabolic biomarker candidates for cardiovascular disease were identified using the classical generalized linear method of regression [9, 10]. Modern machine leaning approaches, which are better able to incorporate high-order nonlinear associations between predictors to gain predictive performance , have rarely been applied to outcome predictions for type 2 diabetic patients receiving PCI.
Hence, in a nested case–control study of 216 patients with T2DM who underwent PCI due to obstructive CAD, we first assessed the prospective associations of serum metabolic profiles at baseline with the risk of incident MACEs at 1 year after PCI. Then, a 6-metabolite model, mainly targeting the pathway of nicotinamide adenine dinucleotide (NAD+) metabolism, was developed and internally validated for the prediction of 1-year MACEs following PCI based on a set of machine learning algorithms. Next, we externally verified the 6-metabolite model using a targeted metabolite analysis in an independent prospective cohort of 301 diabetic patients who received PCI. Finally, we explored the biological relevance of altering NAD+ biosynthesis to abnormal phenotypes of human aortic smooth muscle cells (HASMCs) under high glucose (HG) conditions.
Study design and participants
An overview of the study design is depicted in Fig. 1. In brief, we first conducted a nested case–control study within a prospective cohort of 702 patients with T2DM who underwent primary PCI from Sep 2017 to Jan 2019 in the First Affiliated Hospital of Zhengzhou University. As previously described [12, 13], the prospective cohort excluded patients who had systematic diseases including cancer, serious infection, chronic liver disease, and type 1 diabetes. T2DM was diagnosed based on the 2014 American Diabetes Association criteria . All participants were hospitalized for angiographically confirmed obstructive CAD  and underwent PCI at baseline, and then completed a follow-up of 1 year to track MACEs [composite of all-cause death, myocardial infarction (MI), stroke, and repeat revascularization] as the primary outcome. Clinical, angiographic, and procedural data were collected at baseline (Table 1), and outcome data were obtained from medical records and telephone interviews with participants at 30 days and 6, 9, 12 months after PCI. Within a median follow up of 12.5 months (interquartile range [IQR]: 11.9–12.6 months), 108 (15.4%) patients occurred MACEs (all-cause death, n = 46; repeat revascularization, n = 54; MI, n = 13; stroke, n = 7). Of the remaining 594 participants without the occurrence of MACEs, 108 individuals, matched for baseline characteristics using propensity scores , were selected as the controls. Then, the matched case–control pairs were randomly assigned (1:1) to a discovery set or an internal validation set. Both sets could provide a > 90% power to detect a fold change of > 4/3 or < 3/4 for differential metabolites at a false discovery rate (FDR) of < 5%.
The external validation set was a prospective study of 301 patients with T2DM who were treated with PCI at the Zhongnan Hospital of Wuhan University between May 2016 and Jun 2017. The exclusion criteria were the same as in the above nested case–control study. For all participants, clinical follow-up was performed at 30 days and 6, 9, 12 months after PCI, and angiographic follow-up was conducted at 12 months after PCI . The primary outcome remained MACEs, while the secondary outcome was target lesion failure (TLF), a device-oriented composite endpoint of cardiac death, target vessel MI, and target lesion revascularization . During a median follow-up of 12.4 months (IQR: 11.7–12.6 months), a total of 47 (15.6%) MACEs (all-cause death, n = 17; repeat revascularization, n = 22; MI, n = 9; stroke, n = 3) and 30 (10.0%) TLF were documented. This study followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement . More details in study design, baseline characteristics, and outcome definitions are summarized in Additional file 1: Supplementary Methods. The Study protocol was approved by local institutional review boards, and written informed consents were obtained from all participants.
Untargeted metabolic profiling by liquid chromatography-mass spectrometry (LC–MS)
For all participants, serum samples were isolated from whole blood by centrifugation within 30 days before PCI, and stored at – 80 ℃ until use. To minimize the potential impacts of storage conditions on metabolite stability, we analyzed the level of methionine, a metabolite that could be extensively degraded if the frozen serum samples were stored too long or improperly . As a result, the relative abundance of methionine for all samples was greater than 3 standard deviations below the mean (Additional file 1: Fig. S1), indicating that the serum samples were properly stored for metabolite detection.
Details in LC–MS procedures are described in Additional file 1: Supplementary Methods. Briefly, serum samples were first treated with acetonitrile/methanol (1:1, v/v) and isotope-labeled internal standard mixtures for metabolite extraction. Then, metabolite extracts were separated using an UPLC BEH Amide column (2.1 mm × 100 mm, 1.7 µm) on a Vanquish UHPLC system (Thermo, Waltham, USA). The column eluent was further detected for the acquisition of MS/MS spectra on a Q Exactive Orbitrap mass spectrometer (Thermo, Waltham, USA) operating in the positive and negative ion modes. The acquired MS raw data were analyzed on the XCMS Online platform (https://xcmsonline.scripps.edu)  for peak detection and metabolite annotation. The best-matched internal standard (B-MIS) normalization method  was used to normalize peak areas and yield relative abundance of metabolites. To ensure data quality, the quality control (QC) samples were prepared by mixing an equal aliquot of all samples, and run at the beginning of the sample queue for column conditioning and each of 10 samples thereafter. Metabolic peaks with relative standard deviations (RSD) of > 30% across QC samples or presenting in < 80% of QC samples were removed for further analysis .
Targeted metabolite analysis in the external validation set
External validation of 6 metabolic biomarkers selected from untargeted metabolomic profiling was conducted by targeted metabolite analyses on a 20AD UPLC system (Shimadzu, Kyoto, Japan) coupled with a QTrap 5500 mass spectrometer (SCIEX, Framingham, USA) operating in the multiple reaction monitoring mode (Additional file 1: Supplementary Methods). The absolute concentration (µmol/L) of each metabolite in serum was determined using a 7-point calibration curve, created by calculating the peak area ratio of each calibrator (Sigma, St. Louis, USA) versus its concentration. All the calibration curves showed a good linearity, with R2 values of > 0.990, intra- and inter-batch precision values (as RSD) of < 15%, and accuracy values (as relative error) ranging from − 9.1 to 5.3% (Additional file 1: Table S1).
In vitro experiments in HG-cultured HASMCs
HASMCs (LONZA) were grown in Dulbecco’s Modified Eagle’s medium (LONZA) supplemented with 25 mM D-glucose, 4 mM L-glutamine, 10% fetal bovine serum, and 1% penicillin/streptomycin in a humidified atmosphere at 37 ℃ and 5% CO2. After 3–5 passages, HG-cultured HASMCs were treated with 10 μM FK866 (an inhibitor of nicotinamide phosphoribosyltransferase, Sigma) for 20 h to block the salvage biosynthetic pathway of NAD+, or 200 μM 1-methyl-L-tryptophan (1MT, an inhibitor of indole-2,3-dioxygenase, Sigma) for 20 h to inhibit de novo synthesis of NAD+, or 10 μM β-nicotinamide mononucleotide (NMN, a NAD+ precursor, Sigma) for 20 h to sustain NAD+ levels . For each group of HASMCs, NAD+ levels were detected by LC–MS; the activity of mitochondrial respiratory chain complexes (I–V) was measured using spectrophotometric assays . The bioenergetic profile of HASMCs was determined by an XF24 Extracellular Flux Analyzer (Agilent, Santa Clara, USA) to calculate the parameters of mitochondrial respiration and glycolysis . The mRNA expression and protein secretion of proinflammatory cytokines in HASMCs were examined by reverse transcription quantitative PCR and cytometric bead array (BD-Pharmingen), respectively. A transwell migration assay was performed to evaluate the ability of HASMCs to recruit THP1 monocytes . Proliferation of HASMCs was assessed using a methylene blue dye assay, as described in our previous report . All in vitro experiments were repeated 3 times. More Details are provided in Additional file 1: Supplementary Methods and Additional file 1: Table S2.
In both discovery and internal validation sets, the global metabolic differences between participants with and without MACEs were assessed by a supervised model of orthogonal partial least-squares discriminate analysis (OPLS-DA). For assessing the robustness of OPLS-DA, we performed 200 permutations of the metabolomic datasets, and for each permutation, the values of Q2 and R2Y were calculated by a seven-fold cross validation to reflect the goodness of prediction and the risk of overfitting, respectively. The values of variable importance in the projection (VIP) were also calculated to reflect the contribution of each metabolite to the group discrimination in the OPLS-DA model. Differential comparisons of single metabolites between groups were examined using the Mann–Whitney U test, followed by the Benjamini–Hochberg FDR-controlling method  to adjust for multiple comparisons. Metabolites with VIP values of > 1.0, FDRs of < 0.05, and fold changes of > 4/3 or < 3/4 were considered as the differential metabolites [23, 29], which were further mapped into the KEGG database (https://www.kegg.jp/) for pathway enrichment analyses. Then, an optimal set of metabolic features for predicting MACEs was selected from the differential metabolites using the R package VSURF (Variable Selection Using Random Forests) , in which a recommended stepwise random forest (RF) procedure  was executed to identify the best combination of discriminant variables for classification prediction modeling on the basis of predictive accuracy (as the amount of out-of-bag error) and parsimony (as the number of selected variables).
In the discovery set, we integrated the metabolic features selected by VSURF to develop the reference model of logistic regression and 4 machine learning models for prediction of MACEs. The 4 machine learning algorithms were: (1) RF, an ensemble approach that produces multiple decision trees for classification , (2) extreme gradient boosting (XGBoost), another ensemble machine using the Shapley additive explanations method to create an additive model of decision trees , (3) nonlinear Support Vector Machines (SVM) with a polynomial kernel , and (4) deep multilayer neural network (DNN) with the adaptive moment estimation optimizer . Details in model parameters are listed in the Additional file 1: Supplementary Methods. The importance of each feature to the prediction was assessed by Gini index in RF and by relative importance values in XGBoost. The predictive performance of 4 machine learning models was compared with that of the reference model (i.e., logistic regression) by measuring: (1) discrimination statistics including area under the receiver-operating characteristic curve (AUC), sensitivity, specificity, positive predictive value, and negative predictive value; (2) continuous net reclassification index (NRI); (3) calibration statistics (P-value of the Hosmer–Lemeshow test, calibration slope, and calibration plots); and (4) net clinical benefit through decision curve analyses . To adjust for potential over-fitting and over-optimism, a seven-fold cross validation was also performed to obtain a bias-corrected AUC for each model.
We performed Kaplan–Meier curves and Cox regression to assess the prognostic values of the derived models. The proportional hazards assumption of Cox models was verified by visual inspection of log-minus-log plots and calculation of Schoenfeld residuals. Sensitivity analyses were also conducted to validate model performance after stratification by initial clinical presentations and different components of MACEs. For in vitro experiments, intergroup differences were compared using one-way ANOVA with LSD post hoc tests. All statistical analyses were conducted with R (version 3.5.3) and SIMCA-P (version 16.0.2). A P-value < 0.05 was considered significant.
As described in Table 1, after propensity score matching and random assignment, the baseline characteristics in both discovery and internal validation sets were similar between patients occurring MACEs (n = 54 for each set) and matched controls (n = 54 for each set). In the external validation set, MACEs (n = 47) were more likely to occur in patients who were current smokers or had left ventricular ejection fractions of < 50%, multivessel CAD, or higher levels of glycated hemoglobin at baseline (Table 1).
Global metabolic alterations in patients with T2DM occurring MACEs after PCI
We first assessed the reliability of the LC–MS analysis using QC samples. As presented in Additional file 1: Figure S2, the Pearson correlation coefficients of metabolomic data among QC samples were greater than 0.99, indicating a good reproducibility of the LC–MS analysis. After data quality control and peak alignment, serum metabolome analyses annotated a total of 778 metabolites (discovery set: 743; internal validation set: 674; identified in both sets: 639), which were distributed across 10 ontology classes (Additional file 1: Table S3).
As illustrated in the OPLS-DA models, the metabolomic profile of the MACEs group was significantly distinct from that of the matched control group in both discovery (R2Y = 0.81, Q2 = 0.65, Fig. 2A) and internal validation sets (R2Y = 0.83, Q2 = 0.67, Fig. 2B). After 200 permutations, the intercepts of goodness-of-prediction (Q2) and goodness-of-fit (R2) values were within ± 0.5 (Fig. 2C and D), indicating that the OPLS-DA models were well explained and not overfitting. Based on the conditions of VIP values > 1.0, FDR < 0.05, and fold changes > 4/3 or < 3/4, the volcano plots depicted 69 differential metabolites (37 upregulated, 32 downregulated, Fig. 2E and Additional file 1: Table S4) in the discovery set and 89 differential metabolites (56 upregulated, 33 downregulated, Fig. 2F and Additional file 1: Table S5) in the internal validation set. Pathway enrichment analyses of the differential metabolites found 4 metabolic pathways significantly perturbed in patients with incident MACEs, involving nicotinate and nicotinamide metabolism, tryptophan metabolism, glycerophospholipid metabolism, pentose phosphate pathway, and glycolysis (Fig. 2G). Of all the differential metabolites, 35 were identified in both discovery and internal validation sets (20 upregulated, 15 downregulated, Fig. 2H), with the potential to better distinguish MACEs from matched controls (Additional file 1: Figure S3).
Development and internal validation of a 6-metbolite signature to predict MACEs
From the 35 metabolites differentially expressed in both data sets (Fig. 3A), we sought to select an optimal set of discriminators for MACEs by considering the balance between classification accuracy and parsimony. For this purpose, the normalized data of the 35 metabolites from the discovery set were inputted into a tree-based VSURF algorithm, in which a total of 29 metabolites were gradually excluded due to low importance or high redundancy (Fig. 3B and C), finally leaving a subset of 6 metabolites with the lowest prediction error for multivariate modeling (Fig. 3D). Of particular note, among the 6 metabolites, 4 were involved in biosynthesis (nicotinamide [NAM] and L-tryptophan), consumption (adenosine diphosphate ribose [ADPR]), or excretion [1-methylnicotinamide (1-MNAM)] of NAD+ , implying a crucial role of NAD+ metabolism in the occurrence of MACEs among diabetic patients who received PCI.
As depicted in the OPLS-DA models (Fig. 3E and F), the combination of these 6 metabolites could clearly separated patients with MACEs from matched controls in both discovery and internal validation sets. The logistic regression model composed of these 6 metabolites yielded an AUC of 0.89 [95% confidence interval (CI) 0.81–0.94] in the discovery set and 0.85 (95% CI 0.76–0.91) in the internal validation set for predicting MACEs (Fig. 3G). When the 2 datasets were divided into high-(> 62%) and low-risk (≤ 62%) groups based on the risk probability predicted by the 6-metabolite model (cut-off value derived from the ROC analysis), Kaplan–Meier estimates of the rates of MACEs significantly differed between high- and low-risk groups (P < 0.001, Additional file 1: Figure S4). After multivariable adjustment by potential confounding factors, the 6-metabolite model remained a powerful and independent prognostic predictor for MACEs [discovery set: hazard ratio (HR) = 8.92; internal validation set: HR = 6.17, both P < 0.001, Additional file 1: Figure S4].
Improving performance of the 6-metabolite model using machine learning
To evaluate whether the predictive performance of the 6-metabolite panel could be improved by the application of machine learning, we developed the 6-metabolite prediction models by incorporating the data of the discovery set into 4 machine learning algorithms: RF, XGBoost, SVM, and DNN. As summarized in Table 2, all of the machine learning models, except for DNN, yielded a significantly higher AUC (0.93–0.99, Pdifference < 0.05) than the reference model of logistic regression (AUC = 0.89). Likewise, the reclassification ability of 4 machine learning models was also improved, with continuous NRIs ranging from 0.96 to 1.93.
When the machine learning models were applied in the internal validation set, the top 2 best-performing models were RF and XGBoost models, which both showed significant improvements in discrimination and reclassification abilities to predict MACEs compared with the logistical regression model. Specifically, sensitivities for the RF and XGBoost models increased to ~ 85% compared with 72% for the logistic regression model, meaning that about 13% (7/54) of patients who developed MACEs after PCI would be correctly identified using the RF and XGBoost models but would be missed when the logistic regression model was applied (Additional file 1: Figure S5). Interestingly, all 4 metabolites related to NAD+ metabolism were highlighted as the top 4 important variables to the predictive outcomes in both RF and XGBoost models (Additional file 1: Figure S6). Otherwise, the SVM and DNN models did not improve predictive performance relative to the regression model in the internal validation set.
For all prediction models, the average AUCs calculated by cross validation remained largely unchanged (Table 2); calibration plots showed a good agreement between predicted and observed outcomes (calibration slope around 1, Additional file 1: Figure S7).
External validation using a targeted metabolite analysis
When determining the absolute concentrations of the 6 metabolites using a targeted metabolite analysis in the external validation cohort, we found that 3 of the 6 metabolites were associated with a higher risk of MACEs and the other 3 were associated with a lower risk of MACEs (Fig. 4A–F), which were consistent with the results from the untargeted metabolomics mentioned above. After inputting the normalized (also used the B-MIS method) data of the 6 metabolites into the established machining learning models, the AUCs for predicting MACEs reached to 0.92 (95% CI 0.88–0.95, Pdifference = 0.005, Fig. 4G) in RF and 0.94 (95% CI 0.91–0.97, Pdifference < 0.001, Fig. 4G) in XGBoost, compared to 0.85 (95% CI 0.81–89) in logistic regression. Decision curve analyses also showed that both RF and XGBoost models had larger net benefits (i.e., a greater number of appropriate triage) across the range of risk thresholds compared with the logistic regression model (Fig. 4H). When categorizing participants into high-risk and low-risk groups based on the prediction of the 6-metbaolite classifier, the adjusted HR for MACEs was 7.68 (95% CI 3.57–16.55, Additional file 1: Figure S8) for the comparison of high-risk versus low-risk groups. Likewise, the 6-metabolite classifier achieved a smaller but still good AUC (0.83–0.89) to predict the device-oriented endpoint of TLF (Fig. 4I).
Recently, the FREEDOM trial derived a personalized clinical risk model for MACE prediction in diabetic patients undergoing revascularization . Here, we further assessed the additional value of our 6-metabolite model beyond the FREEDOM tool. As shown in Additional file 1: Table S6, adding the RF-based 6-metbolite model into the FREEDOM tool substantially increased the C-index to 0.87 (95% CI 0.81–0.93) for predicting MACEs. The classification performance was also improved after addition of the 6-metabolite panel, with a categorical NRI of 0.60 (95% CI 0.45–0.75, P < 0.001) and an IDI of 0.27 (95% CI 0.22–0.32, P < 0.001).
Internal and external validation by sensitivity analyses
We first performed sensitivity analyses for assessing the performance of the 6-metabolite model in predicting different components of MACEs. As shown in Fig. 5, the 6-metabolite model consistently yielded high AUCs (≥ 0.90) for either predicting the combined end point of death, MI, and stroke or predicting repeat revascularization in both internal and external validation sets.
Then, considering that the prognosis following PCI significantly differed between patients initially presenting with acute coronary artery syndrome (ACS) and stable CAD (SCAD) , sensitivity analyses were further conducted after stratification by initial clinical presentations. We observed that the AUCs of the 6-metabolite model for predicting MACEs were generally higher (0.93–0.96) in patients presenting with ACS, and slightly lowered to 0.87–0.90 in patients presenting with SCAD, but the differences were not statistically significant (Fig. 5).
Effects of altering NAD+ biosynthesis in HG-cultured HASMCs
Considering the importance of NAD+ metabolites in gaining predictive performance of our prediction model, we investigated the effects of altering NAD+ biosynthesis on the phenotypes of HASMCs under HG conditions. As shown in Fig. 6A and B, pharmacological inhibition of NAD+ biosynthesis by FK866 or 1MT led to a substantial reduction in basal NAD+ levels, accompanied by a marked deficit in mitochondrial complex I activity, which requires reduced NAD+ for mitochondrial electron transfer . Consistent with abnormal changes in activities of mitochondrial complexes, significant reductions in parameters of mitochondrial respiratory, including mitochondrial basal respiration, ATP-linked respiratory capacity, and maximal respiration (Fig. 6C and D), were observed along with increases in glycolytic flux after pharmacological blockade of NAD+ biosynthesis in HASMCs (Fig. 6E and F). Conversely, supplementation of NMN, a NAD+ precursor, increased basal NAD+ levels and enhanced mitochondrial respiratory capacities while decreasing glycolysis in HG-cultured HASMCs (Fig. 6A–F).
Given the potential link of aerobic glycolysis to inflammatory activation , we elected to further explore the impact of interfering NAD+ biosynthesis on the HG-induced expression of an array of proinflammatory factors with documented roles in cardiovascular disease. As a result, inhibition of NAD+ biosynthesis by FK866 or 1MT in HASMCs was found to significantly increase the production of a series of chemokines (MCP1, CCL3, CCL4, etc.) and interleukins (IL6, IL8, IL-1β etc.), both in terms of mRNA expression and protein secretion (Fig. 6G and H). In parallel, exposure of HASMCs to FK866 or 1MT resulted in a more than 80% increase in chemotaxis of THP1 monocytes toward HASMCs (Fig. 6I), along with increased proliferation of HASMCs (Fig. 6J). In contrast, NMN supplementation normalized the production of proinflammatory factors, attenuated the ability of HASMCs to recruit THP1 monocytes, and inhibited HASMCs proliferation (Fig. 6G–J).
Diabetes is deemed as one of the most significant prognostic factors for adverse outcomes following PCI, with hazard ratios ranging from 1.9 to 2.5 . Mounting evidence also indicates that diabetes causes increased proliferation of vascular smooth muscle cells , more extensive neointimal hyperplasia , and consequent more severe restenosis after stent implantation , highlighting the view that the mechanism underlying the adverse outcomes after PCI in diabetes is probably different from that in non-diabetes . Specific to metabolomic studies, there has been epidemiological evidence showing that diabetic patients with macrovascular complications have distinct circulating metabolic profiles compared with those without [46, 47]. Recently, Cui and colleagues constructed a metabolite panel of phospholipids and sphingolipids with high accuracy (AUC > 0.90) for diagnosis of stent restenosis in non-diabetic patients . However, this metabolite panel only achieved a modest AUC of < 0.70 (data not shown) for predicting MACEs in our cohorts of type 2 diabetic patients receiving PCI, implying that the metabolite fingerprint causally associated with the occurrence of adverse outcomes after PCI may also differ between diabetic and non-diabetic status.
Hence, for the first time, the present study focused on the identification of differential metabolic patterns at baseline to predict the incidence of MACEs at 1 year after PCI for patients with T2DM. By applying both untargeted and targeted metabolomic approaches, we first found a significant difference in serum metabolome profiles between type 2 diabetic patients with and without incident MACEs following PCI. Then, the Venn diagram depicted 35 metabolites differentially expressed in both discovery and internal validation sets, which had the potential to better discriminate patients with the occurrence of MACEs from matched controls. For constructing the parsimonious model that would be more achievable in the clinical setting, our next step was to select an optimal set of predictors from the 35 metabolites.
Considering that the 35 metabolites belong to several metabolic pathways with possible interconnections, we did not use the traditional feature selection methods like logistic regression, because fitting a logistic regression model is, in fact, algebraically difficult when there are too many variables with complex interactions between each other . Instead, the 35 metabolic features were filtered by the VSURF algorithm, which is a recommended tree-based method to identify the most important variables for classification after accounting for the complex, nonlinear relations in the dataset [11, 31]. As a result, a total of 6 metabolic features, including 4 NAD+ metabolites, 1 phosphatidylcholine lipid, and 1 sugar phosphate, were finally selected for multivariate modeling.
Then, we went on to consider how the 6 metabolites could best be incorporated to increase predictive accuracy. For this purpose, we trained the 6 metabolites using a series of powerful machine learning algorithms, including ensemble methods like RF and XGBoost, nonlinear method like SVM, and multilayer neural network. The testing results showed that the 6-metbolite models based on RF and XGBoost were the best-performing models, with high discriminative accuracy of ≥ 90% in both internal and external validation cohorts. Of note, the two ensemble models could detect an additional 13% of patients in whom the occurrence of MACEs following PCI would not be identified when using the traditional logistic regression model. If these findings can be verified, our 6-metabolite classifier may be as a helpful tool for identifying patients at high risk for MACEs and guiding early prevention among patients with T2DM undergoing PCI.
An interesting finding of the current study is that our prediction model includes 4 key metabolites involved in biosynthesis (NAM, L-tryptophan), consumption (ADPR), or secretion (1-MNAM) of NAD+, suggesting a possible link of NAD+ metabolism to adverse outcomes following PCI. Physiologically, NAD+ can be synthesized de novo starting with tryptophan, or from salvage pathway starting with NAD+ precursors like NAM derived from cellular NAD+ metabolism or dietary supply . Under critical conditions (e.g., acute DNA injury and cell death), the synthesized NAD+ can be excessively consumed by hyperactivated poly(ADP-ribose) polymerases to produce ADPR and NAM as byproducts, in which ADPR continues to form poly(ADP-ribose) chains with pivotal effects on posttranslational modification of target proteins , whereas NAM can be methylated to form MNAM that is mainly secreted via urine . Our previous bidirectional Mendelian Randomization study has shown that higher extent of leukocyte poly(ADP-ribose), as a hallmark of massive NAD+ consumption, is causally associated with the incidence of 1-year MACEs after PCI . The present study extends our previous work by providing experimental evidence that pharmacologically blocking the salvage or de novo biosynthetic pathways of NAD+ causes abnormal changes in bioenergetic profiles, upregulated expression of proinflammatory factors, increased chemotaxis of monocytes to HASMCs, and enhanced proliferation of HASMCs. In contrast, sustaining NAD+ levels via NMN supplementation may inhibit HG-induced glycolysis, pro-inflammation, and proliferation of HASMCs, all of which are aberrant phenotypes related to incident MACEs after PCI [42, 51]. These findings are in line with recent evidence that the intrinsic NAD+ fueling system is essential to protect against DNA damage, premature senescence, and chaotic migration of smooth muscle cells [52, 53], supporting the close link between NAD+ biosynthesis and phenotypic switching of HASMCs. Collectively, our in vitro data may provide an experimental foundation for the significant effects of NAD+ metabolites on post-PCI outcomes.
Our study is the first to develop a machine learning-based metabolite classifier to predict incident MACEs at 1 year after PCI for type 2 diabetic patients, with rigorous steps for model specification and evaluation of model performance (i.e., discrimination, calibration, and clinical usefulness). Nevertheless, our study also has limitations. First, owing to a relatively short follow-up period, we could not evaluate the predictive performance of our 6-metabolite model for long-term outcomes after PCI. Second, although we validated the 6-metabolite model using both internal and external datasets, its predictive utility should be extended in larger independent cohorts, especially in other geographic populations. Third, sensitivity analyses observed that the predictive accuracy of the 6-metabolite model might slightly decrease in patients initially presenting with SCAD. So we could not rule out the possibility that simultaneously enrolling patients with different clinical presentations might potentially confound the performance of our prediction model. Fourth, other angiographic parameters, such as lesion type, vessel tortuosity, and presence of thrombus, might provide additional information on disease progression, but were not available in the current study. Fifth, although we computed variable importance to define predictors that most affected classification, the prediction model based on RF and XGBoost might still be harder to be interpreted, compared with the regression model simply using given coefficients to weight predictors . Finally, more experimental research is needed for better understanding of the exact mechanism underlying the predictive value of NAD+ metabolites.
Using an array of machine learning algorithms, we develop a 6-metabolite signature with high accuracy for predicting incident MACEs at 1 year after PCI in patients with T2DM. A diagnostic test based on this metabolite model is clinically achievable because of the small number of metabolites included in our model, and may have a potential utility of early identification of type 2 diabetic patients at high risk for post-PCI outcomes. Our study also provides the first evidence for a critical role of abnormal changes in NAD+ metabolites in the occurrence of adverse outcomes after PCI under diabetic conditions.
Availability of data and materials
All data and methods supporting the findings of this study are available from the corresponding author upon reasonable request.
Type 2 diabetes mellitus
Coronary artery disease
Percutaneous coronary intervention
Major adverse cardiovascular events
Nicotinamide adenine dinucleotide
Human aortic smooth muscle cells
False discovery rate
Target lesion failure
Liquid chromatography-mass spectrometry
Orthogonal partial least-squares discriminate analysis
Variable importance in the projection
Variable selection using random forests
Extreme gradient boosting
Support vector machines
Deep neural network
Area under the receiver-operating characteristic curve
Adenosine diphosphate ribose
Acute coronary syndrome
Chichareon P, Modolo R, Kogame N, Takahashi K, Chang CC, Tomaniak M, et al. Association of diabetes with outcomes in patients undergoing contemporary percutaneous coronary intervention: Pre-specified subgroup analysis from the randomized GLOBAL LEADERS study. Atherosclerosis. 2020;295:45–53.
Kedhi E, Généreux P, Palmerini T, McAndrew TC, Parise H, Mehran R, et al. Impact of coronary lesion complexity on drug-eluting stent outcomes in patients with and without diabetes mellitus: analysis from 18 pooled randomized trials. J Am Coll Cardiol. 2014;63(20):2111–8.
Godoy LC, Lawler PR, Farkouh ME, Hersen B, Nicolau JC, Rao V. Urgent revascularization strategies in patients with diabetes mellitus and acute coronary syndrome. Can J Cardiol. 2019;35(8):993–1001.
Ma X, Dong L, Shao Q, Cheng Y, Lv S, Sun Y, et al. Triglyceride glucose index for predicting cardiovascular outcomes after percutaneous coronary intervention in patients with type 2 diabetes mellitus and acute coronary syndrome. Cardiovasc Diabetol. 2020;19(1):31.
Cui L, Lu H, Lee YH. Challenges and emergent solutions for LC-MS/MS based untargeted metabolomics in diseases. Mass Spectrom Rev. 2018;37(6):772–92.
Johnson CH, Ivanisevic J, Siuzdak G. Metabolomics: beyond biomarkers and towards mechanisms. Nat Rev Mol Cell Biol. 2016;17(7):451–9.
Ruiz-Canela M, Hruby A, Clish CB, Liang L, Martínez-González MA, Hu FB. Comprehensive metabolomic profiling and incident cardiovascular disease: a systematic review. J Am Heart Assoc. 2017. https://doi.org/10.1161/JAHA.117.005705.
Khan A, Choi Y, Back JH, Lee S, Jee SH, Park YH. High-resolution metabolomics study revealing l-homocysteine sulfinic acid, cysteic acid, and carnitine as novel biomarkers for high acute myocardial infarction risk. Metabolism. 2020;104: 154051.
Tzoulaki I, Castagné R, Boulangé CL, Karaman I, Chekmeneva E, Evangelou E, et al. Serum metabolic signatures of coronary and carotid atherosclerosis and subsequent cardiovascular disease. Eur Heart J. 2019;40(34):2883–96.
Wang Z, Zhu C, Nambi V, Morrison AC, Folsom AR, Ballantyne CM, et al. Metabolomic pattern predicts incident coronary heart disease. Arterioscler Thromb Vasc Biol. 2019;39(7):1475–82.
Johnson KW, Torres Soto J, Glicksberg BS, Shameer K, Miotto R, Ali M, et al. Artificial intelligence in cardiology. J Am Coll Cardiol. 2018;71(23):2668–79.
Cui NH, Yang JM, Liu X, Wang XB. Poly(ADP-Ribose) polymerase activity and coronary artery disease in type 2 diabetes mellitus: an observational and bidirectional mendelian randomization study. Arterioscler Thromb Vasc Biol. 2020;40(10):2516–26.
Wang XB, Cui NH, Liu X, Liu X. Joint effects of mitochondrial DNA4977 deletion and serum folate deficiency on coronary artery disease in type 2 diabetes mellitus. Clin Nutr. 2020;39(12):3771–8.
Association AD. Diagnosis and classification of diabetes mellitus. Diabetes Care. 2014;37(Suppl 1):S81-90.
Ouellette ML, Löffler AI, Beller GA, Workman VK, Holland E, Bourque JM. Clinical characteristics, sex differences, and outcomes in patients with normal or near-normal coronary arteries, non-obstructive or obstructive coronary artery disease. J Am Heart Assoc. 2018. https://doi.org/10.1161/JAHA.117.007965.
Elze MC, Gregson J, Baber U, Williamson E, Sartori S, Mehran R, et al. Comparison of propensity score methods and covariate adjustment: evaluation in 4 cardiovascular studies. J Am Coll Cardiol. 2017;69(3):345–57.
Wang XB, Cui NH, Liu X, Liu X. Mitochondrial 8-hydroxy-2’-deoxyguanosine and coronary artery disease in patients with type 2 diabetes mellitus. Cardiovasc Diabetol. 2020;19(1):22.
Xu B, Yang Y, Han Y, Huo Y, Wang L, Qi X, et al. Comparison of everolimus-eluting bioresorbable vascular scaffolds and metallic stents: three-year clinical outcomes from the ABSORB China randomised trial. EuroIntervention. 2018;14(5):e554–61.
Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015;350: g7594.
Hustad S, Eussen S, Midttun Ø, Ulvik A, van de Kant PM, Mørkrid L, et al. Kinetic modeling of storage effects on biomarkers related to B vitamin status and one-carbon metabolism. Clin Chem. 2012;58(2):402–10.
Forsberg EM, Huan T, Rinehart D, Benton HP, Warth B, Hilmers B, et al. Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online. Nat Protoc. 2018;13(4):633–51.
Boysen AK, Heal KR, Carlson LT, Ingalls AE. Best-matched internal standard normalization in liquid chromatography-mass spectrometry metabolomics applied to environmental samples. Anal Chem. 2018;90(2):1363–9.
Shen X, Wang C, Liang N, Liu Z, Li X, Zhu Z-J, et al. Serum metabolomics identifies dysregulated pathways and potential metabolic biomarkers for hyperuricemia and gout. Arthritis Rheumatol. 2021;73(9):1738–48.
Minhas PS, Liu L, Moon PK, Joshi AU, Dove C, Mhatre S, et al. Macrophage de novo NAD synthesis specifies immune function in aging and inflammation. Nat Immunol. 2019;20(1):50–63.
Rodenburg RJ. Biochemical diagnosis of mitochondrial disorders. J Inherit Metab Dis. 2011;34(2):283–92.
Alesutan I, Moritz F, Haider T, Shouxuan S, Gollmann-Tepeköylü C, Holfeld J, et al. Impact of β-glycerophosphate on the bioenergetic profile of vascular smooth muscle cells. J Mol Med. 2020;98(7):985–97.
Gardner SE, Humphry M, Bennett MR, Clarke MC. Senescent vascular smooth muscle cells drive inflammation through an interleukin-1α-dependent senescence-associated secretory phenotype. Arterioscler Thromb Vasc Biol. 2015;35(9):1963–74.
Glickman ME, Rao SR, Schultz MR. False discovery rate control is a recommended alternative to Bonferroni-type adjustments in health studies. J Clin Epidemiol. 2014;67(8):850–7.
Liu J, Geng W, Sun H, Liu C, Huang F, Cao J, et al. Integrative metabolomic characterisation identifies altered portal vein serum metabolome contributing to human hepatocellular carcinoma. Gut. 2021. https://doi.org/10.1136/gutjnl-2021-325189.
Genuer R, Poggi J-M, Tuleau-Malot C. VSURF: An R package for variable selection using random forests. R J. 2016;7(2):19–33.
Speiser JL, Miller ME, Tooze J, Ip E. A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst Appl. 2019;134:93–101.
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
Tang X, Tang R, Sun X, Yan X, Huang G, Zhou H, et al. A clinical diagnostic model based on an eXtreme Gradient Boosting algorithm to distinguish type 1 diabetes. Ann Transl Med. 2021;9(5):409.
Huang S, Cai N, Pacheco PP, Narrandes S, Wang Y, Xu W. Applications of Support Vector Machine (SVM) learning in cancer genomics. Cancer Genom Proteom. 2018;15(1):41–51.
Liu X, Pan Z, Yang H, Zhou X, Bai W, Niu X. An adaptive moment estimation method for online AUC maximization. PLoS ONE. 2019;14(4): e0215426.
Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J. 2014;35(29):1925–31.
Covarrubias AJ, Perrone R, Grozio A, Verdin E. NAD metabolism and its roles in cellular processes during ageing. Nat Rev Mol Cell Biol. 2021;22(2):119–41.
Qintar M, Humphries KH, Park JE, Arnold S, Spertus JA. Individualizing revascularization strategy for diabetic patients with multivessel coronary disease. J Am Coll Cardiol. 2019;74(16):2074–84.
Alcock RF, Yong AS, Ng AC, Chow V, Cheruvu C, Aliprandi-Costa B, et al. Acute coronary syndrome and stable coronary artery disease: are they so different? Long-term outcomes in a contemporary PCI cohort. Int J Cardiol. 2013;167(4):1343–6.
Fiedorczuk K, Sazanov LA. Mammalian mitochondrial complex I structure and disease-causing mutations. Trends Cell Biol. 2018;28(10):835–67.
Wen H, Ting JP, O’Neill LA. A role for the NLRP3 inflammasome in metabolic diseases–did Warburg miss inflammation? Nat Immunol. 2012;13(4):352–7.
Lexis CP, Rahel BM, Meeder JG, Zijlstra F, van der Horst IC. The role of glucose lowering agents on restenosis after percutaneous coronary intervention in patients with diabetes mellitus. Cardiovasc Diabetol. 2009;8:41.
Xi G, Shen X, Wai C, White MF, Clemmons DR. Hyperglycemia induces vascular smooth muscle cell dedifferentiation by suppressing insulin receptor substrate-1-mediated p53/KLF4 complex stabilization. J Biol Chem. 2019;294(7):2407–21.
Tanaka N, Terashima M, Rathore S, Itoh T, Habara M, Nasu K, et al. Different patterns of vascular response between patients with or without diabetes mellitus after drug-eluting stent implantation: optical coherence tomographic analysis. JACC Cardiovasc Interv. 2010;3(10):1074–9.
Fröbert O, Lagerqvist B, Carlsson J, Lindbäck J, Stenestrand U, James SK. Differences in restenosis rate with different drug-eluting stents in patients with and without diabetes mellitus: a report from the SCAAR (Swedish Angiography and Angioplasty Registry). J Am Coll Cardiol. 2009;53(18):1660–7.
Ottosson F, Smith E, Fernandez C, Melander O. Plasma metabolites associate with all-cause mortality in individuals with type 2 diabetes. Metabolites. 2020;10(8):315.
Welsh P, Rankin N, Li Q, Mark PB, Würtz P, Ala-Korpela M, et al. Circulating amino acids and the risk of macrovascular, microvascular and mortality outcomes in individuals with type 2 diabetes: results from the ADVANCE trial. Diabetologia. 2018;61(7):1581–91.
Cui S, Li K, Ang L, Liu J, Cui L, Song X, et al. Plasma phospholipids and sphingolipids identify stent restenosis after percutaneous coronary intervention. JACC Cardiovasc Interv. 2017;10(13):1307–16.
Tannous C, Booz GW, Altara R, Muhieddine DH, Mericskay M, Refaat MM, et al. Nicotinamide adenine dinucleotide: biosynthesis, consumption and therapeutic role in cardiac diseases. Acta Physiol. 2021;231(3): e13551.
Bürkle A, Virág L. Poly(ADP-ribose): PARadigms and PARadoxes. Mol Aspects Med. 2013;34(6):1046–65.
Hytönen J, Leppänen O, Braesen JH, Schunck WH, Mueller D, Jung F, et al. Activation of peroxisome proliferator-activated receptor-δ as novel therapeutic strategy to prevent in-stent restenosis and stent thrombosis. Arterioscler Thromb Vasc Biol. 2016;36(8):1534–48.
Watson A, Nong Z, Yin H, O’Neil C, Fox S, Balint B, et al. Nicotinamide phosphoribosyltransferase in smooth muscle cells maintains genome integrity, resists aortic medial degeneration, and is suppressed in human thoracic aortic aneurysm disease. Circ Res. 2017;120(12):1889–902.
Yin H, van der Veer E, Frontini MJ, Thibert V, O’Neil C, Watson A, et al. Intrinsic directionality of migrating vascular smooth muscle cells is regulated by NAD(+) biosynthesis. J Cell Sci. 2012;125(Pt 23):5770–80.
Krittanawong C, Johnson KW, Rosenson RS, Wang Z, Aydar M, Baber U, et al. Deep learning for cardiovascular medicine: a practical primer. Eur Heart J. 2019;40(25):2058–73.
This study was supported by the National Basic Research Program of China (82170343 and 81800317 to XW) and the Union Program of the Key Scientific and Technological Project of Henan Province (LHGJ20190968 to NC).
Ethics approval and consent to participate
The Study protocol was approved by the Ethics Committees of participating centers. Written informed consents were obtained from all participants.
Consent for publication
All authors approved the manuscript for publication.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figure S1. The Z distribution of methionine abundance in both discovery and internal validation sets. The Z scores of methionine across all samples were greater than -3 (-3SD), indicating that the serum samples were properly stored for metabolite detection. Figure S2. The Pearson correlation between metabolic data of quality control samples assessing the reliability of the LC-MS analysis in the discovery (A) and internal validation sets (B). Figure S3. The OPLS-DA analysis assessing the performance of 35 differential metabolites for discrimination of MACEs from matched controls in the discovery (A) and internal validation sets (B). Figure S4. The differences in cumulative rates of MACEs between participants with high-risk and low risk scores of the 6-metabolite signatures. P values were derived from Cox regression with adjustment for age, sex, smoking status, obesity (BMI > 25 kg/m2), hypertension, HbA1c, LVEF < 50%, clinical presentations, multivessel CAD, SYNTAX score, and stent types. Figure S5. Scatter plots for comparing the predictive performance of the random forest (A) and XGBoost (B) models to the logistic regression model of the 6-metabolite panel. The models are generated using the discovery dataset and presented here in the internal validation set. Red lines indicate the cut-offs of random forest and XGBoost models; Black lines indicate the cut-off of logistic regression model. Black circles label MACEs that would be identified using the random forest and XGBoost models but would be missed when the logistic regression model is applied. Figure S6. The importance of each predictor of the 6-metabolite classifier constructed by random forest (A) and XGBoost (B). Abbreviations: NAM, nicotinamide; ADPR, adenosine diphosphate ribose; 1-MNAM, 1-methylnicotinamide; PC, phosphatidylcholine. Figure S7. Calibration plots for the logistic regression and 4 machine learning models of the 6-metabolite classifier in the discovery (A) and internal validation sets (B). Abbreviations: RF, random forest; XGBoost, extreme gradient boosting; SVM, Support Vector Machines; DNN, deep neural network. Figure S8. Backward stepwise Cox regression analyses of the association between the 6-metabolite classifier and MACEs in the external validation set. (A) The log-minus-log plot for graphically testing the proportional hazards assumption. (B) Kaplan-Meier curve for assessing the performance of the 6-metabolite classifier to predict MACEs. In the backward stepwise Cox regression analyses, variables including age, sex, smoking status, obesity (BMI > 25 kg/m2), hypertension, HbA1c, LVEF < 50%, clinical presentations, multivessel CAD, SYNTAX score, stent types, and the 6-metabolite classifier were first entered one at a time. Then, 4 variables with P< 0.10 (i.e. HbA1c, LVEF < 50%, multivessel CAD, and the 6-metabolite classifier) in the stepwise procedure were retained to fit the final model. The HR and P value were calculated accordingly. Table S1. Calibration data for 6 metabolites detected by targeted metabolite analyses. Table S2. List of primers used in RT-qPCR. Table S3. Ontology classes of metabolites in the discovery and internal validation sets. Table S4. 69 differential metabolites in the discovery set. Table S5. 89 differential metabolites in the internal validation set. Table S6. The additional values of the RF-based 6-metabolite model beyond the FREEDOM clinical risk score in the external validation set.
About this article
Cite this article
Wang, Xb., Cui, Nh. & Liu, X. A novel 6-metabolite signature for prediction of clinical outcomes in type 2 diabetic patients undergoing percutaneous coronary intervention. Cardiovasc Diabetol 21, 126 (2022). https://doi.org/10.1186/s12933-022-01561-1