The lasso method for variable selection in the Cox model.
Tibshirani R
Statistics in medicine
I propose a new method for variable selection and shrinkage in Cox's proportional hazards model. My proposal minimizes the log partial likelihood subject to the sum of the absolute values of the parameters being bounded by a constant. Because of the nature of this constraint, it shrinks coefficients and produces some coefficients that are exactly zero. As a result it reduces the estimation variance while providing an interpretable final model. The method is a variation of the 'lasso' proposal of Tibshirani, designed for the linear regression context. Simulations indicate that the lasso can be more accurate than stepwise selection in this setting.
Prediction of Outcome in Patients With Chronic Lymphocytic Leukemia Treated With Ibrutinib: Development and Validation of a Four-Factor Prognostic Model.
Ahn Inhye E,Tian Xin,Ipe David,Cheng Mei,Albitar Maher,Tsao L Claire,Zhang Lei,Ma Wanlong,Herman Sarah E M,Gaglione Erika M,Soto Susan,Dean James P,Wiestner Adrian
Journal of clinical oncology : official journal of the American Society of Clinical Oncology
PURPOSE:Randomized trials established the superiority of ibrutinib-based therapy over chemoimmunotherapy in chronic lymphocytic leukemia. Durability of progression-free survival (PFS) with ibrutinib can vary by patient subgroup. Clinical tools for prognostication and risk-stratification are needed. PATIENTS AND METHODS:Patients treated with ibrutinib in phase II and III trials provided the discovery data set and were subdivided into discovery and internal validation cohorts. An external validation cohort included 84 patients enrolled in our investigator-initiated phase II trial. Univariable analysis of 18 pretreatment parameters was performed using PFS and overall survival (OS) end-points. Multivariable analysis and machine-learning algorithms identified four factors for a prognostic model that was validated in internal and external cohorts. RESULTS:Factors independently associated with inferior PFS and OS were as follows: aberration, prior treatment, β-2 microglobulin ≥ 5 mg/L, and lactate dehydrogenase > 250 U/L. Each of these four factors contributed one point to a prognostic model that stratified patients into three risk groups: three to four points, high risk; two points, intermediate risk; zero to one point, low risk. The 3-year PFS rates for all 804 patients combined were 47%, 74%, and 87% for the high-, the intermediate-, and the low-risk group, respectively ( < .0001). The 3-year OS rates were 63%, 83%, and 93%, respectively ( < .0001). The model remained significant when applied to treatment-naïve and relapsed/refractory cohorts individually. For 84 patients in the external cohort, and mutations were tested cross-sectionally and at progression. The cumulative incidences of mutations were strongly correlated with the model. In the external cohort, Richter's transformation occurred in 17% of the high-risk group, and in no patient in the low-risk group. CONCLUSION:Patients at increased risk of ibrutinib failure can be identified at treatment initiation and considered for clinical trials.
10.1200/JCO.20.00979
Rituximab versus tocilizumab in rheumatoid arthritis: synovial biopsy-based biomarker analysis of the phase 4 R4RA randomized trial.
Nature medicine
Patients with rheumatoid arthritis (RA) receive highly targeted biologic therapies without previous knowledge of target expression levels in the diseased tissue. Approximately 40% of patients do not respond to individual biologic therapies and 5-20% are refractory to all. In a biopsy-based, precision-medicine, randomized clinical trial in RA (R4RA; n = 164), patients with low/absent synovial B cell molecular signature had a lower response to rituximab (anti-CD20 monoclonal antibody) compared with that to tocilizumab (anti-IL6R monoclonal antibody) although the exact mechanisms of response/nonresponse remain to be established. Here, in-depth histological/molecular analyses of R4RA synovial biopsies identify humoral immune response gene signatures associated with response to rituximab and tocilizumab, and a stromal/fibroblast signature in patients refractory to all medications. Post-treatment changes in synovial gene expression and cell infiltration highlighted divergent effects of rituximab and tocilizumab relating to differing response/nonresponse mechanisms. Using ten-by-tenfold nested cross-validation, we developed machine learning algorithms predictive of response to rituximab (area under the curve (AUC) = 0.74), tocilizumab (AUC = 0.68) and, notably, multidrug resistance (AUC = 0.69). This study supports the notion that disease endotypes, driven by diverse molecular pathology pathways in the diseased tissue, determine diverse clinical and treatment-response phenotypes. It also highlights the importance of integration of molecular pathology signatures into clinical algorithms to optimize the future use of existing medications and inform the development of new drugs for refractory patients.
10.1038/s41591-022-01789-0
Precision Phenotyping of Dilated Cardiomyopathy Using Multidimensional Data.
Journal of the American College of Cardiology
BACKGROUND:Dilated cardiomyopathy (DCM) is a final common manifestation of heterogenous etiologies. Adverse outcomes highlight the need for disease stratification beyond ejection fraction. OBJECTIVES:The purpose of this study was to identify novel, reproducible subphenotypes of DCM using multiparametric data for improved patient stratification. METHODS:Longitudinal, observational UK-derivation (n = 426; median age 54 years; 67% men) and Dutch-validation (n = 239; median age 56 years; 64% men) cohorts of DCM patients (enrolled 2009-2016) with clinical, genetic, cardiovascular magnetic resonance, and proteomic assessments. Machine learning with profile regression identified novel disease subtypes. Penalized multinomial logistic regression was used for validation. Nested Cox models compared novel groupings to conventional risk measures. Primary composite outcome was cardiovascular death, heart failure, or arrhythmia events (median follow-up 4 years). RESULTS:In total, 3 novel DCM subtypes were identified: profibrotic metabolic, mild nonfibrotic, and biventricular impairment. Prognosis differed between subtypes in both the derivation (P < 0.0001) and validation cohorts. The novel profibrotic metabolic subtype had more diabetes, universal myocardial fibrosis, preserved right ventricular function, and elevated creatinine. For clinical application, 5 variables were sufficient for classification (left and right ventricular end-systolic volumes, left atrial volume, myocardial fibrosis, and creatinine). Adding the novel DCM subtype improved the C-statistic from 0.60 to 0.76. Interleukin-4 receptor-alpha was identified as a novel prognostic biomarker in derivation (HR: 3.6; 95% CI: 1.9-6.5; P = 0.00002) and validation cohorts (HR: 1.94; 95% CI: 1.3-2.8; P = 0.00005). CONCLUSIONS:Three reproducible, mechanistically distinct DCM subtypes were identified using widely available clinical and biological data, adding prognostic value to traditional risk models. They may improve patient selection for novel interventions, thereby enabling precision medicine.
10.1016/j.jacc.2022.03.375
Machine learning for ECG diagnosis and risk stratification of occlusion myocardial infarction.
Nature medicine
Patients with occlusion myocardial infarction (OMI) and no ST-elevation on presenting electrocardiogram (ECG) are increasing in numbers. These patients have a poor prognosis and would benefit from immediate reperfusion therapy, but, currently, there are no accurate tools to identify them during initial triage. Here we report, to our knowledge, the first observational cohort study to develop machine learning models for the ECG diagnosis of OMI. Using 7,313 consecutive patients from multiple clinical sites, we derived and externally validated an intelligent model that outperformed practicing clinicians and other widely used commercial interpretation systems, substantially boosting both precision and sensitivity. Our derived OMI risk score provided enhanced rule-in and rule-out accuracy relevant to routine care, and, when combined with the clinical judgment of trained emergency personnel, it helped correctly reclassify one in three patients with chest pain. ECG features driving our models were validated by clinical experts, providing plausible mechanistic links to myocardial injury.
10.1038/s41591-023-02396-3
Gut microbiome modulates the effects of a personalised postprandial-targeting (PPT) diet on cardiometabolic markers: a diet intervention in pre-diabetes.
Gut
OBJECTIVE:To explore the interplay between dietary modifications, microbiome composition and host metabolic responses in a dietary intervention setting of a personalised postprandial-targeting (PPT) diet versus a Mediterranean (MED) diet in pre-diabetes. DESIGN:In a 6-month dietary intervention, adults with pre-diabetes were randomly assigned to follow an MED or PPT diet (based on a machine-learning algorithm for predicting postprandial glucose responses). Data collected at baseline and 6 months from 200 participants who completed the intervention included: dietary data from self-recorded logging using a smartphone application, gut microbiome data from shotgun metagenomics sequencing of faecal samples, and clinical data from continuous glucose monitoring, blood biomarkers and anthropometrics. RESULTS:PPT diet induced more prominent changes to the gut microbiome composition, compared with MED diet, consistent with overall greater dietary modifications observed. Particularly, microbiome alpha-diversity increased significantly in PPT (p=0.007) but not in MED arm (p=0.18). Post hoc analysis of changes in multiple dietary features, including food-categories, nutrients and PPT-adherence score across the cohort, demonstrated significant associations between specific dietary changes and species-level changes in microbiome composition. Furthermore, using causal mediation analysis we detect nine microbial species that partially mediate the association between specific dietary changes and clinical outcomes, including three species (from , , orders) that mediate the association between PPT-adherence score and clinical outcomes of hemoglobin A1c (HbA1c), high-density lipoprotein cholesterol (HDL-C) and triglycerides. Finally, using machine-learning models trained on dietary changes and baseline clinical data, we predict personalised metabolic responses to dietary modifications and assess features importance for clinical improvement in cardiometabolic markers of blood lipids, glycaemic control and body weight. CONCLUSIONS:Our findings support the role of gut microbiome in modulating the effects of dietary modifications on cardiometabolic outcomes, and advance the concept of precision nutrition strategies for reducing comorbidities in pre-diabetes. TRIAL REGISTRATION NUMBER:NCT03222791.
10.1136/gutjnl-2022-329201
Predicting Malignant Ventricular Arrhythmias Using Real-Time Remote Monitoring.
Journal of the American College of Cardiology
BACKGROUND:Although implantable cardioverter-defibrillator (ICD) therapies are associated with increased morbidity and mortality, the prediction of malignant ventricular arrhythmias has remained elusive. OBJECTIVES:The purpose of this study was to evaluate whether daily remote-monitoring data may predict appropriate ICD therapies for ventricular tachycardia or ventricular fibrillation. METHODS:This was a post hoc analysis of IMPACT (Randomized trial of atrial arrhythmia monitoring to guide anticoagulation in patients with implanted defibrillator and cardiac resynchronization devices), a multicenter, randomized, controlled trial of 2,718 patients evaluating atrial tachyarrhythmias and anticoagulation for patients with heart failure and ICD or cardiac resynchronization therapy with defibrillator devices. All device therapies were adjudicated as either appropriate (to treat ventricular tachycardia or ventricular fibrillation) or inappropriate (all others). Remote monitoring data in the 30 days before device therapy were utilized to develop separate multivariable logistic regression and neural network models to predict appropriate device therapies. RESULTS:A total of 59,807 device transmissions were available for 2,413 patients (age 64 ± 11 years, 26% women, 64% ICD). Appropriate device therapies (141 shocks, 10 antitachycardia pacing) were delivered to 151 patients. Logistic regression identified shock lead impedance and ventricular ectopy as significantly associated with increased risk of appropriate device therapy (sensitivity 39%, specificity 91%, AUC: 0.72). Neural network modeling yielded significantly better (P < 0.01 for comparison) predictive performance (sensitivity 54%, specificity 96%, AUC: 0.90), and also identified patterns of change in atrial lead impedance, mean heart rate, and patient activity as predictors of appropriate therapies. CONCLUSIONS:Daily remote monitoring data may be utilized to predict malignant ventricular arrhythmias in the 30 days before device therapies. Neural networks complement and enhance conventional approaches to risk stratification.
10.1016/j.jacc.2022.12.024
Multimodal brain age estimates relate to Alzheimer disease biomarkers and cognition in early stages: a cross-sectional observational study.
eLife
Background:Estimates of 'brain-predicted age' quantify apparent brain age compared to normative trajectories of neuroimaging features. The brain age gap (BAG) between predicted and chronological age is elevated in symptomatic Alzheimer disease (AD) but has not been well explored in presymptomatic AD. Prior studies have typically modeled BAG with structural MRI, but more recently other modalities, including functional connectivity (FC) and multimodal MRI, have been explored. Methods:We trained three models to predict age from FC, structural (S), or multimodal MRI (S+FC) in 390 amyloid-negative cognitively normal (CN/A-) participants (18-89 years old). In independent samples of 144 CN/A-, 154 CN/A+, and 154 cognitively impaired (CI; CDR > 0) participants, we tested relationships between BAG and AD biomarkers of amyloid and tau, as well as a global cognitive composite. Results:All models predicted age in the control training set, with the multimodal model outperforming the unimodal models. All three BAG estimates were significantly elevated in CI compared to controls. FC-BAG was significantly reduced in CN/A+ participants compared to CN/A-. In CI participants only, elevated S-BAG and S+FC BAG were associated with more advanced AD pathology and lower cognitive performance. Conclusions:Both FC-BAG and S-BAG are elevated in CI participants. However, FC and structural MRI also capture complementary signals. Specifically, FC-BAG may capture a unique biphasic response to presymptomatic AD pathology, while S-BAG may capture pathological progression and cognitive decline in the symptomatic stage. A multimodal age-prediction model improves sensitivity to healthy age differences. Funding:This work was supported by the National Institutes of Health (P01-AG026276, P01- AG03991, P30-AG066444, 5-R01-AG052550, 5-R01-AG057680, 1-R01-AG067505, 1S10RR022984-01A1, and U19-AG032438), the BrightFocus Foundation (A2022014F), and the Alzheimer's Association (SG-20-690363-DIAN).
10.7554/eLife.81869