A machine learning model for colorectal liver metastasis post-hepatectomy prognostications: several strategies for the model evaluation
Letter to the Editor

A machine learning model for colorectal liver metastasis post-hepatectomy prognostications: several strategies for the model evaluation

Guang-Yao Li1, Lu-Lu Zhai2

1Department of General Surgery, The Second People’s Hospital of Wuhu, Wuhu, China; 2Department of General Surgery, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China

Correspondence to: Lu-Lu Zhai, MD. Department of General Surgery, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, 17 Lujiang Road, Hefei 230001, China. Email: jackyzhai@ustc.edu.cn.

Comment on: Lam CSN, Bharwani AA, Chan EHY, et al. A machine learning model for colorectal liver metastasis post-hepatectomy prognostications. Hepatobiliary Surg Nutr 2023;12:495-506.


Keywords: Colorectal liver metastasis (CRLM); hepatectomy; survival; prediction model


Submitted Jan 29, 2024. Accepted for publication May 18, 2024. Published online Jun 25, 2024.

doi: 10.21037/hbsn-24-54


With great interest, we read the article by Lam et al. (1) entitled “A machine learning model for colorectal liver metastasis post-hepatectomy prognostications”. In this study, the authors included colorectal liver metastasis (CRLM) patients from four hospitals in Hong Kong who underwent hepatic resection, and developed a survival prediction model based on the patients’ demographic, oncologic, clinicopathologic, and therapeutic characteristics using machine learning. Through Cox proportional hazards and least absolute shrinkage and selection operator (LASSO) regression analyses, the authors successfully developed a predictive model consisting of eight predictors that could accurately predict overall survival (OS) and recurrence-free survival (RFS) after hepatectomy in patients with CRLM. This is an intriguing study with significant clinical value, and the authors deserve to be commended for their efforts. However, there are still several issues that need to be addressed in this study.

First, the authors employed only Cox proportional hazards and LASSO regression to construct the predictive models. We recommend that the authors develop a nomogram to predict OS and RFS based on the results of multivariate Cox proportional hazards analysis to visualize the predictive probabilities of the predictors for survival. The nomogram predicts the specific probability of an event occurring, rather than the level of risk. It integrates the predictors in the regression model by plotting the scaled line segments on the same plane at a certain scale, which displays the interrelationships between the variables in the prediction model and the predictive probabilities of the variables for the outcome events. By assigning scores to the regression coefficients of predictors in the regression model and the functional transformation relationship between total scores and outcome events, the nomogram transforms complex regression equations into visual graphs, making the results of prediction models more readable and easier to use, and has been widely applied in the construction of various prediction models (2,3). Second, the authors assessed the predictive ability of the prediction model using Harrell’s Concordance index (C-index), but they did not provide a calibration curve. The C-index measures the discrimination of a predictive model, while the calibration curve measures the agreement between predicted and actual probabilities, i.e., calibration (4,5). Both discrimination and calibration are the most common and crucial metrics for evaluating the performance of a predictive model. Therefore, when evaluating the predictive performance of a model, the authors should consider both discrimination and calibration. Furthermore, in addition to the C-index, we recommend that authors supplement receiver operating characteristic (ROC) curve analysis, which not only evaluates the discriminative ability of the predictive model, but also calculates the sensitivity and specificity of the predictive model. Third, although the C-index, ROC curves, and calibration curves can be used to assess the discrimination and calibration of a predictive model, they do not adequately reflect the clinical utility. In clinical practice, a predictive model is not always fully applicable, and false positives and false negatives are unavoidable. Sometimes, patients benefit more from avoiding false positives, while at other times, it is preferable to avoid false negatives. To find a method to maximize the net benefit, the decision curve analysis (DCA) was subsequently developed (6). The advantage of DCA is that it integrates patient or decision maker preferences into the analysis, which meets the practical requirements of clinical decision-making (6-8). Given that the predictive model developed by the authors ultimately needs to be implemented in clinical practice, we recommend that the authors utilize DCA to ascertain the net benefit of the predictive model, thereby assessing the clinical utility of the model. Finally, the patients recruited for this study were from Hong Kong, China, with regional limitations. We recommend that the authors attempt to recruit CRLM populations from healthcare organizations in various international regions as an external validation cohort for future studies. This will help validate the predictive effect of the model on patient survival from different regions. By broadening the geographic range of the validation cohort, it will be beneficial for the model to be generalized to diverse populations, independent of the clinical setting.

In summary, we would like to thank the authors for their significant contribution to exploring the superior survival prediction model for patients with CRLM in this study. Regarding the evaluation of the survival prediction model, we propose several strategies that could be beneficial in enhancing this study and future similar studies.


Acknowledgments

Funding: This study was supported by the Key University Natural Science Research Project of Anhui Province (No. 2023AH053416), and partly supported by the Open Funds of the Guangxi Key Laboratory of Tumor Immunology and Microenvironmental Regulation (No. 2023KF012), and the Anhui Provincial Postdoctoral Scientific Foundation (No. 2023A660).


Footnote

Provenance and Peer Review: This article was a standard submission to the journal. The article did not undergo external peer review.

Conflicts of Interest: Both authors have completed the ICMJE uniform disclosure form (available at https://hbsn.amegroups.com/article/view/10.21037/hbsn-24-54/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Lam CSN, Bharwani AA, Chan EHY, et al. A machine learning model for colorectal liver metastasis post-hepatectomy prognostications. Hepatobiliary Surg Nutr 2023;12:495-506. [Crossref] [PubMed]
  2. Iasonos A, Schrag D, Raj GV, et al. How to build and interpret a nomogram for cancer prognosis. J Clin Oncol 2008;26:1364-70. [Crossref] [PubMed]
  3. Balachandran VP, Gonen M, Smith JJ, et al. Nomograms in oncology: more than meets the eye. Lancet Oncol 2015;16:e173-80. [Crossref] [PubMed]
  4. Alba AC, Agoritsas T, Walsh M, et al. Discrimination and Calibration of Clinical Prediction Models: Users' Guides to the Medical Literature. JAMA 2017;318:1377-84. [Crossref] [PubMed]
  5. Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 2010;21:128-38. [Crossref] [PubMed]
  6. Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making 2006;26:565-74. [Crossref] [PubMed]
  7. Kerr KF, Brown MD, Zhu K, et al. Assessing the Clinical Impact of Risk Prediction Models With Decision Curves: Guidance for Correct Interpretation and Appropriate Use. J Clin Oncol 2016;34:2534-40. [Crossref] [PubMed]
  8. Wang Y, Li J, Xia Y, et al. Prognostic nomogram for intrahepatic cholangiocarcinoma after partial hepatectomy. J Clin Oncol 2013;31:1188-95. [Crossref] [PubMed]
Cite this article as: Li GY, Zhai LL. A machine learning model for colorectal liver metastasis post-hepatectomy prognostications: several strategies for the model evaluation. HepatoBiliary Surg Nutr 2024;13(4):752-754. doi: 10.21037/hbsn-24-54

Download Citation