• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

Implementation of Forest-Based Predictive and Causal Machine Learning Techniques for Identifying the most Important Predictors of Mortality and Estimating Radiotherapy Treatment Effects in Breast Cancer Patients

Vol. 20, No. 1, January 31, 2026
10.3837/tiis.2026.01.004, Download Paper (Free):

Abstract

Randomized controlled trials (RCTs) are the benchmark for unbiased treatment evaluation, but face ethical, logistical, and recruitment challenges. Observational patient data, while abundant, comes with inherent issues such as confounding and bias, which complicate the analysis. In this study, we employed advanced machine learning techniques—Random Survival Forests (RSF), Causal Survival Forests (CSF), and Shapley Additive Explanations (SHAP values)—to enhance the analysis of observational clinical data from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) breast cancer dataset. RSF was utilized to identify key predictors of patient overall survival (OS), and a focused Cox model was constructed using these predictors for clear interpretability. CSF was applied to quantify the causal effects of radiation therapy on OS, adjusting for potential confounders. SHAP values were crucial in interpreting how individual covariates influenced these causal effects, providing insights into which factors most significantly affect patient outcomes. This approach not only clarifies the impact of radiation therapy on survival, but also demonstrates how modern computational tools can extract meaningful clinical insights from complex observational data, potentially guiding more personalized treatment strategies.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
H. J. Lee, I. Shuryak, E. Wang, K. Lee, "Implementation of Forest-Based Predictive and Causal Machine Learning Techniques for Identifying the most Important Predictors of Mortality and Estimating Radiotherapy Treatment Effects in Breast Cancer Patients," KSII Transactions on Internet and Information Systems, vol. 20, no. 1, pp. 60-79, 2026. DOI: 10.3837/tiis.2026.01.004.

[ACM Style]
Heejeong Jasmine Lee, Igor Shuryak, Eric Wang, and Kang-Yoon Lee. 2026. Implementation of Forest-Based Predictive and Causal Machine Learning Techniques for Identifying the most Important Predictors of Mortality and Estimating Radiotherapy Treatment Effects in Breast Cancer Patients. KSII Transactions on Internet and Information Systems, 20, 1, (2026), 60-79. DOI: 10.3837/tiis.2026.01.004.

[BibTeX Style]
@article{tiis:105649, title="Implementation of Forest-Based Predictive and Causal Machine Learning Techniques for Identifying the most Important Predictors of Mortality and Estimating Radiotherapy Treatment Effects in Breast Cancer Patients", author="Heejeong Jasmine Lee and Igor Shuryak and Eric Wang and Kang-Yoon Lee and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2026.01.004}, volume={20}, number={1}, year="2026", month={January}, pages={60-79}}