Item

Predicting health-related quality of life two years post-diagnosis across seven cancer types: Using machine learning to identify vulnerable patients

Oudijk,W.F.
de Rooij,B.H.
van Benthem,K.J.
Etienne,R.S.
Oerlemans,S.
Verkooijen,H.M.
Aben,K.K.H.
Vink,G.R.
May,A.M.
Mols,F.
... show 2 more
Abstract
Purpose Cancer survivors often experience long-term consequences affecting their Health-Related Quality of Life (HRQoL). Sociodemographic factors, clinical characteristics, and health-related behaviours influence HRQoL, making some individuals vulnerable to adverse HRQoL. This study develops linear regression and machine learning models to predict HRQoL two-year post-diagnosis and to identify key vulnerability factors. Methods This longitudinal study included data of survivors of seven cancer types. Nineteen predictor variables were derived from questionnaires completed within three months post-diagnosis (baseline) from the Netherlands Cancer Registry. Linear regression, random forest, XGBoost, neural network, and Support Vector Machine (SVM) regressors were employed to predict the EORTC QLQ-C30 summary score 1.5–2.5 years post-diagnosis. Permutation testing assessed vulnerability factors. Results The analyses included 4,538 individuals. All models achieved similar R2 (0.3) and RMSE (9) scores. Linear regression, random forest, XGBoost, and SVM models identified lower physical, cognitive, and emotional functioning at diagnosis, along with more comorbidities, cancer type (especially endometrial), and higher BMI as the top vulnerability factors. Treatment, age, and education were not associated with vulnerability. All models tended to overestimate low HRQoL which might be due to the limited number of observations with low HRQoL values. Conclusions The predictors used in this analysis explained only 30% of the variation in long-term HRQoL. Similar to previous studies predicting HRQoL in cancer, these predictors miss crucial information. Baseline functioning, comorbidities, cancer type and BMI appeared to be the key vulnerability factors. Future studies should prioritize accurate prediction of low HRQoL scores.
Description
Date
2026
Journal Title
Journal ISSN
Volume Title
Publisher
Research Projects
Organizational Units
Journal Issue
Keywords
SDG 3 - Good Health and Well-being
Citation
Oudijk, W F, de Rooij, B H, van Benthem, K J, Etienne, R S, Oerlemans, S, Verkooijen, H M, Aben, K K H, Vink, G R, May, A M, Mols, F, Katsimpokis, D & Ezendam, N P M 2026, 'Predicting health-related quality of life two years post-diagnosis across seven cancer types : Using machine learning to identify vulnerable patients', Quality of Life Research, vol. 35, no. 67. https://doi.org/10.1007/s11136-026-04165-4
License
info:eu-repo/semantics/restrictedAccess
Embedded videos