The tool was developed using the electronic health records (EHRs) of the spectrum of all cancer patients treated at a tertiary cancer center and novel machine learning algorithms constructed by the coauthors. Delivered as a web or mobile based application, it estimates the probability of mortality for a particular patient and a particular envisioned cancer treatment. This tool has the following characteristics:
The tool takes as inputs the EHR of a particular patient, the particular cancer type and a particular envisioned cancer treatment and outputs the probability of mortality adjusted for these patient characteristics.
Because the structure of the prediction is based on decision trees, a physician or even a patient can easily understand the reasoning behind the algorithm. The model also identifies key predictors of mortality such as change in weight.
The tool was informed by EHRs of more than 23,000 patients at a large national cancer hospital. We included 401 predictors including demographics, medical and treatment history, laboratory tests, and genomic results.
The clinician can compare different envisioned treatments for a particular cancer patient with respect to the probability of mortality and make decisions that are informed by these estimates.
We compare the out-of-sample accuracy and the area under the curve (AUC) in unseen patient data from 2012-2014, with very encouraging results compared to competing approaches.
The methodology of this paper is based on two novel algorithms developed by coauthors of the paper: a) the predictive decision tree algorithm was developed by Bertsimas and Dunn using optimization ideas, and b) the algorithm for missing data imputation was developed by Bertsimas, Pawlowski, and Zhuo.