Date of Award

December 2016

Degree Type


Degree Name

Master of Science


Computer Science

First Advisor

Rohit J. Kate

Committee Members

Hossein Hosseini, Jun Zhang


Cancer Survivability, Data Mining, Machine Learning, Predictive Models, Seer Dataset


Survivability of cancer strongly depends on the stage of cancer. In most previous works, machine learning survivability prediction models for a particular cancer, were trained and evaluated together on all stages of the cancer. In this work, we trained and evaluated survivability prediction models for five major cancers, together on all stages and separately for every stage. We named these models joint and stage-specific models respectively. The obtained results for the cancers which we investigated reveal that, the best model to predict the survivability of the cancer for one specific stage is the model which is specifically built for that stage. Additionally, we saw that for every stage of cancer, the most important features to predict survivability, differed from other stages. By evaluating the models separately on different stages we found that their performance differed on different stages. We also found that evaluating the models together on all stages, as was done in past, is misleading because it overestimates performance.