Date of Award

December 2016

Degree Type

Thesis

Degree Name

Master of Science

Department

Computer Science

First Advisor

Rohit J. Kate

Committee Members

Hossein Hosseini, Jun Zhang

Keywords

Cancer Survivability, Data Mining, Machine Learning, Predictive Models, Seer Dataset

Abstract

Survivability of cancer strongly depends on the stage of cancer. In most previous works, machine learning survivability prediction models for a particular cancer, were trained and evaluated together on all stages of the cancer. In this work, we trained and evaluated survivability prediction models for five major cancers, together on all stages and separately for every stage. We named these models joint and stage-specific models respectively. The obtained results for the cancers which we investigated reveal that, the best model to predict the survivability of the cancer for one specific stage is the model which is specifically built for that stage. Additionally, we saw that for every stage of cancer, the most important features to predict survivability, differed from other stages. By evaluating the models separately on different stages we found that their performance differed on different stages. We also found that evaluating the models together on all stages, as was done in past, is misleading because it overestimates performance.

Share

COinS