Date of Award

December 2021

Degree Type

Thesis

Degree Name

Master of Science

Department

Computer Science

First Advisor

Jake Luo

Second Advisor

Mukul Goyal

Committee Members

Jake Luo, Mukul Goyal, Rohit J Kate

Keywords

data science, Disease prediction, disease prediction platform, Disease prognosis, machine learning application

Abstract

Disease prediction is an important aspect of early disease detection and preventive care with wide range of applications in healthcare domain. Previous studies used image processing techniques, statistical and machine learning models to predict diseases. Prediction accuracies vary with data type and the target. Often the data is processed through models under different data conditions to identify what works best for a scenario. This results in tweaking the code, running multiple iterations making these methods usable only for people with technical skills. An interactive platform is developed that hides the technicalities and allows the users to change options like target disease for prognosis, feature selection method, sample size, ML algorithm. With this, multiple approaches can be tried and compared to find a combination of the options for an efficient outcome. Colon cancer is used to perform a case study to test this platform. 2 selection algorithms and 3 ML models are used. Although both selection methods identified identical features as significant for colon cancer prediction, the order of the features based on the scores is different. Hence, the machine learning algorithms performed similarly with both the selection methods. Random Forest, Logistic Regression, and Decision Tree had accuracies 87%, 86%, and 83% respectively.

Share

COinS