Date of Award

August 2012

Degree Type


Degree Name

Master of Science



First Advisor

Istvan Lauko

Committee Members

Istvan Lauko, Bruce Wade, Peter Tonellato, Jugal Ghorai


Bayesian Networks, Breast Cancer Risk Prediction, Knowledge Discovery, Large Data Sets, Tetrad


Statistics from the National Cancer Institute indicate that 1 in 8 women will develop Breast cancer in their lifetime. Researchers have developed numerous statistical models to predict breast cancer risk however physicians are hesitant to use these models because of disparities in the predictions they produce. In an effort to reduce these disparities, we use Bayesian networks to capture the joint distribution of risk factors, and simulate artificial patient populations (clinical avatars) for interrogating the existing risk prediction models. The challenge in this effort has been to produce a Bayesian network whose dependencies agree with literature and are good estimates of the joint distribution of risk factors. In this work, we propose a methodology for learning Bayesian networks that uses prior knowledge to guide a collection of search algorithms in identifying an optimum structure. Using data from the breast cancer surveillance consortium we have shown that our methodology produces a Bayesian network with consistent dependencies and a better estimate of the distribution of risk factors compared with existing methods