Date of Award

August 2022

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Engineering

First Advisor

Zeyun ZY Yu

Committee Members

Sandeep SG Gopalakrishnan, Yi YH Hu, Jun JZ Zhang, Mahsa MD Dabagh

Abstract

Approximately 84 % of hospitals are adopting electronic medical records (EMR) In the United States. EMR is a vital resource to help clinicians diagnose the onset or predict the future condition of a specific disease. With machine learning advances, many research projects attempt to extract medically relevant and actionable data from massive EMR databases using machine learning algorithms. However, collecting patients' prognosis factors from Electronic EMR is challenging due to privacy, sensitivity, and confidentiality. In this study, we developed medical generative adversarial networks (GANs) to generate synthetic EMR prognosis factors using minimal information collected during routine care in specialized healthcare facilities. The generated prognosis variables used in developing predictive models for (1) chronic wound healing in patients diagnosed with Venous Leg Ulcers (VLUs) and (2) antibiotic resistance in patients diagnosed with Skin and soft tissue infections (SSTIs). Our proposed medical GANs, EMR-TCWGAN and DermaGAN, can produce both continuous and categorical features from EMR. We utilized conditional training strategies to enhance training and generate classified data regarding healing vs. non-healing in EMR-TCWGAN and susceptibility vs. resistance in DermGAN. The ability of the proposed GAN models to generate realistic EMR data was evaluated by TSTR (test on the synthetic, train on the real), discriminative accuracy, and visualization. We analyzed the synthetic data augmentation technique's practicality in improving the wound healing prognostic model and antibiotic resistance classifier. We achieved the area under the curve (AUC) of 0.875 in the wound healing prognosis model and an average AUC of 0.830 in the antibiotic resistance classifier by using the synthetic samples generated by GANs in the training process. These results suggest that GANs can be considered a data augmentation method to generate realistic EMR data.

Share

COinS