Date of Award
May 2023
Degree Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Engineering
First Advisor
Rohit Kate
Committee Members
Rohit Kate, Jake Luo, Tian Zhao, Jun Zhang, Zeyun Yu
Keywords
Bidirectional Encoder Representations from Transformers (BERT), Clinical Ontology, Medical Ontology, Ontology Embeddings, SNOMED CT, Word Embeddings
Abstract
ABSTRACT Leveraging Biomedical Ontological Knowledge to Improve Clinical Term Embeddings by Fuad Abu Zahra The University of Wisconsin-Milwaukee, 2023 Under the Supervision of Dr. Rohit J. Kate This research is on obtaining and using word embeddings for natural language processing tasks in the biomedical domain. Word embeddings are vector representations of words commonly obtained from large text corpora. This research leverages the biomedical ontology of SNOMED CT as an alternate source for obtaining embeddings for clinical terms. The existing graph-based methods can only give embeddings for concepts (i.e., nodes of the graph) of an ontology, hence we developed a novel method to obtain embeddings for clinical words and terms from their concept embeddings. These embeddings were evaluated on benchmark datasets of clinical term similarity and on the clinical term normalization task and were found to work better than corpus-based embeddings. However, unlike corpus-based embeddings, the embeddings obtained from SNOMED CT do not incorporate linguistic knowledge as the method was not trained on text data. Therefore, we also developed two new methods to combine the two resources of embeddings – by generating a synthetic corpus out of SNOMED CT ontology and using it for additional training using corpus-based methods, and by fine-tuning a corpus-based system on SNOMED CT concept embeddings. The evaluation showed that the combined embeddings obtained using these methods perform better than either type of embeddings.
Recommended Citation
Abuzahra, Fuad Hatem, "Leveraging Biomedical Ontological Knowledge to Improve Clinical Term Embeddings" (2023). Theses and Dissertations. 3116.
https://dc.uwm.edu/etd/3116