A Plug and Play Approach to Vocabulary Mapping and Document Annotation

Document Type


Publication Date



Ontologies, mapping, document annotation, COVID-19, natural language processing, machine learning


This study investigates the utility of open-source data analytics, reporting, and integration tools for mapping knowledge organization systems (KOS) dedicated to COVID-19. Using various natural language processing and machine learning methods, a workflow was created with the KNIME Analytics Platform for term mapping and document annotation tasks using string-based, sense-based, and rule-based algorithmic methods. Results suggest strong support for mapping vocabularies within similar domains and appropriate tagging of unstructured data with concepts from an integrated dictionary created from mapped terms. This study demonstrates the use of a shareable, easily adaptable workflow for KOS mapping and annotating documents focused on COVID-19 research. These mapped terms were then used for annotation of clinical trials. The results suggest that researchers could select and align vocabularies of interest and then use the results to annotate documents of interest.