Date of Award
Doctor of Philosophy
Nadine Kozak, Xiangming Mu, Iris Xie, Timothy Patrick
family health, social media, temporal analysis, theme discovery, topic evolution pattern, Wikipedia
New concepts, terms, and topics always emerge; and meanings of existing terms and topics keep changing all the time. These phenomena occur more frequently on social media than on conventional media because social media allows a huge number of users to generate information online. Retrieving relevant results in different time periods of a fast-changing topic becomes one of the most difficult challenges in the information retrieval field. Among numerous topics discussed on social media, health-related topics are a major category which attracts increasing attention from the general public.
This study investigated and explored the evolution patterns of family-health-related topics on Wikipedia. Three family-health-related topics (Child Maltreatment, Family Planning, and Women’s Health) were selected from the World Health Organization Website and their associated entries were retrieved on Wikipedia. Historical numeric and text data of the entries from 2010 to 2017 were collected from a Wikipedia data dump and the Wikipedia Web pages. Four periods were defined: 2010 to 2011, 2012 to 2013, 2014 to 2015, and 2016 to 2017. Coding, subject analysis, descriptive statistical analysis, inferential statistical analysis, SOM approach, and n-gram approach were employed to explore the internal characteristics and external popularity evolutions of the topics.
The findings illustrate that the external popularities of the family-health-related topics declined from 2010 to 2017, although their content on Wikipedia kept increasing. The emerged entries had three features: specialization, summarization, and internationalization. The subjects derived from the entries became increasingly diverse during the investigated periods. Meanwhile, the developing trajectories of the subjects varied from one to another. According to the developing trajectories, the subjects were grouped into three categories: growing subject, diminishing subject, and fluctuating subject. The popularities of the topics among the Wikipedia viewers were consistent, while among the editors were not. For each topic, its popularity trend among the editors and the viewers was inconsistent. Child Maltreatment was the most popular among the three topics, Women’s Health was the second most popular, while Family Planning was the least popular among the three.
The implications of this study include: (1) helping health professionals and general users get a more comprehensive understanding of the investigated topics; (2) contributing to the developments of health ontologies and consumer health vocabularies; (3) assisting Website designers in organizing online health information and helping them identify popular family-health-related topics; (4) providing a new approach for query recommendation in information retrieval systems; (5) supporting temporal information retrieval by presenting the temporal changes of family-health-related topics; and (6) providing a new combination of data collection and analysis methods for researchers.
Wang, Yanyan, "Analysis of Family-Health-Related Topics on Wikipedia" (2018). Theses and Dissertations. 1945.