An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses
Document Type
Article
Publication Date
5-7-2021
Keywords
COVID-19
Abstract
Background
Mathematical approaches have been for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families: those of coronaviruses and those of influenza viruses. The coronaviruses are SARS-CoV-2, SARS-CoV-1, and MERS. The influenza viruses include H1N1-1918, H1N1-2009, H2N2-1957, and H3N2-1968.
Methods
The mathematical method used is the slow feature analysis (SFA), a rather new but promising method to delineate complex structure in DNA sequences.
Results
The analysis indicates that the DNA sequences exhibit an elaborate and convoluted structure akin to complex networks. We define a measure of complexity and show that each DNA sequence exhibits a certain degree of complexity within itself, while at the same time there exists complex inter-relationships between the sequences within a family and between the two families. From these relationships, we find evidence, especially for the coronavirus family, that increasing complexity in a sequence is associated with higher transmission rate but with lower mortality.
Conclusions
The complexity measure defined here may hold a promise and could become a useful tool in the prediction of transmission and mortality rates in future new viral strains.
Recommended Citation
Tsonis, A., Wang, G., Zhang, L., Lu, W., Kayafas, A., & Del Rio-Tsonis, K. (2021). An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses. Human Genomics, 15(1), 26–26. https://doi.org/10.1186/s40246-021-00327-2