Date of Award

May 2016

Degree Type


Degree Name

Doctor of Philosophy


Freshwater Sciences

First Advisor

Sandra L McLellan

Committee Members

Rebecca D Klaper, Ryan J Newton, Gyaneshwar Prasad, Elizabeth Alm


beach, E. coli, genomics, microcosm, molecular biology, phylotype


The quantification of Escherichia coli or E. coli is the most common method used to detect recent fecal pollution in recreational water, as this species is known for its high abundance in fecal matter and assumed host-associated nature. However, it has been determined that some strains are capable of long-term survival and potential propagation in non-host environments, such as the beach sand. These long-term environmental survivors are host-independent and are not associated with the same health risks as those E. coli from recent fecal pollution. However, they have been shown to impact how water quality is perceived as they are reintroduced into the water column by wave action and are counted in monitoring efforts. Current monitoring methods are unable to differentiate long-term surviving populations from populations originating from recent fecal pollution. Despite this known discrepancy E. coli enumeration is still relied upon to estimate levels of fecal pollution and used to determine the need for beach closures. The aim of this work was to identify genetic indicators of long-term survival that can be used to develop tools to improve beach monitoring. E. coli capable of long-term environmental survival were identified through a series of microcosm experiments, in which populations from sand, sewage, and gull waste (n=198 each) were seeded into sand treatments (unaltered native sand, nutrient-limited baked sand, and nutrient-abundant autoclaved sand) and buried 0.5 m deep in the backshore of Lake Michigan for 6-8 weeks. The populations were monitored over the course of the study, and those capable of environmental survival increased in frequency by the end of the experiment. Survival-associated genes were identified through a novel population genetics approach in which composite samples from each source and timepoint were shot-gun sequenced and mapped to a scaffold of E. coli accessory genes from 21 genomes. Genes that had >25% higher depth of coverage in output populations compared to those from the input were considered enriched in long-term surviving populations. It was determined that E. coli from each source tested were capable of long-term survival in beach sand, with the ability to survive varying based on phylotype association and accessory gene ownership. Through Clermont phylotyping it was determined that members of A and B1 increased in frequency by the end of the experiment, suggesting that members of these groups may be better suited for survival in secondary environments. Overall, there were a total of 198 survival-associated functions shared among each sand, sewage, and gull surviving populations, which were largely associated with metabolism enzymes and transport proteins. Several pathway modules were identified in these surviving populations, including the betaine biosynthesis pathway, which allows the production of compatible solutes that prevent dehydration, and the GABA biosynthesis and the GABA shunt modules, which are associated with flexibility in nutrient utilization. Overall, the distribution of these survival related functions were shown to vary, with some being more widely distributed (i.e., among non-clade members), while others were more narrowly distributed among members of select phylogroups (A/B1/cryptic clades), demonstrating that survivability varies based on accessory gene ownership and phylotype association.