Introduction#
An estimated 40-80% of all bacteria and archea on earth exist in microbial biofilms. These biofilms can be roughly divided into two parts: cells, and the Extracellular Polymeric Substances (EPS) that surround them. The EPS matrix is mainly composed exoPolySaccharides (exoPS), proteins and nucleic acids. This matrix is a core attribute of biofilms, and provides the microbial community with many benefits including adhesion, water retention, toxin adsorption and nutrient capture. EPS is also used by microbial community members as a shared carbon and energy reservoir.
Due to the ubiquity of biofilms (and by extension EPS), their roles in natural and built environments are too numerous to name here. Suffice to say that any research into microbial communities would likely benefit from knowledge about which EPS the community produces.
The genes encoding the biosynthetic pathways responsible for exoPS production cluster together into Biosynthetic Gene Clusters (BGCs) in the microbial genome. This has been exploited by the current version of epsSMASH, currently the most comprehensive tool for detecting known and novel exoPS BGCs. epsSMASH is built on the antiSMASH framework, leveraging custom-made Hidden Markov Models (HMMs) for genes specific for certain types of gene clusters. In future versions we aim to include detection modules for the remaining EPS components (proteins and nucleic acids).