Table Of ContentDepartment of Life Sciences and Chemistry
On the feasibility to engage
heterogeneous communities in data
gathering, sharing and enrichment
by
Julia Schnetzer, MSc.
A thesis submitted in partial fulfillment
of requirements for the degree of
DOCTOR OF PHILOSOPHY
in Marine Microbiology
Approved Thesis Committee:
Prof. Dr. Frank Oliver Glöckner (chair)
Max Planck Institute for Marine Microbiology
Jacobs University Bremen
Prof. Dr. Matthias Ullrich
Jacobs University Bremen
Dr. Julia A. Busch
Carl von Ossietzky University Oldenburg
Dr. Renzo Kottmann
Max Planck Institute for Marine Microbiology
Date of Defense: 08.12.2015
Unlike the original submitted thesis this digital version contains the publisher's version of the articles.
Statutory Declaration
Statutory Declaration
(on Authorship of a Dissertation)
I, Julia Schnetzer hereby declare that I have written this PhD thesis
independently, unless where clearly stated otherwise. I have used only
the sources, the data and the support that I have clearly mentioned.
This PhD thesis has not been submitted for conferral of degree elsewhere.
I confirm that no rights of third parties will be infringed by the publication
of this thesis.
Signature Date
...by the mean nature of a world that will not stand still long enough for them to see it clear
as a whole.
What lured Hemingway to Ketchum?
Hunter S. Thompson (1937-2005)
Thesis Abstract
Marine microbes play critical roles in the well being of the planet Earth and all its
inhabitants. Not only do they influence chemical cycles, the marine food chain, but also the
whole atmosphere and climate of our planet. However, the field of marine microbiology is
still in its infancy and there is much more waiting to be explored. Here, I present a new
approach to investigate global marine microbial diversity and function on a single day of the
year, the 21st of June 2014/2015: the Ocean Sampling Day (OSD). The collection of a
simultaneous, global dataset, required marine researchers, worldwide, to be connected. The
aim was not only to create a snap shot of the marine microbial diversity fixed in time, but
also to raise awareness amongst the general public of the important role these tiny
organisms play in our daily lives. Therefore, professional scientists as well as the non-
scientific public were invited to join the corresponding citizen science project, MyOSD. They
supported OSD by providing oceanographic measurements and even microbial samples.
Data collected by citizen scientists were validated and show that citizen science can
contribute valuable data to marine research. A special focus was set on additional
environmental measurements such as water temperature. This contextual data is important
for the interpretation of microbial diversity in any given sample; however, it is still not
common practice in marine microbial research to measure or report contextual data; OSD
aims to make scientists more aware of this problem.
Extracting contextual data after a dataset or article has been published, is onerous work.
Hence, I present two new tools to extract environmental information and geographic
locations from scientific literature. The text mining tool, ENVIRONMENTS, automatically
annotates scientific text with terms from the Environmental Ontology (EnvO). The PubMap
application utilizes the power of the crowd to enable the creation of a manually curated
database of georeferenced scientific publications.
Overall, this thesis shows that enabling collaboration within the scientific community as well
as the non-scientific public, leads to achievements not only in gathering of new datasets, but
also in enhancing present and historic scientific literature.
Table of Contents
Chapter 1 Introduction ............................................................................................................... 4
1.1 Marine Microbes: The Grey Eminences of the Ocean ......................................................... 4
1.2 Uncovering the Secrets of Marine Microbes ....................................................................... 5
1.3 Context is Everything ............................................................................................................ 9
1.4 Exploring the Microbial Ocean ........................................................................................... 10
1.5 Microbes Gaining Importance in Citizen Science ............................................................... 11
1.6 Working Together to Refurbish Scientific Data .................................................................. 13
1.7 Research aims ..................................................................................................................... 16
Chapter 2 Results and Discussion ............................................................................................. 17
2.1 Overview............................................................................................................................. 17
2.2 The Ocean Sampling Day Consortium ................................................................................ 21
2.3 Between Ignorance and Concern - Interdisciplinary Approaches to Raising Awareness on
Marine Environments ............................................................................................................... 27
2.4 MyOSD 2014:Evaluating Oceanographic Measurements Contributed
by Citizen Scientists in Support of Ocean Sampling Day .......................................................... 36
2.5 Understanding Marine Microbes, the Driving Engines of the Ocean ................................ 38
2.6 MyOSD 2015: Marine Microbiology Meets Citizen Science............................................... 56
2.7 ENVIRONMENTS and EOL: Identification of Environment Ontology Terms in Text and the
Annotation of the Encyclopedia of Life .................................................................................... 74
2.8 PubMap: A Crowdsourcing Application for Georeferenced Annotation of Scientific
Publications .............................................................................................................................. 77
Chapter 3 Summary and Conclusion ........................................................................................ 94
3.1 The Ocean Sampling Day: Bringing Marine Researchers Closer Together ........................ 94
3.2 MyOSD: Engaging the Public in Marine Microbiology ....................................................... 96
3.3 Refurbishing Data for the Scientific Community .............................................................. 100
Chapter 4 Outlook .................................................................................................................. 102
Additional Scientific Publications ........................................................................................... 102
Appendix ................................................................................................................................. 105
Acknowledgements ............................................................................................................... 154
Bibliography ............................................................................................................................ 156
2
Table of Figures
Figure 1: The development of sequencing technologies over the past 30 years. ..................... 7
Figure 2:Growth of DNA sequencing capacity ........................................................................... 7
Figure 3: Google trends search on citizen science.. ................................................................. 12
3
1.1 Marine Microbes: The Grey Eminences of the Ocean
Chapter 1
Introduction
1.1 Marine Microbes: The Grey Eminences of the Ocean
The ocean, which covers 70% of earth's surface, represents the largest habitat for living
organisms. These organisms are mainly marine microbes which include not only prokaryotes
such as bacteria and archaea but also viruses and microscopic eukaryotes such as single
celled algae, protists or fungi (Fuhrman 2009). As the name microbe already implies they
share a common feature, their size is situated in the microscopic scale. For example, the
smallest members of the marine microorganism are viruses of the Parvoviridae family with a
diameter of only 20 nm (Munn 2011) or the archaeon Thermodiscus which is only 200 nm in
diameter (Schulz and Jørgensen 2001). But also in the microbial world some "giants" exist
such as the bacteria Thiomargarita namibiens. Its diameter of about 750 µm is making it
even visible to the naked eye (Schulz 1999). Independent on their size-range, microbes are
very divers and can live in aerobic as well as anaerobic conditions and use various organic
but also inorganic chemicals such as hydrogen sulfide as energy sources (Madigan and Brock
2012). Microbes are able to inhabit every known niche and can even be found in extreme
environments such as hydrothermal vents in the deep ocean (Jannasch and Mottl 1985) or
hyper saline waters such as the Dead Sea (Pundak and Eisenberg 1981). Despite their small
size marine microbes account not only for the most abundant entities in the ocean—a total
of 1.2 x 1029 microbial cells is estimated in the open ocean alone—but also comprise about
90% of the total marine biomass (Whitman et al. 1998). So, it comes as no surprise that they
play major roles in the biochemical cycles of the marine but also the terrestrial ecosystem. In
the upper 200 meters of the ocean photosynthetic microbes—phytoplankton—are
responsible for half of the primary production on earth (Field 1998) thus representing the
foundation of the marine food chain. At the same time they produce around 50% of the
oxygen of the planet's atmosphere providing this vital component for life to all breathing
organism on earth (Walker 1980). Additionally, with the photosynthetic process they
assimilate CO , building up the biological carbon pump which is a crucial scientific and
2
political topic in these days of climate change due to increasing anthropogenic CO emission
2
(Passow and Carlson 2012). They also build the fundamental structure of the nitrogen cycle
and are responsible for the fixation of atmospheric nitrogen (N ), nitrification, denitrification
2
4
1.2 Uncovering the Secrets of Marine Microbes
and nitrate reduction (Arrigo 2005); making this limiting nutrient available to other
organisms. Even more, they are also key players in the biochemical cycles of phosphate,
sulfur, iron and manganese (Kirchman 2008). For these reasons marine microbes have a
crucial impact on our everyday lives and thus marine microbiology became a flourishing field
of study in the last decades. Considering all the discoveries accomplished up to this date, we
might appear to be quite knowledgeable, but even seemingly basic questions such as "Who
is out there?", "What are they doing?" "How do they change and influence their surrounding
environment and vice versa?" are far away from being entirely answered. There is still much
more we need to explore for a deeper understanding of the marine microbial
biogeochemistry, ecology and biodiversity.
1.2 Uncovering the Secrets of Marine Microbes
Marine microbes are difficult to study not only due to their small size but also because only
around 1-5% of them can be cultivated and studied in the laboratory (Amann et al. 1995).
This is mainly due to the complexity involved in artificially reproducing their specific living
conditions. To successfully grow a species in pure culture microbiologists have to identify the
nutrients and physiochemical conditions in the exact concentration. This is highly time
consuming and tedious work and especially extreme environments are hard to mimic (Baker
et al. 2003; Alain and Querellou 2009). Therefore, only massive advances in culture-
independent research technologies in the late 20th century made it possible to shed light on
to microbial diversity and their important role in the world's biochemical cycles. A first
significant discovery was the application of the 16S ribosomal RNA (16S rRNA) from the 30S
small subunit (Spirin 1999) of the prokaryotic ribosome to analyze the phylogenetic diversity
of prokaryotes. The 16S rRNA gene has a essential function in the translation of messenger
RNA (mRNA) into proteins and is unique to prokaryotes. Due to its critical function, part of
its sequence is highly conserved and the mutation rate is relatively low, while other parts are
more variable. The conserved part at the beginning of the gene is usually used as a primer
target while the variable part serves to analyze interspecific polymorphism. Eukaryotes do
not possess a 16S rRNA gene but instead an 18S rRNA gene which serves the same function
(Clarridge 2004). By using 16S rRNA as a phylogenetic marker Woese and colleagues were
pioneering the comparative 16S rRNA analysis and revolutionized the understanding of
prokaryotic evolution and taxonomy (Fox et al. 1980; Woese 1987). Furthermore, the
5
Description:The Ocean Sampling Day (OSD) which was developed as part of the EU project Micro B3 .. Mind the Gap: Why do people act environmentally and Kropla, B. (2005). Beginning MapServer: open source GIS development (Berkeley, Calif: Apress). Maila, M.P., Randima, P., Drønen, K., and Cloete,