Integration of Orthoptera collection data within a
Virtual Museum:
the German Orthoptera Collections Database
K. Riede1, S. Ingrisch1 and C. Dietrich2
|
|||||||||
1 Zoologisches Forschungsinstitut and Museum Alexander Koenig (ZFMK), Adenauerallee 150-164,
D-53113 Bonn, Germany
2 Department of Neural Information Processing, University Ulm, D-89069 Ulm, Germany
E-mail: sigfrid.ingrisch@planet-interkom.de; k.riede.zfmk@uni-bonn.de; christian.dietrich@neuro.informatik.uni-ulm.de. |
|||||||||
Abstract.
The wealth of information contained within museum collections can only be tapped by digitising collection
data, and making them available on-line. The Global Biodiversity Information Facility (GBIF: www.gbif.org)
has been established to provide an interoperable network of biodiversity databases and the necessary
information technology tools. The German Ministry of Science and Education is funding the EDIS-project
(Entomological Data and Information System: wwww.insects-online.de) to digitise and harmonise the rich,
but scattered entomological collections housed at various German institutions. The core of the Orthoptera
subproject is a specimen-based database of important Orthoptera collections in Germany, accessible by
an internet-based user interface (Virtual Museum; German Orthoptera collections database: DORSA, see
Poster 2).
Key words:
biodiversity databases, geographical information systems, bio-acoustics, automatized song classification. |
|||||||||
Sample data sets from the respective tables are shown in Fig. 2
Note that the tables can be connected by the IDs, to one large table containing all the information. The
relational data model saves storage space and time by distributing repetitive information on distinct tables.
Orthopterists are in the privileged situation that they have the Orthoptera Species File (OSF: Otte and
Naskrecki 1997) as a taxonomic backbone, which is among the few global species register already avail-
able on the world-wide web (http://viceroy.eeb.uconn.edu/Orthoptera). |
||||||||||||
Fig. 2: Sample tables (simplified) for individuals of Galidacris sp. |
||||||||||||
Specimen table: |
||||||||||||
Determination table: |
||||||||||||
Storing Geo-Information
For all specimens with reliable locality information, collection sites will be geo-referenced by
latitude/longitude co-ordinates, which can be mapped by any geographical information system (GIS) and
intersected with environmental data. A first prototype for a Java-based graphical user can be found at
www.groms.de. This interface allows geographic queries, retrieval and mapping of species data.
Georeferencing - the geographic bottleneck
Computer-aided visualisation of locality data needs co-ordinates (latitude and longitude). Providing locali-
ties as typed on specimen labels with co-ordinates is called geo-referencing.
Today, geo-referencing is done already by collectors, reliably with help of their GPS. But if we want to tap
the rich geographic information stored on specimen labels, we have to look up co-ordinates manually,
using atlases or gazetteers (online: Alexandria Project).
Given the huge number of specimens in museums (an estimated 5000 type specimens in Berlin alone),
this sounds like an impossible task! However, the task becomes feasible if we think in terms of collectors
and develop a data model to geo-reference collection trips (itineraries).
The itinerary model - a solution?
Note that one collector usually collects thousands of specimens. Historically, collections included distinct
groups of organisms, from insects to plants. The material was distributed on different institutions. This
means that today taxonomists at different institutions might be busy to geo-reference the localities of the
famous Sarawak-expedition by Mjoeberg: inter alia, a frog and cricket type specimen were collected there
(Leptobrachella mjoebergi, Itara mjobergi CHOPARD, 1930).
It is therefore much more efficient to geo-reference Mjoebergs itinerary, and make these data available in
digital format.
This approach is also useful today: there are certain research stations and localities, where huge numbers
of Orthoptera have been collected and been distributed to various institutions (Fig. 3). |
|||||||||
Fig. 3
This picture shows a realistic
scenario for biological collec-
tions:
3800 specimens have been
collected by one collector. They
are distributed on 8 institutions,
and eventually on different sec-
tions (depending on the diversi-
ty of the sample). If localities
are georeferenced in each
case, they must be
georeferenced 3 x 8 = 24 times
(probably more often, if more
sections are involved). With a
central register of itineraries,
localities were georeferenced 3
times. This will be the only pos-
sibility to geo-reference muse-
um collections within a realistic
time frame. |
|||||||||
1000
specimens |
|||||||||
Bioacoustics and Neuroinformatics
DORSA is a network project, connecting expertise in data-basing, collection management, systematics, geo-
graphical information systems, bio-acoustics and neuroinformatics. The species-specific songs are used as a
knowledge base for song recognition algorithms based on neural networks. First results indicate that reliable
automatized classification is possible for songs of Grylloidea from South East Asia and Amazonia. |
||||||||
Fig. 4:
Acoustic analysis of
cricket songs for fea-
ture extraction. The
analysis tools were
programmed with
MatLab (by C.
Dietrich). |
||||||||
Fig. 5:
Neural networks are
used for cricket
song classification.
Many songs of sev-
eral individuals from
one species are
necessary to allow
reliable feature
extraction.The
songs are adminis-
tered by the
DORSA database.
The neural net-
works are trained
with subset of all
songs (at present,
215 songs from 137
individuals and 30
species. |
||||||||