Schuh ecn2013 tcn_data_structure

The structure of insect—plant host data
as derived from museum collections:
An analysis based on data from the
NSF-funded Tritrophic Database —
Thematic Collections Network
(TTD-TCN)
Randall T. Schuh
Katja Seltmann
Christine A. Johnson
American Museum of Natural History

TTD-TCN Rationale
“The data captured via ADBC funding will
dramatically improve our understanding of the
relationships among the more than 11,000
species of North American Hemiptera (scale
insects, aphids, leafhoppers, true bugs, and
relatives), their food plants, and the wasps that
parasitize the hemipterans.”

The data we will evaluate today were captured
through a Web-based application developed with
NSF Planetary Biodiversity Inventory funding and
used by the TTD-TCN. This software application,
known as Arthropod Easy Capture (AEC), is built in
open-source code, is being implemented as an
appliance by the ADBC-funded Home Uniting Biocollections (HUB, iDigBio), and through that
implementation will be able to be installed with a
“one-click” installation application. Server code is online at Source Forge:
http://sourceforge.net/projects/arthropodeasy/

Specimen Count by Project
(1,144,240)

Sources of Insect—Plant Host Data

Data on insect-plant relationships is available
primarily from labels on insect specimens—as
opposed to labels on plant specimens.
Substantial amounts of data were captured for
the family Miridae on a world basis under NSF
Planetary Biodiversity Inventory funding
between 2003—2011.
The TTD-TCN is a collaboration among 17 US
entomological institutions. The institutional
contributions from these two projects, as
represented by numbers of specimen records,
are seen in the following graph.
The TTD-TCN is defining the field structure for
host data as used by the iDigBio and for other
Web-aggregators such as DiscoverLife.org.

In order to evaluate the nature of insect-host plant data
derived from collections, we need to look at groups
that offer large data sets. Necessary attributes are:
1.Large numbers of specimen records with host
information
2.Large numbers of collecting events
3.Substantial diversity of host taxa
At the present time the following taxa in our database
meet those criteria:

Hemiptera
Sternorrhyncha
Aphididae (4400 species worldwide)
Auchenorrhyncha
Membracidae (3200 species worldwide)
Heteroptera
Miridae (11,000 species worldwide)
Raw data for each taxon are distributed as seen in
the following four graphs.

Collection Events
Miridae

Aphididae

Membracidae

Combined data

Year Specimens Collected

Host Records as a Proportion of Collecting Events

Hosts unique
Hosts non-unique

Without hosts

aa
aa
aa
aa
aa
aa
aa
a

Aphididae

Miridae
Miridae

Aphididae

Membracidae

Membracidae

Algorithmic Assessment of
Data Quality

COLLECTING EVEN DATA:
The occurrence of an insect
species on a plant genus

ANALYSIS: evaluate insect/plant
ANALYSIS: evaluate insect/plant
associations with different scores
associations with different scores

Modify algorithm to improve fit
of model to data based on results

Compute frequency
of occurrence on a
particular plant genus

Compare with all insect
collecting events on any plant

Scores: High, Medium, or Low
confidence in insect--plant
association
HEURISTIC DATA:
Larvae present?
Multiple specimens?
Voucher specimen available?

f(y) ≥ 15.00%
y≥5

f(y) ≥ 2.00%
y≥3
∨
f(y) ≥ 15.00%
y≥2

)
n
m
p
#
s
h
u
,
e
v
r
:
a
c
g
l
o
i
b
(

x=y′ +y

c
t
s
i
r
u
e
H

not high or medium

v
g
l
o
n
m
i
c
e
p
s
:
t
a
D

x=1

Analysis

Using Larrea (creosote bush) as a
example host

Miridae/Larrea Association Network

Miridae/Larrea Association Network with High Confidence

Reasons for Low Host Scores and
Methods for Improving Data Quality

Reasons for Low Scores
1. Actual low host specificity: Indicated when a large number of
collecting events are distributed across many plant taxa.

2. Movement of adult specimens to alternative food sources:
Algorithm points out apparent vagility when there are multiple
hosts and little or no host repetition across collecting events.

3. Commingling of specimens in the field: Algorithm points out
problem when insect specimen numbers are low for a host
taxon and when there is lack of repetition of host occurrence.

4. Mislabeling of insects for hosts from a collecting event: Difficult
to distinguish from actual polyphagy in cases where all
specimens from an event are mislabeled. Often seen as a
unique host for a given insect taxon. More fieldwork needed.


4. Mislabeling of insects for hosts from a collecting event: Difficult
to distinguish from actual polyphagy in cases where all
specimens from an event are mislabeled. Often seen as a
unique host for a given insect taxon. More fieldwork needed.
5. Single collecting events: Indistinguishable from absolute host
fidelity based on multiple events, except no confidence limit can
be assessed. Heuristics such as presence of larvae and large
numbers of specimens give credence to presumed association.
Resolved only by further fieldwork.

1. Insect collections offer substantial data on host
relationships even though a majority of the specimens
lack such information.
2. Our algorithm demonstrates a method for assessing
data quality on a large scale. Our initial analyses show
that:
-

We can have confidence in a significant proportion
of the available information
The data demonstrate a substantial degree of host
specificity in our three target groups.

3. Degree of host specificity requires a scoring method
that takes into account biological attributes, collecting
techniques, and approaches to data capture in the field.

Acknowledgments
•Participating TCN and PBI Institutions
•iDigBio
•AMNH Database Data-entry Personnel
•Participating TCN Data-entry Personnel
•Michael D. Schwartz
•National Science Foundation

Schuh ecn2013 tcn_data_structure

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (13)

Ähnlich wie Schuh ecn2013 tcn_data_structure

Ähnlich wie Schuh ecn2013 tcn_data_structure (20)

Mehr von ECNOfficer

Mehr von ECNOfficer (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Schuh ecn2013 tcn_data_structure

Hinweis der Redaktion