SlideShare ist ein Scribd-Unternehmen logo
1 von 2
Downloaden Sie, um offline zu lesen


                                                    Chuck
Henry
and
Christa
Williford,
DLF
Forum,
November
2011


Lessons
from
the
Digging
Into
Data
Challenge

What
Information
Professionals
Should
Know
about
Computationally
Intensive
Research
in
the
Humanities
and

Social
Sciences



For
the
past
two
years,
the
Council
on
Library
and
Information
Resources
(CLIR)
has
partnered
with
the
National

Endowment
for
Humanities
Office
of
Digital
Humanities
(NEH‐ODH)
in
an
intensive
assessment
of
the
inaugural

year
of
the
Digging
Into
Data
grant
program.
Launched
in
2009,
this
unprecedented
international
initiative
involved

four
funding
agencies
in
three
countries
and
supported
eight
international
collaborative
research
projects
in
the

social
sciences
and
humanities,
all
of
which
bring
innovative
applications
of
computer
technology
to
bear
on
the

collection,
mining,
and
interpretation
of
large
data
corpora.
Here
is
a
sampling
of
what
CLIR
has
learned:



Lesson
1:
Computationally
intensive
research
requires
open
sharing
of
resources
among
participants.
Essential

resources
include
hardware,
software,
data
corpora,
and
communication
tools.
Information
professionals
can

facilitate
open
sharing
by
helping
researchers
forge
partnership
agreements
based
upon
trust
and
transparency.



Example:
To
support
the
project
“Digging
Into
Data
to
Answer
Authorship
Related
Questions,”
participants
drafted

a
Memorandum
of
Understanding
that
made
clear
how
shared
resources
would
be
funded
as
well
as
established
a

plan
for
project
communication
and
credit
sharing.
See:
Michael
Simeone,
Jennifer
Guiliano,
Rob
Kooper
and
Peter

Bajcsy,
"Digging
into
Data
Using
New
Collaborative
Infrastructures
Supporting
Humanities‐based
Computer
Science

Research."
First
Monday
16.5

(2
May
2011):

http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/viewArticle/3372/2950



Lesson
2:
Computationally
intensive
research
projects
rely
upon
diverse
kinds
of
expertise:
domain
(or
subject)

expertise,
analytical
expertise,
data
management
expertise,
and
project
management
expertise.
Information

professionals
can
offer
and/or
develop
skills
and
knowledge
in
each
of
these
areas,
enabling
them
to
participate

actively
as
research
partners.



Example:
For
their
project,
“Digging
Into
the
Enlightenment:
Mapping
the
Republic
of
Letters,”
Stanford
University

provided
resources
and
project
management
support
to
their
international
partners
through
“embedded”

information
professional
Nicole
Coleman,
who
is
based
at
the
Stanford
Humanities
Center.
As
Academic

Technology
Specialist,
Nicole’s
focus
is
on
finding
new
research
opportunities
and
supporting
the
production
of

new
knowledge,
and
she
has
developed
expertise
in
the
kinds
of
infrastructure
and
management
practices
that

contribute
to
successful
research
collaborations.
For
more
information
about
this
project,
see:

http://enlightenment.humanitiesnetwork.org/



Lesson
3:
When
it
comes
to
analytical
tools,
one
size
does
not
fit
all.
As
their
questions
evolve
throughout
their

projects,
researchers
want
the
flexibility
to
alternate
between
looking
closely
at
select
data
and
performing

“distant”
readings
of
entire
corpora.
Information
professionals
can
educate
researchers
to
help
them
refine
their

questions,
select
appropriate
tools,
and
use
their
tools
effectively.



Example:
While
both
close
and
distant
readings
of
evidence
characterized
most
of
the
Digging
Into
Data
project

methodologies,
Richard
Healey,
co‐principal
investigator
of
“Railroads
and
the
Making
of
Modern
America,”
has
an

interesting
take
on
why
humanities
and
social
science
data
requires
the
continual
adaptation
and
evolution
of

analytical
tools.
He
hypothesizes
many
“different
levels
of
data‐related
operations,”
and
these
levels
determine

the
research
outcomes
that
are
possible
at
each
level.
He
writes:



         The
levels
relate
to
the
degree
of
scholarly
input
involved
and
I
see
them…as
a
data
‘hierarchy’:

              • Level
0
‐
Data
so
riddled
with
error
it
should
come
with
a
serious
intellectual
health

                   warning!
(We
have
much
more
of
this
than
most
people
seem
willing
to
admit
and
much

                   of
the
Google
data
from
scanned
railroad
reports
admirably
fits
into
this
category).


              • Level
1
‐
Raw
datasets…corrected
for
obvious
errors.



                                                     Chuck
Henry
and
Christa
Williford,
DLF
Forum,
November
2011


             •    Level
2
‐
Value‐added
datasets:
those
that
have
been
standardised/coded
etc.
in
a

                  consistent
fashion
according
to
some
recognised
scheme
or
procedure,
which
may

                  require
significant
domain
expertise
[to
produce]…)

             •    Level
3
‐
Integrated
data
resources:
These
will
contain
value‐added
datasets

                  but…explicit
linkages
have
been
made
between
multiple
related
datasets
(or
they
have

                  been
coded/tagged
in
such
a
way
that
the
linkages
can
be
made
by
software.

Hence,

                  these
are
not
just
'data'
because
so
much
additional
research
time
has
been
invested
in

                  them,
which
is
why
I
prefer
the
word
‘resource’….

Many
GIS
resources
are
of
this
kind,

                  because
they
require
linkage
of
spatial
and
non‐spatial
data.

             •    Level
4
‐
'Digging
Enabler'
or
'Digging
Key'

data/classificatory
resources:
These
require

                  extensive
domain
expertise,
and
use
of/analysis
of
multiple
sources/relevant
literature

                  to
create.
They
facilitate
extensive
additional
types
of
digging
activity
to
be
undertaken

                  on
substantive
projects
beyond
those
of
the
investigators
who
created
them,
i.e
they

                  become
'authority
files'
for
the
wider
research
community.

Gazetteers,
structured

                  occupational
coding
systems,
data
cross‐
classifiers
etc.
fit
into
this
category.



Lesson
4:
Big
data
isn’t
just
for
scientists
anymore.
Not
only
do
humanists
and
social
scientists
work
with
big
data,

their
research
can
also
produce
large
data
corpora.
Some
scholars
engaged
in
computationally
intensive
research

see
the
new
data
they
create
as
their
most
significant
research
outcomes.
Researchers
risk
losing
their
valuable

data
unless
they
take
steps
to
protect
and
sustain
them.
As
practices
for
publishing
research
data
evolve,

information
professionals
can
curate
this
data,
working
with
scholars
to
appraise,
normalize,
validate,
provide

access
to
and,
ultimately,
preserve
research
data
for
the
long
term.



Example:
In
the
final
white
paper
for
“Mining
a
Year
of
Speech,”
John
Coleman
draws
a
compelling
comparison

between
the
sizes
of
data
sets
with
which
current
major
science
and
humanities
projects
are
engaged
(see
below).

This
paper
is
available
at:

http://www.phon.ox.ac.uk/files/pdfs/MiningaYearofSpeechWhitePaper.pdf







                                                                                                                  


Weitere ähnliche Inhalte

Was ist angesagt?

Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchersSarah Jones
 
Information retrieval system
Information retrieval systemInformation retrieval system
Information retrieval systemLeslie Vargas
 
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...ICPSR
 
Guidelines for OSTP Data Access Plans
Guidelines for OSTP Data Access PlansGuidelines for OSTP Data Access Plans
Guidelines for OSTP Data Access PlansICPSR
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryRobin Rice
 
Data management plans archeology class 10 18 2012
Data management plans archeology class 10 18 2012Data management plans archeology class 10 18 2012
Data management plans archeology class 10 18 2012Elizabeth Brown
 
Meeting Federal Research Requirements
Meeting Federal Research RequirementsMeeting Federal Research Requirements
Meeting Federal Research RequirementsICPSR
 
Bonazzi data commons nhgri council feb 2017
Bonazzi data commons nhgri council feb 2017Bonazzi data commons nhgri council feb 2017
Bonazzi data commons nhgri council feb 2017Vivien Bonazzi
 
‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...Robin Rice
 
An analysis and characterization of DMPs in NSF proposals from the University...
An analysis and characterization of DMPs in NSF proposals from the University...An analysis and characterization of DMPs in NSF proposals from the University...
An analysis and characterization of DMPs in NSF proposals from the University...Megan O'Donnell
 
ICPSR Data Exploration Tools
ICPSR Data Exploration ToolsICPSR Data Exploration Tools
ICPSR Data Exploration ToolsICPSR
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017Vivien Bonazzi
 
2013 ICPSR Data Services
2013 ICPSR Data Services2013 ICPSR Data Services
2013 ICPSR Data ServicesICPSR
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so farEnrico Daga
 

Was ist angesagt? (20)

Think like a Digital Curator
Think like a Digital CuratorThink like a Digital Curator
Think like a Digital Curator
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchers
 
Information retrieval system
Information retrieval systemInformation retrieval system
Information retrieval system
 
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
Data Sharing with ICPSR: Fueling the Cycle of Science through Discovery, Acce...
 
Guidelines for OSTP Data Access Plans
Guidelines for OSTP Data Access PlansGuidelines for OSTP Data Access Plans
Guidelines for OSTP Data Access Plans
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
Data management plans archeology class 10 18 2012
Data management plans archeology class 10 18 2012Data management plans archeology class 10 18 2012
Data management plans archeology class 10 18 2012
 
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at ScaleFull Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
 
Meeting Federal Research Requirements
Meeting Federal Research RequirementsMeeting Federal Research Requirements
Meeting Federal Research Requirements
 
Bonazzi data commons nhgri council feb 2017
Bonazzi data commons nhgri council feb 2017Bonazzi data commons nhgri council feb 2017
Bonazzi data commons nhgri council feb 2017
 
‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...‘Good, better, best’? Examining the range and rationales of institutional dat...
‘Good, better, best’? Examining the range and rationales of institutional dat...
 
Zarneger "Supporting AI: Best Practices for Content Delivery Platforms"
Zarneger "Supporting AI: Best Practices for Content Delivery Platforms"Zarneger "Supporting AI: Best Practices for Content Delivery Platforms"
Zarneger "Supporting AI: Best Practices for Content Delivery Platforms"
 
An analysis and characterization of DMPs in NSF proposals from the University...
An analysis and characterization of DMPs in NSF proposals from the University...An analysis and characterization of DMPs in NSF proposals from the University...
An analysis and characterization of DMPs in NSF proposals from the University...
 
ICPSR Data Exploration Tools
ICPSR Data Exploration ToolsICPSR Data Exploration Tools
ICPSR Data Exploration Tools
 
Engaging the Researcher in RDM
Engaging the Researcher in RDMEngaging the Researcher in RDM
Engaging the Researcher in RDM
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017
 
2013 ICPSR Data Services
2013 ICPSR Data Services2013 ICPSR Data Services
2013 ICPSR Data Services
 
User engagement in research data curation
User engagement in research data curationUser engagement in research data curation
User engagement in research data curation
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so far
 

Ähnlich wie Di d dlf_handout

Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonAfrican Open Science Platform
 
Adding valuethroughdatacuration
Adding valuethroughdatacurationAdding valuethroughdatacuration
Adding valuethroughdatacurationAPLICwebmaster
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...EDINA, University of Edinburgh
 
Figshare for institutions presentation swets customer day 2014
Figshare for institutions   presentation swets customer day 2014Figshare for institutions   presentation swets customer day 2014
Figshare for institutions presentation swets customer day 2014Swetsbelgie
 
The Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data PilotThe Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data PilotMartin Donnelly
 
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
The Horizon2020 Open Data Pilot - OpenAIRE WebinarThe Horizon2020 Open Data Pilot - OpenAIRE Webinar
The Horizon2020 Open Data Pilot - OpenAIRE WebinarMartin Donnelly
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystemMaryann Martone
 
Building a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital LibraryBuilding a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital LibraryRobert H. McDonald
 
Open science as roadmap to better data science research
Open science as roadmap to better data science researchOpen science as roadmap to better data science research
Open science as roadmap to better data science researchBeth Plale
 
Linking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesLinking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesMicah Altman
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsMartin Donnelly
 
Open Data: Strategies for Research Data Management (and Planning)
Open Data: Strategies for Research Data  Management (and Planning)Open Data: Strategies for Research Data  Management (and Planning)
Open Data: Strategies for Research Data Management (and Planning)Martin Donnelly
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsfBrad Houston
 

Ähnlich wie Di d dlf_handout (20)

Data management plans
Data management plansData management plans
Data management plans
 
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLANINCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
 
Johnston - How to Curate Research Data
Johnston - How to Curate Research DataJohnston - How to Curate Research Data
Johnston - How to Curate Research Data
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
Adding valuethroughdatacuration
Adding valuethroughdatacurationAdding valuethroughdatacuration
Adding valuethroughdatacuration
 
Data management plans
Data management plansData management plans
Data management plans
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...
 
Figshare for institutions presentation swets customer day 2014
Figshare for institutions   presentation swets customer day 2014Figshare for institutions   presentation swets customer day 2014
Figshare for institutions presentation swets customer day 2014
 
The Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data PilotThe Horizon 2020 Open Data Pilot
The Horizon 2020 Open Data Pilot
 
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
The Horizon2020 Open Data Pilot - OpenAIRE WebinarThe Horizon2020 Open Data Pilot - OpenAIRE Webinar
The Horizon2020 Open Data Pilot - OpenAIRE Webinar
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystem
 
Building a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital LibraryBuilding a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital Library
 
Open science as roadmap to better data science research
Open science as roadmap to better data science researchOpen science as roadmap to better data science research
Open science as roadmap to better data science research
 
Linking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesLinking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual Archives
 
Aggregation as tactic sm new
Aggregation as tactic sm newAggregation as tactic sm new
Aggregation as tactic sm new
 
Aggregation as Tactic
Aggregation as TacticAggregation as Tactic
Aggregation as Tactic
 
BLC & Digital Science: Mark Hahnel, Figshare
BLC & Digital Science: Mark Hahnel, FigshareBLC & Digital Science: Mark Hahnel, Figshare
BLC & Digital Science: Mark Hahnel, Figshare
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
 
Open Data: Strategies for Research Data Management (and Planning)
Open Data: Strategies for Research Data  Management (and Planning)Open Data: Strategies for Research Data  Management (and Planning)
Open Data: Strategies for Research Data Management (and Planning)
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 

Kürzlich hochgeladen

IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 

Kürzlich hochgeladen (20)

IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 

Di d dlf_handout

  • 1. Chuck
Henry
and
Christa
Williford,
DLF
Forum,
November
2011
 Lessons
from
the
Digging
Into
Data
Challenge
 What
Information
Professionals
Should
Know
about
Computationally
Intensive
Research
in
the
Humanities
and
 Social
Sciences
 
 For
the
past
two
years,
the
Council
on
Library
and
Information
Resources
(CLIR)
has
partnered
with
the
National
 Endowment
for
Humanities
Office
of
Digital
Humanities
(NEH‐ODH)
in
an
intensive
assessment
of
the
inaugural
 year
of
the
Digging
Into
Data
grant
program.
Launched
in
2009,
this
unprecedented
international
initiative
involved
 four
funding
agencies
in
three
countries
and
supported
eight
international
collaborative
research
projects
in
the
 social
sciences
and
humanities,
all
of
which
bring
innovative
applications
of
computer
technology
to
bear
on
the
 collection,
mining,
and
interpretation
of
large
data
corpora.
Here
is
a
sampling
of
what
CLIR
has
learned:
 
 Lesson
1:
Computationally
intensive
research
requires
open
sharing
of
resources
among
participants.
Essential
 resources
include
hardware,
software,
data
corpora,
and
communication
tools.
Information
professionals
can
 facilitate
open
sharing
by
helping
researchers
forge
partnership
agreements
based
upon
trust
and
transparency.
 
 Example:
To
support
the
project
“Digging
Into
Data
to
Answer
Authorship
Related
Questions,”
participants
drafted
 a
Memorandum
of
Understanding
that
made
clear
how
shared
resources
would
be
funded
as
well
as
established
a
 plan
for
project
communication
and
credit
sharing.
See:
Michael
Simeone,
Jennifer
Guiliano,
Rob
Kooper
and
Peter
 Bajcsy,
"Digging
into
Data
Using
New
Collaborative
Infrastructures
Supporting
Humanities‐based
Computer
Science
 Research."
First
Monday
16.5

(2
May
2011):
 http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/viewArticle/3372/2950
 
 Lesson
2:
Computationally
intensive
research
projects
rely
upon
diverse
kinds
of
expertise:
domain
(or
subject)
 expertise,
analytical
expertise,
data
management
expertise,
and
project
management
expertise.
Information
 professionals
can
offer
and/or
develop
skills
and
knowledge
in
each
of
these
areas,
enabling
them
to
participate
 actively
as
research
partners.
 
 Example:
For
their
project,
“Digging
Into
the
Enlightenment:
Mapping
the
Republic
of
Letters,”
Stanford
University
 provided
resources
and
project
management
support
to
their
international
partners
through
“embedded”
 information
professional
Nicole
Coleman,
who
is
based
at
the
Stanford
Humanities
Center.
As
Academic
 Technology
Specialist,
Nicole’s
focus
is
on
finding
new
research
opportunities
and
supporting
the
production
of
 new
knowledge,
and
she
has
developed
expertise
in
the
kinds
of
infrastructure
and
management
practices
that
 contribute
to
successful
research
collaborations.
For
more
information
about
this
project,
see:
 http://enlightenment.humanitiesnetwork.org/
 
 Lesson
3:
When
it
comes
to
analytical
tools,
one
size
does
not
fit
all.
As
their
questions
evolve
throughout
their
 projects,
researchers
want
the
flexibility
to
alternate
between
looking
closely
at
select
data
and
performing
 “distant”
readings
of
entire
corpora.
Information
professionals
can
educate
researchers
to
help
them
refine
their
 questions,
select
appropriate
tools,
and
use
their
tools
effectively.
 
 Example:
While
both
close
and
distant
readings
of
evidence
characterized
most
of
the
Digging
Into
Data
project
 methodologies,
Richard
Healey,
co‐principal
investigator
of
“Railroads
and
the
Making
of
Modern
America,”
has
an
 interesting
take
on
why
humanities
and
social
science
data
requires
the
continual
adaptation
and
evolution
of
 analytical
tools.
He
hypothesizes
many
“different
levels
of
data‐related
operations,”
and
these
levels
determine
 the
research
outcomes
that
are
possible
at
each
level.
He
writes:
 
 The
levels
relate
to
the
degree
of
scholarly
input
involved
and
I
see
them…as
a
data
‘hierarchy’:
 • Level
0
‐
Data
so
riddled
with
error
it
should
come
with
a
serious
intellectual
health
 warning!
(We
have
much
more
of
this
than
most
people
seem
willing
to
admit
and
much
 of
the
Google
data
from
scanned
railroad
reports
admirably
fits
into
this
category).

 • Level
1
‐
Raw
datasets…corrected
for
obvious
errors.

  • 2. Chuck
Henry
and
Christa
Williford,
DLF
Forum,
November
2011
 • Level
2
‐
Value‐added
datasets:
those
that
have
been
standardised/coded
etc.
in
a
 consistent
fashion
according
to
some
recognised
scheme
or
procedure,
which
may
 require
significant
domain
expertise
[to
produce]…)
 • Level
3
‐
Integrated
data
resources:
These
will
contain
value‐added
datasets
 but…explicit
linkages
have
been
made
between
multiple
related
datasets
(or
they
have
 been
coded/tagged
in
such
a
way
that
the
linkages
can
be
made
by
software.

Hence,
 these
are
not
just
'data'
because
so
much
additional
research
time
has
been
invested
in
 them,
which
is
why
I
prefer
the
word
‘resource’….

Many
GIS
resources
are
of
this
kind,
 because
they
require
linkage
of
spatial
and
non‐spatial
data.
 • Level
4
‐
'Digging
Enabler'
or
'Digging
Key'

data/classificatory
resources:
These
require
 extensive
domain
expertise,
and
use
of/analysis
of
multiple
sources/relevant
literature
 to
create.
They
facilitate
extensive
additional
types
of
digging
activity
to
be
undertaken
 on
substantive
projects
beyond
those
of
the
investigators
who
created
them,
i.e
they
 become
'authority
files'
for
the
wider
research
community.

Gazetteers,
structured
 occupational
coding
systems,
data
cross‐
classifiers
etc.
fit
into
this
category.
 
 Lesson
4:
Big
data
isn’t
just
for
scientists
anymore.
Not
only
do
humanists
and
social
scientists
work
with
big
data,
 their
research
can
also
produce
large
data
corpora.
Some
scholars
engaged
in
computationally
intensive
research
 see
the
new
data
they
create
as
their
most
significant
research
outcomes.
Researchers
risk
losing
their
valuable
 data
unless
they
take
steps
to
protect
and
sustain
them.
As
practices
for
publishing
research
data
evolve,
 information
professionals
can
curate
this
data,
working
with
scholars
to
appraise,
normalize,
validate,
provide
 access
to
and,
ultimately,
preserve
research
data
for
the
long
term.
 
 Example:
In
the
final
white
paper
for
“Mining
a
Year
of
Speech,”
John
Coleman
draws
a
compelling
comparison
 between
the
sizes
of
data
sets
with
which
current
major
science
and
humanities
projects
are
engaged
(see
below).
 This
paper
is
available
at:
 http://www.phon.ox.ac.uk/files/pdfs/MiningaYearofSpeechWhitePaper.pdf