The much-heralded Semantic Web is enabled by an ability for machines to process webpages and certain data intelligently and perform better tasks on behalf of end users. Material is linked together through machine-readable statements of relationships among ideas, people, events, and places. Linked data examples are beginning to abound in the scholarly information environment, appearing from both publishers and libraries. This webinar will showcase several such examples. Presenters will describe their motivations for investment in such projects and discuss interfaces and other early outcomes.
4.16.24 21st Century Movements for Black Lives.pptx
NISO Webinar: Return on Investment (ROI) in Linking the Semantic Web
1. Linked
Data
for
Smart
Content
Ellen
Hays,
Elsevier
Labs
e.hays@elsevier.com
Presented
at:
NISO
Webinar
on
Seman?c
Web
Linking
28
September
2011
1
2. Why
Smart
Content?
Elsevier’s
readers
want
more
than
text
and
images,
that
is,
more
than
simply
an
online
rendi?on
of
what
we
print.
They
want:
• Seman?cally
enhanced
content,
such
as
mashups
that
combine
informa?on
from
diverse
sources
and
in
diverse
media
• The
ability
to
do
seman?cally-‐mo?vated
search
• Source
data,
and
the
tools
to
mine
it
effec?vely
for
more
informa?on
• I.e.,
informa?on,
presented
in
ways
that
make
it
straighOorward
to
use
and
understand
2
3. The
challenge
How
to
do
seman?c
enhancement
at
scale
for
STM
publishing?
• In
harmony
with
our
culture
and
legacy
• Across
the
breadth
of
our
content
• Within
an
ecosystem
of
authors,
ins?tu?ons,
publishers,
content
suppliers,
and
funding
agencies
3
4. Smarter
Content
Applied Smart Content
Better discovery
Text
• Faceted search & browse
• Ontology-driven navigation
Elsevier
• Task-specific results
content • Personalized/localized
Tables results
• Question answering
Images
Better understanding
• Tag clouds
• Heatmaps
Related
Concepts: • Streamgraphs
Elsevier
content Metadata, • Scatterplots
and data Entities, • Time series
Relationships • Animations
Actionable, persuasive knowledge
• Topic pages
• Social network maps
Linked data • Geolocation maps
from partners • Data mashups
and the Web • Text mining reports
4
6. Guiding
principles
• Leverage
our
exis?ng
content
produc?on
workflow
and
infrastructure
• Acknowledge
a
deep
dependence
on
subject
maZer
exper?se,
third
par?es
and
the
Web
for
content
enhancement
and
knowledge
organiza?on
systems
• Deliver
benefits
across
the
complementary
use
cases
of
researcher
and
prac??oner
6
7. Current
approach
• Embrace
linked
data
principles
• Reuse
Web-‐standard
vocabularies,
taxonomies,
ontologies
and
en?ty
resources
where
possible
• Start
with
a
focus
on
standards
and
infrastructure
• Leverage
partners
and
acquisi?ons
for
content
enhancement
algorithms/capabili?es
• Build
out
linked
data
design
paZerns
for
applica?on
development
• Explore
new
product
opportuni?es
around
linked
data
7
8. Linked
data
principles
1. Use
URIs
to
name
things
2. Use
HTTP
URIs
so
they
can
be
looked
up
3. Return
useful
data
when
things
are
looked
up
4. Include
links
to
other
things
in
the
returned
data
“Linked
data
is
just
a
term
for
how
to
publish
data
on
the
web
while
working
with
the
web.
And
the
web
is
the
best
architecture
we
know
for
publishing
informa?on
in
a
hugely
diverse
and
distributed
environment,
in
a
gradual
and
sustainable
way.”
Tennison
J,
2010.
Why
Linked
Data
for
data.gov.uk?
hZp://
www.jenitennison.com/blog/node/140
ShoZon
D,
Portwin
K,
Klyne
G,
Miles
A,
2009.
Adventures
in
Seman?c
Publishing:
Exemplar
Seman?c
Enhancements
of
a
Research
Ar?cle.
PLoS
Comput
Biol
5(4):
e1000361.
doi:10.1371/journal.pcbi.1000361
9. Standards:
Content
satellites
Content
satellites
are
XML
documents
containing
RDF
statements;
for
example:
• Tags
from
a
taxonomy
for
a
given
document
• Document
sec?ons
relevant
to
a
given
concept
• Document
sec?ons
providing
answers
to
a
given
ques?on
• Learning
objects
compliant
with
a
given
state
educa?onal
standard
• Genes
men?oned
in
a
given
document
• Documents
suppor?ng
or
dispu?ng
conclusions
of
a
given
document
• Concepts
that
are
in
the
areas
of
exper?se
for
a
given
author
Goal
is
to
balance
expressivity
and
manageability
for
seman?c
enhancement
• Constrain
the
RDF
serializa?on
to
allow
exis?ng
XML-‐centric
staff,
tools,
and
workflows
to
accommodate
RDF
modeling
for
specific
applica?on
use
cases
9
10. Infrastructure:
Linked
Data
Repository
• Allows
Elsevier
plaOorms
and
applica?ons
to
retrieve
and
store
content
enhancements
• About
Elsevier
content
• About
third
party
content
• Allows
third
par?es
to
store
content
enhancements
• About
primary
and
secondary
content
• Provides
a
REST
API
for
• CRUD
opera?ons
on
satellites
as
RDF
named
graphs
• Simple,
low-‐expressivity
queries
across
stored
named
graphs
• For
<subject>,
give
me
all
objects
for
<property>
• Give
me
all
subjects
that
have
<object>
for
<property>
• These
can
be
for
sets
of
subjects
and
objects
• Supports
content
nego?a?on
• Op?mized
for
high-‐volume
read-‐write
of
RDF
named
graphs
10
11. Benefits
of
the
LDR
• Unprecedented
access
to
Elsevier
content
• Key
enabler
for
providing
advanced
seman?c
search
across
products
• Provides
links
to
other
data
sources
to
provide
further
contextual
enrichment
• Allow
others
to
discover
and
integrate
with
Elsevier
content
• Link
content
across
domains
• Data
can
be
pulled
out
of
large
amounts
of
text
and
organized
for
review
and
ac?on
• Informa?on
mining
for
compliance
and
research
• Create
mashups
from
mul?ple
data
sources
• Present
informa?on
with
enhanced
visualiza?on
11
12. Mining
text
for
semanHc
data
Building
the
databases
that
support
content
enrichment
includes
extrac?ng
from
unstructured
text:
―
men?ons
of
concepts
―
men?ons
of
rela,ons
between
concepts
―
other
seman,c
informa,on,
such
as
document
metadata
and
context
indicators
http://www.ifs.tuwien.ac.at/dm/
12
13. Mining
text
for
semanHc
data
• We’re
exploring
a
range
of
tools
and
techniques
to
do
text
mining,
including:
Rule-‐based
informa?on
extrac?on
Sta?s?cal
informa?on
extrac?on
Mapping
terms
in
text
to
thesauri
(Ei
Thesaurus,
EMTREE)
or
other
sources
of
lexical/seman?c
informa?on
• Working
with
GATE
and
UIMA
components
to
design
and
implement
language
processing
pipelines,
and
with
a
number
of
text
mining
vendors
• Because
Elsevier
publishes
in
a
broad
range
of
subject
areas,
content
types,
and
languages,
no
one
approach
is
appropriate
for
all
uses
13
14. SemanHc
and
lexical
models
Suppor?ng
our
text
mining
efforts
is
an
increased
focus
on
acquiring,
building,
and
maintaining
vocabularies
and
seman?c
models,
including:
Dic?onaries/thesauri
Taxonomies
Ontologies
We
reuse
Web-‐standard
seman?c
and
lexical
resources
wherever
possible,
but
also
create
applica?on-‐specific
domain
models,
some?mes
by
hand,
for
narrow
domains
These
seman?c
resources
are
also
stored
in
the
LDR,
which
links
seman?c
data
to
documents,
to
non-‐text
content,
and
to
other
resources,
to
create
a
web
of
meaningful
and
re-‐usable
informa?on
14
15. Smart
Content
design
paIerns
Linked
data
• Link-‐following
naviga?on
over
linked
graph
of
browser
RDF
resources
• Integrated
presenta?on
of
content
and
data
Mashup
across
mul?ple
sources
• Free
text/faceted
search
over
document/data
Seman?c
search
sets
• Rela?onal
query
over
aggregated/federated
sets
Seman?c
query
of
RDF
statements
15