Weitere ähnliche Inhalte
Mehr von Society for Scholarly Publishing
Mehr von Society for Scholarly Publishing (20)
1 d.3
- 1. When Metadata is the Content
From Articles to Knowledge
SSP 2009 Annual Meeting
Chris Beguel – Director of Sales – TEMIS
Baltimore, MD – May 09
- 2. Where are we? Semantic Age!
Copyright © 2009 TEMIS –All rights reserved 2
- 3. From Words to Meaning…
Trimilax 500 mg makes me feel dizzy after ingestion
Term Prop. Num. Abrev. Verb /3rd Pron. Verb Adj. Prep. Noun
Entity Product Dosing Action Target State Event Action
Fact Drug Symptom Condition
Potential Adverse Effect
Drug = Trimilax
Knowledge Dosing = 500mg
Symptom = Tireness
When = After administration
Copyright © 2009 TEMIS –All rights reserved 3
- 4. Metadata? Understand!
Metadata
Title: Google gives drivers a hand
at the gas pumps
Source: InformationWeek
Author: Antone Gonsalves
Date: November 7, 2007
Entities
Facts
Copyright © 2009 TEMIS –All rights reserved 4
- 5. Metadata? Understand!
Metadata
Entities
Companies
Gilbarco Veeder-Root Gilbarco
Google InformationWeek
T-Mobile HTC
Qualcomm Motorola
Persons
Lucy Sackett Sackett
Locations
Atlanta United States
Organizations
National Association of Conveni…
Technologies
Internet Linux Open-source …
Product
New Service Google Service
Copyright © 2009 TEMIS –All rights reserved Facts 5
- 6. Metadata? Understand!
Who: Gilbarco
Whom: unknown
What: New Service Metadata
When: unknown
Announcement Entities
Who: Gilbarco Facts
What: Google Service
When: early next week Announcement
Gilbarco New service
Who: Sackett
Launch Whom: InformationWeek
When: unknown Sackett InformationWeek
What: unknown
Launch
Gilbarco Google Service
Function Function
Announcement
Who: Gilbarco Sackett Gilbarco
With whom: Google
Who: Sackett When; unknown
State: Negative
Partnership
Company: Gilbarco
Who: Google
Function: spoke woman Gilbarco Google
With whom: T-
Mobile, HTC,
Partnership Qualcom, Motorola Alliance
When: unknown T-Mobile
Google HTC
Alliance
Qualcomm
Motorola
Copyright © 2009 TEMIS –All rights reserved 6
- 8. What is Text Mining?
v Text Mining is an information access technology…
v Text Mining generates Knowledge
v Text Mining serves information consumers & producers
Text Mining Back-End
Data
Repository
Text Mining Front-End
(Text Analytics)
Copyright © 2009 TEMIS –All rights reserved 8
- 9. 1. Enhanced Search Experience
From standard keyword search….
Simple recognition of words…
Copyright © 2009 TEMIS –All rights reserved 9
- 10. 1. Enhanced Search Experience
… to Entity & Fact search!
•Make comprehensive and precise search
End-User
•Get more relevant documents
Benefits •Find what you don’know!
t
Copyright © 2009 TEMIS –All rights reserved 10
- 12. 2. Faceted Navigation
… to multi-dimensional faceted navigation
Point & Click
filtering
Ability to combine
several filters at once
(and/or)
Self-adjusting
filters to refine the
search
•Get a quick vision of document content
End-User
•Navigate within context-relevant information
Benefits •Rapidly focus on targeted documents
Copyright © 2009 TEMIS –All rights reserved 12
- 13. 3. Data Analysis and Reporting
From bug view ….
Copyright © 2009 TEMIS –All rights reserved 13
- 14. 3. Data Analysis and Reporting
… to bird-
eye view!
•Visualize key Entities & Facts (pie/bar charts)
End-User
•Detect Entities & Facts dependencies (matrix charts)
Benefits •Zoom in & out by drilling anywhere
Copyright © 2009 TEMIS –All rights reserved 14
- 16. 4. Information Discovery
… to
information
network
Discovery
Search
Tools
Panel
Entities
Proofs
Facts
•Search in knowledge, not in documents
End-User
•Get a graphical representation of knowledge
Benefits •Discover information by navigating within Facts
Copyright © 2009 TEMIS –All rights reserved 16
- 17. Semantic Enrichment at the Core
Automatic Entity & Facts Taxonomy Content
Categorization Extraction Management Editors
Related Topics
Editorial Web Content Extraction
& Content Management
Similarity
Management Detection
Smart
Text Mining Linking
Content Enrichment Trends Analysis
Product & Charting
Visitors &
customers
Management
Sentiment
Analysis
Content Metadata
Annotation Extraction
Original Content
Journal Scans
Expert Interviews
Event Reports
Copyright © 2009 TEMIS –All rights reserved 17
- 18. Benefits to Information Producers
Increase stickiness of website to maximize
ad revenue or subscription utilization!
v Create more engaging, longer lasting user visits
• Richer user experience with context sensitive information
• Enhanced page views per visits
• Exposing the “long tail” through suggestions and linking
• Integrate more content at a fraction of the cost
v Establish your web properties as a community
gateway
• “70% of all searches do NOT start on Google/MSN/Yahoo”
says Sue Feldman at IDC Research
• Smart search and navigation are critical to user’ experience
s
Copyright © 2009 TEMIS –All rights reserved 18
- 19. Re-Packaging Content – Elsevier
v Objective
• Develop a revolutionary database indexing the last 28 years
in chemistry patent
• Provide an exceptional users’experience by using “smart
content”
v Results
• ~20 Million Chemistry Patent documents
• Searchable by chemical reactions, solvents, reactants directly
extracted from the documents
• Released by Elsevier-MDL in Nov. 2004
v Currently
• TEMIS distributes the Chemical Entities Relationships
Annotator in partnership with Elsevier
Copyright © 2009 TEMIS –All rights reserved 19
- 20. Exposing the Long Tail – Springer
v Objective
• Mapping of meaningful words and phrases
in journal articles to encyclopedia entries
• Identification of related documents in a pool of over
three million journal articles
v Solution
• Indexing of incoming journal articles to link journal
articles with the related encyclopedia entry
• Creation of semantic fingerprint for each journal article
to allow search engine calculate degree of relationship
• Integration with Springer’ search engine
s
v Benefits
• Increased product sales by improving content linking
Copyright © 2009 TEMIS –All rights reserved 20
- 21. Answering Burning Questions – EFL
v Objectives
• Extract numerical data
from case law to enhance
information access
for lawyers.
v Solution
• Luxid® with custom annotators (address, activity,
compensation, age, turnover… )
• Export numerical data as metadata to a search engine.
v Benefits
• Productivity gain to extract and validate metadata
• Allowing to treat huge amount of case law
Copyright © 2009 TEMIS –All rights reserved 21