Weitere ähnliche Inhalte Ähnlich wie Taxonomy Fundamentals Workshop 2013 (20) Mehr von Access Innovations, Inc. (20) Kürzlich hochgeladen (20) Taxonomy Fundamentals Workshop 20132. Taxonomy Fundamentals Workshop
10:15 a.m. - 12:00 p.m.
Marjorie M.K. Hlava, President & Chairman, Access Innovations, Inc., creator of
Data Harmony software. My blog is TaxoDiary.com
This interactive session starts by building a solid conceptual foundation for
taxonomy creation and reinforces those concepts through audience participation.
Starting with the basics, Hlava quickly advances to where and how to leverage
taxonomies. This gives beginning and intermediate practitioners a good overview
of the foundational knowledge for the more advanced sessions throughout the
conference. Leveraging the taxonomy standards for the key components of a
thesaurus, Hlava explores how those elements support the information needs of
users from multiple perspectives and examines illustrative sites and behind-thescenes solutions to see how a well-constructed taxonomy with a rich interplay of
terms and synonyms leads to better information access. The workshop discusses
developing a taxonomy that serves users, respecting their needs for specialized
vocabularies. With hands-on activities, attendees gain insight into how a subject
area can be viewed, described, and structured. This learn-by-doing session
provides basic knowledge to create a taxonomy that suits your needs.
© Access Innovations, Inc. All Rights Reserved.
3. In our 1:45 hours together
Conceptual Framework - The Basics
Where and How to Leverage Taxonomies
Better Information Access (Search)
“Card Sort”
“Taxonomatch”
A Quick Look at Standards
“A Taxing Situation” – “Taxopoly”
© Access Innovations, Inc. All Rights Reserved.
4. Conceptual Framework –
The Basics
What is a taxonomy?
What are the parts of a taxonomy?
How do you build one?
Guidelines for the terms
Subject matter experts (SMEs)
© Access Innovations, Inc. All Rights Reserved.
5. What is a Taxonomy?
ANSI/NISO Z39.19-2005
(R2010)
controlled
“A collection of controlled vocabulary terms
organized into a Yes!
hierarchical structure.”
Missing:
equivalence, associative relationships, and notes
© Access Innovations, Inc. All Rights Reserved.
6. The Semantic Roadmap:
Knowledge Organization Systems
•Linked Entities
•Contextual Specificity
•Complex
•High value
Semantic network
Ontology
Thesaurus
Taxonomy
Controlled vocabulary
Synonym set/ring
Name authority file
Uncontrolled list
•Simple
•Low Value
Uncontrolled list
Highest Cost over
Time!
© Access Innovations, Inc. All Rights Reserved.
•Unrelated Entities
•Ambiguity
7. Basic features –
The term record
Main Term (MT)
Top Term (TT)
Broader Terms (BT)
Narrower Terms (NT)
Related Terms (RT)
ONTOLOGY
THESAURUS
Non-Preferred Term (NP)
See also (SA)
TAXONOMY
Used for (UF), See (S)
Synonyms
Scope Note (SN)
History (H)
© Access Innovations, Inc. All Rights Reserved.
= subject term, heading, node,
category, descriptor, class
8. Taxonomy? Thesaurus?
Often used interchangeably
Thesaurus is a taxonomy with extras
Related Terms
Non-preferred Terms (USE/Used for)
Scope Notes
More
Taxonomies often have the actual
information object at the final node.
CMS and SharePoint tend to the
hierarchical view only, definition, and USE
© Access Innovations, Inc. All Rights Reserved.
10. How do you build a taxonomy
?
•
•
•
•
•
•
Define subject field
Collect terms
Organize terms
Fill in gaps
Flesh out and interrelate terms
Apply to your data
You’re done!
© Access Innovations, Inc. All Rights Reserved.
11. Define subject field
Review representative collection of content
Determine:
Core areas
Sociology
Peripheral topics
Psychology
Education
Law
Scope can be modified later
© Access Innovations, Inc. All Rights Reserved.
12. Build, buy, augment?
Survey existing thesaurus/taxonomy resources for your
domain
Test for
•
Scope
•
Depth
•
Make-or-break terms
•
Cost
Adoption of existing taxonomies
Term registries
TaxoBank
Taxonomy Warehouse
Other resources
Don’t reinvent the wheel!
© Access Innovations, Inc. All Rights Reserved.
13. Foundations
Start with what is known
Build from there
Use the literature, your data
Use the lists you already have internally
Build in continuous review throughout the
process, and beyond
Who is involved?
Taxonomists
Subject matter experts
Project management
Users
© Access Innovations, Inc. All Rights Reserved.
14. Collect terms
Your documents and databases
Departmental terminology
Textbooks and their indexes
Book tables of contents and indexes
Journal quarterly indexes
Encyclopedias
Lexicons, glossaries on the topic
Web resources
Users and experts
Search logs
© Access Innovations, Inc. All Rights Reserved.
15. Gather terms from search logs
Top ~100 search terms from search logs
Terms used more than 50 times
Match to web site with appropriate
answer
Basis for favorites or best bets, presented
at the top of results list
Behavior-based taxonomy
© Access Innovations, Inc. All Rights Reserved.
16. Extract the terms – N-grams
Mine the full text for terms
Decide term length
Up to four perhaps
Sort into a frequency list
Leave full strings and just n-grams
Auto match to other lists
© Access Innovations, Inc. All Rights Reserved.
18. How do you choose terms?
Importance in the subject area
Use in the literature, by the organization
or community
Necessary degree of specificity or detail
Relationship with other controlled
vocabularies
Single concept = single term
© Access Innovations, Inc. All Rights Reserved.
19. One term / one concept
Terms represent simple or unitary concept
A unit of thought
May be a single-word term
May be a multi-word term if required to
“A unit of thought, formed by
represent the concept
mentally combining some or all
Three main categories
of the characteristics of a
–
–
–
Concrete entities
Abstract concepts
Proper nouns
© Access Innovations, Inc. All Rights Reserved.
concrete or abstract, real or
imaginary object. Concepts
exist in the mind as abstract
entities independent of terms
used to express them.”
20. How big should it be?
Depends on use and your content
What do the users need?
If search logs show precise detailed requests
Support them
Retail sites – less deep, more “facets”
Scholarly publishers – deep and specific
© Access Innovations, Inc. All Rights Reserved.
21. The levels
7 – 22 top terms
Cognitive width supports this range
3 levels for e-commerce
Roll up if you have more levels
Index / tag to the most specific level
More for specific and precise data
Smaller taxos are tougher to maintain
22 x 22 = 484 x 22 = 10,648
© Access Innovations, Inc. All Rights Reserved.
23. Sample vocabulary sizes
Retail
Barnes & Noble
LL Bean
16 TT 30 level 2 = 480
Home Depot
10 TT 3 levels 10 / level = 1000 terms
Amazon
14TT 30 level 2 = 420 terms
14 TT – 3levels – 8 per level = 896 terms
For more information on product taxos http://gilbaneboston.com/12/presentations/T11_Hedden.pdf
© Access Innovations, Inc. All Rights Reserved.
24. Concrete entities as terms
•
Things and their physical parts
–
Birds
•
•
Buildings
•
•
Floors
Materials
–
–
–
–
Feathers
Cement
Wood
Lead
Cards and chips
© Access Innovations, Inc. All Rights Reserved.
25. Abstract concepts as terms
•
Actions and events
–
•
Abstract entities
–
•
strength, efficiency
Disciplines and sciences
–
•
law, theory
Properties of things, materials, and
actions
–
•
evolution, skating, management, ceremonies
physics, meteorology, mathematics
Units of measurement
–
pounds, kilograms, miles, meters, nanoseconds
© Access Innovations, Inc. All Rights Reserved.
26. Proper nouns as terms
Individual entities – “classes of one” –
expressed as proper nouns
San Francisco, Lake Michigan
Thesaurus standards exclude proper names,
persons, and trade names authority files.
Taxonomies include them as final nodes.
© Access Innovations, Inc. All Rights Reserved.
27. Organize terms – roughly
Sort terms into several major categories –
logical groups of similar concepts as Top
Terms
Identify core areas and peripheral topics
10 – 20 to start
Consider moving proper names to authority files
Result: loose collection of terms under
several main headings
Rough and tentative – see how it fits as you go
Initial gap analysis
Add / modify / delete as needed
© Access Innovations, Inc. All Rights Reserved.
28. How do terms relate?
Hierarchical relationships
-- Parents and their children
Equivalence relationships
-- Aliases
Associative relationships
-- Related terms
-- Cousins
-- See Also’s
© Access Innovations, Inc. All Rights Reserved.
TAXONOMY
THESAURUS
29. Hierarchical relationships
Broader Term represents the class,
whole, or genus
Narrower Term is a member, part, or
species
Generic relationship
Whole-part relationship
Instance relationship
NTs inherit all the BT characteristics
BTs/NTs have a reciprocal relationship
© Access Innovations, Inc. All Rights Reserved.
30. Broader to Narrower Terms
Communications equipment
Telephones
Smartphones
Radio phones
Analog phones
Speaker phones
© Access Innovations, Inc. All Rights Reserved.
31. Hierarchy –
Whole-part relationship
Four general types
–
Body systems and organs
•
–
Geographical locations
•
–
Bernalillo County Albuquerque
Fields of study
•
–
Ear Middle ear
Geology Physical geology
Hierarchical social structures
•
Ontario Manitoulin District
© Access Innovations, Inc. All Rights Reserved.
32. Hierarchy –
Instance relationship
General category (common noun) as BT,
with individual example (proper noun) as
NTI (Narrower Term Instance)
Seas
French cathedrals
Baltic Sea
Caspian Sea
Mediterranean Sea
Chartres Cathedral
Rheims Cathedral
Rouen Cathedral
Essentially identical to “final node” in some taxonomies
© Access Innovations, Inc. All Rights Reserved.
33. Polyhierarchical relationship
•
•
Term can logically fit under more than one
Broader Term – can have Multiple Broader
Terms (MBT)
Part of ISO and ANSI/NISO standards
Nurses
Nurse administrators
Health administrators
Nurse administrators
Finance
Accounting
Careers
Accounting
© Access Innovations, Inc. All Rights Reserved.
34. Generic relationship test – 1
•
•
Both terms in same fundamental category
“All-and-some” test
Rodents
SOME
SOME
Squirrels
Pests
Squirrels
ALL
NOT ALL
Inheritance or inclusion – what’s true of the parent (BT)
is true for all children (NTs)
© Access Innovations, Inc. All Rights Reserved.
35. Generic relationship test – 2
Rodents
Squirrels
Pests
ALL squirrels are rodents
x NOT ALL squirrels are pests
x NOT ALL pests are rodents
© Access Innovations, Inc. All Rights Reserved.
36. Equivalence relationship
•
Preferred Term
–
–
•
Thesaurus term and valid for indexing
Thesaurus notation: USE
Non-Preferred Term
–
–
–
–
Not valid for indexing
An alias
Entry point, directs user to Preferred Term
Thesaurus notation: UF or NPT
Spiders
UF Arachnids
© Access Innovations, Inc. All Rights Reserved.
Plant pathology
USE Phytopathology
37. Equivalence – when to use
Synonyms, slang, quasi-synonyms
Scientific and trade names
UF Motrin
Lexical variants
Ibuprofen
Fiber optics
Mouse
UF Fibre optics
UF Mice
Upward posting of narrow concepts not specified
in taxonomy or thesaurus
Social class
UF Elite, Middle class, Working class
Get equivalent terms from search logs, brainstorming…
© Access Innovations, Inc. All Rights Reserved.
38. Associative relationship
Related Terms (RTs) – cousins
“…terms related conceptually but not
hierarchically, and are not part of an equivalence
set” (i.e. not synonyms)
Both valid for indexing
Reciprocal relationship with each other
Expands user’s awareness, reflects
thesaurus coverage of unanticipated areas
Main basis for the ontology
14 main options offered in Z39.19
© Access Innovations, Inc. All Rights Reserved.
39. Scope Notes (SN)
Indicate meaning of the term in the context of
this thesaurus, for this audience
Stress – Metal, Psychological, Physiological
Could be the definition or glossary
Indicate any restriction in meaning
Indicate range of topics covered
Provide direction for indexers; for terms often
confused, may suggest an alternative term
Use as needed – may not be for every term
Use a style guide
Be concise
© Access Innovations, Inc. All Rights Reserved.
40. Stating the terms
•
•
•
•
•
•
•
•
Term format
Grammatical issues
Singular and plural forms
Spelling
Abbreviations and acronyms
Capitalization
Other punctuation
Consistency
© Access Innovations, Inc. All Rights Reserved.
41. Term format
KISS – Keep it short and simple
•
•
•
Establish a policy
•
1-2-3 words
Effect on search
Pre- and post-coordination
E.g., follow Chicago Manual of Style
Grammatical issues
•
•
•
•
•
Nouns and noun phrases
Verbs Gerunds
Adjectives - no
Adverbs - no
Initial articles – no
© Access Innovations, Inc. All Rights Reserved.
42. Compound Terms
“Terms in a thesaurus should represent
simple or unitary concepts…” (ISO standard)
“Compound terms should be factored
(split) into simple elements…” (ANSI/NISO
standard)
Term phrases are okay (bigrams)
Adjective Noun
American history
Two concepts combined are not
Aromatherapy for bloating
© Access Innovations, Inc. All Rights Reserved.
43. Pre- and Post-Coordination
Pre-coordination – multiple concepts
Subject headings – Library of Congress
American history – Civil War
Back of the book
Put together in advance by the publisher
Post-coordination
Taxonomy terms
Single concept
Put together by the user / searcher
© Access Innovations, Inc. All Rights Reserved.
44. So far you’ve got
Hierarchy
–
•
–
Preferred/Non-Preferred Terms
–
–
–
Equivalence relationships
Related Terms
–
–
Broader and Narrower Terms
Polyhierarchies when needed
Associative relationships
Scope Notes
Complete term records
–
Correct term format
© Access Innovations, Inc. All Rights Reserved.
46. Does it work?
Test on your data
Index 500+ documents (more for variable writing style; fewer
for strict style)
No un-indexed articles allowed
Consider deleting unused terms
Review
Users
Expert reviewers
Consider automated / assisted indexing software
© Access Innovations, Inc. All Rights Reserved.
47. Subject Matter Experts
Work first from the literature
Establish literary warrant for terms
Someone else do the clerical work
Differentiate the lexicography work
From the subject matter expert work
Let SMEs do the review and tailoring
Expert review ensures the proper term
use and application
© Access Innovations, Inc. All Rights Reserved.
49. Review, edit, test, edit,
use, edit, and maintain, i.e. edit
Monitor search logs
Allow indexers to suggest candidate terms
Edit and maintain
Add term
Change existing term
Change term status
Delete term
Add term relationship
Delete term relationship
Add/modify scope note
Change overall structure
© Access Innovations, Inc. All Rights Reserved.
50. Card Sort
Groups of three
Organize terms into the “proper”
hierarchical order
Use as many levels as needed
Use polyhierarchy as needed
Write your top terms on the flip chart
sheet to show the group
15 minutes
© Access Innovations, Inc. All Rights Reserved.
51. Where and how to leverage
taxonomies
Implementation and applications
Adding the terms to the information
objects
Search and other applications
Taxonomy use cases – implementation
Opportunities and obstacles
© Access Innovations, Inc. All Rights Reserved.
52. Parts of the puzzle
The taxonomy
Applications
Search, Web site, CMS, SharePoint, Publishing system,
Author submission, Peer reviewer ID, Recommendation
engines, etc.
Implementation / actions
The words to use
In the order you want the users to browse
Making the links
Adding terms to information objects
Mash-ups
Most people confuse the parts, but they act very
differently
© Access Innovations, Inc. All Rights Reserved.
53. Fully integrated with MOSS
The Workflow
Gather
source
data
Client Data
Full Text
Tag and
Create
metadata
Put in
data base
with tags
Build
Search
inverted
index
Automatic
Summarization
Search
Presentation
Layer
HTML, PDF,
Data Feeds,
etc.
Machine Aided
Indexer
(M.A.I.™)
Inline Tagging
Client
Client Taxonomy
taxonomy
Create
user
interface
Database
Repository
Metadata and
Entity Extractor
Thesaurus
Master
© Access Innovations, Inc. All Rights Reserved.
Search
Software
Increases
accuracy
Browse by Subject
Auto-completion
Broader Terms
Narrower Terms
Related Terms
54. Adding terms to
information objects
Part of the record
A relational table pointing the terms to a
record ID number (Secondary key)
Adding data to the HTML
XML
MARC
META NAME KEYWORD Element
Many other options
© Access Innovations, Inc. All Rights Reserved.
55. Part of the record –
XML
Added as an element in the XML record
Need an element to put the data in
Element = Field = table value = <Taxonomy
Term>
<Taxonomy Term>Roadworks</Taxonomy
Term>
Capture the terms when creating the
records
© Access Innovations, Inc. All Rights Reserved.
56. Part of the record MARC
Added as an element in the MARC record
Need an element to put the data in
654 Roadworks
345 Roadworks
Wherever you decide it works best for
your OPAC
Add the terms when creating the record
© Access Innovations, Inc. All Rights Reserved.
57. Editorial Workflow Integration
Author Submission Module
The author fills in the data to the document template,
attaching images and graphs as necessary.
An API calls Data Harmony and generates a list of indexing
terms based on the content.
© Access Innovations, Inc. All Rights Reserved.
58. Editorial Workflow Integration
Author Submission Module
Authors review the
indexing and may
change it.
Content is stored
into a data
repository as
HTML, XML, etc.
© Access Innovations, Inc. All Rights Reserved.
59. Editorial Workflow Integration
Contributor Role Tagging
A popup list of contributor role options appears for the author to
choose from
----Study conception ?
Contributor Information
Contributor Role
--Methodology ?
Mouse Over
Formal analysis ?
for explanation
Computation ?
Investigation ?
Application of
Resources ?
statistical,
Data Curation ?
mathematical, or
Publication ?
other formal
---Supervision ?
techniques to
--Project administration ?
analyze study data
Funding acquisition ?
Formal Analysis
© Access Innovations, Inc. All Rights Reserved.
60. In the HTML record
Makes it crawlable for the internet
Used in CMS applications
Add to the HTML
Content management systems
Manually
In Dreamweaver
In your CMS system (Drupal, WordPress, etc.)
Author Submissions Example
© Access Innovations, Inc. All Rights Reserved.
62. In Relational Database
Table
Primary key – for the record
Secondary key all the metadata
Used in Oracle, SQL, etc.
Like taxonomy terms
Like author
Like publication date
Need field to put the taxonomy data in
Supports “Faceted Search”
each item in a separate field or element or table
© Access Innovations, Inc. All Rights Reserved.
64. User uploads a document
to SharePoint space
Data Harmony
automatically attaches
indexing terms before
uploading to MOSS
Before uploading to
SharePoint server, the
EventHandler sends the
document to Data
Harmony.
TaxoTerm Server
Data Harmony
(M.A.I.)
Adding terms
to SharePoint
Returns subject
metadata
© Access Innovations, Inc. All Rights Reserved.
Microsoft
SharePoint
Server 2010
65. SharePoint 2010 shows only
10 lines of the taxonomy
This add-on makes it all viewable
© Access Innovations, Inc. All Rights Reserved.
66. Taxonomies added in
search example
Core Architectural Components
Administrator’s
Dashboard
FAST MANAGEMENT API
EMAIL
CONNECTOR
Email,
Groupware
Content
Push
FILTER
SERVER
Alerts
CUSTOM
CONNECTOR
MAIstro
Agent DB
Use taxonomy terms here
Data Harmony Governance API
© Access Innovations, Inc. All Rights Reserved.
Query
Vertical
Applications
Portals
Results
Search harmony
Custom
Applications
Index DB
Pipeline
QUERY
PROCESSOR
Databases
DATABASE
CONNECTOR
Pipeline
DOCUMENT
PROCESSOR
FILE
TRAVERSER
CONTENT API
Files,
Documents
SEARCH
SERVER
QUERY API
WEB
CRAWLER
Web
Content
Custom
Front-Ends
Mobile
Devices
72. Inline Tagging
Shows the exact point where the
concept is mentioned
Mouse-over to view the term record
© Access Innovations, Inc. All Rights Reserved.
Statistical summary, showing the
number of times each term is
mentioned in the article
73. Integrate taxonomy
to enhance findability
Browsable categories of a directory
Browsable faceted navigation
Smart search for term equivalents
Taxonomy terms (original or modified) as labels
Navigation aids incorporate taxonomy terms
and relationships
© Access Innovations, Inc. All Rights Reserved.
74. More Taxonomy
Enrichment
Spelling alternatives and correction
Related concepts
Statistical information about the metadata
Navigation or drill-downs
Search refinement
Recursive sets
Concept linking
Dictionary lookup (in taxonomy glossary)
© Access Innovations, Inc. All Rights Reserved.
75. Parts of Search
Search software
Inverted index
Search algorithms
Presentation layer
Search box
Auto-completion
Related and narrower terms
Hierarchical display
© Access Innovations, Inc. All Rights Reserved.
76. Database Plus Search Workflow
SQL for
ecommerce
Raw Full
text data
feeds
Printed
source
materials
Data
Crawls on
data
sources
Source data
XIS
Creation
XIS
repository
Load to
Search
Taxonomy
terms
MAI Concept
Extractor
MAI Rule
Base
Taxonomy
Thesaurus
Master
Add
metadata
Clean and enhance data
© Access Innovations, Inc. All Rights Reserved.
Search
Harmony
Display
Search
Search data
77. Why does search fail?
Most large organizations have 5 different
search software applications
All disappointing and on the shelf
Inconsistent results
Unclear path to results
Lack of single unified clear and consistent
vocabulary
Not tied to data governance
Taxonomy
Other metadata
© Access Innovations, Inc. All Rights Reserved.
78. Sample DOCUMENT
Outline of Presentation
1
Creating
an
Inverted
File Index
2
Define key terminology
Thesaurus tools
3
Costs
4
Features
Functions
Thesaurus construction
Thesaurus tools
Why & when?
© Access Innovations, Inc. All Rights Reserved.
79. Simple inverted file index
The terms from the “outline”
&
1
2
3
4
construction
costs
define
features
functions
© Access Innovations, Inc. All Rights Reserved.
key
of
outline
presentation
terminology
thesaurus
tools
when
why
80. Complex inverted file index
Placement location
& - Stop
1 - Stop
2 - Stop
3 - Stop
4 - Stop
construction - L7, P2, SH
costs - L6, P1, H
define - L2, P1, H
features - L4, P1, SH
functions - L5, P1, SH
key - L2, P2, H
of - Stop
outline - L1, P1, T
presentation - L1, P3, T
terminology - L2, P3, H
thesaurus - (1) - L3, P1, H
(2) - L7, P1, SH
(3) - L8, P1, SH
tools - (1) - L3, P2, H
(2) - L8, P2, SH
when - L9, P3, H
why - L9, P1, H
© Access Innovations, Inc. All Rights Reserved.
81. Access Innovations –
Complex Farm
with Perfect Search
Query
Query Servers
Search Harmony
Presentation
Layer
Federators
Cleanup, etc.
Deploy
Hub
Repository XIS
(cache)
Cache
Builders
Source
Data
© Access Innovations, Inc. All Rights Reserved.
Index
Builders
82. Measuring accuracy in search
Relevance
Recall
Precision
Hits, misses, noise
Ranking
Linguistics
Query processing
Results processing
Display
Search refinement
Usability
Business rules
© Access Innovations, Inc. All Rights Reserved.
83. Relevance
How well a set of returned documents answers
the information need
“Accuracy”
Related to objective of search
Different user communities
Information resources
Tension of user needs and context available
A confidence “guessimate”
© Access Innovations, Inc. All Rights Reserved.
84. The formulas
Recall = Number of relevant items retrieved
Number of relevant items in the collection
Precision = Number of relevant items retrieved
Number of items retrieved
Relevance = Germane (Precision)
Pertinent (Recall)
© Access Innovations, Inc. All Rights Reserved.
89. Link to Society Resources
Cancer Epidemiology Biomarkers & Prevention
Vol. 12, 161-164,
February 2003
© 2003 American Association for Cancer Research
Short Communications
Alcohol, Folate, Methionine, and Risk of Incident Breast
Cancer in the American Cancer Society Cancer Prevention
Study II Nutrition Cohort
Heather Spencer Feigelson1, Carolyn R. Jonas, Andreas S.
Robertson, Marjorie L. McCullough, Michael J. Thun and
Eugenia E. Calle Department of Epidemiology and Surveillance
Research, American Cancer Society, National Home Office,
Atlanta, Georgia 30329-4251
Recent studies suggest that the increased risk of breast cancer
associated with alcohol consumption may be reduced by
adequate folate intake. We examined this question among
66,561 postmenopausal women in the American Cancer Society
Related Working Groups
Think Tank Report
•FinancePrevention Study II NutritionRelated Think Tank Report
Cancer
Cohort.
•Charter
Content
•Molecular Epidemiology
Webcasts
Related Awards
Related Webcasts
•AACR-GlaxoSmithKline Clinical Cancer Research
Scholar Awards
•ACS Award
•Weinstein Distinguished Lecture
© Access Innovations, Inc. All Rights Reserved.
Related Press Releases
•How What and How Much We Eat (And Drink) Affects Our
Risk of Cancer
•Novel COX-2 Combination Treatment May Reduce Colon
Cancer Risk Combination Regimen of COX-2 Inhibitor and
Fish Oil Causes Cell Death
•COX-2 Levels Are Elevated in Smokers
Related AACR Workshops and Conferences
•Frontiers in Cancer Prevention Research
•Continuing Medical Education (CME)
•Molecular Targets and Cancer Therapeutics
Related Meeting Abstracts
•Association between dietary folate intake, alcohol intake, and
methylenetetrahydrofolate reductase C677T and A1298C
polymorphisms and subsequent breast
•Folate, folate cofactor, and alcohol intakes and risk for
colorectal adenoma
•Dietary folate intake and risk of prostate cancer in a large
prospective cohort study
Related Education Book Content
Oral Contraceptives, Postmenopausal Hormones,
and Breast Cancer
Physical Activity and Cancer
Hormonal Interventions: From Adjuvant Therapy to
Breast Cancer Prevention
After Helen Atkins
90. Linked Data
Other Journal
Articles on
Topic A
CME Activity
on Topic A
Journal
Article on
Topic A
Grant Available for
Researchers
Working on Topic A
Upcoming
Conference
on Topic A
Job Posting
for Expert
on Topic A
Podcast Interview
with Researcher
Working on Topic A
Author Networks
Social Networking
© Access Innovations, Inc. All Rights Reserved.
After Helen Atkins
91. Authors at a Place
© Access Innovations, Inc. All Rights Reserved.
92. Member Profile Tagging
User pastes or
uploads CV
Button to autoextract taxonomy
attributes
© Access Innovations, Inc. All Rights Reserved.
93.
Designed to enhance understanding and retention of the
vocabulary concepts necessary for creating a taxonomy,
ontology, thesaurus, or controlled vocabulary.
Game supplies:
Game setup:
1 Deck of Orange Question and Challenge Cards
1 Deck of Green Answer Cards
Shuffle the deck of Green Answer cards.
Deal the entire deck to the players.
Shuffle the deck of Orange Question and Challenge cards.
Place them face down in a pile in the middle of the table so that all
players can reach the pile.
Reinforce what you just heard!
Have fun!
© Access Innovations, Inc. All Rights Reserved.
94. 1.
2.
3.
4.
5.
6.
Play moves to the left of the dealer.
7.
Draw a card from the top of the Orange
cards. Read it aloud to all of the players.
The player who read the card says out loud 8.
what they think the answer is.
Each player looks at the Green Answer
cards in their hand.
1.
If they have the correct answer to the
Question or Challenge, they show their
9.
card to everyone at the table.
2.
If everyone agrees that the answer is
correct, the player holding the correct
answer card gives it to the player who
10.
read the Question or Challenge card.
The player places their associated pair of
cards – one Orange Question and
Challenge card and one Green Answer card
– face up on the table in front of them.
Play passes to the person who held the
correct Green Answer card in their hand.
Play continues as in step 2 above.
© Access Innovations, Inc. All Rights Reserved.
Discussion among the players to arrive at
the correct answer is permissible and
encouraged!
If players do not arrive at a consensus
regarding the correct answer, the Orange
Question and Challenge card may be
returned to the bottom of the pile, and play
passes to the person to the left of the player
who drew the previous card.
When all of the Orange Question and
Challenge cards have been drawn, read
aloud, and matched with their Green
Answer cards, the game ends.
If there are any Orange Question and
Challenge cards remaining to which players
cannot agree on an answer, players may
consult their notes or ask the session
speaker.
95. Using taxonomies in
applications
•
•
•
•
•
•
•
•
•
•
•
Improve search
•
Subject browsing
Mobile intelligence
Targeted resources based •
on subject or user role
Link to society resources •
Author submission module •
Author authority database •
•
Expert reviewer
•
identification
Member profiles
•
Data visualization
More like this
© Access Innovations, Inc. All Rights Reserved.
In “indexing” or
categorizing, as subject
metadata
In content management
systems
In SharePoint
In mash-ups
In social networking sites
In author tagging
In filtering data – e.g.,
spam filters and RSS feeds
In web crawlers
Social media - community
96. More Innovations!
Link topic to article to author to event
Make visual links within domain
Enable authors to submit and categorize conference
submissions
Create author authority database linking to co-authors,
topics, locations, etc.
Create expert reviewer database
Create member profiles with alternate names,
publications, tagged by topic
Visualize data and domain distribution
Display interest connections in social network
Deliver accurate targeted information through mobile
applications
© Access Innovations, Inc. All Rights Reserved.
97. Visualize your tagged data
This is a radial
graph of
“plosthes”. The
number of
records for which
each index term
occurs is
reflected by
circle sizes.
© Access Innovations, Inc. All Rights Reserved.
102. Load to a visualization program
such as Prefuse
© Access Innovations, Inc. All Rights Reserved.
104. Taxonomy standards
Z39.19 (2005; reaffirmed 2010) Controlled
Vocabularies
BS 8723 Parts 1 – 5
ISO 25964 Parts 1 and 2
TAG 37 and 46 standards
SKOS - Simple Knowledge Organization
System
OWL - Web Ontology Language
AND more!
© Access Innovations, Inc. All Rights Reserved.
105. Taxonomies don’t exist in a
vacuum
They are part of metadata
They are used to tag information objects
They are used
On Web sites
In search
To profile people
To link resources
So we have to know a little about those
standards as well
© Access Innovations, Inc. All Rights Reserved.
106. More on ISO 25964
Part 2 Interoperability
and RDA
at 2:15 PM
© Access Innovations, Inc. All Rights Reserved.
107. W3C
HTML 5
Linked Data
Ontologies (OWL) and SKOS
Cascading Style Sheets (CSS)
Simple Knowledge Organization System
Adding style to Web content
Widgets
Widget Packaging and XML Configuration,
Widget Interface
API to metadata and persistently storing data
XML Digital Signatures for Widgets
© Access Innovations, Inc. All Rights Reserved.
108. Big Library Followings
DCMI – Dublin Core Metadata Initiative
Functional requirements
Library of Congress
© Access Innovations, Inc. All Rights Reserved.
110. Library of Congress
MARC 21 formats and MARCXML
VRA Core -- them
METS (Metadata Encoding & Transmission
Standard)
MIX (NISO Metadata for Images in XML)
PREMIS (Preservation Metadata)
TextMD (Technical Metadata for Text)
ALTO - Technical Metadata for Optical
Character Recognition
Extended Date/Time Format (EDTF)
© Access Innovations, Inc. All Rights Reserved.
111. Thesaurus related
NISO Z39.19 2010 www.niso.org
ISO 2788 - Monolingual (1986) (withdrawn)
ISO 5964 - Multilingual (1985) (withdrawn)
ISO 5127, Information and documentation
Vocabulary
BS 8723 (withdrawn) (basis for revised ISO)
ISO 25964 / Part 1 – Controlled Vocabularies
ISO 25964 / Part 2 – Taxonomy Interoperability
Dublin Core DCMI Functional requirements
SKOS – the W3C thesaurus standard
OWL from W3C
© Access Innovations, Inc. All Rights Reserved.
112. Thesaurus and Indexing
Standards – ANSI/NISO
NISO Z39.19-2005 (R2010) Guidelines for the
Construction, Format, and Management of
Monolingual Controlled Vocabularies
NISO TR02-1997 Guidelines for Indexes and
Related Information Retrieval Devices
by James D. Anderson
© Access Innovations, Inc. All Rights Reserved.
113. New ISO Taxonomy
Standard
ISO 25964. Thesauri and interoperability
with other vocabularies
Part 1: Thesauri for information retrieval
Part 2: Interoperability with other vocabularies
Stella Dextre Clarke, principal author
© Access Innovations, Inc. All Rights Reserved.
114. W3C
OWL – Web Ontology Language
RDF – Resource Description Format
Topic Maps
SKOS - Simple Knowledge Organization
System
SKOS 2 DCMI
TURTLE
Which community to serve?
© Access Innovations, Inc. All Rights Reserved.
117. Other Relevant ISO &
W3C Standards
Metadata standards overview
http://www.slis.kent.edu/~mzeng/metadatabasics/
completelist.htm
Review of SKOS / DCMI / Taxonomy
Standards
http://nkos.slis.kent.edu/
© Access Innovations, Inc. All Rights Reserved.
118. SKOS
SKOS 1 –
SKOS 2 –
no synonyms,
no polyhierarchies
Added the above
Allows other fields (elements) on request
OWL Crosswalk
NISO Z39.19, BSI 8723, and ISO 25964
© Access Innovations, Inc. All Rights Reserved.
119. Who supports SKOS Everyone
Data Harmony Thesaurus Master
Synaptica
SmartLogic
WordMap
PoolParty
Top Quadrant
Protégé
Etc.
© Access Innovations, Inc. All Rights Reserved.
120. Standards and
pragmatism
Use Standards
Lead to richer, more informative product
Promote interoperability -- Allow you to adopt or
adapt other controlled vocabularies
Promote predictability
Allow repurposing within your organization and by
other organizations
Follow thesaurus standards for taxonomy
Incorporate authority files / final nodes as needed
Your taxonomy or thesaurus must meet your needs
© Access Innovations, Inc. All Rights Reserved.
121. The Problem –
KEEPING UP
Many players we know and don’t know
Controlled vocabulary standards
Groups developing guidelines and standards
W3C with SKOS and OWL
Governments worldwide developing and
mandating taxonomies
Communities
Increase reuse
Mapping interoperability between controlled
vocabularies
© Access Innovations, Inc. All Rights Reserved.
122. Places to watch
Other W3C and ISO areas
Support groups
Blogs
Communities of Practice
WSDL – Web Services Digital Library
DCMI
NKOS
ISKO
Linked Data
© Access Innovations, Inc. All Rights Reserved.
123.
The New Board Game
Applications
Implementation
The taxonomy
© Access Innovations, Inc. All Rights Reserved.
124. Where do I learn more?
Online resources
Taxonomy books
Those standards
Organizations
SLA Taxonomy Division
ISKO
NKOS
© Access Innovations, Inc. All Rights Reserved.
131. Lists of Taxonomy
Resources
Registry? NKOS KOS of KOS
SKOS participants – W3C
KOS typology – Tudhope
TaxoBank.org
Kent.edu site – Marcia Zeng
Taxonomy Warehouse – Synaptica
UMLS - Unified Medical Language
System - NIH
© Access Innovations, Inc. All Rights Reserved.
133. IT is often Fire, Ready, Aim!
Choose the hardware
Choose the software
Decide on the format
Convert the data
Fix the data
Tack on a taxonomy
Ignore the standards
© Access Innovations, Inc. All Rights Reserved.
134. Change to Ready, Aim, Fire!
Follow the data
Look at the data, format, and content
Design taxonomy for data
Leverage the standards
Use taxonomy to tag data
Choose search and repository software for
data
Load the data into the system
Keep your eye on the target
© Access Innovations, Inc. All Rights Reserved.
135. For copies of the
“The Games”
© Access Innovations, Inc. All Rights Reserved.
136. Summary
We covered the basics
We talked about the implementation
Application of the terms to your content
Search
Standards
We reinforced the learning with activities
You drank from the fire hose
Now go hear the case studies of the next
two days!
© Access Innovations, Inc. All Rights Reserved.
137. Marjorie M.K. Hlava
Thank you
for your
attention!
© Access Innovations, Inc. All Rights Reserved.
President
Access Innovations, Inc.
Data Harmony
mhlava@accessinn.com
505-998-0800
www.taxodiary.com - the
taxonomy news blog
mmkhlava = Twitter
mhlava = Facebook,
LinkedIn, eAcademy, Plaxo
Hinweis der Redaktion Thanks to Helen Atkins of AACR for this illustration.The real power of this is that the links can all go in all directions, so we take advantage of having the user’s attention regardless of how they step into our “web”