SlideShare a Scribd company logo
1 of 103
Albuquerque, NM 87110
               www.accessinn.com
           www.dataharmony.com
                    505-998-0800
             Marjorie M.K. Hlava
     President and Chief Scientist
         Access Innovations, Inc.




Taxonomy 101
Overview of the
            Presentation

   Why build a taxonomy?
   What is a taxonomy?
   What are the standards?
   Where are taxonomies used?
   What are the parts of a taxonomy?
   How do you build one?
   How do you implement one?
   Review
Why build a taxonomy?
   Leverage your data
   Search, precision, and recall
   Websites
   Discoverability and findability
   Data mashups
   Data trends and visualization
   Repackaging and repurposing data
   Author and entity disambiguation
Heart of the Big Data
production process
Taxonomy
From the
production side
to the website
display carry the
taxonomy
descriptors for
use in precision
search
What is a Taxonomy?
               ANSI/NISO Z39.19-2005

                     controlled

“A collection of controlled vocabulary terms
   organized into a Yes!

                                        hierarchical structure.”
Missing:
equivalence, homographic, and associative relationships
and notes
                     Copyright © 2009 - Access Innovations, Inc.
A taxonomy is a
             knowledge organization system

   Uncontrolled list                                            Not complex

   Name authority file
   Synonym set/ring
   Controlled vocabulary
   Taxonomy
   Thesaurus
   Ontology
   Semantic network                                            Highly complex


                  Copyright © 2009 - Access Innovations, Inc.
A Thesaurus is a
              Knowledge Organization System
   Controlled vocabulary                                                   KOS
   Focus on conceptual classes, not specifics
   Hierarchy – implicit if not displayed
       Parent-child relationships
   Various display formats may be available
   Network of relationships between terms guides
    user to find information            Long
       Cousins, friends, aliases
   Scope notes, term history                                           established
   More elaborate and informative                                      standards
                          Copyright © 2009 - Access Innovations, Inc.
Thesaurus defined –
              ANSI/NISO Z39.19-2005
“A controlled vocabulary arranged in a known order
    and structured so that the various
    [equivalence, homographic, hierarchical, and
    associative] relationships among terms are displayed
    clearly and identified by standardized relationship
    indicators. Relationship indicators should be employed
    reciprocally.
“Its purpose is to promote consistency in the indexing of
    content objects, especially for postcoordinated
    information storage and retrieval systems, and to
    facilitate browsing and searching by linking entry
    terms with terms. Thesauri may also facilitate the
    retrieval of content objects in free text searching.”
                   Copyright © 2009 - Access Innovations, Inc.
Structure of
                  controlled vocabularies

Lists   Synonyms      Taxonomy                  Thesaurus              Ontology

                  INCREASING COMPLEXITY and CONTROL




Ambiguity Ambiguity                                Ambiguity
          Synonym                                  Synonym             Synonym
                      Hierarchy                    Hierarchy           Hierarchy
                                                   Relationships       Additional kinds of
                                                                       relationships


                         Copyright © 2009 - Access Innovations, Inc.
Taxonomy? Thesaurus?

   Often used interchangeably
   Thesaurus is a taxonomy with extras
       Related Terms
       Non-preferred Terms (USE/Used for)
       Scope Notes
       more
   Use the word your audience understands
       Avoid confusion with Roget’s Thesaurus

                     Copyright © 2009 - Access Innovations, Inc.
Where can I get
                 taxonomy standards?

   www.niso.org
       Z39.19 (2010) Controlled Vocabularies
   www.ISO.ce
       ISO 25964 parts 1 and 2
   www.bsi.uk.co
   www.w3c.org
Web Ontology Language
                   OWL


   W3C Recommendation 10 February 2004
   http://www.w3.org/TR/2004/Rec-owl-guide-20040210/

   http://www.w3.org/TR/2004/Rec-owl-ref-20040210/

   http://www.w3.org/TR/2004/Rec-webont-req-20040210/

   Continuing updates


                         Copyright © 2009 - Access Innovations, Inc.
Taxonomy                                                  Thesaurus
  view                                                   Term Record
                                                            view




           Copyright © 2005 - Access Innovations, Inc.
Where are taxonomies used?

•   In “indexing” or categorizing, as subject metadata
•   In search
•   In content management systems
•   In SharePoint
•   In mashups
•   In social networking sites
•   In author tagging
•   In filtering data – e.g., spam filters and RSS feeds
•   In web crawlers
Why the excitement?

   Makes information findable!
        Cut search time by 50% - The Weather Channel
   Organizes web sites
   Provides better online help
        Customer support 30x more costly than web self-service
        (Forrester Research "Tier Zero Customer Support" 1999)
Taxonomies in business
“The High Cost of Not Finding Information”
 Time wasted searching
 Confusion about same information by different name
 Similar/overlapping activities, products, uses
                  (Susan Feldman, KMWorld, March 2004)

With a unified taxonomy and consistent indexing:
 Better searching or browsing to locate information
 More efficient content management
 Focused content collection through web spidering
 Personalized content delivery

                   Copyright © 2009 - Access Innovations, Inc.
Good Search
            must have…metadata

                                                          Inverted
 Searchable Index                                           File
                                                           Index




                                                          Taxonomy
Hierarchical Display                                      Thesaurus


            Copyright © 2009 - Access Innovations, Inc.
Metadata

   The fields
   The elements
       Class codes
       Title
       Author
       Plaintiff
       Product
       subject / topic
   Meta Name Keywords in HTML

                   Copyright © 2009 - Access Innovations, Inc.
Copyright © 2005 - Access Innovations, Inc.
Basic taxonomy
                / thesaurus features

•   Hierarchy structure
    –   Broader Terms = more general concepts
    –   Narrower Terms = more specific concepts
•   Related Terms = conceptual cousins
•   Term equivalents = synonyms
•   Classification options
•   Scope notes
•   Other elements as needed
                     Copyright © 2009 - Access Innovations, Inc.
What are the parts? The term
                  record
                                                                = subject term, heading, node,
                                                                category, descriptor, class

   Main Term (MT)
   Top Term (TT)                               TAXONOMY
   Broader Terms (BT)
   Narrower Terms (NT)                                   ONTOLOGY
   Related Terms (RT)                                THESAURUS
       See also (SA)
   Non-Preferred Term (NP)
       Used for (UF), See (S)
   Scope Note (SN)
   History (H)

                        Copyright © 2009 - Access Innovations, Inc.
How do you build one?

   From scratch?
   Adoption of existing
       Term registries
       Taxonomy Warehouse
       Other resources
   Combination
Define subject field
    Review representative collection of content
    Determine:
        Core areas
        Peripheral topics                                                 Sociology

                                        Psychology
                  Education
                                                                           Law
     Scope can be modified later
                             Copyright © 2009 - Access Innovations, Inc.
Before you go on: Build or
                  buy?

•   Survey existing thesaurus/taxonomy
    resources for your domain
•   Test for
    –   Scope
    –   Depth
        •   Make-or-break terms
    –   Cost

                Don’t reinvent the wheel!
                       Copyright © 2009 - Access Innovations, Inc.
Build a taxonomy – simple
                steps

•   Get paper and pencil
    –   Sharpen pencil
•   Define subject field
•   Collect terms
•   Organize terms
•   Fill in gaps
•   Flesh out and interrelate terms
                 You’re done!
                     Copyright © 2009 - Access Innovations, Inc.
Your taxonomy /
                   thesaurus end product
•   Reflects
    –   scope of your concern
    –   degree of precision you need
•   Facilitates
    –   data storage and retrieval by vocabulary control
    –   discovery of ideas
•   Promotes learning
    –   preferred terminology
    –   relationships among concepts
    –   organized guide to your field
                     Copyright © 2009 - Access Innovations, Inc.
How do you choose terms?

   Importance in the subject area
   Use in the literature, by the organization
    or community
   Necessary degree of specificity or detail
   Relationship with other controlled
    vocabularies


                 Copyright © 2009 - Access Innovations, Inc.
Vocabulary control – why?
“Eliminating ambiguity and compensating for
   synonymy through vocabulary control
   assures that each term has only one
   meaning and that only one term can be
   used to represent a given concept or
   entity. … Ambiguity occurs in natural
   language when a word or phrase (a
   homograph or polyseme) has more than
   one meaning. ” (ANSI/NISO Z39.19-2005)
               Copyright © 2009 - Access Innovations, Inc.
One term / one concept
   “Terms in a thesaurus should represent
    simple or unitary concepts…”
    (ISO standard)
   “Each term included in a controlled
    vocabulary should represent a single
    concept (or unit of thought). A single concept
    is frequently expressed by a single-word
    term but in many cases a multiword term is
    required to represent the concept.”
    (ANSI/NISO Z39.19-2005)
                      Copyright © 2009 - Access Innovations, Inc.
Vocabulary control – how?

   Use unambiguous terms, clear to the user
    group
   Distinguish between terms that appear
    similar
   Use Scope Notes when necessary
   Use terms as elements that can be
    coordinated in a flexible manner
   Create compound terms if necessary
                Copyright © 2009 - Access Innovations, Inc.
A “term” synonym ring
                     Term


Descriptor                                     Node


   Category                              Subject heading



         Copyright © 2009 - Access Innovations, Inc.
So what’s a concept?

•   “A unit of thought, formed by mentally
    combining some or all of the
    characteristics of a concrete or
    abstract, real or imaginary object.
    Concepts exist in the mind as abstract
    entities independent of terms used to
    express them.”
•   Three main categories
    –   Abstract concepts
    –   Concrete entities
    –   Proper nouns
                     Copyright © 2009 - Access Innovations, Inc.
Concrete entities as terms

•   Things and their physical parts
    –   primates
        •   head
    –   buildings
        •   floors
•   Materials
    –   cement
    –   wood
    –   lead
                        Copyright © 2009 - Access Innovations, Inc.
Abstract concepts as terms

•   Actions and events
    –   evolution, skating, management, ceremonies
•   Abstract entities
    –   law, theory
•   Properties of things, materials, and
    actions
    –   strength, efficiency
•   Disciplines and sciences
    –   physics, meteorology, mathematics
•   Units of measurement
    –   pounds, kilograms, miles, meters, nanoseconds
                      Copyright © 2009 - Access Innovations, Inc.
Proper nouns as terms

   Individual entities – “classes of one” –
    expressed as proper nouns
       San Francisco, Lake Michigan

         Thesaurus standards exclude proper names,
          persons, and trade names  authority files.
         Taxonomies include them as final nodes.


                     Copyright © 2009 - Access Innovations, Inc.
Collect terms

   Your documents and databases
   Departmental terminology
   Text books and their indexes
   Book tables of contents and indexes
   Journal quarterly indexes
   Encyclopedias
   Lexicons, glossaries on the topic
   Web resources
   Users and experts
   Search logs
                Copyright © 2009 - Access Innovations, Inc.
Gather terms from search
                logs
“Beyond the Spider: The Accidental
  Thesaurus”
          (Richard Wiggins in Information Today, Oct 2002)

Top ~100 search terms from search logs
Match to web site with appropriate answer
Basis for favorites or best bets, presented at the top of
   results list.
(AKA behavior-based taxonomy)

Not a thesaurus or taxonomy,
      but still a useful source of terms.
                      Copyright © 2009 - Access Innovations, Inc.
Organize terms – roughly
   Sort terms into several major categories –
    logical groups of similar concepts as Top
    Terms
       Identify core areas and peripheral topics
       10 – 20 to start
       Consider moving proper names to authority files
   Result: loose collection of terms under
    several main headings
       Rough and tentative – see how it fits as you go
       Initial gap analysis
       Add / modify / delete as needed
                     Copyright © 2009 - Access Innovations, Inc.
Usefulness of a term –
                           the “duh” factor

•   Some terms are so basic for a domain that
    they have little or no value
    –   “Sports” in Sports Illustrated
    –   “Technology” in Technology Review
    –   “Golf” in Golf Magazine
    –   “Information science” and “Information technology”
•   How useful will the term be for indexing?
    –   Does the term apply to everything in the domain?
    –   Does the term distinguish important concepts?
    –   If term is needed, specify limited use conditions in
        Scope Note

                      Copyright © 2009 - Access Innovations, Inc.
Hierarchy structures –
                       variations on a theme

•   Not pre-determined
    –   Subcategorize wines first by type, variety, region,
        then cost? Or first by cost and then type?
•   Varies by user group and needs
    –   May have multiple views of same content
    –   Standard alpha view or customized notation
•   Affects information architecture, i.e., how
    web site functions

                      Copyright © 2005 - Access Innovations, Inc.
How do terms relate?

   Hierarchical relationships
      -- Parents and their children                             TAXONOMY

   Equivalence relationships
                                                                THESAURUS
      -- Aliases
   Associative relationships
      -- Cousins


                  Copyright © 2009 - Access Innovations, Inc.
Hierarchical relationships

   Broader Term represents the class,
    whole, or genus
   Narrower Term is a member, part, or
    species
       Generic relationship
       Whole-part relationship
       Instance relationship
   BTs/NTs have a reciprocal relationship
   Hyponym - Hypernym
                     Copyright © 2009 - Access Innovations, Inc.
Broader to Narrower Terms
Politics

           Elections
                    Presidential elections
                   Gubernatorial elections
                    Mayoral elections



               Copyright © 2009 - Access Innovations, Inc.
Hierarchy – Generic
                 (genus-species) relationship

   Inheritance or inclusion – what’s true of
    the parent (BT) is true for all children
    (NTs)
   Applies to
    entities, actions, properties, agents – not
    just biological taxonomies
Value              Thinking                                       Heat treatment
Cultural value      Contemplation                                 Annealing
Economic value       Divergent thinking                           Decarburization
Moral value         Lateral thinking                              Hardening
Social value        Reasoning
                    Copyright © 2009 - Access Innovations, Inc.   Tempering
Generic relationship test – 1

•   Both terms in same fundamental category
•   “All-and-some” test
                           Rodents
          SOME                                                    ALL
                           Squirrels


                                Pests
         SOME                                                 NOT ALL
                            Squirrels
      Consider concepts of marketing and advertising
                    Copyright © 2009 - Access Innovations, Inc.
Generic relationship test – 2


Rodents
          Squirrels                         Pests




  ALL squirrels are rodents
 x NOT ALL squirrels are pests
 x NOT ALL pests are rodents
           Copyright © 2009 - Access Innovations, Inc.
Hierarchy –
                       Whole-part relationship

•   Also known as meronymy or partonymy
•   Four types allowed in thesaurus standards
    –   Body systems and organs
        •   Ear  Middle ear
    –   Geographical locations
        •   Bernalillo County  Albuquerque
    –   Fields of study
        •   Geology  Physical geology
    –   Hierarchical social structures
        •   Ontario  Manitoulin District
                          Copyright © 2009 - Access Innovations, Inc.
Hierarchy –
                     Instance relationship

   General category (common noun) as BT,
    with individual example (proper noun) as
    NT
Seas                            French cathedrals
    Baltic Sea                          Chartres Cathedral
    Caspian Sea                         Rheims Cathedral
    Mediterranean Sea                   Rouen Cathedral

Essentially identical to “final node” in taxonomies
                   Copyright © 2009 - Access Innovations, Inc.
Polyhierarchical relationship

•   Term can logically fit under more than one
    Broader Term – can have Multiple Broader
    Terms (MBT)
•   Part of ISO standards, new to ANSI/NISO
     Spoons                              Forks
      Sporks                               Sporks
     Nurses                              Health administrators
      Nurse administrators                Nurse administrators
     Finance                             Careers
       Accounting                         Accounting
                        Copyright © 2009 - Access Innovations, Inc.
Equivalence relationship

•   Preferred Term
    –   Thesaurus term and valid for indexing
    –   Thesaurus notation: USE

•   Non-Preferred Term
    –   Not valid for indexing
    –   An alias or imposter
    –   Entry point, directs user to Preferred Term
    –   Thesaurus notation: UF or NPT
    Spiders                                Plant pathology
      UF Arachnids                            USE Phytopathology

                      Copyright © 2009 - Access Innovations, Inc.
Equivalence – when to use
   Synonyms, slang, quasi-synonyms
   Scientific and trade names
       Ibubrofen      UF Motrin™
   Lexical variants
       Fiber optics   UF Fibre optics
       Mouse          UF Mice
   Upward posting of narrow concepts not specified
    in taxonomy or thesaurus
       Social class   UF Elite, Middle class, Working class

Get equivalent terms from search logs, brainstorming…
                           Copyright © 2009 - Access Innovations, Inc.
Associative relationship

   Related Terms (RTs) – cousins
   “…terms related conceptually but not
    hierarchically, and are not part of an equivalence
    set” (i.e. not synonyms)
   Both terms are valid thesaurus terms for
    indexing, and have reciprocal relationship
   Expands user’s awareness, reflects thesaurus
    coverage of unanticipated areas
   Standards describe specific types (see Appendix)

                   Copyright © 2009 - Access Innovations, Inc.
Sibling rivalry and facets
   Format and sense of sibling Narrower Terms
    should be consistent
   If siblings don’t coexist well, separate them
   Subdivide large groups of terms into facets,
    mutually exclusive subcategories
   Growing demand with faceted navigation
   Facet examples
       Properties, Materials, Agents, Actions, Influence
       Objects, Styles and periods, Color, Shape
        (Art & Architecture Thesaurus)
                          Copyright © 2009- Access Innovations, Inc.
Scope Notes (SN)

   Indicate meaning of the term in the context of
    this thesaurus, for this audience
     Stress – Metal, Psychological, Physiological
   Indicate any restriction in meaning
   Indicate range of topics covered
   Provide direction for indexers; for terms often
    confused, may suggest an alternative term
   Use only as needed – not for every term
   Establish and stick with consistent format
   Be concise
                   Copyright © 2009 - Access Innovations, Inc.
Talk about terms

•   Term format
•   Grammatical issues
•   Singular and plural forms
•   Spelling
•   Abbreviations and acronyms
•   Capitalization
•   Other punctuation
•   Consistency
                Copyright © 2009 - Access Innovations, Inc.
Term format

•   KISS – Keep it short and simple
    –   1-2-3 words
        •   Effect on search
        •   Factoring, Postcoordination (coming)
•   Grammatical issues
    –   Nouns and noun phrases
    –   Verbish things
    –   Adjectives
    –   Adverbs
    –   Initial articles
                       Copyright © 2009 - Access Innovations, Inc.
Most terms are nouns

   Nouns or simple noun phrases
       Adj + Noun – Art history (ANSI/NISO standard)
           Noun + Prep + Noun – History of art (ISO standard)
       Exceptions – Burden of proof, Coats of arms,
        Prisoners of war, Birds of prey, etc.




                       Copyright © 2009 - Access Innovations, Inc.
Compound and Factored Terms

   “Terms in a thesaurus should represent
    simple or unitary concepts…” (ISO standard)
   “Compound terms should be factored
    (split) into simple elements…” (ANSI/NISO
    standard)


                Nice in theory…
                often unworkable
                     Copyright © 2009 - Access Innovations, Inc.
**Compound terms
                        are precoordinated**

   Elements are put together to specify a concept at
    the indexing stage
   Can’t change the parts

       Water pollution
       Library science
       Television influence on preschoolers
       Chicken dinner with turnips and rutabagas –
       no substitutions of menu items!


                     Copyright © 2009 - Access Innovations, Inc.
Precoordination positives

   User expectations – Rapid transit
       Occurs commonly in data, splitting would be odd
       Reflects a single concept for the audience
   Better accuracy – captures specific
    concepts precisely
   Fewer false drops
   Term information is retained
    (Related Terms, NonPreferred Terms, Scope Notes, etc.)
                     Copyright © 2009 - Access Innovations, Inc.
Precoordination negatives

   Poorer total recall
   Term proliferation
       Combinations and permutations increase
        thesaurus size
   Higher cost
   Limited flexibility in expressing new
    concepts

                    Copyright © 2009 - Access Innovations, Inc.
Postcoordination
                      pros and cons

   Higher recall
   Lower cost
   Greater flexibility – enables expression of new
    concepts through novel combinations
   Lower accuracy, some false drops
       Library science        NOT = Library + Science
       Art museums            NOT = Art + Museums
   Postcoordination is implicit in most searches

                      Copyright © 2009 - Access Innovations, Inc.
About “and”

    Avoid “and” in terms – not a single concept

        Instead of: Children and television

     Factor and postcoordinate

       USE Media influence + Television + Children
     And is not in the standard
    In real life—need for granularity may dictate your choice

                         Copyright © 2009 - Access Innovations, Inc.
So far you’ve got
•   Hierarchy
•   Complete term records
    –   Broader and Narrower Terms
        •   Polyhierarchies when needed
    –   Preferred/Non-Preferred Terms
        (equivalence relationships)
    –   Related Terms (associative relationships)
    –   Scope Notes
    –   Correct term format
    –   Compound terms when needed
                      Copyright © 2009 - Access Innovations, Inc.
Notation

•   Symbols (numerals, letters, hyphens, colons, etc.)
    –   1: Apples
        •   1.1: Granny Smith
        •   1.2: Winesap
•   Adjunct to verbal expression of term
•   May represent another kind of ordering of sibling
    terms (non-alphabetic)
    –   Chronological, positional, numeric sequence, or other
        logical sequence for user group
    –   Same terms presented differently for different user groups,
        different purposes
•   Secondary to verbal concept organization
                           Copyright © 2009 - Access Innovations, Inc.
Review, edit, test, edit,
             use, edit, and maintain, i.e. edit

    Review                                               Edit and maintain
        Users                                                   Add term
        Expert reviewers                                        Change existing term
    Test                                                        Change term status
        Index 500+ documents                                    Delete term
         (more for variable writing                              Add term relationship
         style; fewer for strict                                 Delete term relationship
         style)                                                  Add/modify Scope Note
        Monitor search log                                      Change overall structure

Consider automated / assisted indexing software
                            Copyright © 2009 - Access Innovations, Inc.
When do you add more
                terms?

   On demand
       When usage changes
       Stewardess – flight attendant
   As the field evolves
       8 changes to 64 colors
   In Use
       Don’t freeze waiting for perfection

                     Copyright © 2009 - Access Innovations, Inc.
Methodology and Workflow

                                            Enrich Content
                    Build and maintain
                                                  via
                        Thesaurus
                                             Data Harmony
                  (Software and Services)                           Enhanced
                                                 M.A.I.
                                                                    Documents:
Source Material
                                                                    Tagged XML
                      Choose the
Vocabularies            Terms                                       Subjects
                                            Documents               People
Content                                                             Places
Full Text             Style and            Articles               Etc.
HTML, PDF,            Spelling             Proceedings
Data Feeds,

Search logs
                                            Web Pages

etc.                                        Conference
                         Craft
                      Hierarchical             Abstracts
                                            Etc.
                                                                        Additional
                       Structure
                                                                        Databases:

                    Non-hierarchical                                     Authors
                     Relationships:                                      Experts
                      Synonyms                                            Etc.
                     Related Terms


                       Refine the                Evaluate and Manage Thesaurus
                       Rulebase
How do I implement a
            taxonomy?

   In search
   In a web site
   In indexing
   In other ways
Taxonomy and System Integrations


                   Document                                        Search
                   repository              SEARCH               Presentation
Full                CMS                    Perfect Search           Layer
text, HTML, P                              Lucene, MarkLogi
                    Documentum             c                      Website
DF, data feeds
                    SharePoint             SQL etc.
                     Oracle
                     MarkLogic




                               Metadata                       Client
     Inline
                               Extractor                      Taxonomy
     Tagging
                 M.A.I. Rule
                   Base                    Thesaurus
                                           Master
Parts of Search

   Search software
       Inverted Index
       Search algorithms
   Presentation layer
       Search box
       Autocompletion
       Related and narrower terms
       Hierarchical display
Outline of Presentation

           1     Define key terminology
Creating
           2     Thesaurus tools
an
                        Features
Inverted                Functions
File       3     Costs
Index                   Thesaurus construction
                        Thesaurus tools
           4     Why & when?
               Copyright © 2009 - Access Innovations, Inc.
Simple Inverted File Index


&                                                   key
1                                                   of
2                                                   outline
3                                                   presentation
4                                                   terminology
construction                                        thesaurus
costs                                               tools
define                                              when
features
                                                    why
functions

               Copyright © 2009 - Access Innovations, Inc.
Complex Inverted File Index
                      Example 1
                                      key - L2, P2, H
& - Stop                              of - Stop
1 - Stop                              outline - L1, P1, T
2 - Stop                              presentation - L1, P3, T
3 - Stop                              terminology - L2, P3, H
4 - Stop                              thesaurus - (1) - L3, P1, H
construction - L7, P2, SH               (2) - L7, P1, SH
costs - L6, P1, H                       (3) - L8, P1, SH
define - L2, P1, H                    tools - (1) - L3, P2, H
features - L4, P1, SH                    (2) - L8, P2, SH
functions - L5, P1, SH                when - L9, P3, H
                                      why - L9, P1, H
                        Copyright © 2009 - Access Innovations, Inc.
The Portal View -
                 MediaSleuth

   Use all options for search
   Traditional Search
   Taxonomy
   Rule Base
NavTree View




               MAIQuery
Taxonomy    Thesaurus
  view     Term Record
               view
Search Presentation Layer




                 Automatic completion
                    And type ahead
                    from Thesaurus
Search Presentation Layer




                     Related



                     Narrower
Search Presentation Layer




         The Hierarchical view of the thesaurus is
         also a browse able view of the content.

         The numbers include the number of hits
             1. For the term
             2. For the branch
Taxonomy    Thesaurus
  view     Term Record
               view
Web Taxonomies –
                  Changing faces

   ….and how the information is delivered
   From current site
   To new version
       Depends on TAXONOMY
   Personalization
   Feeding ads
   Consistent information
Use the taxonomy here


          HTML Headers
    META NAME KEYWORD
To personalize or profile
Improve Search:         www.mediasleuth.com




Autocompletion Using the
Taxonomy




                               Guide the
                               User

           Navigate
           the full
           Taxonomy        “Indispensable for anyone trying to identify
           “Tree”          instructional media for teaching.” – CHOICE
                           Magazine
Link to Society Resources

                      CME
                                   Upcoming
  Other            Activity on
                                   Conference
 Journal            Topic A
                                   on Topic A
Articles on
 Topic A
                                            Job Posting
                      Journal                for Expert
                     Article on              on Topic A
                      Topic A

 Grant Available                  Podcast Interview
for Researchers                    with Researcher
  Working on                      Working on Topic A
     Topic A



                                                 CONFIDENTIAL
Author
Submission
Module


 The author pastes
 the data to the
 document
 template,
 attaching images,
 graphs, as
 necessary:
Author connections
Authors at a place
MASHUP locations to a
GPS grid of an area
Watch Crime in action
More Like This - Recommender
Cancer Epidemiology Biomarkers & Prevention                              Related Press Releases
                                                                         •How What and How Much We Eat (And Drink) Affects Our
Vol. 12, 161-164,                                                        Risk of Cancer
February 2003                                                            •Novel COX-2 Combination Treatment May Reduce Colon
© 2003 American Association for Cancer Research                          Cancer Risk Combination Regimen of COX-2 Inhibitor and
                                                                         Fish Oil Causes Cell Death
Short Communications                                                     •COX-2 Levels Are Elevated in Smokers


Alcohol, Folate, Methionine, and Risk of Incident Breast Cancer in the
American Cancer Society Cancer Prevention Study II NutritionRelated AACR Workshops and Conferences
                                                               Cohort
Heather Spencer Feigelson  1, Carolyn R. Jonas, Andreas S.    •Frontiers in Cancer Prevention Research
                                                              •Continuing
Robertson, Marjorie L. McCullough, Michael J. Thun and Eugenia E. CalleMedical Education (CME)
                                                              •Molecular Targets and Cancer Therapeutics
Department of Epidemiology and Surveillance Research, American Cancer
Society, National Home Office, Atlanta, Georgia 30329-4251 Related Meeting Abstracts
                                                              •Association between dietary folate intake, alcohol intake, and
                                                              methylenetetrahydrofolate reductase C677T and A1298C
Recent studies suggest that the increased risk of breast cancer associated
                                                              polymorphisms and subsequent breast
with alcohol consumption may be reduced by adequate folate •Folate, folate cofactor, and alcohol intakes and risk for
                                                              intake. We
examined this question among 66,561 postmenopausal women in the adenoma
                                                              colorectal
American Cancer Society Cancer Prevention Study II Nutrition•Dietary folate intake and risk of prostate cancer in a large
                                                               Cohort.
                                                              prospective cohort study
Related Working Groups                 Think Tank Report
•Finance                               Related Think Tank Report        Related Education Book Content
•Charter                               Content                          Oral Contraceptives, Postmenopausal
•Molecular Epidemiology                                                 Hormones, and Breast Cancer
                                                                        Physical Activity and Cancer
                                       Webcasts                         Hormonal Interventions: From Adjuvant Therapy to
                                       Related Webcasts                 Breast Cancer Prevention
Related Awards
•AACR-GlaxoSmithKline Clinical Cancer Research Scholar Awards
•ACS Award
•Weinstein Distinguished Lecture
Thesaurus Resources
•   American Society for Information Science and Technology
    –   www.asis.org/
•   ANSI/NISO Standard Z39.19-1993
    –   www.niso.com
•   Australian Society of Indexers
    –   www.aussi.org/
•   Data Harmony
    –   www.dataharmony.com
   International Society for Knowledge Organization
       http://www.iskouk.org/index.htm
   Networked Knowledge Organization Systems
      http://nkos.slis.kent.edu/
•   SLA Taxonomy Division
    –   SLA Taxonomy and Metadata (wiki.sla.org/display/SLATAX/Home)
•   Taxonomy Community of Practice
                               Copyright © 2009 - Access Innovations, Inc.
Readings –
                     Thesaurus Construction
   Thesaurus Construction and Use a Practical Manual. Fourth
    edition has taxonomy information
    http://www.alibris.com/search//search/search.cfm?wauth=Aitchi
    son%2C%20Jean%20Gilchrist%2C Aitchison, Jean -
    Gilchrist, Alan - Bawden, David
   NISO Z39.19 (2005) standard NOT the 2003
   http://www.asindexing.org/site/thesbuild.shtml American Society
    for Indexers - a good practical approach
   Books about the process also include the ones listed here
    http://www.asindexing.org/site/bibliog.shtml
   There is also a series of white papers and other information on
    the web site at www.dataharmony.com
   SLA Taxonomy Division

                       Copyright © 2009 - Access Innovations, Inc.
Review

   Why build a taxonomy?
   What is a taxonomy?
   What are the standards?
   Where are taxonomies used?
   What are the parts of a taxonomy?
   How do you build one?
   How do you implement one?
Talks upcoming

   “Data Visualization” workshop at
    Computers in Libraries – April 11
   SLA Taxonomy Division workshop “How
    to build a taxonomy” June 8
   Session “How to Apply Your Taxonomy to
    Your Content” Monday June 10
Thank you!
                    Marjorie M.K. Hlava*
                    President and Chief Scientist
                    Access Innovations, Inc.
                    mhlava@accessinn.com
                    505-998-0800
* Our team of 37 has built over 200 taxonomies and implemented more than
600 for enterprises, governments and not for profits. We built tools to do
the work as well and are glad to share them with you for your projects

More Related Content

What's hot

Taxonomies for Text Analytics and Auto-indexing
Taxonomies for Text Analytics and Auto-indexingTaxonomies for Text Analytics and Auto-indexing
Taxonomies for Text Analytics and Auto-indexingHeather Hedden
 
Webinar: Business Solutions and Metadata Design
Webinar:  Business Solutions and Metadata DesignWebinar:  Business Solutions and Metadata Design
Webinar: Business Solutions and Metadata Designmartingarland
 
The Role of Thesauri in Data Modeling
The Role of Thesauri in Data ModelingThe Role of Thesauri in Data Modeling
The Role of Thesauri in Data ModelingDanny Greefhorst
 
Customer-Focused Thesauri
Customer-Focused ThesauriCustomer-Focused Thesauri
Customer-Focused ThesauriHeather Hedden
 
Dealing the Cards
Dealing the CardsDealing the Cards
Dealing the CardsTSoholt
 
Implementing a Taxonomy in a Content Management Portal
Implementing a Taxonomy in a Content Management PortalImplementing a Taxonomy in a Content Management Portal
Implementing a Taxonomy in a Content Management PortalAccess Innovations, Inc.
 
Introduction To Controlled Vocabularies
Introduction To Controlled VocabulariesIntroduction To Controlled Vocabularies
Introduction To Controlled VocabulariesFred Leise
 
Successful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata DesignSuccessful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata Designsarakirsten
 
Improve your Searches, Get Trained up on Expernova!
Improve your Searches, Get Trained up on Expernova!Improve your Searches, Get Trained up on Expernova!
Improve your Searches, Get Trained up on Expernova!Expernova
 
Controlled Vocabulary
Controlled VocabularyControlled Vocabulary
Controlled Vocabularyguest118a9a
 
SHOE (simple html ontology extensions)
SHOE (simple html ontology extensions)SHOE (simple html ontology extensions)
SHOE (simple html ontology extensions)Selman Bozkır
 
Should libraries discontinue using and maintaining controlled subject vocabul...
Should libraries discontinue using and maintaining controlled subject vocabul...Should libraries discontinue using and maintaining controlled subject vocabul...
Should libraries discontinue using and maintaining controlled subject vocabul...Ryan Scicluna
 
Taxonomies for Human vs Auto-Indexing
Taxonomies for Human vs Auto-IndexingTaxonomies for Human vs Auto-Indexing
Taxonomies for Human vs Auto-IndexingHeather Hedden
 
Asis&t webinar people directories access innovations
Asis&t webinar people directories access innovationsAsis&t webinar people directories access innovations
Asis&t webinar people directories access innovationsBert Carelli
 
Leveraging Your Taxonomy With Navtree and MAIQuery
Leveraging Your Taxonomy With Navtree and MAIQueryLeveraging Your Taxonomy With Navtree and MAIQuery
Leveraging Your Taxonomy With Navtree and MAIQueryAccess Innovations, Inc.
 
Global ID’s & Publicizing Researches (ORCID)
Global ID’s & Publicizing Researches (ORCID)Global ID’s & Publicizing Researches (ORCID)
Global ID’s & Publicizing Researches (ORCID)Nabeel Salih Ali
 

What's hot (20)

Taxonomy made easy
Taxonomy made easyTaxonomy made easy
Taxonomy made easy
 
The Myth of Topic Maps
The Myth of Topic MapsThe Myth of Topic Maps
The Myth of Topic Maps
 
Taxonomies for Text Analytics and Auto-indexing
Taxonomies for Text Analytics and Auto-indexingTaxonomies for Text Analytics and Auto-indexing
Taxonomies for Text Analytics and Auto-indexing
 
Webinar: Business Solutions and Metadata Design
Webinar:  Business Solutions and Metadata DesignWebinar:  Business Solutions and Metadata Design
Webinar: Business Solutions and Metadata Design
 
The Role of Thesauri in Data Modeling
The Role of Thesauri in Data ModelingThe Role of Thesauri in Data Modeling
The Role of Thesauri in Data Modeling
 
Taxonomy And Metadata
Taxonomy And MetadataTaxonomy And Metadata
Taxonomy And Metadata
 
Customer-Focused Thesauri
Customer-Focused ThesauriCustomer-Focused Thesauri
Customer-Focused Thesauri
 
Dealing the Cards
Dealing the CardsDealing the Cards
Dealing the Cards
 
Implementing a Taxonomy in a Content Management Portal
Implementing a Taxonomy in a Content Management PortalImplementing a Taxonomy in a Content Management Portal
Implementing a Taxonomy in a Content Management Portal
 
Introduction To Controlled Vocabularies
Introduction To Controlled VocabulariesIntroduction To Controlled Vocabularies
Introduction To Controlled Vocabularies
 
Successful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata DesignSuccessful Content Management Through Taxonomy And Metadata Design
Successful Content Management Through Taxonomy And Metadata Design
 
Improve your Searches, Get Trained up on Expernova!
Improve your Searches, Get Trained up on Expernova!Improve your Searches, Get Trained up on Expernova!
Improve your Searches, Get Trained up on Expernova!
 
Business research lec5
Business research lec5Business research lec5
Business research lec5
 
Controlled Vocabulary
Controlled VocabularyControlled Vocabulary
Controlled Vocabulary
 
SHOE (simple html ontology extensions)
SHOE (simple html ontology extensions)SHOE (simple html ontology extensions)
SHOE (simple html ontology extensions)
 
Should libraries discontinue using and maintaining controlled subject vocabul...
Should libraries discontinue using and maintaining controlled subject vocabul...Should libraries discontinue using and maintaining controlled subject vocabul...
Should libraries discontinue using and maintaining controlled subject vocabul...
 
Taxonomies for Human vs Auto-Indexing
Taxonomies for Human vs Auto-IndexingTaxonomies for Human vs Auto-Indexing
Taxonomies for Human vs Auto-Indexing
 
Asis&t webinar people directories access innovations
Asis&t webinar people directories access innovationsAsis&t webinar people directories access innovations
Asis&t webinar people directories access innovations
 
Leveraging Your Taxonomy With Navtree and MAIQuery
Leveraging Your Taxonomy With Navtree and MAIQueryLeveraging Your Taxonomy With Navtree and MAIQuery
Leveraging Your Taxonomy With Navtree and MAIQuery
 
Global ID’s & Publicizing Researches (ORCID)
Global ID’s & Publicizing Researches (ORCID)Global ID’s & Publicizing Researches (ORCID)
Global ID’s & Publicizing Researches (ORCID)
 

Viewers also liked

Power Point Project
Power Point ProjectPower Point Project
Power Point Projectvangeest87
 
Ontology And Taxonomy Modeling Quick Guide
Ontology And Taxonomy Modeling Quick GuideOntology And Taxonomy Modeling Quick Guide
Ontology And Taxonomy Modeling Quick GuideHeimo Hänninen
 
From Taxonomies to Ontologies
From Taxonomies to OntologiesFrom Taxonomies to Ontologies
From Taxonomies to OntologiesChristine Connors
 
JumpIdeas Creating a Portfolio of Innovations
JumpIdeas  Creating a Portfolio of InnovationsJumpIdeas  Creating a Portfolio of Innovations
JumpIdeas Creating a Portfolio of InnovationsJump Associates
 
Ontologies and Vocabularies
Ontologies and VocabulariesOntologies and Vocabularies
Ontologies and Vocabulariesseanb
 
Taxonomies and Ontologies – The Yin and Yang of Knowledge Modelling
Taxonomies and Ontologies – The Yin and Yang of Knowledge ModellingTaxonomies and Ontologies – The Yin and Yang of Knowledge Modelling
Taxonomies and Ontologies – The Yin and Yang of Knowledge ModellingSemantic Web Company
 
Taxonomy, ontology, folksonomies & SKOS.
Taxonomy, ontology, folksonomies & SKOS.Taxonomy, ontology, folksonomies & SKOS.
Taxonomy, ontology, folksonomies & SKOS.Janet Leu
 
Ontology and its various aspects
Ontology and its various aspectsOntology and its various aspects
Ontology and its various aspectssamhati27
 
Basic database analysis(database)
Basic database analysis(database)Basic database analysis(database)
Basic database analysis(database)welcometofacebook
 
Brand Communications
Brand CommunicationsBrand Communications
Brand CommunicationsSj -
 

Viewers also liked (12)

Power Point Project
Power Point ProjectPower Point Project
Power Point Project
 
Ontology And Taxonomy Modeling Quick Guide
Ontology And Taxonomy Modeling Quick GuideOntology And Taxonomy Modeling Quick Guide
Ontology And Taxonomy Modeling Quick Guide
 
From Taxonomies to Ontologies
From Taxonomies to OntologiesFrom Taxonomies to Ontologies
From Taxonomies to Ontologies
 
JumpIdeas Creating a Portfolio of Innovations
JumpIdeas  Creating a Portfolio of InnovationsJumpIdeas  Creating a Portfolio of Innovations
JumpIdeas Creating a Portfolio of Innovations
 
Ontologies and Vocabularies
Ontologies and VocabulariesOntologies and Vocabularies
Ontologies and Vocabularies
 
Risk Management
Risk ManagementRisk Management
Risk Management
 
Taxonomies and Ontologies – The Yin and Yang of Knowledge Modelling
Taxonomies and Ontologies – The Yin and Yang of Knowledge ModellingTaxonomies and Ontologies – The Yin and Yang of Knowledge Modelling
Taxonomies and Ontologies – The Yin and Yang of Knowledge Modelling
 
Taxonomy, ontology, folksonomies & SKOS.
Taxonomy, ontology, folksonomies & SKOS.Taxonomy, ontology, folksonomies & SKOS.
Taxonomy, ontology, folksonomies & SKOS.
 
Ontology and its various aspects
Ontology and its various aspectsOntology and its various aspects
Ontology and its various aspects
 
Basic database analysis(database)
Basic database analysis(database)Basic database analysis(database)
Basic database analysis(database)
 
Ontology
OntologyOntology
Ontology
 
Brand Communications
Brand CommunicationsBrand Communications
Brand Communications
 

Similar to Taxonomy 101

Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureAccess Innovations, Inc.
 
Taxonomy Development and Digital Projects
Taxonomy Development and Digital ProjectsTaxonomy Development and Digital Projects
Taxonomy Development and Digital Projects daniela barbosa
 
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an OntologyDeveloping the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an OntologyAccess Innovations, Inc.
 
Taxonomy design best practices
Taxonomy design best practices Taxonomy design best practices
Taxonomy design best practices voginip
 
Putting Controlled Vocabulary To Work I Davis 2008
Putting Controlled Vocabulary To Work I Davis 2008Putting Controlled Vocabulary To Work I Davis 2008
Putting Controlled Vocabulary To Work I Davis 2008Ian Davis
 
You Say Dog I Say Canine
You Say Dog I Say CanineYou Say Dog I Say Canine
You Say Dog I Say Canineaubreymm
 
SharePoint Saturday New york City - The importance of metadata #spsnyc
SharePoint Saturday New york City - The importance of metadata #spsnycSharePoint Saturday New york City - The importance of metadata #spsnyc
SharePoint Saturday New york City - The importance of metadata #spsnycVincent Biret
 
Knowledge engineering and the Web
Knowledge engineering and the WebKnowledge engineering and the Web
Knowledge engineering and the WebGuus Schreiber
 
The state of KOS in the Linked Data movement
The state of KOS in the Linked Data movementThe state of KOS in the Linked Data movement
The state of KOS in the Linked Data movementMarcia Zeng
 
Theresa regli bw-3
Theresa regli bw-3Theresa regli bw-3
Theresa regli bw-3R Aunpad
 
IMT530 Tagging Presentation
IMT530 Tagging PresentationIMT530 Tagging Presentation
IMT530 Tagging PresentationMichael Braly
 
SharePoint Taxonomy Introduction
SharePoint Taxonomy IntroductionSharePoint Taxonomy Introduction
SharePoint Taxonomy IntroductionChris Woodill
 
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsAccess Innovations, Inc.
 
SharePoint Taxonomy and Metadata 11-19-09
SharePoint Taxonomy and Metadata 11-19-09SharePoint Taxonomy and Metadata 11-19-09
SharePoint Taxonomy and Metadata 11-19-09Stephanie Lemieux
 

Similar to Taxonomy 101 (20)

Taxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information ArchitectureTaxonomies and Metadata in Information Architecture
Taxonomies and Metadata in Information Architecture
 
Taxonomy Development and Digital Projects
Taxonomy Development and Digital ProjectsTaxonomy Development and Digital Projects
Taxonomy Development and Digital Projects
 
DHUG 2017 - Thesaurus Construction Training
DHUG 2017 - Thesaurus Construction TrainingDHUG 2017 - Thesaurus Construction Training
DHUG 2017 - Thesaurus Construction Training
 
Developing the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an OntologyDeveloping the AIP Thesaurus: The Platform for an Ontology
Developing the AIP Thesaurus: The Platform for an Ontology
 
Taxonomy design best practices
Taxonomy design best practices Taxonomy design best practices
Taxonomy design best practices
 
Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary: Real-World A...
Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary:  Real-World A...Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary:  Real-World A...
Hlava, Davis, Corson-Rikert, and Parr "Control Your Vocabulary: Real-World A...
 
Putting Controlled Vocabulary To Work I Davis 2008
Putting Controlled Vocabulary To Work I Davis 2008Putting Controlled Vocabulary To Work I Davis 2008
Putting Controlled Vocabulary To Work I Davis 2008
 
You Say Dog I Say Canine
You Say Dog I Say CanineYou Say Dog I Say Canine
You Say Dog I Say Canine
 
SharePoint Saturday New york City - The importance of metadata #spsnyc
SharePoint Saturday New york City - The importance of metadata #spsnycSharePoint Saturday New york City - The importance of metadata #spsnyc
SharePoint Saturday New york City - The importance of metadata #spsnyc
 
Knowledge engineering and the Web
Knowledge engineering and the WebKnowledge engineering and the Web
Knowledge engineering and the Web
 
Searching techniques
Searching techniquesSearching techniques
Searching techniques
 
Searching techniques
Searching techniquesSearching techniques
Searching techniques
 
The state of KOS in the Linked Data movement
The state of KOS in the Linked Data movementThe state of KOS in the Linked Data movement
The state of KOS in the Linked Data movement
 
Taxonomy 101
Taxonomy 101Taxonomy 101
Taxonomy 101
 
Tutorial 1-Ontologies
Tutorial 1-OntologiesTutorial 1-Ontologies
Tutorial 1-Ontologies
 
Theresa regli bw-3
Theresa regli bw-3Theresa regli bw-3
Theresa regli bw-3
 
IMT530 Tagging Presentation
IMT530 Tagging PresentationIMT530 Tagging Presentation
IMT530 Tagging Presentation
 
SharePoint Taxonomy Introduction
SharePoint Taxonomy IntroductionSharePoint Taxonomy Introduction
SharePoint Taxonomy Introduction
 
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy ResultsMaking AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
Making AI Behave: Using Knowledge Domains to Produce Useful, Trustworthy Results
 
SharePoint Taxonomy and Metadata 11-19-09
SharePoint Taxonomy and Metadata 11-19-09SharePoint Taxonomy and Metadata 11-19-09
SharePoint Taxonomy and Metadata 11-19-09
 

More from Access Innovations, Inc.

ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8Access Innovations, Inc.
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Access Innovations, Inc.
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Access Innovations, Inc.
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Access Innovations, Inc.
 
Tagging overview - Why Keywords Don't Cut It
Tagging overview  - Why Keywords Don't Cut ItTagging overview  - Why Keywords Don't Cut It
Tagging overview - Why Keywords Don't Cut ItAccess Innovations, Inc.
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityAccess Innovations, Inc.
 
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedDHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedAccess Innovations, Inc.
 

More from Access Innovations, Inc. (20)

ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
ISO 25964-1Working Group ISO/TC 46/SC 9/WG 8
 
Smart submit
Smart submitSmart submit
Smart submit
 
Plos taxonomy beyond search dhug 2021
Plos taxonomy beyond search   dhug 2021Plos taxonomy beyond search   dhug 2021
Plos taxonomy beyond search dhug 2021
 
Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)Hindawi taxonomy and personalization 27.10 (1)
Hindawi taxonomy and personalization 27.10 (1)
 
Data harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacingData harmonycloudpowerpointclientfacing
Data harmonycloudpowerpointclientfacing
 
Data harmony update 2021
Data harmony update 2021 Data harmony update 2021
Data harmony update 2021
 
Atypon dhug2021
Atypon dhug2021Atypon dhug2021
Atypon dhug2021
 
Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021Asco using ai-taxos-for meta-titles-february-2021
Asco using ai-taxos-for meta-titles-february-2021
 
Asce more than just topic taxonomies
Asce more than just topic taxonomiesAsce more than just topic taxonomies
Asce more than just topic taxonomies
 
Acs discoverability-dhug2021
Acs discoverability-dhug2021Acs discoverability-dhug2021
Acs discoverability-dhug2021
 
Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)Ai webinar 2 -what's in a name (consolidated pdf)
Ai webinar 2 -what's in a name (consolidated pdf)
 
Tagging overview - Why Keywords Don't Cut It
Tagging overview  - Why Keywords Don't Cut ItTagging overview  - Why Keywords Don't Cut It
Tagging overview - Why Keywords Don't Cut It
 
Health Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut ItHealth Affairs - Why Keywords Don't Cut It
Health Affairs - Why Keywords Don't Cut It
 
Why Keywords Don't Cut It
Why Keywords Don't Cut ItWhy Keywords Don't Cut It
Why Keywords Don't Cut It
 
Data Harmony update 2020 final
Data Harmony update 2020 finalData Harmony update 2020 final
Data Harmony update 2020 final
 
Data Harmony Update 2020 final
Data Harmony Update 2020 finalData Harmony Update 2020 final
Data Harmony Update 2020 final
 
DHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository InteroperabilityDHUG 2018: Towards Web-Centric Repository Interoperability
DHUG 2018: Towards Web-Centric Repository Interoperability
 
DHUG 2018 - Florida Thesis OCR
DHUG 2018 - Florida Thesis OCRDHUG 2018 - Florida Thesis OCR
DHUG 2018 - Florida Thesis OCR
 
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project FundedDHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
DHUG 2017 - Understanding ROI Just Enough to Get Your Project Funded
 
DHUG 2017 - Access Integrity
DHUG 2017 - Access IntegrityDHUG 2017 - Access Integrity
DHUG 2017 - Access Integrity
 

Taxonomy 101

  • 1. Albuquerque, NM 87110 www.accessinn.com www.dataharmony.com 505-998-0800 Marjorie M.K. Hlava President and Chief Scientist Access Innovations, Inc. Taxonomy 101
  • 2. Overview of the Presentation  Why build a taxonomy?  What is a taxonomy?  What are the standards?  Where are taxonomies used?  What are the parts of a taxonomy?  How do you build one?  How do you implement one?  Review
  • 3. Why build a taxonomy?  Leverage your data  Search, precision, and recall  Websites  Discoverability and findability  Data mashups  Data trends and visualization  Repackaging and repurposing data  Author and entity disambiguation
  • 4. Heart of the Big Data production process
  • 6. From the production side to the website display carry the taxonomy descriptors for use in precision search
  • 7. What is a Taxonomy? ANSI/NISO Z39.19-2005 controlled “A collection of controlled vocabulary terms organized into a Yes! hierarchical structure.” Missing: equivalence, homographic, and associative relationships and notes Copyright © 2009 - Access Innovations, Inc.
  • 8. A taxonomy is a knowledge organization system  Uncontrolled list Not complex  Name authority file  Synonym set/ring  Controlled vocabulary  Taxonomy  Thesaurus  Ontology  Semantic network Highly complex Copyright © 2009 - Access Innovations, Inc.
  • 9. A Thesaurus is a Knowledge Organization System  Controlled vocabulary KOS  Focus on conceptual classes, not specifics  Hierarchy – implicit if not displayed  Parent-child relationships  Various display formats may be available  Network of relationships between terms guides user to find information Long  Cousins, friends, aliases  Scope notes, term history established  More elaborate and informative standards Copyright © 2009 - Access Innovations, Inc.
  • 10. Thesaurus defined – ANSI/NISO Z39.19-2005 “A controlled vocabulary arranged in a known order and structured so that the various [equivalence, homographic, hierarchical, and associative] relationships among terms are displayed clearly and identified by standardized relationship indicators. Relationship indicators should be employed reciprocally. “Its purpose is to promote consistency in the indexing of content objects, especially for postcoordinated information storage and retrieval systems, and to facilitate browsing and searching by linking entry terms with terms. Thesauri may also facilitate the retrieval of content objects in free text searching.” Copyright © 2009 - Access Innovations, Inc.
  • 11. Structure of controlled vocabularies Lists Synonyms Taxonomy Thesaurus Ontology INCREASING COMPLEXITY and CONTROL Ambiguity Ambiguity Ambiguity Synonym Synonym Synonym Hierarchy Hierarchy Hierarchy Relationships Additional kinds of relationships Copyright © 2009 - Access Innovations, Inc.
  • 12. Taxonomy? Thesaurus?  Often used interchangeably  Thesaurus is a taxonomy with extras  Related Terms  Non-preferred Terms (USE/Used for)  Scope Notes  more  Use the word your audience understands  Avoid confusion with Roget’s Thesaurus Copyright © 2009 - Access Innovations, Inc.
  • 13. Where can I get taxonomy standards?  www.niso.org  Z39.19 (2010) Controlled Vocabularies  www.ISO.ce  ISO 25964 parts 1 and 2  www.bsi.uk.co  www.w3c.org
  • 14. Web Ontology Language OWL  W3C Recommendation 10 February 2004  http://www.w3.org/TR/2004/Rec-owl-guide-20040210/  http://www.w3.org/TR/2004/Rec-owl-ref-20040210/  http://www.w3.org/TR/2004/Rec-webont-req-20040210/  Continuing updates Copyright © 2009 - Access Innovations, Inc.
  • 15. Taxonomy Thesaurus view Term Record view Copyright © 2005 - Access Innovations, Inc.
  • 16. Where are taxonomies used? • In “indexing” or categorizing, as subject metadata • In search • In content management systems • In SharePoint • In mashups • In social networking sites • In author tagging • In filtering data – e.g., spam filters and RSS feeds • In web crawlers
  • 17. Why the excitement?  Makes information findable!  Cut search time by 50% - The Weather Channel  Organizes web sites  Provides better online help  Customer support 30x more costly than web self-service (Forrester Research "Tier Zero Customer Support" 1999)
  • 18. Taxonomies in business “The High Cost of Not Finding Information”  Time wasted searching  Confusion about same information by different name  Similar/overlapping activities, products, uses (Susan Feldman, KMWorld, March 2004) With a unified taxonomy and consistent indexing:  Better searching or browsing to locate information  More efficient content management  Focused content collection through web spidering  Personalized content delivery Copyright © 2009 - Access Innovations, Inc.
  • 19. Good Search must have…metadata Inverted Searchable Index File Index Taxonomy Hierarchical Display Thesaurus Copyright © 2009 - Access Innovations, Inc.
  • 20. Metadata  The fields  The elements  Class codes  Title  Author  Plaintiff  Product  subject / topic  Meta Name Keywords in HTML Copyright © 2009 - Access Innovations, Inc.
  • 21. Copyright © 2005 - Access Innovations, Inc.
  • 22. Basic taxonomy / thesaurus features • Hierarchy structure – Broader Terms = more general concepts – Narrower Terms = more specific concepts • Related Terms = conceptual cousins • Term equivalents = synonyms • Classification options • Scope notes • Other elements as needed Copyright © 2009 - Access Innovations, Inc.
  • 23. What are the parts? The term record = subject term, heading, node, category, descriptor, class  Main Term (MT)  Top Term (TT) TAXONOMY  Broader Terms (BT)  Narrower Terms (NT) ONTOLOGY  Related Terms (RT) THESAURUS  See also (SA)  Non-Preferred Term (NP)  Used for (UF), See (S)  Scope Note (SN)  History (H) Copyright © 2009 - Access Innovations, Inc.
  • 24. How do you build one?  From scratch?  Adoption of existing  Term registries  Taxonomy Warehouse  Other resources  Combination
  • 25. Define subject field  Review representative collection of content  Determine:  Core areas  Peripheral topics Sociology Psychology Education Law  Scope can be modified later Copyright © 2009 - Access Innovations, Inc.
  • 26. Before you go on: Build or buy? • Survey existing thesaurus/taxonomy resources for your domain • Test for – Scope – Depth • Make-or-break terms – Cost Don’t reinvent the wheel! Copyright © 2009 - Access Innovations, Inc.
  • 27. Build a taxonomy – simple steps • Get paper and pencil – Sharpen pencil • Define subject field • Collect terms • Organize terms • Fill in gaps • Flesh out and interrelate terms You’re done! Copyright © 2009 - Access Innovations, Inc.
  • 28. Your taxonomy / thesaurus end product • Reflects – scope of your concern – degree of precision you need • Facilitates – data storage and retrieval by vocabulary control – discovery of ideas • Promotes learning – preferred terminology – relationships among concepts – organized guide to your field Copyright © 2009 - Access Innovations, Inc.
  • 29. How do you choose terms?  Importance in the subject area  Use in the literature, by the organization or community  Necessary degree of specificity or detail  Relationship with other controlled vocabularies Copyright © 2009 - Access Innovations, Inc.
  • 30. Vocabulary control – why? “Eliminating ambiguity and compensating for synonymy through vocabulary control assures that each term has only one meaning and that only one term can be used to represent a given concept or entity. … Ambiguity occurs in natural language when a word or phrase (a homograph or polyseme) has more than one meaning. ” (ANSI/NISO Z39.19-2005) Copyright © 2009 - Access Innovations, Inc.
  • 31. One term / one concept  “Terms in a thesaurus should represent simple or unitary concepts…” (ISO standard)  “Each term included in a controlled vocabulary should represent a single concept (or unit of thought). A single concept is frequently expressed by a single-word term but in many cases a multiword term is required to represent the concept.” (ANSI/NISO Z39.19-2005) Copyright © 2009 - Access Innovations, Inc.
  • 32. Vocabulary control – how?  Use unambiguous terms, clear to the user group  Distinguish between terms that appear similar  Use Scope Notes when necessary  Use terms as elements that can be coordinated in a flexible manner  Create compound terms if necessary Copyright © 2009 - Access Innovations, Inc.
  • 33. A “term” synonym ring Term Descriptor Node Category Subject heading Copyright © 2009 - Access Innovations, Inc.
  • 34. So what’s a concept? • “A unit of thought, formed by mentally combining some or all of the characteristics of a concrete or abstract, real or imaginary object. Concepts exist in the mind as abstract entities independent of terms used to express them.” • Three main categories – Abstract concepts – Concrete entities – Proper nouns Copyright © 2009 - Access Innovations, Inc.
  • 35. Concrete entities as terms • Things and their physical parts – primates • head – buildings • floors • Materials – cement – wood – lead Copyright © 2009 - Access Innovations, Inc.
  • 36. Abstract concepts as terms • Actions and events – evolution, skating, management, ceremonies • Abstract entities – law, theory • Properties of things, materials, and actions – strength, efficiency • Disciplines and sciences – physics, meteorology, mathematics • Units of measurement – pounds, kilograms, miles, meters, nanoseconds Copyright © 2009 - Access Innovations, Inc.
  • 37. Proper nouns as terms  Individual entities – “classes of one” – expressed as proper nouns  San Francisco, Lake Michigan Thesaurus standards exclude proper names, persons, and trade names  authority files. Taxonomies include them as final nodes. Copyright © 2009 - Access Innovations, Inc.
  • 38. Collect terms  Your documents and databases  Departmental terminology  Text books and their indexes  Book tables of contents and indexes  Journal quarterly indexes  Encyclopedias  Lexicons, glossaries on the topic  Web resources  Users and experts  Search logs Copyright © 2009 - Access Innovations, Inc.
  • 39. Gather terms from search logs “Beyond the Spider: The Accidental Thesaurus” (Richard Wiggins in Information Today, Oct 2002) Top ~100 search terms from search logs Match to web site with appropriate answer Basis for favorites or best bets, presented at the top of results list. (AKA behavior-based taxonomy) Not a thesaurus or taxonomy, but still a useful source of terms. Copyright © 2009 - Access Innovations, Inc.
  • 40. Organize terms – roughly  Sort terms into several major categories – logical groups of similar concepts as Top Terms  Identify core areas and peripheral topics  10 – 20 to start  Consider moving proper names to authority files  Result: loose collection of terms under several main headings  Rough and tentative – see how it fits as you go  Initial gap analysis  Add / modify / delete as needed Copyright © 2009 - Access Innovations, Inc.
  • 41. Usefulness of a term – the “duh” factor • Some terms are so basic for a domain that they have little or no value – “Sports” in Sports Illustrated – “Technology” in Technology Review – “Golf” in Golf Magazine – “Information science” and “Information technology” • How useful will the term be for indexing? – Does the term apply to everything in the domain? – Does the term distinguish important concepts? – If term is needed, specify limited use conditions in Scope Note Copyright © 2009 - Access Innovations, Inc.
  • 42. Hierarchy structures – variations on a theme • Not pre-determined – Subcategorize wines first by type, variety, region, then cost? Or first by cost and then type? • Varies by user group and needs – May have multiple views of same content – Standard alpha view or customized notation • Affects information architecture, i.e., how web site functions Copyright © 2005 - Access Innovations, Inc.
  • 43. How do terms relate?  Hierarchical relationships -- Parents and their children TAXONOMY  Equivalence relationships THESAURUS -- Aliases  Associative relationships -- Cousins Copyright © 2009 - Access Innovations, Inc.
  • 44. Hierarchical relationships  Broader Term represents the class, whole, or genus  Narrower Term is a member, part, or species  Generic relationship  Whole-part relationship  Instance relationship  BTs/NTs have a reciprocal relationship  Hyponym - Hypernym Copyright © 2009 - Access Innovations, Inc.
  • 45. Broader to Narrower Terms Politics Elections Presidential elections Gubernatorial elections Mayoral elections Copyright © 2009 - Access Innovations, Inc.
  • 46. Hierarchy – Generic (genus-species) relationship  Inheritance or inclusion – what’s true of the parent (BT) is true for all children (NTs)  Applies to entities, actions, properties, agents – not just biological taxonomies Value Thinking Heat treatment Cultural value Contemplation Annealing Economic value Divergent thinking Decarburization Moral value Lateral thinking Hardening Social value Reasoning Copyright © 2009 - Access Innovations, Inc. Tempering
  • 47. Generic relationship test – 1 • Both terms in same fundamental category • “All-and-some” test Rodents SOME ALL Squirrels Pests SOME NOT ALL Squirrels Consider concepts of marketing and advertising Copyright © 2009 - Access Innovations, Inc.
  • 48. Generic relationship test – 2 Rodents Squirrels Pests  ALL squirrels are rodents x NOT ALL squirrels are pests x NOT ALL pests are rodents Copyright © 2009 - Access Innovations, Inc.
  • 49. Hierarchy – Whole-part relationship • Also known as meronymy or partonymy • Four types allowed in thesaurus standards – Body systems and organs • Ear  Middle ear – Geographical locations • Bernalillo County  Albuquerque – Fields of study • Geology  Physical geology – Hierarchical social structures • Ontario  Manitoulin District Copyright © 2009 - Access Innovations, Inc.
  • 50. Hierarchy – Instance relationship  General category (common noun) as BT, with individual example (proper noun) as NT Seas French cathedrals Baltic Sea Chartres Cathedral Caspian Sea Rheims Cathedral Mediterranean Sea Rouen Cathedral Essentially identical to “final node” in taxonomies Copyright © 2009 - Access Innovations, Inc.
  • 51. Polyhierarchical relationship • Term can logically fit under more than one Broader Term – can have Multiple Broader Terms (MBT) • Part of ISO standards, new to ANSI/NISO Spoons Forks Sporks Sporks Nurses Health administrators Nurse administrators Nurse administrators Finance Careers Accounting Accounting Copyright © 2009 - Access Innovations, Inc.
  • 52. Equivalence relationship • Preferred Term – Thesaurus term and valid for indexing – Thesaurus notation: USE • Non-Preferred Term – Not valid for indexing – An alias or imposter – Entry point, directs user to Preferred Term – Thesaurus notation: UF or NPT Spiders Plant pathology UF Arachnids USE Phytopathology Copyright © 2009 - Access Innovations, Inc.
  • 53. Equivalence – when to use  Synonyms, slang, quasi-synonyms  Scientific and trade names  Ibubrofen UF Motrin™  Lexical variants  Fiber optics UF Fibre optics  Mouse UF Mice  Upward posting of narrow concepts not specified in taxonomy or thesaurus  Social class UF Elite, Middle class, Working class Get equivalent terms from search logs, brainstorming… Copyright © 2009 - Access Innovations, Inc.
  • 54. Associative relationship  Related Terms (RTs) – cousins  “…terms related conceptually but not hierarchically, and are not part of an equivalence set” (i.e. not synonyms)  Both terms are valid thesaurus terms for indexing, and have reciprocal relationship  Expands user’s awareness, reflects thesaurus coverage of unanticipated areas  Standards describe specific types (see Appendix) Copyright © 2009 - Access Innovations, Inc.
  • 55. Sibling rivalry and facets  Format and sense of sibling Narrower Terms should be consistent  If siblings don’t coexist well, separate them  Subdivide large groups of terms into facets, mutually exclusive subcategories  Growing demand with faceted navigation  Facet examples  Properties, Materials, Agents, Actions, Influence  Objects, Styles and periods, Color, Shape (Art & Architecture Thesaurus) Copyright © 2009- Access Innovations, Inc.
  • 56. Scope Notes (SN)  Indicate meaning of the term in the context of this thesaurus, for this audience  Stress – Metal, Psychological, Physiological  Indicate any restriction in meaning  Indicate range of topics covered  Provide direction for indexers; for terms often confused, may suggest an alternative term  Use only as needed – not for every term  Establish and stick with consistent format  Be concise Copyright © 2009 - Access Innovations, Inc.
  • 57. Talk about terms • Term format • Grammatical issues • Singular and plural forms • Spelling • Abbreviations and acronyms • Capitalization • Other punctuation • Consistency Copyright © 2009 - Access Innovations, Inc.
  • 58. Term format • KISS – Keep it short and simple – 1-2-3 words • Effect on search • Factoring, Postcoordination (coming) • Grammatical issues – Nouns and noun phrases – Verbish things – Adjectives – Adverbs – Initial articles Copyright © 2009 - Access Innovations, Inc.
  • 59. Most terms are nouns  Nouns or simple noun phrases  Adj + Noun – Art history (ANSI/NISO standard)  Noun + Prep + Noun – History of art (ISO standard)  Exceptions – Burden of proof, Coats of arms, Prisoners of war, Birds of prey, etc. Copyright © 2009 - Access Innovations, Inc.
  • 60. Compound and Factored Terms  “Terms in a thesaurus should represent simple or unitary concepts…” (ISO standard)  “Compound terms should be factored (split) into simple elements…” (ANSI/NISO standard) Nice in theory… often unworkable Copyright © 2009 - Access Innovations, Inc.
  • 61. **Compound terms are precoordinated**  Elements are put together to specify a concept at the indexing stage  Can’t change the parts Water pollution Library science Television influence on preschoolers Chicken dinner with turnips and rutabagas – no substitutions of menu items! Copyright © 2009 - Access Innovations, Inc.
  • 62. Precoordination positives  User expectations – Rapid transit  Occurs commonly in data, splitting would be odd  Reflects a single concept for the audience  Better accuracy – captures specific concepts precisely  Fewer false drops  Term information is retained (Related Terms, NonPreferred Terms, Scope Notes, etc.) Copyright © 2009 - Access Innovations, Inc.
  • 63. Precoordination negatives  Poorer total recall  Term proliferation  Combinations and permutations increase thesaurus size  Higher cost  Limited flexibility in expressing new concepts Copyright © 2009 - Access Innovations, Inc.
  • 64. Postcoordination pros and cons  Higher recall  Lower cost  Greater flexibility – enables expression of new concepts through novel combinations  Lower accuracy, some false drops  Library science NOT = Library + Science  Art museums NOT = Art + Museums  Postcoordination is implicit in most searches Copyright © 2009 - Access Innovations, Inc.
  • 65. About “and”  Avoid “and” in terms – not a single concept Instead of: Children and television Factor and postcoordinate USE Media influence + Television + Children And is not in the standard In real life—need for granularity may dictate your choice Copyright © 2009 - Access Innovations, Inc.
  • 66. So far you’ve got • Hierarchy • Complete term records – Broader and Narrower Terms • Polyhierarchies when needed – Preferred/Non-Preferred Terms (equivalence relationships) – Related Terms (associative relationships) – Scope Notes – Correct term format – Compound terms when needed Copyright © 2009 - Access Innovations, Inc.
  • 67. Notation • Symbols (numerals, letters, hyphens, colons, etc.) – 1: Apples • 1.1: Granny Smith • 1.2: Winesap • Adjunct to verbal expression of term • May represent another kind of ordering of sibling terms (non-alphabetic) – Chronological, positional, numeric sequence, or other logical sequence for user group – Same terms presented differently for different user groups, different purposes • Secondary to verbal concept organization Copyright © 2009 - Access Innovations, Inc.
  • 68. Review, edit, test, edit, use, edit, and maintain, i.e. edit  Review  Edit and maintain  Users  Add term  Expert reviewers  Change existing term  Test  Change term status  Index 500+ documents  Delete term (more for variable writing  Add term relationship style; fewer for strict  Delete term relationship style)  Add/modify Scope Note  Monitor search log  Change overall structure Consider automated / assisted indexing software Copyright © 2009 - Access Innovations, Inc.
  • 69. When do you add more terms?  On demand  When usage changes  Stewardess – flight attendant  As the field evolves  8 changes to 64 colors  In Use  Don’t freeze waiting for perfection Copyright © 2009 - Access Innovations, Inc.
  • 70. Methodology and Workflow Enrich Content Build and maintain via Thesaurus Data Harmony (Software and Services) Enhanced M.A.I. Documents: Source Material Tagged XML Choose the Vocabularies Terms Subjects Documents People Content Places Full Text Style and Articles Etc. HTML, PDF, Spelling Proceedings Data Feeds, Search logs Web Pages etc. Conference Craft Hierarchical Abstracts Etc. Additional Structure Databases: Non-hierarchical Authors Relationships: Experts Synonyms Etc. Related Terms Refine the Evaluate and Manage Thesaurus Rulebase
  • 71. How do I implement a taxonomy?  In search  In a web site  In indexing  In other ways
  • 72. Taxonomy and System Integrations Document Search repository SEARCH Presentation Full CMS Perfect Search Layer text, HTML, P Lucene, MarkLogi Documentum c Website DF, data feeds SharePoint SQL etc. Oracle MarkLogic Metadata Client Inline Extractor Taxonomy Tagging M.A.I. Rule Base Thesaurus Master
  • 73. Parts of Search  Search software  Inverted Index  Search algorithms  Presentation layer  Search box  Autocompletion  Related and narrower terms  Hierarchical display
  • 74. Outline of Presentation 1 Define key terminology Creating 2 Thesaurus tools an  Features Inverted  Functions File 3 Costs Index  Thesaurus construction  Thesaurus tools 4 Why & when? Copyright © 2009 - Access Innovations, Inc.
  • 75. Simple Inverted File Index & key 1 of 2 outline 3 presentation 4 terminology construction thesaurus costs tools define when features why functions Copyright © 2009 - Access Innovations, Inc.
  • 76. Complex Inverted File Index Example 1 key - L2, P2, H & - Stop of - Stop 1 - Stop outline - L1, P1, T 2 - Stop presentation - L1, P3, T 3 - Stop terminology - L2, P3, H 4 - Stop thesaurus - (1) - L3, P1, H construction - L7, P2, SH (2) - L7, P1, SH costs - L6, P1, H (3) - L8, P1, SH define - L2, P1, H tools - (1) - L3, P2, H features - L4, P1, SH (2) - L8, P2, SH functions - L5, P1, SH when - L9, P3, H why - L9, P1, H Copyright © 2009 - Access Innovations, Inc.
  • 77. The Portal View - MediaSleuth  Use all options for search  Traditional Search  Taxonomy  Rule Base
  • 78. NavTree View MAIQuery
  • 79. Taxonomy Thesaurus view Term Record view
  • 80. Search Presentation Layer Automatic completion And type ahead from Thesaurus
  • 81. Search Presentation Layer Related Narrower
  • 82. Search Presentation Layer The Hierarchical view of the thesaurus is also a browse able view of the content. The numbers include the number of hits 1. For the term 2. For the branch
  • 83. Taxonomy Thesaurus view Term Record view
  • 84. Web Taxonomies – Changing faces  ….and how the information is delivered  From current site  To new version  Depends on TAXONOMY  Personalization  Feeding ads  Consistent information
  • 85.
  • 86.
  • 87. Use the taxonomy here HTML Headers META NAME KEYWORD
  • 88. To personalize or profile
  • 89. Improve Search: www.mediasleuth.com Autocompletion Using the Taxonomy Guide the User Navigate the full Taxonomy “Indispensable for anyone trying to identify “Tree” instructional media for teaching.” – CHOICE Magazine
  • 90. Link to Society Resources CME Upcoming Other Activity on Conference Journal Topic A on Topic A Articles on Topic A Job Posting Journal for Expert Article on on Topic A Topic A Grant Available Podcast Interview for Researchers with Researcher Working on Working on Topic A Topic A CONFIDENTIAL
  • 91. Author Submission Module The author pastes the data to the document template, attaching images, graphs, as necessary:
  • 93. Authors at a place MASHUP locations to a GPS grid of an area
  • 94. Watch Crime in action
  • 95.
  • 96.
  • 97.
  • 98. More Like This - Recommender Cancer Epidemiology Biomarkers & Prevention Related Press Releases •How What and How Much We Eat (And Drink) Affects Our Vol. 12, 161-164, Risk of Cancer February 2003 •Novel COX-2 Combination Treatment May Reduce Colon © 2003 American Association for Cancer Research Cancer Risk Combination Regimen of COX-2 Inhibitor and Fish Oil Causes Cell Death Short Communications •COX-2 Levels Are Elevated in Smokers Alcohol, Folate, Methionine, and Risk of Incident Breast Cancer in the American Cancer Society Cancer Prevention Study II NutritionRelated AACR Workshops and Conferences Cohort Heather Spencer Feigelson 1, Carolyn R. Jonas, Andreas S. •Frontiers in Cancer Prevention Research •Continuing Robertson, Marjorie L. McCullough, Michael J. Thun and Eugenia E. CalleMedical Education (CME) •Molecular Targets and Cancer Therapeutics Department of Epidemiology and Surveillance Research, American Cancer Society, National Home Office, Atlanta, Georgia 30329-4251 Related Meeting Abstracts •Association between dietary folate intake, alcohol intake, and methylenetetrahydrofolate reductase C677T and A1298C Recent studies suggest that the increased risk of breast cancer associated polymorphisms and subsequent breast with alcohol consumption may be reduced by adequate folate •Folate, folate cofactor, and alcohol intakes and risk for intake. We examined this question among 66,561 postmenopausal women in the adenoma colorectal American Cancer Society Cancer Prevention Study II Nutrition•Dietary folate intake and risk of prostate cancer in a large Cohort. prospective cohort study Related Working Groups Think Tank Report •Finance Related Think Tank Report Related Education Book Content •Charter Content Oral Contraceptives, Postmenopausal •Molecular Epidemiology Hormones, and Breast Cancer Physical Activity and Cancer Webcasts Hormonal Interventions: From Adjuvant Therapy to Related Webcasts Breast Cancer Prevention Related Awards •AACR-GlaxoSmithKline Clinical Cancer Research Scholar Awards •ACS Award •Weinstein Distinguished Lecture
  • 99. Thesaurus Resources • American Society for Information Science and Technology – www.asis.org/ • ANSI/NISO Standard Z39.19-1993 – www.niso.com • Australian Society of Indexers – www.aussi.org/ • Data Harmony – www.dataharmony.com  International Society for Knowledge Organization  http://www.iskouk.org/index.htm  Networked Knowledge Organization Systems  http://nkos.slis.kent.edu/ • SLA Taxonomy Division – SLA Taxonomy and Metadata (wiki.sla.org/display/SLATAX/Home) • Taxonomy Community of Practice Copyright © 2009 - Access Innovations, Inc.
  • 100. Readings – Thesaurus Construction  Thesaurus Construction and Use a Practical Manual. Fourth edition has taxonomy information http://www.alibris.com/search//search/search.cfm?wauth=Aitchi son%2C%20Jean%20Gilchrist%2C Aitchison, Jean - Gilchrist, Alan - Bawden, David  NISO Z39.19 (2005) standard NOT the 2003  http://www.asindexing.org/site/thesbuild.shtml American Society for Indexers - a good practical approach  Books about the process also include the ones listed here http://www.asindexing.org/site/bibliog.shtml  There is also a series of white papers and other information on the web site at www.dataharmony.com  SLA Taxonomy Division Copyright © 2009 - Access Innovations, Inc.
  • 101. Review  Why build a taxonomy?  What is a taxonomy?  What are the standards?  Where are taxonomies used?  What are the parts of a taxonomy?  How do you build one?  How do you implement one?
  • 102. Talks upcoming  “Data Visualization” workshop at Computers in Libraries – April 11  SLA Taxonomy Division workshop “How to build a taxonomy” June 8  Session “How to Apply Your Taxonomy to Your Content” Monday June 10
  • 103. Thank you! Marjorie M.K. Hlava* President and Chief Scientist Access Innovations, Inc. mhlava@accessinn.com 505-998-0800 * Our team of 37 has built over 200 taxonomies and implemented more than 600 for enterprises, governments and not for profits. We built tools to do the work as well and are glad to share them with you for your projects

Editor's Notes

  1. Thanks to Helen Atkins of AACR for this illustration.The real power of this is that the links can all go in all directions, so we take advantage of having the user’s attention regardless of how they step into our “web”