SlideShare ist ein Scribd-Unternehmen logo
1 von 46
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Computing and Linguistics
A Cognitive Approach
or, Computing “As We May Think”
Steve Pepper
pepper.steve@gmail.com
University of Oslo, 2009-04-21
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Today’s “research questions”
 How can linguistics – and in particular cognitive
linguistics – inform our work with Topic Maps?
 Can Topic Maps contribute in any way to the
cognitive linguistics project?
 Plan of action
– I tell you about Topic Maps (conceptual model)
– I draw some parallels with natural language
– You correct me, elaborate and suggest new directions
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Relevance to you as linguists
 As users of the technology
– organizing data collected in your research
 As consultants to users of the technology
– e.g. universities, government agencies, private enterprise
 As contributors to the standard
– clarify some of the cognitive issues, establish best
practices, help extend the standard
 As lobbyists to the University of Oslo
– if you think the new UiO web site should be based on
Topic Maps, please make your views known to the project
group: http://www.admin.uio.no/prosjekter/nyuioweb/
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Relevance in general
1. We need to organize information in a new way
– The summation of human experience is being
expanded at a prodigious rate, and the means we
use for threading through the consequent maze
to the momentarily important item is the same
as was used in the days of square-rigged ships.
(Vannevar Bush, As We May Think, 1945)
2. We need new ways of managing knowledge
– In today’s global knowledge economy, knowledge is
the key asset in many organizations...
 Topic Maps makes major contributions in both areas
– See the use cases presented at recent Topic Maps conferences
http://www.topicmaps.com
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
What is Topic Maps?
 An ISO standard for computer-based information
and knowledge management
– “Provides the ability to control infoglut and share knowledge
by connecting any kind of information from any kind of source
based on its meaning”
 A “semantic technology”
– Cf. Semantic Web (RDF, OWL)
– A form of knowledge representation (primitive perhaps, but useful)
 Widely used for web-based delivery of information
– Plus: Information Integration, eLearning, Business Process
Modeling, Product Configuration, Business Rules Management,
Asset Management, Knowledge Management, …
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
The problem with computing...
 ...is that it’s inside-out!
 People used to think the sun
revolved around the earth
– Copernicus’ heliocentric theory
turned this idea inside out and
revolutionized our understanding
of the universe
 Today we face a similar
situation in computing
– Our computing universe has
computers, applications and
documents at the centre
– The concepts that our information is
about are somewhere in outer space
where they can’t be found
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
A subject-centric revolution
 This is wrong, because it does
not reflect how humans think
– We think in terms of interrelated
concepts (or subjects)
– Subjects are what interest us, not
documents or applications
– And so subjects must be given
centre stage
 We need a subject-centric
revolution
– This has ramifications for every
aspect of human-computer
interaction, including user interfaces,
operating systems, file systems, etc.
– Consider the typical user desktop...
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Today ourToday our
desktops aredesktops are
application-application-
centric andcentric and
document-document-
centriccentric
Icons representIcons represent
applicationsapplications andand
documentsdocuments
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
topic
maps
tm2008
bantu
semantics
LING 2110
INF 2820rana
keynote
OOXML
K185
gambia
opera
janacek
bayreuth
håkon
TM2008
Topic page
Emails
Documents
Web pages
Copy PSIΨ
 Why can’t they be subject-centric, with icons that
represent the subjects we are interested in?
 With links between related icons?
 And with context menus that allow us to find
everything related to a particular subject?
TM2008
Topic page
Emails
Documents
Web pages
Copy PSIΨ
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Computing “As We May Think”
 Bush’s solution to information overload:
– Organize information “As We May Think”, i.e. associatively
 His vision spawned the hypertext movement
– Doug Engelbart, Ted Nelson, Bill Atkinson, Tim Berners-Lee, ...
– The World Wide Web is its greatest triumph to date
 But hypertext does not correspond to how we think
– Our heads are not full of millions of interlinked documents
– They are full of “interlinked” concepts (or subjects)
 Topic Maps provides a close approximation to this
– It is a technology that is based on cognitive principles
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Background to Topic Maps
 Emerged from the SGML community in 1990’s
– Use case: How to merge (digital) back-of-book indexes
– Some input from library science
– No input from linguists
– Precious little input from computer scientists before 2001
– Most of the SGML community came from the humanities
 ISO 13250 first published in 2000 (recently revised)
– A model for representing knowledge organization structures
(indexes, glossaries, thesauri, encyclopedias)
– Plus interchange syntax, query language, constraint language, ...
 Widely adopted in Norway (esp. public sector)
– And gaining ground elsewhere
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
The TAO of Topic Maps
 The core concepts are derived
from the back-of-book index
 Extended and generalized for
use with digital information
 Consider a two-layer model
consisting of
– a set of information resources (below)
– a “knowledge map” (above)
 This is like the division of a
book into content and index
knowledge layer
information layer
(INDEX)
(CONTENT)
Callas, Maria …………………… 42
Cavalleria Rusticana … 71, 203-204
Mascagni, Pietro
Cavalleria Rusticana . 71, 203-204
Pavarotti, Luciano ……………… 45
Puccini, Giacomo ………. 23, 26-31
Tosca ………………. 65, 201-202
Rustic Chivalry, see Cavalleria
Rusticana
singers ………………………. 39-52
baritone ………………………. 46
bass ……………………….. 46-47
soprano ……………… 41-42, 337
tenor ………………………. 44-45
see also Callas, Pavarotti
Tosca ………………… 65, 201-202
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
(1) The information layer
 The lower layer contains the content
– usually digital, but need not be
– can be in any format or notation or location
– can be text, graphics, video, audio – whatever
 This is like the content of the book to which the
back-of-book index belongs
information layer(CONTENT)
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
(2) The knowledge layer
 The upper layer consists of (typed) topics and
associations
– Topics represent the subjects that the information is about
 Like the list of topics that forms a back-of-book index
– Associations represent relationships between those subjects
 Like “see also” relationships in a back-of-book index
knowledge layer
composed by
born in
composed by
Puccini
Tosca
Lucca
Madame
Butterfly
(INDEX)
Domain:
Italian opera
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Occurrences link the layers
 Occurrences represent
relationships between
information resources and
the subjects that they are
“about”
 The links (or locators) are
like page numbers in a
back-of-book index
 Occurrences can
also be typed (e.g.
bio, map, synopsis)
knowledge layer
information layer
Puccini
Tosca
Lucca
composed by
born in
composed by
Madame
Butterfly
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Summary of core concepts
A pool of information or data, and
information
Associations
– representing relationships between
subjects
composed by
born in
composed by
Occurrences
– links to information that is somehow
relevant to a given subject
= The TAO of Topic Maps
a knowledge layer consisting of
knowledge
Topics
– a set of topics representing the key
subjects of the domain in question
Puccini
Tosca
Lucca
Madame
Butterfly
Let’s look at some TAOs
in the Omnigator…
Plus: topic types, association types,
occurrence types – each of which
are represented by topics...
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
About the Omnigator
 A free topic map browser from Ontopia
– Download from http://www.ontopia.net (part of “OKS Samplers”)
– Java-based, runs on any computer
 Completely generic
– Not optimized for any particular ontology
– Display and navigate any conforming topic map
 A teaching aid
– Not designed for end-users (no attempt to hide technical jargon)
– Also used for prototyping and debugging
 Not to be used for most real world applications!
– These require custom interfaces based on a specific ontology
– (see http://www.topicmaps.com for a good example)
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Omnigator interface
current topic
multiple (typed) names
topic type(s)
typed
occurrences
(internal and
external)
typed
associations
Demo
a typical topic page
identifier(s)
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Typing topics revisited
 Basic building blocks of the TAO model are
– Topics: e.g. “Puccini”, “Lucca”, “Tosca”
– Associations: e.g. “Puccini was born in Lucca”
– Occurrences: e.g. “http://www.opera.net/puccini/bio.html
is a biography of Puccini”
 Each of these constructs can be typed
– Topic types: “composer”, “city”, “opera”
– Association types: “born in”, “composed by”
– Occurrence types: “biography”, “street map”, “synopsis”
 All such types are also topics
– The set of typing topics constitutes an ontology
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Capabilities of the TAO model (1)
 Represent subjects explicitly
– Topics represent the “things” users are interested in
 Capture relationships between subjects
– Associations provide user-friendly navigation paths to information
(navigation “as we may think”)
– Associations also promote serendipitous knowledge discovery
through browsing
 Make information findable
– Topics provide a “one-stop-shop” for everything that is known
about a subject (collocation of information and knowledge)
– Occurrences allow information about a common subject to be
aggregated across multiple systems, irrespective of location
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Capabilities of the TAO model (2)
 Represent taxonomies and thesauri
– Associations can (also) represent hierarchical relationships
– With Topic Maps you can have multiple, interlinked hierarchies
and faceted classification
 Transcend simple hierarchies
– Rich associative structures capture the complexity of
knowledge and reflect the way people think
 Manage knowledge
– The topic map is the embodiment of “organizational memory”
– Provides a structured way to capture people’s knowledge of
things, events, relationships, etc.
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Beyond the TAO
 Formal data model
– Topic maps can be queried, e.g.
– Give me all composers that composed operas that were
based on plays that were written by Shakespeare
 Interchange syntax
– Topic maps can be interchanged
– Increased reuse = added value
 Robust identity model
– Topic maps can be merged
– Potential to federate knowledge
 Scope
– Topic maps can capture context
 Reification
– Topic maps can express different levels of detail
– Similar to scaling in cartography
For more details,
see Pepper 2009
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Break – any questions so far?
After the break:
Topic Maps and natural
language – towards a
linguistic perspective
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Parallels with natural language
 Basic grammatical classes
 Nouns and verbs
 Nominals and nouns
 Clauses and verbs
 Valency
 Semantic roles
 Categories and schemas
 Hyponymy
 Synonymy and homonymy
 Nominalization
 Grounding / co-reference
 Information structure
⇒ TAO model
⇒ Topics and associations
⇒ Topics and their types
⇒ Associations and their types
⇒ Arity
⇒ Association roles
⇒ Typing topics
⇒ Type hierarchies
⇒ Naming
⇒ Reification
⇒ Subject identity / collocation
⇒ Navigation
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Basic principles, basic classes
 In elementary school, I was taught that a noun is the name of a person,
place, or thing. In college, I was taught the basic linguistic doctrine that a
noun can only be defined in terms of grammatical behavior, conceptual
definitions of grammatical classes being impossible. Here, several
decades later, I demonstrate the inexorable progress of grammatical
theory by claiming that a noun is the name of a thing.
(Langacker 2008)
 The basic grammatical classes are nouns and verbs
– They prototypically profile things and relationships
 They correspond to topics and associations
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Grounding
 “Grounding is characteristic of the structure referred to in CG
as nominals and finite clauses. More specifically, a nominal
or a finite clause profiles a grounded instance of a thing or
process type.”
 “A noun designates a type of thing, and a verb a type of
process.”
 “A nominal or a finite clause profiles a grounded instance of
a thing or process type.”
 Nominal grounding (determiners and quantifiers)
– the, this, that, some, a, each, every, no, any
 Clausal grounding (mood and tense)
– -s, -ed, may, will, should
Langacker 2008: 259ff (esp. 264)
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Nouns and nominals
 Topic types represent classes of topics
– Conceptual “groupings of things”, e.g. composer, opera, city, ...
– They correspond to Langacker’s nouns (“types of thing”)
However, topics can have multiple names
– (This is how we handle synonymy and multilingualism)
– In one sense it is topic names that correspond to nouns
 Topic instances represent individual subjects
– They correspond to Langacker’s nominals (“instances of types”)
– Their names are typically proper nouns, e.g. Puccini, Tosca, Lucca
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Verbs and clauses
 Association types represent classes of relationships
– They correspond to Langacker’s verbs (“types of process”)
– (Often named accordingly, e.g. born in, composed by, killed by, ...)
 Individual associations represent specific relationships
– They correspond to Langacker’s clauses (“instances of processes”)
– e.g. Puccini was born in Lucca; Tosca was composed by Puccini
Langacker distinguishes processes (temporal) and non-processual
relationships (non-temporal). The latter are (prototypically) profiled
by adjectives, adverbs, prepositions, and participles. This distinction
is not made explicitly in Topic Maps.
 Note: There are two predefined association types
– type-instance (the relationship between a topic and its type)
– supertype-subtype (a relationship between types, see Hyponymy)
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Valency
 Associations can involve one, two or more topics
– Binary associations, e.g. Puccini composed Tosca, are most
common and correspond to transitive verbs
– Ternary associations, e.g. Tosca killed Scarpia with a knife, can
correspond to ditransitive verbs
– Unary associations, e.g. Turandot was unfinished, correspond
(sort of) to intransitive verbs (or binary properties)
 The arity of an association
– Corresponds to the valency of a verb
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Semantic roles
 An association does not have “directionality”
 Instead of direction, Topic Maps uses roles
– Roles are classified by type
– Role types specify the nature of each topic’s involvement
in the relationship. They correspond to semantic roles.
– (Role types are also topics)
 Role types are different from topic types...
Puccini Tosca
composed
composed by
composer work
RDFTopic Maps
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Roles and types
T R A
Puccini
R T
Tosca
T T T
composer workcomposed
T
composer
T
opera
The role type can be
– the same as the role playing topic’s topic type (composer = composer)
– a supertype of the topic type (work > opera)
– a subtype of the topic type (teacher < person)
– a subtype of the topic type’s supertype (source < work)
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Association roles Semantic roles
 Italian Opera Topic Map
– composed: composer, work
– born in: person, place
– appears in: character, work
– based on: source, result
– revision of: source, result
– part of: part, whole
– exponent of: person, style
– located in: container, containee
– pupil of: teacher, pupil
 Association roles tend to be much more specific
– Variable practice – as yet no established conventions
– Might (cognitive) linguists have something to offer here?
 (Frawley 1992)
– (logical actors)
agent, author, instrument
– (logical recipients)
patient, experiencer,
benefactive
– (spatial roles)
theme, source, goal
– (non-participant roles)
locative, reason, purpose
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Naming of associations
 Intuitive naming requires
flexibility
– i.e. multiple AT names that change
depending on the “direction” of the
association
 Puccini was born in Lucca
 Lucca was the birthplace of Puccini
 Alternative CG view
– Naming should be based on whether
the agent or the theme is in focus
 The focus becomes the trajector
– Point of focus = Current topic
 Some strategies...
 Voice-based
– Active / passive forms of the verb
 composedVa / composedVp by
– Works well in SVO languages.
Less satisfactory with SOV.
 Role-based
 teacherN of/pupilN of
 Nominalization
 composition
– Tends to be used by Japanese,
Koreans (and Germans??)
 Combinations
 bornV in / birthplaceN of
 partN of/consistsV of
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Categories and prototypes
 Topic types define categories of things
– But are they Aristotelian or prototypical categories?
 Aristotelian
– Category membership is binary
– All instances are equally representative. No standard notion of “similarity”.
 Prototypical
– Not defined by “necessary and sufficient conditions” (cf. OWL)
 The decision is up to the conceptualizer (a.k.a. topic map author)
– A topic can have more than one type
 Boïto is a composer and a librettist
– The same topic can be a topic type and a role type
 e.g. Puccini is a composer; Puccini plays the role of composer in …
– Should we establish conventions for goodness of example?
 Could be useful in automated classification
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Schemas and constraints
 Other types can also be said to define categories
– association types, (occurrence types, name types, role types)
 But these are more schematic (in the CG sense)
– Schemas are “abstract templates obtained by reinforcing the
commonality inherent in a set of instances”
(Langacker 2008, p.23, in the context of grammatical rules)
 Rules can be defined as templates and constraints:
T R A
Puccini
R T
Tosca
T T T
composer workcomposed
T
composer
T
opera
“Puccini composed Tosca”
The composer Puccini plays
the role of composer in the
“composition” relationship in
which the role of work is
played by the opera Tosca.T
AGENT
T
THEMEelaboration sites
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Hyponymy
 Topic Maps has two predefined association types:
– type-instance (relationship between a topic and its type)
– supertype-subtype (relationship between the denotations of a
hyponym and its hyperonym)
Mammal
Primate Canine
HumanChimp WolfDog
Steve Ron
LEGEND
types
instances
supertype-subtype
type-instance
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Synonymy and homonymy
 Synonyms
– One subject, multiple names
– In thesauri: USE and USED FOR
 TMs are subject-centric
– A topic can have multiple names
– Names can be typed
 Typical name types:
– nickname, synonym, alternate name
– Context can be expressed using scope
 Typically names in different natural languages
– composer, komponist, 작곡가 , ...
– Names can also have “variants”
 Often used to capture orthographic variation:
– Tchaikovsky, Чайко́вский, Tsjajkovskij,
Tschaikowski
 Also useful for sort names, pronunciation, etc.
 Homonyms
– One name, multiple subjects
– In thesauri: problematic
 TMs are based on identifiers
– Same name can be used by more than
one topic
– Disambiguation in UI is left to the
application
– Two main disambiguation strategies
 Default: qualify by type, e.g.
– Tosca (opera) vs. Tosca (character)
 Fallback: qualify by some other
relationship, e.g.
– Paris (France) vs. Paris (Texas)
– La Bohème (Puccini) vs. La Bohème
(Leoncavallo)
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Nominalization
 (A topic map consists of assertions about subjects)
 Assertions are made using statements:
– names, e.g. a certain subject has the name “Tosca”
– associations, e.g. “Tosca is set in Rome”
– occurrences, e.g. “http://en.wikipedia.org/wiki/Rome is a web
page about Rome”
 Any statement can be reified
– Reification results in a topic that has the same referent as the
reified statement
– e.g. Tosca is set in RomeA ⇒ The setting of Tosca in RomeT
– The (new) reifying topic can have names and occurrences,
and it can play roles in associations
Derivation of nouns from other words,
including verbs, adjectives etc.
e.g. meetV ⇒ meetingN
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Subjects and topics
 Topics represent subjects
– the topic is the representation
– the subject is the referent
 Or, in Saussure’s terms
– signifiant and signifié
 A subject can be anything:
A subject is any “thing” whatsoever,
whether or not it exists or has any other
specific characteristics, about which
anything whatsoever may be asserted
by any means whatsoever.
 Is the topic/subject pairing a
symbolic assembly?
A subject in
the real world
T
A topic in the
computer domain
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Co-reference and collocation
 Grounding singles out referents and enables co-reference
– between speaker and listener
– across a sequence of utterances
 In Topic Maps the central objective is collocation
– By definition, each topic represents a single subject (one subject per topic)
– A topic is intended to be a point of collocation for everything that is known
about a particular subject
– Therefore the goal is to have only one topic per subject
 To achieve that we need to know which subject a topic
represents
– (This is sometimes referred to as the “intentionality” of the relation
between a symbol and its referent.
– We call it subject identity.
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Subject identity
 The identity of a subject is expressed using globally
unique identifiers called subject identifiers
– If two topics share a subject identifier, they are deemed to
represent the same subject and must be merged
SUBJECTS
TOPICS
Madame
Butterfly
Tosca
Lucca
Puccini
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
The subject is identified by a
URL
• The URL is called a
subject identifier
Subject identifiers
Giacomo
Puccini
topic
http://psi.ontopedia.net/Giacomo_Puccini
subject identifier
The URL is the address
of a web page
• The web page describes
the subject such that a
human can know what
subject is referred to
• This web page is called a
subject descriptor
Giacomo Puccini
Italian composer, b.
Lucca 22nd Dec 1858,
d. Brussels, 29th Nov
1924. Best known for
his operas, of which
Tosca is one of the
most popular
and well-known.
subject descriptor
http://psi.ontopedia.net/Giacomo_Puccini Humans use the descriptor
By inspecting the web page the person
responsible for assigning the identifier can
be sure that it does not refer to, say,
Giacomo’s grandfather Domenico (who
was also a composer of operas)
Machines use the identifier
The link is not resolved.
Instead simple lexical
comparison is used. If the
strings are identical, the
subject is deemed to be the
same and the topics are
merged.
subject
Is the subject identifier/
subject descriptor pairing
a symbolic assembly?
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Information structure
 Intuitive navigation is a key feature of Topic Maps
 But what is its cognitive basis?
– I claim that it corresponds to the way we think (i.e., associatively)
– Can linguistics back up this claim?
 topic vs. comment in linguistics (Bussmann, 487)
– “Analysis of sentences according to communicative criteria into the
topic (what is being talked about) and the comment (what is being
said about the topic)”
– “Analysis of utterances according to the communicative criteria of
given/known information vs. new information”
– Cf. theme vs. rheme in Halliday’s functional grammar
 Consider our earlier tour of Italian opera...
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
Navigation as narrative
Giacomo Puccini was a composer. He was born in Lucca in 1858.
Lucca is a city, located in Italy. It was the birthplace of
Puccini and Catalani.
Catalani was a composer who composed 5 operas. He died in Milan.
Milan is the home of La Scala, which was the venue for many
premiére performances, including that of Madam Butterfly.
Madam Butterfly is set in Nagasaki, which is located in Japan.
Japan is (also) the setting for Iris, [which is] an opera [which
was] composed by Mascagni, who was a pupil of Ponchielli who was
(also) the teacher of Puccini...
Giacomo Puccini was a composer. He was born in Lucca in 1858.
Lucca is a city, located in Italy. It was the birthplace of
Puccini and […] Catalani.
Catalani was a composer who composed 5 operas. He died in Milan.
Milan is the home of La Scala, which was the venue for many
premiére performances, including that of Madam Butterfly.
Madam Butterfly is set in Nagasaki, which is located in Japan.
Japan is (also) the setting for Iris, [which is] an opera [which
was] composed by Mascagni, who was a pupil of Ponchielli who was
(also) the teacher of Puccini...
THEME: new theme continuing theme
RHEME: predicate with potential new theme
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
“Now! …. That should
clear up a few things
around here!”
Discussion
 Questions, comments, corrections?
– What have I missed? Where else should I look?
 What might linguists contribute?
– A better understanding of the nature of roles?
– Approaches to representing temporal knowledge?
– ...
 Can Topic Maps inform linguistics?
– After all, it is a technology that captures (some
degree of) (some form of) knowledge
– It seems to have a reasonable cognitive basis
– It emerged through usage (librarians, indexers, etc.)
– And last but not least, it works!
www.ontopedia.net
O N T O P E D I A
The Identity of Everything
References
 Bussman, H. Routledge Dictionary of Language and Linguistics
(London 1996)
 Frawley, W. Linguistic Semantics (Hillsdale 1992)
 Langacker, R. Cognitive Grammar (Oxford 2008)
 Pepper, S. Italian Opera Topic Map
– http://www.ontopedia.net/ItalianOpera
 Pepper, S. “Topic Maps” in Bates, M.J. and Maack, M.N. (eds)
Encyclopedia of Library and Information Sciences (CRC Press,
forthcoming 2009)
– http://www.ontopedia.net/pepper/papers/ELIS-TopicMaps.pdf

Weitere ähnliche Inhalte

Andere mochten auch

11. Creazione collettiva
11. Creazione collettiva11. Creazione collettiva
11. Creazione collettivaRoberto Polillo
 
Subtle patterns of learner language: 13 topics for further research
Subtle patterns of learner language: 13 topics for further researchSubtle patterns of learner language: 13 topics for further research
Subtle patterns of learner language: 13 topics for further researchSteve Pepper
 
No A14 Toll Tax on Suffolk Leaflet
No A14 Toll Tax on Suffolk LeafletNo A14 Toll Tax on Suffolk Leaflet
No A14 Toll Tax on Suffolk LeafletTim Meadows-Smith
 
Language processing patterns
Language processing patternsLanguage processing patterns
Language processing patternsRalf Laemmel
 
02 question-paper-maths-x
02 question-paper-maths-x02 question-paper-maths-x
02 question-paper-maths-xdinesh reddy
 
Topic Maps for the Three Kingdoms: The Many Applications of Topic Maps
Topic Maps for the Three Kingdoms: The Many Applications of Topic MapsTopic Maps for the Three Kingdoms: The Many Applications of Topic Maps
Topic Maps for the Three Kingdoms: The Many Applications of Topic MapsSteve Pepper
 

Andere mochten auch (7)

Dinesh
DineshDinesh
Dinesh
 
11. Creazione collettiva
11. Creazione collettiva11. Creazione collettiva
11. Creazione collettiva
 
Subtle patterns of learner language: 13 topics for further research
Subtle patterns of learner language: 13 topics for further researchSubtle patterns of learner language: 13 topics for further research
Subtle patterns of learner language: 13 topics for further research
 
No A14 Toll Tax on Suffolk Leaflet
No A14 Toll Tax on Suffolk LeafletNo A14 Toll Tax on Suffolk Leaflet
No A14 Toll Tax on Suffolk Leaflet
 
Language processing patterns
Language processing patternsLanguage processing patterns
Language processing patterns
 
02 question-paper-maths-x
02 question-paper-maths-x02 question-paper-maths-x
02 question-paper-maths-x
 
Topic Maps for the Three Kingdoms: The Many Applications of Topic Maps
Topic Maps for the Three Kingdoms: The Many Applications of Topic MapsTopic Maps for the Three Kingdoms: The Many Applications of Topic Maps
Topic Maps for the Three Kingdoms: The Many Applications of Topic Maps
 

Ähnlich wie Computing and Linguistics: A cognitive approach

Everything is a Subject: The vision of subject-centric computing
Everything is a Subject: The vision of subject-centric computingEverything is a Subject: The vision of subject-centric computing
Everything is a Subject: The vision of subject-centric computingSteve Pepper
 
Learning and Text Analysis for Ontology Engineering
Learning and Text Analysis for Ontology EngineeringLearning and Text Analysis for Ontology Engineering
Learning and Text Analysis for Ontology Engineeringbutest
 
M1. sem web & ontology introd
M1. sem web & ontology introdM1. sem web & ontology introd
M1. sem web & ontology introdMichele Missikoff
 
Introduction
IntroductionIntroduction
Introductionsriniefs
 
20111022 ontologiescomeofageocas germanymcguinnessfinal
20111022 ontologiescomeofageocas germanymcguinnessfinal20111022 ontologiescomeofageocas germanymcguinnessfinal
20111022 ontologiescomeofageocas germanymcguinnessfinalDeborah McGuinness
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word CloudsMarina Santini
 
Web 20 E Oltre 1202297800291589 3
Web 20 E Oltre 1202297800291589 3Web 20 E Oltre 1202297800291589 3
Web 20 E Oltre 1202297800291589 3Universita' di Bari
 
SPARC Repositories conference in Baltimore - Nov 2010
SPARC Repositories conference in Baltimore - Nov 2010SPARC Repositories conference in Baltimore - Nov 2010
SPARC Repositories conference in Baltimore - Nov 2010Jisc
 
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...OpenEdition
 
Seville2000
Seville2000Seville2000
Seville2000behem0t
 
Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Rinke Hoekstra
 
CNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationCNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationJohn Doove
 
Context culture metadata_openscout20120301
Context culture metadata_openscout20120301Context culture metadata_openscout20120301
Context culture metadata_openscout20120301Jan Pawlowski
 
Porting terminologies to the Semantic Web
Porting terminologies to the Semantic WebPorting terminologies to the Semantic Web
Porting terminologies to the Semantic WebBernard Vatant
 
Lecture knowledge representationreasoning
Lecture knowledge representationreasoningLecture knowledge representationreasoning
Lecture knowledge representationreasoningIKS - Project
 

Ähnlich wie Computing and Linguistics: A cognitive approach (20)

An introduction to topic maps,ontologies and published subjects
An introduction to topic maps,ontologies and published subjectsAn introduction to topic maps,ontologies and published subjects
An introduction to topic maps,ontologies and published subjects
 
Everything is a Subject: The vision of subject-centric computing
Everything is a Subject: The vision of subject-centric computingEverything is a Subject: The vision of subject-centric computing
Everything is a Subject: The vision of subject-centric computing
 
Learning and Text Analysis for Ontology Engineering
Learning and Text Analysis for Ontology EngineeringLearning and Text Analysis for Ontology Engineering
Learning and Text Analysis for Ontology Engineering
 
M1. sem web & ontology introd
M1. sem web & ontology introdM1. sem web & ontology introd
M1. sem web & ontology introd
 
20080606 VöGler GöTtingen E Humanities
20080606 VöGler GöTtingen E Humanities20080606 VöGler GöTtingen E Humanities
20080606 VöGler GöTtingen E Humanities
 
Introduction
IntroductionIntroduction
Introduction
 
20111022 ontologiescomeofageocas germanymcguinnessfinal
20111022 ontologiescomeofageocas germanymcguinnessfinal20111022 ontologiescomeofageocas germanymcguinnessfinal
20111022 ontologiescomeofageocas germanymcguinnessfinal
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word Clouds
 
Digital humanities
Digital humanitiesDigital humanities
Digital humanities
 
Web 20 E Oltre 1202297800291589 3
Web 20 E Oltre 1202297800291589 3Web 20 E Oltre 1202297800291589 3
Web 20 E Oltre 1202297800291589 3
 
SPARC Repositories conference in Baltimore - Nov 2010
SPARC Repositories conference in Baltimore - Nov 2010SPARC Repositories conference in Baltimore - Nov 2010
SPARC Repositories conference in Baltimore - Nov 2010
 
Diary or Megaphone?
Diary or Megaphone?Diary or Megaphone?
Diary or Megaphone?
 
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
Du Literary and linguistic computing aux Digital Humanities : retour sur 40 a...
 
Seville2000
Seville2000Seville2000
Seville2000
 
Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04
 
CNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundationCNI fall 2009 enhanced publications john_doove-SURFfoundation
CNI fall 2009 enhanced publications john_doove-SURFfoundation
 
Digitizingthearchive
DigitizingthearchiveDigitizingthearchive
Digitizingthearchive
 
Context culture metadata_openscout20120301
Context culture metadata_openscout20120301Context culture metadata_openscout20120301
Context culture metadata_openscout20120301
 
Porting terminologies to the Semantic Web
Porting terminologies to the Semantic WebPorting terminologies to the Semantic Web
Porting terminologies to the Semantic Web
 
Lecture knowledge representationreasoning
Lecture knowledge representationreasoningLecture knowledge representationreasoning
Lecture knowledge representationreasoning
 

Kürzlich hochgeladen

4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptxAneriPatwari
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxAnupam32727
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Celine George
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
CHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptxCHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptxAneriPatwari
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6Vanessa Camilleri
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 

Kürzlich hochgeladen (20)

4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptx
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
CHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptxCHEST Proprioceptive neuromuscular facilitation.pptx
CHEST Proprioceptive neuromuscular facilitation.pptx
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 

Computing and Linguistics: A cognitive approach

  • 1. www.ontopedia.net O N T O P E D I A The Identity of Everything Computing and Linguistics A Cognitive Approach or, Computing “As We May Think” Steve Pepper pepper.steve@gmail.com University of Oslo, 2009-04-21
  • 2. www.ontopedia.net O N T O P E D I A The Identity of Everything Today’s “research questions”  How can linguistics – and in particular cognitive linguistics – inform our work with Topic Maps?  Can Topic Maps contribute in any way to the cognitive linguistics project?  Plan of action – I tell you about Topic Maps (conceptual model) – I draw some parallels with natural language – You correct me, elaborate and suggest new directions
  • 3. www.ontopedia.net O N T O P E D I A The Identity of Everything Relevance to you as linguists  As users of the technology – organizing data collected in your research  As consultants to users of the technology – e.g. universities, government agencies, private enterprise  As contributors to the standard – clarify some of the cognitive issues, establish best practices, help extend the standard  As lobbyists to the University of Oslo – if you think the new UiO web site should be based on Topic Maps, please make your views known to the project group: http://www.admin.uio.no/prosjekter/nyuioweb/
  • 4. www.ontopedia.net O N T O P E D I A The Identity of Everything Relevance in general 1. We need to organize information in a new way – The summation of human experience is being expanded at a prodigious rate, and the means we use for threading through the consequent maze to the momentarily important item is the same as was used in the days of square-rigged ships. (Vannevar Bush, As We May Think, 1945) 2. We need new ways of managing knowledge – In today’s global knowledge economy, knowledge is the key asset in many organizations...  Topic Maps makes major contributions in both areas – See the use cases presented at recent Topic Maps conferences http://www.topicmaps.com
  • 5. www.ontopedia.net O N T O P E D I A The Identity of Everything What is Topic Maps?  An ISO standard for computer-based information and knowledge management – “Provides the ability to control infoglut and share knowledge by connecting any kind of information from any kind of source based on its meaning”  A “semantic technology” – Cf. Semantic Web (RDF, OWL) – A form of knowledge representation (primitive perhaps, but useful)  Widely used for web-based delivery of information – Plus: Information Integration, eLearning, Business Process Modeling, Product Configuration, Business Rules Management, Asset Management, Knowledge Management, …
  • 6. www.ontopedia.net O N T O P E D I A The Identity of Everything The problem with computing...  ...is that it’s inside-out!  People used to think the sun revolved around the earth – Copernicus’ heliocentric theory turned this idea inside out and revolutionized our understanding of the universe  Today we face a similar situation in computing – Our computing universe has computers, applications and documents at the centre – The concepts that our information is about are somewhere in outer space where they can’t be found
  • 7. www.ontopedia.net O N T O P E D I A The Identity of Everything A subject-centric revolution  This is wrong, because it does not reflect how humans think – We think in terms of interrelated concepts (or subjects) – Subjects are what interest us, not documents or applications – And so subjects must be given centre stage  We need a subject-centric revolution – This has ramifications for every aspect of human-computer interaction, including user interfaces, operating systems, file systems, etc. – Consider the typical user desktop...
  • 8. www.ontopedia.net O N T O P E D I A The Identity of Everything Today ourToday our desktops aredesktops are application-application- centric andcentric and document-document- centriccentric Icons representIcons represent applicationsapplications andand documentsdocuments
  • 9. www.ontopedia.net O N T O P E D I A The Identity of Everything topic maps tm2008 bantu semantics LING 2110 INF 2820rana keynote OOXML K185 gambia opera janacek bayreuth håkon TM2008 Topic page Emails Documents Web pages Copy PSIΨ  Why can’t they be subject-centric, with icons that represent the subjects we are interested in?  With links between related icons?  And with context menus that allow us to find everything related to a particular subject? TM2008 Topic page Emails Documents Web pages Copy PSIΨ
  • 10. www.ontopedia.net O N T O P E D I A The Identity of Everything Computing “As We May Think”  Bush’s solution to information overload: – Organize information “As We May Think”, i.e. associatively  His vision spawned the hypertext movement – Doug Engelbart, Ted Nelson, Bill Atkinson, Tim Berners-Lee, ... – The World Wide Web is its greatest triumph to date  But hypertext does not correspond to how we think – Our heads are not full of millions of interlinked documents – They are full of “interlinked” concepts (or subjects)  Topic Maps provides a close approximation to this – It is a technology that is based on cognitive principles
  • 11. www.ontopedia.net O N T O P E D I A The Identity of Everything Background to Topic Maps  Emerged from the SGML community in 1990’s – Use case: How to merge (digital) back-of-book indexes – Some input from library science – No input from linguists – Precious little input from computer scientists before 2001 – Most of the SGML community came from the humanities  ISO 13250 first published in 2000 (recently revised) – A model for representing knowledge organization structures (indexes, glossaries, thesauri, encyclopedias) – Plus interchange syntax, query language, constraint language, ...  Widely adopted in Norway (esp. public sector) – And gaining ground elsewhere
  • 12. www.ontopedia.net O N T O P E D I A The Identity of Everything The TAO of Topic Maps  The core concepts are derived from the back-of-book index  Extended and generalized for use with digital information  Consider a two-layer model consisting of – a set of information resources (below) – a “knowledge map” (above)  This is like the division of a book into content and index knowledge layer information layer (INDEX) (CONTENT) Callas, Maria …………………… 42 Cavalleria Rusticana … 71, 203-204 Mascagni, Pietro Cavalleria Rusticana . 71, 203-204 Pavarotti, Luciano ……………… 45 Puccini, Giacomo ………. 23, 26-31 Tosca ………………. 65, 201-202 Rustic Chivalry, see Cavalleria Rusticana singers ………………………. 39-52 baritone ………………………. 46 bass ……………………….. 46-47 soprano ……………… 41-42, 337 tenor ………………………. 44-45 see also Callas, Pavarotti Tosca ………………… 65, 201-202
  • 13. www.ontopedia.net O N T O P E D I A The Identity of Everything (1) The information layer  The lower layer contains the content – usually digital, but need not be – can be in any format or notation or location – can be text, graphics, video, audio – whatever  This is like the content of the book to which the back-of-book index belongs information layer(CONTENT)
  • 14. www.ontopedia.net O N T O P E D I A The Identity of Everything (2) The knowledge layer  The upper layer consists of (typed) topics and associations – Topics represent the subjects that the information is about  Like the list of topics that forms a back-of-book index – Associations represent relationships between those subjects  Like “see also” relationships in a back-of-book index knowledge layer composed by born in composed by Puccini Tosca Lucca Madame Butterfly (INDEX) Domain: Italian opera
  • 15. www.ontopedia.net O N T O P E D I A The Identity of Everything Occurrences link the layers  Occurrences represent relationships between information resources and the subjects that they are “about”  The links (or locators) are like page numbers in a back-of-book index  Occurrences can also be typed (e.g. bio, map, synopsis) knowledge layer information layer Puccini Tosca Lucca composed by born in composed by Madame Butterfly
  • 16. www.ontopedia.net O N T O P E D I A The Identity of Everything Summary of core concepts A pool of information or data, and information Associations – representing relationships between subjects composed by born in composed by Occurrences – links to information that is somehow relevant to a given subject = The TAO of Topic Maps a knowledge layer consisting of knowledge Topics – a set of topics representing the key subjects of the domain in question Puccini Tosca Lucca Madame Butterfly Let’s look at some TAOs in the Omnigator… Plus: topic types, association types, occurrence types – each of which are represented by topics...
  • 17. www.ontopedia.net O N T O P E D I A The Identity of Everything About the Omnigator  A free topic map browser from Ontopia – Download from http://www.ontopia.net (part of “OKS Samplers”) – Java-based, runs on any computer  Completely generic – Not optimized for any particular ontology – Display and navigate any conforming topic map  A teaching aid – Not designed for end-users (no attempt to hide technical jargon) – Also used for prototyping and debugging  Not to be used for most real world applications! – These require custom interfaces based on a specific ontology – (see http://www.topicmaps.com for a good example)
  • 18. www.ontopedia.net O N T O P E D I A The Identity of Everything Omnigator interface current topic multiple (typed) names topic type(s) typed occurrences (internal and external) typed associations Demo a typical topic page identifier(s)
  • 19. www.ontopedia.net O N T O P E D I A The Identity of Everything Typing topics revisited  Basic building blocks of the TAO model are – Topics: e.g. “Puccini”, “Lucca”, “Tosca” – Associations: e.g. “Puccini was born in Lucca” – Occurrences: e.g. “http://www.opera.net/puccini/bio.html is a biography of Puccini”  Each of these constructs can be typed – Topic types: “composer”, “city”, “opera” – Association types: “born in”, “composed by” – Occurrence types: “biography”, “street map”, “synopsis”  All such types are also topics – The set of typing topics constitutes an ontology
  • 20. www.ontopedia.net O N T O P E D I A The Identity of Everything Capabilities of the TAO model (1)  Represent subjects explicitly – Topics represent the “things” users are interested in  Capture relationships between subjects – Associations provide user-friendly navigation paths to information (navigation “as we may think”) – Associations also promote serendipitous knowledge discovery through browsing  Make information findable – Topics provide a “one-stop-shop” for everything that is known about a subject (collocation of information and knowledge) – Occurrences allow information about a common subject to be aggregated across multiple systems, irrespective of location
  • 21. www.ontopedia.net O N T O P E D I A The Identity of Everything Capabilities of the TAO model (2)  Represent taxonomies and thesauri – Associations can (also) represent hierarchical relationships – With Topic Maps you can have multiple, interlinked hierarchies and faceted classification  Transcend simple hierarchies – Rich associative structures capture the complexity of knowledge and reflect the way people think  Manage knowledge – The topic map is the embodiment of “organizational memory” – Provides a structured way to capture people’s knowledge of things, events, relationships, etc.
  • 22. www.ontopedia.net O N T O P E D I A The Identity of Everything Beyond the TAO  Formal data model – Topic maps can be queried, e.g. – Give me all composers that composed operas that were based on plays that were written by Shakespeare  Interchange syntax – Topic maps can be interchanged – Increased reuse = added value  Robust identity model – Topic maps can be merged – Potential to federate knowledge  Scope – Topic maps can capture context  Reification – Topic maps can express different levels of detail – Similar to scaling in cartography For more details, see Pepper 2009
  • 23. www.ontopedia.net O N T O P E D I A The Identity of Everything Break – any questions so far? After the break: Topic Maps and natural language – towards a linguistic perspective
  • 24. www.ontopedia.net O N T O P E D I A The Identity of Everything Parallels with natural language  Basic grammatical classes  Nouns and verbs  Nominals and nouns  Clauses and verbs  Valency  Semantic roles  Categories and schemas  Hyponymy  Synonymy and homonymy  Nominalization  Grounding / co-reference  Information structure ⇒ TAO model ⇒ Topics and associations ⇒ Topics and their types ⇒ Associations and their types ⇒ Arity ⇒ Association roles ⇒ Typing topics ⇒ Type hierarchies ⇒ Naming ⇒ Reification ⇒ Subject identity / collocation ⇒ Navigation
  • 25. www.ontopedia.net O N T O P E D I A The Identity of Everything Basic principles, basic classes  In elementary school, I was taught that a noun is the name of a person, place, or thing. In college, I was taught the basic linguistic doctrine that a noun can only be defined in terms of grammatical behavior, conceptual definitions of grammatical classes being impossible. Here, several decades later, I demonstrate the inexorable progress of grammatical theory by claiming that a noun is the name of a thing. (Langacker 2008)  The basic grammatical classes are nouns and verbs – They prototypically profile things and relationships  They correspond to topics and associations
  • 26. www.ontopedia.net O N T O P E D I A The Identity of Everything Grounding  “Grounding is characteristic of the structure referred to in CG as nominals and finite clauses. More specifically, a nominal or a finite clause profiles a grounded instance of a thing or process type.”  “A noun designates a type of thing, and a verb a type of process.”  “A nominal or a finite clause profiles a grounded instance of a thing or process type.”  Nominal grounding (determiners and quantifiers) – the, this, that, some, a, each, every, no, any  Clausal grounding (mood and tense) – -s, -ed, may, will, should Langacker 2008: 259ff (esp. 264)
  • 27. www.ontopedia.net O N T O P E D I A The Identity of Everything Nouns and nominals  Topic types represent classes of topics – Conceptual “groupings of things”, e.g. composer, opera, city, ... – They correspond to Langacker’s nouns (“types of thing”) However, topics can have multiple names – (This is how we handle synonymy and multilingualism) – In one sense it is topic names that correspond to nouns  Topic instances represent individual subjects – They correspond to Langacker’s nominals (“instances of types”) – Their names are typically proper nouns, e.g. Puccini, Tosca, Lucca
  • 28. www.ontopedia.net O N T O P E D I A The Identity of Everything Verbs and clauses  Association types represent classes of relationships – They correspond to Langacker’s verbs (“types of process”) – (Often named accordingly, e.g. born in, composed by, killed by, ...)  Individual associations represent specific relationships – They correspond to Langacker’s clauses (“instances of processes”) – e.g. Puccini was born in Lucca; Tosca was composed by Puccini Langacker distinguishes processes (temporal) and non-processual relationships (non-temporal). The latter are (prototypically) profiled by adjectives, adverbs, prepositions, and participles. This distinction is not made explicitly in Topic Maps.  Note: There are two predefined association types – type-instance (the relationship between a topic and its type) – supertype-subtype (a relationship between types, see Hyponymy)
  • 29. www.ontopedia.net O N T O P E D I A The Identity of Everything Valency  Associations can involve one, two or more topics – Binary associations, e.g. Puccini composed Tosca, are most common and correspond to transitive verbs – Ternary associations, e.g. Tosca killed Scarpia with a knife, can correspond to ditransitive verbs – Unary associations, e.g. Turandot was unfinished, correspond (sort of) to intransitive verbs (or binary properties)  The arity of an association – Corresponds to the valency of a verb
  • 30. www.ontopedia.net O N T O P E D I A The Identity of Everything Semantic roles  An association does not have “directionality”  Instead of direction, Topic Maps uses roles – Roles are classified by type – Role types specify the nature of each topic’s involvement in the relationship. They correspond to semantic roles. – (Role types are also topics)  Role types are different from topic types... Puccini Tosca composed composed by composer work RDFTopic Maps
  • 31. www.ontopedia.net O N T O P E D I A The Identity of Everything Roles and types T R A Puccini R T Tosca T T T composer workcomposed T composer T opera The role type can be – the same as the role playing topic’s topic type (composer = composer) – a supertype of the topic type (work > opera) – a subtype of the topic type (teacher < person) – a subtype of the topic type’s supertype (source < work)
  • 32. www.ontopedia.net O N T O P E D I A The Identity of Everything Association roles Semantic roles  Italian Opera Topic Map – composed: composer, work – born in: person, place – appears in: character, work – based on: source, result – revision of: source, result – part of: part, whole – exponent of: person, style – located in: container, containee – pupil of: teacher, pupil  Association roles tend to be much more specific – Variable practice – as yet no established conventions – Might (cognitive) linguists have something to offer here?  (Frawley 1992) – (logical actors) agent, author, instrument – (logical recipients) patient, experiencer, benefactive – (spatial roles) theme, source, goal – (non-participant roles) locative, reason, purpose
  • 33. www.ontopedia.net O N T O P E D I A The Identity of Everything Naming of associations  Intuitive naming requires flexibility – i.e. multiple AT names that change depending on the “direction” of the association  Puccini was born in Lucca  Lucca was the birthplace of Puccini  Alternative CG view – Naming should be based on whether the agent or the theme is in focus  The focus becomes the trajector – Point of focus = Current topic  Some strategies...  Voice-based – Active / passive forms of the verb  composedVa / composedVp by – Works well in SVO languages. Less satisfactory with SOV.  Role-based  teacherN of/pupilN of  Nominalization  composition – Tends to be used by Japanese, Koreans (and Germans??)  Combinations  bornV in / birthplaceN of  partN of/consistsV of
  • 34. www.ontopedia.net O N T O P E D I A The Identity of Everything Categories and prototypes  Topic types define categories of things – But are they Aristotelian or prototypical categories?  Aristotelian – Category membership is binary – All instances are equally representative. No standard notion of “similarity”.  Prototypical – Not defined by “necessary and sufficient conditions” (cf. OWL)  The decision is up to the conceptualizer (a.k.a. topic map author) – A topic can have more than one type  Boïto is a composer and a librettist – The same topic can be a topic type and a role type  e.g. Puccini is a composer; Puccini plays the role of composer in … – Should we establish conventions for goodness of example?  Could be useful in automated classification
  • 35. www.ontopedia.net O N T O P E D I A The Identity of Everything Schemas and constraints  Other types can also be said to define categories – association types, (occurrence types, name types, role types)  But these are more schematic (in the CG sense) – Schemas are “abstract templates obtained by reinforcing the commonality inherent in a set of instances” (Langacker 2008, p.23, in the context of grammatical rules)  Rules can be defined as templates and constraints: T R A Puccini R T Tosca T T T composer workcomposed T composer T opera “Puccini composed Tosca” The composer Puccini plays the role of composer in the “composition” relationship in which the role of work is played by the opera Tosca.T AGENT T THEMEelaboration sites
  • 36. www.ontopedia.net O N T O P E D I A The Identity of Everything Hyponymy  Topic Maps has two predefined association types: – type-instance (relationship between a topic and its type) – supertype-subtype (relationship between the denotations of a hyponym and its hyperonym) Mammal Primate Canine HumanChimp WolfDog Steve Ron LEGEND types instances supertype-subtype type-instance
  • 37. www.ontopedia.net O N T O P E D I A The Identity of Everything Synonymy and homonymy  Synonyms – One subject, multiple names – In thesauri: USE and USED FOR  TMs are subject-centric – A topic can have multiple names – Names can be typed  Typical name types: – nickname, synonym, alternate name – Context can be expressed using scope  Typically names in different natural languages – composer, komponist, 작곡가 , ... – Names can also have “variants”  Often used to capture orthographic variation: – Tchaikovsky, Чайко́вский, Tsjajkovskij, Tschaikowski  Also useful for sort names, pronunciation, etc.  Homonyms – One name, multiple subjects – In thesauri: problematic  TMs are based on identifiers – Same name can be used by more than one topic – Disambiguation in UI is left to the application – Two main disambiguation strategies  Default: qualify by type, e.g. – Tosca (opera) vs. Tosca (character)  Fallback: qualify by some other relationship, e.g. – Paris (France) vs. Paris (Texas) – La Bohème (Puccini) vs. La Bohème (Leoncavallo)
  • 38. www.ontopedia.net O N T O P E D I A The Identity of Everything Nominalization  (A topic map consists of assertions about subjects)  Assertions are made using statements: – names, e.g. a certain subject has the name “Tosca” – associations, e.g. “Tosca is set in Rome” – occurrences, e.g. “http://en.wikipedia.org/wiki/Rome is a web page about Rome”  Any statement can be reified – Reification results in a topic that has the same referent as the reified statement – e.g. Tosca is set in RomeA ⇒ The setting of Tosca in RomeT – The (new) reifying topic can have names and occurrences, and it can play roles in associations Derivation of nouns from other words, including verbs, adjectives etc. e.g. meetV ⇒ meetingN
  • 39. www.ontopedia.net O N T O P E D I A The Identity of Everything Subjects and topics  Topics represent subjects – the topic is the representation – the subject is the referent  Or, in Saussure’s terms – signifiant and signifié  A subject can be anything: A subject is any “thing” whatsoever, whether or not it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever.  Is the topic/subject pairing a symbolic assembly? A subject in the real world T A topic in the computer domain
  • 40. www.ontopedia.net O N T O P E D I A The Identity of Everything Co-reference and collocation  Grounding singles out referents and enables co-reference – between speaker and listener – across a sequence of utterances  In Topic Maps the central objective is collocation – By definition, each topic represents a single subject (one subject per topic) – A topic is intended to be a point of collocation for everything that is known about a particular subject – Therefore the goal is to have only one topic per subject  To achieve that we need to know which subject a topic represents – (This is sometimes referred to as the “intentionality” of the relation between a symbol and its referent. – We call it subject identity.
  • 41. www.ontopedia.net O N T O P E D I A The Identity of Everything Subject identity  The identity of a subject is expressed using globally unique identifiers called subject identifiers – If two topics share a subject identifier, they are deemed to represent the same subject and must be merged SUBJECTS TOPICS Madame Butterfly Tosca Lucca Puccini
  • 42. www.ontopedia.net O N T O P E D I A The Identity of Everything The subject is identified by a URL • The URL is called a subject identifier Subject identifiers Giacomo Puccini topic http://psi.ontopedia.net/Giacomo_Puccini subject identifier The URL is the address of a web page • The web page describes the subject such that a human can know what subject is referred to • This web page is called a subject descriptor Giacomo Puccini Italian composer, b. Lucca 22nd Dec 1858, d. Brussels, 29th Nov 1924. Best known for his operas, of which Tosca is one of the most popular and well-known. subject descriptor http://psi.ontopedia.net/Giacomo_Puccini Humans use the descriptor By inspecting the web page the person responsible for assigning the identifier can be sure that it does not refer to, say, Giacomo’s grandfather Domenico (who was also a composer of operas) Machines use the identifier The link is not resolved. Instead simple lexical comparison is used. If the strings are identical, the subject is deemed to be the same and the topics are merged. subject Is the subject identifier/ subject descriptor pairing a symbolic assembly?
  • 43. www.ontopedia.net O N T O P E D I A The Identity of Everything Information structure  Intuitive navigation is a key feature of Topic Maps  But what is its cognitive basis? – I claim that it corresponds to the way we think (i.e., associatively) – Can linguistics back up this claim?  topic vs. comment in linguistics (Bussmann, 487) – “Analysis of sentences according to communicative criteria into the topic (what is being talked about) and the comment (what is being said about the topic)” – “Analysis of utterances according to the communicative criteria of given/known information vs. new information” – Cf. theme vs. rheme in Halliday’s functional grammar  Consider our earlier tour of Italian opera...
  • 44. www.ontopedia.net O N T O P E D I A The Identity of Everything Navigation as narrative Giacomo Puccini was a composer. He was born in Lucca in 1858. Lucca is a city, located in Italy. It was the birthplace of Puccini and Catalani. Catalani was a composer who composed 5 operas. He died in Milan. Milan is the home of La Scala, which was the venue for many premiére performances, including that of Madam Butterfly. Madam Butterfly is set in Nagasaki, which is located in Japan. Japan is (also) the setting for Iris, [which is] an opera [which was] composed by Mascagni, who was a pupil of Ponchielli who was (also) the teacher of Puccini... Giacomo Puccini was a composer. He was born in Lucca in 1858. Lucca is a city, located in Italy. It was the birthplace of Puccini and […] Catalani. Catalani was a composer who composed 5 operas. He died in Milan. Milan is the home of La Scala, which was the venue for many premiére performances, including that of Madam Butterfly. Madam Butterfly is set in Nagasaki, which is located in Japan. Japan is (also) the setting for Iris, [which is] an opera [which was] composed by Mascagni, who was a pupil of Ponchielli who was (also) the teacher of Puccini... THEME: new theme continuing theme RHEME: predicate with potential new theme
  • 45. www.ontopedia.net O N T O P E D I A The Identity of Everything “Now! …. That should clear up a few things around here!” Discussion  Questions, comments, corrections? – What have I missed? Where else should I look?  What might linguists contribute? – A better understanding of the nature of roles? – Approaches to representing temporal knowledge? – ...  Can Topic Maps inform linguistics? – After all, it is a technology that captures (some degree of) (some form of) knowledge – It seems to have a reasonable cognitive basis – It emerged through usage (librarians, indexers, etc.) – And last but not least, it works!
  • 46. www.ontopedia.net O N T O P E D I A The Identity of Everything References  Bussman, H. Routledge Dictionary of Language and Linguistics (London 1996)  Frawley, W. Linguistic Semantics (Hillsdale 1992)  Langacker, R. Cognitive Grammar (Oxford 2008)  Pepper, S. Italian Opera Topic Map – http://www.ontopedia.net/ItalianOpera  Pepper, S. “Topic Maps” in Bates, M.J. and Maack, M.N. (eds) Encyclopedia of Library and Information Sciences (CRC Press, forthcoming 2009) – http://www.ontopedia.net/pepper/papers/ELIS-TopicMaps.pdf

Hinweis der Redaktion

  1. Korean: chak-gok-ga