Illuminating Chaos Using Semantics to Harness the Web
1. Illuminating Chaos
Using Semantics to Harness the
Web
Dagobert Soergel
Department of Library and Information Studies,
University at Buffalo
1
AAT Workshop
Academia Sinica, Taipei
June 7,2010
2. Outline
• Overview of issues
• Semantics for whom and for what
• Representation to assist with query formulation
• Representation for comprehension
• Systems of representation
• Support for finding: Indexing
• Building KOS
• How can it all get done
• Zeroing in on the conceptual foundation
• Issues in the realm of AAT Taiwan
2
4. Semantics for whom?
• Semantics for computer systems
inference
answers and solutions instead of lots of Web pages
• Semantics for people
assist users in creating meaning and making sense
structure for learning
4
5. Semantics for what
• Finding
• Comprehending
• To know what to look for, a user (a person or a
system) must first comprehend something – a cycle
• Both finding and comprehending require navigating in
an information space – need meaningful structure
5
7. Problem clarification for
search
JG prevention approach
JG10 . individual-level prevention
JG10.2 . . individual- vs. family-focused prevention
JG10.2.2 . . . individual-focused prevention
JG10.2.4 . . . family-focused prevention
JG10.4 . . prevention through information and education
JG10.4.2 . . . social marketing prevention approach
JG10.4.4 . . . prevention through information dissemination
JG10.4.6 . . . prevention through education
JG10.4.8 . . . peer prevention
JG10.8 . . prevention through spirituality and religion
JG10.10 . . prevention through public commitment
JG12 . environmental-level prevention
JG12.4 . . social policy prevention approach
JG14 . multi-level prevention
7
8. Problem clarification for
search
churches (buildings)
. <church buildings by function>
. . chapels of ease (buildings)
. . fortified churches
. . pilgrimage churches (buildings)
. . procathedrals (buildings)
. <church buildings by location or context>
. . abbey churches
. . cathedrals (buildings)
. . cave churches
. . collegiate churches
. . . . .
. <churches by form>
. . double churches
. . hall churches
. . rock-cut churches
. . stave churches
8
9. Browse structure for search
• Make a table of contents for the entire Wikipedia
using UDC
• Make a classified (hierarchically structured) index for
an art textbook using the Art and Architecture
Thesaurus
• Make a classified index for the collection of an art
museum using the Art and Architecture Thesaurus
9
10. 10
Facet structure to guide search
A Area of ability combines with B Degree of ability
A1 psychomotor ability
A2 senses
A2.1 . vision
A2.1.1 . . night vision
A2.2 . hearing
A3 intelligence
A4 artistic ability
B1 low degree of ability, disabled
B2 average degree of ability
B3 above average degree of ability
B3.1 . very high degree of ability
Examples A2.1B1 visually impaired
A2.2B1 hearing impaired
A3B1 mentally handicapped
A3B3 intellectually gifted
11. Provide front-ends to assist
users
• Elicit a query with a facet-based interfaces,
then the system creates a free-text query
• Create a structure that normalizes terms assigned
through social tagging and arranges them in a
meaningful structure.
The user can than browse and select concepts
The system maps to all appropriate tags
11
12. Problem space for diseases
Used by people or computer systems
for search and arranging search output
Pathologic process
Body system affected
12
Pathologic process
Body system affected
Cause (condition, organism, chemical substance,
environmental factors)
Treatment
13. Representation for
comprehension
A question of information representation (knowledge
representation)
• For computer systems: formal representation
• For people: Text, images, graphical representation,
visualization
• Transformations between representations, such as
• from text to formal: information extraction
• from text to a map showing the text structure
• from a conventional thesaurus display to a concept map
13
14. Two representations
Text (for people)
High blood pressure is a serious disease often caused
by being overweight. In kids 4 – 12 it can be treated
highly effectively with Nystatin.
Formal representation (for computer system)
Causation (HighBloodPressure, Obesity)
Treatment (HighBloodPressure, {Human, [Age, 4-12y]},
Nystatin, [Effectiveness, 4])
14
16. Two representations
Text
Kids begin grazing independently from their mothers at
three months
Formal representation
Separation (Mother, Child, {Goat, [Age, 3m]})
16
17. Information extraction
• Information extraction produces representations
needed for the semantic Web
• Also useful for people if formal expressions are
transformed into sentences that state the findings of
a document as individual "bullets"
• Could arrange statements from one or more
documents in UDC order as a kind of summary
• Information extraction needs rich KOS
17
19. Meaningful arrangement of
terms
in document representations
19
• Terms assigned in social tagging
• Terms assigned from controlled vocabulary, e.g., AAT
24. Comparison
• By similarity:
Metaphor / analogy
• Christ’s sacrifice
and crucifixion
{Christ metaphor}
Cause / Effect
• Reaction or feeling
• Intensity
• Effect / Outcome
• Pulls the viewer into the scene
Context
• Biographic info: Artist
• Jusepe de Ribera
• Biographic info: Time / period
• 1634
• 17th century
24
Comparison, cause/effect, context
25. Tags arranged
by how they relate to the
image
with descriptors from the
Art and Architecture
Thesaurus
25
26. • Image theme AAT
• martyrdom sacrifice
• mystical experience mysticism
• Biblical biblical stories
• Religious religion and religious concepts
• Image content: Focal
• Reference
• nude body nudes (representations)
• old man elderly
• Saint Bartholomew saints
• Executioner executioners
• Knife knives
• Elaboration (Adj.)
• Bearded
• physical anguish pain (sensation)
• profound emotion {emotional}
• Luminous shine
Matching topic (Direct)
26
28. Comparison
• By similarity:
Metaphor / analogy
• Christ’s sacrifice
and crucifixion
{Christ metaphor}
Cause / Effect
• Reaction or feeling
• Intensity
• Effect / Outcome
• Pulls the viewer into the scene
Context
• Biographic info: Artist
• Jusepe de Ribera
• Biographic info: Time / period
• 1634
• 17th century
28
Comparison, cause/effect, context
No AAT terms
30. Example
mysticism
Note: Refers in a general sense to a spiritual quest for hidden
truth, the goal of which is to be united with the divine. It also
refers more specifically to a belief in the existence of important
realities beyond perceptual or intellectual understanding that are
accessible by subjective experience, such as by intuition or
meditation. Forms of mysticism are found in all major religions
as well as in secular experience.
30
32. Comprehension "in the
large"
• Learning and sense making require comprehension
across multiple sources
• Requires structure – can be supplied by KOS
• Requires tools for the manipulation of external
structures the learner / sensemaker builds, such as
concept maps
32
34. Representations need rules
• Formal representations need logical formalisms, such
as full first-order logic or subsets (for ease of
processing) or extensions (to be more expressive)
• Text needs rules of syntax and broader document
structure
• Graphical representations need rules of design
34
35. Representations
need names for entities
• Names for (abstract) concepts – classification
• Names for many different types of other entities, such
as persons, places, buildings, events, currencies, …
(named entities)
• Systems of such names – Knowledge Organization
Systems, authority lists of personal names
• Mappings between such systems
35
36. Representations need
relationships
• Relationships are used to connect entities,
thus forming statements
obesity <causes> high blood pressure
• Need system of relationships
Many such systems exist (a type of KOS)
Problem of mapping
36
37. Rhetorical relationships
• To map text structure
• To discern how a retrieved document, paragraph,
statement, or image relates to the topic of a search
37
39. Matching topic (Direct)
. Manifestation
. Image content
. Image theme
Evidence (Indirect)
Context
. Scope
. Framework
. Environmental setting
. Social background
. Time & sequence
. Assumption / expectation
. Biographic information
Condition
. Helping or hindering factor
. Unconditional
. Exceptional condition
Purpose / Motivation
Cause / Effect
. Cause
. Effect / Outcome
. Explanation (causal)
. Prediction
Comparison
. By similarity (analogy) /
By difference (contrast)
. By factor that is different
Method / Solution
. Method / Approach
. Instrument
. Technique / Style
Evaluation
. Significance
. Limitation
. Criterion / Standard
. Comparative evaluation
39
RST+ Functional Role
40. Functional role: Comparison
Comparison
. By similarity vs. By difference (Contrast)
. . By similarity
. . . Analogy & metaphor
. . By difference (Contrast)
. By factor that is different
. . Different external factor
. . . Different time
. . . Different place
. . Different participant
. . . Different actor
. . . Different subject acted upon
. . Different act or experience
. . . Different act
. . . Different experience 40
41. Support for finding: Indexing
• Finding based on text:
Knowledge-based expansion of query
Front-end as discussed earlier
• Finding based on indexing:
Semantically enriched documents
41
42. A semantically enriched document
Reis et al. (2008)
Impact of Environment and Social Gradient on Leptospira infection in Urban
Slums (doi:10.1371/journal.pntd.0000228).
Infectious disease studied: Leptospirosis
Pathogen (causative agent of disease): Leptospira spirochete
Vector of disease pathogen: Rat (Rattus norvegicus)
Pathogen host subjected to study: Human (Homo sapiens)
Number of subject individuals in study: 3,171
. . .
Purpose of study: Quantify risk factors for leptospirosis . . .
Principal finding 1: Prevalence of Leptospira antibodies . . .
Principal finding 2: Disease risk . . .open sewers . . .
42
(http://dx.doi.org/10.1371/journal.pntd.0000228.x002)
43. A semantically enriched document
43
Tag Trees of Individual Semantic
Classes of Highlighted Terms
disease
infectious diseases
diarrheal disease
childhood diarrhea
dengue
leptospirosis
human leptospirosis
meningococcal disease
pulmonary hemorrhage syndrome
visceral leishmaniasis
Weil's disease
occupational disease
zoonotic disease
ID = Infectious Disease Ontology
GO = Gene Ontology term used in ID
ID:0000012 immunity
ID:0000017 mortality
ID:0000023 zoonotic
ID:0000025 pathogenicity
ID:0000034 endemic
ID:0000038 parasite
ID:0000056 host
ID:0000057 carrier
ID:0000063 vector
ID:0000064 pathogen
ID:0000066 infectious agent
ID:0000069 primary pathogen
ID:0000104 infection
44. 44
ID = Infectious Disease Ontology GO = Gene Ontology
IDO:0000000 ! process
IDO:0000083 transmission
IDO:0000231 horizontal transmission (GO:0000031)
IDO:0000104 infection
IDO:0000084 pathogenesis
IDO:0000221 ! infectious disease progression
IDO:0000100 ! pathogen evasion of host immune response
IDO:0000111 antigenic variation
IDO:0000115 genetic diversification
IDO:0000226 pathogen life cycle (GO:0000026)
IDO:0000001 ! role
IDO:0000036 ! colonizer
IDO:0000038 parasite
IDO:0000048 symptom
IDO:0000056 host
IDO:0000057 carrier
IDO:0000059 reservoir
IDO:0000063 vector
IDO:0000064 pathogen
IDO:0000066 infectious agent
IDO:0000069 primary pathogen
IDO:0000200 mode of transmission (GO:0000000)
IDO:0000002 ! quality
IDO:0000215 ! quality of host population
IDO:0000098 infectious disease
IDO:0000210 ! quality of host
IDO:0000012 immunity
45. Semantically enriched documents
• Semantic enrichment supports semantic retrieval
• Broad area of its own
• Many different forms
• Explicit document structure
• Concept and named entity tagging and identification
• Assigning additional concepts or named entities
• Assigning extracted propositions
• Closely linked with information extraction
• IE produces elements of semantic enrichment
45
46. Need KOS
Needed for all this
• Large Knowledge Organization Systems
• Large knowledge bases with mappings
• Methods and procedures for developing KOS
46
47. How to get all this work
done?
The forces that created the problem
also support the solution
• Use automation
• Automated information extraction gets better every day and also
provides input to building KOS
• Automated classification could be used for the UDC Wikipedia project
• Use Web-enabled collaborative work ("crowdsourcing")
• Use computer systems to assist people
• Use Web-based systems to collect and integrate results
• Bootstrap: The more knowledge is in formal systems, the more
information extraction and structuring tasks can be automated
47
48. Example: Guided tagging
• Use facet structure to get taggers think a bit more
out of the box
For example, could ask
What does this image remind you of
• Could assign some terms automatically, for example,
extracting terms from text assigned to an image
48
52. 52
Hub
Water transport
Inland water transport
Ocean transport
Traffic station Water transport⊓
Traffic station Inland water tr.⊓
Traffic station Ocean transport⊓
Dewey
387 Water, air, space transportation
386 Inland waterway & ferry transportation
387.5 Ocean transportation
386.8 Inland waterway tr. > Ports
387.1 Ports
LCSH
Shipping
Inland water transport
Merchant marine
Harbors
German
Hafen
Mapping through a Hub
53. Outline
• Objective: Interoperability Plus
• KOS concept hub
• Method: Knowledge-based, computer-assisted
creation of canonical representations of concepts
• Resulting knowledge base and applications
53
54. Objective
Improve semantic-based search
across multiple collections in multiple languages.
• Interoperability between any two participating KOS
(Knowledge Organization Systems)
• Support for search, esp. facet-based search
• for any collection indexed by a participating KOS
• for search based on free-text or free-form social tagging
• Assistance in cataloging (metadata creation)
by catalogers or users (social tagging)
• Long-range goal: Web service where a KOS can be uploaded
and mappings to specified target KOS are returned
54
55. KOS Concept Hub
• Interoperability is achieved by
expressing concepts from all participating KOS
as a canonical representation,
such as a description logic formula
using atomic concepts and relationships
• The backbone of the proposed system is a
faceted core classification of atomic concepts
together with a set of relationships
• Mapping from KOS to KOS is achieved by reasoning
over these canonical representations
55
56. 56
Hub
Water transport
Inland water transport
Ocean transport
Traffic station Water transport⊓
Traffic station Inland water tr.⊓
Traffic station Ocean transport⊓
Dewey
387 Water, air, space transportation
386 Inland waterway & ferry transportation
387.5 Ocean transportation
386.8 Inland waterway tr. > Ports
387.1 Ports
LCSH
Shipping
Inland water transport
Merchant marine
Harbors
German
Hafen
Mapping through a Hub
57. 57
Hub
Traffic station
Vehicle parking
Terminal facilities
Water transport
Inland water transport
Ocean transport
Traffic station Water transport⊓
By type of water transport
Traffic station Inland water tr.⊓
Traffic station Ocean transport⊓
By component of traffic station
Vehicle parking Water transport⊓
Terminal facilities Water transport⊓
Dewey
387 Water, air, space transportation
386 Inland waterway & ferry transportation
387.5 Ocean transportation
386.8 Inland waterway tr. > Ports
387.1 Ports
LCSH/AAT
Shipping
water transport
Inland water transport
Merchant marine
Harbors
ports
harbors
Mapping through a Hub
58. Method: How to get DL formulas
Key: Efficient creation of canonical representations (DL formulas)
• Apply existing knowledge:
Large knowledge base ▬► less effort for processing a new KOS
• Use knowledge of KOS structure for hierarchical inheritance
• Use linguistic analysis of terms and captions
• Eliminate redundant atomic concepts
• Check or produce mapping results from assignment of concepts to
the same records
• Get human editors’ input and verification where needed through a
user-friendly interface
• KOS “owners” may verify and edit data pertaining to their KOS
58
59. Knowledge base
Requires an ever larger classification and lexical
knowledge base containing many kinds of data:
1. A faceted classification of atomic concepts
Seeded from sources with well-developed facets such as
UDC
the Alcohol and Other Drug (AOD) Thesaurus
the Harvard Business Thesaurus
the Art and Architecture Thesaurus
various systems called ontologies
59
60. Knowledge base 2
Requires an ever larger classification and lexical
knowledge base containing many kinds of data:
2. Linguistic knowledge bases such as WordNet and mono-,bi-, and
multi-lingual dictionaries and thesauri
3. Many KOS (Knowledge Organization Systems), such as LCC, UDC,
DDC, DMOZ directory, LCSH, Gene Ontology, Schlagwortnormdatei
4. These will over time be fused into one large multilingual knowledge
base with many terminological and translation relationships and
relationships linking terms to concepts,
with an increasing number of concepts semantically represented by a
DL formula.
60
62. L00 Transportation and traffic
L10 Traffic system components
L13 Traffic facilities
L15Traffic stations
L17 Vehicles
L30 Modes of transportation
L33 Air transport
L37 Water transport
P00 Buildings, construction
P23 Buildings
P27 Architecture
P43 Construction
R00 Engineering
R30 Acoustics
R37 Soundproofing
T70 Military vs. civilian
T73 Military
T77 Civilian
62
Underlying faceted classification
63. HE Transportation
HE550-560 Ports, harbors,
docks, wharves, etc.
L00 Transportation and traffic T77 Civilian⊓
Inherited:
L00 Transportation and traffic T77 Civilian⊓
Added by editor:
L15 Traffic stations L37 Water transport⊓
Resolved to:
L15 Traffic stations L37 Water transport⊓ ⊓
T77 Civilian
63
Method: Assigning atomic concepts 1
64. NA6300-6307 Airport buildings From database already established:
Airport =
L15 Traffic stations L33 Air transport⊓
Buildings = P23 Buildings
Added by editor T77 Civilian
Resolved to
L15 Traffic stations L33 Air transport⊓ ⊓
P23 Buildings T77 Civilian⊓
64
Method: Assigning atomic concepts 2
65. TL681.S6 Airplanes. Soundproofing From database already established:
Airplane =
L17 Vehicles L33 Air transport⊓
Soundproofing = R37 Soundproofing
Added by editor: Nothing
Resolved to
L17 Vehicles L33 Air transport⊓ ⊓
R37 Soundproofing
65
Method: Assigning atomic concepts 3
66. Aeroplanes-Soundproofing From database already established:
Aeroplanes = Airplane [Spelling variant]
Therefore
Term is recognized as same as
Airplanes. Soundproofing
Resolved to
L17 Vehicles L33 Air transport⊓ ⊓
R37 Soundproofing
66
Method: Assigning atomic concepts 4
67. Any class formed by geographical
subdivision
Such as
NA6300-6307 Airport buildings
NA6305.E3 Egypt
Recognized using a dictionary of
geographical names
Inherits from subject class above it;
simply add the country
L15 Traffic stations L33 Air transport⊓
P23 Buildings T77 Civilian⊓ ⊓ Egypt⊓
No editor checking needed
67
Method: Assigning atomic concepts 5
71. 71
Hub
L17 Vehicles L33 Air transport⊓ ⊓
R37 Soundproofing
L17 Vehicles ⊓ L37 Water
transport ⊓ R37 Soundproofing
L17 Vehicles ⊓ L37 Water
transport ⊓
R37 Soundproofing T73⊓
Military⊓
Underwater
LCC
TL681.S6 Airplanes. Soundproofing
VM367.S6 Submarines.
Soundproofing
LCSH
Aeroplanes-
Soundproofing
Ships-Soundproofing
Mapping through a Hub
72. 72
Hub
Canonical form of
query
(DL formula)
User query
Free text
Combination of elemental
concepts through facets
(guided query formulation)
Controlled term(s) from a
KOS, possibly found
through browsing a KOS
Final query
(Enriched) free
text query
Query in terms
of a KOS
Mapping user queries
73. TL681.S6 Airplanes. Soundproofing
VM367.S6 Submarines. Soundproofing
Aeroplanes-Soundproofing
Ships-Soundproofing
[L17 Vehicles L33 Air transport⊓ ⊓
R37 Soundproofing]
[L17 Vehicles L37 Water transport⊓ ⊓
R37 Soundproofing Military]⊓
[L17 Vehicles L33 Air transport⊓ ⊓
R37 Soundproofing]
[L17 Vehicles L37 Water transport⊓ ⊓
R37 Soundproofing]
73
Query:
L17 Vehicles AND R37
Soundproofing
74. Examples
from NALT and LCSH
• NALT National Agricultural Library Thesaurus
• LCSH Library of Congress Subject Headings
74
75. Air pollution laws
LCSH term
Air – Pollution – Laws and regulations
[isa] Legal rule [appliedTo] {[isa] Condition [isConditionOf] Air [causedBy]
Pollutant [property] Undesirable}
NALT terms
Air pollution
[isa] Condition [isConditionOf] Air [causedBy] Pollutant [prop.] Undesirable
Laws and regulations
[isa] Legal rule
Mapping LCSH ▬► NALT
Air – Pollution – Laws and regulations ▬► Air pollution AND
Laws and regulations
Interpretation for indexing and searching in both directions
75
76. Soil moisture vs. Soil water
LCSH term
Soil moisture
[isa] Water [containedIn] Soil
NALT term
Soil water
[isa] Water [containedIn] Soil
Mapping LCSH ▬► NALT
Soil moisture ▬► Soil water
76
77. Greenhouse gardening
LCSH term
Greenhouse gardening
[isa] Gardening [inEnvironment] Greenhouse [inEnvironment] Home
NALT terms
Home gardening
[isa] Gardening [inEnvironment] Home
Greenhouse
[isa] Greenhouse
Mapping LCSH ▬► NALT
Greenhouse gardening ▬► Home gardening AND
Greenhouse
77
78. Salad greens
LCSH term
Salad greens
[isa] Green leafy vegetable [usedFor] Salad
NALT term
Green leafy vegetables
[isa] Green leafy vegetable
Mapping LCSH ▬► NALT
Salad greens ▬► BT Green leafy vegetables
78
80. Distributed implementation
• A KOS on the Web could assign DL formulas to its
concepts − let's call this a
semantically enhanced KOS or SEKOS
• Could use any of a number of faceted core
classifications or even several (using a unique URI
for each elemental concept)
• Core classifications could be mapped to each other
• It is now a simple matter to map from any SEKOS
to any other (somewhat dependent on the core
classifications used)
80
81. Examples
from the realm of AAT Taiwan
AAT Art and Architecture Thesaurus (Getty)
AAT Taiwan TELDAP, Institute for Information Science
Academia Sinica
TGM Thesaurus of Graphic Materials,
Library of Congress
E-HowNet A Lexical Knowledge Base for Semantic
Composition, Academia Sinica
81
84. E-HowNet ontology 廣義知識知識本體
• Building | 建築物
Facilities |設施
Chinese Word: 廟
English: Temple
Conceptual expression: {facilities |設施 : domain = {religion |宗教 }}
Chinese Word: 禪寺
English: Buddhist temple
Conceptual expression: {facilities |設施 : domain = {Buddhist |佛教 }}
Chinese Word: 道觀
English: Taoist temple/ Taoist quan
Conceptual expression: {facilities |設施 : domain = {Taoism |道教 }}
84
85. Mapping to Chinese
• Use E-HowNet formal semantic expressions
• Use terms that already exist in E-HowNet
• Add terms using computer-assisted derivation of
semantic expressions as described above for English
85
87. Analysis
Since English has only one word stitching,
AAT does not distinguishbetween the two specific concepts
even though the AAT scope note describes the two concepts
Solution
AAT AAT Taiwan
stitching 縫 (feng)
stitching (needlework) 縫合 (feng he)
stitching (bookbinding) 縫訂 (feng ding)
87
88. Principle
The classification should include all concepts that are lexicalized in
any language participating in a cross-language mapping system
If a language does not have a term for a concept, a term must be
invented.
This also happens when a concept is found through conceptual
analysis
88
89. Shades of meaning
Example
The AAT defines temple as
Buildings housing places devoted to the worship of a deity or
deities
But in Chinese culture, a temple (Miao( 廟 ) is devoted to
worshiping or honoring or communing with ancestors or spirits.
There are a number of further terms in Chinese for buildings
devoted to worshiping/ commemorating saints, or some famous
scholars, poets, or people with great achievement.
89
90. Shades of meaning
Thus in the concept structure we need
Temple (broad definition)
Building housing places devoted to the worshiping, communing
with, or honoring or commemorating a deity or deities or ancestors
or spirits or saints, or some famous scholars, poets, people with
great achievement.
Temple (narrow AAT defintion)
Miao( 廟 )
Other Chinese terms
90
91. The importance of good defintions
AAT Taiwan must make sure that all readers, English
and Chinese, understand all terms, English and
Chinese, and the often subtle differences.
The table on the next slide illustrates that
91
92. Uses of AAT Taiwan
92
Searching
Western art
Searching
Chinese art
Western user Understands English
terms
Needs to understand
Chinese terms
Chinese user Needs to understand
English terms
Understands Chinese
terms
All users need a good conceptual structure
96. E-HowNet ontology 廣義知識知識本
體• Building | 建築物
Facilities |設施
Chinese Word: 廟
English: Temple
Conceptual expression: {facilities |設施 : domain = {religion |宗教 }}
Chinese Word: 禪寺
English: Buddhist temple
Conceptual expression: {facilities |設施 : domain = {Buddhist |佛教 }}
Chinese Word: 道觀
English: Taoist temple/ Taoist quan
Conceptual expression: {facilities |設施 : domain = {Taoism |道教 }}96
97. 9797
Mapping Issues- 1Mapping Issues- 1
Terms related to Chinese religious concept
The word “temples” is frequently considered as an equivalent term “ 廟 miao” in Chinese.
However, due to different purposes of the building and the spirit that it worships, names of religious
buildings in Taiwan are varied.
Temples (buildings) (religious buildings, <religious structures>, ... Built Environment (Hierarchy
Name))
Note: Buildings housing places devoted to the worship of a deity or deities. In the strictest sense, it refers to
the dwelling place of a deity, and thus often houses a cult image. In modern usage a temple is generally a
structure, but it was originally derived from the Latin "templum" and historically has referred to an uncovered
place affording a view of the surrounding region. For Christian or Islamic religious buildings the terms
"churches" or "mosques" are generally used, but an exception is that "temples" is used for Protestant, as
opposed to Roman Catholic, places of worship in France and some French-speaking regions.
Q1. The mapping team has found that “temple” in AAT is broader than the concept in Chinese.
Therefore it is necessary to distinguish the differences in each Chinese terms before mapping.
98. 9898
Mapping Issues-Mapping Issues- 1
Terms related to Chinese religious concept
Despite the similar appearance, each of them has slight
difference from the others.
Miao( 廟 ): In the past, it was a place to worship ancestors.
Since Han dynasty, it had been used as a place both
worship ancestor and the spirits.
•ci ( 祠 ): It is built for the purpose to worship/
commemorate saints, or some famous scholars, poets,
people with great achievement. Sometimes also refers to
those places that worship ancestors.
• si ( 寺 ): Generally refers to a place that worship the
Buddhist spirits. Sometimes it also refers to the place where
Buddhist monk live.
• an ( 庵 ): used to refers to scholars’ study place ( 書齋 ).
Nowadays it refers to where Buddhist nuns live.
• guan( 觀 ): only refers to Taoist building
• yan( 巖 ): refers to those miaos( 廟 ) established nearby or
at mountain.
99. 9999
Mapping Issues- 2Mapping Issues- 2
A Chinese set term stands for broader meaning
文玩 (Wenwan)
• A word combined with two words “ 文物 cultural object” and “ 古玩 antique curio”.
( 文玩兼有文物與古玩的特點 )
• It specifically refers to those objects used in the educated people’s reading room,
including those writing equipments, small tools and decorations.
( 特指文人書齋中的書寫設備、小工具和擺飾 )
• It represents the culture of reading room, by combining the practical function of
educated people’s study equipments and art crafts for people’s appreciation.
( 文玩是種書齋文化,結合了文人書生的實用器物與具觀賞價值的藝術品 )
• Common objects including: ink stones, seals, washing vessels, fine sculptured
decoration…etc. “Elegant” and “exquisite” are its essential characters.
( 文玩為以下器物的泛稱 : 古硯、印章、洗器、牙雕…等,“雅” 與“巧”是其基本特徵 )
• It is produced in a highly artistic manner. Nowadays it has become popular collection that
values more as an artifact than equipment.
( 以高藝術性的方式製造,現今多為賞而勿用的文房珍玩 )
100. 100100
Mapping Issues- 2Mapping Issues- 2
A Chinese set term stands for broader meaning
• lotus pod shaped vessel for
injecting water 雙蓮房水注
• banana leaf shaped
wooden plate 癭木蕉葉盤
• olive stone boat sculpture
果核小舟
• blue snuff bottle
藍地金星套料鼻煙
壺
• lotus leaf shaped washing
vessel 白玉荷葉式洗
• seal 鴛錦雲章循
連環田黃石印
• ivory desk tidy 象牙
雕山水人物筆筒
101. 101101
Mapping Issues- 2Mapping Issues- 2
A Chinese set term stands for broader meaning
Q2. The mapping team has found the meaning of Wenwan is boarder than the
term “desk sets”, while some part of them are equal. Therefore, the 2 terms
are inexact equivalent relations.
Is it more suitable to create a new term “Wenwan” in the structure, or it should
be referred as desk sets?
desk sets (sets (groups), <object groupings by general context>, ... Object Groupings and Systems)
Note: Sets of matching articles intended to be used on a desk including such articles as inkstands,
pen trays, and stamp boxes.
102. When English terms have broader meanings (1/2)
EX1:
• ID: 300053660 Record Type: concept
stitching (<processes and techniques by specific type>, <processes and techniques>, Processes and Techniques)
Note: Refers to the process of fastening, joining, closing, uniting, mending, or creating ornamentation by stitches,
which are the portions of thread left in fabric or another material by the in and out movement of a threaded needle
through the thickness or surface of the material, or the loops of thread created on a needle in knitting or other
needlework. In the context of textiles and needleworking, its meaning overlaps with "sewing." In the context of
bookbinding, it refers to the fastening together a number of leaves or gatherings by passing the thread or wire through
all of the sheets at once; it is distinct from "sewing," which, in the context of bookbinding, is used for the joining of
leaves or gatherings together one by one by drawing thread or wire backwards and forwards through the back fold of
each sheet to attach it to the cords.
縫綴 / 縫訂 (< 依特定種類區分之過程與技術 >, < 過程與技術 >, 過程與技術 )
範圍註:意指藉由針線進出穿過材料或其表面的動作,將針腳留在布料或其他材料上,或是在編織或針織時形成針目,
以固定、結合、閉合、合併、修補或製作裝飾的過程。若指涉的是紡織品與手工繡品方面,則其意義與「縫紉
(sewing) 」一詞重疊。若指涉的是書籍裝幀方面,則意指將若干頁面或疊層,用線或金屬線一次穿過所有紙張固定在一
起。而「線訂( sewing )」在書籍裝幀方面,是指用針線或金屬線,在一疊書頁的摺縫處上下穿梭,使其與裝訂線固定
的方法。
In different contexts (bookbinding vs. needleworking), the meaning of stitching may change accordingly. In
AAT, two kinds of meanings are explained in the same record, but when translating the term into Chinese,
there will be two ways of translation, 縫合 (feng he) for needleworking and 縫訂 (feng ding) for
bookbinding. The same problem occurs in the record of sewing (ID: 300053658).
Stiching in needleworkingStiching in bookbinding
103. When English terms have broader meanings (2/2)
EX2:
300004184 Record Type: concept
patios (<uncovered spaces>, <rooms and spaces by form>, ... Components (Hierarchy Name))
Note: Paved recreation areas adjoining contemporary houses and the paved interior courts of
Spanish or Spanish-style buildings.
The term refers to two types of open spaces, so the translations could be 屋外休憩區 or ( 西班牙 ) 內
院 .
Spanish patioPatio adjoining a house
104. When English terms have broader meanings (2/2)
EX3:
• 300266238
Record Type: concept
maculatures (<prints by process or technique>, prints (visual works), ... Visual and Verbal
Communication)
Note: Prints made by taking a second impression without reinking the plate, often
used for cleaning the plate. May also refer to blotting paper. Also used for scrap
paper that can reinforce fabric in Medieval embroidery.
The term maculatures could be used in three different contexts (prints, blotting paper, and
scrap paper) , and there are three kinds of translations ( 吸墨紙版畫、吸墨紙、固定刺繡布
料的紙片 ).
Q3: In this case, since the record contains multiple meanings, it’s not a problem
of which one being the preferred term, so how should the Chinese translations be
displayed?
Editor's Notes
For the medical data, one important finding is the multi-level topical structure, by that it means the typology can be applied to multiple levels. Here is an example. The answer to the question of “What is the most effective treatment for ADHD in children?” is analyzed; the Figure shows the top levels of the topical map from the coding.
The central topic is ADHD in Children, we have inattention and hyperactivity or comorbid conditions as symptoms directly matching the topic; we have stimulant medication therapy as medical treatment for the relevance category of method and solution. Further on, the answer provides information about the significance of the therapy, medical trial of the therapy, the side effect of the therapy, and other treatment methods in comparison to the therapy. We have poor patient and parent education as hindering factor to the medical treatment.
The general idea here is that some information presented in the answer relates to the central topic only through “steps” of connection. For example, the poor patient and parent education is not the hindering factor to the disease, but a hindering condition of the medical treatment to the disease.
This coupling structure is very interesting and important, it indicates that the same set of topical relevance relationships can be applied on many levels; although the level can vary, the relationship types remain stable on each level.
the presented information relates to the central topic only through “steps” of connection. For example, “A large random trial” does not directly connect to the central topic of “ADHD in children”: It is not the “Evaluation” of “ADHD in children”; instead, it is the “Evaluation” of “Stimulant medication therapy”, which in turn is the “Medical treatment” of “ADHD in children”.
The typology also applies to image tagging
The tags are organized first by the functional relevance categories, then by the presentation type. This slide shows all the tags directly matching topic of the image; such as saint bartholomew, executioner, knife referring to the focal image content; physical anguish, profound emotion as adjectival elaboration; expressive hands, gestures, confronts as adverbial elaboration; martyrdom is the theme of image;
The tags are organized first by the functional relevance categories, then by the presentation type. This slide shows all the tags directly matching topic of the image; such as saint bartholomew, executioner, knife referring to the focal image content; physical anguish, profound emotion as adjectival elaboration; expressive hands, gestures, confronts as adverbial elaboration; martyrdom is the theme of image;
The tag, Christ’s sacrifice and crucifixion is analogy to the image; biographic information of artist and creation time period as context.
The tags are organized first by the functional relevance categories, then by the presentation type. This slide shows all the tags directly matching topic of the image; such as saint bartholomew, executioner, knife referring to the focal image content; physical anguish, profound emotion as adjectival elaboration; expressive hands, gestures, confronts as adverbial elaboration; martyrdom is the theme of image;
The tags are organized first by the functional relevance categories, then by the presentation type. This slide shows all the tags directly matching topic of the image; such as saint bartholomew, executioner, knife referring to the focal image content; physical anguish, profound emotion as adjectival elaboration; expressive hands, gestures, confronts as adverbial elaboration; martyrdom is the theme of image;
The tag, Christ’s sacrifice and crucifixion is analogy to the image; biographic information of artist and creation time period as context.
The study focuses on the rhetorical functional roles. It includes evidence, context, comparison, evaluation, method, etc. Let’s zoom in on this facet.
Since the typology has many levels and it’s easy to get lost. This is just an overview, I will discuss some of the categories in greater detail as we move along.
I will present the findings from literature analysis and also from MALACH data analysis together, in this way, you get to see how literature analysis contributes to developing the typology, you also get to see how the typology is exemplified by the MALACH data.
The typology has two facets, 1, functional role and 2, mode of reasoning. These two facets are diagonal but equally important to characterize types of topical relevance.
Functional role is concerned about …I’d like to focus the talk on this perspective
This slide gives you a second-level detail of the function-based facet.
RST stands for Rhetorical structural theory, it provides a comprehensive framework for investigating relationships based on functional role. It was developed by Mann & Thompson in 1980s for the purpose of guiding natural language generation. Later, it is widely applied in discourse analyses in various domains.
[RST looks at the relationships hold between text parts in a coherent discourse by identifying the functional role of each text part in the discourse.]
[You may ask how discourse relations relate to topical relevance? Well, in most cases, a coherent discourse is organized around a topic, different text parts play different roles and work together to contribute to the reader’s understanding of the topic. In information search, this is not much different. we also have a topic, and we gather and organize different pieces of relevant information to improve the user’s understanding on the topic. In terms of contributing to the receiver&apos;s understanding of a topic, the functional roles played by different parts of text and those by different pieces of relevant information are much the same.]
From this framework, close relations can be drawn to the MALACH relevance types, direct, indirect, context, comparison, also it supplements other element, such as method, solution, evaluation and so on.
Now let’s zoom into Direct relevance.
Comparison is based on perceived similarities, [but what really make it interesting are the pieces that are different.]
Under comparison, we have two sub-facets, first, comparison by similarity or comparison by difference, Both similar and contrasting cases are considered relevant. Second, by factor that is different. These two sub-facets are coupled, for example, by varying the external factors or participants, we often get similar cases happening in a different place, at a different time, or with a different person; by fixing these factors and varying the act itself, we often get contrasting cases happening in the same time-space or with the same participant.
Varying values of the first two topical facets, we get the same or comparable event/ experience/ phenomenon happening in a different place, at a different time, in a different situation, or with a different person; varying values of the last facet, we get an opposite event/ experience/ phenomenon happening in the same time-space or involving the same participant(s). The three major topical facets define three specific types of comparative evidence:
This is the detailed typology for comparison relevance
By similarity and contrast
By factor that is different