Topic Maps are means for representing sophisticated indexes of any information collections for the purpose of semantic information. The creation of Topic Maps bases on a theoretic fundament which is introduced in this paper. Moreover, the Observation Principle is the result of a deep investigation of the Subject Equality Decision Chain will be discussed as well as the Semantic Talk System which generates sophisticated, conceptual indexes of speech streams in realtime. This paper describes how these indexes are created, how they are represented as Topic Maps and how they can be used for integration purposes.
Real-time Generation of Topic Maps from Speech Streams
1.
2.
3.
4. Subject Equality Decision Chain From the child's perspective: („Elgs are sweet.“) I caught always the same Subject, an elg. From the ranger's perspective: („Bernd needs a cow“) I caught Lisa, Ud (fighting), and Bernd (in summer, in winter and as calf). From the zoologist's perspective: („Elgs are loners.“) I caught two deers and three elgs. Are there any elgs? 1. World without any sensory system Decision about Subject Identity is a perspective dependent process under uncertainty whether Subject Stages caught at different occassions belong to the same Subject. Subject Identity 2. Sensory Systems come to stage, catching Subject Stages
5. Subject Equality Decision Chain 3. Documenting the impressions (from the rangers perspective) 2. Sensory Systems come to stage, catching Subject Stages (1) Subjectness : I'm only interested in Lisa, Ud, and Bernd not in snow, trees. (2) Creating Subject Proxies for the current Subject Stages of Lisa, Ud and Bernd (3) Try to document the decision about the Subject Identity of the current Subject Stage by the given means of the governing SMD ontology, TMV ontology and TMV vocabulary. Subject Identity of Subject Stages is mapped to Subject Indication of the Subject Proxy (4) Document all further information observed about the Subject Stage. (Documenting = modelling = loosing information) 4. Subject Equality is decided according to the governing SMD Are there any elgs? 1. World without any sensory system
6.
7. The Observation Principle (1.) Observe the information collections in interest (texts, video streams, etc.) and detect Subject Stages of Subjects in interest from the current perspective. (3.) Create a Subject Proxy for each Subject Stage in interest. (4.) Document the decision about the Subject Identity of the current Subject Stage by the given means of the governing SMD ontology, TMV ontology and TMV vocabulary. ( ... and with respect to all expected Subject Equality Decision Approaches applied later to this Subject Proxy) (5.) Document all further information observed about the Subject Stage by the given means of the governing SMD ontology, TMV ontology and TMV vocabulary. .. or how to create Topic Maps from digital domains? (2.) Decide about the Subject Identity of the observed Subject Stages.
8.
9. SemantikTalk: Speech recognition and text Mining Sliders for configuration parameters (zooms) local context window Overview window (birds eye view) Window for add. Information (documents, pictures)
10. The Semantic Talk System Speech recognition 1 (VoicePro) Speech recognition n Integration und Serialization Topic & Association Extraction Background Knowledge with Semantic Relations Visualization component abc foo cdf topic3 xyz
11.
12. From RDF-output to LTM <st:node rdf:ID="node_Fisichella"> <st:ID>160615</st:ID> <st:label>Fisichella</st:label> <st:nodelevel>1</st:nodelevel> <st:ref_wort_nr rdf:resource="http://www.tt.de/dtd/st/pap#node_160615"/> <st:variant st:index="3" st:type="4" st:weight="0.3176"/> </st:node> ST did observe a noticeable usage of the term "Fisichella" in the speech stream ... Semantic Mapping between RDF-output and Topic Map using the Omnigator ... [id7406 : id7276 = "Fisichella" @"http://www.texttech.de/dtd/st/pap#node_160615" @"http://www.texttech.de/dtd/st/pap#node_Fisichella"] {id7406, id3670, [[1]]} {id7406, id7650, [[160615]]} id7549( id7406 : id463, id464 : id2195 ) [id464] {id464, id1636, [[0.31766722453166335]]} {id464, id4378, [[3]]} {id464, id787, [[4]]} ... and this 'noticeable usage of the term Fisichella' becomes the Subject in the Topic Map. (Subject Identity => the same algorithms observes the 'noticable usage' twice)
13. Integration with other Topic Maps ... Starting point: Integration with an other Topic Map created by the observation principle (for example a motor-sport Topic Map) - a mapping Topic Map is needed (which should be created under the observation principle, too) ... to allow more accurate mapping decisions, it seems to be necessary that the creation process of a Topic Map needs to be documented, too. [id @"http://www.formula1-fansite.org/Fisichella " @"http://www.texttech.de/dtd/st/pap#node_Fisichella"] from the mapping perspective the same Subject is caught, - if Semantic Talk observes a noticeable usage of the term 'Fisichella' - if the motor-sport Topic Map caught a person with the same name.