SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
Application of Topic Segmentation in
Audiovisual Information Retrieval
Petra Galuščáková
galuscakova@ufal.mff.cuni.cz
Information Retrieval
● Finding material (usually documents) of an unstructured nature
(usually text) that satisfies an information need from within large
collections (usually stored on computers) [Manning, 21]
● Audiovisual Information Retrieval
- Documents to retrieve in audiovisual format
- Harder navigation
● Dependency on segmentation
- We want to minimize user`s needed work and retrieve
exact start point
- Especially audio and audiovisual data
-> we need precise segmentation
- Eskevich [6] states significantly better results of IR with
textTiling segmentation algorithm used then with c99
segmentation algorithm
Topic Segmentation
● Segment
● Coherent part of data
● Definition depends on the application – i. e. news
story, paragraphs in text
● Hierarchical/linear structure
● Audiovisual recordings
● No given text structure
● Needs to be segmented on sentences first
Topic Segmentation in Text
● Automatic Speech Recognition for transformation of audio track into text
● Errors in transcripts could influence segmentation
● Malioutov et al.[20] shows differences in evaluation of segmentation algorithms in
dependency of manual and automatic transcripts
● Hsueh and Moore [12] shows that despite the word recognition error (WER equal
to 39.1%) - their segmentation systems did not work significantly worse on ASR
transcripts than on reference transcripts.
– ASR system is likely to mis-recognize different occurences of words in the
same way
– Use more features than ASR output and the impact of recognition errors
could be reduced
Systems for Topic
Segmentation
● Lexical Cohesion Based
● TextTilling [10], C99 [3], LCSeg [8], MinCut [19], Dotplot [16], IClustSeg [26],
TextLec [29], DivSeg [31], NM09[24], U00 [33], JSeg [1], Transeg [17], LCP [15],
LSITilling [A9], TopSeg [11]
● Features Based
● [12], [7] PLSA [14] – Decisoin trees[25, 32], Maximum Entropy, SVM [14]
● Generative Models
● HMM [13, 32], BayesSeg, U00 [4, 33]
Lexical Cohesion
● Cohesion
- The sentences "stick together" to function as a whole [23]
- Achieved through back-reference, conjunction, and semantic word relations
● Division according to Halliday and Hasan [9]:
● Reiteration:
– Reiteration with identity of reference:
1. Mary bit into a peach. 2. Unfortunately the peach wasn't ripe.
– Reiteration without identity of reference:
1. Mary ate some peaches. 2. She likes peaches very much.
– Reiteration by means of superordinate (subdominate, and synonyms):
1. Mary ate a peach. 2. She likes fruit.
● Collocation:
– Systematic semantic relation (systematically classifiable):
1. Mary likes green apples. 2. She does not like red ones.
– Nonsystematic semantic relation (not systematically classifiable):
1. Mary spent three hours in the garden yesterday. 2. She was digging
potatoes.
Systems for Topic
Segmentation - C99
● C99 [3]
● Based on the cosine measure of sentence pairs
– Similarity between sentences x and y, fi,j denotes frequency of word j in
sentence I
– Similarity values are used to build the similarity matrix [17]
– Then the ranked matrix is built according to the similarity matrix
● Each value in the similarity matrix is replaced by its rank in the local
region. The rank is the number of neighbouring elements with a lower
similarity value [3]
– Finally clustering is applicated
● Iteratively searching for maximum density of matrices in the rank matrix
Systems for Topic
Segmentation - TextTiling
● Based on a lexical repetition
● Uses cosine measure
● A window of fixed length is being gradually slid through the text, and information
about word overlap between the left and right part of the window is converted into
digital signal.[10]
● Graph is then smoothed
● Shape of the post-processed signal is used to determine segment breaks.
● High similarity values, implying that the adjacent blocks cohere well, tend to form
peaks, whereas low similarity values, indicating a potential boundary between tiles,
create valleys. [10]
Systems for Topic
Segmentation – Features
Based● Text
● Lexical features
- Cue words and n-grams (now, okay, let’s, um, so, good night, ...) [12, 28]
- Distribution of nouns [7]
● Contextual Features:
- Dialogue act type [12]
- Speaker role (e.g., project manager, marketing expert)
- Tense, aspect [24]
● Vocabulary
- Word groups (months, day, coutry names, named entities, ...)
- POS tags
- Pronoun (Does the sentence contain a pronoun?), Numbers (segment of a
specific length), Is this sentence part of a conversation, i.e. does this sentence
contain “direct speech”? [12]
- Interlocutors mention agenda items (e.g., presentation, meeting) or content words
more often when initiating a new discussion. [12]
Systems for Topic
Segmentation – Features
Based● Text
● According to Hsueh [12] interlocutors do the following more often than usual at
segment boundaries: start speaking before they are ready, give information, elicit
an assessment of what has been said so far, or act to smooth social functioning
and make the group happier
● Lexical Chains [2, 14]
- Does the word appear in the next few sentences?
- Does the word appear in the next few words?
- Does the word appear in the previous few sentences?
- Does the word appear in the previous few words?
- Does the word appear in the previous few sentences but not in the next few
sentences?
- Does the word begin the preceding sentence?
Systems for Topic
Segmentation – Features
BasedAudio:
● Conversational Features [12]
- Amount of overlapping speech
- Speaker activity change [24]
● Prosodic Features [12]
- Fundamental frequency F0 – maximum, mean F0, patterns across the
boundary [32]
- Energy, energy at multiple points (e.g., the first and last 100 and 200 ms, the
first and last quarter, the first and second half)
- Pitch contour (relative to the speaker’s baseline [32]) – pitch is less robust [30]
- Rate of speech (number of words and the number of syllables spoken per
second)
- Silence [1]
- Duration of pauses [30], vowels [1], final vowels and final rhymes [32]
Segmentation Using Audio
Information
● Segment is likely to start with higher pitched sounds and a lower rate of speech
● Tendency of speakers to reset pitch at the start of a new major unit - final fall in pitch
associated with the ends of such units [30]
● Slowing down toward the ends of units [30]
● Topic shifts often occur after a pause of relatively long duration [12]
Systems for Topic
Segmentation – Features
Based● Video:
● Color similarity
– Based on histogram
● Motion similarity
– Pixel comparison
– Especially frontal shots, hand movements [12]
– Gestural features (eye gaze behaviour) [5], face similarity
● Bag of Visual Words
● Interlocutors do not move around a lot when a new discussion is brought up [12]
Systems for Topic
Segmentation – Features
Based
● Hearst [11] creates new features as a combination of another features
● He shows that the most useful features are the anchor face and pauses
● According to Hsueh [12] must be lexical features combined with other features, in
particular, conversational features (i.e., lexical cohesion, overlap, pause, speaker
change)
Fusion
● Llinas [18] defines fusion as an information process that associates, correlates and
combines data and information from single or multiple sensors or sources to achieve
refined estimates of parameters, characteristics, events and behaviors
● From many sources of information and context, how to make our best to “interpret”
the data [22]
● Levels of fusion
● Early fusion strategy
- All modalities are „concatenated into one“
- Only one decision is taken over the concatenated input
● Intermediate fusion strategy
- I.e. creataing various feature vectors, which are finally processed by HMM
● Late fusion strategy
- Each source is processed individually by a specific recognizer
Our Approach - Objectives
● Segmentes should be further porcessed by IR system
● Usable on several systems – MediaEval Competition Data and Dialogy corpus
● Applicable to various types of recordings news data and dialogs
● Language independent – should work at least with English and Czech data
● Small amount of training data for given type of recordings
● Training data exists for other type of recordings (i. e. TDT corpus – available in LDC,
Malach)
● Possible to integrate users feedback (in Dialogy corpus)
Our Approach - Solution
● Should be feature based – one of the future could be
output of cohesion based algorithm (TextTiling)
● Should incorporate all types of information (textual, audio
and visual)
● Should use fusion for mixing these different sources
● In visual track - shot detection should be used
● Active learning could help to incorporate user feedback
References
● [1] Katarina Bartkova: How far can prosodic cues help in word segmentation? In Proceedings of the 3rd International Conference on
Speech Prosody SP2006, 2006
● [2] Doug Beeferman, Adam Berger, John Lafferty: Statistical models for text segmentation, Journal Machine Learning - Special issue on
natural language learning archive Volume 34 Issue 1-3, Feb. 1999, Pages 177 – 210, 1999
● [3] Freddy Y. Y. Choi : Advances in domain independent linear text segmentation, Proceedings of the 1st Meeting of the North American
Chapter of the Association for Computational Linguistics (ANLP-NAACL-00). pp. 26–33, 2000
[4] Jacob Eisenstein, Regina Barzilay: Bayesian Unsupervised Topic Segmentation, Proceeding EMNLP '08 Proceedings of the
Conference on Empirical Methods in Natural Language Processing, Pages 334-343, 2008
● [5] Jacob Eisenstein, Regina Barzilay, All Davis: Gestural Cohesion for Topic Segmentation, ACL 2008: 852-860, 2008
● [6] Maria Eskevich, Gareth J. F. Jones: DCU at MediaEval 2011: Rich Speech Retrieval. MediaEval 2011
● [7] Martin Franz , Bhuvana Ramabhadran , Todd Ward , Michael Picheny: Automated Transcription and Topic Segmentation of Large
Spoken Archives, In Proceedings of Eurospeech, 2003
● [8] Michel Galley , Kathleen Mckeown : Discourse Segmentation of Multi-Party Conversation, in 41st Annual Meeting of ACL, 2003
● [9] M. A. K. Halliday, Ruqaiya Hasa: Cohesion in English, 1976
● [10] Marti A. Hearst TextTiling: A Quantitative Approach to Discourse Segmentation, Technical Report, 1993
● [11] Winston Hsu, Shih-fu Chang, Chih-wei Huang, Lyndon Kennedy Ching-yung Lin, Giridharan Iyengar: Discovery and Fusion of Salient
Multi-modal Features towards News Story Segmentation, In IS&T/SPIE Electronic Imagin, 2004
● [12] Pei-yun Hsueh, Johanna D. Moore: Combining Multiple Knowledge Sources for Dialogue Segmentation in Multimedia Archives. ACL
2007, 2007.
References
● [13] Minwoo Jeong, Ivan Titov:Multi-document Topic Segmentation, Proceeding CIKM '10 Proceedings of the 19th ACM international
conference on Information and knowledge management, Pages 1119-1128, 2010
● [14] David Kaucha, Francine Chen: Feature-Based Segmentation of Narrative Documents, Proceeding FeatureEng '05 Proceedings of
the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing, Pages 32-39, 2005
● [15] Hideki Kozima: Text Segmentation Based On Similarity Between Words, Proceeding ACL '93 Proceedings of the 31st annual
meeting on Association for Computational Linguistics, Pages 286-288, 1993
● [16] Niraj Kumar, Piyush Rai, Chandrika Pulla and C.V. Jawahar Video Scene Segmentation with a Semantic Similarity Proceedings of
5th Indian International Conference on Artificial Intelligence (IICAI 2011),14-16 December, 2011, Bangalore, India, 2011.
● [17] Alexandre Labadié, Violaine Prince: Lexical and semantic methods in inner text topic segmentation: A comparison between c99
and Transeg, Proceeding NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems:
Applications of Natural Language to Information Systems, Pages 347 – 349, 2008
● [18] James Llinas, Christopher Bowman, Galina Rogova, Alan Steinberg, and Frank White: Revisiting the JDL Data Fusion Model II, In
P. Svensson and J. Schubert Eds., Proceedings of the Seventh International Conference on Information Fusion FUSION 2004, 2004
● [19] Igor Malioutov, Regina Barzilay: Minimum Cut Model for Spoken Lecture Segmentation, Proceeding ACL-44 Proceedings of the
21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational
Linguistics, Pages 25-32, 2006
● [20] Igor Malioutov, Alex Park, Regina Barzilay, James Glass : Making Sense of Sound: Unsupervised Topic Segmentation over
Acoustic Input, In Proceedings, ACL, 2007
● [21] Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze: Introduction to Information Retrieval, 2008
● [22] Stéphane Marchand-Maillet: Multimedia Information Retrieval, Promise Witer School, 2012
● [23] Jane Morris, Graeme Hirst: Lexical Cohesion Computed by Thesaural Relations as an Indicator of the Structure
References
● [24] John Niekrasz, Johanna Moore: Participant Subjectivity and Involvement as a Basis for Discourse Segmentation, Proceeding
SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and
Dialogue, Pages 54-61, 2009
● [25] Rebecca J. Passonneau, Diane J. Litman: Discourse Segmentation by Human and Automated Means, Journal Computational
Linguistics Volume 23 Issue 1, March 1997, Pages 103-139, 1997
● [26] Raúl Abella Pérez, José Eladio Medina Pagola: An Incremental Text Segmentation by Clustering Cohesion, Proceeding CIARP'10
Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and
applications, Pages 261-268, 2010
● [27] Lev Pevzner, Marti A. Hearst: A Critique and Improvement of an Evaluation Metric for Text Segmentation, Journal Computational
Linguistics, Volume 28 Issue 1, March 2002, Pages 19-36, 2002
● [28] Jay M. Ponte , W. Bruce Croft : Text Segmentation by Topic, In Proceedings of the First European Conference on Research and
Advanced Technology for Digital Libraries, 1997
● [29] Laritza Hernández Rojas, José E. Medina Pagola: A Novel Method of Segmentation by Topic Using Lower Windows and Lexical
Cohesion, Proceeding CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in
pattern recognition, image analysis and applications Pages 724-733, 2007
● [30] Elizabeth Shriber, Andreas Stolcke, Dilek Hakkani-Tür, Gükhan Tür: Prosody-Based Automatic Segmentation of Speech into
Sentences and Topics, Journal Speech Communication - Special issue on accessing information in spoken audio archive Volume 32
Issue 1-2, Sept. 2000, Pages 127 – 154, 2000
[31] Fei Song, William M. Darling, Adnan Duric, Fred W. Kroon: An Iterative Approach to Text Segmentation, Proceeding ECIR'11
Proceedings of the 33rd European conference on Advances in information retrieval, Pages 629-640, 2011
● [32] Gökhan Tür, Andreas Stolcke, Dilek H. Tür, Elizabeth Shriberg: Integrating Prosodic and Lexical Cues for Automatic Topic
Segmentation, Comput. Linguist., Vol. 27, No. 1. pp. 31-57, 2001
●
[33] Masao Utiyama , Hitoshi Isahara: A Statistical Model for Domain-Independent Text Segmentation, In Proceedings of the 9th
Conference of the European Chapter of the Association for Computational Linguistics, 2001
Thank you

Weitere ähnliche Inhalte

Ähnlich wie Application of Topic Segmentation in Audiovisual Information Retrieval

Towards Machine Comprehension of Spoken Content
Towards Machine Comprehension of Spoken ContentTowards Machine Comprehension of Spoken Content
Towards Machine Comprehension of Spoken ContentNVIDIA Taiwan
 
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Quinsulon Israel
 
A Methodology to Compare and Adapt E-Learning in the Global Context (Pawlowsk...
A Methodology to Compare and Adapt E-Learning in the Global Context (Pawlowsk...A Methodology to Compare and Adapt E-Learning in the Global Context (Pawlowsk...
A Methodology to Compare and Adapt E-Learning in the Global Context (Pawlowsk...Richter Thomas
 
Using construction grammar in conversational systems
Using construction grammar in conversational systemsUsing construction grammar in conversational systems
Using construction grammar in conversational systemsCJ Jenkins
 
Cognition in HCI
Cognition in HCICognition in HCI
Cognition in HCIUm e Farwa
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifiereSAT Publishing House
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifiereSAT Journals
 
Performance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languagePerformance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languageiosrjce
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert Systemcsandit
 
Towards a Reference Ontology on Mulsemedia Systems
Towards a Reference Ontology on Mulsemedia SystemsTowards a Reference Ontology on Mulsemedia Systems
Towards a Reference Ontology on Mulsemedia SystemsEstêvão Bissoli Saleme
 
Integration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationIntegration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationChamani Shiranthika
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.Lifeng (Aaron) Han
 
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...iosrjce
 
Tutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemIJERA Editor
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH WarNik Chow
 
Meta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsMeta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsLifeng (Aaron) Han
 
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...Jekaterina Novikova, PhD
 
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...cscpconf
 

Ähnlich wie Application of Topic Segmentation in Audiovisual Information Retrieval (20)

Towards Machine Comprehension of Spoken Content
Towards Machine Comprehension of Spoken ContentTowards Machine Comprehension of Spoken Content
Towards Machine Comprehension of Spoken Content
 
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
 
A Methodology to Compare and Adapt E-Learning in the Global Context (Pawlowsk...
A Methodology to Compare and Adapt E-Learning in the Global Context (Pawlowsk...A Methodology to Compare and Adapt E-Learning in the Global Context (Pawlowsk...
A Methodology to Compare and Adapt E-Learning in the Global Context (Pawlowsk...
 
Using construction grammar in conversational systems
Using construction grammar in conversational systemsUsing construction grammar in conversational systems
Using construction grammar in conversational systems
 
Cognition in HCI
Cognition in HCICognition in HCI
Cognition in HCI
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
 
Unified modeling language
Unified modeling languageUnified modeling language
Unified modeling language
 
Performance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languagePerformance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi language
 
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
 
Towards a Reference Ontology on Mulsemedia Systems
Towards a Reference Ontology on Mulsemedia SystemsTowards a Reference Ontology on Mulsemedia Systems
Towards a Reference Ontology on Mulsemedia Systems
 
Integration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationIntegration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translation
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
 
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
 
Carolyn Rosé - WESST - From Data to Design of Dynamic Support for Collaborati...
Carolyn Rosé - WESST - From Data to Design of Dynamic Support for Collaborati...Carolyn Rosé - WESST - From Data to Design of Dynamic Support for Collaborati...
Carolyn Rosé - WESST - From Data to Design of Dynamic Support for Collaborati...
 
Tutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis System
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
 
Meta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsMeta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methods
 
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
 
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
 

Mehr von Petra Galuscakova

Combining Evidence for Cross-language Information Retrieval
Combining Evidence for Cross-language Information RetrievalCombining Evidence for Cross-language Information Retrieval
Combining Evidence for Cross-language Information RetrievalPetra Galuscakova
 
Multimodal Features for Linking Television Content
Multimodal Features for Linking Television ContentMultimodal Features for Linking Television Content
Multimodal Features for Linking Television ContentPetra Galuscakova
 
Czech Malach Cross-lingual Speech Retrieval Test Collection
Czech Malach Cross-lingual Speech Retrieval Test CollectionCzech Malach Cross-lingual Speech Retrieval Test Collection
Czech Malach Cross-lingual Speech Retrieval Test CollectionPetra Galuscakova
 
Audio Information for Hyperlinking of TV Content
Audio Information for Hyperlinking of TV ContentAudio Information for Hyperlinking of TV Content
Audio Information for Hyperlinking of TV ContentPetra Galuscakova
 
Multimodal Features for Search and Hyperlinking of Video Content
Multimodal Features for Search and Hyperlinking of Video ContentMultimodal Features for Search and Hyperlinking of Video Content
Multimodal Features for Search and Hyperlinking of Video ContentPetra Galuscakova
 
Evaluácia tematického vyhľadávania v audiovizuálnych nahrávkach
Evaluácia tematického vyhľadávania v audiovizuálnych nahrávkachEvaluácia tematického vyhľadávania v audiovizuálnych nahrávkach
Evaluácia tematického vyhľadávania v audiovizuálnych nahrávkachPetra Galuscakova
 
CUNI at MediaEval 2013 Similar Segments in Social Speech Task
CUNI at MediaEval 2013 Similar Segments in Social Speech TaskCUNI at MediaEval 2013 Similar Segments in Social Speech Task
CUNI at MediaEval 2013 Similar Segments in Social Speech TaskPetra Galuscakova
 
Česko-slovenský paralelný korpus určený pre preklad medzi blízkymi jazykmi
Česko-slovenský paralelný korpus určený pre preklad medzi blízkymi jazykmiČesko-slovenský paralelný korpus určený pre preklad medzi blízkymi jazykmi
Česko-slovenský paralelný korpus určený pre preklad medzi blízkymi jazykmiPetra Galuscakova
 
Penalty Functions for Evaluation Measures of Unsegmented Speech Retrieval
Penalty Functions for Evaluation Measures of Unsegmented Speech RetrievalPenalty Functions for Evaluation Measures of Unsegmented Speech Retrieval
Penalty Functions for Evaluation Measures of Unsegmented Speech RetrievalPetra Galuscakova
 

Mehr von Petra Galuscakova (9)

Combining Evidence for Cross-language Information Retrieval
Combining Evidence for Cross-language Information RetrievalCombining Evidence for Cross-language Information Retrieval
Combining Evidence for Cross-language Information Retrieval
 
Multimodal Features for Linking Television Content
Multimodal Features for Linking Television ContentMultimodal Features for Linking Television Content
Multimodal Features for Linking Television Content
 
Czech Malach Cross-lingual Speech Retrieval Test Collection
Czech Malach Cross-lingual Speech Retrieval Test CollectionCzech Malach Cross-lingual Speech Retrieval Test Collection
Czech Malach Cross-lingual Speech Retrieval Test Collection
 
Audio Information for Hyperlinking of TV Content
Audio Information for Hyperlinking of TV ContentAudio Information for Hyperlinking of TV Content
Audio Information for Hyperlinking of TV Content
 
Multimodal Features for Search and Hyperlinking of Video Content
Multimodal Features for Search and Hyperlinking of Video ContentMultimodal Features for Search and Hyperlinking of Video Content
Multimodal Features for Search and Hyperlinking of Video Content
 
Evaluácia tematického vyhľadávania v audiovizuálnych nahrávkach
Evaluácia tematického vyhľadávania v audiovizuálnych nahrávkachEvaluácia tematického vyhľadávania v audiovizuálnych nahrávkach
Evaluácia tematického vyhľadávania v audiovizuálnych nahrávkach
 
CUNI at MediaEval 2013 Similar Segments in Social Speech Task
CUNI at MediaEval 2013 Similar Segments in Social Speech TaskCUNI at MediaEval 2013 Similar Segments in Social Speech Task
CUNI at MediaEval 2013 Similar Segments in Social Speech Task
 
Česko-slovenský paralelný korpus určený pre preklad medzi blízkymi jazykmi
Česko-slovenský paralelný korpus určený pre preklad medzi blízkymi jazykmiČesko-slovenský paralelný korpus určený pre preklad medzi blízkymi jazykmi
Česko-slovenský paralelný korpus určený pre preklad medzi blízkymi jazykmi
 
Penalty Functions for Evaluation Measures of Unsegmented Speech Retrieval
Penalty Functions for Evaluation Measures of Unsegmented Speech RetrievalPenalty Functions for Evaluation Measures of Unsegmented Speech Retrieval
Penalty Functions for Evaluation Measures of Unsegmented Speech Retrieval
 

Kürzlich hochgeladen

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Application of Topic Segmentation in Audiovisual Information Retrieval

  • 1. Application of Topic Segmentation in Audiovisual Information Retrieval Petra Galuščáková galuscakova@ufal.mff.cuni.cz
  • 2. Information Retrieval ● Finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large collections (usually stored on computers) [Manning, 21] ● Audiovisual Information Retrieval - Documents to retrieve in audiovisual format - Harder navigation ● Dependency on segmentation - We want to minimize user`s needed work and retrieve exact start point - Especially audio and audiovisual data -> we need precise segmentation - Eskevich [6] states significantly better results of IR with textTiling segmentation algorithm used then with c99 segmentation algorithm
  • 3. Topic Segmentation ● Segment ● Coherent part of data ● Definition depends on the application – i. e. news story, paragraphs in text ● Hierarchical/linear structure ● Audiovisual recordings ● No given text structure ● Needs to be segmented on sentences first
  • 4. Topic Segmentation in Text ● Automatic Speech Recognition for transformation of audio track into text ● Errors in transcripts could influence segmentation ● Malioutov et al.[20] shows differences in evaluation of segmentation algorithms in dependency of manual and automatic transcripts ● Hsueh and Moore [12] shows that despite the word recognition error (WER equal to 39.1%) - their segmentation systems did not work significantly worse on ASR transcripts than on reference transcripts. – ASR system is likely to mis-recognize different occurences of words in the same way – Use more features than ASR output and the impact of recognition errors could be reduced
  • 5. Systems for Topic Segmentation ● Lexical Cohesion Based ● TextTilling [10], C99 [3], LCSeg [8], MinCut [19], Dotplot [16], IClustSeg [26], TextLec [29], DivSeg [31], NM09[24], U00 [33], JSeg [1], Transeg [17], LCP [15], LSITilling [A9], TopSeg [11] ● Features Based ● [12], [7] PLSA [14] – Decisoin trees[25, 32], Maximum Entropy, SVM [14] ● Generative Models ● HMM [13, 32], BayesSeg, U00 [4, 33]
  • 6. Lexical Cohesion ● Cohesion - The sentences "stick together" to function as a whole [23] - Achieved through back-reference, conjunction, and semantic word relations ● Division according to Halliday and Hasan [9]: ● Reiteration: – Reiteration with identity of reference: 1. Mary bit into a peach. 2. Unfortunately the peach wasn't ripe. – Reiteration without identity of reference: 1. Mary ate some peaches. 2. She likes peaches very much. – Reiteration by means of superordinate (subdominate, and synonyms): 1. Mary ate a peach. 2. She likes fruit. ● Collocation: – Systematic semantic relation (systematically classifiable): 1. Mary likes green apples. 2. She does not like red ones. – Nonsystematic semantic relation (not systematically classifiable): 1. Mary spent three hours in the garden yesterday. 2. She was digging potatoes.
  • 7. Systems for Topic Segmentation - C99 ● C99 [3] ● Based on the cosine measure of sentence pairs – Similarity between sentences x and y, fi,j denotes frequency of word j in sentence I – Similarity values are used to build the similarity matrix [17] – Then the ranked matrix is built according to the similarity matrix ● Each value in the similarity matrix is replaced by its rank in the local region. The rank is the number of neighbouring elements with a lower similarity value [3] – Finally clustering is applicated ● Iteratively searching for maximum density of matrices in the rank matrix
  • 8. Systems for Topic Segmentation - TextTiling ● Based on a lexical repetition ● Uses cosine measure ● A window of fixed length is being gradually slid through the text, and information about word overlap between the left and right part of the window is converted into digital signal.[10] ● Graph is then smoothed ● Shape of the post-processed signal is used to determine segment breaks. ● High similarity values, implying that the adjacent blocks cohere well, tend to form peaks, whereas low similarity values, indicating a potential boundary between tiles, create valleys. [10]
  • 9. Systems for Topic Segmentation – Features Based● Text ● Lexical features - Cue words and n-grams (now, okay, let’s, um, so, good night, ...) [12, 28] - Distribution of nouns [7] ● Contextual Features: - Dialogue act type [12] - Speaker role (e.g., project manager, marketing expert) - Tense, aspect [24] ● Vocabulary - Word groups (months, day, coutry names, named entities, ...) - POS tags - Pronoun (Does the sentence contain a pronoun?), Numbers (segment of a specific length), Is this sentence part of a conversation, i.e. does this sentence contain “direct speech”? [12] - Interlocutors mention agenda items (e.g., presentation, meeting) or content words more often when initiating a new discussion. [12]
  • 10. Systems for Topic Segmentation – Features Based● Text ● According to Hsueh [12] interlocutors do the following more often than usual at segment boundaries: start speaking before they are ready, give information, elicit an assessment of what has been said so far, or act to smooth social functioning and make the group happier ● Lexical Chains [2, 14] - Does the word appear in the next few sentences? - Does the word appear in the next few words? - Does the word appear in the previous few sentences? - Does the word appear in the previous few words? - Does the word appear in the previous few sentences but not in the next few sentences? - Does the word begin the preceding sentence?
  • 11. Systems for Topic Segmentation – Features BasedAudio: ● Conversational Features [12] - Amount of overlapping speech - Speaker activity change [24] ● Prosodic Features [12] - Fundamental frequency F0 – maximum, mean F0, patterns across the boundary [32] - Energy, energy at multiple points (e.g., the first and last 100 and 200 ms, the first and last quarter, the first and second half) - Pitch contour (relative to the speaker’s baseline [32]) – pitch is less robust [30] - Rate of speech (number of words and the number of syllables spoken per second) - Silence [1] - Duration of pauses [30], vowels [1], final vowels and final rhymes [32]
  • 12. Segmentation Using Audio Information ● Segment is likely to start with higher pitched sounds and a lower rate of speech ● Tendency of speakers to reset pitch at the start of a new major unit - final fall in pitch associated with the ends of such units [30] ● Slowing down toward the ends of units [30] ● Topic shifts often occur after a pause of relatively long duration [12]
  • 13. Systems for Topic Segmentation – Features Based● Video: ● Color similarity – Based on histogram ● Motion similarity – Pixel comparison – Especially frontal shots, hand movements [12] – Gestural features (eye gaze behaviour) [5], face similarity ● Bag of Visual Words ● Interlocutors do not move around a lot when a new discussion is brought up [12]
  • 14. Systems for Topic Segmentation – Features Based ● Hearst [11] creates new features as a combination of another features ● He shows that the most useful features are the anchor face and pauses ● According to Hsueh [12] must be lexical features combined with other features, in particular, conversational features (i.e., lexical cohesion, overlap, pause, speaker change)
  • 15. Fusion ● Llinas [18] defines fusion as an information process that associates, correlates and combines data and information from single or multiple sensors or sources to achieve refined estimates of parameters, characteristics, events and behaviors ● From many sources of information and context, how to make our best to “interpret” the data [22] ● Levels of fusion ● Early fusion strategy - All modalities are „concatenated into one“ - Only one decision is taken over the concatenated input ● Intermediate fusion strategy - I.e. creataing various feature vectors, which are finally processed by HMM ● Late fusion strategy - Each source is processed individually by a specific recognizer
  • 16. Our Approach - Objectives ● Segmentes should be further porcessed by IR system ● Usable on several systems – MediaEval Competition Data and Dialogy corpus ● Applicable to various types of recordings news data and dialogs ● Language independent – should work at least with English and Czech data ● Small amount of training data for given type of recordings ● Training data exists for other type of recordings (i. e. TDT corpus – available in LDC, Malach) ● Possible to integrate users feedback (in Dialogy corpus)
  • 17. Our Approach - Solution ● Should be feature based – one of the future could be output of cohesion based algorithm (TextTiling) ● Should incorporate all types of information (textual, audio and visual) ● Should use fusion for mixing these different sources ● In visual track - shot detection should be used ● Active learning could help to incorporate user feedback
  • 18. References ● [1] Katarina Bartkova: How far can prosodic cues help in word segmentation? In Proceedings of the 3rd International Conference on Speech Prosody SP2006, 2006 ● [2] Doug Beeferman, Adam Berger, John Lafferty: Statistical models for text segmentation, Journal Machine Learning - Special issue on natural language learning archive Volume 34 Issue 1-3, Feb. 1999, Pages 177 – 210, 1999 ● [3] Freddy Y. Y. Choi : Advances in domain independent linear text segmentation, Proceedings of the 1st Meeting of the North American Chapter of the Association for Computational Linguistics (ANLP-NAACL-00). pp. 26–33, 2000 [4] Jacob Eisenstein, Regina Barzilay: Bayesian Unsupervised Topic Segmentation, Proceeding EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing, Pages 334-343, 2008 ● [5] Jacob Eisenstein, Regina Barzilay, All Davis: Gestural Cohesion for Topic Segmentation, ACL 2008: 852-860, 2008 ● [6] Maria Eskevich, Gareth J. F. Jones: DCU at MediaEval 2011: Rich Speech Retrieval. MediaEval 2011 ● [7] Martin Franz , Bhuvana Ramabhadran , Todd Ward , Michael Picheny: Automated Transcription and Topic Segmentation of Large Spoken Archives, In Proceedings of Eurospeech, 2003 ● [8] Michel Galley , Kathleen Mckeown : Discourse Segmentation of Multi-Party Conversation, in 41st Annual Meeting of ACL, 2003 ● [9] M. A. K. Halliday, Ruqaiya Hasa: Cohesion in English, 1976 ● [10] Marti A. Hearst TextTiling: A Quantitative Approach to Discourse Segmentation, Technical Report, 1993 ● [11] Winston Hsu, Shih-fu Chang, Chih-wei Huang, Lyndon Kennedy Ching-yung Lin, Giridharan Iyengar: Discovery and Fusion of Salient Multi-modal Features towards News Story Segmentation, In IS&T/SPIE Electronic Imagin, 2004 ● [12] Pei-yun Hsueh, Johanna D. Moore: Combining Multiple Knowledge Sources for Dialogue Segmentation in Multimedia Archives. ACL 2007, 2007.
  • 19. References ● [13] Minwoo Jeong, Ivan Titov:Multi-document Topic Segmentation, Proceeding CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management, Pages 1119-1128, 2010 ● [14] David Kaucha, Francine Chen: Feature-Based Segmentation of Narrative Documents, Proceeding FeatureEng '05 Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing, Pages 32-39, 2005 ● [15] Hideki Kozima: Text Segmentation Based On Similarity Between Words, Proceeding ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics, Pages 286-288, 1993 ● [16] Niraj Kumar, Piyush Rai, Chandrika Pulla and C.V. Jawahar Video Scene Segmentation with a Semantic Similarity Proceedings of 5th Indian International Conference on Artificial Intelligence (IICAI 2011),14-16 December, 2011, Bangalore, India, 2011. ● [17] Alexandre Labadié, Violaine Prince: Lexical and semantic methods in inner text topic segmentation: A comparison between c99 and Transeg, Proceeding NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems, Pages 347 – 349, 2008 ● [18] James Llinas, Christopher Bowman, Galina Rogova, Alan Steinberg, and Frank White: Revisiting the JDL Data Fusion Model II, In P. Svensson and J. Schubert Eds., Proceedings of the Seventh International Conference on Information Fusion FUSION 2004, 2004 ● [19] Igor Malioutov, Regina Barzilay: Minimum Cut Model for Spoken Lecture Segmentation, Proceeding ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, Pages 25-32, 2006 ● [20] Igor Malioutov, Alex Park, Regina Barzilay, James Glass : Making Sense of Sound: Unsupervised Topic Segmentation over Acoustic Input, In Proceedings, ACL, 2007 ● [21] Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze: Introduction to Information Retrieval, 2008 ● [22] Stéphane Marchand-Maillet: Multimedia Information Retrieval, Promise Witer School, 2012 ● [23] Jane Morris, Graeme Hirst: Lexical Cohesion Computed by Thesaural Relations as an Indicator of the Structure
  • 20. References ● [24] John Niekrasz, Johanna Moore: Participant Subjectivity and Involvement as a Basis for Discourse Segmentation, Proceeding SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Pages 54-61, 2009 ● [25] Rebecca J. Passonneau, Diane J. Litman: Discourse Segmentation by Human and Automated Means, Journal Computational Linguistics Volume 23 Issue 1, March 1997, Pages 103-139, 1997 ● [26] Raúl Abella Pérez, José Eladio Medina Pagola: An Incremental Text Segmentation by Clustering Cohesion, Proceeding CIARP'10 Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications, Pages 261-268, 2010 ● [27] Lev Pevzner, Marti A. Hearst: A Critique and Improvement of an Evaluation Metric for Text Segmentation, Journal Computational Linguistics, Volume 28 Issue 1, March 2002, Pages 19-36, 2002 ● [28] Jay M. Ponte , W. Bruce Croft : Text Segmentation by Topic, In Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries, 1997 ● [29] Laritza Hernández Rojas, José E. Medina Pagola: A Novel Method of Segmentation by Topic Using Lower Windows and Lexical Cohesion, Proceeding CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications Pages 724-733, 2007 ● [30] Elizabeth Shriber, Andreas Stolcke, Dilek Hakkani-Tür, Gükhan Tür: Prosody-Based Automatic Segmentation of Speech into Sentences and Topics, Journal Speech Communication - Special issue on accessing information in spoken audio archive Volume 32 Issue 1-2, Sept. 2000, Pages 127 – 154, 2000 [31] Fei Song, William M. Darling, Adnan Duric, Fred W. Kroon: An Iterative Approach to Text Segmentation, Proceeding ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval, Pages 629-640, 2011 ● [32] Gökhan Tür, Andreas Stolcke, Dilek H. Tür, Elizabeth Shriberg: Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation, Comput. Linguist., Vol. 27, No. 1. pp. 31-57, 2001 ● [33] Masao Utiyama , Hitoshi Isahara: A Statistical Model for Domain-Independent Text Segmentation, In Proceedings of the 9th Conference of the European Chapter of the Association for Computational Linguistics, 2001