SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Presented By:
Patel Prashant(CE 137)
The Web, databases, and other digitized information storehouses contain a
growing volume of audio content.
 For example newscasts, sporting events, telephone conversations,
recordings of meetings, Webcasts, documentary archives etc.
 Users want to make the most of this material by searching and indexing
the digitized audio content.
 In the past, companies had to create and manually analyze written
transcripts of audio content because using computers to recognize, interpret,
and analyze digitized speech was difficult.
 However, the development of faster microprocessors, larger storage
capacities, and better speech-recognition algorithms has made audio mining
easier.
Audio mining, also called audio searching, takes a text-based query
and locates the search term or phrase in an audio file.
 This helps users by, for example, letting them quickly get to specific
places in a recorded conversation or determine when a company is
mentioned in a newscast.
 Audio indexing uses speech recognition to analyze an entire file and
produce a searchable index of content bearing words and their
locations.
This is critical because audio content is in a binary format that is
otherwise not readily searchable.
Indexing audio content thus enables searching.
 There are two main approaches to audio mining :
1.Text-based indexing :
oIt converts speech to text and then identifies words in a dictionary
that can contain several hundred thousand entries. If a word or name is
not in the dictionary, the system will choose the most similar word it
can find.
oThe system uses language understanding to create a confidence level
for its findings. For findings with less than a 100 percent confidence
level, the system offers other possible word matches.
2. Phoneme-based indexing:
o It doesn’t convert speech to text but instead works only with sounds.
o The system first analyzes and identifies sounds in a piece of audio
content to create a phonetic-based index. It then uses a dictionary of
several dozen phonemes to convert a user’s search term to the correct
phoneme string.
o Phonemes are the smallest units of speech in a language, all words
are set of phonemes.
oFinally, the system looks for the search terms in the index.
 A phonetic system requires a more efficient search tool because it
must phoneticize the query term, then try to match it with the existing
phonetic string output .This is considerably more complex than using
one of the many existing text-based search tools.
Phoneme-based searches can result in more false matches than the
text-based approach, particularly for short search terms, because many
words sound alike or sound like parts of other words.
However, phonetic indexing can still be useful if the analyzed
material contains important words that are likely to be missing from a
text system’s dictionary, such as foreign terms and names of people and
places
Text- and phoneme-based systems operate in much the same way, except
that the former uses a text-based dictionary and the latter uses a phonetic
dictionary.
A speech recognizer converts the observed acoustic signal into the
corresponding written representation of the spoken words.
Speech recognition software contains acoustic models of the way in which
all phonemes are represented.
Also, there is a statistical language model that indicates how likely words are
to follow each other in a specific Language.
By using these capabilities, as well as complex probability analysis, the
technology can take a speech signal of unknown content and convert it to a
series of words .
Figure: ScanSoft Audio Mining System
 Since music is often
described by
genre, we would
like to annotate
our music data with genre.
Classification by genre
is useful for
music search and
retrieval and also for
playlist generation
Linear and non-linear neural networks
Gaussian Classification
Gaussian mixture models
Hidden Markov model
Let us consider a simple weather forecast problem and try to
emulate a model that can predict tomorrow's weather based on
today’s condition. In this example we have three stationary all day
weather, which could be sunny (S), cloudy (C), or Rainy (R). From
the
history of the weather of the town under investigation we have the
following table
•We refer to the weather conditions by state q that are
sampled at instant t and the problem is to find the
probability of weather condition of tomorrow given today's
condition
P(qt+1 /qt).
•An acceptable approximation for n instants history is :
•P(qt+1/qt , qt-1 , qt-2 , ….. , qt-n ) » P(qt+1 /qt).
•This is the first order Markov chain as the history is
considered to be one instant only.
Let us now ask this question: Given today as sunny (S) what is
the probability that the next following five days are S , C , C , R
and S, having the above model?
The answer resides in the following formula using first order
Markov chain:
P(q1 = S, q2=S, q3=C, q4=C, q5=R, q6=S) =
P(S).P(q2=S/q1=S). P(q3=C/q2=S). P(q4=C/q3=C).
P(q5=R/q4=C). P(q6=S/q5=R)
= 1 x 0.7 x 0.2 x 0.8 x 0.15 x 0.15
= 0.00252
The initial probability P(S) = 1, as it is assumed that today is
sunny.
The model is completely defined by these three sets of parameters a,
b, and p and the model of N states and M observations can be
referred to by :
λ = (A , B , p )
where A = {aij}, B = {bj(wk)} 1 <=i , j <=N and 1< =k <= M.
aij represents the probability of state transition (probability of being
in state Sj given state Si )
aij = P(qt+1=Sj / qt=Si) .
bj(wk) is the probability distribution in a state Sj
w is the alphabet and k is the number of symbols in this alphabet.
π = {1 0 0 } is the initial state probability distribution.
Each phoneme(unit of sound) could be represented by a
state of different and varying duration.
Accordingly, the transition between different phonemes to
form a word can be represented by A = {aij}.
 The observations in this case are the sounds produced in
each position and due to the variations in the evolution of
each sound
this can be also represented by a probabilistic function
B = {bj(wk)}.
A major challenge for speech recognition tools has been recognizing the
speech of different users in different environments.
With this in mind, BBN, IBM, Fast-Talk, and ScanSoft have designed their
audio mining technology to be largely speaker independent.
For example, Fast-Talk’s acoustic models are trained to recognize numerous
speakers via exposure to audio data from speakers representing various ages,
dialects, and speaking styles.
Some audio mining technology uses acoustic models tuned to understand
speech from different environments— such as telephony, TV, or radio.
Designing filters to reduce the background noise that can interfere
with accurate speech recognition.
Creating efficient data structures for representing content.
Developing algorithms that quickly work through the data structures
during indexing and searching.
 Precision is improving but it is still a key issue impeding the
technology’s widespread adoption, particularly in such accuracy-
critical applications as court reporting and medical dictation.
Processing conversational speech can be particularly difficult
because of such factors as overlapping words and background noise.
Breakthroughs in natural language understanding will eventually
lead to big improvements, but until then audio mining will get
better only incrementally.
It’s currently seen as a ‘nice to have,’ not quite yet a ‘need to have’
technology.
Companies could use audio mining to analyze customer-service and
helpdesk conversations or even voice mail.
 Law enforcement and intelligence organizations could use the
technology to analyze intercepted phone conversations.
Broadcast companies like CNN and Radio Free Asia are already
using
audio mining to quickly retrieve important background information
from previous broadcasts when new stories break.
A US prison is using ScanSoft’s audio mining product to analyze
recordings of prisoners’ phone calls to identify illegal activity.
 Musical audio mining relates to the identification of perceptually
important characteristics of a piece of music such as melodic,
harmonic or rhythmic structure.
 Searches can then be carried out to find pieces of music that are
similar in terms of their melodic, harmonic and/or rhythmic
characteristics.
This type of analysis can also be used in music to determine
characteristics like beats per minute (BPM), musical key, and musical
structure, information that is employed to classify music.
Music downloading sites that categorize music by genre uses audio
mining to organize the music.
 Research paper: “Lets Hear It For Audio Mining” by Neal Leavitt.
Research paper: “Tendencies, Perspectives, and Opportunities of
Musical Audio-Mining” by Ghent University, Belgium.
wikipedia.org/wiki/Audio_mining
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization Hafiz faiz
 
Models of Distributed System
Models of Distributed SystemModels of Distributed System
Models of Distributed SystemAshish KC
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysisDataminingTools Inc
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Semantic nets in artificial intelligence
Semantic nets in artificial intelligenceSemantic nets in artificial intelligence
Semantic nets in artificial intelligenceharshita virwani
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data MiningValerii Klymchuk
 
CS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit VCS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit Vpkaviya
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introductionnimmyjans4
 
Data Integration and Transformation in Data mining
Data Integration and Transformation in Data miningData Integration and Transformation in Data mining
Data Integration and Transformation in Data miningkavitha muneeshwaran
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.pptneelamoberoi1030
 
4.5 mining the worldwideweb
4.5 mining the worldwideweb4.5 mining the worldwideweb
4.5 mining the worldwidewebKrish_ver2
 
4.4 text mining
4.4 text mining4.4 text mining
4.4 text miningKrish_ver2
 

Was ist angesagt? (20)

Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization
 
Spatial databases
Spatial databasesSpatial databases
Spatial databases
 
Web content mining
Web content miningWeb content mining
Web content mining
 
web mining
web miningweb mining
web mining
 
Temporal data mining
Temporal data miningTemporal data mining
Temporal data mining
 
Models of Distributed System
Models of Distributed SystemModels of Distributed System
Models of Distributed System
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 
Data science unit1
Data science unit1Data science unit1
Data science unit1
 
Big Data Ecosystem
Big Data EcosystemBig Data Ecosystem
Big Data Ecosystem
 
Inverted index
Inverted indexInverted index
Inverted index
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Semantic nets in artificial intelligence
Semantic nets in artificial intelligenceSemantic nets in artificial intelligence
Semantic nets in artificial intelligence
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
 
CS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit VCS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit V
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introduction
 
Data Integration and Transformation in Data mining
Data Integration and Transformation in Data miningData Integration and Transformation in Data mining
Data Integration and Transformation in Data mining
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
4.5 mining the worldwideweb
4.5 mining the worldwideweb4.5 mining the worldwideweb
4.5 mining the worldwideweb
 
4.4 text mining
4.4 text mining4.4 text mining
4.4 text mining
 

Andere mochten auch

speech processing and recognition basic in data mining
speech processing and recognition basic in  data miningspeech processing and recognition basic in  data mining
speech processing and recognition basic in data miningJimit Rupani
 
Multimedia Data Mining using Deep Learning
Multimedia Data Mining using Deep LearningMultimedia Data Mining using Deep Learning
Multimedia Data Mining using Deep LearningBhagyashree Barde
 
Perfectessay.net research paper sample #1 apa style
Perfectessay.net research paper sample #1 apa stylePerfectessay.net research paper sample #1 apa style
Perfectessay.net research paper sample #1 apa styleDavid Smith
 
Brunel NIHR event 26 March 2015 - College of Health and Life Sciences researc...
Brunel NIHR event 26 March 2015 - College of Health and Life Sciences researc...Brunel NIHR event 26 March 2015 - College of Health and Life Sciences researc...
Brunel NIHR event 26 March 2015 - College of Health and Life Sciences researc...Professor Priscilla Harries
 
Siam term paper finalIdentifying the Marketing Strategy in Existing mobile Co...
Siam term paper finalIdentifying the Marketing Strategy in Existing mobile Co...Siam term paper finalIdentifying the Marketing Strategy in Existing mobile Co...
Siam term paper finalIdentifying the Marketing Strategy in Existing mobile Co...Polta Siam
 
Educatonal Research Methodology Framework
Educatonal Research Methodology FrameworkEducatonal Research Methodology Framework
Educatonal Research Methodology FrameworkSt. John's University
 
Writing The Research Paper
Writing The Research PaperWriting The Research Paper
Writing The Research PaperProf S
 
Business Process Re-Engineering Essay
Business Process Re-Engineering EssayBusiness Process Re-Engineering Essay
Business Process Re-Engineering EssayPrince Bertrand
 
Experimental/Research Reports
Experimental/Research ReportsExperimental/Research Reports
Experimental/Research ReportstheLecturette
 
A research paper on leadership
A research paper on leadershipA research paper on leadership
A research paper on leadershipPriyanka Singh
 
Term paper on grameen phone telecom...Download: http://studyassignment.blogsp...
Term paper on grameen phone telecom...Download: http://studyassignment.blogsp...Term paper on grameen phone telecom...Download: http://studyassignment.blogsp...
Term paper on grameen phone telecom...Download: http://studyassignment.blogsp...Pujan Kumar Saha
 
A study on new ULIP guidelines and comparative analysis on new ULIP vis-a-vis...
A study on new ULIP guidelines and comparative analysis on new ULIP vis-a-vis...A study on new ULIP guidelines and comparative analysis on new ULIP vis-a-vis...
A study on new ULIP guidelines and comparative analysis on new ULIP vis-a-vis...Prashant Maharshi
 
Workplace motivation paper
Workplace motivation paperWorkplace motivation paper
Workplace motivation paperPhillip Woodard
 
Eco tourism project paper
Eco tourism project paperEco tourism project paper
Eco tourism project paperSkillet Tony
 
Rural marketing research new
Rural marketing research newRural marketing research new
Rural marketing research newRamendra Tripathi
 
UGC NET PREPARATION
UGC NET PREPARATIONUGC NET PREPARATION
UGC NET PREPARATIONAloke Kumar
 
Serial Killers Psychology Presentation
Serial Killers Psychology PresentationSerial Killers Psychology Presentation
Serial Killers Psychology PresentationPietro Solda
 

Andere mochten auch (20)

Visual Data Mining
Visual Data MiningVisual Data Mining
Visual Data Mining
 
speech processing and recognition basic in data mining
speech processing and recognition basic in  data miningspeech processing and recognition basic in  data mining
speech processing and recognition basic in data mining
 
Multimedia Data Mining using Deep Learning
Multimedia Data Mining using Deep LearningMultimedia Data Mining using Deep Learning
Multimedia Data Mining using Deep Learning
 
Perfectessay.net research paper sample #1 apa style
Perfectessay.net research paper sample #1 apa stylePerfectessay.net research paper sample #1 apa style
Perfectessay.net research paper sample #1 apa style
 
Prime Essay Writings Egyptian art coursework
Prime Essay Writings Egyptian art courseworkPrime Essay Writings Egyptian art coursework
Prime Essay Writings Egyptian art coursework
 
Brunel NIHR event 26 March 2015 - College of Health and Life Sciences researc...
Brunel NIHR event 26 March 2015 - College of Health and Life Sciences researc...Brunel NIHR event 26 March 2015 - College of Health and Life Sciences researc...
Brunel NIHR event 26 March 2015 - College of Health and Life Sciences researc...
 
Siam term paper finalIdentifying the Marketing Strategy in Existing mobile Co...
Siam term paper finalIdentifying the Marketing Strategy in Existing mobile Co...Siam term paper finalIdentifying the Marketing Strategy in Existing mobile Co...
Siam term paper finalIdentifying the Marketing Strategy in Existing mobile Co...
 
Educatonal Research Methodology Framework
Educatonal Research Methodology FrameworkEducatonal Research Methodology Framework
Educatonal Research Methodology Framework
 
Writing The Research Paper
Writing The Research PaperWriting The Research Paper
Writing The Research Paper
 
Business Process Re-Engineering Essay
Business Process Re-Engineering EssayBusiness Process Re-Engineering Essay
Business Process Re-Engineering Essay
 
Experimental/Research Reports
Experimental/Research ReportsExperimental/Research Reports
Experimental/Research Reports
 
A research paper on leadership
A research paper on leadershipA research paper on leadership
A research paper on leadership
 
Term paper on grameen phone telecom...Download: http://studyassignment.blogsp...
Term paper on grameen phone telecom...Download: http://studyassignment.blogsp...Term paper on grameen phone telecom...Download: http://studyassignment.blogsp...
Term paper on grameen phone telecom...Download: http://studyassignment.blogsp...
 
A study on new ULIP guidelines and comparative analysis on new ULIP vis-a-vis...
A study on new ULIP guidelines and comparative analysis on new ULIP vis-a-vis...A study on new ULIP guidelines and comparative analysis on new ULIP vis-a-vis...
A study on new ULIP guidelines and comparative analysis on new ULIP vis-a-vis...
 
Workplace motivation paper
Workplace motivation paperWorkplace motivation paper
Workplace motivation paper
 
Eco tourism project paper
Eco tourism project paperEco tourism project paper
Eco tourism project paper
 
Rural marketing research new
Rural marketing research newRural marketing research new
Rural marketing research new
 
UGC NET PREPARATION
UGC NET PREPARATIONUGC NET PREPARATION
UGC NET PREPARATION
 
Serial Killers Psychology Presentation
Serial Killers Psychology PresentationSerial Killers Psychology Presentation
Serial Killers Psychology Presentation
 
Data mining
Data miningData mining
Data mining
 

Ähnlich wie Audio Mining Techniques and Applications

Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert Systemcsandit
 
Voice Recognition System using Template Matching
Voice Recognition System using Template MatchingVoice Recognition System using Template Matching
Voice Recognition System using Template MatchingIJORCS
 
Robust Speech Recognition Technique using Mat lab
Robust Speech Recognition Technique using Mat labRobust Speech Recognition Technique using Mat lab
Robust Speech Recognition Technique using Mat labIRJET Journal
 
Deciphering voice of customer through speech analytics
Deciphering voice of customer through speech analyticsDeciphering voice of customer through speech analytics
Deciphering voice of customer through speech analyticsR Systems International
 
A Recorded Debating Dataset
A Recorded Debating DatasetA Recorded Debating Dataset
A Recorded Debating DatasetScott Faria
 
Tutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemIJERA Editor
 
AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...
AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...
AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...aciijournal
 
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...aciijournal
 
Mood classification of songs based on lyrics
Mood classification of songs based on lyricsMood classification of songs based on lyrics
Mood classification of songs based on lyricsFrancesco Cucari
 
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...aciijournal
 
Statistical Named Entity Recognition for Hungarian – analysis ...
Statistical Named Entity Recognition for Hungarian – analysis ...Statistical Named Entity Recognition for Hungarian – analysis ...
Statistical Named Entity Recognition for Hungarian – analysis ...butest
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNIJCSEA Journal
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNIJCSEA Journal
 
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...TELKOMNIKA JOURNAL
 
Annotating Soundscapes.pdf
Annotating Soundscapes.pdfAnnotating Soundscapes.pdf
Annotating Soundscapes.pdfMichelle Shaw
 

Ähnlich wie Audio Mining Techniques and Applications (20)

Modeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert SystemModeling of Speech Synthesis of Standard Arabic Using an Expert System
Modeling of Speech Synthesis of Standard Arabic Using an Expert System
 
Voice Recognition System using Template Matching
Voice Recognition System using Template MatchingVoice Recognition System using Template Matching
Voice Recognition System using Template Matching
 
Robust Speech Recognition Technique using Mat lab
Robust Speech Recognition Technique using Mat labRobust Speech Recognition Technique using Mat lab
Robust Speech Recognition Technique using Mat lab
 
Deciphering voice of customer through speech analytics
Deciphering voice of customer through speech analyticsDeciphering voice of customer through speech analytics
Deciphering voice of customer through speech analytics
 
A Recorded Debating Dataset
A Recorded Debating DatasetA Recorded Debating Dataset
A Recorded Debating Dataset
 
Dy36749754
Dy36749754Dy36749754
Dy36749754
 
Voice
VoiceVoice
Voice
 
Tutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis System
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...
AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...
AUDIO SIGNAL IDENTIFICATION AND SEARCH APPROACH FOR MINIMIZING THE SEARCH TIM...
 
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
 
Mood classification of songs based on lyrics
Mood classification of songs based on lyricsMood classification of songs based on lyrics
Mood classification of songs based on lyrics
 
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...Audio Signal Identification and Search Approach for Minimizing the Search Tim...
Audio Signal Identification and Search Approach for Minimizing the Search Tim...
 
Statistical Named Entity Recognition for Hungarian – analysis ...
Statistical Named Entity Recognition for Hungarian – analysis ...Statistical Named Entity Recognition for Hungarian – analysis ...
Statistical Named Entity Recognition for Hungarian – analysis ...
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 
F334047
F334047F334047
F334047
 
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
 
An Introduction To Speech Recognition
An Introduction To Speech RecognitionAn Introduction To Speech Recognition
An Introduction To Speech Recognition
 
Annotating Soundscapes.pdf
Annotating Soundscapes.pdfAnnotating Soundscapes.pdf
Annotating Soundscapes.pdf
 

Kürzlich hochgeladen

Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17Celine George
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxAnupam32727
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Celine George
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 

Kürzlich hochgeladen (20)

Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17
 
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptxCLASSIFICATION OF ANTI - CANCER DRUGS.pptx
CLASSIFICATION OF ANTI - CANCER DRUGS.pptx
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 

Audio Mining Techniques and Applications

  • 2. The Web, databases, and other digitized information storehouses contain a growing volume of audio content.  For example newscasts, sporting events, telephone conversations, recordings of meetings, Webcasts, documentary archives etc.  Users want to make the most of this material by searching and indexing the digitized audio content.  In the past, companies had to create and manually analyze written transcripts of audio content because using computers to recognize, interpret, and analyze digitized speech was difficult.  However, the development of faster microprocessors, larger storage capacities, and better speech-recognition algorithms has made audio mining easier.
  • 3. Audio mining, also called audio searching, takes a text-based query and locates the search term or phrase in an audio file.  This helps users by, for example, letting them quickly get to specific places in a recorded conversation or determine when a company is mentioned in a newscast.  Audio indexing uses speech recognition to analyze an entire file and produce a searchable index of content bearing words and their locations. This is critical because audio content is in a binary format that is otherwise not readily searchable. Indexing audio content thus enables searching.
  • 4.  There are two main approaches to audio mining : 1.Text-based indexing : oIt converts speech to text and then identifies words in a dictionary that can contain several hundred thousand entries. If a word or name is not in the dictionary, the system will choose the most similar word it can find. oThe system uses language understanding to create a confidence level for its findings. For findings with less than a 100 percent confidence level, the system offers other possible word matches.
  • 5. 2. Phoneme-based indexing: o It doesn’t convert speech to text but instead works only with sounds. o The system first analyzes and identifies sounds in a piece of audio content to create a phonetic-based index. It then uses a dictionary of several dozen phonemes to convert a user’s search term to the correct phoneme string. o Phonemes are the smallest units of speech in a language, all words are set of phonemes. oFinally, the system looks for the search terms in the index.
  • 6.  A phonetic system requires a more efficient search tool because it must phoneticize the query term, then try to match it with the existing phonetic string output .This is considerably more complex than using one of the many existing text-based search tools. Phoneme-based searches can result in more false matches than the text-based approach, particularly for short search terms, because many words sound alike or sound like parts of other words. However, phonetic indexing can still be useful if the analyzed material contains important words that are likely to be missing from a text system’s dictionary, such as foreign terms and names of people and places
  • 7. Text- and phoneme-based systems operate in much the same way, except that the former uses a text-based dictionary and the latter uses a phonetic dictionary. A speech recognizer converts the observed acoustic signal into the corresponding written representation of the spoken words. Speech recognition software contains acoustic models of the way in which all phonemes are represented. Also, there is a statistical language model that indicates how likely words are to follow each other in a specific Language. By using these capabilities, as well as complex probability analysis, the technology can take a speech signal of unknown content and convert it to a series of words .
  • 8. Figure: ScanSoft Audio Mining System
  • 9.
  • 10.  Since music is often described by genre, we would like to annotate our music data with genre. Classification by genre is useful for music search and retrieval and also for playlist generation
  • 11. Linear and non-linear neural networks Gaussian Classification Gaussian mixture models Hidden Markov model
  • 12. Let us consider a simple weather forecast problem and try to emulate a model that can predict tomorrow's weather based on today’s condition. In this example we have three stationary all day weather, which could be sunny (S), cloudy (C), or Rainy (R). From the history of the weather of the town under investigation we have the following table
  • 13. •We refer to the weather conditions by state q that are sampled at instant t and the problem is to find the probability of weather condition of tomorrow given today's condition P(qt+1 /qt). •An acceptable approximation for n instants history is : •P(qt+1/qt , qt-1 , qt-2 , ….. , qt-n ) » P(qt+1 /qt). •This is the first order Markov chain as the history is considered to be one instant only.
  • 14. Let us now ask this question: Given today as sunny (S) what is the probability that the next following five days are S , C , C , R and S, having the above model? The answer resides in the following formula using first order Markov chain: P(q1 = S, q2=S, q3=C, q4=C, q5=R, q6=S) = P(S).P(q2=S/q1=S). P(q3=C/q2=S). P(q4=C/q3=C). P(q5=R/q4=C). P(q6=S/q5=R) = 1 x 0.7 x 0.2 x 0.8 x 0.15 x 0.15 = 0.00252 The initial probability P(S) = 1, as it is assumed that today is sunny.
  • 15.
  • 16. The model is completely defined by these three sets of parameters a, b, and p and the model of N states and M observations can be referred to by : λ = (A , B , p ) where A = {aij}, B = {bj(wk)} 1 <=i , j <=N and 1< =k <= M. aij represents the probability of state transition (probability of being in state Sj given state Si ) aij = P(qt+1=Sj / qt=Si) . bj(wk) is the probability distribution in a state Sj w is the alphabet and k is the number of symbols in this alphabet. π = {1 0 0 } is the initial state probability distribution.
  • 17. Each phoneme(unit of sound) could be represented by a state of different and varying duration. Accordingly, the transition between different phonemes to form a word can be represented by A = {aij}.  The observations in this case are the sounds produced in each position and due to the variations in the evolution of each sound this can be also represented by a probabilistic function B = {bj(wk)}.
  • 18. A major challenge for speech recognition tools has been recognizing the speech of different users in different environments. With this in mind, BBN, IBM, Fast-Talk, and ScanSoft have designed their audio mining technology to be largely speaker independent. For example, Fast-Talk’s acoustic models are trained to recognize numerous speakers via exposure to audio data from speakers representing various ages, dialects, and speaking styles. Some audio mining technology uses acoustic models tuned to understand speech from different environments— such as telephony, TV, or radio.
  • 19. Designing filters to reduce the background noise that can interfere with accurate speech recognition. Creating efficient data structures for representing content. Developing algorithms that quickly work through the data structures during indexing and searching.
  • 20.  Precision is improving but it is still a key issue impeding the technology’s widespread adoption, particularly in such accuracy- critical applications as court reporting and medical dictation. Processing conversational speech can be particularly difficult because of such factors as overlapping words and background noise. Breakthroughs in natural language understanding will eventually lead to big improvements, but until then audio mining will get better only incrementally. It’s currently seen as a ‘nice to have,’ not quite yet a ‘need to have’ technology.
  • 21. Companies could use audio mining to analyze customer-service and helpdesk conversations or even voice mail.  Law enforcement and intelligence organizations could use the technology to analyze intercepted phone conversations. Broadcast companies like CNN and Radio Free Asia are already using audio mining to quickly retrieve important background information from previous broadcasts when new stories break. A US prison is using ScanSoft’s audio mining product to analyze recordings of prisoners’ phone calls to identify illegal activity.
  • 22.  Musical audio mining relates to the identification of perceptually important characteristics of a piece of music such as melodic, harmonic or rhythmic structure.  Searches can then be carried out to find pieces of music that are similar in terms of their melodic, harmonic and/or rhythmic characteristics. This type of analysis can also be used in music to determine characteristics like beats per minute (BPM), musical key, and musical structure, information that is employed to classify music. Music downloading sites that categorize music by genre uses audio mining to organize the music.
  • 23.  Research paper: “Lets Hear It For Audio Mining” by Neal Leavitt. Research paper: “Tendencies, Perspectives, and Opportunities of Musical Audio-Mining” by Ghent University, Belgium. wikipedia.org/wiki/Audio_mining