SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Downloaden Sie, um offline zu lesen
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 09 | Sep 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 427
AUDIO DATA SUMMARIZATION SYSTEM USING NATURAL LANGUAGE
PROCESSING
Pravin Khandare1, Sanket Gaikwad2, Aditya Kukade3, Rohit Panicker4, Swaraj Thamke5
1Assistant Professor, Dept. of Computer Engineering, Vishwakarma Institute of Technology, Pune, Maharashtra
(India)
2,3,4,5UG Student, Vishwakarma Institute of Technology, Pune, Maharashtra(India)
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - This paper presents techniques for converting
speech audio file to text file and text summarization on the
text file. For the former case, we have used Python modules to
convert the audio files to text format. For the latter case,
Natural Language Processing’s modules are used for text
summarization. A Python toolkit named SpaCy is used for the
English data functions. Summarization method involves
important sentence obtained when the extraction is
investigated. Weights are assigned to words according to the
number of occurrences of each word in the text file. This
technique is used for producing summaries from the main
audio file.
Key Words: NLTK (Natural Language Toolkit), Stop words,
Frequency Table, Frequency Distribution, Tokenization,
Stemming, Lemmatization, SpaCy.
1. INTRODUCTION
The most natural and effective mode of communication
between human beings is available in audio.
Although, it is an easier way to perceive the information,
there are some disadvantages associated with this type of
communication. In case of audios that are available today,
like podcasts, broadcasting of news on radio channels,
speeches, etc. all the audio files data that are obtained from
these sources are not effectively useful means of gathering
information every time. In this case, it is always effective if
the data is obtained from the condensed contenti.e.focusing
more on important points in the content rather than the
entire content. To proceed with this technique, an audio file
is taken as input and data summarization processes are
implemented to get condensed output.
This paper particularly focuses on a two-level procedure for
getting summarized output from input audio file. Namely,
Speech to Text and then text summarization. The efficiency
of the output though depends on the clarity of the audio file
that is given as input to the former stage. Human speech
recording has proved to be a clear and processable input,
whereas pre-recorded podcasts have some amount of
deviation in the original text and the text generated in the
first phase.
There is an optional output available for the users that are
getting an audio summary of the recording. This has been
kept as a user preference whether they want the audio-
based summary or not. All the summarized outputs are
available in the form of files. Also, the original audio and text
in the form of output files are maintained, in case the user
wants to manually add a sentence from the text to the
summary.
2. PHASE 1: SPEECH TO TEXT
2.1 Obtaining input
The input audio forprocessing isobtainedbyrecordingusing
the device’s microphone. Therecording process is facilitated
by GUI based buttons. The recording starts when the button
“Start Recording” is pressed and stops on the “Stop
recording” button click event. The python modules used for
GUI creation is tkinter. The audio recording and processing
are done with the help of pyaudio module. Audio is recorded
by continuously appending frames of audio. The final output
of this phase is in the form of an audio file of the “wav”
format. The format wav is chosen since it is suitable for
further processing.
2.2 Need of partition
The audio file recorded needs to be converted into text. This
process uses python module SpeechRecognition. This
module requires a working internet connection to function
as it uses Google’s web-based speechtotext engine.Thetime
duration for which the engine accepts input is a maximum of
1 minute. So, if the duration of the audio file increases
beyond 1 minute then we need to divide it into chunks of
smaller duration.
The process of dividing the audio file in chunks of
processable duration is carried out by using mathematical
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 09 | Sep 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 428
formulae by calculating the size of the audio file by
considering frame rate of audio file, the original duration of
the audio file and the bit rate and then dividing the size of
the audio file by the file split size. We then get the total
number of numbered chunks and in wav format as well. The
python module requiredfor gettingtheaudiofileparameters
is pydub.
The mathematical formulae used for the same are:
wav_file_size = (sample_rate * bit_rate * channel_count *
duration_in_sec) / 8 ……………………. (1)
file_split_size = 1000……………………………... (2)
total_chunks = wav_file_size // file_split_size…... (3)
chunk_length_in_sec = math.ceil((duration_in_sec *
10000000) /wav_file_size) …………………………………(4)
chunk_length_ms=chunk_length_in_sec*1000…... (5)
chunks = make_chunks(myaudio, chunk_length_ms)
……. (6)
The individual chunks of files are then processed further for
getting the final text output of the original audio file. This
text will then be processed for getting summary.
2.3 Getting the text output
The function of generating text from the audio file recorded
is aided by the python module Speech Recognition. The
reason for selecting this specific module is due to its
simplicity of usage and better efficiency than offline
modules. The audio file or chunk of audio file is passed to an
instance of the module that takes this wav file as source. The
audio file is scanned and rectified for noise.Themoduleuses
Google’s engine to recognize the audio and convert it into
text file. This file stores the original text of the speech and is
stored for operational purposes. This file is then processed
under phase two to get the extractive summary of the text
recorded in this file.
3. PHASE 2: TEXT SUMMARIZATION
This phase focuses on generating a text summary as an
output for the input text that is generated in phase 1. This
phase mentions the major steps taken in the algorithm of
summarization. It uses SpaCy which is an open-source
software library for advanced Natural Language Processing,
written in the programming languages Python and Cython.
3.1 Tokenization of words and word frequency
generation
At this point, the data has to be tokenized by breaking it
down into words to generate the frequency of each word
occurring in the provided data. After the list of tokens is
generated, iterate through the list and check if the
corresponding word is not present in the stop wordslistand
if not, increase its frequency by 1.
3.2 Weighted Frequency and Frequency Table
Generation
At this point having generated the frequencies of each word
in the given data, normalize the frequencies byextractingthe
maximum of all frequencies which was generated in the
earlier step and divide frequency of each word by the
maximum frequency.
3.3 Sentence Tokenization
Now there is a need to generate scores of each sentence in
the data to generate an optimal summary of the given data.
For that first tokenize each sentence in the data.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 09 | Sep 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 429
3.4 Sentence scoring
Now, having generated a list of sentences and also a list of
words. So, generate a score for each sentence by adding the
weighted frequencies of the word that occur in a particular
sentence. As it is not interested in a summary, it has to score
only those sentences with less than 30 words.
3.5 Sentence Selection
In the earlier step, output is generated as dictionary of
sentences with their scores. Now, select the top N sentences
by passing our sentence scores dictionary along with its
values and the required N sentencestothen-largestfunction
in the NLTK library and it turns into a list with sentences
according to their sentence scores.
3.6 Joining the sentences
Further, convert the sentences to strings and then Finally
Join them to generate the finalized summary.
3.7 Getting audio output
For this part of the program, use of gTTS module is done. It
stands for Google Text to Speech. It isa web-basedengine for
processing an input text file and generating output in audio
format. The language needs to be specified for the purpose.
In this case, English language is chosen and theinputtextfile
is the previously generated summarized text output file.
After processing on the given parameters, the gTTS module
gives output in the form of an mp3 file. This file can be
played on a simple button click event. Hence the user
satisfaction is taken care of in the end.
3.8 Final Set of Outputs
Hence the final set of outputs include:
a. Main recorded audio file.
b. Chunks of the above file, if needed.
c. Text file of recorded audio.
d. Summarized text output file.
e. Summarized audio file.
4. CONCLUSION
In the process of generating a summary as discussed in this
paper, the primary input taken forprocessingisanaudiofile.
The audio file is generated by recording the human speech
which is being spoken or is already recorded. The duration
of the recording is completely controlled by the user in the
form of a buttoned GUI. The audio file in wav format is then
converted into a text file which is then used as input for text
summarizer processing. The final output of the projectisthe
summarized text file of the contents of the primary
recording.
FLOWCHART
Start
Start Recording Audio
Stop Recording
Get Text
Generate Audio File
Generate Main File Text
Stop
Get Summary Text File
Get Summary
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 09 | Sep 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 430
REFERENCES
1. Partha Mukherjee, Soumen Santra, Subhajit
Bhowmick, Development of GUI for Text-to-Speech
Recognition using Natural Language Processing
2. Sadaoki Furui, Fellow, IEEE, Tomonori Kikuchi,
Yousuke Shinnaka, and Chiori Hori, Member, IEEE,
Speech-to-Text and Speech-to-Speech
Summarization of Spontaneous Speech
3. Rajesh S.Prasad, Dr. U.V.Kulkarni,MachineLearning
in Evolving Connectionist Text Summarizer
4. M. Saadeq Rafieee, Somayeh Jafari,”Considerations
to Spoken Language RecognitionforText-to-Speech
Applications”
5. Guangbing Yang, Erkki Sutinen,”Chunking and
Extracting Text Content for Mobile Learning: A
Query-focused Summarizer Based on Relevance
Language Model”
6. Yihong Gong ,Xin Liu,”Creating generic text
summaries”
7. Yogita H. Ghadage, Sushama D. Shelke,” Speech to
text conversion for multilingual languages”
8. Abhijit V. Bapat, Lalit K. Nagalkar, “Phonetic Speech
Analysis for Speech to Text Conversion”
9. R. Carlson, G. Fant, C. Gobl, B. Granstrom,I.Karlsson,
Q.-G. Lin, “Voice source rules for text-to-speech
synthesis”
10. F. Violaro, O. Boeffard, “A hybrid model for text-to-
speech synthesis”

Weitere ähnliche Inhalte

Was ist angesagt?

SMATalk: Standard Malay Text to Speech Talk System
SMATalk: Standard Malay Text to Speech Talk SystemSMATalk: Standard Malay Text to Speech Talk System
SMATalk: Standard Malay Text to Speech Talk SystemCSCJournals
 
A Statistical Approach to Adaptive Playout Scheduling in Voice Over Internet ...
A Statistical Approach to Adaptive Playout Scheduling in Voice Over Internet ...A Statistical Approach to Adaptive Playout Scheduling in Voice Over Internet ...
A Statistical Approach to Adaptive Playout Scheduling in Voice Over Internet ...IJECEIAES
 
IRJET-Raspberry Pi based Reader for Blind People
IRJET-Raspberry Pi based Reader for Blind PeopleIRJET-Raspberry Pi based Reader for Blind People
IRJET-Raspberry Pi based Reader for Blind PeopleIRJET Journal
 
Human Emotion Recognition From Speech
Human Emotion Recognition From SpeechHuman Emotion Recognition From Speech
Human Emotion Recognition From SpeechIJERA Editor
 
Performance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languagePerformance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languageiosrjce
 
DATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITY
DATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITYDATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITY
DATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITYcsandit
 
HIDING A MESSAGE IN MP3 USING LSB WITH 1, 2, 3 AND 4 BITS
HIDING A MESSAGE IN MP3 USING LSB WITH 1, 2, 3 AND 4 BITSHIDING A MESSAGE IN MP3 USING LSB WITH 1, 2, 3 AND 4 BITS
HIDING A MESSAGE IN MP3 USING LSB WITH 1, 2, 3 AND 4 BITSIJCNCJournal
 
Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...IOSR Journals
 
IRJET- Vocal Code
IRJET- Vocal CodeIRJET- Vocal Code
IRJET- Vocal CodeIRJET Journal
 
Wavelet Based Feature Extraction for the Indonesian CV Syllables Sound
Wavelet Based Feature Extraction for the Indonesian CV Syllables SoundWavelet Based Feature Extraction for the Indonesian CV Syllables Sound
Wavelet Based Feature Extraction for the Indonesian CV Syllables SoundTELKOMNIKA JOURNAL
 
Ijrdtvlis11 140006
Ijrdtvlis11 140006Ijrdtvlis11 140006
Ijrdtvlis11 140006Ijrdt Journal
 
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...ijceronline
 
IRJET- Voice based Billing System
IRJET-  	  Voice based Billing SystemIRJET-  	  Voice based Billing System
IRJET- Voice based Billing SystemIRJET Journal
 
Comparative Study of Different Techniques in Speaker Recognition: Review
Comparative Study of Different Techniques in Speaker Recognition: ReviewComparative Study of Different Techniques in Speaker Recognition: Review
Comparative Study of Different Techniques in Speaker Recognition: ReviewIJAEMSJORNAL
 
Classification of Language Speech Recognition System
Classification of Language Speech Recognition SystemClassification of Language Speech Recognition System
Classification of Language Speech Recognition Systemijtsrd
 

Was ist angesagt? (20)

SMATalk: Standard Malay Text to Speech Talk System
SMATalk: Standard Malay Text to Speech Talk SystemSMATalk: Standard Malay Text to Speech Talk System
SMATalk: Standard Malay Text to Speech Talk System
 
A Statistical Approach to Adaptive Playout Scheduling in Voice Over Internet ...
A Statistical Approach to Adaptive Playout Scheduling in Voice Over Internet ...A Statistical Approach to Adaptive Playout Scheduling in Voice Over Internet ...
A Statistical Approach to Adaptive Playout Scheduling in Voice Over Internet ...
 
IRJET-Raspberry Pi based Reader for Blind People
IRJET-Raspberry Pi based Reader for Blind PeopleIRJET-Raspberry Pi based Reader for Blind People
IRJET-Raspberry Pi based Reader for Blind People
 
Human Emotion Recognition From Speech
Human Emotion Recognition From SpeechHuman Emotion Recognition From Speech
Human Emotion Recognition From Speech
 
Dy36749754
Dy36749754Dy36749754
Dy36749754
 
Performance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languagePerformance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi language
 
Innovative Improvement of Data Storage Using Error Correction Codes
Innovative Improvement of Data Storage Using Error Correction CodesInnovative Improvement of Data Storage Using Error Correction Codes
Innovative Improvement of Data Storage Using Error Correction Codes
 
Audio steganography using r prime rsa and ga based lsb algorithm to enhance s...
Audio steganography using r prime rsa and ga based lsb algorithm to enhance s...Audio steganography using r prime rsa and ga based lsb algorithm to enhance s...
Audio steganography using r prime rsa and ga based lsb algorithm to enhance s...
 
DATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITY
DATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITYDATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITY
DATA HIDING IN AUDIO SIGNALS USING WAVELET TRANSFORM WITH ENHANCED SECURITY
 
H42045359
H42045359H42045359
H42045359
 
HIDING A MESSAGE IN MP3 USING LSB WITH 1, 2, 3 AND 4 BITS
HIDING A MESSAGE IN MP3 USING LSB WITH 1, 2, 3 AND 4 BITSHIDING A MESSAGE IN MP3 USING LSB WITH 1, 2, 3 AND 4 BITS
HIDING A MESSAGE IN MP3 USING LSB WITH 1, 2, 3 AND 4 BITS
 
Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...
 
IRJET- Vocal Code
IRJET- Vocal CodeIRJET- Vocal Code
IRJET- Vocal Code
 
Wavelet Based Feature Extraction for the Indonesian CV Syllables Sound
Wavelet Based Feature Extraction for the Indonesian CV Syllables SoundWavelet Based Feature Extraction for the Indonesian CV Syllables Sound
Wavelet Based Feature Extraction for the Indonesian CV Syllables Sound
 
Ijrdtvlis11 140006
Ijrdtvlis11 140006Ijrdtvlis11 140006
Ijrdtvlis11 140006
 
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
 
IRJET- Voice based Billing System
IRJET-  	  Voice based Billing SystemIRJET-  	  Voice based Billing System
IRJET- Voice based Billing System
 
Comparative Study of Different Techniques in Speaker Recognition: Review
Comparative Study of Different Techniques in Speaker Recognition: ReviewComparative Study of Different Techniques in Speaker Recognition: Review
Comparative Study of Different Techniques in Speaker Recognition: Review
 
19 ijcse-01227
19 ijcse-0122719 ijcse-01227
19 ijcse-01227
 
Classification of Language Speech Recognition System
Classification of Language Speech Recognition SystemClassification of Language Speech Recognition System
Classification of Language Speech Recognition System
 

Ă„hnlich wie IRJET- Audio Data Summarization System using Natural Language Processing

Automatic Subtitle Generation for Sound in Videos
Automatic Subtitle Generation for Sound in VideosAutomatic Subtitle Generation for Sound in Videos
Automatic Subtitle Generation for Sound in VideosIRJET Journal
 
Automatic Subtitle Generation For Sound In Videos
Automatic Subtitle Generation For Sound In VideosAutomatic Subtitle Generation For Sound In Videos
Automatic Subtitle Generation For Sound In VideosAsia Smith
 
IRJET- Transcription of Conferences
IRJET- Transcription of ConferencesIRJET- Transcription of Conferences
IRJET- Transcription of ConferencesIRJET Journal
 
Real Time Direct Speech-to-Speech Translation
Real Time Direct Speech-to-Speech TranslationReal Time Direct Speech-to-Speech Translation
Real Time Direct Speech-to-Speech TranslationIRJET Journal
 
IRJET- Audio Genre Classification using Neural Networks
IRJET-  	  Audio Genre Classification using Neural NetworksIRJET-  	  Audio Genre Classification using Neural Networks
IRJET- Audio Genre Classification using Neural NetworksIRJET Journal
 
Automatic subtitle generation
Automatic subtitle generationAutomatic subtitle generation
Automatic subtitle generationtanyasaxena1611
 
MULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLE
MULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLEMULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLE
MULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLEIRJET Journal
 
Video Summarization
Video SummarizationVideo Summarization
Video SummarizationIRJET Journal
 
VIDEO TO TEXT SUMMARIZER USING AI.pdf
VIDEO TO TEXT SUMMARIZER USING AI.pdfVIDEO TO TEXT SUMMARIZER USING AI.pdf
VIDEO TO TEXT SUMMARIZER USING AI.pdfFreeFire293813
 
Voice Based Email System For Visual Impaired Using AI
Voice Based Email System For Visual Impaired Using AIVoice Based Email System For Visual Impaired Using AI
Voice Based Email System For Visual Impaired Using AIIRJET Journal
 
Extract the Audio from Video by using python
Extract the Audio from Video by using pythonExtract the Audio from Video by using python
Extract the Audio from Video by using pythonIRJET Journal
 
IRJET- Speech to Speech Translation System
IRJET- Speech to Speech Translation SystemIRJET- Speech to Speech Translation System
IRJET- Speech to Speech Translation SystemIRJET Journal
 
IRJET- Voice based Gender Recognition
IRJET- Voice based Gender RecognitionIRJET- Voice based Gender Recognition
IRJET- Voice based Gender RecognitionIRJET Journal
 
Autotuned voice cloning enabling multilingualism
Autotuned voice cloning enabling multilingualismAutotuned voice cloning enabling multilingualism
Autotuned voice cloning enabling multilingualismIRJET Journal
 
Playing sound files in labview using audacity toolkit
Playing sound files in labview using audacity toolkitPlaying sound files in labview using audacity toolkit
Playing sound files in labview using audacity toolkiteSAT Journals
 
AI POWERED TRANSCRIBER TO RECORD THE CONFERENCE PROCEEDINGS
AI POWERED TRANSCRIBER TO RECORD THE CONFERENCE PROCEEDINGSAI POWERED TRANSCRIBER TO RECORD THE CONFERENCE PROCEEDINGS
AI POWERED TRANSCRIBER TO RECORD THE CONFERENCE PROCEEDINGSIRJET Journal
 
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)IRJET Journal
 
The first FOSD-tacotron-2-based text-to-speech application for Vietnamese
The first FOSD-tacotron-2-based text-to-speech application for VietnameseThe first FOSD-tacotron-2-based text-to-speech application for Vietnamese
The first FOSD-tacotron-2-based text-to-speech application for VietnamesejournalBEEI
 
Automatic document clustering
Automatic document clusteringAutomatic document clustering
Automatic document clusteringIAEME Publication
 
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...IRJET Journal
 

Ă„hnlich wie IRJET- Audio Data Summarization System using Natural Language Processing (20)

Automatic Subtitle Generation for Sound in Videos
Automatic Subtitle Generation for Sound in VideosAutomatic Subtitle Generation for Sound in Videos
Automatic Subtitle Generation for Sound in Videos
 
Automatic Subtitle Generation For Sound In Videos
Automatic Subtitle Generation For Sound In VideosAutomatic Subtitle Generation For Sound In Videos
Automatic Subtitle Generation For Sound In Videos
 
IRJET- Transcription of Conferences
IRJET- Transcription of ConferencesIRJET- Transcription of Conferences
IRJET- Transcription of Conferences
 
Real Time Direct Speech-to-Speech Translation
Real Time Direct Speech-to-Speech TranslationReal Time Direct Speech-to-Speech Translation
Real Time Direct Speech-to-Speech Translation
 
IRJET- Audio Genre Classification using Neural Networks
IRJET-  	  Audio Genre Classification using Neural NetworksIRJET-  	  Audio Genre Classification using Neural Networks
IRJET- Audio Genre Classification using Neural Networks
 
Automatic subtitle generation
Automatic subtitle generationAutomatic subtitle generation
Automatic subtitle generation
 
MULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLE
MULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLEMULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLE
MULTILINGUAL SPEECH TO TEXT CONVERSION USING HUGGING FACE FOR DEAF PEOPLE
 
Video Summarization
Video SummarizationVideo Summarization
Video Summarization
 
VIDEO TO TEXT SUMMARIZER USING AI.pdf
VIDEO TO TEXT SUMMARIZER USING AI.pdfVIDEO TO TEXT SUMMARIZER USING AI.pdf
VIDEO TO TEXT SUMMARIZER USING AI.pdf
 
Voice Based Email System For Visual Impaired Using AI
Voice Based Email System For Visual Impaired Using AIVoice Based Email System For Visual Impaired Using AI
Voice Based Email System For Visual Impaired Using AI
 
Extract the Audio from Video by using python
Extract the Audio from Video by using pythonExtract the Audio from Video by using python
Extract the Audio from Video by using python
 
IRJET- Speech to Speech Translation System
IRJET- Speech to Speech Translation SystemIRJET- Speech to Speech Translation System
IRJET- Speech to Speech Translation System
 
IRJET- Voice based Gender Recognition
IRJET- Voice based Gender RecognitionIRJET- Voice based Gender Recognition
IRJET- Voice based Gender Recognition
 
Autotuned voice cloning enabling multilingualism
Autotuned voice cloning enabling multilingualismAutotuned voice cloning enabling multilingualism
Autotuned voice cloning enabling multilingualism
 
Playing sound files in labview using audacity toolkit
Playing sound files in labview using audacity toolkitPlaying sound files in labview using audacity toolkit
Playing sound files in labview using audacity toolkit
 
AI POWERED TRANSCRIBER TO RECORD THE CONFERENCE PROCEEDINGS
AI POWERED TRANSCRIBER TO RECORD THE CONFERENCE PROCEEDINGSAI POWERED TRANSCRIBER TO RECORD THE CONFERENCE PROCEEDINGS
AI POWERED TRANSCRIBER TO RECORD THE CONFERENCE PROCEEDINGS
 
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
 
The first FOSD-tacotron-2-based text-to-speech application for Vietnamese
The first FOSD-tacotron-2-based text-to-speech application for VietnameseThe first FOSD-tacotron-2-based text-to-speech application for Vietnamese
The first FOSD-tacotron-2-based text-to-speech application for Vietnamese
 
Automatic document clustering
Automatic document clusteringAutomatic document clustering
Automatic document clustering
 
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...
IRJET- Designing and Creating Punjabi Speech Synthesis System using Hidden Ma...
 

Mehr von IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

Mehr von IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

KĂĽrzlich hochgeladen

Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Servicemeghakumariji156
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Bridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxBridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxnuruddin69
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdfKamal Acharya
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersMairaAshraf6
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X79953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksMagic Marks
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsArindam Chakraborty, Ph.D., P.E. (CA, TX)
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...Health
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfsmsksolar
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projectssmsksolar
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadhamedmustafa094
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stageAbc194748
 

KĂĽrzlich hochgeladen (20)

Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Bridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxBridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptx
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic Marks
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stage
 

IRJET- Audio Data Summarization System using Natural Language Processing

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 09 | Sep 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 427 AUDIO DATA SUMMARIZATION SYSTEM USING NATURAL LANGUAGE PROCESSING Pravin Khandare1, Sanket Gaikwad2, Aditya Kukade3, Rohit Panicker4, Swaraj Thamke5 1Assistant Professor, Dept. of Computer Engineering, Vishwakarma Institute of Technology, Pune, Maharashtra (India) 2,3,4,5UG Student, Vishwakarma Institute of Technology, Pune, Maharashtra(India) ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - This paper presents techniques for converting speech audio file to text file and text summarization on the text file. For the former case, we have used Python modules to convert the audio files to text format. For the latter case, Natural Language Processing’s modules are used for text summarization. A Python toolkit named SpaCy is used for the English data functions. Summarization method involves important sentence obtained when the extraction is investigated. Weights are assigned to words according to the number of occurrences of each word in the text file. This technique is used for producing summaries from the main audio file. Key Words: NLTK (Natural Language Toolkit), Stop words, Frequency Table, Frequency Distribution, Tokenization, Stemming, Lemmatization, SpaCy. 1. INTRODUCTION The most natural and effective mode of communication between human beings is available in audio. Although, it is an easier way to perceive the information, there are some disadvantages associated with this type of communication. In case of audios that are available today, like podcasts, broadcasting of news on radio channels, speeches, etc. all the audio files data that are obtained from these sources are not effectively useful means of gathering information every time. In this case, it is always effective if the data is obtained from the condensed contenti.e.focusing more on important points in the content rather than the entire content. To proceed with this technique, an audio file is taken as input and data summarization processes are implemented to get condensed output. This paper particularly focuses on a two-level procedure for getting summarized output from input audio file. Namely, Speech to Text and then text summarization. The efficiency of the output though depends on the clarity of the audio file that is given as input to the former stage. Human speech recording has proved to be a clear and processable input, whereas pre-recorded podcasts have some amount of deviation in the original text and the text generated in the first phase. There is an optional output available for the users that are getting an audio summary of the recording. This has been kept as a user preference whether they want the audio- based summary or not. All the summarized outputs are available in the form of files. Also, the original audio and text in the form of output files are maintained, in case the user wants to manually add a sentence from the text to the summary. 2. PHASE 1: SPEECH TO TEXT 2.1 Obtaining input The input audio forprocessing isobtainedbyrecordingusing the device’s microphone. Therecording process is facilitated by GUI based buttons. The recording starts when the button “Start Recording” is pressed and stops on the “Stop recording” button click event. The python modules used for GUI creation is tkinter. The audio recording and processing are done with the help of pyaudio module. Audio is recorded by continuously appending frames of audio. The final output of this phase is in the form of an audio file of the “wav” format. The format wav is chosen since it is suitable for further processing. 2.2 Need of partition The audio file recorded needs to be converted into text. This process uses python module SpeechRecognition. This module requires a working internet connection to function as it uses Google’s web-based speechtotext engine.Thetime duration for which the engine accepts input is a maximum of 1 minute. So, if the duration of the audio file increases beyond 1 minute then we need to divide it into chunks of smaller duration. The process of dividing the audio file in chunks of processable duration is carried out by using mathematical
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 09 | Sep 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 428 formulae by calculating the size of the audio file by considering frame rate of audio file, the original duration of the audio file and the bit rate and then dividing the size of the audio file by the file split size. We then get the total number of numbered chunks and in wav format as well. The python module requiredfor gettingtheaudiofileparameters is pydub. The mathematical formulae used for the same are: wav_file_size = (sample_rate * bit_rate * channel_count * duration_in_sec) / 8 ……………………. (1) file_split_size = 1000……………………………... (2) total_chunks = wav_file_size // file_split_size…... (3) chunk_length_in_sec = math.ceil((duration_in_sec * 10000000) /wav_file_size) …………………………………(4) chunk_length_ms=chunk_length_in_sec*1000…... (5) chunks = make_chunks(myaudio, chunk_length_ms) ……. (6) The individual chunks of files are then processed further for getting the final text output of the original audio file. This text will then be processed for getting summary. 2.3 Getting the text output The function of generating text from the audio file recorded is aided by the python module Speech Recognition. The reason for selecting this specific module is due to its simplicity of usage and better efficiency than offline modules. The audio file or chunk of audio file is passed to an instance of the module that takes this wav file as source. The audio file is scanned and rectified for noise.Themoduleuses Google’s engine to recognize the audio and convert it into text file. This file stores the original text of the speech and is stored for operational purposes. This file is then processed under phase two to get the extractive summary of the text recorded in this file. 3. PHASE 2: TEXT SUMMARIZATION This phase focuses on generating a text summary as an output for the input text that is generated in phase 1. This phase mentions the major steps taken in the algorithm of summarization. It uses SpaCy which is an open-source software library for advanced Natural Language Processing, written in the programming languages Python and Cython. 3.1 Tokenization of words and word frequency generation At this point, the data has to be tokenized by breaking it down into words to generate the frequency of each word occurring in the provided data. After the list of tokens is generated, iterate through the list and check if the corresponding word is not present in the stop wordslistand if not, increase its frequency by 1. 3.2 Weighted Frequency and Frequency Table Generation At this point having generated the frequencies of each word in the given data, normalize the frequencies byextractingthe maximum of all frequencies which was generated in the earlier step and divide frequency of each word by the maximum frequency. 3.3 Sentence Tokenization Now there is a need to generate scores of each sentence in the data to generate an optimal summary of the given data. For that first tokenize each sentence in the data.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 09 | Sep 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 429 3.4 Sentence scoring Now, having generated a list of sentences and also a list of words. So, generate a score for each sentence by adding the weighted frequencies of the word that occur in a particular sentence. As it is not interested in a summary, it has to score only those sentences with less than 30 words. 3.5 Sentence Selection In the earlier step, output is generated as dictionary of sentences with their scores. Now, select the top N sentences by passing our sentence scores dictionary along with its values and the required N sentencestothen-largestfunction in the NLTK library and it turns into a list with sentences according to their sentence scores. 3.6 Joining the sentences Further, convert the sentences to strings and then Finally Join them to generate the finalized summary. 3.7 Getting audio output For this part of the program, use of gTTS module is done. It stands for Google Text to Speech. It isa web-basedengine for processing an input text file and generating output in audio format. The language needs to be specified for the purpose. In this case, English language is chosen and theinputtextfile is the previously generated summarized text output file. After processing on the given parameters, the gTTS module gives output in the form of an mp3 file. This file can be played on a simple button click event. Hence the user satisfaction is taken care of in the end. 3.8 Final Set of Outputs Hence the final set of outputs include: a. Main recorded audio file. b. Chunks of the above file, if needed. c. Text file of recorded audio. d. Summarized text output file. e. Summarized audio file. 4. CONCLUSION In the process of generating a summary as discussed in this paper, the primary input taken forprocessingisanaudiofile. The audio file is generated by recording the human speech which is being spoken or is already recorded. The duration of the recording is completely controlled by the user in the form of a buttoned GUI. The audio file in wav format is then converted into a text file which is then used as input for text summarizer processing. The final output of the projectisthe summarized text file of the contents of the primary recording. FLOWCHART Start Start Recording Audio Stop Recording Get Text Generate Audio File Generate Main File Text Stop Get Summary Text File Get Summary
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 09 | Sep 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 430 REFERENCES 1. Partha Mukherjee, Soumen Santra, Subhajit Bhowmick, Development of GUI for Text-to-Speech Recognition using Natural Language Processing 2. Sadaoki Furui, Fellow, IEEE, Tomonori Kikuchi, Yousuke Shinnaka, and Chiori Hori, Member, IEEE, Speech-to-Text and Speech-to-Speech Summarization of Spontaneous Speech 3. Rajesh S.Prasad, Dr. U.V.Kulkarni,MachineLearning in Evolving Connectionist Text Summarizer 4. M. Saadeq Rafieee, Somayeh Jafari,”Considerations to Spoken Language RecognitionforText-to-Speech Applications” 5. Guangbing Yang, Erkki Sutinen,”Chunking and Extracting Text Content for Mobile Learning: A Query-focused Summarizer Based on Relevance Language Model” 6. Yihong Gong ,Xin Liu,”Creating generic text summaries” 7. Yogita H. Ghadage, Sushama D. Shelke,” Speech to text conversion for multilingual languages” 8. Abhijit V. Bapat, Lalit K. Nagalkar, “Phonetic Speech Analysis for Speech to Text Conversion” 9. R. Carlson, G. Fant, C. Gobl, B. Granstrom,I.Karlsson, Q.-G. Lin, “Voice source rules for text-to-speech synthesis” 10. F. Violaro, O. Boeffard, “A hybrid model for text-to- speech synthesis”