Following are the questions which I tried to answer in this ppt
What is text summarization.
What is automatic text summarization?
How it has evolved over the time?
What are different methods?
How deep learning is used for text summarization?
business application
in first few slides extractive summarization is explained, with pro and cons in next section abstractive on is explained.
In the last section business application of each one is highlighted
2. Analytical methods at glance
Extractive Methods
Selecting set of sentences from the source text,
then arranging them to form a summary
Abstractive Methods
Using the natural language generation techniques
to write novel sentences
Methods
Luhn Edmunson
TextRank LexRank
SumBasic LSA
Methods
Sequence to Sequence
Sequence to Sequence
with attention
Pointer Generation
Network
Fast abstractive
Reinforcement
3. 3
Extractive – Luhn Method
What is Luhn method?
Frequency of the word in the text
Relative position of the word in the
sentence
Significance factor
Bracketing
Pros
• Can highlight the important topics of the
document
Limitations
• Very few features of the text taken into
consideration,might not give accurate results
• More weight to sentences at the beginning of
the document or a paragraph
4. 4
Extractive – Edmunson Method
What is Edmusonmethod?
Cue Method
• Cue dictionary comprises – Stigma, Bonus and Null words; Final cue weight
for each sentence will be generated
• Concordance program will provide – Frequency, Dispersion, Selection ratio
Pros
• Takes into considerationuser
defined features of the sentences
Limitations
• Designed for well formatted
document
• Performance declined for
disorganized document
Key Method
• Non-cue words based on the frequency will be selected as Key words and
positive weights will be assigned based on the frequency
• Final key weight of sentence is sum of key words’ weights
TitleMethod
• Title glossary words (non-null words of title, subtitle, and headings) will be
formed
• Positive weights will be assigned to the title glossary words and sentence
weight will be calculated
Location Method
• Based on 2 hypotheses that – Sentences following headings are more
relevant & topic sentences occur in the initial or last lines of the paragraph,
the probable relevance will be calculated
• Weights will be assigned based on the hypotheses test results
Weighted score
of 4 methods for
each sentence
5. 5
Extractive – Latent Semantic Analysis(LSA)
What is LSA method?
Pros
• Captures salient and recurring
word combinationpatterns
• Suitablefor Big Data
Limitations
• LSA is very sensitive on a stop list
and a lemmatization process,
performance will differ a lot for
different languages.
• A sentence matrix is created by applyingTF-IDF
(Term Frequency – Inverse Document
Frequency) on document
• SingularValueDecomposition is applied,which
decomposes this into 3 different matrices
• They represent different interrelationsof words
and topics and their importance
6. 6
Extractive – TextRank Method
What is TextRank method?
Pros
• It takes intoconsiderationboth the
number of links and their
weightage
Limitations
• Can only be used for single
document summarization
• Similar to Google’s famous PageRank
method(graph based)
• Data will be preprocessed to remove the
irrelevantdata
• Words will be vectorised and a ‘Cosine
Similarity Score’ is calculated
• ‘Similarity matrix’ will be created –
similarity b/w any two sentences
7. 7
Extractive – LexRank Method
What is LexRank method?
Pros
• Can be used for multi document
summarization
• It takes intoconsiderationboth the
number of links and their weightage
Limitations
• Documents might have conflicting
ideas, might lead to an incorrect
summary
• Based on Google’s PageRank method
• Connectivityis based on cosine similarity
• Concept of eigenvector centralityin graph is used to set
sentence importance
8. 8
Extractive – SumBasic, KLSum
What is SumBasic method?
Pros
• It gives the summarizer sensitivity
to context depending on what
has already been included
• Naturalway to deal with
redundancy
Limitations
• It gives informationabout word
frequency, but does not capture
all the topics accurately
• SumBasic is a way of computing sentence scores from a
multi-documentdataset
• Computes the probability distribution over the words in the
input
• Assigns weights to sentences as the average probability of
the words in them
• Pick the best scoring sentence
• Recalculatesprobabilitiesif summary length has not been
reached
What is KLsum method?
Improved SumBasic by minimizingthe ‘Kullback-Lieber’ (KL)
divergence between probabilitydistributionsof summary and
document
9. 9
Abstractive – RNN and Seq2Seq
What is RNN?
• RNN (recurrent neural network) is a multi layer NN(neural
network) that works on principleof predicting current event
based on the recent and also the long term past
• RNN states holds information about sequentialevents.
• LSTM(Long short term memory) is a variantof RNN that
improves RNN predictiveabilityfor very long sequences
• LSTM have a selective memory, so it can ‘forget’ and
‘update’ the NN, which improves RNN state cell at every
LSTM cell
10. 10
Abstractive – Sequence to Sequence Method
What is Sequence to Sequence method?
Pros
• Work in abstractive sense, performs similar to a
human.
• Based on machine learning
• optimization accuracycan be improved.
• Same network can trainedfor any other
language or for translation
Limitations
• Trainingtime increases as size of input data
increases(number of encoders)
• Summary is generatedby lookup from
vocabulary(limited abstraction)
• <UNK> tag is generated for names, places etc.
not in vocab
• No coverage of what has alreadybeen decoded
• Text is fed to encoder units and the
intermediateform(hidden state) is fed to
decoder.
• Stacked RNN/LSTM unitsare used for
encoding and decoding
11. 11
Abstractive – Seq2Seq with Attention Method
What is Seq2Seq with Attention method?
Pros
• The decoder’s abilityto freely generate
words in any order
Limitations
• Represent factualinformation incorrectly
• Summary sometime repeats
• <UNK> tag is generated for names, places
etc. not in vocab
• No coverage of what has been decoded
• Attentiondistributionis used to prepare weighted sum of
encoder hiddenstates, known as context vector
• The context vector can be regarded as “what has been
read from the source text” on this step of the decode
• Context vector and decoder hidden state used to generate
vocabularydistribution
12. 12
Abstractive – Pointer-generator Network Method
What is Pointer-generatorNetwork method?
Pros
• No repetition of words
• Out of source text words can be
generated
• Rare words (names/factual information)
represented correctly
Limitations
• Higher level abstractionis mission,
wording usually close to originaltext
• Incorrect composition of sentences.
• Hybrid network that can copy words from the source via pointing,
while retaining the ability to generate words from the fixed
vocabulary
• Generation probability is used to determine whether to copy the
word from source or to generate it from vocab
• Generation probability is used to weigh the attention distribution
and vocab distribution to generate final distribution
• Coverage is used to avoid repetition, it checks what has been
decoded so far
13. 13
Abstractive – Reinforce-Selected Sentence Rewriting(fast_abs_rl)
What is fast_abs_rlNetworkmethod?
Pros
• 4x improvement on training speed 10-20x
improvement on inference speed
• Whole text is considered for abstractive
summarization
Limitations
• High level of abstractionstill missing as
vocab size is limited
• Fast summarization method that first select salient
sentences and then rewrites it as an abstractivesummary.
• Pipeline of Extractive-Abstractivesummarizer is used which
is optimized using Reinforcement learning
• The Convolutionencoder used is Extractive
• Seq2Seq pipelinewith ReRank used is Abstractive
• Based on decoder output extractor parameter are
optimized using POMDP
14. Benchmark metrics
Major3 metrics were considered: ROUGE(Lin 2004), METEOR and BLUE – String matching metrics
Brief:
• ROUGE-N measures the overlap of N-grams between the system and reference summary
• ROUGE-L is based on longest common subsequences. Takes into account sentence level similarity.
• ROUGE-S is the skip-gram variant
• METEOR score matches the unigrams between the system and reference summary with explicit care wrt to sentence ordering
Summarization method Rogue Score(ROUGE-L) METEOR Score
TextRank 0.500 -
LexRank 0.469 -
SumBasic 0.484 -
KLSum - -
LSA 0.432 -
Seq2Seq 31.2 -
Seq2Seq attention 33.8 -
Pointer generator 36.38 18.72
Fast_abs_rl 38.54 20.38
15. Method Source
Luhn https://ieeexplore.ieee.org/document/5392672/
Edmuson http://courses.ischool.berkeley.edu/i256/f06/papers/edmonson69.pdf
LexRank https://www.cs.cmu.edu/afs/cs/project/jair/pub/volume22/erkan04a-html/erkan04a.html
TextRank https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf
LSA www.kiv.zcu.cz/~jstein/publikace/isim2004.pdf
SumBasic http://www.cis.upenn.edu/~nenkova/papers/ipm.pdf
KLSum http://www.aclweb.org/anthology/N09-1041
Reduction https://github.com/adamfabish/Reduction/blob/master/Source/reduction.py
Seqto seq https://arxiv.org/pdf/1409.3215.pdf
Seqto seqwithattention http://www.aclweb.org/anthology/N16-1012
Seqto seqwithattentionandpointer generation Get To The Point: Summarization with Pointer-Generator Networks
Fast Abstractive SummarizationwithReinforce-
SelectedSentence Rewriting
https://arxiv.org/pdf/1805.11080.pdf
References
16. Use Case: Planning and Procurements Influenced by Trends
Text Summarizer
The articles are passed
through the summarizer
that is trained to pick up
specific design elements
Design
The inputs are provided to
the planning team and to
help them arrive at better
decision
Fashion Articles
News and magazines
articles showing latest
fashion trends and top
selling designs
Tagging
The various design
elements are tagged
according on clothing
category
Summarizing and tagging the Current trends for any business can provide valuable insights and make the planning and
procurement process better
Examples:
• Fashion articles can augmentthe Design SuccessProbability project(illustrated below)
• Latest Auto news can support Novelis R&D and customer negotiations
• Shutdownsin factorieswhich supply raw mater can help in taking pre-emptive measuresand protectfrom increased raw materialprices
17. Use Case: Competition Benchmarking
Analysisof news articles can help in
businesses in consolidatingall competition
related informationand also plan for reactive
actions to competitionnews
Examples:
• Jio launching a new campaign or increasing
services to new circles
• Aditya Birla Capital could leverage the
information provided in advertisements and
news articles regarding competition products
Repositoryof all
competition data
News and
announcements
Ads on products
and sales
Industry reports
1
2
Text Summarization
Tagging
18. Use Case: Creation and Updation of legal Matrix
Government publishes amendment to existing laws in
gazettes which contain information about multiple
industries
Each Plant has / should have a legal matrix
which contains information about mandatory
checks, concerned authorities and last date to
complete it
Text Summarizer
Relevant text
extraction is
carried out
Text tagging
carried out
Rules comparison
and updation
General Laws captured in Gazette
Few of the major acts covered in Gazette:
1. Customs Act
2. Central Goods and Services TaxAct
3. Major PortTrustAct
4. Mines and Minerals Development
Regulation Act
5. Bureau of Indian Standards Act
(Necessary for lab equipment's)
6. Special Economic Zones Act
7. IndustrialBoilers and PressureVesselact
8. Labor law
19. Use Case: Increasing customer care efficiency
Current Customer Journey while calling customer care
Time Spentfilling the
informationviakeypad
The in call option is not great
for mobile as they haveto
removetheir handset from
there ears and then input
details which further irritates a
person
Speech to text
converted
Text Summarizer Auto filling of information
Cons:
1. Added complexity of Speech to text converter
2. Indian dialects and local languages, extremely
hard to train for all
Pros:
1. More interactive
2. Better Consumer Experience
3. Multiple verification can be done earlier
20. TextSummarizer – Business applications
SummarizeNoisy, Unstructured, Ungrammatical,
Huge Volume of Data for Pantaloons & MFL
Storylines of events : Identify and summarize
events of Idea that leads to the event of interest
Sentence compression fromnews articles & stock market
reports for Aditya Birla Capital
Summarizing Internalsales reportat various levels
for UltraTech cement
21. TextSummarizer – Function applications
Legal and employee document screening and
summarization for HR Dept
Capturing Customer carevoice calls of Aditya
Birla capital,
Summarization of bill, contract, order details using vendor
documents for ABMCPL
Summarizing news articles for central economic
cell consisting regulation and economic conditions