"Image and Video Summarization," a Presentation from the University of Washington

Bigness Info. in Data Image Summarization Video Summarization End
Image and Video Summarization
Jeﬀrey A. Bilmes
Professor
Departments of Electrical Engineering
& Computer Science and Engineering
University of Washington, Seattle
http://melodi.ee.washington.edu/~bilmes
and
Visiting Scientist, Google Research
Wednesday, Dec 7th, 2016
J. Bilmes Image and Video Summarization — 12/7/2016 page 1 / 35

Outline
1 Bigness
2 Information In Data
3 Image Summarization
4 Video Summarization
5 End

Summarization
What is summarization?
Why do we need summarization?

Bigness

Water
H2O molecules
small (n-body)
medium (ﬂuid dynamics, viscosity, compressibility),
large (global weather systems, meteorology).
Same underlying molecular collision events!

Neurons
Neurons
small (neural spike trains, population coding)
medium (intelligence, consciousness, psychology)
large (society, social choice, wisdom of the crowd)
Same underlying electrical and chemical impulses.

Bigger is Diﬀerent
“More is Diﬀerent”, P.W. Anderson, 1972
(Nobel laureate). “The ability to reduce
everything to simple fundamental laws does
not imply the ability to start from those laws
and reconstruct the universe.”
“. . . alterations of being . . . are not only the
transition of one magnitude into another,
but a transition from quantity into quality,”
Hegel, The Science of Logic, 1816

Big Data is Different Data
Hypothesis: extremely large data sets offer qualitatively different
capabilities than small data sets.
Evidence: Image Completion (Hays & Efros, 2007)
“our initial experiments . . . on a dataset of ten thousand images
were very discouraging. However, increasing the image collection
to two million yielded a qualitative leap in performance”

Modern Times
Big Data is Big and Getting Even Bigger.
Sensors &
Devices
Social Media
VoIP
Enterprise
Data
Now A Few Years
From Now
A Few Years
Ago
VolumeofData
2.5 quintillion bytes (2.5 million terabytes) of data/day (IBM)
>90% of the world’s data has been created in the last two years.

Big Data in Machine Learning
Statistics, Machine Learning, and Artiﬁcial Intelligence (AI)
“There’s no data like more data”,
Computational Consequences:
Massive computational resource demands!
Research opportunities to address new computational challenges
1 systems programming, parallel and distributed computing,
network topologies, eﬃcient databases, GPUs.
2 Examples: map reduce, Hadoop, GraphLab, HaLoop, Greenplum,
Asterix, Spark, SystemML, MLBase, Myria, etc.

The Two Foundations of Big Data?
Data
StatisticalSignificance
ParallelComputingSystems
??????????????
Large Data
StatisticalSignificance
Larg

Goal: Data to Information
Data is:
Streaming
Torrential
Relentless
Multi-modal
Mostly Unstructured
Sensors/Actuators
Redundant
High Dimensional
Distributed

Big Data is Different Data: A Proposition
Hypothesis: extremely large data sets offer
qualitatively different capabilities than small
data sets.
Problem: Big data sets are big, unwieldy,
computationally challenging, and often highly
redundant.
Research Quest: Can statistical predictions and
actions be made cost effectively using the right
data management strategies?
Yes, by reducing redundancy.

How to identify and measure redundancy?

Measuring Information in Data
What is information?
Information Theory (entropy, mutual information) ⇔ probability
distributions.
Kolmogorov Complexity ⇔ algorithms & models of computation.
Information measures over non-random data samples (e.g., images).
f( ) = f( , )
< f( , ) < f( , )

Measuring Information in Data
Diminishing returns:
The more you have, the less valuable is anything you don’t have.
f( )f( )-
f( )
- f( )
≥

Example: Number of Colors of Balls in Urns
Consider an urn containing colored balls. Given a set S of balls,
f (S) counts the number of distinct colors.
Initial value: 2 (colors in urn).
New value with added blue ball: 3
Initial value: 3 (colors in urn).
New value with added blue ball: 3
Submodularity : Incremental Value of Object Diminishes in a
Larger Context (diminishing returns). Thus, f is submodular.

As the data set size grow . . .
There is no data like more data ⇒ more data is like no more data.
From Andrew Ng’s Stanford
machine learning class, 2011

As the data set sizes grow . . .
0.70
0.75
0.80
0.85
0.90
0.95
1.00
0.1 1 10 100 1000
Millions of Words
TestAccuracy
Memory-Based
Winnow
Perceptron
Naïve Bayes
Banko & Brill 2001
(Riccardi & Hakkani-Tür, 2005, Speech Recognition) (Callison-Burch&Bloodgood, 2010, Machine Translation)
Tong & Koller, 2001 (Soon, Ng, Lim, 2001, Coreference Resolution)
50 100 150 200 250
Number of Training Examples
0
0.2
0.4
0.6
0.8
1
Accuracy
(Kadie, 1995, Generic Classification)
Sentiment Tutorial, Chris Potts, Stanford Ling., 2011
78
80
82
84
86
88
90
92
700 1.750 3.500 7.000 10.500 14.000 17.500 21.000
Overallaccuracy(%)
Training set size
SVMs
DTs
( Kavzoglu & Colkesen, 2012, Image Classification)
Speed, memory, attention, problem solving
playing game Luminosity
http://www.lumosity.com/blog/how-much-and-how-often-should-i-train/

Submodularity and Learning Curves
Proposition
Let V = {1, 2, . . . , n} be a ﬁnite ground set, and let f : 2V
→ R be a
set function. If for all permutations σ of V , we have that for all i ≤ j:
f (σj |Si−1) ≥ f (σj |Sj−1) (1)
with Si = {σ1, σ2, . . . , σi }, then f is submodular.
Learning curves might not be exactly submodular, but
submodularity seems a reasonable model.

What is Summarization?
1 Start with a massive data set (images, videos, etc.), set V .
2 Identify a good (submodular) information function f (by hand or
by machine learning) that represents information in V .
3 Find a subset A ⊆ V of a given size that retains as much
information as possible.
4 Luckily, this normally exponential time problem can be done
computationally eﬃciently!!

Modern Image Collections
Many images, also that have a higher level gestalt than just a few.

Image Summarization
Task: Summarize collection
of images by representative
subset of the images
Applications:
Summarizing your
holiday pictures.
Summarizing image
search results
Eﬃcient browsing of
image collections
Video frame
summarization
Diﬃculties:
No high level
⇓

Image Summarization - Data Collection
Data Statistics
14 image collections with 100 pictures each
∼ 400 human summaries for every image collection, via Amazon
Turk, about 5500 summaries total!
Example collections:

Image Summarization
Whole collection: 3 best summaries:
3 medium summaries:
3 worst summaries:

Image Summarization
Typical Results - Learnt mixture using Max-Margin
f(∅) = 0 f(V ) = 1
Greedy Min
Average Pruned Random
Max of Learned Mixture
Average Pruned Human
Greedy Max
Average Pruned Random
Average Pruned Human

Real-Time Running Online Video Summarization
Live Video
Feed
Most recent
representative video
snippet (repeating)
Next most recent
snippet (repeating)
Third most recent
snippet (repeating)
Fourth most recent
snippet (repeating)
Least recent
snippet (repeating)
...

Real-Time Running Online Video Summarization

The End
The End: Thank you!
++
+ +
f(A) f(B) f(A ∪ B)
= f(Ar ) +f(C) + f(Br )
≥
≥
= f(A ∩ B)
f(A ∩ B)
= f(Ar ) + 2f(C) + f(Br )

"Image and Video Summarization," a Presentation from the University of Washington

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie "Image and Video Summarization," a Presentation from the University of Washington

Ähnlich wie "Image and Video Summarization," a Presentation from the University of Washington (20)

Mehr von Edge AI and Vision Alliance

Mehr von Edge AI and Vision Alliance (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

"Image and Video Summarization," a Presentation from the University of Washington