SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Estimating Video Authenticity via
the Analysis of Visual Quality and
Video Structure
画質と映像構造の解析に基づく
映像の信頼性推定に関する研究
2015/07/31
Laboratory of Media Dynamics
Graduate School of Information Science and Technology
Michael Penkov
1
[*] http://www.youtube.com/t/press_statistics/ (accessed 2015/06/29)
Need to distinguish between the parent and edited videos
Introduction :: Background
Estimating Video Authenticity via the Analysis of Visual Quality and Video Structure
Many independent
uploaders
Upload rate:
300 h/min [*]
Much video is duplicated
No screening of
content
Parent
video
Edited video
(1st gen.)
Edited video
(2nd gen.)
Authenticity of video
important event
(e.g. news)
most objective
most reliable
closest to the truth
least edited
Search result
summarization
Content tracking
Content
aggregation
Cheaper
phones/cameras
Faster
networks
Increase of video sharing
Free sharing
services
(Low)(High)
Editing Editing
Upload Reupload
…
…
How similar is the edited video to the parent video?
[1] P. Marziliano et al, “Perceptual blur and ringing metrics: Application to
JPEG2000,” in Elsevier Signal Processing: Image Communication, vol. 19, pp.
163–172, 2004.
[2] B. Coskun, B. Sankur, and N. Memon, “Spatio-Temporal Transform Based
Video Hashing,” IEEE Transactions on Multimedia, vol. 8, no. 6, pp. 1190–1208,
Dec. 2006.
[3] Z. Dias, A. Rocha, and S. Goldenstein, “Video Phylogeny: Recovering near-
duplicate video relationships,” in 2011 IEEE International Workshop on Information
Forensics and Security. IEEE, Nov. 2011, pp. 1–6.
[4] F. Battisti, M. Carli, and A. Neri, “Image forgery detection by means of no-
reference quality metrics,” in SPIE Vol. 8303, 2012.
[5] S. Lameri, P. Bestagini, A. Melloni, S. Milani, A. Rocha, M. Tagliasacchi, and S.
Tubaro, “Who is my parent? Reconstructing video sequences from partially
matching shots,” in IEEE International Conference on Image Processing (ICIP),
2014.
Video
Phylogeny [3]
Forgery
Detection [4]
2Introduction :: Related Research
Estimating Video Authenticity via the Analysis of Visual Quality and Video Structure
Our Research
Parent Video
Estimation [5]
Existing methods can…
• Estimate visual quality [1]
• Quantify video similarity [2]
• Estimate hierarchical relationships
between videos [3]
• Detect copy-paste forgeries through
inconsistencies in visual quality [4]
• Estimate parent video from edited
videos [5]
2010 2015
Visual quality ∝ Authenticity
Video structure  estimate deleted shots
Video Similarity
Digital Forensics
Visual Quality Assessment
No-reference
VQA [1]
Robust video
hash [2]
Estimate authenticity of edited videos
Visual quality: low visual quality  low authenticity
Video structure: many deleted shots  low authenticity
3Introduction :: Research Map
Estimating Video Authenticity via the Analysis of Visual Quality and Video Structure
Digital
Forensics
Shot
Segmentation
Video
Similarity
Visual Quality
Assessment
Our Research
Contribution: bridging Visual Quality Assessment and Digital Forensics
[4] F. Battisti, M. Carli, and A. Neri, “Image forgery detection by means of no-reference quality metrics,” in SPIE Vol. 8303, 2012.
[5] S. Lameri, P. Bestagini, A. Melloni, S. Milani, A. Rocha, M. Tagliasacchi, and S. Tubaro, “Who is my parent? Reconstructing
video sequences from partially matching shots,” in IEEE International Conference on Image Processing (ICIP), 2014.
[6] ペンコフ マイケル, 小川 貴弘, 長谷山 美紀 “Fidelity estimation of online video based on video quality measurement and Web
information” 第 26 回信号処理シンポジウム, vol. A3-5, pp. 70–74 (2011).
[4] [6]
[5]
[5]
4Introduction :: Our Contribution
Estimating Video Authenticity via the Analysis of Visual Quality and Video Structure
Authenticity degree: the proportion of information retained by an edited video.
10
A relative scale for ranking edited videos.
Bridge digital forensics and visual quality assessment.
Authenticity Degree
Edited Edited Edited Parent
not
available
Information
Information: the message contained by the video; the reason why people watch the video.
Lower visual quality
Many deleted shots
Higher visual quality
Few deleted shots
Introduction :: Thesis Contents
• Chapter 1: Introduction
• Chapter 2: Visual Quality Assessment
• Our contribution: Visual quality ∝ Authenticity
• Reviews conventional algorithms
• Enables the proposed method to quantify information loss
• Chapter 3: Shot Identification
• Utilizes the structure of videos to
1. Enable the reconstruction of the parent video when it is not available
2. Enable detecting deleted shots
3. Enable applications of conventional visual quality assessment algorithms from Chapter 2
• Chapter 4: The Video Authenticity Degree
• The proposed method for estimating video authenticity
• Chapter 5: Conclusion
5
Thesis Contents
• Chapter 1: Introduction
• Chapter 2: Visual Quality Assessment
• Our contribution: Visual quality ∝ Authenticity
• Reviews conventional algorithms
• Enables the proposed method to quantify information loss
• Chapter 3: Shot Identification
• Utilizes the structure of videos to
1. Enable the reconstruction of the parent video when it is not available
2. Enable detecting deleted shots
3. Enable applications of conventional visual quality assessment algorithms from Chapter 2
• Chapter 4: The Video Authenticity Degree
• The proposed method for estimating video authenticity
• Chapter 5: Conclusion
6
7Chapter 2 :: Visual Quality Assessment
An Overview
Full-reference
Algorithms [9]
Subjective
Evaluation [8]
No-reference
Algorithms [1]
Target
image
Reference
image
Human subjects
Reduced-
reference
Algorithms [7]
Extracted
features
Compression
[1] P. Marziliano et al, “Perceptual blur and ringing metrics: Application to JPEG2000,” in Elsevier Signal Processing: Image Communication, vol. 19, pp. 163–172, 2004.
[7] Z. Wang and A. C. Bovik, “Modern Image Quality Assessment,” Synthesis Lectures on Image, Video, and Multimedia Processing, vol. 2, no. 1, pp. 1–156, Jan. 2006.
[8] ITU-T Recommendation BT.500: “Methodology for the subjective assessment of the quality of television pictures”
[9] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13,
no. 4, pp. 600–612, Apr. 2004.
𝑉1 (edited)
𝑉0 (parent)
8Chapter 2 :: Visual Quality Assessment
Known Problems and Limitations of No-reference Algorithms
1.0
1.5 0.5
1.4
1.2
1.0X
(deleted)
1.2
Low  good quality
High  poor quality
Example:
Problem: algorithms do not consider deleted shots.
[1] P. Marziliano et al, “Perceptual blur and ringing metrics: Application to JPEG2000,” in
Elsevier Signal Processing: Image Communication, vol. 19, pp. 163–172, 2004.
Blurring
strength [1]
Mean blurring strength
for entire video
Problem: algorithm output is relative to the visual content.
Our solution: shot identifiers (Chapter 3) and a shot-based
penalty model (Chapter 4).
Better quality (!)
Worse quality (!)
𝑉1 (edited video)
𝑉0 (parent video)
Enable detection of deleted shots
Group visually similar shots together
Normalize algorithm outputs
Thesis Contents
• Chapter 1: Introduction
• Chapter 2: Visual Quality Assessment
• Our contribution: Visual quality ∝ Authenticity
• Reviews conventional algorithms
• Enables the proposed method to quantify information loss
• Chapter 3: Shot Identification
• Utilizes the structure of videos to
1. Enable the reconstruction of the parent video when it is not available
2. Enable detecting deleted shots
3. Enable applications of conventional visual quality assessment algorithms from Chapter 2
• Chapter 4: The Video Authenticity Degree
• The proposed method for estimating video authenticity
• Chapter 5: Conclusion
9
Chapter 3 :: Shot Identification :: Summary
1. Enable the reconstruction of the parent video
2. Enable detecting deleted shots
3. Enable applications of algorithms from Chapter 2
10
𝑉1 𝑉2 𝑉3
𝑉1 𝑉2 𝑉3
1 122 33 4 4
Aim
Method Represent each unique shot as a unique integer.
ID = 1 ID = 2 ID = 3 ID = 4
Chapter 3 :: Shot Identification :: Details
11
𝑉1 𝑉2 𝑉3
𝑉1
1
𝑉1
3
𝑉2
2
𝑉2
3
𝑉3
1
𝑉3
2
Shot
segmentation
[10]
Visual
similarity
calculation [2]
Connected
Components
Computation
[11]
Shot ID
assignment
𝑉2
1
𝑉1
2
[2] B. Coskun et al, “Spatio-temporal Transform Based Video Hashing”, IEEE Transactions on Multimedia, vol. 7, no. 3, pp. 524-537, Jun. 2005.
[10] A. Nagasaka and Y. Tanaka, “Automatic Video Indexing and Full-Video Search for Object Appearances”, North Holland Publishing Co., 1992
[11] Hopcroft, J.; Tarjan, R. (1973). "Efficient algorithms for graph manipulation". Communications of the ACM 16 (6): 372–378.
Visually similar shots  equal shot identifiers.
Thesis Contents
• Chapter 1: Introduction
• Chapter 2: Visual Quality Assessment
• Our contribution: Visual quality ∝ Authenticity
• Reviews conventional algorithms
• Enables the proposed method to quantify information loss
• Chapter 3: Shot Identification
• Utilizes the structure of videos to
1. Enable the reconstruction of the parent video when it is not available
2. Enable detecting deleted shots
3. Enable applications of conventional visual quality assessment algorithms from Chapter 2
• Chapter 4: The Video Authenticity Degree
• The proposed method for estimating video authenticity
• Chapter 5: Conclusion
12
Chapter 4 :: The Video Authenticity Degree
Our Strategy
13
Problem: the parent video 𝑉0 is usually unavailable.
Solution: estimate the parent video 𝑉0 from the available edited videos.
Penalty PenaltyPenalty
Aggregate penalties
Detect information loss
Calculate penalties
Proposed
method
Authenticity degree of edited video 𝑉𝑗
𝑉0 (parent video) 𝑉𝑗 (edited video)
X
Editing
Information
1. Shot removal (full loss)
2. Recompression (partial loss)
Penalties
Information: the message contained by the video; the reason why people watch the video.
Calculate shot identifiers
Chapter 4 :: The Video Authenticity Degree
An Example
14
𝑉0 (estimate of parent)
2 3 41A set of edited videos
Estimate parent video
Detect removed shots
𝑉2
2 3 4X
𝑉4
2 3 4 5 6X
𝑉3
1 4X X
𝑉1
1 2 3 X
Calculate penalties
Aggregate penalties
0.1
0.2
1.0
1.0
1.0
1.0 1.0
0.1 0.2
0.10.3 0.2
0.20.1 0.2
0.2
Penalties
Shot IDs
0.65
0.60
0.40
0.63
Authenticity
degrees
Authenticity degree
for each edited video
Not in parent
(no penalties)
Calculate shot identifiers
15
𝑉0 (estimate of parent)
2 3 41A set of edited videos
Estimate parent video
Detect removed shots
𝑉2
2 3 4
𝑉4
2 3 4 5 6
𝑉3
1 4
𝑉1
1 2 3
Calculate penalties
Aggregate penalties
Shot IDs
0.65
0.60
0.40
0.63
Authenticity
degrees
Authenticity degree
for each edited video
Chapter 4 :: The Video Authenticity Degree
An Example
6
1 23
234
14
2345
0
1
2
3
4
1 2 3 4 5 6
Frequency
Shot ID
Chapter 4 :: Estimating the Parent Video 𝑉0
16
Problem: how can we estimate 𝑉0 from the available edited videos?
Solution: examine the frequently-occurring shot identifiers.
𝑉1
𝑉2
𝑉3
𝑉4
𝑉0 (estimate of parent video)
2341
Threshold
Edited videos Shot ID histogram Estimated result
Calculate shot identifiers
17
𝑉0 (estimate of parent)
2 3 41A set of edited videos
Estimate parent video
Detect removed shots
𝑉2
2 3 4X
𝑉4
2 3 4 5 6X
𝑉3
1 4X X
𝑉1
1 2 3 X
Calculate penalties
Aggregate penalties
0.1
0.2
1.0
1.0
1.0
1.0 1.0
0.1 0.2
0.10.3 0.2
0.20.1 0.2
0.2
Penalties
Shot IDs
Authenticity degree
for each edited video
Chapter 4 :: The Video Authenticity Degree
An Example
Not in parent
(no penalties)
Chapter 4 :: Penalizing Information Loss (1)
Solving the Problem of Relativity to Visual Content
18
Problem: algorithm output is relative to the visual content.
Solution: normalize visual quality for each unique shot ID individually.
𝑉2
1
𝑉3
1
Information loss is
proportional to visual quality
loss.
Utilize visual quality
algorithms to estimate
information loss.
𝑉4
1
𝑉1
2
𝑉2
2
1.5 1.0 2.0 4.0 5.0Direct [1]
0.0 -1.2 1.2 -1.0 1.0Normalized
Shot ID = 1 Shot ID = 2
𝜇 = 1.5, 𝜎 = 0.4 𝜇 = 4.5, 𝜎 = 0.5
0.2 0.0 0.5 0.0 0.5Penalties
Penalty calculation function
[1] P. Marziliano et al, “Perceptual blur and ringing metrics: Application to JPEG2000,” in
Elsevier Signal Processing: Image Communication, vol. 19, pp. 163–172, 2004.
𝑧 =
𝑥 − 𝜇
𝜎
𝑥: direct output
𝑧: normalized output
Chapter 4 :: Penalizing Information Loss (2)
A Model for Penalizing Shot Removal
19
Problem: algorithms do not consider deleted shots.
Solution: model shot removal as complete information loss.
Information loss is
proportional to visual quality
loss.
Utilize visual quality
algorithms to estimate
information loss.
Penalty calculation function
𝑉0 (estimate of parent)
2 3 41
𝑉2
2 3 4X
𝑉3
1 4X X
1.0
1.0 1.0
Maximum penalty
Shot IDs
Deleted shot
20
Chapter 4 :: Experiments :: Summary
Exp. Purpose Videos Data type Editing operations
1 Demonstrate that the method
correctly detects editing
operations
10 Artificial Scaling
Recompression
Remove whole shots
Remove parts of shots
Reverse shot order
Add logo
2 Demonstrate that the method
can correctly estimate the
parent video for a large variety
of videos
272 Artificial Scaling
Recompression
Remove whole shots
3 Demonstrate the effectiveness
of the proposed method in a
real-life situation
175 Real Unknown
21
Chapter 4 :: Experiments :: Evaluation Method
Given 𝑛 videos, the output of a proposed/comparative method:
𝑥 = [ 𝑥1, 𝑥2, … , 𝑥 𝑛]
Ground truth:
𝑦 = 𝑦1, 𝑦2, … , 𝑦 𝑛
Sample correlation coefficient:
𝑟(𝑥, 𝑦) =
𝑖=1
𝑛
(𝑥𝑖 − 𝑥)(𝑦𝑖 − 𝑦)
𝑖=1
𝑛
𝑥𝑖 − 𝑥 2
𝑖=1
𝑛
𝑦𝑖 − 𝑦 2
Rank-order correlation coefficient:
𝜌 𝑥, 𝑦 = 𝑟(𝑋, 𝑌)
𝑥 = 2.4, 3.9, 4.1, 3.5, 4.0, 5.9, 6.3
𝑋 = [1, 2, 3, 4, 5, 6, 7]
High correlation coefficient corresponds to a good result.
22
Chapter 4 :: Experiments :: Summary
Exp. Purpose Videos Data type Editing operations
1 Demonstrate that the method
correctly detects editing
operations
10 Artificial Scaling
Recompression
Remove whole shots
Remove parts of shots
Reverse shot order
Add logo
2 Demonstrate that the method
can correctly estimate the
parent video for a large variety
of videos
272 Artificial Scaling
Recompression
Remove whole shots
3 Demonstrate the effectiveness
of the proposed method in a
real-life situation
175 Real Unknown
23
Chapter 4 :: Experiment 1 :: Overview
Data
Video Comments Ground
truth
𝑉0 Parent video (consists of 4 shots) 1
𝑉1 Reuploaded 𝑉0 to YouTube 2
𝑉2 Removed 10 frames from each shot of 𝑉1 3
𝑉3 Reversed order of shots of 𝑉1 4
𝑉4 Added a shot to 𝑉1 5
𝑉5 Added a logo to 𝑉1 6
𝑉6 Downsampled 𝑉1 to 720p 7
𝑉7 Removed one shot from 𝑉0 8
𝑉8 Removed two shots from 𝑉0 9
𝑉9 Removed 60 shots from each shot of 𝑉0 10
Evaluation
Criteria
Sample correlation coefficient between the ranks of the
output of the proposed method and the ground truth
High correlation coefficient corresponds to a good result.
1dataset
10 videos
24
Chapter 4 :: Experiment 1 :: Results
Comments ER 𝜸 = 𝟎. 𝟏𝟑 𝜸 = 𝟎. 𝟐𝟓 𝜸 = 𝟎. 𝟓𝟎 𝜸 = 𝟎. 𝟕𝟓
𝑉0 Parent video 1 0.99 (1) 0.97 (1) 0.94 (1) 0.92 (1)
𝑉1 Reuploaded 𝑉0 to YouTube 2 0.97 (3) 0.93 (3) 0.87 (3) 0.80 (3)
𝑉2 Removed 10 frames from each shot of 𝑉1 3 0.96 (4) 0.92 (4) 0.84 (4) 0.75 (4)
𝑉3 Reversed order of shots of 𝑉1 4 0.94 (6) 0.88 (6) 0.77 (6) 0.65 (6)
𝑉4 Added a shot to 𝑉1 5 0.95 (5) 0.89 (5) 0.78 (5) 0.67 (5)
𝑉5 Added a logo to 𝑉1 6 0.97 (2) 0.94 (2) 0.88 (2) 0.81 (2)
𝑉6 Downsampled 𝑉1 to 720p 7 0.88 (7) 0.75 (7) 0.50 (8) 0.25 (9)
𝑉7 Removed one shot from 𝑉0 8 0.72 (8) 0.68 (8) 0.61 (7) 0.55 (7)
𝑉8 Removed two shots from 𝑉0 9 0.48 (9) 0.45 (9) 0.40 (9) 0.36 (8)
𝑉9 Removed 60 frames from each shot of 𝑉0 10 0.00 (10) 0.00 (10) 0.00 (10) 0.00 (10)
𝒓 0.88 0.88 0.87 0.85
Different values for 𝛾 influence the output values, but not their rank.
Proposed method estimates the authenticity of this dataset effectively.
Proposed method does not penalize partial short removal or changes in shot order.
25
Chapter 4 :: Experiments :: Summary
Exp. Purpose Videos Data type Editing operations
1 Demonstrate that the method
correctly detects editing
operations
10 Artificial Scaling
Recompression
Remove whole shots
Remove parts of shots
Reverse shot order
Add logo
2 Demonstrate that the method
can correctly estimate the
parent video for a large variety
of videos
272 Artificial Scaling
Recompression
Remove whole shots
3 Demonstrate the effectiveness
of the proposed method in a
real-life situation
175 Real Unknown
26
Chapter 4 :: Experiment 2 :: Overview
Data
Sample correlation coefficient: 𝑟(𝑥, 𝑦)
Rank-order correlation coefficient: 𝜌 𝑥, 𝑦
High correlation coefficient corresponds to a good result.
16 parent videos from #PopularOnYouTube
Genres: Movie trailers, documentaries, comedy, sports, etc.
Each parent video edited to create 17 edited videos
Ground truth (𝑦): subjective evaluation by 12 individuals
Editing operation Parameter type Parameter values
Downsampling Resolution 720p, 480p, 360p
H. 264 recompression CRF 18, 26, 34, 40
Shot removal Percentage 10%, 20%, …, 90%
16datasets
272 videos
12subjects
Criteria
[1] P. Marziliano et al, “Perceptual blur and ringing metrics: Application to JPEG2000,” in
Elsevier Signal Processing: Image Communication, vol. 19, pp. 163–172, 2004.
No-reference visual quality assessment algorithm [1]
Comparative
Methods (𝑥)
27Chapter 4 :: Experiment 2
Obtaining Subjective Evaluation Scores
Problem: many videos and parameters  objective evaluation is difficult.
Solution: obtain ground truth through subjective evaluations.
For each experiment subject:
For each video:
1. Score visual quality (1 = worst, 5 = best)
2. Score removed shots (1 = most, 5 = least)
3. Score authenticity (1 = lowest, 5 = highest)
For each video:
Ground truth score  mean for (3) across all subjects.
28Chapter 4 :: Experiment 2
Subjective Evaluation Interface
Demo
available
29
Chapter 4 :: Experiment 2 :: Results
Proposed method is more effective than the comparative method for most datasets.
Comparative method is not sensitive to editing other than recompression & resampling.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Comp
.
Sample correlation coefficient (𝑟) Rank-order correlation coefficient (𝜌)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Comp
.
30
Chapter 4 :: Experiments :: Summary
Exp. Purpose Videos Data type Editing operations
1 Demonstrate that the method
correctly detects editing
operations
10 Artificial Scaling
Recompression
Remove whole shots
Remove parts of shots
Reverse shot order
Add logo
2 Demonstrate that the method
can correctly estimate the
parent video for a large variety
of videos
272 Artificial Scaling
Recompression
Remove whole shots
3 Demonstrate the effectiveness
of the proposed method in a
real-life situation
175 Real Unknown
31
Chapter 4 :: Experiment 3 :: Overview
Data
High correlation coefficient corresponds to a good result.
5 search queries
8 ~ 76 videos downloaded from YouTube for each query
Ground truth (𝑦): subjective evaluation by 20 individuals
Name Videos Total duration Shots Unique IDs
Bolt 68 4 h 42 min 1933 275
Kerry 5 0 h 47 min 103 24
Klaus 76 1 h 16 min 253 61
Lagos 8 0 h 6 min 17 17
Russell 18 2 h 50 min 1748 103
Total 175 9 h 41 min 4116 480
5datasets
175 videos
20subjects
Comparative
Methods (𝑥)
Criteria
[1] P. Marziliano et al, “Perceptual blur and ringing metrics: Application to JPEG2000,” in
Elsevier Signal Processing: Image Communication, vol. 19, pp. 163–172, 2004.
(1) View count
(2) Upload timestamp
(3) No-reference visual quality assessment algorithm [1]
Sample correlation coefficient: 𝑟(𝑥, 𝑦)
Rank-order correlation coefficient: 𝜌 𝑥, 𝑦
32
Sample correlation coefficient (𝑟)
Estimating authenticity for real data is a difficult task, even for humans.
Rank-order correlation coefficient (𝜌)
Proposed method outperforms the comparative methods for most datasets.
0
0.2
0.4
0.6
0.8
1
View #
Time
Edge W.
Prop.
Ideal
0
0.2
0.4
0.6
0.8
1
View #
Time
Edge W.
Prop.
Ideal
Chapter 4 :: Experiment 3 :: Results
33
Chapter 4 :: Demo :: Summary
Video Editing operations
Authenticity
Degree
Parent video None 1.00
Edited video 1 H.264 Recompression (H.264 CRF = 40) 0.70
Edited video 2 Removed shots (60% of all shots removed) 0.43
Parent video available at: http://youtu.be/xAsjRRMMg_Q (July 21)
34
Chapter 4 :: Demo :: Summary
Video Editing operations
Authenticity
Degree
Parent video None 1.00
Edited video 1 H.264 Recompression (H.264 CRF = 40) 0.70
Edited video 2 Removed shots (60% of all shots removed) 0.43
Parent video available at: http://youtu.be/xAsjRRMMg_Q (July 21)
Chapter 4 :: Demo :: Parent Video Screenshot
Authenticity Degree (AD) = 1.00
Chapter 4 :: Demo :: Edited Video 1 Screenshot
Authenticity Degree (AD) = 0.70
Chapter 4 :: Demo :: Zoomed Comparison
Parent video (AD = 1.00) Edited video 1 (AD = 0.70)
38
Chapter 4 :: Demo :: Summary
Video Editing operations
Authenticity
Degree
Parent video None 1.00
Edited video 1 H.264 Recompression (H.264 CRF = 40) 0.70
Edited video 2 Removed shots (60% of all shots removed) 0.43
Parent video available at: http://youtu.be/xAsjRRMMg_Q (July 21)
39Chapter 4 :: Demo :: Parent Video Shots
Authenticity Degree (AD) = 1.00
40Chapter 4 :: Demo :: Edited Video 2 Shots
Authenticity Degree (AD) = 0.43
Full videos
available
Conclusion and Future Work
41
Future work:
‒ Consider shot order
‒ Consider inter-frame differences
‒ Detect partial shot removal
‒ Focus on the audio signal as well
Many applications require a method for determining video authenticity.
Search result
summarization
Content tracking Content aggregation
Information loss is
proportional to visual
quality loss.
Utilize visual quality
algorithms to estimate
information loss.

Weitere ähnliche Inhalte

Ähnlich wie phd-mark4

Recent advances in content based video copy detection (IEEE)
Recent advances in content based video copy detection (IEEE)Recent advances in content based video copy detection (IEEE)
Recent advances in content based video copy detection (IEEE)PACE 2.0
 
SUBJECTIVE QUALITY EVALUATION OF H.264 AND H.265 ENCODED VIDEO SEQUENCES STRE...
SUBJECTIVE QUALITY EVALUATION OF H.264 AND H.265 ENCODED VIDEO SEQUENCES STRE...SUBJECTIVE QUALITY EVALUATION OF H.264 AND H.265 ENCODED VIDEO SEQUENCES STRE...
SUBJECTIVE QUALITY EVALUATION OF H.264 AND H.265 ENCODED VIDEO SEQUENCES STRE...ijma
 
Subjective Quality Evaluation of H.264 and H.265 Encoded Video Sequences Stre...
Subjective Quality Evaluation of H.264 and H.265 Encoded Video Sequences Stre...Subjective Quality Evaluation of H.264 and H.265 Encoded Video Sequences Stre...
Subjective Quality Evaluation of H.264 and H.265 Encoded Video Sequences Stre...ijma
 
Video Summarization for Sports
Video Summarization for SportsVideo Summarization for Sports
Video Summarization for SportsIRJET Journal
 
IRJET - Applications of Image and Video Deduplication: A Survey
IRJET -  	  Applications of Image and Video Deduplication: A SurveyIRJET -  	  Applications of Image and Video Deduplication: A Survey
IRJET - Applications of Image and Video Deduplication: A SurveyIRJET Journal
 
IRJET-Feature Extraction from Video Data for Indexing and Retrieval
IRJET-Feature Extraction from Video Data for Indexing and Retrieval IRJET-Feature Extraction from Video Data for Indexing and Retrieval
IRJET-Feature Extraction from Video Data for Indexing and Retrieval IRJET Journal
 
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...IJERD Editor
 
Parking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationParking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationIRJET Journal
 
Inside prototype 16-113
Inside prototype   16-113Inside prototype   16-113
Inside prototype 16-113Kasun Udayanga
 
Semantic Summarization of videos, Semantic Summarization of videos
Semantic Summarization of videos, Semantic Summarization of videosSemantic Summarization of videos, Semantic Summarization of videos
Semantic Summarization of videos, Semantic Summarization of videosdarsh228313
 
Key frame extraction methodology for video annotation
Key frame extraction methodology for video annotationKey frame extraction methodology for video annotation
Key frame extraction methodology for video annotationIAEME Publication
 
Deep neural networks for Youtube recommendations
Deep neural networks for Youtube recommendationsDeep neural networks for Youtube recommendations
Deep neural networks for Youtube recommendationsAryan Khandal
 
A survey on Measurement of Objective Video Quality in Social Cloud using Mach...
A survey on Measurement of Objective Video Quality in Social Cloud using Mach...A survey on Measurement of Objective Video Quality in Social Cloud using Mach...
A survey on Measurement of Objective Video Quality in Social Cloud using Mach...IRJET Journal
 
Real-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big DataReal-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big DataIRJET Journal
 
Video Content Identification using Video Signature: Survey
Video Content Identification using Video Signature: SurveyVideo Content Identification using Video Signature: Survey
Video Content Identification using Video Signature: SurveyIRJET Journal
 
Human Activity Recognition (HAR) using HMM based Intermediate matching kernel...
Human Activity Recognition (HAR) using HMM based Intermediate matching kernel...Human Activity Recognition (HAR) using HMM based Intermediate matching kernel...
Human Activity Recognition (HAR) using HMM based Intermediate matching kernel...Rupali Bhatnagar
 
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...SWAMI06
 

Ähnlich wie phd-mark4 (20)

Recent advances in content based video copy detection (IEEE)
Recent advances in content based video copy detection (IEEE)Recent advances in content based video copy detection (IEEE)
Recent advances in content based video copy detection (IEEE)
 
SUBJECTIVE QUALITY EVALUATION OF H.264 AND H.265 ENCODED VIDEO SEQUENCES STRE...
SUBJECTIVE QUALITY EVALUATION OF H.264 AND H.265 ENCODED VIDEO SEQUENCES STRE...SUBJECTIVE QUALITY EVALUATION OF H.264 AND H.265 ENCODED VIDEO SEQUENCES STRE...
SUBJECTIVE QUALITY EVALUATION OF H.264 AND H.265 ENCODED VIDEO SEQUENCES STRE...
 
Subjective Quality Evaluation of H.264 and H.265 Encoded Video Sequences Stre...
Subjective Quality Evaluation of H.264 and H.265 Encoded Video Sequences Stre...Subjective Quality Evaluation of H.264 and H.265 Encoded Video Sequences Stre...
Subjective Quality Evaluation of H.264 and H.265 Encoded Video Sequences Stre...
 
Video Summarization for Sports
Video Summarization for SportsVideo Summarization for Sports
Video Summarization for Sports
 
IRJET - Applications of Image and Video Deduplication: A Survey
IRJET -  	  Applications of Image and Video Deduplication: A SurveyIRJET -  	  Applications of Image and Video Deduplication: A Survey
IRJET - Applications of Image and Video Deduplication: A Survey
 
IRJET-Feature Extraction from Video Data for Indexing and Retrieval
IRJET-Feature Extraction from Video Data for Indexing and Retrieval IRJET-Feature Extraction from Video Data for Indexing and Retrieval
IRJET-Feature Extraction from Video Data for Indexing and Retrieval
 
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
 
Parking Surveillance Footage Summarization
Parking Surveillance Footage SummarizationParking Surveillance Footage Summarization
Parking Surveillance Footage Summarization
 
Inside prototype 16-113
Inside prototype   16-113Inside prototype   16-113
Inside prototype 16-113
 
Semantic Summarization of videos, Semantic Summarization of videos
Semantic Summarization of videos, Semantic Summarization of videosSemantic Summarization of videos, Semantic Summarization of videos
Semantic Summarization of videos, Semantic Summarization of videos
 
AcademicProject
AcademicProjectAcademicProject
AcademicProject
 
Key frame extraction methodology for video annotation
Key frame extraction methodology for video annotationKey frame extraction methodology for video annotation
Key frame extraction methodology for video annotation
 
Deep neural networks for Youtube recommendations
Deep neural networks for Youtube recommendationsDeep neural networks for Youtube recommendations
Deep neural networks for Youtube recommendations
 
A survey on Measurement of Objective Video Quality in Social Cloud using Mach...
A survey on Measurement of Objective Video Quality in Social Cloud using Mach...A survey on Measurement of Objective Video Quality in Social Cloud using Mach...
A survey on Measurement of Objective Video Quality in Social Cloud using Mach...
 
Real-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big DataReal-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big Data
 
Video Content Identification using Video Signature: Survey
Video Content Identification using Video Signature: SurveyVideo Content Identification using Video Signature: Survey
Video Content Identification using Video Signature: Survey
 
L0956974
L0956974L0956974
L0956974
 
Human Activity Recognition (HAR) using HMM based Intermediate matching kernel...
Human Activity Recognition (HAR) using HMM based Intermediate matching kernel...Human Activity Recognition (HAR) using HMM based Intermediate matching kernel...
Human Activity Recognition (HAR) using HMM based Intermediate matching kernel...
 
K1803027074
K1803027074K1803027074
K1803027074
 
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
 

phd-mark4

  • 1. Estimating Video Authenticity via the Analysis of Visual Quality and Video Structure 画質と映像構造の解析に基づく 映像の信頼性推定に関する研究 2015/07/31 Laboratory of Media Dynamics Graduate School of Information Science and Technology Michael Penkov
  • 2. 1 [*] http://www.youtube.com/t/press_statistics/ (accessed 2015/06/29) Need to distinguish between the parent and edited videos Introduction :: Background Estimating Video Authenticity via the Analysis of Visual Quality and Video Structure Many independent uploaders Upload rate: 300 h/min [*] Much video is duplicated No screening of content Parent video Edited video (1st gen.) Edited video (2nd gen.) Authenticity of video important event (e.g. news) most objective most reliable closest to the truth least edited Search result summarization Content tracking Content aggregation Cheaper phones/cameras Faster networks Increase of video sharing Free sharing services (Low)(High) Editing Editing Upload Reupload … … How similar is the edited video to the parent video?
  • 3. [1] P. Marziliano et al, “Perceptual blur and ringing metrics: Application to JPEG2000,” in Elsevier Signal Processing: Image Communication, vol. 19, pp. 163–172, 2004. [2] B. Coskun, B. Sankur, and N. Memon, “Spatio-Temporal Transform Based Video Hashing,” IEEE Transactions on Multimedia, vol. 8, no. 6, pp. 1190–1208, Dec. 2006. [3] Z. Dias, A. Rocha, and S. Goldenstein, “Video Phylogeny: Recovering near- duplicate video relationships,” in 2011 IEEE International Workshop on Information Forensics and Security. IEEE, Nov. 2011, pp. 1–6. [4] F. Battisti, M. Carli, and A. Neri, “Image forgery detection by means of no- reference quality metrics,” in SPIE Vol. 8303, 2012. [5] S. Lameri, P. Bestagini, A. Melloni, S. Milani, A. Rocha, M. Tagliasacchi, and S. Tubaro, “Who is my parent? Reconstructing video sequences from partially matching shots,” in IEEE International Conference on Image Processing (ICIP), 2014. Video Phylogeny [3] Forgery Detection [4] 2Introduction :: Related Research Estimating Video Authenticity via the Analysis of Visual Quality and Video Structure Our Research Parent Video Estimation [5] Existing methods can… • Estimate visual quality [1] • Quantify video similarity [2] • Estimate hierarchical relationships between videos [3] • Detect copy-paste forgeries through inconsistencies in visual quality [4] • Estimate parent video from edited videos [5] 2010 2015 Visual quality ∝ Authenticity Video structure  estimate deleted shots Video Similarity Digital Forensics Visual Quality Assessment No-reference VQA [1] Robust video hash [2] Estimate authenticity of edited videos Visual quality: low visual quality  low authenticity Video structure: many deleted shots  low authenticity
  • 4. 3Introduction :: Research Map Estimating Video Authenticity via the Analysis of Visual Quality and Video Structure Digital Forensics Shot Segmentation Video Similarity Visual Quality Assessment Our Research Contribution: bridging Visual Quality Assessment and Digital Forensics [4] F. Battisti, M. Carli, and A. Neri, “Image forgery detection by means of no-reference quality metrics,” in SPIE Vol. 8303, 2012. [5] S. Lameri, P. Bestagini, A. Melloni, S. Milani, A. Rocha, M. Tagliasacchi, and S. Tubaro, “Who is my parent? Reconstructing video sequences from partially matching shots,” in IEEE International Conference on Image Processing (ICIP), 2014. [6] ペンコフ マイケル, 小川 貴弘, 長谷山 美紀 “Fidelity estimation of online video based on video quality measurement and Web information” 第 26 回信号処理シンポジウム, vol. A3-5, pp. 70–74 (2011). [4] [6] [5] [5]
  • 5. 4Introduction :: Our Contribution Estimating Video Authenticity via the Analysis of Visual Quality and Video Structure Authenticity degree: the proportion of information retained by an edited video. 10 A relative scale for ranking edited videos. Bridge digital forensics and visual quality assessment. Authenticity Degree Edited Edited Edited Parent not available Information Information: the message contained by the video; the reason why people watch the video. Lower visual quality Many deleted shots Higher visual quality Few deleted shots
  • 6. Introduction :: Thesis Contents • Chapter 1: Introduction • Chapter 2: Visual Quality Assessment • Our contribution: Visual quality ∝ Authenticity • Reviews conventional algorithms • Enables the proposed method to quantify information loss • Chapter 3: Shot Identification • Utilizes the structure of videos to 1. Enable the reconstruction of the parent video when it is not available 2. Enable detecting deleted shots 3. Enable applications of conventional visual quality assessment algorithms from Chapter 2 • Chapter 4: The Video Authenticity Degree • The proposed method for estimating video authenticity • Chapter 5: Conclusion 5
  • 7. Thesis Contents • Chapter 1: Introduction • Chapter 2: Visual Quality Assessment • Our contribution: Visual quality ∝ Authenticity • Reviews conventional algorithms • Enables the proposed method to quantify information loss • Chapter 3: Shot Identification • Utilizes the structure of videos to 1. Enable the reconstruction of the parent video when it is not available 2. Enable detecting deleted shots 3. Enable applications of conventional visual quality assessment algorithms from Chapter 2 • Chapter 4: The Video Authenticity Degree • The proposed method for estimating video authenticity • Chapter 5: Conclusion 6
  • 8. 7Chapter 2 :: Visual Quality Assessment An Overview Full-reference Algorithms [9] Subjective Evaluation [8] No-reference Algorithms [1] Target image Reference image Human subjects Reduced- reference Algorithms [7] Extracted features Compression [1] P. Marziliano et al, “Perceptual blur and ringing metrics: Application to JPEG2000,” in Elsevier Signal Processing: Image Communication, vol. 19, pp. 163–172, 2004. [7] Z. Wang and A. C. Bovik, “Modern Image Quality Assessment,” Synthesis Lectures on Image, Video, and Multimedia Processing, vol. 2, no. 1, pp. 1–156, Jan. 2006. [8] ITU-T Recommendation BT.500: “Methodology for the subjective assessment of the quality of television pictures” [9] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, Apr. 2004.
  • 9. 𝑉1 (edited) 𝑉0 (parent) 8Chapter 2 :: Visual Quality Assessment Known Problems and Limitations of No-reference Algorithms 1.0 1.5 0.5 1.4 1.2 1.0X (deleted) 1.2 Low  good quality High  poor quality Example: Problem: algorithms do not consider deleted shots. [1] P. Marziliano et al, “Perceptual blur and ringing metrics: Application to JPEG2000,” in Elsevier Signal Processing: Image Communication, vol. 19, pp. 163–172, 2004. Blurring strength [1] Mean blurring strength for entire video Problem: algorithm output is relative to the visual content. Our solution: shot identifiers (Chapter 3) and a shot-based penalty model (Chapter 4). Better quality (!) Worse quality (!) 𝑉1 (edited video) 𝑉0 (parent video) Enable detection of deleted shots Group visually similar shots together Normalize algorithm outputs
  • 10. Thesis Contents • Chapter 1: Introduction • Chapter 2: Visual Quality Assessment • Our contribution: Visual quality ∝ Authenticity • Reviews conventional algorithms • Enables the proposed method to quantify information loss • Chapter 3: Shot Identification • Utilizes the structure of videos to 1. Enable the reconstruction of the parent video when it is not available 2. Enable detecting deleted shots 3. Enable applications of conventional visual quality assessment algorithms from Chapter 2 • Chapter 4: The Video Authenticity Degree • The proposed method for estimating video authenticity • Chapter 5: Conclusion 9
  • 11. Chapter 3 :: Shot Identification :: Summary 1. Enable the reconstruction of the parent video 2. Enable detecting deleted shots 3. Enable applications of algorithms from Chapter 2 10 𝑉1 𝑉2 𝑉3 𝑉1 𝑉2 𝑉3 1 122 33 4 4 Aim Method Represent each unique shot as a unique integer.
  • 12. ID = 1 ID = 2 ID = 3 ID = 4 Chapter 3 :: Shot Identification :: Details 11 𝑉1 𝑉2 𝑉3 𝑉1 1 𝑉1 3 𝑉2 2 𝑉2 3 𝑉3 1 𝑉3 2 Shot segmentation [10] Visual similarity calculation [2] Connected Components Computation [11] Shot ID assignment 𝑉2 1 𝑉1 2 [2] B. Coskun et al, “Spatio-temporal Transform Based Video Hashing”, IEEE Transactions on Multimedia, vol. 7, no. 3, pp. 524-537, Jun. 2005. [10] A. Nagasaka and Y. Tanaka, “Automatic Video Indexing and Full-Video Search for Object Appearances”, North Holland Publishing Co., 1992 [11] Hopcroft, J.; Tarjan, R. (1973). "Efficient algorithms for graph manipulation". Communications of the ACM 16 (6): 372–378. Visually similar shots  equal shot identifiers.
  • 13. Thesis Contents • Chapter 1: Introduction • Chapter 2: Visual Quality Assessment • Our contribution: Visual quality ∝ Authenticity • Reviews conventional algorithms • Enables the proposed method to quantify information loss • Chapter 3: Shot Identification • Utilizes the structure of videos to 1. Enable the reconstruction of the parent video when it is not available 2. Enable detecting deleted shots 3. Enable applications of conventional visual quality assessment algorithms from Chapter 2 • Chapter 4: The Video Authenticity Degree • The proposed method for estimating video authenticity • Chapter 5: Conclusion 12
  • 14. Chapter 4 :: The Video Authenticity Degree Our Strategy 13 Problem: the parent video 𝑉0 is usually unavailable. Solution: estimate the parent video 𝑉0 from the available edited videos. Penalty PenaltyPenalty Aggregate penalties Detect information loss Calculate penalties Proposed method Authenticity degree of edited video 𝑉𝑗 𝑉0 (parent video) 𝑉𝑗 (edited video) X Editing Information 1. Shot removal (full loss) 2. Recompression (partial loss) Penalties Information: the message contained by the video; the reason why people watch the video.
  • 15. Calculate shot identifiers Chapter 4 :: The Video Authenticity Degree An Example 14 𝑉0 (estimate of parent) 2 3 41A set of edited videos Estimate parent video Detect removed shots 𝑉2 2 3 4X 𝑉4 2 3 4 5 6X 𝑉3 1 4X X 𝑉1 1 2 3 X Calculate penalties Aggregate penalties 0.1 0.2 1.0 1.0 1.0 1.0 1.0 0.1 0.2 0.10.3 0.2 0.20.1 0.2 0.2 Penalties Shot IDs 0.65 0.60 0.40 0.63 Authenticity degrees Authenticity degree for each edited video Not in parent (no penalties)
  • 16. Calculate shot identifiers 15 𝑉0 (estimate of parent) 2 3 41A set of edited videos Estimate parent video Detect removed shots 𝑉2 2 3 4 𝑉4 2 3 4 5 6 𝑉3 1 4 𝑉1 1 2 3 Calculate penalties Aggregate penalties Shot IDs 0.65 0.60 0.40 0.63 Authenticity degrees Authenticity degree for each edited video Chapter 4 :: The Video Authenticity Degree An Example
  • 17. 6 1 23 234 14 2345 0 1 2 3 4 1 2 3 4 5 6 Frequency Shot ID Chapter 4 :: Estimating the Parent Video 𝑉0 16 Problem: how can we estimate 𝑉0 from the available edited videos? Solution: examine the frequently-occurring shot identifiers. 𝑉1 𝑉2 𝑉3 𝑉4 𝑉0 (estimate of parent video) 2341 Threshold Edited videos Shot ID histogram Estimated result
  • 18. Calculate shot identifiers 17 𝑉0 (estimate of parent) 2 3 41A set of edited videos Estimate parent video Detect removed shots 𝑉2 2 3 4X 𝑉4 2 3 4 5 6X 𝑉3 1 4X X 𝑉1 1 2 3 X Calculate penalties Aggregate penalties 0.1 0.2 1.0 1.0 1.0 1.0 1.0 0.1 0.2 0.10.3 0.2 0.20.1 0.2 0.2 Penalties Shot IDs Authenticity degree for each edited video Chapter 4 :: The Video Authenticity Degree An Example Not in parent (no penalties)
  • 19. Chapter 4 :: Penalizing Information Loss (1) Solving the Problem of Relativity to Visual Content 18 Problem: algorithm output is relative to the visual content. Solution: normalize visual quality for each unique shot ID individually. 𝑉2 1 𝑉3 1 Information loss is proportional to visual quality loss. Utilize visual quality algorithms to estimate information loss. 𝑉4 1 𝑉1 2 𝑉2 2 1.5 1.0 2.0 4.0 5.0Direct [1] 0.0 -1.2 1.2 -1.0 1.0Normalized Shot ID = 1 Shot ID = 2 𝜇 = 1.5, 𝜎 = 0.4 𝜇 = 4.5, 𝜎 = 0.5 0.2 0.0 0.5 0.0 0.5Penalties Penalty calculation function [1] P. Marziliano et al, “Perceptual blur and ringing metrics: Application to JPEG2000,” in Elsevier Signal Processing: Image Communication, vol. 19, pp. 163–172, 2004. 𝑧 = 𝑥 − 𝜇 𝜎 𝑥: direct output 𝑧: normalized output
  • 20. Chapter 4 :: Penalizing Information Loss (2) A Model for Penalizing Shot Removal 19 Problem: algorithms do not consider deleted shots. Solution: model shot removal as complete information loss. Information loss is proportional to visual quality loss. Utilize visual quality algorithms to estimate information loss. Penalty calculation function 𝑉0 (estimate of parent) 2 3 41 𝑉2 2 3 4X 𝑉3 1 4X X 1.0 1.0 1.0 Maximum penalty Shot IDs Deleted shot
  • 21. 20 Chapter 4 :: Experiments :: Summary Exp. Purpose Videos Data type Editing operations 1 Demonstrate that the method correctly detects editing operations 10 Artificial Scaling Recompression Remove whole shots Remove parts of shots Reverse shot order Add logo 2 Demonstrate that the method can correctly estimate the parent video for a large variety of videos 272 Artificial Scaling Recompression Remove whole shots 3 Demonstrate the effectiveness of the proposed method in a real-life situation 175 Real Unknown
  • 22. 21 Chapter 4 :: Experiments :: Evaluation Method Given 𝑛 videos, the output of a proposed/comparative method: 𝑥 = [ 𝑥1, 𝑥2, … , 𝑥 𝑛] Ground truth: 𝑦 = 𝑦1, 𝑦2, … , 𝑦 𝑛 Sample correlation coefficient: 𝑟(𝑥, 𝑦) = 𝑖=1 𝑛 (𝑥𝑖 − 𝑥)(𝑦𝑖 − 𝑦) 𝑖=1 𝑛 𝑥𝑖 − 𝑥 2 𝑖=1 𝑛 𝑦𝑖 − 𝑦 2 Rank-order correlation coefficient: 𝜌 𝑥, 𝑦 = 𝑟(𝑋, 𝑌) 𝑥 = 2.4, 3.9, 4.1, 3.5, 4.0, 5.9, 6.3 𝑋 = [1, 2, 3, 4, 5, 6, 7] High correlation coefficient corresponds to a good result.
  • 23. 22 Chapter 4 :: Experiments :: Summary Exp. Purpose Videos Data type Editing operations 1 Demonstrate that the method correctly detects editing operations 10 Artificial Scaling Recompression Remove whole shots Remove parts of shots Reverse shot order Add logo 2 Demonstrate that the method can correctly estimate the parent video for a large variety of videos 272 Artificial Scaling Recompression Remove whole shots 3 Demonstrate the effectiveness of the proposed method in a real-life situation 175 Real Unknown
  • 24. 23 Chapter 4 :: Experiment 1 :: Overview Data Video Comments Ground truth 𝑉0 Parent video (consists of 4 shots) 1 𝑉1 Reuploaded 𝑉0 to YouTube 2 𝑉2 Removed 10 frames from each shot of 𝑉1 3 𝑉3 Reversed order of shots of 𝑉1 4 𝑉4 Added a shot to 𝑉1 5 𝑉5 Added a logo to 𝑉1 6 𝑉6 Downsampled 𝑉1 to 720p 7 𝑉7 Removed one shot from 𝑉0 8 𝑉8 Removed two shots from 𝑉0 9 𝑉9 Removed 60 shots from each shot of 𝑉0 10 Evaluation Criteria Sample correlation coefficient between the ranks of the output of the proposed method and the ground truth High correlation coefficient corresponds to a good result. 1dataset 10 videos
  • 25. 24 Chapter 4 :: Experiment 1 :: Results Comments ER 𝜸 = 𝟎. 𝟏𝟑 𝜸 = 𝟎. 𝟐𝟓 𝜸 = 𝟎. 𝟓𝟎 𝜸 = 𝟎. 𝟕𝟓 𝑉0 Parent video 1 0.99 (1) 0.97 (1) 0.94 (1) 0.92 (1) 𝑉1 Reuploaded 𝑉0 to YouTube 2 0.97 (3) 0.93 (3) 0.87 (3) 0.80 (3) 𝑉2 Removed 10 frames from each shot of 𝑉1 3 0.96 (4) 0.92 (4) 0.84 (4) 0.75 (4) 𝑉3 Reversed order of shots of 𝑉1 4 0.94 (6) 0.88 (6) 0.77 (6) 0.65 (6) 𝑉4 Added a shot to 𝑉1 5 0.95 (5) 0.89 (5) 0.78 (5) 0.67 (5) 𝑉5 Added a logo to 𝑉1 6 0.97 (2) 0.94 (2) 0.88 (2) 0.81 (2) 𝑉6 Downsampled 𝑉1 to 720p 7 0.88 (7) 0.75 (7) 0.50 (8) 0.25 (9) 𝑉7 Removed one shot from 𝑉0 8 0.72 (8) 0.68 (8) 0.61 (7) 0.55 (7) 𝑉8 Removed two shots from 𝑉0 9 0.48 (9) 0.45 (9) 0.40 (9) 0.36 (8) 𝑉9 Removed 60 frames from each shot of 𝑉0 10 0.00 (10) 0.00 (10) 0.00 (10) 0.00 (10) 𝒓 0.88 0.88 0.87 0.85 Different values for 𝛾 influence the output values, but not their rank. Proposed method estimates the authenticity of this dataset effectively. Proposed method does not penalize partial short removal or changes in shot order.
  • 26. 25 Chapter 4 :: Experiments :: Summary Exp. Purpose Videos Data type Editing operations 1 Demonstrate that the method correctly detects editing operations 10 Artificial Scaling Recompression Remove whole shots Remove parts of shots Reverse shot order Add logo 2 Demonstrate that the method can correctly estimate the parent video for a large variety of videos 272 Artificial Scaling Recompression Remove whole shots 3 Demonstrate the effectiveness of the proposed method in a real-life situation 175 Real Unknown
  • 27. 26 Chapter 4 :: Experiment 2 :: Overview Data Sample correlation coefficient: 𝑟(𝑥, 𝑦) Rank-order correlation coefficient: 𝜌 𝑥, 𝑦 High correlation coefficient corresponds to a good result. 16 parent videos from #PopularOnYouTube Genres: Movie trailers, documentaries, comedy, sports, etc. Each parent video edited to create 17 edited videos Ground truth (𝑦): subjective evaluation by 12 individuals Editing operation Parameter type Parameter values Downsampling Resolution 720p, 480p, 360p H. 264 recompression CRF 18, 26, 34, 40 Shot removal Percentage 10%, 20%, …, 90% 16datasets 272 videos 12subjects Criteria [1] P. Marziliano et al, “Perceptual blur and ringing metrics: Application to JPEG2000,” in Elsevier Signal Processing: Image Communication, vol. 19, pp. 163–172, 2004. No-reference visual quality assessment algorithm [1] Comparative Methods (𝑥)
  • 28. 27Chapter 4 :: Experiment 2 Obtaining Subjective Evaluation Scores Problem: many videos and parameters  objective evaluation is difficult. Solution: obtain ground truth through subjective evaluations. For each experiment subject: For each video: 1. Score visual quality (1 = worst, 5 = best) 2. Score removed shots (1 = most, 5 = least) 3. Score authenticity (1 = lowest, 5 = highest) For each video: Ground truth score  mean for (3) across all subjects.
  • 29. 28Chapter 4 :: Experiment 2 Subjective Evaluation Interface Demo available
  • 30. 29 Chapter 4 :: Experiment 2 :: Results Proposed method is more effective than the comparative method for most datasets. Comparative method is not sensitive to editing other than recompression & resampling. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Comp . Sample correlation coefficient (𝑟) Rank-order correlation coefficient (𝜌) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Comp .
  • 31. 30 Chapter 4 :: Experiments :: Summary Exp. Purpose Videos Data type Editing operations 1 Demonstrate that the method correctly detects editing operations 10 Artificial Scaling Recompression Remove whole shots Remove parts of shots Reverse shot order Add logo 2 Demonstrate that the method can correctly estimate the parent video for a large variety of videos 272 Artificial Scaling Recompression Remove whole shots 3 Demonstrate the effectiveness of the proposed method in a real-life situation 175 Real Unknown
  • 32. 31 Chapter 4 :: Experiment 3 :: Overview Data High correlation coefficient corresponds to a good result. 5 search queries 8 ~ 76 videos downloaded from YouTube for each query Ground truth (𝑦): subjective evaluation by 20 individuals Name Videos Total duration Shots Unique IDs Bolt 68 4 h 42 min 1933 275 Kerry 5 0 h 47 min 103 24 Klaus 76 1 h 16 min 253 61 Lagos 8 0 h 6 min 17 17 Russell 18 2 h 50 min 1748 103 Total 175 9 h 41 min 4116 480 5datasets 175 videos 20subjects Comparative Methods (𝑥) Criteria [1] P. Marziliano et al, “Perceptual blur and ringing metrics: Application to JPEG2000,” in Elsevier Signal Processing: Image Communication, vol. 19, pp. 163–172, 2004. (1) View count (2) Upload timestamp (3) No-reference visual quality assessment algorithm [1] Sample correlation coefficient: 𝑟(𝑥, 𝑦) Rank-order correlation coefficient: 𝜌 𝑥, 𝑦
  • 33. 32 Sample correlation coefficient (𝑟) Estimating authenticity for real data is a difficult task, even for humans. Rank-order correlation coefficient (𝜌) Proposed method outperforms the comparative methods for most datasets. 0 0.2 0.4 0.6 0.8 1 View # Time Edge W. Prop. Ideal 0 0.2 0.4 0.6 0.8 1 View # Time Edge W. Prop. Ideal Chapter 4 :: Experiment 3 :: Results
  • 34. 33 Chapter 4 :: Demo :: Summary Video Editing operations Authenticity Degree Parent video None 1.00 Edited video 1 H.264 Recompression (H.264 CRF = 40) 0.70 Edited video 2 Removed shots (60% of all shots removed) 0.43 Parent video available at: http://youtu.be/xAsjRRMMg_Q (July 21)
  • 35. 34 Chapter 4 :: Demo :: Summary Video Editing operations Authenticity Degree Parent video None 1.00 Edited video 1 H.264 Recompression (H.264 CRF = 40) 0.70 Edited video 2 Removed shots (60% of all shots removed) 0.43 Parent video available at: http://youtu.be/xAsjRRMMg_Q (July 21)
  • 36. Chapter 4 :: Demo :: Parent Video Screenshot Authenticity Degree (AD) = 1.00
  • 37. Chapter 4 :: Demo :: Edited Video 1 Screenshot Authenticity Degree (AD) = 0.70
  • 38. Chapter 4 :: Demo :: Zoomed Comparison Parent video (AD = 1.00) Edited video 1 (AD = 0.70)
  • 39. 38 Chapter 4 :: Demo :: Summary Video Editing operations Authenticity Degree Parent video None 1.00 Edited video 1 H.264 Recompression (H.264 CRF = 40) 0.70 Edited video 2 Removed shots (60% of all shots removed) 0.43 Parent video available at: http://youtu.be/xAsjRRMMg_Q (July 21)
  • 40. 39Chapter 4 :: Demo :: Parent Video Shots Authenticity Degree (AD) = 1.00
  • 41. 40Chapter 4 :: Demo :: Edited Video 2 Shots Authenticity Degree (AD) = 0.43 Full videos available
  • 42. Conclusion and Future Work 41 Future work: ‒ Consider shot order ‒ Consider inter-frame differences ‒ Detect partial shot removal ‒ Focus on the audio signal as well Many applications require a method for determining video authenticity. Search result summarization Content tracking Content aggregation Information loss is proportional to visual quality loss. Utilize visual quality algorithms to estimate information loss.

Hinweis der Redaktion

  1. Define “authenticity” here Similarity to the original? How well does the video convey the message of the parent video?
  2. Explain meaning of blue line. [4] is different direction, but similar in general.
  3. Change values to edge width
  4. Change values to edge width
  5. How do you aggregate the penalties?
  6. How do you aggregate the penalties?
  7. How do you aggregate the penalties?
  8. Penalty is simple and sufficient
  9. Introduce gamma here
  10. Make this a graph?
  11. Make it a graph
  12. Very wordy
  13. Transition to next slide.
  14. Transition to next slide.
  15. Transition to next slide.
  16. Transition to next slide.
  17. Transition to next slide.
  18. Transition to next slide.
  19. Transition to next slide.
  20. Transition to next slide.
  21. Transition to next slide.