SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
Activity Prediction
Using a Space-Time CNN and Bayesian Framework
Hirokatsu KATAOKA, Yoshimitsu AOKI†, Kenji IWATA, Yutaka SATOH
National Institute of Advanced Industrial Science and Technology (AIST)
† Keio University
http://www.hirokatsukataoka.net/
Background
•  Computer vision for human sensing
–  Detection, tracking, trajectory analysis
–  Posture estimation, action analysis
–  Action recognition is able to extend human sensing applications
Mental state
Body Situation
Attention
Action Analysis
shakinghands
Look at people
Detection
Gaze Estimation
Action Recognition
Posture Estimation
Face Recognition
Trajectory extraction
Tracking
Related work 1: Action Recognition
•  Action is a low-level primitive with semantic meaning
–  e.g. walking, running, sitting
This image contains a man walking
- The classification (location is given)
Action recognition
Walking
Is action recognition enough?
Time-series
Post-detection
Event detection
(Action tag : Ai)
Time-series
Event prediction
(Prediction tag : Aj)
Pre-estimation
Related work 2: Early Action Recognition
•  Prediction in early part of action
–  Integral bag-of-words
–  Accumulating likelihood through time-sequence
M. S. Ryoo, “Human Activity Prediction: Early Recognition of Ongoing Activities from Streaming Videos”, International Conference on
Computer Vision (ICCV), pp.1036-1043, 2011.
Proposal
•  Action prediction within a ST-CNN and Bayesian
framework
–  Action recognition
–  Database analysis
???	Daytime
(Time Zone)	
Walking
(Previous Action)	
Sitting
(Current Action)	
???
(Next Action)	
xtimezone	
xprevious	 xcurrent	
θ = “Using a PC”	
Given	 Not given	
Time series
Problem settings
•  Three different works in action analysis
–  Action recognition
•  Recognizing At given 1 ~ t frames
–  Early action recognition
•  Recognizing At given 1 ~ t-L frames
–  Action Prediction
•  Recognizing At+L given 1 ~ t frames
Approach Setting
Action Recognition
Early Action Recognition
Action Prediction
f (F1...t
A
) → At
f (F1...t−L
A
) → At
f (F1...t
A
) → At+L
Process flow
•  Consist of (i) action recognition (ii) action prediction
1.  Action recognition
1.1 Improved dense trajectories (IDT)
1.2 Space-time convolutional neural networks (ST-CNN)
2.  Action prediction
2.1 Bayesian framework
2.2 Database
x	x	x	x	x	x	x	x	x	x	x	x	x	x	x	
x	x	x	
Trajectory (in t + L frames)	
Feature extraction
(HOG, HOF, MBH, Traj.)	
Bag-of-words (BoW)	
Pedestrian detection	 IDT	
Input	
Conv	
Conv	
Pool	
FC	
Conv	
Conv	
Pool	
Conv	
Conv	
Pool	
Conv	
Conv	
Pool	
Conv	
Conv	
Pool	
ST-CNN	Oxford VGG architecture (VGGNet)
Action Recognition (1/2)
•  Improved Dense Trajectories (IDT) [Wang+, ICCV2013]
–  Pyramidal image sequences and flow tracking
–  Feature descriptors on trajectories
–  Feature representation with bag-of-words (BoW)
sittingwalking
Action Recognition (1/2)
•  IDT + Co-occurrence HOG [Kataoka+, ACCV2014]
CoHOG: edge-pair counting to corresponding histogram position
Extended CoHOG(ECoHOG): edge-magnitude accumulation
–  PCA dim. reduction: 103 - 104 dims into 101-102 ,easy to divide in feature space
Action Recognition (2/2)
•  Space-time Convolutional Neural Networks (ST-CNN)
–  Based on VGG 16-layer architecture (VGGNet) [Simonyan+, ICLR2015]
–  Statio-temporal feature concatenation (around 10 frames)
Space-time CNN (ST-CNN) Feature
Input	
Conv	
Conv	
Pool	
FC	
FC	
Conv	
Conv	
Pool	
Conv	
Conv	
Pool	
Conv	
Conv	
Pool	
Conv	
Conv	
Pool	
FC	
So3max	
・・・	
CNN architecture with VGGNet
Action Prediction (1/2)
•  Prediction model
- Action sequence
Predicting “Using a PC” at “Walk” => “Sit”
- Time zone (supplemental info.)
Day time
???	Daytime
(Time Zone)	
Walking
(Previous Activity)	
Sitting
(Current Activity)	
???
(Next Activity)	
xtimezone	
xprevious	 xcurrent	
θ = “Using a PC”	
Given	 Not given	
Time series
•  Database: ST-action tags + attribute
–  Time zone
•  “morning”, “day time”, “night”
–  Previous & current action
•  “walk”, “bend”, “stand”, “sit”…
–  Next action (objective)
•  “use a PC”, “read”, “meal”…
Action Prediction (2/2)
Action History DB
Walking
Sitting
Using a PC
Daytime
Experiments on the Daily Living Data
–  Total 20h of video
–  3 different scenes
–  640x480, 30fps
Results
•  Action recognition
–  IDT (HOG, HOF, MBH, CoHOG, ECoHOG, All)
–  Per-frame CNN
–  ST-CNN
–  Combined vector
Results
•  Action prediction
Time Attributes
Estimated Intention
Action
PC (0.82)
Read (0.11)
Predicted activity
Read (1.00)
PC (0.00)
Coluclusion
•  Action prediction approach within recognition and database
analysis
–  Concatenated vector of IDT, ST-CNN
–  Bayesian framework
–  Database

Weitere ähnliche Inhalte

Andere mochten auch

【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...
【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...
【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...Hirokatsu Kataoka
 
【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Traj...
【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Traj...【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Traj...
【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Traj...Hirokatsu Kataoka
 
TensorFlowによるCNNアーキテクチャ構築
TensorFlowによるCNNアーキテクチャ構築TensorFlowによるCNNアーキテクチャ構築
TensorFlowによるCNNアーキテクチャ構築Hirokatsu Kataoka
 
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder DatasetHirokatsu Kataoka
 
【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016cvpaper. challenge
 
PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装Hirokatsu Kataoka
 
Res netと派生研究の紹介
Res netと派生研究の紹介Res netと派生研究の紹介
Res netと派生研究の紹介masataka nishimori
 
21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る
21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る
21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫るTakashi J OZAKI
 
Deep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewDeep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewKeunwoo Choi
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - IntroductionChristian Perone
 
Neural network
Neural networkNeural network
Neural networkSilicon
 
Deep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural ZooDeep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural ZooChristian Perone
 
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2Daiki Shimada
 

Andere mochten auch (19)

【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...
【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...
【CVPR2016_LAP】Dominant Codewords Selection with Topic Model for Action Recogn...
 
【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Traj...
【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Traj...【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Traj...
【ISVC2015】Evaluation of Vision-based Human Activity Recognition in Dense Traj...
 
ILSVRC2015 手法のメモ
ILSVRC2015 手法のメモILSVRC2015 手法のメモ
ILSVRC2015 手法のメモ
 
TensorFlowによるCNNアーキテクチャ構築
TensorFlowによるCNNアーキテクチャ構築TensorFlowによるCNNアーキテクチャ構築
TensorFlowによるCNNアーキテクチャ構築
 
16 17 bag_words
16 17 bag_words16 17 bag_words
16 17 bag_words
 
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
【ITSC2015】Fine-grained Walking Activity Recognition via Driving Recorder Dataset
 
【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016
 
PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装PythonによるCVアルゴリズム実装
PythonによるCVアルゴリズム実装
 
CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
 
Res netと派生研究の紹介
Res netと派生研究の紹介Res netと派生研究の紹介
Res netと派生研究の紹介
 
21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る
21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る
21世紀で最もセクシーな職業!?「データサイエンティスト」の実像に迫る
 
Deep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - OverviewDeep Convolutional Neural Networks - Overview
Deep Convolutional Neural Networks - Overview
 
Edge detection
Edge detectionEdge detection
Edge detection
 
20150930
2015093020150930
20150930
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
 
Neural network
Neural networkNeural network
Neural network
 
Deep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural ZooDeep Learning - Convolutional Neural Networks - Architectural Zoo
Deep Learning - Convolutional Neural Networks - Architectural Zoo
 
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
 
CVPR 2016 まとめ v1
CVPR 2016 まとめ v1CVPR 2016 まとめ v1
CVPR 2016 まとめ v1
 

Kürzlich hochgeladen

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Mohammad Khajehpour
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicinesherlingomez2
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oManavSingh202607
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxBhagirath Gogikar
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Joonhun Lee
 

Kürzlich hochgeladen (20)

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 

【VISAPP2016】Activity Prediction Using a Space-Time CNN and Bayesian Framework

  • 1. Activity Prediction Using a Space-Time CNN and Bayesian Framework Hirokatsu KATAOKA, Yoshimitsu AOKI†, Kenji IWATA, Yutaka SATOH National Institute of Advanced Industrial Science and Technology (AIST) † Keio University http://www.hirokatsukataoka.net/
  • 2. Background •  Computer vision for human sensing –  Detection, tracking, trajectory analysis –  Posture estimation, action analysis –  Action recognition is able to extend human sensing applications Mental state Body Situation Attention Action Analysis shakinghands Look at people Detection Gaze Estimation Action Recognition Posture Estimation Face Recognition Trajectory extraction Tracking
  • 3. Related work 1: Action Recognition •  Action is a low-level primitive with semantic meaning –  e.g. walking, running, sitting This image contains a man walking - The classification (location is given) Action recognition Walking
  • 4. Is action recognition enough? Time-series Post-detection Event detection (Action tag : Ai) Time-series Event prediction (Prediction tag : Aj) Pre-estimation
  • 5. Related work 2: Early Action Recognition •  Prediction in early part of action –  Integral bag-of-words –  Accumulating likelihood through time-sequence M. S. Ryoo, “Human Activity Prediction: Early Recognition of Ongoing Activities from Streaming Videos”, International Conference on Computer Vision (ICCV), pp.1036-1043, 2011.
  • 6. Proposal •  Action prediction within a ST-CNN and Bayesian framework –  Action recognition –  Database analysis ??? Daytime (Time Zone) Walking (Previous Action) Sitting (Current Action) ??? (Next Action) xtimezone xprevious xcurrent θ = “Using a PC” Given Not given Time series
  • 7. Problem settings •  Three different works in action analysis –  Action recognition •  Recognizing At given 1 ~ t frames –  Early action recognition •  Recognizing At given 1 ~ t-L frames –  Action Prediction •  Recognizing At+L given 1 ~ t frames Approach Setting Action Recognition Early Action Recognition Action Prediction f (F1...t A ) → At f (F1...t−L A ) → At f (F1...t A ) → At+L
  • 8. Process flow •  Consist of (i) action recognition (ii) action prediction 1.  Action recognition 1.1 Improved dense trajectories (IDT) 1.2 Space-time convolutional neural networks (ST-CNN) 2.  Action prediction 2.1 Bayesian framework 2.2 Database x x x x x x x x x x x x x x x x x x Trajectory (in t + L frames) Feature extraction (HOG, HOF, MBH, Traj.) Bag-of-words (BoW) Pedestrian detection IDT Input Conv Conv Pool FC Conv Conv Pool Conv Conv Pool Conv Conv Pool Conv Conv Pool ST-CNN Oxford VGG architecture (VGGNet)
  • 9. Action Recognition (1/2) •  Improved Dense Trajectories (IDT) [Wang+, ICCV2013] –  Pyramidal image sequences and flow tracking –  Feature descriptors on trajectories –  Feature representation with bag-of-words (BoW) sittingwalking
  • 10. Action Recognition (1/2) •  IDT + Co-occurrence HOG [Kataoka+, ACCV2014] CoHOG: edge-pair counting to corresponding histogram position Extended CoHOG(ECoHOG): edge-magnitude accumulation –  PCA dim. reduction: 103 - 104 dims into 101-102 ,easy to divide in feature space
  • 11. Action Recognition (2/2) •  Space-time Convolutional Neural Networks (ST-CNN) –  Based on VGG 16-layer architecture (VGGNet) [Simonyan+, ICLR2015] –  Statio-temporal feature concatenation (around 10 frames) Space-time CNN (ST-CNN) Feature Input Conv Conv Pool FC FC Conv Conv Pool Conv Conv Pool Conv Conv Pool Conv Conv Pool FC So3max ・・・ CNN architecture with VGGNet
  • 12. Action Prediction (1/2) •  Prediction model - Action sequence Predicting “Using a PC” at “Walk” => “Sit” - Time zone (supplemental info.) Day time ??? Daytime (Time Zone) Walking (Previous Activity) Sitting (Current Activity) ??? (Next Activity) xtimezone xprevious xcurrent θ = “Using a PC” Given Not given Time series
  • 13. •  Database: ST-action tags + attribute –  Time zone •  “morning”, “day time”, “night” –  Previous & current action •  “walk”, “bend”, “stand”, “sit”… –  Next action (objective) •  “use a PC”, “read”, “meal”… Action Prediction (2/2) Action History DB Walking Sitting Using a PC Daytime
  • 14. Experiments on the Daily Living Data –  Total 20h of video –  3 different scenes –  640x480, 30fps
  • 15. Results •  Action recognition –  IDT (HOG, HOF, MBH, CoHOG, ECoHOG, All) –  Per-frame CNN –  ST-CNN –  Combined vector
  • 16. Results •  Action prediction Time Attributes Estimated Intention Action PC (0.82) Read (0.11) Predicted activity Read (1.00) PC (0.00)
  • 17. Coluclusion •  Action prediction approach within recognition and database analysis –  Concatenated vector of IDT, ST-CNN –  Bayesian framework –  Database