Siamese Spatio-temporal convolutional neural network for stroke classification in Table Tennis games

•Als PPTX, PDF herunterladen•

0 gefällt mir•29 views

Paper: http://ceur-ws.org/Vol-2670/MediaEval_19_paper_58.pdf Youtube: https://www.youtube.com/watch?v=KgTUlU_gALI Pierre-Etienne Martin, Jenny Benois-Pineau, Boris Mansencal, Renaud Peteri and Julien Morlier , Siamese Spatio-temporal convolutional neural network for stroke classification in Table Tennis games. Proc. of MediaEval 2019, 27-29 October 2019, Sophia Antipolis, France. Abstract: This work presents a Table Tennis stroke classification approach through a siamese spatio-temporal convolutional neural network - SSTCNN. The videos are recorded at 120 frames per second with players performing in natural conditions. The frames are extracted, resized and processed to compute the optical flow. From the optical flow, a region of interest - ROI - is inferred. The SSTCNN is then fed by RGB and optical flow ROIs stream to give a probabilistic classification over all the table tennis strokes. Presented by Pierre-Etienne Martin

Technologie

Benchmarking Initiative for Multimedia Evaluation
MediaEval
MediaEval 2019
10th Anniversary Workshop
27-29 October 2019
EURECOM, Sophia Antipolis, France

ediaEval 2019
by:
Pierre-Etienne Martin
Jenny Benois-Pineau
Boris Mansencal
Renaud Péteri
Julien Morlier
2
Siamese Spatio-Temporal
Convolutional Neural
Network for stroke
classification in Table
Tennis games

Runs submitted
3
Accuracies in %
Rank Team Run1 Run2 Run3 Run4
1
Univ. Bordeaux -
LaBRI
19.2 17.2 17.8 22.9

Data construction
4
➢ RGB extraction
➢ Optical Flow computation
○ DIS
○ Deep Flow
➢ ROI calculation from Flow
Dis * Foreground Deep Flow
[1] P.-E. Martin et al., “Optimal choice of motion estimation methods for fine-grained action classification with 3D convolutional networks,”
in IEEE ICIP, 2019.

Data Normalization
5
➢ Normalization which takes into account the distribution of the maximum flow values in each
direction of each frame among the training set
[1] P.-E. Martin et al., “Optimal choice of motion estimation methods for fine-grained action classification with 3D convolutional networks,”
in IEEE ICIP, 2019.
Max Normalization Ours ‘Normal’

Data augmentation
6
➢ Spatial augmentation centered in the ROI center
○ translation
○ rotations
○ zooms
○ flip
➢ RGB channel swap
➢ Temporal augmentation with probabilistic random extraction of 100 successive frames within the
boundaries of the stroke

Model
7
➢ Cuboides of size (W x H x T) = (120 x 120 x 100)
➢ Trained from scratch with 250 epochs on split train dataset
➢ Trained with full dataset using number of epochs with best performances
SSTCNN Architecture
[2] P.-E. Martin et al., “Sport action recognition with siamese spatio-temporal cnns: Application to table tennis,” in IEEE CBMI, 2018.

Runs submitted
8
Accuracies in %
Rank Team Run1 Run2 Run3 Run4
1
Univ. Bordeaux -
LaBRI
19.2 17.2 17.8 22.9
?

Runs submitted
9
Accuracies in %
Train dataset Split Full
Flow DIS DeepFlow DIS DeepFlow
Runs 19.2 17.2 17.8 22.9

Evaluation
11
➢ Hand
○ Forehand
○ Backhand

Evaluation
12
➢ Type
○ Service
○ Offensive
○ Defensive

Conclusion
14
➢ Need negative samples
➢ Multi label or/and cascade
○ Forehand / Backhand
○ Offensive / Defensive / Services
○ Type of stroke
[3] P.-E. Martin et al., “Fine-Grained Action Detection and Classification in Table Tennis with Siamese Spatio-Temporal Convolutional
Neural Network,” in IEEE ICIP, 2019.

LaBRI - Building A30
351 cours de la Libération
33405, Talence
pierre-etienne.martin@u-bordeaux.fr
https://www.labri.fr/projet/AIV/CRISP_presentation.php
+33 5 40 00 38 80
www.linkedin.com/in/p-e-martin
@P_eMartin
Merci

Empfohlen

AI in AerospaceObject Automation

Photogrametry_3D_Modelling[1]Joachim Nkendeys

Esa pres-nisser-2020-11-20Advanced-Concepts-Team

Fyp presentation for kaydriveMohammadHamzaAmjad

Field Geometry, auto steering and servicesCAPIGI

Tianjin Case StudyVisionMap

Empfohlen

AI in AerospaceObject Automation

Photogrametry_3D_Modelling[1]Joachim Nkendeys

Esa pres-nisser-2020-11-20Advanced-Concepts-Team

Fyp presentation for kaydriveMohammadHamzaAmjad

Field Geometry, auto steering and servicesCAPIGI

Tianjin Case StudyVisionMap

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...multimediaeval

MediaEval 2018: Fine grained sport action recognition: Application to table t...multimediaeval

Sports Video Annotation: Detection of Strokes in Table Tennis Taskmultimediaeval

NVIDIA GTC 2018 PresentationTomasz Bednarz

Analysis by semantic segmentation of Multispectral satellite imagery using de...Yogesh S Awate

Wearable Accelerometer Optimal Positions for Human Motion Recognition(LifeTec...sugiuralab

MediaEval 2018: Ensembled Convolutional Neural Network Models for Retrieving ...multimediaeval

Artificial Neural Network Based Graphical User Interface for Estimation of Fa...ijsrd.com

Deep Learning TomographyAmir Adler

The study on mining temporal patterns and related applications in dynamic soc...Thanh Hieu

BPSO&1-NN algorithm-based variable selection for power system stability ident...IJAEMSJORNAL

Development of 3D convolutional neural network to recognize human activities ...journalBEEI

"An adaptive modular approach to the mining of sensor network ...butest

Selective and incremental re-computation in reaction to changes: an exercise ...Paolo Missier

IBM Cloud Paris Meetup 20180517 - Deep Learning ChallengesIBM France Lab

CROP PROTECTION AGAINST BIRDS USING DEEP LEARNING AND IOTIRJET Journal

A multi-sensor based uncut crop edge detection method for head-feeding combin...Institute of Agricultural Machinery, NARO

Model-counting Approaches For Nonlinear Numerical ConstraintsQuoc-Sang Phan

Fractional step discriminant pruningVasileiosMezaris

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...multimediaeval

Sports Video Classification: Classification of Strokes in Table Tennis for Me...multimediaeval

Weitere ähnliche Inhalte

Ähnlich wie Siamese Spatio-temporal convolutional neural network for stroke classification in Table Tennis games

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...multimediaeval

MediaEval 2018: Fine grained sport action recognition: Application to table t...multimediaeval

Sports Video Annotation: Detection of Strokes in Table Tennis Taskmultimediaeval

NVIDIA GTC 2018 PresentationTomasz Bednarz

Analysis by semantic segmentation of Multispectral satellite imagery using de...Yogesh S Awate

Wearable Accelerometer Optimal Positions for Human Motion Recognition(LifeTec...sugiuralab

MediaEval 2018: Ensembled Convolutional Neural Network Models for Retrieving ...multimediaeval

Artificial Neural Network Based Graphical User Interface for Estimation of Fa...ijsrd.com

Deep Learning TomographyAmir Adler

The study on mining temporal patterns and related applications in dynamic soc...Thanh Hieu

BPSO&1-NN algorithm-based variable selection for power system stability ident...IJAEMSJORNAL

Development of 3D convolutional neural network to recognize human activities ...journalBEEI

"An adaptive modular approach to the mining of sensor network ...butest

Selective and incremental re-computation in reaction to changes: an exercise ...Paolo Missier

IBM Cloud Paris Meetup 20180517 - Deep Learning ChallengesIBM France Lab

CROP PROTECTION AGAINST BIRDS USING DEEP LEARNING AND IOTIRJET Journal

A multi-sensor based uncut crop edge detection method for head-feeding combin...Institute of Agricultural Machinery, NARO

Model-counting Approaches For Nonlinear Numerical ConstraintsQuoc-Sang Phan

Fractional step discriminant pruningVasileiosMezaris

Ähnlich wie Siamese Spatio-temporal convolutional neural network for stroke classification in Table Tennis games (20)

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...

MediaEval 2018: Fine grained sport action recognition: Application to table t...

Sports Video Annotation: Detection of Strokes in Table Tennis Task

NVIDIA GTC 2018 Presentation

Analysis by semantic segmentation of Multispectral satellite imagery using de...

Wearable Accelerometer Optimal Positions for Human Motion Recognition(LifeTec...

MediaEval 2018: Ensembled Convolutional Neural Network Models for Retrieving ...

Artificial Neural Network Based Graphical User Interface for Estimation of Fa...

Deep Learning Tomography

The study on mining temporal patterns and related applications in dynamic soc...

BPSO&1-NN algorithm-based variable selection for power system stability ident...

Development of 3D convolutional neural network to recognize human activities ...

"An adaptive modular approach to the mining of sensor network ...

Selective and incremental re-computation in reaction to changes: an exercise ...

IBM Cloud Paris Meetup 20180517 - Deep Learning Challenges

CROP PROTECTION AGAINST BIRDS USING DEEP LEARNING AND IOT

A multi-sensor based uncut crop edge detection method for head-feeding combin...

Model-counting Approaches For Nonlinear Numerical Constraints

Fractional step discriminant pruning

Mehr von multimediaeval

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...multimediaeval

Sports Video Classification: Classification of Strokes in Table Tennis for Me...multimediaeval

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...multimediaeval

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Taskmultimediaeval

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...multimediaeval

Fooling an Automatic Image Quality Estimatormultimediaeval

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...multimediaeval

Pixel Privacy: Quality Camouflage for Social Imagesmultimediaeval

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matchingmultimediaeval

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...multimediaeval

HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...multimediaeval

Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...multimediaeval

Deep Conditional Adversarial learning for polyp Segmentationmultimediaeval

A Temporal-Spatial Attention Model for Medical Image Detectionmultimediaeval

HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...multimediaeval

Fine-tuning for Polyp Segmentation with Attentionmultimediaeval

Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...multimediaeval

Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...multimediaeval

Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...multimediaeval

Personal Air Quality Index Prediction Using Inverse Distance Weighting Methodmultimediaeval

Mehr von multimediaeval (20)

HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...

Sports Video Classification: Classification of Strokes in Table Tennis for Me...

Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...

Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task

Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...

Fooling an Automatic Image Quality Estimator

Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...

Pixel Privacy: Quality Camouflage for Social Images

HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...

HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...

Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...

Deep Conditional Adversarial learning for polyp Segmentation

A Temporal-Spatial Attention Model for Medical Image Detection

HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...

Fine-tuning for Polyp Segmentation with Attention

Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...

Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...

Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...

Personal Air Quality Index Prediction Using Inverse Distance Weighting Method

Kürzlich hochgeladen

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

GenCyber Cyber Security Day PresentationMichael W. Hawkins

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Histor y of HAM Radio presentation slidevu2urc

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

How to convert PDF to text with Nanonetsnaman860154

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

A Call to Action for Generative AI in 2024Results

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Kürzlich hochgeladen (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

GenCyber Cyber Security Day Presentation

IAC 2024 - IA Fast Track to Search Focused AI Solutions

How to Troubleshoot Apps for the Modern Connected Worker

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

08448380779 Call Girls In Friends Colony Women Seeking Men

Histor y of HAM Radio presentation slide

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

My Hashitalk Indonesia April 2024 Presentation

How to convert PDF to text with Nanonets

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Boost PC performance: How more available memory can improve productivity

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Salesforce Community Group Quito, Salesforce 101

A Call to Action for Generative AI in 2024

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Presentation on how to chat with PDF using ChatGPT code interpreter

Unblocking The Main Thread Solving ANRs and Frozen Frames

Siamese Spatio-temporal convolutional neural network for stroke classification in Table Tennis games

1. Benchmarking Initiative for Multimedia Evaluation MediaEval MediaEval 2019 10th Anniversary Workshop 27-29 October 2019 EURECOM, Sophia Antipolis, France

2. ediaEval 2019 by: Pierre-Etienne Martin Jenny Benois-Pineau Boris Mansencal Renaud Péteri Julien Morlier 2 Siamese Spatio-Temporal Convolutional Neural Network for stroke classification in Table Tennis games

3. Runs submitted 3 Accuracies in % Rank Team Run1 Run2 Run3 Run4 1 Univ. Bordeaux - LaBRI 19.2 17.2 17.8 22.9

4. Data construction 4 ➢ RGB extraction ➢ Optical Flow computation ○ DIS ○ Deep Flow ➢ ROI calculation from Flow Dis * Foreground Deep Flow [1] P.-E. Martin et al., “Optimal choice of motion estimation methods for fine-grained action classification with 3D convolutional networks,” in IEEE ICIP, 2019.

5. Data Normalization 5 ➢ Normalization which takes into account the distribution of the maximum flow values in each direction of each frame among the training set [1] P.-E. Martin et al., “Optimal choice of motion estimation methods for fine-grained action classification with 3D convolutional networks,” in IEEE ICIP, 2019. Max Normalization Ours ‘Normal’

6. Data augmentation 6 ➢ Spatial augmentation centered in the ROI center ○ translation ○ rotations ○ zooms ○ flip ➢ RGB channel swap ➢ Temporal augmentation with probabilistic random extraction of 100 successive frames within the boundaries of the stroke

7. Model 7 ➢ Cuboides of size (W x H x T) = (120 x 120 x 100) ➢ Trained from scratch with 250 epochs on split train dataset ➢ Trained with full dataset using number of epochs with best performances SSTCNN Architecture [2] P.-E. Martin et al., “Sport action recognition with siamese spatio-temporal cnns: Application to table tennis,” in IEEE CBMI, 2018.

8. Runs submitted 8 Accuracies in % Rank Team Run1 Run2 Run3 Run4 1 Univ. Bordeaux - LaBRI 19.2 17.2 17.8 22.9 ?

9. Runs submitted 9 Accuracies in % Train dataset Split Full Flow DIS DeepFlow DIS DeepFlow Runs 19.2 17.2 17.8 22.9

10. Best run 10 ➢ 20 strokes

11. Evaluation 11 ➢ Hand ○ Forehand ○ Backhand

12. Evaluation 12 ➢ Type ○ Service ○ Offensive ○ Defensive

13. Evaluation 13 ➢ Hand + Type

14. Conclusion 14 ➢ Need negative samples ➢ Multi label or/and cascade ○ Forehand / Backhand ○ Offensive / Defensive / Services ○ Type of stroke [3] P.-E. Martin et al., “Fine-Grained Action Detection and Classification in Table Tennis with Siamese Spatio-Temporal Convolutional Neural Network,” in IEEE ICIP, 2019.

15. LaBRI - Building A30 351 cours de la Libération 33405, Talence pierre-etienne.martin@u-bordeaux.fr https://www.labri.fr/projet/AIV/CRISP_presentation.php +33 5 40 00 38 80 www.linkedin.com/in/p-e-martin @P_eMartin Merci

Hinweis der Redaktion

This is the point CLEF, where CLEF starts to interweave with MediaEval, the benchmarking initiative for multimedia evaluation. This is the first joint session.