Regularizing Class-wise Predictions via Self-knowledge Distillation (CVPR 2020)

•

0 gefällt mir•513 views

Sukmin Yun*, Jongjin Park*, Kimin Lee, Jinwoo Shin arXiv: https://arxiv.org/abs/2003.13964 code: https://github.com/alinlab/cs-kd

Technologie

CVPR 2020
Regularizing Class-wise Predictions via
Self-knowledge Distillation
1 Korea Advanced Institute of Science and Technology (KAIST)
2 University of California, Berkeley
Sukmin Yun1 Jongjin Park1 Kimin Lee2 Jinwoo Shin1

Algorithmic Intelligence Lab
• We propose a new output regularizer utilizing the dark knowledge
Introduction
2
DNN Prediction
Dark knowldege
Information of non-target labels
Supervision on dark knowledge leads to better generalization!

Algorithmic Intelligence Lab
• Self-supervision by penalizing the predictions between similar samples
• is the self-supervision of
Class-wise Self-knowledge Distillation (CS-KD)
3
Similar samples
(same class)

Algorithmic Intelligence Lab
Class-wise Self-knowledge Distillation (CS-KD)
4
• The total training loss is defined as follow:
where denotes the Kullback-Leibler divergence and denotes the cross-entropy loss
Similar samples
(same class)

Algorithmic Intelligence Lab
• CS-KD could achieve two desirable goals simultaneously:
1. Preventing overconfident predictions
• Goal of entropy regularization methods [1]
• CS-KD utilizes the model prediction of other samples as the soft-label
2. Reducing the intra-class variations
• Goal of margin-based methods [2]
• CS-KD minimizes the distance between two logits within the same class
Class-wise Self-knowledge Distillation (CS-KD)
5
[1] When does label smoothing help? In NeuIPS, 2019.
[2] Adacos: Adaptively scaling cosine logits for effectively learning deep face representations. In CVPR, 2019.

Algorithmic Intelligence Lab
• We demonstrate the effectiveness of CS-KD, as below:
Our Contributions
6
Improving the generalization ability
Reducing the intra-class variations
Relaxing the overconfident predictions
Enhancing model calibration [3]
[3] Predicting good probabilities with supervised learning. In ICML, 2005.

Algorithmic Intelligence Lab
• The proposed CS-KD is arguably the simplest way to achieve two goals via a
single mechanism:
• We believe that the proposed method may be influential to enjoy a broader
usage in other applications.
Conclusion
7
Thank you for your attention 
• Preventing overconfident predictions
• Expected calibration error
• Reducing the intra-class variations
• Generalization error

Empfohlen

Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised...ALINLAB

Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinf...ALINLAB

Learning bounds for risk-sensitive learningALINLAB

CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted I...ALINLAB

Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)ALINLAB

Context-aware Dynamics Model for Generalization in Model-Based Reinforcement ...ALINLAB

Self-supervised Label Augmentation via Input Transformations (ICML 2020)ALINLAB

M2m: Imbalanced Classification via Major-to-minor Translation (CVPR 2020)ALINLAB

Empfohlen

Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised...ALINLAB

Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinf...ALINLAB

Learning bounds for risk-sensitive learningALINLAB

CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted I...ALINLAB

Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)ALINLAB

Context-aware Dynamics Model for Generalization in Model-Based Reinforcement ...ALINLAB

Self-supervised Label Augmentation via Input Transformations (ICML 2020)ALINLAB

M2m: Imbalanced Classification via Major-to-minor Translation (CVPR 2020)ALINLAB

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

ICT role in 21st century education and its challengesrafiqahmad00786416

AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez

Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Real Time Object Detection Using Open CVKhem

Corporate and higher education May webinar.pptxRustici Software

2024 State of Marketing Report – by HubspotMarius Sescu

Everything You Need To Know About ChatGPTExpeed Software

Weitere ähnliche Inhalte

Kürzlich hochgeladen

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

ICT role in 21st century education and its challengesrafiqahmad00786416

AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez

Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Real Time Object Detection Using Open CVKhem

Corporate and higher education May webinar.pptxRustici Software

Kürzlich hochgeladen (20)

Powerful Google developer tools for immediate impact! (2023-24 C)

Strategies for Landing an Oracle DBA Job as a Fresher

Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

ICT role in 21st century education and its challenges

AXA XL - Insurer Innovation Award Americas 2024

Exploring the Future Potential of AI-Enabled Smartphone Processors

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Artificial Intelligence Chap.5 : Uncertainty

presentation ICT roal in 21st century education

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Apidays New York 2024 - The value of a flexible API Management solution for O...

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Real Time Object Detection Using Open CV

Corporate and higher education May webinar.pptx

Empfohlen

2024 State of Marketing Report – by HubspotMarius Sescu

Everything You Need To Know About ChatGPTExpeed Software

Product Design Trends in 2024 | Teenage EngineeringsPixeldarts

How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow

AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork

Skeleton Culture CodeSkeleton Technologies

PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley

Content Methodology: A Best Practices Report (Webinar)contently

How to Prepare For a Successful Job Search for 2024Albert Qian

Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)

Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal

5 Public speaking tips from TED - Visualized summarySpeakerHub

ChatGPT and the Future of Work - Clark Boyd Clark Boyd

Getting into the tech field. what next Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray

How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC

Introduction to Data ScienceChristy Abraham Joy

Time Management & Productivity - Best PracticesVit Horky

The six step guide to practical project managementMindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36

Empfohlen (20)

2024 State of Marketing Report – by Hubspot

Everything You Need To Know About ChatGPT

Product Design Trends in 2024 | Teenage Engineerings

How Race, Age and Gender Shape Attitudes Towards Mental Health

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

Skeleton Culture Code

PEPSICO Presentation to CAGNY Conference Feb 2024

Content Methodology: A Best Practices Report (Webinar)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Regularizing Class-wise Predictions via Self-knowledge Distillation (CVPR 2020)

1. CVPR 2020 Regularizing Class-wise Predictions via Self-knowledge Distillation 1 Korea Advanced Institute of Science and Technology (KAIST) 2 University of California, Berkeley Sukmin Yun1 Jongjin Park1 Kimin Lee2 Jinwoo Shin1

2. Algorithmic Intelligence Lab • We propose a new output regularizer utilizing the dark knowledge Introduction 2 DNN Prediction Dark knowldege Information of non-target labels Supervision on dark knowledge leads to better generalization!

3. Algorithmic Intelligence Lab • Self-supervision by penalizing the predictions between similar samples • is the self-supervision of Class-wise Self-knowledge Distillation (CS-KD) 3 Similar samples (same class)

4. Algorithmic Intelligence Lab Class-wise Self-knowledge Distillation (CS-KD) 4 • The total training loss is defined as follow: where denotes the Kullback-Leibler divergence and denotes the cross-entropy loss Similar samples (same class)

5. Algorithmic Intelligence Lab • CS-KD could achieve two desirable goals simultaneously: 1. Preventing overconfident predictions • Goal of entropy regularization methods [1] • CS-KD utilizes the model prediction of other samples as the soft-label 2. Reducing the intra-class variations • Goal of margin-based methods [2] • CS-KD minimizes the distance between two logits within the same class Class-wise Self-knowledge Distillation (CS-KD) 5 [1] When does label smoothing help? In NeuIPS, 2019. [2] Adacos: Adaptively scaling cosine logits for effectively learning deep face representations. In CVPR, 2019.

6. Algorithmic Intelligence Lab • We demonstrate the effectiveness of CS-KD, as below: Our Contributions 6 Improving the generalization ability Reducing the intra-class variations Relaxing the overconfident predictions Enhancing model calibration [3] [3] Predicting good probabilities with supervised learning. In ICML, 2005.

7. Algorithmic Intelligence Lab • The proposed CS-KD is arguably the simplest way to achieve two goals via a single mechanism: • We believe that the proposed method may be influential to enjoy a broader usage in other applications. Conclusion 7 Thank you for your attention  • Preventing overconfident predictions • Expected calibration error • Reducing the intra-class variations • Generalization error