[ICCV 21] Influence-Balanced Loss for Imbalanced Visual Classification
1. MARS: Motion-Augmented RGB Stream for Action
Recognition (CVPR 2019)
MARS: Motion-Augmented RGB Stream for Action
Recognition (CVPR 2019)
Influence-Balanced Loss for
Imbalanced Visual Classification
ASRI, Dept. of Electrical and Computer Engineering
Seoul National University
Seulki Park Jongin Lim Younghan Jeon Jin Young Choi
2. Imbalanced Visual Classification
2
Introduction Proposed Method Experiment Conclusion
Many real-world data often exhibit long-tailed distribution.
â The model trained on such imbalanced data tends to overfit the majority classes.
â That is, the model performs poorly on minority classes.
Problem Definition:
â Input: Long-tailed (imbalanced) training data & uniform-distributed (balanced) test data.
â Goal: To make a robust model that can generalize well on balanced test data.
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
Faces (Zhang et al., 2017) Places (Wang et al., 2017) Species (Van Horn et al., 2018) Actions (Zhang et al., 2019)
* Images by authors.
3. Previous Methods
Introduction Proposed Method Experiment Conclusion
1. Data-level approach
⊠Directly balance the training data distributions by re-sampling or generating synthetic samples
(Chawlaâ02, Mullickâ 19, Hulseâ 07).
⊠Under-sampling majority classes can lose some valuable information.
⊠Over-sampling or data generation is susceptible to overfitting to certain repetitive samples.
2. Meta-learning approach
⊠Recent meta-learning methods have shown promising results (Shuâ 19, Liuâ20, Renâ20).
⊠However, these methods are difficult to implement in practice.
âȘ Additional unbiased data are required (Shuâ19), Meta-sampler is computationally expensive (Renâ20).
3. Re-weighting approach
⊠Assign different weights to each sample according to its importance.
⊠However, they have focused on only global class-level distribution and assign the same weight to all
samples belonging to the same class (CB Loss [Cuiâ19], LDAM [Caoâ19]).
â âNot all samples in a dataset play an equal role in determining the model parameters. (Cook, 1982)â
3
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
4. Motivation
Introduction Proposed Method Experiment Conclusion
Q. How can we appropriately re-weight each training samples
while preventing the model from overfitting to majority classes?
- Focal Loss [Linâ17] assigns more weights to hard examples, but most hard examples
belong to majority classes as training progresses.
A. Letâs re-weight samples by their direct influences on making the overfitted model!
â To measure the influence of a sample, we focus on Influence function (Cook, 1982).
4
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
5. Motivation
Introduction Proposed Method Experiment Conclusion
Key observations:
â Influence of the majority class is much greater than that of the minor class!
Key idea:
â Down-weight the samples that have large influence on the overfitted decision
boundary to make a smoother decision boundary!
5
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
(a) Comparison of Influences
between balanced and imbalanced dataset.
(b) key concept of our approach.
6. Proposed Method
Introduction Proposed Method Experiment Conclusion
0. Recap: Influence Functions on DNNs (Koh and Liang, ICML, 2017)
Let đ(đ„, đ) a model, đż(đŠ, đ(đ„, đ)) a loss for a training point (đ„, đŠ).
By definition, Influence of (đ„, đŠ) on parameters of model (đ) is given by:
đȘ đ„; đ = âđ»â1
âđđż đŠ, đ đ„, đ ,
where đ» â
1
đ
Ïđ=1
đ
âđ
2 đż đŠđ, đ đ„đ, đ
1. Influence-balanced (IB) weighting factor
From đȘ đ„; đ , we design the IB weighting factor as follows:
IB đ„; đ = âđđż đŠ, đ đ„, đ 1
6
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
(1)
(2)
7. Proposed Method
Introduction Proposed Method Experiment Conclusion
2. Influence-balanced (IB) Loss
When using the softmax cross-entropy loss, IB weighting factor can be further simplified:
IB đ„; đ = âđđż đŠ, đ đ„, đ 1
= đ đ„, đ â đŠ 1 â â 1
where â is a hidden feature vector.
Finally, the influence-balanced loss is given by
đżđŒđ”(đŠ, đ đ„, đ ) =
đż(đŠ, đ đ„, đ )
đ đ„, đ â đŠ 1 â â 1
â The proposed influence-balanced term constrains the decision boundary to
not overfit to influential majority samples.
7
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
(3)
(4)
8. Proposed Method
Introduction Proposed Method Experiment Conclusion
3. Influence-balanced (IB) Class-wise Re-weighting
Finally, we add a class-wise re-weighting term đđ to the IB-loss in 2. as:
đżđŒđ”(đ) =
1
đ
à·
đ„,đŠ âđ·đ
đđ
đż(đŠ, đ đ„, đ )
đ đ„, đ â đŠ 1 â â 1
where đđ = đŒ
đđ
â1
ÏđâČ
đŸ
đđâČ
â1. (đđ: the number of samples in the đ-th class)
â The class-wise re-weighting can further control the influences depending on the classes.
8
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
(5)
9. Proposed Method
Introduction Proposed Method Experiment Conclusion
4. Influence-balanced Training Scheme
The influence-balanced training process comprises two phases:
1) normal training and 2) fine-tuning for balance.
9
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
10. Experimental Results
Introduction Proposed Method Experiment Conclusion
1. Datasets
âŠSynthetic data:
âȘCIFAR10/100 (10/100 classes), Tiny ImageNet (200 classes)
âȘLong-tailed imbalance: the number of samples of đ-th class is set to đđđđ, (đ â 0,1 ).
âȘStep-imbalance: the classes are divided into two groups (majority, minority).
⊠Real-world data:
âȘiNaturalist 2018: 437,513 images from 8,142 classes (imbalance factor: 500)
â» imbalance factor: the ratio between the most frequent class and the least frequent class.
2. Baselines
We compare our method with the following cost-sensitive loss methods:
âŠCE: uses standard cross-entropy loss.
âŠFocal Loss [Lin et al., ICCV17]: down-weights well-classified samples and up-weights hard samples.
âŠCB Loss [Cui et al., CVPR19]: re-weights the loss inversely proportional to the effective number of samples.
âŠLDAM [Cao et al., NeurIPS19]: regularizes the minority classes to have larger margins.
10
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
11. Comparison with the state-of-the-arts
Introduction Proposed Method Experiment Conclusion
Our method achieves the state-of-the-art results on benchmark datasets with various
imbalance factors.
11
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
Classification Accuracy (%) of ResNet-32 on imbalanced CIFAR-10 and CIFAR-100 Classification Accuracy (%) of ResNet-18 on
imbalanced Tiny ImageNet.
Classification Accuracy (%) of ResNet-50 on
iNaturalist2018.
12. Class-wise classification accuracy
Introduction Proposed Method Experiment Conclusion
Our proposed method shows that the significant performance improvement has resulted
from the minority classes, not from the majority classes!
12
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
13. Conclusion
Introduction Proposed Method Experiment Conclusion
âDiscovered that the existing loss-based loss methods can lead a decision boundary of DNNs to
eventually overfit to the majority classes.
âDesigned a novel influence-balanced loss function to re-weight samples more effectively in a
way to alleviate overfitting of decision boundary.
âExperimentally demonstrated that IB-loss can improve the generalization performance on
imbalanced data.
âOur method is easy to be implemented and integrated into existing methods.
13
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
14. Conclusion
Introduction Proposed Method Experiment Conclusion
âDiscovered that the existing loss-based loss methods can lead a decision boundary of DNNs to
eventually overfit to the majority classes.
âDesigned a novel influence-balanced loss function to re-weight samples more effectively in a
way to alleviate overfitting of decision boundary.
âExperimentally demonstrated that IB-loss can improve the generalization performance on
imbalanced data.
âOur method is easy to be implemented and integrated into existing methods.
14
Influence-Balanced Loss for Imbalanced Visual Classification (ICCV 2021)
Contact: seulki.park@snu.ac.kr
Code: https://github.com/pseulki/IB-Loss