The document discusses various pooling operations used in image processing and convolutional neural networks (CNNs). It provides an overview of common pooling methods like max pooling, average pooling, and spatial pyramid pooling. It also discusses more advanced and trainable pooling techniques like stochastic pooling, mixed/gated pooling, fractional pooling, local importance pooling, and global feature guided local pooling. The document analyzes the tradeoffs of different pooling methods and how they can balance preserving details versus achieving invariance to changes in position or lighting. It references several influential papers that analyzed properties of pooling operations.
4. What is Pooling/Subsampling for?
Obj : Transform the "joint feature representation” into a new, more usable
Const
- discarding irrelevant detail
=> Invariance to changes in position/lighting, Robust to clutter, Compactness
of representation...
A Theoretical Analysis of Feature Pooling in Visual Recognition (2010)
Pooling Operations
5. This is the point
Const
- discarding irrelevant detail
Pros
Increasing Receptive Field
Decrease Model Size
Remove Noise
Translation Invariance
Cons
Lose Details
Lack (or NOT) of Equivariance
Pooling Operations
6. for TLDR;
StaticLearnable
Universal
Particular
Max Pooling
Avg Pooling
Fractional Pooling
LP Pooling
Wavelet Pooling
Softmax Pooling
Stochastic Pooling
Blur Pooling
Orderable Pooling
Global Average Pooling
Strided Convolution
Mixed, Gated, Tree Pooling
Local Importance Pooling
Detail-Preserving Pooling
Sensitive
Insensitive
Pooling Operations
Global Feature Guided Local Pooling
Spatial Pyramid Pooling
7. for TLDR;
Sensitive
Insensitive
Pooling Operations
Response Sensitivity
Someone, Orderable/Ranking/Attention
Max Pooling
Global Max Pooling
Softmax Pooling
Orderable Pooling
Mixed, Gated, Tree Pooling
Local Importance Pooling
Detail-Preserving Pooling
Spatial Pyramid Pooling
Fractional Pooling
Stochastic Pooling
Strided Convolution
Avg Pooling
Wavelet Pooling
LP Pooling
Global Feature Guided Local Pooling
Blur Pooling
8. 시작하기 전 TMI
CNN - Cat neuron response experiment (1962)
Average Pooling - Lenet (1998) : Subsampling
Max Pooling - Monkey neuron response exp (1999) -> DBN (2006)
-> Hand gesture recognition with cnn (2010)
Pooling Operations
9. Main Pooling Papers
Average Pooling
Lenet’s Subsampling [2]
Max Pooling
Similar to Neuron’s (at least monkey) response than avg [1]
LP Pooling
Average Pool - Max Pool. With hyperparameter, we can get avg/weighted/max pooling
Spatial Pyramid Pooling
Overcome CNN’s fixed output (GAP), Diversiform Receptive Field
Strided Convolution
Trainable Pooling
Pooling Operations
10. Average Pooling vs Max Pooling
Pooling Operations
Feature
Use all information with sum
> Pros
Backpropagation to all response
> Cons
Not robust to noise
Feature
Use highest value
> Pros
Robust to Noise
> Cons
Details are removed
vs
Suitable for Sparse Information
- Image Classification
Suitable for Dense Information
- NLP [below 1]
Avg / Response Insensitive Max / Response Sensitive
Start with 90s
11. Average Pooling / Max Pooling Balancing
- Transformable 2012
Pooling Operations
Convolutional Neural Networks Applied to House Numbers Digit Classification (2012)
LP Pool
by hyperparameter ‘p’, LP Pool is controllable from Avg(p=1) to Max(p=inf)
- Exp on Lenet Base
*SS: Single Stage, MS : Multi stage (Depth)
12. Average Pooling / Max Pooling Balancing
- Transformable 2017
Pooling Operations
DCASE 2017 SUBMISSION: MULTIPLE INSTANCE LEARNING FOR SOUND EVENT DETECTION (2017)
adaptive pooling operators for weakly labeled sound event detection (2018)
Softmax Pool
( Auto Pool )
by hyperparameter ‘a’, controllable from Avg(a=0), Softmax(a=1) to
Max(a=inf)
* RAP : Restricted Auto Pool (add regularizer for moving near 0), CAP : Contrainted Auto Pool ( value const for moving near 0 )
* Strong - time-varying label / Weak - No time labeled
- CNN based voice-spectrogram cls
13. Average Pooling / Max Pooling Balancing
- Hybrid 2015
Pooling Operations
Generalizing Pooling Functions in Convolutional Neural Networks:Mixed, Gated, and Tree (2015)
Mix/Gate Pool
- Maybe Exp on (pool 2) Alexnet Base
* Baseline : Max pool
+ Better Result & More Robust on transition
‘a’ parameter mix avg/max response
14. Average Pooling / Max Pooling Balancing
- Transformable + Hybrid 2019
Pooling Operations
Global Feature Guided Local Pooling (2019)
Global Feature Guided Local Pooling
- Softmax Pooling인데, GAP-FC로 parameter 데이터 마다 조정!
- More Depth, Pooling activated AVG/MAX
*trainable
`lambda` : 0, ‘rho’ :0 -> Avg
`lambda` : inf, ‘rho’ :0 -> Max
`lambda` : -inf, ‘rho’ :0 -> Min
Response Position
- Pooling Activation is Different by Cls (Imagenet)
15. (Max Pooling) Only Response Sensitive!
- Use Nth Max Responses 2016 - 2019
Pooling Operations
Rank-based pooling for deep convolutional neural networks (2016)
Ordinal Pooling (2019)
Ordinal Pooling
Weighting by Response Value
Notable thing
- At First Pooling, Closed to Avg. At Last, Close to Max
* baseline :Modified Lenet Avg/GAP - MNIST
16. Detail Please
- Detail reinforce 2018
Pooling Operations
Rapid, Detail-Preserving Image Downscaling (2016)
Detail-Preserving Pooling in Deep Networks (2018)
18. Detail Please
- Detail reinforce 2019
Pooling Operations
LIP: Local Importance-based Pooling (2019)
Local Importance Pooling
Calculate Pixel Importance
(Gate)
Reinforce
Response by Gate
LIP-AVG-Strided
CAM - 코알라
19. Hmm… Multiple Pooling
- Diversity Bags
Pooling Operations
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories (2006)
Spatial Pyramid Pooling in Deep ConvolutionalNetworks for Visual Recognition (2014)
Spatial Pyramid Pooling
GAP
20. Training, Training, Training
- NONO Pooling! 2014
Pooling Operations
Striving for simplicity, the all convolutional net (2014)
Strided Convolution
* BUT! FishNet reports
Strided Conv is worse
than Max Pool
* Stride Convs can’t
approximate MaxPool
(Ordinal)
21. Jackpot plz
- Stochastic 2013
Pooling Operations
Stochastic Pooling for Regularization of Deep Convolutional Neural Networks (2013)
Stochastic Pooling
- Train time : Stochastically Pool
- Test time : Weight Pool
22. Why only Integer division?
- Rebellion of root 2015
Pooling Operations
Fractional Max Pooling 2015
Fractional Max-Pooling
- Random Kernel Size for Max pooling -
for achieving root size downsample
* In = 25, Out =18
(root 2) = 1.414 / 25/18 = 1.388
23. Pooling has a WEAKNESS
- Avoiding Kryptonium 2019
Pooling Operations
Making Convolutional Networks Shift-Invariant Again 2019
https://richzhang.github.io/antialiased-cnns/
* Data Aug was removed
Blur Pool
24. Pooling has a WEAKNESS
- Failed KING 2017
Pooling Operations
CapsuleNet 2017
* Notable, “Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition (ICANN 2010)
- Exp shows overlapping drops accuracy
* By Contrast, “AlexNet (2012)” Report overlapping is helpful for prevent overfitting
25. Family Tree
Max Pooling Avg Pooling
Spatial Pyramid Pool
- 좀 더 다양하게 볼래
LP/Softmax Pool
- 왔다 갔다할 수 있음!
족보 없음
Stochastic Pooling
- 아무나 고를래!
Wavelet Pooling
- Edge는 무조건 중요!
Ordinal Pool
- Flex 여러개 봄
Mixed Pool
- 동시에 둘다 볼래
Strided Convolution
- 요즘은 러닝 시대
Rank Pool
- 여러개 볼래
Blur Pool
- Blurring하면 더 좋아짐!
Fractional Pooling
- 커널 크기 랜덤할래
Global Pool
- 다 묻고 하나만 가!
Detail Preserve Pool
- Detail 어디감
Local Importance Pool
- More Trainable
Robust Attention Pool
- GMP도 Trainable 하게
Global Feature Guided Local Pool
- Input에 따라 avg/max 조절할래
( Trainable LP )
26.
27. Evaluation of Pooling Operations in Convolutional Architectures for Object
Recognition (2010)
- Overlapping pool drops accuracy
A Theoretical Analysis of Feature Pooling in Visual Recognition (2010)
- Proof if Response is sparse, max pooling is better than avg pooling
Pooling is neither necessary nor sufficient for appropriate deformation stability
in CNNS (2018)
Ask the locals: multi-way local pooling for image recognition (2011)
Signal recovery from Pooling Representations (2014)
stats385
Emergence of Invariance and Disentanglement in Deep Representations
(2018)
Quantifying Translation-Invariance in Convolutional Neural Networks (2017)
A Mathematical Theory of Deep Convolutional Neural Networks for Feature
Extraction (2017)
Learning to Linearize Under Uncertainty (2015)
Pooling Analysis