SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
Do Wide and Deep Networks Learn the Same Things? Uncovering
How Neural Network Representations Vary with Width and Depth
Hwang seung hyun
Google Research | arxiv preprint
2020.11.04
Introduction Methods and
Experiments
01 02
Conclusion
03
Contents
Depth & Width
Introduction – Background
• Key factor in the success of Deep Neural Nets
→ Scaling models by varying “Depth” and “Width”
• Limited understanding of how varying these
properties affects the model beyond its
performance.
• Investigating this question is critical especially
with continually increasing computing resources.
Introduction / Related Work / Methods and Experiments / Conclusion
02
Depth & Width
Introduction – Questions
1. How do Depth & Width affect the final learned representations?
2. Do these different model architectures also learn different hidden layer features?
3. Are there discernible differences in the outputs?
Introduction / Related Work / Methods and Experiments / Conclusion
03
Depth & Width
Introduction – Contribution
• Apply CKA (centered kernel alignment) to measure the similarity of the hidden
representations of different NNs, finding that representations in wide or deep models
exhibit a characteristic structure, “Block Structure”.
• Block Structure corresponds to hidden representations having a single principal
component that explains most of the variance in the representation → Possible Pruning
• Block Structures are unique to each model, whereas the other part remain similar within
different networks.
• Found that wide and deep models make systematically different mistakes at the level of
individual examples. (Wide networks better at scenes, Deep networks better at objects)
Introduction / Related Work / Methods and Experiments / Conclusion
04
Methods and Experiments
Experimental Settings
Introduction / Methods and Experiments / Conclusion
05
• Models: Family of ResNets
• Datasets: CIFAR-10, CIFAR-100, ImageNet
• Representational Similarity Measures:
Linear centered kernel alignment (CKA)
→ Compute CKA as a function of average HSIC scores
computed over k mini-batches) [1]
Num of
Channels x 2
Num of
Channels x 2
[1] Kornblith, Simon, et al. "Similarity of neural network representations revisited.“ ICML(2019)
Methods and Experiments
Emergence of the block structure with increasing width or depth
Introduction / Related Work / Methods and Experiments / Conclusion
06
Yellow square
on the heatmap
mostly appears
in the later layers
of the network
Methods and Experiments
Emergence of the block structure with increasing width or depth
Introduction / Related Work / Methods and Experiments / Conclusion
07
CNN with No
Residual
Connections
Block Structure
varies across
Random-
Initializations
Methods and Experiments
Block structure in narrower networks with less data
Introduction / Related Work / Methods and Experiments / Conclusion
08
Block structure
in the internal-
representations
arises in models
that are heavily
overparameterized
relative to the
training dataset.
Methods and Experiments
Block structure and the first principal component
Introduction / Related Work / Methods and Experiments / Conclusion
09
Block structure
arises from
preserving and
propagating the first
principal component
across its
constituent layers.
Deep Model Wide Model
Methods and Experiments
Linear probe accuracy
Introduction / Related Work / Methods and Experiments / Conclusion
10
In models with the
block structure,
linear probe
accuracy shows little
improvement inside
the block structure.
Residual connections
play an important
role in preserving
representations in
the block structure.
Methods and Experiments
Effect of deleting blocks on accuracy for models with or w.o block structure
Introduction / Related Work / Methods and Experiments / Conclusion
11
Block structure could
be an indication of
redundant modules in
model design.
Similarity of its
constituent layer
representations could
be leveraged for model
compression.
Methods and Experiments
Per-example performance differences between Wide and Deep models
Introduction / Related Work / Methods and Experiments / Conclusion
12
Methods and Experiments
Per-class performance differences between Wide and Deep models
Introduction / Related Work / Methods and Experiments / Conclusion
13
Deep Architecture:
Consumer goods
Wide Architecture:
Scenes
Conclusion
Introduction / Related Work / Methods and Experiments / Conclusion
• Studied the effects of width and depth on neural network
representations.
• Emergence of a characteristic “block structure” that reflects the
similarity of a dominant first principal component, propagated across
many network hidden layers.
• While block structure is unique to each model, other learned features
are shared across different initializations and architectures.
• Width and Depth have different effects on network predictions at the
example and class levels.
14
Conclusion
Introduction / Related Work / Methods and Experiments / Conclusion
• How does block structure arises through training?
• Controlling depth and width properly to optimize task-specific
model design?
• How to adjust depth and width wisely in Medical domain.
15
Future Work

Weitere ähnliche Inhalte

Was ist angesagt?

Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Muhammad Haroon
 
Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료taeseon ryu
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaPreferred Networks
 
Deformable ConvNets V2, DCNV2
Deformable ConvNets V2, DCNV2Deformable ConvNets V2, DCNV2
Deformable ConvNets V2, DCNV2HaiyanWang16
 
Continual learning: Survey
Continual learning: SurveyContinual learning: Survey
Continual learning: SurveyWonjun Jeong
 
Community detection in social networks
Community detection in social networksCommunity detection in social networks
Community detection in social networksFrancisco Restivo
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
ResNet basics (Deep Residual Network for Image Recognition)
ResNet basics (Deep Residual Network for Image Recognition)ResNet basics (Deep Residual Network for Image Recognition)
ResNet basics (Deep Residual Network for Image Recognition)Sanjay Saha
 
Introduction to Generative Adversarial Networks (GAN) with Apache MXNet
Introduction to Generative Adversarial Networks (GAN) with Apache MXNetIntroduction to Generative Adversarial Networks (GAN) with Apache MXNet
Introduction to Generative Adversarial Networks (GAN) with Apache MXNetAmazon Web Services
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksJinwon Lee
 
Factorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender SystemsFactorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender SystemsEvgeniy Marinov
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphsNicola Barbieri
 
Image Object Detection Pipeline
Image Object Detection PipelineImage Object Detection Pipeline
Image Object Detection PipelineAbhinav Dadhich
 
CVPR 2018 Paper Reading MobileNet V2
CVPR 2018 Paper Reading MobileNet V2CVPR 2018 Paper Reading MobileNet V2
CVPR 2018 Paper Reading MobileNet V2Khang Pham
 

Was ist angesagt? (20)

Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)Convolutional Neural Network (CNN)
Convolutional Neural Network (CNN)
 
Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료Detection focal loss 딥러닝 논문읽기 모임 발표자료
Detection focal loss 딥러닝 논문읽기 모임 발표자료
 
A Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi KerolaA Brief History of Object Detection / Tommi Kerola
A Brief History of Object Detection / Tommi Kerola
 
Deformable ConvNets V2, DCNV2
Deformable ConvNets V2, DCNV2Deformable ConvNets V2, DCNV2
Deformable ConvNets V2, DCNV2
 
OOP design patterns
OOP design patternsOOP design patterns
OOP design patterns
 
Continual learning: Survey
Continual learning: SurveyContinual learning: Survey
Continual learning: Survey
 
Yolo
YoloYolo
Yolo
 
Ph.D. thesis defense
Ph.D. thesis defensePh.D. thesis defense
Ph.D. thesis defense
 
Yolov3
Yolov3Yolov3
Yolov3
 
Community detection in social networks
Community detection in social networksCommunity detection in social networks
Community detection in social networks
 
CNN Tutorial
CNN TutorialCNN Tutorial
CNN Tutorial
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
ResNet basics (Deep Residual Network for Image Recognition)
ResNet basics (Deep Residual Network for Image Recognition)ResNet basics (Deep Residual Network for Image Recognition)
ResNet basics (Deep Residual Network for Image Recognition)
 
Introduction to Generative Adversarial Networks (GAN) with Apache MXNet
Introduction to Generative Adversarial Networks (GAN) with Apache MXNetIntroduction to Generative Adversarial Networks (GAN) with Apache MXNet
Introduction to Generative Adversarial Networks (GAN) with Apache MXNet
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
 
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear BottlenecksPR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
PR-108: MobileNetV2: Inverted Residuals and Linear Bottlenecks
 
Factorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender SystemsFactorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender Systems
 
Community detection in graphs
Community detection in graphsCommunity detection in graphs
Community detection in graphs
 
Image Object Detection Pipeline
Image Object Detection PipelineImage Object Detection Pipeline
Image Object Detection Pipeline
 
CVPR 2018 Paper Reading MobileNet V2
CVPR 2018 Paper Reading MobileNet V2CVPR 2018 Paper Reading MobileNet V2
CVPR 2018 Paper Reading MobileNet V2
 

Ähnlich wie Do wide and deep networks learn the same things? Uncovering how neural network representations vary with width and depth

SDC: A Distributed Clustering Protocol
SDC: A Distributed Clustering ProtocolSDC: A Distributed Clustering Protocol
SDC: A Distributed Clustering ProtocolCSCJournals
 
Jürgens diata12-communities
Jürgens diata12-communitiesJürgens diata12-communities
Jürgens diata12-communitiesPascal Juergens
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworkstm1966
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksSeunghyun Hwang
 
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...Lviv Data Science Summer School
 
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...ssuser4b1f48
 
Deep Learning via Semi-Supervised Embedding (第 7 回 Deep Learning 勉強会資料; 大澤)
Deep Learning via Semi-Supervised Embedding (第 7 回 Deep Learning 勉強会資料; 大澤)Deep Learning via Semi-Supervised Embedding (第 7 回 Deep Learning 勉強会資料; 大澤)
Deep Learning via Semi-Supervised Embedding (第 7 回 Deep Learning 勉強会資料; 大澤)Ohsawa Goodfellow
 
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...ijcsit
 
PhD Defense Slides
PhD Defense SlidesPhD Defense Slides
PhD Defense SlidesDebasmit Das
 
TEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHES
TEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHESTEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHES
TEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHESsipij
 
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning TasksPR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning TasksJinwon Lee
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSungchul Kim
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationYogendra Tamang
 
Chapter-2 Database System Concepts and Architecture
Chapter-2 Database System Concepts and ArchitectureChapter-2 Database System Concepts and Architecture
Chapter-2 Database System Concepts and ArchitectureKunal Anand
 
Modelling pairwise key predistribution in the presence of unreliable links
Modelling pairwise key predistribution in the presence of unreliable links Modelling pairwise key predistribution in the presence of unreliable links
Modelling pairwise key predistribution in the presence of unreliable links Saikiran Gvs
 
character_ANN.ppt
character_ANN.pptcharacter_ANN.ppt
character_ANN.pptHarsh480253
 
How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?Seunghyun Hwang
 
cn ppt.pptxbdbdbbxnxnxbxbxbxbxnxnxnxnxnxnxn
cn ppt.pptxbdbdbbxnxnxbxbxbxbxnxnxnxnxnxnxncn ppt.pptxbdbdbbxnxnxbxbxbxbxnxnxnxnxnxnxn
cn ppt.pptxbdbdbbxnxnxbxbxbxbxnxnxnxnxnxnxnBharathNS10
 

Ähnlich wie Do wide and deep networks learn the same things? Uncovering how neural network representations vary with width and depth (20)

SDC: A Distributed Clustering Protocol
SDC: A Distributed Clustering ProtocolSDC: A Distributed Clustering Protocol
SDC: A Distributed Clustering Protocol
 
Jürgens diata12-communities
Jürgens diata12-communitiesJürgens diata12-communities
Jürgens diata12-communities
 
20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks20191107 deeplearningapproachesfornetworks
20191107 deeplearningapproachesfornetworks
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention Networks
 
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
 
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...NS-CUK Seminar: H.B.Kim,  Review on "metapath2vec: Scalable representation le...
NS-CUK Seminar: H.B.Kim, Review on "metapath2vec: Scalable representation le...
 
Deep Learning via Semi-Supervised Embedding (第 7 回 Deep Learning 勉強会資料; 大澤)
Deep Learning via Semi-Supervised Embedding (第 7 回 Deep Learning 勉強会資料; 大澤)Deep Learning via Semi-Supervised Embedding (第 7 回 Deep Learning 勉強会資料; 大澤)
Deep Learning via Semi-Supervised Embedding (第 7 回 Deep Learning 勉強会資料; 大澤)
 
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
CONTENT BASED VIDEO CATEGORIZATION USING RELATIONAL CLUSTERING WITH LOCAL SCA...
 
PhD Defense Slides
PhD Defense SlidesPhD Defense Slides
PhD Defense Slides
 
TEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHES
TEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHESTEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHES
TEST-COST-SENSITIVE CONVOLUTIONAL NEURAL NETWORKS WITH EXPERT BRANCHES
 
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning TasksPR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
PR095: Modularity Matters: Learning Invariant Relational Reasoning Tasks
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the Eyes
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication
 
ExplainableAI.pptx
ExplainableAI.pptxExplainableAI.pptx
ExplainableAI.pptx
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
Chapter-2 Database System Concepts and Architecture
Chapter-2 Database System Concepts and ArchitectureChapter-2 Database System Concepts and Architecture
Chapter-2 Database System Concepts and Architecture
 
Modelling pairwise key predistribution in the presence of unreliable links
Modelling pairwise key predistribution in the presence of unreliable links Modelling pairwise key predistribution in the presence of unreliable links
Modelling pairwise key predistribution in the presence of unreliable links
 
character_ANN.ppt
character_ANN.pptcharacter_ANN.ppt
character_ANN.ppt
 
How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?How useful is self-supervised pretraining for Visual tasks?
How useful is self-supervised pretraining for Visual tasks?
 
cn ppt.pptxbdbdbbxnxnxbxbxbxbxnxnxnxnxnxnxn
cn ppt.pptxbdbdbbxnxnxbxbxbxbxnxnxnxnxnxnxncn ppt.pptxbdbdbbxnxnxbxbxbxbxnxnxnxnxnxnxn
cn ppt.pptxbdbdbbxnxnxbxbxbxbxnxnxnxnxnxnxn
 

Mehr von Seunghyun Hwang

An annotation sparsification strategy for 3D medical image segmentation via r...
An annotation sparsification strategy for 3D medical image segmentation via r...An annotation sparsification strategy for 3D medical image segmentation via r...
An annotation sparsification strategy for 3D medical image segmentation via r...Seunghyun Hwang
 
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...Seunghyun Hwang
 
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model Seunghyun Hwang
 
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...Seunghyun Hwang
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersSeunghyun Hwang
 
Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation Seunghyun Hwang
 
Segmenting Medical MRI via Recurrent Decoding Cell
Segmenting Medical MRI via Recurrent Decoding CellSegmenting Medical MRI via Recurrent Decoding Cell
Segmenting Medical MRI via Recurrent Decoding CellSeunghyun Hwang
 
Progressive learning and Disentanglement of hierarchical representations
Progressive learning and Disentanglement of hierarchical representationsProgressive learning and Disentanglement of hierarchical representations
Progressive learning and Disentanglement of hierarchical representationsSeunghyun Hwang
 
Learning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted DropoutLearning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted DropoutSeunghyun Hwang
 
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual RepresentationsA Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual RepresentationsSeunghyun Hwang
 
DeepStrip: High Resolution Boundary Refinement
DeepStrip: High Resolution Boundary RefinementDeepStrip: High Resolution Boundary Refinement
DeepStrip: High Resolution Boundary RefinementSeunghyun Hwang
 
Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Seunghyun Hwang
 
A Probabilistic U-Net for Segmentation of Ambiguous Images
A Probabilistic U-Net for Segmentation of Ambiguous ImagesA Probabilistic U-Net for Segmentation of Ambiguous Images
A Probabilistic U-Net for Segmentation of Ambiguous ImagesSeunghyun Hwang
 
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...Seunghyun Hwang
 
Mix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional KernelsMix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional KernelsSeunghyun Hwang
 
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image SynthesisLarge Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image SynthesisSeunghyun Hwang
 

Mehr von Seunghyun Hwang (16)

An annotation sparsification strategy for 3D medical image segmentation via r...
An annotation sparsification strategy for 3D medical image segmentation via r...An annotation sparsification strategy for 3D medical image segmentation via r...
An annotation sparsification strategy for 3D medical image segmentation via r...
 
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
Deep Learning-based Fully Automated Detection and Quantification of Acute Inf...
 
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
Diagnosis of Maxillary Sinusitis in Water’s view based on Deep learning model
 
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
 
Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation Deep Generative model-based quality control for cardiac MRI segmentation
Deep Generative model-based quality control for cardiac MRI segmentation
 
Segmenting Medical MRI via Recurrent Decoding Cell
Segmenting Medical MRI via Recurrent Decoding CellSegmenting Medical MRI via Recurrent Decoding Cell
Segmenting Medical MRI via Recurrent Decoding Cell
 
Progressive learning and Disentanglement of hierarchical representations
Progressive learning and Disentanglement of hierarchical representationsProgressive learning and Disentanglement of hierarchical representations
Progressive learning and Disentanglement of hierarchical representations
 
Learning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted DropoutLearning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted Dropout
 
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual RepresentationsA Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual Representations
 
DeepStrip: High Resolution Boundary Refinement
DeepStrip: High Resolution Boundary RefinementDeepStrip: High Resolution Boundary Refinement
DeepStrip: High Resolution Boundary Refinement
 
Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...
 
A Probabilistic U-Net for Segmentation of Ambiguous Images
A Probabilistic U-Net for Segmentation of Ambiguous ImagesA Probabilistic U-Net for Segmentation of Ambiguous Images
A Probabilistic U-Net for Segmentation of Ambiguous Images
 
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stoch...
 
Mix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional KernelsMix Conv: Mixed Depthwise Convolutional Kernels
Mix Conv: Mixed Depthwise Convolutional Kernels
 
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image SynthesisLarge Scale GAN Training for High Fidelity Natural Image Synthesis
Large Scale GAN Training for High Fidelity Natural Image Synthesis
 

Kürzlich hochgeladen

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Kürzlich hochgeladen (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Do wide and deep networks learn the same things? Uncovering how neural network representations vary with width and depth

  • 1. Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth Hwang seung hyun Google Research | arxiv preprint 2020.11.04
  • 2. Introduction Methods and Experiments 01 02 Conclusion 03 Contents
  • 3. Depth & Width Introduction – Background • Key factor in the success of Deep Neural Nets → Scaling models by varying “Depth” and “Width” • Limited understanding of how varying these properties affects the model beyond its performance. • Investigating this question is critical especially with continually increasing computing resources. Introduction / Related Work / Methods and Experiments / Conclusion 02
  • 4. Depth & Width Introduction – Questions 1. How do Depth & Width affect the final learned representations? 2. Do these different model architectures also learn different hidden layer features? 3. Are there discernible differences in the outputs? Introduction / Related Work / Methods and Experiments / Conclusion 03
  • 5. Depth & Width Introduction – Contribution • Apply CKA (centered kernel alignment) to measure the similarity of the hidden representations of different NNs, finding that representations in wide or deep models exhibit a characteristic structure, “Block Structure”. • Block Structure corresponds to hidden representations having a single principal component that explains most of the variance in the representation → Possible Pruning • Block Structures are unique to each model, whereas the other part remain similar within different networks. • Found that wide and deep models make systematically different mistakes at the level of individual examples. (Wide networks better at scenes, Deep networks better at objects) Introduction / Related Work / Methods and Experiments / Conclusion 04
  • 6. Methods and Experiments Experimental Settings Introduction / Methods and Experiments / Conclusion 05 • Models: Family of ResNets • Datasets: CIFAR-10, CIFAR-100, ImageNet • Representational Similarity Measures: Linear centered kernel alignment (CKA) → Compute CKA as a function of average HSIC scores computed over k mini-batches) [1] Num of Channels x 2 Num of Channels x 2 [1] Kornblith, Simon, et al. "Similarity of neural network representations revisited.“ ICML(2019)
  • 7. Methods and Experiments Emergence of the block structure with increasing width or depth Introduction / Related Work / Methods and Experiments / Conclusion 06 Yellow square on the heatmap mostly appears in the later layers of the network
  • 8. Methods and Experiments Emergence of the block structure with increasing width or depth Introduction / Related Work / Methods and Experiments / Conclusion 07 CNN with No Residual Connections Block Structure varies across Random- Initializations
  • 9. Methods and Experiments Block structure in narrower networks with less data Introduction / Related Work / Methods and Experiments / Conclusion 08 Block structure in the internal- representations arises in models that are heavily overparameterized relative to the training dataset.
  • 10. Methods and Experiments Block structure and the first principal component Introduction / Related Work / Methods and Experiments / Conclusion 09 Block structure arises from preserving and propagating the first principal component across its constituent layers. Deep Model Wide Model
  • 11. Methods and Experiments Linear probe accuracy Introduction / Related Work / Methods and Experiments / Conclusion 10 In models with the block structure, linear probe accuracy shows little improvement inside the block structure. Residual connections play an important role in preserving representations in the block structure.
  • 12. Methods and Experiments Effect of deleting blocks on accuracy for models with or w.o block structure Introduction / Related Work / Methods and Experiments / Conclusion 11 Block structure could be an indication of redundant modules in model design. Similarity of its constituent layer representations could be leveraged for model compression.
  • 13. Methods and Experiments Per-example performance differences between Wide and Deep models Introduction / Related Work / Methods and Experiments / Conclusion 12
  • 14. Methods and Experiments Per-class performance differences between Wide and Deep models Introduction / Related Work / Methods and Experiments / Conclusion 13 Deep Architecture: Consumer goods Wide Architecture: Scenes
  • 15. Conclusion Introduction / Related Work / Methods and Experiments / Conclusion • Studied the effects of width and depth on neural network representations. • Emergence of a characteristic “block structure” that reflects the similarity of a dominant first principal component, propagated across many network hidden layers. • While block structure is unique to each model, other learned features are shared across different initializations and architectures. • Width and Depth have different effects on network predictions at the example and class levels. 14
  • 16. Conclusion Introduction / Related Work / Methods and Experiments / Conclusion • How does block structure arises through training? • Controlling depth and width properly to optimize task-specific model design? • How to adjust depth and width wisely in Medical domain. 15 Future Work