3. 書誌情報
• 題名:MetaFormer is Actually What You Need for Vision [1]
• 著者:Weihao Yu, Mi Luo, Pan Zhou, Chenyang Si, Yichen Zhou, Xinchao
Wang, Jiashi Feng, Shuicheng Yan(シンガポールの研究チーム)
• URL:https://arxiv.org/abs/2111.11418
※本資料における出典の記載の無い図表は全て上記論文より引用
3
30. 引用
1. Yu, Weihao, et al. "Metaformer is actually what you need for vision." arXiv preprint arXiv:2111.11418 (2021).
2. Dosovitskiy, Alexey, et al. "An image is worth 16x16 words: Transformers for image recognition at scale." arXiv preprint
arXiv:2010.11929 (2020)
3. Tolstikhin, Ilya, et al. "Mlp-mixer: An all-mlp architecture for vision." arXiv preprint arXiv:2105.01601 (2021).
4. Touvron, Hugo, et al. "Training data-efficient image transformers & distillation through attention." International Conference on
Machine Learning. PMLR, 2021.
5. Wang, Wenhai, et al. "Pyramid vision transformer: A versatile backbone for dense prediction without convolutions." arXiv preprint
arXiv:2102.12122 (2021).
6. Wightman, Ross, Hugo Touvron, and Hervé Jégou. "Resnet strikes back: An improved training procedure in timm." arXiv preprint
arXiv:2110.00476 (2021).
7. Touvron, Hugo, et al. "Resmlp: Feedforward networks for image classification with data-efficient training." arXiv preprint
arXiv:2105.03404 (2021).
8. Liu, Ze, et al. "Swin transformer: Hierarchical vision transformer using shifted windows." arXiv preprint arXiv:2103.14030 (2021).
9. Liu, Hanxiao, et al. "Pay Attention to MLPs." arXiv preprint arXiv:2105.08050 (2021).
10. He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and
pattern recognition. 2016.
30
31. 引用
11. Lin, Tsung-Yi, et al. "Focal loss for dense object detection." Proceedings of the IEEE international conference on computer vision.
2017.
12. He, Kaiming, et al. "Mask r-cnn." Proceedings of the IEEE international conference on computer vision. 2017.
13. Kirillov, Alexander, et al. "Panoptic feature pyramid networks." Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition. 2019.
14. Xie, Saining, et al. "Aggregated residual transformations for deep neural networks." Proceedings of the IEEE conference on
computer vision and pattern recognition. 2017.
31