2. 背景・目的
関連研究
Information
Dataset Cartography: Mapping and Diagnosing
Datasets with Training Dynamics
Swayamdipta, S.1, Schwartz, R. 2, Lourie, N. 1,
Wang, Y. 3, Hajishirzi, H. 3, Smith, N. A. 3, & Choi, Y. 3
1: Allen Institute for Artificial Intelligence, Seattle
2: The Hebrew University of Jerusalem, Israel
3: Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle
https://arxiv.org/pdf/2009.10795.pdf
https://github.com/allenai/cartography
2
Allen AI は OSS の活動で有名
18. 関連研究
提案手法
背景・目的
関連研究2
◆ Joshi ら[10]: AL-uncertainty
➢ 着目点: SVM (margin based model) における不確かさ
➢ 分析項目: active learning での効用
◆ Sener and Savarese ら[11]: AL-greedyK
➢ 着目点: データ集合の中における k 個の center (≈cluster)
と各 center のデータ集合全体への影響度
➢ 分析項目: active learning に効果的な部分集合
18
同様の視点から adversarial (データに誤りのある)シナリオでの
学習安定化・精度向上とも関連があるそう
19. 参考文献
まとめ
参考文献1
[1] Kailas Vodrahalli, Ke Li, and Jitendra Malik. 2018. Are all training
examples created equal? an empirical study. ArXiv:1811.12569.
[2] Dan Hendrycks, Xiaoyuan Liu, Eric Wallace, Adam Dziedzic, Rishabh
Krishnan, and Dawn Song. 2020. Pretrained transformers improve out-of-
distribution robustness. ArXiv preprint arXiv:2004.06100.
[3] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar S. Joshi, Danqi
Chen, Omer Levy, Mike Lewis, Luke S. Zettlemoyer, and Veselin Stoyanov.
2019. RoBERTa: A robustly optimized BERT pretraining approach.
ArXiv:1907.11692.
[4] Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D.
Manning. 2015. A large annotated corpus for learning natural language
inference. In Proceedings of the 2015 Conference on Empirical Methods in
Natural Language Processing, pages 632–642, Lisbon, Portugal.
Association for Computational Linguistics.
[5] Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, and Yejin Choi.
2020. Winogrande: An adversarial winograd schema challenge at scale. In
AAAI.
19
20. 参考文献
まとめ
参考文献2
[6] Wei Hu, Zhiyuan Li, and Dingli Yu. 2020. Simple and effective
regularization methods for training on noisily labeled data with
generalization guarantee. In ICLR. OpenReview.net.
[7] Chen Xing, Devansh Arpit, Christos Tsirigotis, and Yoshua Bengio. 2018.
A walk with SGD.
[8] Mariya Toneva, Alessandro Sordoni, Remi Tachet des Combes, Adam
Trischler, Yoshua Bengio, and Geoffrey J Gordon. 2018. An empirical study
of example forgetting during deep neural network learning. In ICLR.
[9] Ronan LeBras, Swabha Swayamdipta, Chandra Bhagavatula, Rowan
Zellers, Matthew E. Peters, Ashish Sabharwal, and Yejin Choi. 2020.
Adversarial filters of dataset biases. In ICML.
[10] Ajay J Joshi, Fatih Porikli, and Nikolaos Papanikolopoulos. 2009. Multi-
class active learning for image classification. In CVPR, pages 2372– 2379.
IEEE.
[11] Ozan Sener and Silvio Savarese. 2018. Active learning for
convolutional neural networks: A core-set approach. In ICLR.
20