SlideShare ist ein Scribd-Unternehmen logo
1 von 22
1
2
Multimodal intelligence integrating
diverse data with deep learning
Computer
Vision
Natural
Language
Processing
Multimodal
Machine Learning
Deep Learning
3
Robust/Few-shot
learning theory
ACCV’18, ICIP’19
ConvNets architectures
BMVC’13, ICIP’19
Fine-grained recognition
ICME’13, CLEF’13
Large-scale image tagging
CVPR’10, ECCV’10, ICPR’16
Medical image analysis
ISBI’18, CIKM’19,
Neurocomputing’19
Visual aesthetics analysis
ACMMM’18
Scene text erasing
WACV’20
Visual relationship detection
ISM’17, ICIP’20
4
Word representation learning
IJCNLP’17, ICLR’18
Machine translation
MT’17, ACL’18, ACL’19, AAAI’20 Cross-lingual retrieval
EMNLP’15
a cat is trying to
eat the food
Image/video caption generation
COLING’16 , LREC’18
Visual phrase grounding
LREC’20
Multimodal machine
translation
CICLING’19
Unsupervised discourse parsing
SIGDIAL’18, TACL’20
Neural input method
NAACL’19
 Discriminative initialization of
Convolutional Neural Network (CNN)
[BMVC’13]
◦ Closed-form initialization using Fisher
Discriminant Analysis
 Frequency-domain CNNs
[ICIP’19, MMAsia’19]
5
6
 Control the number of output words using a
recurrent neural network
Jin et al., Annotation Order Matters: Recurrent Image Annotator
for Arbitrary Length Image Tagging, In Proc. ICPR, 2016.
 Car types identification
 Plant species identification
◦ ImageCLEF 2013 Plant
Identification Challenge (1st place)
 Character recognition
◦ ICDAR Script identification
challenge (3rd place)
7
Acura RLMitsubishi
Lancer
Toyota Camry
Audi S4 Honda Accord Mercedes-Benz C-Class
 Stochastically switch the cross-entropy loss(CCE)and the
mean absolute error loss(MAE)
8Hataya et al., LOL: LEARNING TO OPTIMIZE LOSS SWITCHING UNDER LABEL NOISE, 2018.
 There exist some “easy” examples which can be correctly
classified at the beginning stage of learning
 “Hard” data matters more
9
Kishida et al., EMPIRICAL STUDY OF EASY AND HARD EXAMPLES IN CNN TRAINING, ICONIP 2019.
 Co-segmentation: extract common objects in multiple images
10
Chen et al., Semantic Aware Attention Based Deep
Object Co-segmentation, In Proc. ACCV, 2018.
Han et al., "Learning More with Less: Conditional PGGAN-based Data Augmentation for Brain Metastases Detection Using
Highly-Rough Annotation on MR Images", In Proc. of CIKM, 2019.
Han et al., "Combining Noise-to-Image and Image-to-Image GANs: Brain MR Image Augmentation for Tumor Detection",
IEEE Access, Vol.7, pp.156966-156977, 2019.
 Erasing texts in general images
[WACV’20]
 Erasing general objects
[Lazarski, 2018]
12
https://www.youtube.com/watch?v=JvTvyOeAGbU
 Diversification of decoding [ACL’19]
 Resource-efficient MT
◦ Compression of word vectors (99% off!) [ICLR’18]
◦ Rapid decoding [ACL’18, AAAI’20]
13
Input
Beam
Search
Proposed
Syntactic Diversity
14
 Obtain syntactic word features
Permutation
Matrix
→
update
steps
 Incorporate Quantum Walk for graph
representation learning
15
16
a woman is slicing
some vegetables
a cat is trying to
eat the food
a dog is swimming
in the pool
Input
(frame
sequence)
Output (word sequence)
“Translation”
from
video to text!
<BOS> a woman is cooking in the kitchen <EOS>
context vector
 Multimodal Machine Translation
[CICLING’19]
◦ Improve translation with the help of vision
17
 Phrase localization [LREC’20]
◦ Identify the image region for a given phrase
 Presentations at prestigious conferences/journals
◦ ACL, AAAI, WACV, TACL, ICIP (2020)
◦ ACL, CIKM, Neurocomputing, 3DV, ICIPx3 (2019)
◦ ACL, ICLR, ACCV, SIGDIAL, LREC (2018)
◦ IJCNLP, ICDAR (2017)
 Awards
◦ 言語処理学会年次大会 優秀賞、若手奨励賞 (2020)
◦ CVIM研究会奨励賞 (2020)
◦ 情報理工学系研究科長賞 (2020)
◦ 画像の認識・理解シンポジウム 学生奨励賞 (2019)
◦ 電子情報通信学会医用画像研究会奨励賞 (2019)
◦ 言語処理学会年次大会 最優秀賞 (2018)
◦ NMT@ACL outstanding paper award (2017)
◦ 人工知能学会全国大会 優秀賞, 学生奨励賞x2 (2017)
18
 Faculty:1(Nakayama)
 PhD students:10
 Master’s students:12(4~5 per each year)
 Secretary:1
19
 Monday: Group meeting (2~3h)
◦ Short progress report by all, discussion, study session
◦ Mainly organized by PhD students
 Wednesday: Main meeting (2~3h)
◦ Progress report (3~4 students)
◦ Presentation practice, etc.
 Others
◦ One-on-one meeting
◦ Project meeting
 No other hours on duty
20
 Workstation (2GPUs) for each student
 Share machines
◦ 4GPUs x 4
◦ 8GPUs x 2
 Cloud computers
◦ University cloud system
◦ ABCI
21
22

Weitere ähnliche Inhalte

Mehr von nlab_utokyo

Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までーDeep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までーnlab_utokyo
 
Deep Learningと画像認識   ~歴史・理論・実践~
Deep Learningと画像認識 ~歴史・理論・実践~Deep Learningと画像認識 ~歴史・理論・実践~
Deep Learningと画像認識   ~歴史・理論・実践~nlab_utokyo
 
Lab introduction 2014
Lab introduction 2014Lab introduction 2014
Lab introduction 2014nlab_utokyo
 
SSII2014 詳細画像識別 (FGVC) @OS2
SSII2014 詳細画像識別 (FGVC) @OS2SSII2014 詳細画像識別 (FGVC) @OS2
SSII2014 詳細画像識別 (FGVC) @OS2nlab_utokyo
 

Mehr von nlab_utokyo (10)

RecSysTV2014
RecSysTV2014RecSysTV2014
RecSysTV2014
 
20150930
2015093020150930
20150930
 
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までーDeep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
Deep Learningによる画像認識革命 ー歴史・最新理論から実践応用までー
 
20150414seminar
20150414seminar20150414seminar
20150414seminar
 
Deep Learningと画像認識   ~歴史・理論・実践~
Deep Learningと画像認識 ~歴史・理論・実践~Deep Learningと画像認識 ~歴史・理論・実践~
Deep Learningと画像認識   ~歴史・理論・実践~
 
MIRU2014 SLAC
MIRU2014 SLACMIRU2014 SLAC
MIRU2014 SLAC
 
Lab introduction 2014
Lab introduction 2014Lab introduction 2014
Lab introduction 2014
 
SSII2014 詳細画像識別 (FGVC) @OS2
SSII2014 詳細画像識別 (FGVC) @OS2SSII2014 詳細画像識別 (FGVC) @OS2
SSII2014 詳細画像識別 (FGVC) @OS2
 
ICME 2013
ICME 2013ICME 2013
ICME 2013
 
Seminar
SeminarSeminar
Seminar
 

Kürzlich hochgeladen

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEaurabinda banchhor
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxJanEmmanBrigoli
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxRosabel UA
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 

Kürzlich hochgeladen (20)

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSE
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
Millenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptxMillenials and Fillennials (Ethical Challenge and Responses).pptx
Millenials and Fillennials (Ethical Challenge and Responses).pptx
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptx
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 

2020年度 東京大学中山研 研究室紹介

  • 1. 1
  • 2. 2 Multimodal intelligence integrating diverse data with deep learning Computer Vision Natural Language Processing Multimodal Machine Learning Deep Learning
  • 3. 3 Robust/Few-shot learning theory ACCV’18, ICIP’19 ConvNets architectures BMVC’13, ICIP’19 Fine-grained recognition ICME’13, CLEF’13 Large-scale image tagging CVPR’10, ECCV’10, ICPR’16 Medical image analysis ISBI’18, CIKM’19, Neurocomputing’19 Visual aesthetics analysis ACMMM’18 Scene text erasing WACV’20 Visual relationship detection ISM’17, ICIP’20
  • 4. 4 Word representation learning IJCNLP’17, ICLR’18 Machine translation MT’17, ACL’18, ACL’19, AAAI’20 Cross-lingual retrieval EMNLP’15 a cat is trying to eat the food Image/video caption generation COLING’16 , LREC’18 Visual phrase grounding LREC’20 Multimodal machine translation CICLING’19 Unsupervised discourse parsing SIGDIAL’18, TACL’20 Neural input method NAACL’19
  • 5.  Discriminative initialization of Convolutional Neural Network (CNN) [BMVC’13] ◦ Closed-form initialization using Fisher Discriminant Analysis  Frequency-domain CNNs [ICIP’19, MMAsia’19] 5
  • 6. 6  Control the number of output words using a recurrent neural network Jin et al., Annotation Order Matters: Recurrent Image Annotator for Arbitrary Length Image Tagging, In Proc. ICPR, 2016.
  • 7.  Car types identification  Plant species identification ◦ ImageCLEF 2013 Plant Identification Challenge (1st place)  Character recognition ◦ ICDAR Script identification challenge (3rd place) 7 Acura RLMitsubishi Lancer Toyota Camry Audi S4 Honda Accord Mercedes-Benz C-Class
  • 8.  Stochastically switch the cross-entropy loss(CCE)and the mean absolute error loss(MAE) 8Hataya et al., LOL: LEARNING TO OPTIMIZE LOSS SWITCHING UNDER LABEL NOISE, 2018.
  • 9.  There exist some “easy” examples which can be correctly classified at the beginning stage of learning  “Hard” data matters more 9 Kishida et al., EMPIRICAL STUDY OF EASY AND HARD EXAMPLES IN CNN TRAINING, ICONIP 2019.
  • 10.  Co-segmentation: extract common objects in multiple images 10 Chen et al., Semantic Aware Attention Based Deep Object Co-segmentation, In Proc. ACCV, 2018.
  • 11. Han et al., "Learning More with Less: Conditional PGGAN-based Data Augmentation for Brain Metastases Detection Using Highly-Rough Annotation on MR Images", In Proc. of CIKM, 2019. Han et al., "Combining Noise-to-Image and Image-to-Image GANs: Brain MR Image Augmentation for Tumor Detection", IEEE Access, Vol.7, pp.156966-156977, 2019.
  • 12.  Erasing texts in general images [WACV’20]  Erasing general objects [Lazarski, 2018] 12 https://www.youtube.com/watch?v=JvTvyOeAGbU
  • 13.  Diversification of decoding [ACL’19]  Resource-efficient MT ◦ Compression of word vectors (99% off!) [ICLR’18] ◦ Rapid decoding [ACL’18, AAAI’20] 13 Input Beam Search Proposed Syntactic Diversity
  • 14. 14  Obtain syntactic word features Permutation Matrix → update steps
  • 15.  Incorporate Quantum Walk for graph representation learning 15
  • 16. 16 a woman is slicing some vegetables a cat is trying to eat the food a dog is swimming in the pool Input (frame sequence) Output (word sequence) “Translation” from video to text! <BOS> a woman is cooking in the kitchen <EOS> context vector
  • 17.  Multimodal Machine Translation [CICLING’19] ◦ Improve translation with the help of vision 17  Phrase localization [LREC’20] ◦ Identify the image region for a given phrase
  • 18.  Presentations at prestigious conferences/journals ◦ ACL, AAAI, WACV, TACL, ICIP (2020) ◦ ACL, CIKM, Neurocomputing, 3DV, ICIPx3 (2019) ◦ ACL, ICLR, ACCV, SIGDIAL, LREC (2018) ◦ IJCNLP, ICDAR (2017)  Awards ◦ 言語処理学会年次大会 優秀賞、若手奨励賞 (2020) ◦ CVIM研究会奨励賞 (2020) ◦ 情報理工学系研究科長賞 (2020) ◦ 画像の認識・理解シンポジウム 学生奨励賞 (2019) ◦ 電子情報通信学会医用画像研究会奨励賞 (2019) ◦ 言語処理学会年次大会 最優秀賞 (2018) ◦ NMT@ACL outstanding paper award (2017) ◦ 人工知能学会全国大会 優秀賞, 学生奨励賞x2 (2017) 18
  • 19.  Faculty:1(Nakayama)  PhD students:10  Master’s students:12(4~5 per each year)  Secretary:1 19
  • 20.  Monday: Group meeting (2~3h) ◦ Short progress report by all, discussion, study session ◦ Mainly organized by PhD students  Wednesday: Main meeting (2~3h) ◦ Progress report (3~4 students) ◦ Presentation practice, etc.  Others ◦ One-on-one meeting ◦ Project meeting  No other hours on duty 20
  • 21.  Workstation (2GPUs) for each student  Share machines ◦ 4GPUs x 4 ◦ 8GPUs x 2  Cloud computers ◦ University cloud system ◦ ABCI 21
  • 22. 22