2. www.deecoob.com 1
Data - Information - InsightData - Information - Insight
www.deecoob.com 1
1. Use Case at deecoob
2. Current mono-lingual approaches
3. International Use Case
4. Transformer Models
5. Multi-language capabilities
Overview
3. www.deecoob.com 2
Data - Information - InsightData - Information - Insight
www.deecoob.com 2
1. Use Case at deecoob
4. www.deecoob.com 3
Data - Information - InsightData - Information - Insight
www.deecoob.com 3
● pre-labeled text corpus
● Tf-Idf as text features
● Naïve Bayes as classifier
2. Current mono-lingual approaches
5. www.deecoob.com 4
Data - Information - InsightData - Information - Insight
www.deecoob.com 4
● Music event detection world-wide
○ different languages
○ texts containing more than one language
○ different character sets (umlauts, accents,
cyrillic, hebrew)
3. International Use Case
6. www.deecoob.com 5
Data - Information - InsightData - Information - Insight
www.deecoob.com 5
3. International Use Case : Naïve Approach
7. www.deecoob.com 6
Data - Information - InsightData - Information - Insight
www.deecoob.com 6
3. International Use Case : BERT approach
8. www.deecoob.com 7
Data - Information - InsightData - Information - Insight
www.deecoob.com 7
● use deep neural network
encoder-decoder
○ can process multiple languages at once
○ learns language-independent model
● input entire sequence at once
● make heavy use of attention
4. Transformer Models
9. www.deecoob.com 8
Data - Information - InsightData - Information - Insight
www.deecoob.com 8
Encoder-Decoder Stack
4. Transformer - Google BERT
http://jalammar.github.io/illustrated-transformer/
10. www.deecoob.com 9
Data - Information - InsightData - Information - Insight
www.deecoob.com 9
4. Transformer - Google BERT
https://mchromiak.github.io/articles/2017/Sep/12/Transformer-Attention-is-all-you-need/
model token dependencies through
multi-head self-attention
11. www.deecoob.com 10
Data - Information - InsightData - Information - Insight
www.deecoob.com 10
Step 1 :
● training on masked words
● randomly mask 15% of words
○ words “do not see themselves” in training
● Trained with Wikipedia corpus 104 languages, 12-layer, 768-hidden, 12-heads, 110M
parameters
4. Transformer - Google BERT
https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html
12. www.deecoob.com 11
Data - Information - InsightData - Information - Insight
www.deecoob.com 11
Step 2 : two-sentence training
4. Transformer - Google BERT
https://www.kdnuggets.com/2018/12/bert-sota-nlp-model-explained.html
13. www.deecoob.com 12
Data - Information - InsightData - Information - Insight
www.deecoob.com 12
5. Multi-language capabilities
14. www.deecoob.com 13
Data - Information - InsightData - Information - Insight
www.deecoob.com 13
Results
15. www.deecoob.com 14
Data - Information - Insight
deecoob Technology GmbH
+49 (0) 351 410 470
www.deecoob.com
info@deecoob.com