26. Sep 2019•0 gefällt mir•98 views

Downloaden Sie, um offline zu lesen

Melden

Technologie

Slides of the paper Towards a Higher Accuracy of Optical Character Recognition of Chinese Rare Books in Making Use of Text Model by Hsiang-An Wang and Pin-Ting Liu at the 3rd Edition of the DATeCH2019 International Conference

Contribution of recurrent connectionist language models in improving lstm bas...anna8885

Implemetation of parallelism in HMM DNN based state of the art kaldi ASR ToolkitShubham Verma

Language modelsMaryam Khordad

Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for De...Databricks

Ekush netSalithRahman

Story generation-Sarah SaneeiSRah Sanei

- 1. Towards a Higher Accuracy of Optical Character Recognition of Chinese Rare Books in Making Use of Text Model Hsiang-An Wang Academia Sinica Center for Digital Cultures
- 2. Ink Bleed and Pool Quality 2
- 3. Limitation (Missing and Extra Word) OCR Original OCR Original 3
- 4. Experiment: Data Collection • Training dataset: 187 ancient medicine books from the Scripta Sinica Database (about 40 million words) • Testing dataset: 1 relevant ancient medicine book named “ ” with a total of 185,000 words • The OCR results are about 180,000 words correct and about 5000 incorrect words, which means the correct rate is about 97.3 % 4
- 5. Experiment: Building a N-gram Model • Relied on the sequence of words in the training dataset, and thus we picked the highest frequency of output. • " " – 2-gram: input to predict " " – 3-gram: input predict " " – 4-gram: input predict " " – ... 5
- 6. Experiment: Building a Backward and Forward N-gram Model • Relied on the sequence of backward and forward words in the training dataset, and thus we picked the highest frequency of output. • Since the backward and forward N-gram are divided into two different sets of N-gram, therefore, the model can be used when the same word is found afterwards. • " " – Backward 4-gram: input to predict " " – Forward 4-gram: input to predict " " 6
- 7. Experiment: Building a LSTM Model • Used the Word2vec to project text into the vector space with 200 dimension • Used LSTM with three layers of neural network • Picked the highest score of softmax layer to predict the word • " " – LSTM 2-gram: input to predict " " – LSTM 3-gram: input to predict " " – LSTM 4-gram: input to predict " " 7
- 8. The Modification of Correctness Rate in N-gram Model • 7-gram can achieve the best correction rate 8
- 9. The Modification of Correctness Rate in Backward and Forward N-gram Model • Backward and Forward 4-gram can achieve the best correction rate 9
- 10. The Modification of Correctness Rate in LSTM Model • LSTM 6-gram can achieve the best correction rate • 10
- 11. Model The ratio of the correct result of OCR changes to the wrong one The ratio of making the incorrect result of OCR changes to the right one The ratio of accuracy of OCR and the text model OCR X X 97.30% 7-gram 0.35% 13.06% 97.49% LSTM 6-gram 0.1% 7.33% 97.5% BF 4-gram 0.08% 9.54% 97.57% Comparison of 7-gram, LSTM 6-gram and BF 4-gram Text Models • Backward and Forward 4-gram has the best performance, with the lowest modification error result and the highest correct results 11
- 12. Three Text models with OCR Top 5 Candidate Words • The OCR software we use is a Convolution Neural Network model and to calculate the probability of classification through softmax function • When the probability of OCR Top 1 is lower than 95%, it determines the word might be wrong and will use mixed model • Pick the word that has the highest score of the text model also appeared in OCR Top 5 candidate words 12
- 13. Model The ratio of the correct result of OCR changes to the wrong one The ratio of making the incorrect result of OCR changes to the right one The ratio of accuracy of OCR and the text model OCR X X 97.30% 7-gram 0.012% 9% 97.63% LSTM 6-gram 0.13% 16% 97.71% BF 4-gram 0.009% 5.92% 97.55% Comparison of Three Text Models Mixed with the Probability of OCR • LSTM 6-gram mixed with the probability of OCR that has the best performance 13
- 14. Conclusion: Using Text Model • N-gram, backward and forward N-gram or LSTM N- gram text model can increase the ratio of accuracy of OCR • Backward and Forward 4-gram model has the lowest modification error result and the highest correct result 14
- 15. Conclusion: Mixing Text Models with the Probability of OCR • By mixing rules of OCR Top 5 candidate words and probability of Top 1 with text model, it can archive better result than using text model only • Mixing the LSTM 6-gram with the probability of OCR model has the highest correct results 15
- 16. Thank you for listening