SlideShare a Scribd company logo
1 of 17
Download to read offline
PurePos – an open source
        morphological disambiguator
                       György Orosz, Attila Novák

                       {oroszgy, novak.attila}@itk.ppke.hu

 Pázmány Péter Catholic University, Faculty of Information Technology
           MTA-PPKE Language Technology Research Group


This work was partially supported by TÁMOP: 4.2.2/B – 10/1–2010–0014
Outline

  PurePos
    – Full morphological disambiguation (tag + lemma)
    – Integrated morphological analyzer




1) Need of a tagger with an integrated MA
2) Implementation, Contribution
3) Evaluation
Problems with agglutinating languages
• Small word coverage of the corpus
• Even 1000+ possible forms of a word
• Possibly huge tagset
  – absent tags
  – absent tag sequences
• Standalone lemmatization is not a good
  solution
Less-resourced languages
• Morphologically complex
• Lack of annotated corpora



Building an annotated corpus:
  1) Manually disambiguate/correct
  2) Train the tagger
  3) Tag some text
Web service scenario
• Need of a high precision tagging tool
• Noisy and unseen data
• Incremental training
What do we need?
• Full morphological disambiguation
    – Including lemmatization
•   Integrated morphological analyzer
•   Incremental training
•   Unicode support
•   Fast to train
•   Open source
•   Easy to use
Where to start?
• From scratch?
• Modifying an existing tool?
  –   TriTagger
  –   IceMorphy
  –   Apertium tagger
  –   HunPos
  –   OpenNLP
  –   ...
HunPos
Pros:                      Cons:
  – Trigram tagger (TnT)     – Only POS tagging
  – Beam search                (no lemmatization)
  – Clever tricks            – Implemented in
  – Contains a suffix          OCaml
    guesser                  – No support for
  – Employing a                Unicode
    morphological table      – No real MA
  – Fast to train and
    decode
Using the analyzer



          • Reducing the
            search space
          • Generating lemma
            candidates
Lemmatization

Morphological guesser
                           1) Generating
 E.g.:                       candidates
  Facebookjukba
                           2) Filter by POS tag
                           3) Select the most
                             probable one
Incremental training
Training                 Tagging
  1) Train the tagger    1) Load the model
  2) Save the model      2) Compile the model
  3) Load the model      3) Use the model for
  4) Add training data     tagging
    to the model
  5) Save the model
Evaluation

                       Accuracy
OpenNLP (perceptron)   97,16%
OpenNLP (maxent)       96.45%     POS tagging
PurePos (without MA)   98.14%     accuracy
PurePos (with MA)      98.99%



                                                 Accuracy
        Full disambiguation       Guesser        89.79%
        accuracy of PurePos       Guesser + MT   90.35%
                                  Guesser + MA   98.35%
Evaluation

POS tagging accuracy
Evaluation

Full disambiguation accuracy
Evaluation

Performance as a web service

               Lemmatization   Tagging   Combined
Baseline       90.58%          98.14%    89.79%
MT-10k         90.58%          98.14%    89.79%
MT-30k         90.58%          98.17%    89.81%
MT-100k        90.64%          98.30%    89.90%
MT-100k*       90.72%          98.39%    89.97%
PurePos        99.07%          98.99%    98.35%
PurePos
•   Reimplementation of HunPos
•   Deeply integrated MA
•   Full disambiguation
•   State-of-the-art accuracy
•   Full Unicode support
•   Incremental training
•   Open source
•   Easily extensible
Thank you!

http://nlpg.itk.ppke.hu/software/purepos

More Related Content

Similar to Purepos -- an open source morphological disambiguator

Building NLP solutions using Python
Building NLP solutions using PythonBuilding NLP solutions using Python
Building NLP solutions using Pythonbotsplash.com
 
An Introduction to Natural Language Processing
An Introduction to Natural Language ProcessingAn Introduction to Natural Language Processing
An Introduction to Natural Language ProcessingTyrone Systems
 
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Sagar Deogirkar
 
Introduction To Applied Machine Learning
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learningananth
 
Learning to Translate with Joey NMT
Learning to Translate with Joey NMTLearning to Translate with Joey NMT
Learning to Translate with Joey NMTJulia Kreutzer
 
The Joy of SciPy
The Joy of SciPyThe Joy of SciPy
The Joy of SciPykammeyer
 
PyTorch - an ecosystem for deep learning with Soumith Chintala (Facebook AI)
PyTorch - an ecosystem for deep learning with Soumith Chintala (Facebook AI)PyTorch - an ecosystem for deep learning with Soumith Chintala (Facebook AI)
PyTorch - an ecosystem for deep learning with Soumith Chintala (Facebook AI)Databricks
 
Thinking about nlp
Thinking about nlpThinking about nlp
Thinking about nlpPan Xiaotong
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 
Course report-islam-taharimul (1)
Course report-islam-taharimul (1)Course report-islam-taharimul (1)
Course report-islam-taharimul (1)TANVIRAHMED611926
 
Investigating the Possibilities of Using SMT for Text Annotation
Investigating the Possibilities of Using SMT for Text AnnotationInvestigating the Possibilities of Using SMT for Text Annotation
Investigating the Possibilities of Using SMT for Text Annotationnlpg
 
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Vienna Data Science Group
 
Using Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, Microsoft
Using Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, MicrosoftUsing Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, Microsoft
Using Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, MicrosoftGuhan Suriyanarayanan
 
Error handling in visual fox pro 9
Error handling in visual fox pro 9Error handling in visual fox pro 9
Error handling in visual fox pro 9Mike Feltman
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Prof. Wim Van Criekinge
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptxAmanBadesra1
 

Similar to Purepos -- an open source morphological disambiguator (20)

Building NLP solutions using Python
Building NLP solutions using PythonBuilding NLP solutions using Python
Building NLP solutions using Python
 
An Introduction to Natural Language Processing
An Introduction to Natural Language ProcessingAn Introduction to Natural Language Processing
An Introduction to Natural Language Processing
 
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...
 
Introduction To Applied Machine Learning
Introduction To Applied Machine LearningIntroduction To Applied Machine Learning
Introduction To Applied Machine Learning
 
Learning to Translate with Joey NMT
Learning to Translate with Joey NMTLearning to Translate with Joey NMT
Learning to Translate with Joey NMT
 
Rui Meng - 2017 - Deep Keyphrase Generation
Rui Meng - 2017 - Deep Keyphrase GenerationRui Meng - 2017 - Deep Keyphrase Generation
Rui Meng - 2017 - Deep Keyphrase Generation
 
The Joy of SciPy
The Joy of SciPyThe Joy of SciPy
The Joy of SciPy
 
MTM 2015
MTM 2015MTM 2015
MTM 2015
 
PyTorch - an ecosystem for deep learning with Soumith Chintala (Facebook AI)
PyTorch - an ecosystem for deep learning with Soumith Chintala (Facebook AI)PyTorch - an ecosystem for deep learning with Soumith Chintala (Facebook AI)
PyTorch - an ecosystem for deep learning with Soumith Chintala (Facebook AI)
 
Thinking about nlp
Thinking about nlpThinking about nlp
Thinking about nlp
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Course report-islam-taharimul (1)
Course report-islam-taharimul (1)Course report-islam-taharimul (1)
Course report-islam-taharimul (1)
 
Investigating the Possibilities of Using SMT for Text Annotation
Investigating the Possibilities of Using SMT for Text AnnotationInvestigating the Possibilities of Using SMT for Text Annotation
Investigating the Possibilities of Using SMT for Text Annotation
 
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
Wastian, Brunmeir - Data Analyses in Industrial Applications: From Predictive...
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
 
Chat adapted pos tagger for romanian language
Chat adapted pos tagger for romanian languageChat adapted pos tagger for romanian language
Chat adapted pos tagger for romanian language
 
Using Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, Microsoft
Using Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, MicrosoftUsing Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, Microsoft
Using Deep Learning at Scale - Guhan Suriyanarayanan and Adi Oltean, Microsoft
 
Error handling in visual fox pro 9
Error handling in visual fox pro 9Error handling in visual fox pro 9
Error handling in visual fox pro 9
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptx
 

Recently uploaded

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Purepos -- an open source morphological disambiguator

  • 1. PurePos – an open source morphological disambiguator György Orosz, Attila Novák {oroszgy, novak.attila}@itk.ppke.hu Pázmány Péter Catholic University, Faculty of Information Technology MTA-PPKE Language Technology Research Group This work was partially supported by TÁMOP: 4.2.2/B – 10/1–2010–0014
  • 2. Outline PurePos – Full morphological disambiguation (tag + lemma) – Integrated morphological analyzer 1) Need of a tagger with an integrated MA 2) Implementation, Contribution 3) Evaluation
  • 3. Problems with agglutinating languages • Small word coverage of the corpus • Even 1000+ possible forms of a word • Possibly huge tagset – absent tags – absent tag sequences • Standalone lemmatization is not a good solution
  • 4. Less-resourced languages • Morphologically complex • Lack of annotated corpora Building an annotated corpus: 1) Manually disambiguate/correct 2) Train the tagger 3) Tag some text
  • 5. Web service scenario • Need of a high precision tagging tool • Noisy and unseen data • Incremental training
  • 6. What do we need? • Full morphological disambiguation – Including lemmatization • Integrated morphological analyzer • Incremental training • Unicode support • Fast to train • Open source • Easy to use
  • 7. Where to start? • From scratch? • Modifying an existing tool? – TriTagger – IceMorphy – Apertium tagger – HunPos – OpenNLP – ...
  • 8. HunPos Pros: Cons: – Trigram tagger (TnT) – Only POS tagging – Beam search (no lemmatization) – Clever tricks – Implemented in – Contains a suffix OCaml guesser – No support for – Employing a Unicode morphological table – No real MA – Fast to train and decode
  • 9. Using the analyzer • Reducing the search space • Generating lemma candidates
  • 10. Lemmatization Morphological guesser 1) Generating E.g.: candidates Facebookjukba 2) Filter by POS tag 3) Select the most probable one
  • 11. Incremental training Training Tagging 1) Train the tagger 1) Load the model 2) Save the model 2) Compile the model 3) Load the model 3) Use the model for 4) Add training data tagging to the model 5) Save the model
  • 12. Evaluation Accuracy OpenNLP (perceptron) 97,16% OpenNLP (maxent) 96.45% POS tagging PurePos (without MA) 98.14% accuracy PurePos (with MA) 98.99% Accuracy Full disambiguation Guesser 89.79% accuracy of PurePos Guesser + MT 90.35% Guesser + MA 98.35%
  • 15. Evaluation Performance as a web service Lemmatization Tagging Combined Baseline 90.58% 98.14% 89.79% MT-10k 90.58% 98.14% 89.79% MT-30k 90.58% 98.17% 89.81% MT-100k 90.64% 98.30% 89.90% MT-100k* 90.72% 98.39% 89.97% PurePos 99.07% 98.99% 98.35%
  • 16. PurePos • Reimplementation of HunPos • Deeply integrated MA • Full disambiguation • State-of-the-art accuracy • Full Unicode support • Incremental training • Open source • Easily extensible