SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Part-of-Speech tagging of Northern Sotho: Disambiguating polysemous function words Gertrud Faa ß [email_address]   Ulrich Heid [email_address] E lsab é   Taljard [email_address] DJ Prinsloo [email_address]
This Talk ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Nine Official Bantu Languages of SA ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
Noun class system 1 1 ga re ‘middle’ ga- (24)  ga - n tle ‘outside’ pele ‘in front’ N- / Ø- N - mo rago ‘behind’ mo- 18 go dimo ‘above’ go- 17 fa se ‘below’ fa- 16 go  ruta ‘to learn’ go- 15 ma dulo‘residences’ ma- (6) bo dulo ‘residence’ bo- 14 dim pša ‘dogs’ /  di hlogo ‘heads’ di N - / di-  10 m pša ‘dog’ / hlogo ‘head’ N - / Ø- 9 di lepe ‘axes’ di- 8 se lepe ‘axe’ se- 7 ma bone ‘lights’ ma- 6 le bone ‘light’ le- 5 me nwana ‘fingers’ me- 4 mo nwana ‘finger’ mo- 3 malome ‘uncle’ bo malome ‘uncle & co’ Ø- bo- 1a 2b mo sadi ‘woman’ ba sadi ‘women’ mo- ba- 1 2 Example CP Cl.No
Concordial agreement – Northern Sotho ,[object Object]
Challenges for tagging ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Challenges: unknown words ,[object Object],[object Object],[object Object]
[object Object],Examples of suffixes and combinations for a single verb
Solution  for unknown verbs and nouns ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Objectives ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
State of the art for tagging:  Sotho languages ,[object Object],[object Object],[object Object],[object Object],[object Object]
Descriptive State of the Art:  tagsets and tools yes yes 25/141 This paper no yes 141/262 Taljard et al. (2008) yes no partial Kotzé (several, e.g. 2008) yes no 56 De Schryver and De Pauw (2007) no no 106 Van Rooy and Pretorius (2003) Tool? Noun class yes/no No. of tags Authors
Descriptive State of the Art for tagging: Sotho languages ,[object Object],[object Object],[object Object],[object Object],[object Object]
Descriptive state of the art for tagging: Sotho languages ,[object Object],[object Object],[object Object],[object Object]
The ambiguity problem ,[object Object],[object Object]
The ambiguity problem:  possible solutions ,[object Object],[object Object],[object Object]
Tagset (cf. Handout) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
RF tagger technology   (cf. Schmid and Laws (2008) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Training corpus ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Comparing taggers on manually annotated data ,[object Object],[object Object],[object Object],[object Object]
Effects of size of training corpus No more adding of training data necessary
Effects of highly polysemous function words ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Alternative proposal: hybrid taggers Spoustov á  et al. (2007)   ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Results: 10-fold cross validation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object]
Future work ,[object Object],[object Object],[object Object]

Weitere ähnliche Inhalte

Ähnlich wie Part-of-Speech Tagging of Northern Sotho: Disambiguating Polysemous Function Words

The verbal phrase of Northern Sotho: A morpho-syntactic perspective
The verbal phrase of Northern Sotho: A morpho-syntactic perspectiveThe verbal phrase of Northern Sotho: A morpho-syntactic perspective
The verbal phrase of Northern Sotho: A morpho-syntactic perspectiveGuy De Pauw
 
GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...
GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...
GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...Lifeng (Aaron) Han
 
parts of speech,punctuation,use of grammer,active passive voice, change of ac...
parts of speech,punctuation,use of grammer,active passive voice, change of ac...parts of speech,punctuation,use of grammer,active passive voice, change of ac...
parts of speech,punctuation,use of grammer,active passive voice, change of ac...UmarKhan422
 
Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Taggingtheyaseen51
 
Pos Tagging for Classical Tamil Texts
Pos Tagging for Classical Tamil TextsPos Tagging for Classical Tamil Texts
Pos Tagging for Classical Tamil Textsijcnes
 
Natural language procssing
Natural language procssing Natural language procssing
Natural language procssing Rajnish Raj
 
SETSWANA PART OF SPEECH TAGGING
SETSWANA PART OF SPEECH TAGGINGSETSWANA PART OF SPEECH TAGGING
SETSWANA PART OF SPEECH TAGGINGkevig
 
PSY499 Psychology Capstone Project
PSY499 Psychology Capstone ProjectPSY499 Psychology Capstone Project
PSY499 Psychology Capstone ProjectTow Wee Yeh
 
Pattern Mining To Unknown Word Extraction (10
Pattern Mining To Unknown Word Extraction (10Pattern Mining To Unknown Word Extraction (10
Pattern Mining To Unknown Word Extraction (10Jason Yang
 
Teaching alphabetics and fluency in reading
Teaching alphabetics and fluency in readingTeaching alphabetics and fluency in reading
Teaching alphabetics and fluency in readingMarcia Luptak
 
Cec2010 araujo santamaria
Cec2010 araujo santamariaCec2010 araujo santamaria
Cec2010 araujo santamariaLourdes Araujo
 
Word level language identification in code-switched texts
Word level language identification in code-switched textsWord level language identification in code-switched texts
Word level language identification in code-switched textsHarsh Jhamtani
 
Overview of the 9th Author Profiling task at PAN: Profiling Hate Speech Sprea...
Overview of the 9th Author Profiling task at PAN: Profiling Hate Speech Sprea...Overview of the 9th Author Profiling task at PAN: Profiling Hate Speech Sprea...
Overview of the 9th Author Profiling task at PAN: Profiling Hate Speech Sprea...Francisco Manuel Rangel Pardo
 
Natural Language parsing.pptx
Natural Language parsing.pptxNatural Language parsing.pptx
Natural Language parsing.pptxsiddhantroy13
 

Ähnlich wie Part-of-Speech Tagging of Northern Sotho: Disambiguating Polysemous Function Words (20)

The verbal phrase of Northern Sotho: A morpho-syntactic perspective
The verbal phrase of Northern Sotho: A morpho-syntactic perspectiveThe verbal phrase of Northern Sotho: A morpho-syntactic perspective
The verbal phrase of Northern Sotho: A morpho-syntactic perspective
 
Lidia Pivovarova
Lidia PivovarovaLidia Pivovarova
Lidia Pivovarova
 
MA thesis
MA thesisMA thesis
MA thesis
 
GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...
GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...
GSCL2013.Phrase Tagset Mapping for French and English Treebanks and Its Appli...
 
parts of speech,punctuation,use of grammer,active passive voice, change of ac...
parts of speech,punctuation,use of grammer,active passive voice, change of ac...parts of speech,punctuation,use of grammer,active passive voice, change of ac...
parts of speech,punctuation,use of grammer,active passive voice, change of ac...
 
Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
 
Pos Tagging for Classical Tamil Texts
Pos Tagging for Classical Tamil TextsPos Tagging for Classical Tamil Texts
Pos Tagging for Classical Tamil Texts
 
OpenLogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries
OpenLogos Semantico-Syntactic Knowledge-Rich Bilingual DictionariesOpenLogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries
OpenLogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries
 
Natural language procssing
Natural language procssing Natural language procssing
Natural language procssing
 
haenelt.ppt
haenelt.ppthaenelt.ppt
haenelt.ppt
 
SETSWANA PART OF SPEECH TAGGING
SETSWANA PART OF SPEECH TAGGINGSETSWANA PART OF SPEECH TAGGING
SETSWANA PART OF SPEECH TAGGING
 
PSY499 Psychology Capstone Project
PSY499 Psychology Capstone ProjectPSY499 Psychology Capstone Project
PSY499 Psychology Capstone Project
 
Pattern Mining To Unknown Word Extraction (10
Pattern Mining To Unknown Word Extraction (10Pattern Mining To Unknown Word Extraction (10
Pattern Mining To Unknown Word Extraction (10
 
Teaching alphabetics and fluency in reading
Teaching alphabetics and fluency in readingTeaching alphabetics and fluency in reading
Teaching alphabetics and fluency in reading
 
Cec2010 araujo santamaria
Cec2010 araujo santamariaCec2010 araujo santamaria
Cec2010 araujo santamaria
 
Word level language identification in code-switched texts
Word level language identification in code-switched textsWord level language identification in code-switched texts
Word level language identification in code-switched texts
 
Overview of the 9th Author Profiling task at PAN: Profiling Hate Speech Sprea...
Overview of the 9th Author Profiling task at PAN: Profiling Hate Speech Sprea...Overview of the 9th Author Profiling task at PAN: Profiling Hate Speech Sprea...
Overview of the 9th Author Profiling task at PAN: Profiling Hate Speech Sprea...
 
grammar-ppt.ppt
grammar-ppt.pptgrammar-ppt.ppt
grammar-ppt.ppt
 
Natural Language parsing.pptx
Natural Language parsing.pptxNatural Language parsing.pptx
Natural Language parsing.pptx
 
Ivan Derganskyi
Ivan DerganskyiIvan Derganskyi
Ivan Derganskyi
 

Mehr von Guy De Pauw

Technological Tools for Dictionary and Corpora Building for Minority Language...
Technological Tools for Dictionary and Corpora Building for Minority Language...Technological Tools for Dictionary and Corpora Building for Minority Language...
Technological Tools for Dictionary and Corpora Building for Minority Language...Guy De Pauw
 
Semi-automated extraction of morphological grammars for Nguni with special re...
Semi-automated extraction of morphological grammars for Nguni with special re...Semi-automated extraction of morphological grammars for Nguni with special re...
Semi-automated extraction of morphological grammars for Nguni with special re...Guy De Pauw
 
Resource-Light Bantu Part-of-Speech Tagging
Resource-Light Bantu Part-of-Speech TaggingResource-Light Bantu Part-of-Speech Tagging
Resource-Light Bantu Part-of-Speech TaggingGuy De Pauw
 
Natural Language Processing for Amazigh Language
Natural Language Processing for Amazigh LanguageNatural Language Processing for Amazigh Language
Natural Language Processing for Amazigh LanguageGuy De Pauw
 
The Tagged Icelandic Corpus (MÍM)
The Tagged Icelandic Corpus (MÍM)The Tagged Icelandic Corpus (MÍM)
The Tagged Icelandic Corpus (MÍM)Guy De Pauw
 
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...Guy De Pauw
 
Tagging and Verifying an Amharic News Corpus
Tagging and Verifying an Amharic News CorpusTagging and Verifying an Amharic News Corpus
Tagging and Verifying an Amharic News CorpusGuy De Pauw
 
A Corpus of Santome
A Corpus of SantomeA Corpus of Santome
A Corpus of SantomeGuy De Pauw
 
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...Guy De Pauw
 
Compiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFSTCompiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFSTGuy De Pauw
 
The Database of Modern Icelandic Inflection
The Database of Modern Icelandic InflectionThe Database of Modern Icelandic Inflection
The Database of Modern Icelandic InflectionGuy De Pauw
 
Learning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
Learning Morphological Rules for Amharic Verbs Using Inductive Logic ProgrammingLearning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
Learning Morphological Rules for Amharic Verbs Using Inductive Logic ProgrammingGuy De Pauw
 
Issues in Designing a Corpus of Spoken Irish
Issues in Designing a Corpus of Spoken IrishIssues in Designing a Corpus of Spoken Irish
Issues in Designing a Corpus of Spoken IrishGuy De Pauw
 
How to build language technology resources for the next 100 years
How to build language technology resources for the next 100 yearsHow to build language technology resources for the next 100 years
How to build language technology resources for the next 100 yearsGuy De Pauw
 
Towards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound AnalysersTowards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound AnalysersGuy De Pauw
 
The PALDO Concept - New Paradigms for African Language Resource Development
The PALDO Concept - New Paradigms for African Language Resource DevelopmentThe PALDO Concept - New Paradigms for African Language Resource Development
The PALDO Concept - New Paradigms for African Language Resource DevelopmentGuy De Pauw
 
A System for the Recognition of Handwritten Yorùbá Characters
A System for the Recognition of Handwritten Yorùbá CharactersA System for the Recognition of Handwritten Yorùbá Characters
A System for the Recognition of Handwritten Yorùbá CharactersGuy De Pauw
 
IFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation SystemIFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation SystemGuy De Pauw
 
A Number to Yorùbá Text Transcription System
A Number to Yorùbá Text Transcription SystemA Number to Yorùbá Text Transcription System
A Number to Yorùbá Text Transcription SystemGuy De Pauw
 
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...Bilingual Data Mining for the English-Amharic Statistical Machine Translation...
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...Guy De Pauw
 

Mehr von Guy De Pauw (20)

Technological Tools for Dictionary and Corpora Building for Minority Language...
Technological Tools for Dictionary and Corpora Building for Minority Language...Technological Tools for Dictionary and Corpora Building for Minority Language...
Technological Tools for Dictionary and Corpora Building for Minority Language...
 
Semi-automated extraction of morphological grammars for Nguni with special re...
Semi-automated extraction of morphological grammars for Nguni with special re...Semi-automated extraction of morphological grammars for Nguni with special re...
Semi-automated extraction of morphological grammars for Nguni with special re...
 
Resource-Light Bantu Part-of-Speech Tagging
Resource-Light Bantu Part-of-Speech TaggingResource-Light Bantu Part-of-Speech Tagging
Resource-Light Bantu Part-of-Speech Tagging
 
Natural Language Processing for Amazigh Language
Natural Language Processing for Amazigh LanguageNatural Language Processing for Amazigh Language
Natural Language Processing for Amazigh Language
 
The Tagged Icelandic Corpus (MÍM)
The Tagged Icelandic Corpus (MÍM)The Tagged Icelandic Corpus (MÍM)
The Tagged Icelandic Corpus (MÍM)
 
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
Describing Morphologically Rich Languages Using Metagrammars a Look at Verbs ...
 
Tagging and Verifying an Amharic News Corpus
Tagging and Verifying an Amharic News CorpusTagging and Verifying an Amharic News Corpus
Tagging and Verifying an Amharic News Corpus
 
A Corpus of Santome
A Corpus of SantomeA Corpus of Santome
A Corpus of Santome
 
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
Automatic Structuring and Correction Suggestion System for Hungarian Clinical...
 
Compiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFSTCompiling Apertium Dictionaries with HFST
Compiling Apertium Dictionaries with HFST
 
The Database of Modern Icelandic Inflection
The Database of Modern Icelandic InflectionThe Database of Modern Icelandic Inflection
The Database of Modern Icelandic Inflection
 
Learning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
Learning Morphological Rules for Amharic Verbs Using Inductive Logic ProgrammingLearning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
Learning Morphological Rules for Amharic Verbs Using Inductive Logic Programming
 
Issues in Designing a Corpus of Spoken Irish
Issues in Designing a Corpus of Spoken IrishIssues in Designing a Corpus of Spoken Irish
Issues in Designing a Corpus of Spoken Irish
 
How to build language technology resources for the next 100 years
How to build language technology resources for the next 100 yearsHow to build language technology resources for the next 100 years
How to build language technology resources for the next 100 years
 
Towards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound AnalysersTowards Standardizing Evaluation Test Sets for Compound Analysers
Towards Standardizing Evaluation Test Sets for Compound Analysers
 
The PALDO Concept - New Paradigms for African Language Resource Development
The PALDO Concept - New Paradigms for African Language Resource DevelopmentThe PALDO Concept - New Paradigms for African Language Resource Development
The PALDO Concept - New Paradigms for African Language Resource Development
 
A System for the Recognition of Handwritten Yorùbá Characters
A System for the Recognition of Handwritten Yorùbá CharactersA System for the Recognition of Handwritten Yorùbá Characters
A System for the Recognition of Handwritten Yorùbá Characters
 
IFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation SystemIFE-MT: An English-to-Yorùbá Machine Translation System
IFE-MT: An English-to-Yorùbá Machine Translation System
 
A Number to Yorùbá Text Transcription System
A Number to Yorùbá Text Transcription SystemA Number to Yorùbá Text Transcription System
A Number to Yorùbá Text Transcription System
 
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...Bilingual Data Mining for the English-Amharic Statistical Machine Translation...
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...
 

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Kürzlich hochgeladen (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Part-of-Speech Tagging of Northern Sotho: Disambiguating Polysemous Function Words

  • 1. Part-of-Speech tagging of Northern Sotho: Disambiguating polysemous function words Gertrud Faa ß [email_address]   Ulrich Heid [email_address] E lsab é Taljard [email_address] DJ Prinsloo [email_address]
  • 2.
  • 3.
  • 4.  
  • 5. Noun class system 1 1 ga re ‘middle’ ga- (24) ga - n tle ‘outside’ pele ‘in front’ N- / Ø- N - mo rago ‘behind’ mo- 18 go dimo ‘above’ go- 17 fa se ‘below’ fa- 16 go ruta ‘to learn’ go- 15 ma dulo‘residences’ ma- (6) bo dulo ‘residence’ bo- 14 dim pša ‘dogs’ / di hlogo ‘heads’ di N - / di- 10 m pša ‘dog’ / hlogo ‘head’ N - / Ø- 9 di lepe ‘axes’ di- 8 se lepe ‘axe’ se- 7 ma bone ‘lights’ ma- 6 le bone ‘light’ le- 5 me nwana ‘fingers’ me- 4 mo nwana ‘finger’ mo- 3 malome ‘uncle’ bo malome ‘uncle & co’ Ø- bo- 1a 2b mo sadi ‘woman’ ba sadi ‘women’ mo- ba- 1 2 Example CP Cl.No
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. Descriptive State of the Art: tagsets and tools yes yes 25/141 This paper no yes 141/262 Taljard et al. (2008) yes no partial Kotzé (several, e.g. 2008) yes no 56 De Schryver and De Pauw (2007) no no 106 Van Rooy and Pretorius (2003) Tool? Noun class yes/no No. of tags Authors
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. Effects of size of training corpus No more adding of training data necessary
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.