SlideShare ist ein Scribd-Unternehmen logo
1 von 16
CLTL
Software and Web
Services
Rubén Izquierdo Beviá
Rubén Izquierdo Beviá
About me
 5-year degree on Computer Science (University of
Alicante, Alicante, Spain)

 National NLP projects and 1 European project (QALLME)
(University of Alicante, Alicante, Spain)

 Thesis about NLP & Word Sense Disambiguation (University
of Alicante, Alicante, Spain. Sept 2010)

 Postdoc position at DutchSemCor Project (University of
Tilburg, Tilburg. Sept 2011-Sept2012)

 Postdoc position at OpeNER Project (Vrije
University, Amsterdam. Sept 2012-)
CLTL software
 In general common input/output format
 KAF
 NAF, as an extension of KAF

 Single components performing single tasks
 Integration of existing modules
 Adaptation of input/output formats

 Development of new ones
KAF
Kyoto Annotation Format
 Stand-off, layered, XML-based representation format





Different types of information are stored in different layers
Layers are linked by means of references
Suitable for creating pipelines based on this format
Layers:
 Text  tokens
 Term  lemmas, part-of-speech, term sentiment, word
senses
 Entities, chunks, opinions…
KAF
Kyoto Annotation Format
NAF
NewsReader Annotation Format
 Extension of KAF

 Allow the cross-document processing
 Event coreference

 ID’s are converted into valid URI’s

 Store the same type of information provided by different
tools
 Result of two different pos-taggers
How the software is provided I
 All modules are publicly available on GitHub
 CLTL GitHub
 http://github.com/cltl

 NewsReader GitHub
 http://github.com/newsreader

 OpeNER GitHub
 http://github.com/opener-project/
How the software is provided
II
 Some are available as Web Services
 Exposed as REST web services
 Accept and input stream (KAF/NAF)
 Generate an output stream (KAF/NAF)
 Easy to call from command line with CURL
 Easy to create module pipelines in the same way you create a
linux commands pipeline

 http://wordpress.let.vupr.nl/web-services/
How the software is provided
II
How the software is provided
II
Our software I
 General modules (integrated)
 Tokenizers: whitespace based, open-nlp trained...
 Sentence splitters: based on rules, open-nlp
 Pos-taggers: treetagger, open-nlp pos taggers
 Chunker: trained on Alpino data with open-nlp
 Parsers: Alpino (nl), Stanford (en)
Our software II
 General modules (developed by us)
 Wordnet Tools
 Functions to use a WordNet in LMF format

 Word Sense Disambiguation systems
 UKB: unsupersived
 SVM: supervised (for nl derived from DutchSemcor)

 Multiword tagger
 multiword sequences of terms according the WordNet

 OntoTagger
 Ontotagger inserts (semantic) labels into KAF representation on the basis
of lemma or wordnet synset representations of text
Our software III
 General modules (developed by us)
 Named Entity Recognizer
 Detects dates and locations using specific resources +
GeoNames

 KyBot
 Extract tuples and relations from a set of profiles formulated
using semantic and structural properties
Our software IV
 OpeNER related (developed by us)
 Hotel property tagger
 Detect aspects related with
cleanliness, staff, breakfast, rooms…

 Term polarity tagger
 Positive/negative terms, intensifiers, negators …
 Opinion miner
 Detect opinions: target + holder + expression
 2 rule based version // 1 machine learning version
Our software V
 NewsReader related (developed by us)
 Discourse Module
 Splits incoming texts into headers and paragraphs
 Factuality Classifier
 Classifies whether a statement is factual/probable/possible or
not

 Event Coreference
 Compares descriptions of events within and across
documents to decide if they refer to the same events.
CLTL
Software and Web
Services
Rubén Izquierdo Beviá

Weitere ähnliche Inhalte

Andere mochten auch

Efficient approach of patent search paradigm (abstract)
Efficient approach of patent search paradigm (abstract)Efficient approach of patent search paradigm (abstract)
Efficient approach of patent search paradigm (abstract)Prateek Jaiswal
 
CLTL python course: Object Oriented Programming (3/3)
CLTL python course: Object Oriented Programming (3/3)CLTL python course: Object Oriented Programming (3/3)
CLTL python course: Object Oriented Programming (3/3)Rubén Izquierdo Beviá
 
Divine safety final
Divine safety finalDivine safety final
Divine safety finalTAVADO
 
5 FAQS About Dental Implants
5 FAQS About Dental Implants5 FAQS About Dental Implants
5 FAQS About Dental ImplantsDrBjorklund
 
patent search paradigm (ieee)
patent search paradigm (ieee)patent search paradigm (ieee)
patent search paradigm (ieee)Prateek Jaiswal
 
Проект : Есть такая профессия - Родину защищать!
Проект : Есть такая профессия - Родину защищать!Проект : Есть такая профессия - Родину защищать!
Проект : Есть такая профессия - Родину защищать!Aleksey92
 
Self-Organizing Time Synchronization in Wireless Sensor Networks with Adaptiv...
Self-Organizing Time Synchronization in Wireless Sensor Networks with Adaptiv...Self-Organizing Time Synchronization in Wireless Sensor Networks with Adaptiv...
Self-Organizing Time Synchronization in Wireless Sensor Networks with Adaptiv...Önder Gürcan
 
Managing A Hedge Fund: Marketing To Investors & Raising Capital
Managing A Hedge Fund: Marketing To Investors & Raising CapitalManaging A Hedge Fund: Marketing To Investors & Raising Capital
Managing A Hedge Fund: Marketing To Investors & Raising CapitalTyra Jeffries
 
Маркетинг Monster energy
Маркетинг Monster energyМаркетинг Monster energy
Маркетинг Monster energyPavel Kozlov
 
Peran pemimpin perubahan
Peran pemimpin perubahanPeran pemimpin perubahan
Peran pemimpin perubahanYusuf Darismah
 

Andere mochten auch (15)

Social media in de culturele sector
Social media in de culturele sectorSocial media in de culturele sector
Social media in de culturele sector
 
Social media in de culturele sector
Social media in de culturele sectorSocial media in de culturele sector
Social media in de culturele sector
 
Efficient approach of patent search paradigm (abstract)
Efficient approach of patent search paradigm (abstract)Efficient approach of patent search paradigm (abstract)
Efficient approach of patent search paradigm (abstract)
 
CLTL python course: Object Oriented Programming (3/3)
CLTL python course: Object Oriented Programming (3/3)CLTL python course: Object Oriented Programming (3/3)
CLTL python course: Object Oriented Programming (3/3)
 
Divine safety final
Divine safety finalDivine safety final
Divine safety final
 
Social media & de culturele sector
Social media & de culturele sectorSocial media & de culturele sector
Social media & de culturele sector
 
5 FAQS About Dental Implants
5 FAQS About Dental Implants5 FAQS About Dental Implants
5 FAQS About Dental Implants
 
Portfolio
PortfolioPortfolio
Portfolio
 
patent search paradigm (ieee)
patent search paradigm (ieee)patent search paradigm (ieee)
patent search paradigm (ieee)
 
Проект : Есть такая профессия - Родину защищать!
Проект : Есть такая профессия - Родину защищать!Проект : Есть такая профессия - Родину защищать!
Проект : Есть такая профессия - Родину защищать!
 
CLTL Software and Web Services
CLTL Software and Web Services CLTL Software and Web Services
CLTL Software and Web Services
 
Self-Organizing Time Synchronization in Wireless Sensor Networks with Adaptiv...
Self-Organizing Time Synchronization in Wireless Sensor Networks with Adaptiv...Self-Organizing Time Synchronization in Wireless Sensor Networks with Adaptiv...
Self-Organizing Time Synchronization in Wireless Sensor Networks with Adaptiv...
 
Managing A Hedge Fund: Marketing To Investors & Raising Capital
Managing A Hedge Fund: Marketing To Investors & Raising CapitalManaging A Hedge Fund: Marketing To Investors & Raising Capital
Managing A Hedge Fund: Marketing To Investors & Raising Capital
 
Маркетинг Monster energy
Маркетинг Monster energyМаркетинг Monster energy
Маркетинг Monster energy
 
Peran pemimpin perubahan
Peran pemimpin perubahanPeran pemimpin perubahan
Peran pemimpin perubahan
 

Ähnlich wie CLTL Software and Web Services Guide

OOP Comparative Study
OOP Comparative StudyOOP Comparative Study
OOP Comparative StudyDarren Tan
 
A Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And RlbpA Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And RlbpRikki Wright
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23Sebastian Hellmann
 
Programing paradigm & implementation
Programing paradigm & implementationPrograming paradigm & implementation
Programing paradigm & implementationBilal Maqbool ツ
 
Evolution Of Object Oriented Technology
Evolution Of Object Oriented TechnologyEvolution Of Object Oriented Technology
Evolution Of Object Oriented TechnologySharon Roberts
 
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The ServicesLynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The ServicesLynx Project
 
plone.app.multilingual
plone.app.multilingual plone.app.multilingual
plone.app.multilingual Ramon Navarro
 
epicenter2010 Open Xml
epicenter2010   Open Xmlepicenter2010   Open Xml
epicenter2010 Open XmlCraig Murphy
 
Dot net-interview-questions-and-answers part i
Dot net-interview-questions-and-answers part iDot net-interview-questions-and-answers part i
Dot net-interview-questions-and-answers part iRakesh Joshi
 
Dot net-interview-questions-and-answers part i
Dot net-interview-questions-and-answers part iDot net-interview-questions-and-answers part i
Dot net-interview-questions-and-answers part iRakesh Joshi
 
Presentation of lpOD (ODF automation platform) at FOSDEM 2010
Presentation of lpOD (ODF automation platform) at FOSDEM 2010Presentation of lpOD (ODF automation platform) at FOSDEM 2010
Presentation of lpOD (ODF automation platform) at FOSDEM 2010Itaapy
 
OBJECT ORIENTED PROGRAMMING.docx
OBJECT ORIENTED PROGRAMMING.docxOBJECT ORIENTED PROGRAMMING.docx
OBJECT ORIENTED PROGRAMMING.docxAleKi2
 

Ähnlich wie CLTL Software and Web Services Guide (20)

OOP Comparative Study
OOP Comparative StudyOOP Comparative Study
OOP Comparative Study
 
A Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And RlbpA Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And Rlbp
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23
 
Programing paradigm & implementation
Programing paradigm & implementationPrograming paradigm & implementation
Programing paradigm & implementation
 
Evolution Of Object Oriented Technology
Evolution Of Object Oriented TechnologyEvolution Of Object Oriented Technology
Evolution Of Object Oriented Technology
 
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The ServicesLynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services
 
plone.app.multilingual
plone.app.multilingual plone.app.multilingual
plone.app.multilingual
 
c#.pptx
c#.pptxc#.pptx
c#.pptx
 
F# Tutorial @ QCon
F# Tutorial @ QConF# Tutorial @ QCon
F# Tutorial @ QCon
 
epicenter2010 Open Xml
epicenter2010   Open Xmlepicenter2010   Open Xml
epicenter2010 Open Xml
 
Chapter1
Chapter1Chapter1
Chapter1
 
Dot net-interview-questions-and-answers part i
Dot net-interview-questions-and-answers part iDot net-interview-questions-and-answers part i
Dot net-interview-questions-and-answers part i
 
Dot net-interview-questions-and-answers part i
Dot net-interview-questions-and-answers part iDot net-interview-questions-and-answers part i
Dot net-interview-questions-and-answers part i
 
Sinux
SinuxSinux
Sinux
 
OOoCon Lpod
OOoCon LpodOOoCon Lpod
OOoCon Lpod
 
Microsoft.Net
Microsoft.NetMicrosoft.Net
Microsoft.Net
 
.Net
.Net.Net
.Net
 
OOP Java
OOP JavaOOP Java
OOP Java
 
Presentation of lpOD (ODF automation platform) at FOSDEM 2010
Presentation of lpOD (ODF automation platform) at FOSDEM 2010Presentation of lpOD (ODF automation platform) at FOSDEM 2010
Presentation of lpOD (ODF automation platform) at FOSDEM 2010
 
OBJECT ORIENTED PROGRAMMING.docx
OBJECT ORIENTED PROGRAMMING.docxOBJECT ORIENTED PROGRAMMING.docx
OBJECT ORIENTED PROGRAMMING.docx
 

Mehr von Rubén Izquierdo Beviá

ULM-1 Understanding Languages by Machines: The borders of Ambiguity
ULM-1 Understanding Languages by Machines: The borders of AmbiguityULM-1 Understanding Languages by Machines: The borders of Ambiguity
ULM-1 Understanding Languages by Machines: The borders of AmbiguityRubén Izquierdo Beviá
 
DutchSemCor workshop: Domain classification and WSD systems
DutchSemCor workshop: Domain classification and WSD systemsDutchSemCor workshop: Domain classification and WSD systems
DutchSemCor workshop: Domain classification and WSD systemsRubén Izquierdo Beviá
 
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged CorpusRANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged CorpusRubén Izquierdo Beviá
 
Topic modeling and WSD on the Ancora corpus
Topic modeling and WSD on the Ancora corpusTopic modeling and WSD on the Ancora corpus
Topic modeling and WSD on the Ancora corpusRubén Izquierdo Beviá
 
Error analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationError analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationRubén Izquierdo Beviá
 
KafNafParserPy: a python library for parsing/creating KAF and NAF files
KafNafParserPy: a python library for parsing/creating KAF and NAF filesKafNafParserPy: a python library for parsing/creating KAF and NAF files
KafNafParserPy: a python library for parsing/creating KAF and NAF filesRubén Izquierdo Beviá
 
CLTL python course: Object Oriented Programming (2/3)
CLTL python course: Object Oriented Programming (2/3)CLTL python course: Object Oriented Programming (2/3)
CLTL python course: Object Oriented Programming (2/3)Rubén Izquierdo Beviá
 
CLTL python course: Object Oriented Programming (1/3)
CLTL python course: Object Oriented Programming (1/3)CLTL python course: Object Oriented Programming (1/3)
CLTL python course: Object Oriented Programming (1/3)Rubén Izquierdo Beviá
 
Thesis presentation (WSD and Semantic Classes)
Thesis presentation (WSD and Semantic Classes)Thesis presentation (WSD and Semantic Classes)
Thesis presentation (WSD and Semantic Classes)Rubén Izquierdo Beviá
 
CLTL presentation: training an opinion mining system from KAF files using CRF
CLTL presentation: training an opinion mining system from KAF files using CRFCLTL presentation: training an opinion mining system from KAF files using CRF
CLTL presentation: training an opinion mining system from KAF files using CRFRubén Izquierdo Beviá
 
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor  Building a semantically annotated corpus for DutchCLIN 2012: DutchSemCor  Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor Building a semantically annotated corpus for DutchRubén Izquierdo Beviá
 
RANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpusRANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpusRubén Izquierdo Beviá
 

Mehr von Rubén Izquierdo Beviá (15)

ULM-1 Understanding Languages by Machines: The borders of Ambiguity
ULM-1 Understanding Languages by Machines: The borders of AmbiguityULM-1 Understanding Languages by Machines: The borders of Ambiguity
ULM-1 Understanding Languages by Machines: The borders of Ambiguity
 
DutchSemCor workshop: Domain classification and WSD systems
DutchSemCor workshop: Domain classification and WSD systemsDutchSemCor workshop: Domain classification and WSD systems
DutchSemCor workshop: Domain classification and WSD systems
 
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged CorpusRANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
RANLP2013: DutchSemCor, in Quest of the Ideal Sense Tagged Corpus
 
Topic modeling and WSD on the Ancora corpus
Topic modeling and WSD on the Ancora corpusTopic modeling and WSD on the Ancora corpus
Topic modeling and WSD on the Ancora corpus
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
 
Error analysis of Word Sense Disambiguation
Error analysis of Word Sense DisambiguationError analysis of Word Sense Disambiguation
Error analysis of Word Sense Disambiguation
 
Juan Calvino y el Calvinismo
Juan Calvino y el CalvinismoJuan Calvino y el Calvinismo
Juan Calvino y el Calvinismo
 
KafNafParserPy: a python library for parsing/creating KAF and NAF files
KafNafParserPy: a python library for parsing/creating KAF and NAF filesKafNafParserPy: a python library for parsing/creating KAF and NAF files
KafNafParserPy: a python library for parsing/creating KAF and NAF files
 
CLTL python course: Object Oriented Programming (2/3)
CLTL python course: Object Oriented Programming (2/3)CLTL python course: Object Oriented Programming (2/3)
CLTL python course: Object Oriented Programming (2/3)
 
CLTL python course: Object Oriented Programming (1/3)
CLTL python course: Object Oriented Programming (1/3)CLTL python course: Object Oriented Programming (1/3)
CLTL python course: Object Oriented Programming (1/3)
 
Thesis presentation (WSD and Semantic Classes)
Thesis presentation (WSD and Semantic Classes)Thesis presentation (WSD and Semantic Classes)
Thesis presentation (WSD and Semantic Classes)
 
ULM1 - The borders of Ambiguity
ULM1 - The borders of AmbiguityULM1 - The borders of Ambiguity
ULM1 - The borders of Ambiguity
 
CLTL presentation: training an opinion mining system from KAF files using CRF
CLTL presentation: training an opinion mining system from KAF files using CRFCLTL presentation: training an opinion mining system from KAF files using CRF
CLTL presentation: training an opinion mining system from KAF files using CRF
 
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor  Building a semantically annotated corpus for DutchCLIN 2012: DutchSemCor  Building a semantically annotated corpus for Dutch
CLIN 2012: DutchSemCor Building a semantically annotated corpus for Dutch
 
RANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpusRANLP 2013: DutchSemcor in quest of the ideal corpus
RANLP 2013: DutchSemcor in quest of the ideal corpus
 

Kürzlich hochgeladen

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Kürzlich hochgeladen (20)

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

CLTL Software and Web Services Guide

  • 2. Rubén Izquierdo Beviá About me  5-year degree on Computer Science (University of Alicante, Alicante, Spain)  National NLP projects and 1 European project (QALLME) (University of Alicante, Alicante, Spain)  Thesis about NLP & Word Sense Disambiguation (University of Alicante, Alicante, Spain. Sept 2010)  Postdoc position at DutchSemCor Project (University of Tilburg, Tilburg. Sept 2011-Sept2012)  Postdoc position at OpeNER Project (Vrije University, Amsterdam. Sept 2012-)
  • 3. CLTL software  In general common input/output format  KAF  NAF, as an extension of KAF  Single components performing single tasks  Integration of existing modules  Adaptation of input/output formats  Development of new ones
  • 4. KAF Kyoto Annotation Format  Stand-off, layered, XML-based representation format     Different types of information are stored in different layers Layers are linked by means of references Suitable for creating pipelines based on this format Layers:  Text  tokens  Term  lemmas, part-of-speech, term sentiment, word senses  Entities, chunks, opinions…
  • 6. NAF NewsReader Annotation Format  Extension of KAF  Allow the cross-document processing  Event coreference  ID’s are converted into valid URI’s  Store the same type of information provided by different tools  Result of two different pos-taggers
  • 7. How the software is provided I  All modules are publicly available on GitHub  CLTL GitHub  http://github.com/cltl  NewsReader GitHub  http://github.com/newsreader  OpeNER GitHub  http://github.com/opener-project/
  • 8. How the software is provided II  Some are available as Web Services  Exposed as REST web services  Accept and input stream (KAF/NAF)  Generate an output stream (KAF/NAF)  Easy to call from command line with CURL  Easy to create module pipelines in the same way you create a linux commands pipeline  http://wordpress.let.vupr.nl/web-services/
  • 9. How the software is provided II
  • 10. How the software is provided II
  • 11. Our software I  General modules (integrated)  Tokenizers: whitespace based, open-nlp trained...  Sentence splitters: based on rules, open-nlp  Pos-taggers: treetagger, open-nlp pos taggers  Chunker: trained on Alpino data with open-nlp  Parsers: Alpino (nl), Stanford (en)
  • 12. Our software II  General modules (developed by us)  Wordnet Tools  Functions to use a WordNet in LMF format  Word Sense Disambiguation systems  UKB: unsupersived  SVM: supervised (for nl derived from DutchSemcor)  Multiword tagger  multiword sequences of terms according the WordNet  OntoTagger  Ontotagger inserts (semantic) labels into KAF representation on the basis of lemma or wordnet synset representations of text
  • 13. Our software III  General modules (developed by us)  Named Entity Recognizer  Detects dates and locations using specific resources + GeoNames  KyBot  Extract tuples and relations from a set of profiles formulated using semantic and structural properties
  • 14. Our software IV  OpeNER related (developed by us)  Hotel property tagger  Detect aspects related with cleanliness, staff, breakfast, rooms…  Term polarity tagger  Positive/negative terms, intensifiers, negators …  Opinion miner  Detect opinions: target + holder + expression  2 rule based version // 1 machine learning version
  • 15. Our software V  NewsReader related (developed by us)  Discourse Module  Splits incoming texts into headers and paragraphs  Factuality Classifier  Classifies whether a statement is factual/probable/possible or not  Event Coreference  Compares descriptions of events within and across documents to decide if they refer to the same events.