SlideShare ist ein Scribd-Unternehmen logo
1 von 59
Downloaden Sie, um offline zu lesen
University of Sheffield, NLP
Text Analysis with GATE
Diana Maynard
University of Sheffield
Big Social Data Workshop
Reading, UK, April 2015
Š The University of Sheffield, 2015
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike Licence
University of Sheffield, NLP
Outline of the Tutorial
• Introduction to NLP and Information Extraction
• Introduction to GATE
• Social media analysis
• Example applications
• Semantic Search, Nesta, Decarbonet.
University of Sheffield, NLP
We are all connected to each other...
● Information,
thoughts and
opinions are
shared prolifically
on the social web
these days
● 72% of online
adults use social
networking sites
University of Sheffield, NLP
Your grandmother is three times as likely to
use a social networking site now as in 2009
University of Sheffield, NLP
There are hundreds of tools for social media
analytics
● Most of them are commercial and not freely available
● The research tools tend to focus on specific topics and
scenarios, and aren't easily adaptable
● The analysis they do often doesn't go much beyond number
crunching, e.g.
– look at number of tweets, retweets, favourites
– filter by hashtag or keyword for topic categorisation
– use off-the-shelf sentiment tools
– use counts of word length, POS categories etc
– very little semantics, don't deal with variation, ambiguity,
slang, sarcasm etc.
University of Sheffield, NLP
Analysing Social Media is harder than it sounds
There are lots
of things to
think about!
University of Sheffield, NLP
Analysing language in social media is hard
● Grundman:politics makes #climatechange scientific issue,people
don’t like knowitall rational voice tellin em wat 2do
● @adambation Try reading this article , it looks like it would be
really helpful and not obvious at all. http://t.co/mo3vODoX
● Want to solve the problem of #ClimateChange? Just #vote for a
#politician! Poof! Problem gone! #sarcasm #TVP #99%
● Human Caused #ClimateChange is a Monumental Scam!
http://www.youtube.com/watch?v=LiX792kNQeE … F**k yes!!
Lying to us like MOFO's Tax The Air We Breath! F**k Them!
University of Sheffield, NLP
Let's search for keywords like “Arctic”
Oops!
University of Sheffield, NLP
Seems like we need something to help!
How about NLP?
University of Sheffield, NLP
It is difficult to access unstructured information
efficiently
Information extraction tools can help you:
● Save time and money on management of text and data from
multiple sources
● Find hidden links scattered across huge volumes of diverse
information
● Integrate structured data from variety of sources
● Interlink text and data
● Collect information and extract new facts
University of Sheffield, NLP
What is Entity Recognition?
●
Entity Recognition is about recogising and classifying key Named
Entities and terms in the text
●
A Named Entity is a Person, Location, Organisation, Date etc.
●
A term is a key concept or phrase that is representative of the text
●
Entities and terms may be described in different ways but refer to
the same thing. We call this co-reference.
Mitt Romney, the favorite to win the Republican nomination for president in 2012
DatePerson Term
The GOP tweeted that they had knocked on 75,000 doors in Ohio the day prior.
Organisation
co-reference
Location
University of Sheffield, NLP
What is Event Recognition?
●
An event is an action or situation relevant to the domain
expressed by some relation between entities or terms.
●
It is always grounded in time, e.g. the performance of a
band, an election, the death of a person
Mitt Romney, the favorite to win the Republican nomination for president in 2012
Event DatePerson
Relation Relation
University of Sheffield, NLP
Why are Entities and Events Useful?
● They can help answer the “Big 5” journalism questions
(who, what, when, where, why)
● They can be used to categorise the texts in different ways
– look at all texts about Obama.
● They can be used as targets for opinion mining
– find out what people think about President Obama
● When linked to an ontology and/or combined with other
information, they can be used for reasoning about things
not explicit in the text
– seeing how opinions about different American
presidents have changed over the years
University of Sheffield, NLP
Approaches to Information Extraction
Knowledge Engineering

rule based

developed by
experienced language
engineers

make use of human
intuition

easier to understand
results

development could be
very time consuming

some changes may be
hard to accommodate
Learning Systems

use statistics or other
machine learning

developers do not need
LE expertise

requires large amounts of
annotated training data

some changes may
require re-annotation of
the entire training corpus
University of Sheffield, NLP
Seems like we need a tool to do this clever
stuff for us.
How about GATE?
University of Sheffield, NLP
What is GATE?
GATE is an NLP toolkit developed at the University of Sheffield
over the last 20 years.
It's open source and freely available. http://gate.ac.uk
• components for language processing, e.g. parsers, machine
learning tools, stemmers, IR tools, IE components for various
languages...
• tools for visualising and manipulating text, annotations,
ontologies, parse trees, etc.
• various information extraction tools
• evaluation and benchmarking tools
University of Sheffield, NLP
GATE components
● Language Resources (LRs), e.g. lexicons, corpora,
ontologies
● Processing Resources (PRs), e.g. parsers, generators,
taggers
●
Visual Resources (VRs), i.e. visualisation and editing
components
● Algorithms are separated from the data, which means:
– the two can be developed independently by users with
different expertise.
– alternative resources of one type can be used without
affecting the other, e.g. a different visual resource can be
used with the same language resource
University of Sheffield, NLP
ANNIE
• ANNIE is GATE's rule-based IE system
• It uses the language engineering approach (though we also
have tools in GATE for ML)
• Distributed as part of GATE
• Uses a finite-state pattern-action rule language, JAPE
• ANNIE contains a reusable and easily extendable set of
components:
– generic preprocessing components for tokenisation,
sentence splitting etc
– components for performing NE on general open domain text
University of Sheffield, NLP
ANNIE Modules
University of Sheffield, NLP
20
Document with Tokens
University of Sheffield, NLP
21
Document with Sentences
University of Sheffield, NLP
Gazetteer editor
definition file
entries entries for selected list
University of Sheffield, NLP
Named Entity Grammars
• Hand-coded rules written in JAPE applied to annotations to
identify NEs
• Phases run sequentially and constitute a cascade of FSTs over
annotations
• Annotations from format analysis, tokeniser. splitter, POS tagger,
morphological analysis, gazetteer etc.
• Because phases are sequential, annotations can be built up over
a period of phases, as new information is gleaned
• Standard named entities: persons, locations, organisations, dates,
addresses, money
• Basic NE grammars can be adapted for new applications, domains
and languages
University of Sheffield, NLP
Document with NEs
University of Sheffield, NLP
Coreference
University of Sheffield, NLP
Hands-on: Running ANNIE on news texts
●
Start GATE by double clicking on the icon
●
Because ANNIE is a ready-made application, we can just load it
directly from the menu
●
Right click on Applications, select “Ready-made applications”
and then ANNIE
●
Create a new corpus (right click on Language Resources →
New → GATE corpus and click “OK)
●
Populate the corpus (right click on the corpus, and use the file
chooser in Directory URL to select the hands-on/corpora/news-
texts directory. Click “Open” and then “OK”)
●
Double click on ANNIE and select “Run this application”
●
This will now run the application on all the documents in your
corpus
University of Sheffield, NLP
Hands-on: Looking at the results
● Now you can inspect your annotated data: double click on a
document to open it
● Select “Annotation Sets” and “Annotation List” from the tabs
● Click on the top arrow above “Original markups” in the right
pane
● You should see a mixture of Named Entity annotations
(Person, Location etc) and some other linguistic annotations
(Token, Sentence etc) in the Default annotation set
● Click on the name of an annotation in the right hand pane that
you want to view (you can select multiple annotations)
● Hover over the annotation in the text to get more info
● Try the Annotation Stack for a different view
University of Sheffield, NLP
3. Analysing Social Media
University of Sheffield, NLP
Analysing language in social media is hard
● Grundman:politics makes #climatechange scientific issue,people
don’t like knowitall rational voice tellin em wat 2do
● @adambation Try reading this article , it looks like it would be
really helpful and not obvious at all. http://t.co/mo3vODoX
● Want to solve the problem of #ClimateChange? Just #vote for a
#politician! Poof! Problem gone! #sarcasm #TVP #99%
● Human Caused #ClimateChange is a Monumental Scam!
http://www.youtube.com/watch?v=LiX792kNQeE … F**k yes!!
Lying to us like MOFO's Tax The Air We Breath! F**k Them!
University of Sheffield, NLP
Challenges for NLP
● Noisy language: unusual punctuation, capitalisation, spelling, use
of slang, sarcasm etc.
● Terse nature of microposts such as tweets
● Use of hashtags, @mentions etc causes problems for
tokenisation #thisistricky
● Lack of context gives rise to ambiguities
● NER performs poorly on microposts, mainly because of linguistic
pre-processing failure
– Performance of standard IE tools decreases from ~90% to
~40% when run on tweets rather than news articles
University of Sheffield, NLP
Lack of context causes ambiguity
Branching out from Lincoln park after dark ... Hello Russian
Navy, it's like the same thing but with glitter!
??
University of Sheffield, NLP
Getting the NEs right is crucial
Branching out from Lincoln park after dark ... Hello Russian
Navy, it's like the same thing but with glitter!
University of Sheffield, NLP
Hands-on with TwitIE
● First we need to load the TWITIE plugin
● Click on the green jigsaw-shaped icon at the top to load the
plugins manager (this takes a few seconds to load)
● Scroll down to the Twitter plugin and select “Load now”.
● Click “Apply All” and then “Close”.
● Now right-click on “Applications”, select “Ready-made
applications” and “TwitIE”
● Create another new corpus, name it “Tweets” and populate
with texts from hands-on/corpora/tweets
● Once loaded, double click on TwitIE to open it, and then select
“Run this application” (make sure the tweets corpus is chosen)
● Use the Annotation Stack view to look at Tokens in hashtags
University of Sheffield, NLP
Semantic Search with GATE
University of Sheffield, NLP
GATE MĂ­mir
• can be used to index and search over text, annotations,
semantic metadata (concepts and instances)
• allows queries that arbitrarily mix full-text, structural, linguistic
and semantic annotations
• is open source
University of Sheffield, NLP
What can GATE MĂ­mir do that Google can't?
Show me:
• all documents mentioning a temperature between 30 and 90
degrees F (expressed in any unit)
• all abstracts written in French on patent documents from the last
6 months which mention any form of the word “transistor” in the
English abstract
• the names of the patent inventors of those abstracts
• all documents mentioning steel industries in the UK, along with
their location
University of Sheffield, NLP
Search News Articles for
Politicians born in Sheffield
http://demos.gate.ac.uk/mimir/gpd/search/gus
University of Sheffield, NLP
MIMIR: Searching Text Mining Results
●
Searching and managing text annotations, semantic
information, and full text documents in one search engine
●
Queries over annotation graphs
●
Regular expressions, Kleene operators
●
Designed to be integrated as a web service in custom end-user
systems with bespoke interfaces
●
Demos at http://services.gate.ac.uk/mimir/
University of Sheffield, NLP
Hands-on with Semantic Search
Try these queries on the BBC News demo:
http://services.gate.ac.uk/mimir/gpd/search/index
● Gordon Brown
● Gordon Brown said
●
Gordon Brown [0..3] root:say
●
{Person} [0..3] root:say
●
{Person.gender=female}[0..3] root:say
●
Make sure you type the queries EXACTLY or they probably won't
work!
●
Try making up some of your own queries.
University of Sheffield, NLP
Try your hand with some SPARQL
(for the more adventurous!)
●
{Person inst ="http://dbpedia.org/resource/Gordon_Brown"} [0..3]
root:say
●
{Person sparql="SELECT ?inst WHERE { ?inst a :Politician }"}
[0..3] root:say
●
{Person sparql = "SELECT ?inst WHERE { ?inst :party
<http://dbpedia.org/resource/Labour_Party_%28UK%29> }" }
[0..3] root:say
●
{Person sparql = "SELECT ?inst WHERE {
?inst :party <http://dbpedia.org/resource/Labour_Party_%28UK
%29> .
?inst :almaMater
<http://dbpedia.org/resource/University_of_Edinburgh> }"
} [0..3] root:say
Can you work out what these do?
University of Sheffield, NLP
We don't just have to look at politicians saying and
measuring things
● If we first process the text with other NLP tools such as
sentiment analysis, we can also search for positive or negative
documents
● Or positive/negative comments about certain
people/topics/things/companies
● In the DecarboNet project, we're looking at people's opinions
about topics relating to climate change, e.g. fracking
● We could index on the sentiment annotations too
● Other people are using the combination of opinion mining and
MIMIR to look at e.g. customer feedback about products /
companies on a huge corpus
University of Sheffield, NLP
A positive tweet
University of Sheffield, NLP
A negative tweet
University of Sheffield, NLP
A Sarcastic Tweet
University of Sheffield, NLP
What diseases are in these documents?
University of Sheffield, NLP
What Pathogens?
University of Sheffield, NLP
Disease vs Disease Co-ocurrences
University of Sheffield, NLP
Diseases vs Pathogens
University of Sheffield, NLP
The Political Futures Tracker
● Example of using the technology on a real scenario - analysing
political tweets in the run-up to the UK elections
● Project funded by Nesta http://www.nesta.org.uk/
● Series of blog posts about the project, leading up to the
election
http://www.nesta.org.uk/news/political-futures-tracker
● Some of our own more technical blog posts about the
underlying tools:
http://gate.ac.uk/projects/pft/
University of Sheffield, NLP
Recognition of MPs / Candidates
University of Sheffield, NLP
Topic recognition
This term is found by first
recognising the head
word “job” from a list
under the theme
“employment” and
matching against its root
form in the text, i.e. “job”.
It is then extended to
include the adjectival
modifier “British”, which
is not present in a list
anywhere.
University of Sheffield, NLP
Positive opinion about science and innovation
University of Sheffield, NLP
What do the different parties talk about?
Conservative vs Labour
Transport, Europe and employment are
mentioned more by Conservatives
NHS is mentioned more by Labour
University of Sheffield, NLP
Where did people tweet about the economy?
University of Sheffield, NLP
Terms that co-occur with immigration topic
University of Sheffield, NLP
Which tweets express sentiment?
University of Sheffield, NLP
Summary
● Text mining is a very useful pre-requisite for doing all kinds of
more interesting things, like semantic search
● Semantic web search allows you to do much more interesting
kinds of search than a standard text-based search
● Text mining is hard, so it won't always be correct, however
● This is especially true on lower quality text such as social
media
● On the other hand, social media has some of the most
interesting material to look at
● Still plenty of work to be done, but there are lots of tools that
you can use now and get useful results
● We run annual GATE training courses where you can spend a
whole week learning all this and more!
University of Sheffield, NLP
Acknowledgements and further information
● Research partially supported by the European Union/EU under the Information
and Communication Technologies (ICT) theme of the 7th Framework Programme
for R&D (FP7) DecarboNet (610829) http://www.decarbonet.eu and Nesta
http://nesta.org.uk
● Annual GATE training course in June in
Sheffield:https://gate.ac.uk/family/training.html
● Download gate: http://gate.ac.uk/download
University of Sheffield, NLP
Relevant publications
● K Bontcheva, V Tablan, H Cunningham. Semantic Search over Documents
and Ontologies (2014) Bridging Between Information Retrieval and
Databases, 31-53
● V. Tablan, K. Bontcheva, I. Roberts, and H. Cunningham. Mímir: An open-
source semantic search framework for interactive information seeking and
discovery. Journal of Web Semantics: Science, Services and Agents on the
World Wide Web, 2014
● A. Dietzel and D. Maynard. Climate Change: A Chance for Political Re-
Engagement? In Proc. of the Political Studies Association 65th Annual
International Conference, April 2015, Sheffield, UK.
● Diana Maynard. Challenges in Analysing Social Media. In Adrian Duşa,
Dietrich Nelle, GĂźnter Stock and Gert G. Wagner (eds.) (2014): Facing the
Future: European Research Infrastructures for the Humanities and Social
Sciences. SCIVERO Verlag, Berlin, 2014.
● These and other relevant papers on the GATE website:
http://gate.ac.uk/gate/doc/papers.html

Weitere ähnliche Inhalte

Was ist angesagt?

WTM-2023-Looker Studio.pdf
WTM-2023-Looker Studio.pdfWTM-2023-Looker Studio.pdf
WTM-2023-Looker Studio.pdfPyariKumaran
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business IntelligenceRonan Soares
 
Social media analytics powered by data science
Social media analytics powered by data scienceSocial media analytics powered by data science
Social media analytics powered by data scienceNavin Manaswi
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
Generative AI 101 A Beginners Guide.pdf
Generative AI 101 A Beginners Guide.pdfGenerative AI 101 A Beginners Guide.pdf
Generative AI 101 A Beginners Guide.pdfSoluLab1231
 
Big data case study collection
Big data   case study collectionBig data   case study collection
Big data case study collectionLuis Miguel Salgado
 
What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?Windzoon Technologies
 
Data science
Data scienceData science
Data scienceSreejith c
 
What is big data?
What is big data?What is big data?
What is big data?David Wellman
 
The Digital Sociology of Generative AI (1).pptx
The Digital Sociology of Generative AI (1).pptxThe Digital Sociology of Generative AI (1).pptx
The Digital Sociology of Generative AI (1).pptxMark Carrigan
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingHealth Catalyst
 
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersIvo Andreev
 
KNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIMESlides
 
leewayhertz.com-Generative AI for enterprises The architecture its implementa...
leewayhertz.com-Generative AI for enterprises The architecture its implementa...leewayhertz.com-Generative AI for enterprises The architecture its implementa...
leewayhertz.com-Generative AI for enterprises The architecture its implementa...robertsamuel23
 

Was ist angesagt? (20)

Social media with big data analytics
Social media with big data analyticsSocial media with big data analytics
Social media with big data analytics
 
WTM-2023-Looker Studio.pdf
WTM-2023-Looker Studio.pdfWTM-2023-Looker Studio.pdf
WTM-2023-Looker Studio.pdf
 
200109-Open AI Chat GPT.pptx
200109-Open AI Chat GPT.pptx200109-Open AI Chat GPT.pptx
200109-Open AI Chat GPT.pptx
 
Data science
Data scienceData science
Data science
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business Intelligence
 
Social media analytics powered by data science
Social media analytics powered by data scienceSocial media analytics powered by data science
Social media analytics powered by data science
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Generative AI 101 A Beginners Guide.pdf
Generative AI 101 A Beginners Guide.pdfGenerative AI 101 A Beginners Guide.pdf
Generative AI 101 A Beginners Guide.pdf
 
Big data
Big dataBig data
Big data
 
Big data case study collection
Big data   case study collectionBig data   case study collection
Big data case study collection
 
What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?What Are the Problems Associated with ChatGPT?
What Are the Problems Associated with ChatGPT?
 
Data science
Data scienceData science
Data science
 
What is big data?
What is big data?What is big data?
What is big data?
 
The Digital Sociology of Generative AI (1).pptx
The Digital Sociology of Generative AI (1).pptxThe Digital Sociology of Generative AI (1).pptx
The Digital Sociology of Generative AI (1).pptx
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s GoingBig Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
 
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for DevelopersHow do OpenAI GPT Models Work - Misconceptions and Tips for Developers
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
 
KNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To Deployment
 
leewayhertz.com-Generative AI for enterprises The architecture its implementa...
leewayhertz.com-Generative AI for enterprises The architecture its implementa...leewayhertz.com-Generative AI for enterprises The architecture its implementa...
leewayhertz.com-Generative AI for enterprises The architecture its implementa...
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 

Andere mochten auch

Text analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEText analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEDiana Maynard
 
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextTwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextLeon Derczynski
 
SAMOA: A Platform for Mining Big Data Streams (Apache BigData North America 2...
SAMOA: A Platform for Mining Big Data Streams (Apache BigData North America 2...SAMOA: A Platform for Mining Big Data Streams (Apache BigData North America 2...
SAMOA: A Platform for Mining Big Data Streams (Apache BigData North America 2...Nicolas Kourtellis
 
Text Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATEText Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATEDiana Maynard
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics ArchitectureArvind Sathi
 
Big Data for Big Results in Chinese Social Media
Big Data for Big Results in Chinese Social MediaBig Data for Big Results in Chinese Social Media
Big Data for Big Results in Chinese Social MediaReach China Holdings Limited
 
Tools for (Almost) Real-Time Social Media Analysis
Tools for (Almost) Real-Time Social Media AnalysisTools for (Almost) Real-Time Social Media Analysis
Tools for (Almost) Real-Time Social Media AnalysisDiana Maynard
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA DATASCIENCE
 
Rd 60
Rd 60Rd 60
Rd 60gerobag
 
Erba investor presentation
Erba investor presentationErba investor presentation
Erba investor presentationAmy Ostler
 
Apache UIMA Introduction
Apache UIMA IntroductionApache UIMA Introduction
Apache UIMA IntroductionTommaso Teofili
 
Semantic Search Engines
Semantic Search EnginesSemantic Search Engines
Semantic Search EnginesAtul Shridhar
 
Intriduction to Ontotext's KIM platform
Intriduction to Ontotext's KIM platformIntriduction to Ontotext's KIM platform
Intriduction to Ontotext's KIM platformtoncho11
 
Adding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryAdding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryOntotext
 
Ontological approach for improving semantic web search results
Ontological approach for improving semantic web search resultsOntological approach for improving semantic web search results
Ontological approach for improving semantic web search resultseSAT Journals
 
In Search of a Semantic Book Search Engine: Are We There Yet?
In Search of a Semantic Book Search Engine: Are We There Yet?In Search of a Semantic Book Search Engine: Are We There Yet?
In Search of a Semantic Book Search Engine: Are We There Yet?Irfan Ullah
 
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalMauro Dragoni
 
Semantics And Search
Semantics And SearchSemantics And Search
Semantics And SearchVestforsk.no
 

Andere mochten auch (20)

Text analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEText analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATE
 
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextTwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
 
SAMOA: A Platform for Mining Big Data Streams (Apache BigData North America 2...
SAMOA: A Platform for Mining Big Data Streams (Apache BigData North America 2...SAMOA: A Platform for Mining Big Data Streams (Apache BigData North America 2...
SAMOA: A Platform for Mining Big Data Streams (Apache BigData North America 2...
 
NLP from scratch
NLP from scratch NLP from scratch
NLP from scratch
 
Text Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATEText Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATE
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics Architecture
 
Big Data for Big Results in Chinese Social Media
Big Data for Big Results in Chinese Social MediaBig Data for Big Results in Chinese Social Media
Big Data for Big Results in Chinese Social Media
 
Tools for (Almost) Real-Time Social Media Analysis
Tools for (Almost) Real-Time Social Media AnalysisTools for (Almost) Real-Time Social Media Analysis
Tools for (Almost) Real-Time Social Media Analysis
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2
 
Rd 60
Rd 60Rd 60
Rd 60
 
Erba investor presentation
Erba investor presentationErba investor presentation
Erba investor presentation
 
Apache UIMA Introduction
Apache UIMA IntroductionApache UIMA Introduction
Apache UIMA Introduction
 
Semantic Search Engines
Semantic Search EnginesSemantic Search Engines
Semantic Search Engines
 
Intriduction to Ontotext's KIM platform
Intriduction to Ontotext's KIM platformIntriduction to Ontotext's KIM platform
Intriduction to Ontotext's KIM platform
 
Adding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to DeliveryAdding Semantic Edge to Your Content – From Authoring to Delivery
Adding Semantic Edge to Your Content – From Authoring to Delivery
 
Ontological approach for improving semantic web search results
Ontological approach for improving semantic web search resultsOntological approach for improving semantic web search results
Ontological approach for improving semantic web search results
 
A Taxonomy of Semantic Web data Retrieval Techniques
A Taxonomy of Semantic Web data Retrieval TechniquesA Taxonomy of Semantic Web data Retrieval Techniques
A Taxonomy of Semantic Web data Retrieval Techniques
 
In Search of a Semantic Book Search Engine: Are We There Yet?
In Search of a Semantic Book Search Engine: Are We There Yet?In Search of a Semantic Book Search Engine: Are We There Yet?
In Search of a Semantic Book Search Engine: Are We There Yet?
 
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
 
Semantics And Search
Semantics And SearchSemantics And Search
Semantics And Search
 

Ähnlich wie Analyzing Social Media

Text analysis-semantic-search
Text analysis-semantic-searchText analysis-semantic-search
Text analysis-semantic-searchDiana Maynard
 
Scale2014
Scale2014Scale2014
Scale2014shaunagm
 
Arcomem training entities-and-events_advanced
Arcomem training entities-and-events_advancedArcomem training entities-and-events_advanced
Arcomem training entities-and-events_advancedarcomem
 
Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?Diana Maynard
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introductionananth
 
Arcomem training opinions_advanced
Arcomem training opinions_advancedArcomem training opinions_advanced
Arcomem training opinions_advancedarcomem
 
Twitter data analysis using R
Twitter data analysis using RTwitter data analysis using R
Twitter data analysis using Rsantoshi mangalgi
 
A tailor-made one-size-fits-all approach to sentiment analysis
A tailor-made one-size-fits-all approach to sentiment analysisA tailor-made one-size-fits-all approach to sentiment analysis
A tailor-made one-size-fits-all approach to sentiment analysisDiana Maynard
 
DMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slidesDMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slidesPaige Morgan
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxsaivinay93
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translationMarcis Pinnis
 
Addis Ababa University.pptx
Addis Ababa University.pptxAddis Ababa University.pptx
Addis Ababa University.pptxBelay Alemayehu
 
ChatGPT in academic settings H2.de
ChatGPT in academic settings H2.deChatGPT in academic settings H2.de
ChatGPT in academic settings H2.deDavid DĂśring
 
Impact the UX of Your Website with Contextual Inquiry
Impact the UX of Your Website with Contextual InquiryImpact the UX of Your Website with Contextual Inquiry
Impact the UX of Your Website with Contextual InquiryRachel Vacek
 
Jon Rubin & Katherine Spivey - User-Useful Government Websites: Intersection ...
Jon Rubin & Katherine Spivey - User-Useful Government Websites: Intersection ...Jon Rubin & Katherine Spivey - User-Useful Government Websites: Intersection ...
Jon Rubin & Katherine Spivey - User-Useful Government Websites: Intersection ...Plain Talk 2015
 
Growing Your Freelance Business (Olga Melnikova)
Growing Your Freelance Business (Olga Melnikova)Growing Your Freelance Business (Olga Melnikova)
Growing Your Freelance Business (Olga Melnikova)Olga Melnikova
 
Introduction to NLP_1.pptx
Introduction to NLP_1.pptxIntroduction to NLP_1.pptx
Introduction to NLP_1.pptxjkamble
 
Introduction to NLP.pptx
Introduction to NLP.pptxIntroduction to NLP.pptx
Introduction to NLP.pptxjkamble
 

Ähnlich wie Analyzing Social Media (20)

Text analysis-semantic-search
Text analysis-semantic-searchText analysis-semantic-search
Text analysis-semantic-search
 
Scale2014
Scale2014Scale2014
Scale2014
 
Arcomem training entities-and-events_advanced
Arcomem training entities-and-events_advancedArcomem training entities-and-events_advanced
Arcomem training entities-and-events_advanced
 
Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
Arcomem training opinions_advanced
Arcomem training opinions_advancedArcomem training opinions_advanced
Arcomem training opinions_advanced
 
Twitter data analysis using R
Twitter data analysis using RTwitter data analysis using R
Twitter data analysis using R
 
A tailor-made one-size-fits-all approach to sentiment analysis
A tailor-made one-size-fits-all approach to sentiment analysisA tailor-made one-size-fits-all approach to sentiment analysis
A tailor-made one-size-fits-all approach to sentiment analysis
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Unit 5f.pptx
Unit 5f.pptxUnit 5f.pptx
Unit 5f.pptx
 
DMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slidesDMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slides
 
NATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptxNATURAL LANGUAGE PROCESSING.pptx
NATURAL LANGUAGE PROCESSING.pptx
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
 
Addis Ababa University.pptx
Addis Ababa University.pptxAddis Ababa University.pptx
Addis Ababa University.pptx
 
ChatGPT in academic settings H2.de
ChatGPT in academic settings H2.deChatGPT in academic settings H2.de
ChatGPT in academic settings H2.de
 
Impact the UX of Your Website with Contextual Inquiry
Impact the UX of Your Website with Contextual InquiryImpact the UX of Your Website with Contextual Inquiry
Impact the UX of Your Website with Contextual Inquiry
 
Jon Rubin & Katherine Spivey - User-Useful Government Websites: Intersection ...
Jon Rubin & Katherine Spivey - User-Useful Government Websites: Intersection ...Jon Rubin & Katherine Spivey - User-Useful Government Websites: Intersection ...
Jon Rubin & Katherine Spivey - User-Useful Government Websites: Intersection ...
 
Growing Your Freelance Business (Olga Melnikova)
Growing Your Freelance Business (Olga Melnikova)Growing Your Freelance Business (Olga Melnikova)
Growing Your Freelance Business (Olga Melnikova)
 
Introduction to NLP_1.pptx
Introduction to NLP_1.pptxIntroduction to NLP_1.pptx
Introduction to NLP_1.pptx
 
Introduction to NLP.pptx
Introduction to NLP.pptxIntroduction to NLP.pptx
Introduction to NLP.pptx
 

Mehr von Diana Maynard

Filth and lies: analysing social media
Filth and lies: analysing social mediaFilth and lies: analysing social media
Filth and lies: analysing social mediaDiana Maynard
 
Adding value to NLP: a little semantics goes a long way
Adding value to NLP: a little semantics goes a long wayAdding value to NLP: a little semantics goes a long way
Adding value to NLP: a little semantics goes a long wayDiana Maynard
 
Methodological possibilities for strengthening the monitoring of SDG indicato...
Methodological possibilities for strengthening the monitoring of SDG indicato...Methodological possibilities for strengthening the monitoring of SDG indicato...
Methodological possibilities for strengthening the monitoring of SDG indicato...Diana Maynard
 
Getting the-most-out-of-conferences
Getting the-most-out-of-conferencesGetting the-most-out-of-conferences
Getting the-most-out-of-conferencesDiana Maynard
 
Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Diana Maynard
 
The language of social media
The language of social mediaThe language of social media
The language of social mediaDiana Maynard
 
Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...Diana Maynard
 
Adapting NLP tools to diverse data: challenges and solutions
Adapting NLP tools to diverse data: challenges and solutionsAdapting NLP tools to diverse data: challenges and solutions
Adapting NLP tools to diverse data: challenges and solutionsDiana Maynard
 
Ontologies as bridges between data sources and user queries: the KNOWMAK proj...
Ontologies as bridges between data sources and user queries: the KNOWMAK proj...Ontologies as bridges between data sources and user queries: the KNOWMAK proj...
Ontologies as bridges between data sources and user queries: the KNOWMAK proj...Diana Maynard
 
20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...
20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...
20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...Diana Maynard
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real worldDiana Maynard
 
Social media analytics as a service: tools from GATE
Social media analytics as a service: tools from GATESocial media analytics as a service: tools from GATE
Social media analytics as a service: tools from GATEDiana Maynard
 
Cls8 decarbonet
Cls8 decarbonetCls8 decarbonet
Cls8 decarbonetDiana Maynard
 
Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?Diana Maynard
 
Do we really know what people mean when they tweet?
Do we really know what people mean when they tweet?Do we really know what people mean when they tweet?
Do we really know what people mean when they tweet?Diana Maynard
 
Disability and Adventure Travel: the Double-Edged Sword
Disability and Adventure Travel: the Double-Edged SwordDisability and Adventure Travel: the Double-Edged Sword
Disability and Adventure Travel: the Double-Edged SwordDiana Maynard
 
Who cares about sarcastic tweets? Investigating the impact of sarcasm on sent...
Who cares about sarcastic tweets? Investigating the impact of sarcasm on sent...Who cares about sarcastic tweets? Investigating the impact of sarcasm on sent...
Who cares about sarcastic tweets? Investigating the impact of sarcasm on sent...Diana Maynard
 
Multimodal opinion mining from social media
Multimodal opinion mining from social mediaMultimodal opinion mining from social media
Multimodal opinion mining from social mediaDiana Maynard
 
Practical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaPractical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaDiana Maynard
 
What do you really mean when you tweet? Challenges for opinion mining on soci...
What do you really mean when you tweet? Challenges for opinion mining on soci...What do you really mean when you tweet? Challenges for opinion mining on soci...
What do you really mean when you tweet? Challenges for opinion mining on soci...Diana Maynard
 

Mehr von Diana Maynard (20)

Filth and lies: analysing social media
Filth and lies: analysing social mediaFilth and lies: analysing social media
Filth and lies: analysing social media
 
Adding value to NLP: a little semantics goes a long way
Adding value to NLP: a little semantics goes a long wayAdding value to NLP: a little semantics goes a long way
Adding value to NLP: a little semantics goes a long way
 
Methodological possibilities for strengthening the monitoring of SDG indicato...
Methodological possibilities for strengthening the monitoring of SDG indicato...Methodological possibilities for strengthening the monitoring of SDG indicato...
Methodological possibilities for strengthening the monitoring of SDG indicato...
 
Getting the-most-out-of-conferences
Getting the-most-out-of-conferencesGetting the-most-out-of-conferences
Getting the-most-out-of-conferences
 
Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...
 
The language of social media
The language of social mediaThe language of social media
The language of social media
 
Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...
 
Adapting NLP tools to diverse data: challenges and solutions
Adapting NLP tools to diverse data: challenges and solutionsAdapting NLP tools to diverse data: challenges and solutions
Adapting NLP tools to diverse data: challenges and solutions
 
Ontologies as bridges between data sources and user queries: the KNOWMAK proj...
Ontologies as bridges between data sources and user queries: the KNOWMAK proj...Ontologies as bridges between data sources and user queries: the KNOWMAK proj...
Ontologies as bridges between data sources and user queries: the KNOWMAK proj...
 
20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...
20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...
20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real world
 
Social media analytics as a service: tools from GATE
Social media analytics as a service: tools from GATESocial media analytics as a service: tools from GATE
Social media analytics as a service: tools from GATE
 
Cls8 decarbonet
Cls8 decarbonetCls8 decarbonet
Cls8 decarbonet
 
Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?
 
Do we really know what people mean when they tweet?
Do we really know what people mean when they tweet?Do we really know what people mean when they tweet?
Do we really know what people mean when they tweet?
 
Disability and Adventure Travel: the Double-Edged Sword
Disability and Adventure Travel: the Double-Edged SwordDisability and Adventure Travel: the Double-Edged Sword
Disability and Adventure Travel: the Double-Edged Sword
 
Who cares about sarcastic tweets? Investigating the impact of sarcasm on sent...
Who cares about sarcastic tweets? Investigating the impact of sarcasm on sent...Who cares about sarcastic tweets? Investigating the impact of sarcasm on sent...
Who cares about sarcastic tweets? Investigating the impact of sarcasm on sent...
 
Multimodal opinion mining from social media
Multimodal opinion mining from social mediaMultimodal opinion mining from social media
Multimodal opinion mining from social media
 
Practical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaPractical Opinion Mining for Social Media
Practical Opinion Mining for Social Media
 
What do you really mean when you tweet? Challenges for opinion mining on soci...
What do you really mean when you tweet? Challenges for opinion mining on soci...What do you really mean when you tweet? Challenges for opinion mining on soci...
What do you really mean when you tweet? Challenges for opinion mining on soci...
 

KĂźrzlich hochgeladen

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĂşjo
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

KĂźrzlich hochgeladen (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Analyzing Social Media

  • 1. University of Sheffield, NLP Text Analysis with GATE Diana Maynard University of Sheffield Big Social Data Workshop Reading, UK, April 2015 Š The University of Sheffield, 2015 This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike Licence
  • 2. University of Sheffield, NLP Outline of the Tutorial • Introduction to NLP and Information Extraction • Introduction to GATE • Social media analysis • Example applications • Semantic Search, Nesta, Decarbonet.
  • 3. University of Sheffield, NLP We are all connected to each other... ● Information, thoughts and opinions are shared prolifically on the social web these days ● 72% of online adults use social networking sites
  • 4. University of Sheffield, NLP Your grandmother is three times as likely to use a social networking site now as in 2009
  • 5. University of Sheffield, NLP There are hundreds of tools for social media analytics ● Most of them are commercial and not freely available ● The research tools tend to focus on specific topics and scenarios, and aren't easily adaptable ● The analysis they do often doesn't go much beyond number crunching, e.g. – look at number of tweets, retweets, favourites – filter by hashtag or keyword for topic categorisation – use off-the-shelf sentiment tools – use counts of word length, POS categories etc – very little semantics, don't deal with variation, ambiguity, slang, sarcasm etc.
  • 6. University of Sheffield, NLP Analysing Social Media is harder than it sounds There are lots of things to think about!
  • 7. University of Sheffield, NLP Analysing language in social media is hard ● Grundman:politics makes #climatechange scientific issue,people don’t like knowitall rational voice tellin em wat 2do ● @adambation Try reading this article , it looks like it would be really helpful and not obvious at all. http://t.co/mo3vODoX ● Want to solve the problem of #ClimateChange? Just #vote for a #politician! Poof! Problem gone! #sarcasm #TVP #99% ● Human Caused #ClimateChange is a Monumental Scam! http://www.youtube.com/watch?v=LiX792kNQeE … F**k yes!! Lying to us like MOFO's Tax The Air We Breath! F**k Them!
  • 8. University of Sheffield, NLP Let's search for keywords like “Arctic” Oops!
  • 9. University of Sheffield, NLP Seems like we need something to help! How about NLP?
  • 10. University of Sheffield, NLP It is difficult to access unstructured information efficiently Information extraction tools can help you: ● Save time and money on management of text and data from multiple sources ● Find hidden links scattered across huge volumes of diverse information ● Integrate structured data from variety of sources ● Interlink text and data ● Collect information and extract new facts
  • 11. University of Sheffield, NLP What is Entity Recognition? ● Entity Recognition is about recogising and classifying key Named Entities and terms in the text ● A Named Entity is a Person, Location, Organisation, Date etc. ● A term is a key concept or phrase that is representative of the text ● Entities and terms may be described in different ways but refer to the same thing. We call this co-reference. Mitt Romney, the favorite to win the Republican nomination for president in 2012 DatePerson Term The GOP tweeted that they had knocked on 75,000 doors in Ohio the day prior. Organisation co-reference Location
  • 12. University of Sheffield, NLP What is Event Recognition? ● An event is an action or situation relevant to the domain expressed by some relation between entities or terms. ● It is always grounded in time, e.g. the performance of a band, an election, the death of a person Mitt Romney, the favorite to win the Republican nomination for president in 2012 Event DatePerson Relation Relation
  • 13. University of Sheffield, NLP Why are Entities and Events Useful? ● They can help answer the “Big 5” journalism questions (who, what, when, where, why) ● They can be used to categorise the texts in different ways – look at all texts about Obama. ● They can be used as targets for opinion mining – find out what people think about President Obama ● When linked to an ontology and/or combined with other information, they can be used for reasoning about things not explicit in the text – seeing how opinions about different American presidents have changed over the years
  • 14. University of Sheffield, NLP Approaches to Information Extraction Knowledge Engineering  rule based  developed by experienced language engineers  make use of human intuition  easier to understand results  development could be very time consuming  some changes may be hard to accommodate Learning Systems  use statistics or other machine learning  developers do not need LE expertise  requires large amounts of annotated training data  some changes may require re-annotation of the entire training corpus
  • 15. University of Sheffield, NLP Seems like we need a tool to do this clever stuff for us. How about GATE?
  • 16. University of Sheffield, NLP What is GATE? GATE is an NLP toolkit developed at the University of Sheffield over the last 20 years. It's open source and freely available. http://gate.ac.uk • components for language processing, e.g. parsers, machine learning tools, stemmers, IR tools, IE components for various languages... • tools for visualising and manipulating text, annotations, ontologies, parse trees, etc. • various information extraction tools • evaluation and benchmarking tools
  • 17. University of Sheffield, NLP GATE components ● Language Resources (LRs), e.g. lexicons, corpora, ontologies ● Processing Resources (PRs), e.g. parsers, generators, taggers ● Visual Resources (VRs), i.e. visualisation and editing components ● Algorithms are separated from the data, which means: – the two can be developed independently by users with different expertise. – alternative resources of one type can be used without affecting the other, e.g. a different visual resource can be used with the same language resource
  • 18. University of Sheffield, NLP ANNIE • ANNIE is GATE's rule-based IE system • It uses the language engineering approach (though we also have tools in GATE for ML) • Distributed as part of GATE • Uses a finite-state pattern-action rule language, JAPE • ANNIE contains a reusable and easily extendable set of components: – generic preprocessing components for tokenisation, sentence splitting etc – components for performing NE on general open domain text
  • 19. University of Sheffield, NLP ANNIE Modules
  • 20. University of Sheffield, NLP 20 Document with Tokens
  • 21. University of Sheffield, NLP 21 Document with Sentences
  • 22. University of Sheffield, NLP Gazetteer editor definition file entries entries for selected list
  • 23. University of Sheffield, NLP Named Entity Grammars • Hand-coded rules written in JAPE applied to annotations to identify NEs • Phases run sequentially and constitute a cascade of FSTs over annotations • Annotations from format analysis, tokeniser. splitter, POS tagger, morphological analysis, gazetteer etc. • Because phases are sequential, annotations can be built up over a period of phases, as new information is gleaned • Standard named entities: persons, locations, organisations, dates, addresses, money • Basic NE grammars can be adapted for new applications, domains and languages
  • 24. University of Sheffield, NLP Document with NEs
  • 25. University of Sheffield, NLP Coreference
  • 26. University of Sheffield, NLP Hands-on: Running ANNIE on news texts ● Start GATE by double clicking on the icon ● Because ANNIE is a ready-made application, we can just load it directly from the menu ● Right click on Applications, select “Ready-made applications” and then ANNIE ● Create a new corpus (right click on Language Resources → New → GATE corpus and click “OK) ● Populate the corpus (right click on the corpus, and use the file chooser in Directory URL to select the hands-on/corpora/news- texts directory. Click “Open” and then “OK”) ● Double click on ANNIE and select “Run this application” ● This will now run the application on all the documents in your corpus
  • 27. University of Sheffield, NLP Hands-on: Looking at the results ● Now you can inspect your annotated data: double click on a document to open it ● Select “Annotation Sets” and “Annotation List” from the tabs ● Click on the top arrow above “Original markups” in the right pane ● You should see a mixture of Named Entity annotations (Person, Location etc) and some other linguistic annotations (Token, Sentence etc) in the Default annotation set ● Click on the name of an annotation in the right hand pane that you want to view (you can select multiple annotations) ● Hover over the annotation in the text to get more info ● Try the Annotation Stack for a different view
  • 28. University of Sheffield, NLP 3. Analysing Social Media
  • 29. University of Sheffield, NLP Analysing language in social media is hard ● Grundman:politics makes #climatechange scientific issue,people don’t like knowitall rational voice tellin em wat 2do ● @adambation Try reading this article , it looks like it would be really helpful and not obvious at all. http://t.co/mo3vODoX ● Want to solve the problem of #ClimateChange? Just #vote for a #politician! Poof! Problem gone! #sarcasm #TVP #99% ● Human Caused #ClimateChange is a Monumental Scam! http://www.youtube.com/watch?v=LiX792kNQeE … F**k yes!! Lying to us like MOFO's Tax The Air We Breath! F**k Them!
  • 30. University of Sheffield, NLP Challenges for NLP ● Noisy language: unusual punctuation, capitalisation, spelling, use of slang, sarcasm etc. ● Terse nature of microposts such as tweets ● Use of hashtags, @mentions etc causes problems for tokenisation #thisistricky ● Lack of context gives rise to ambiguities ● NER performs poorly on microposts, mainly because of linguistic pre-processing failure – Performance of standard IE tools decreases from ~90% to ~40% when run on tweets rather than news articles
  • 31. University of Sheffield, NLP Lack of context causes ambiguity Branching out from Lincoln park after dark ... Hello Russian Navy, it's like the same thing but with glitter! ??
  • 32. University of Sheffield, NLP Getting the NEs right is crucial Branching out from Lincoln park after dark ... Hello Russian Navy, it's like the same thing but with glitter!
  • 33. University of Sheffield, NLP Hands-on with TwitIE ● First we need to load the TWITIE plugin ● Click on the green jigsaw-shaped icon at the top to load the plugins manager (this takes a few seconds to load) ● Scroll down to the Twitter plugin and select “Load now”. ● Click “Apply All” and then “Close”. ● Now right-click on “Applications”, select “Ready-made applications” and “TwitIE” ● Create another new corpus, name it “Tweets” and populate with texts from hands-on/corpora/tweets ● Once loaded, double click on TwitIE to open it, and then select “Run this application” (make sure the tweets corpus is chosen) ● Use the Annotation Stack view to look at Tokens in hashtags
  • 34. University of Sheffield, NLP Semantic Search with GATE
  • 35. University of Sheffield, NLP GATE MĂ­mir • can be used to index and search over text, annotations, semantic metadata (concepts and instances) • allows queries that arbitrarily mix full-text, structural, linguistic and semantic annotations • is open source
  • 36. University of Sheffield, NLP What can GATE MĂ­mir do that Google can't? Show me: • all documents mentioning a temperature between 30 and 90 degrees F (expressed in any unit) • all abstracts written in French on patent documents from the last 6 months which mention any form of the word “transistor” in the English abstract • the names of the patent inventors of those abstracts • all documents mentioning steel industries in the UK, along with their location
  • 37. University of Sheffield, NLP Search News Articles for Politicians born in Sheffield http://demos.gate.ac.uk/mimir/gpd/search/gus
  • 38. University of Sheffield, NLP MIMIR: Searching Text Mining Results ● Searching and managing text annotations, semantic information, and full text documents in one search engine ● Queries over annotation graphs ● Regular expressions, Kleene operators ● Designed to be integrated as a web service in custom end-user systems with bespoke interfaces ● Demos at http://services.gate.ac.uk/mimir/
  • 39. University of Sheffield, NLP Hands-on with Semantic Search Try these queries on the BBC News demo: http://services.gate.ac.uk/mimir/gpd/search/index ● Gordon Brown ● Gordon Brown said ● Gordon Brown [0..3] root:say ● {Person} [0..3] root:say ● {Person.gender=female}[0..3] root:say ● Make sure you type the queries EXACTLY or they probably won't work! ● Try making up some of your own queries.
  • 40. University of Sheffield, NLP Try your hand with some SPARQL (for the more adventurous!) ● {Person inst ="http://dbpedia.org/resource/Gordon_Brown"} [0..3] root:say ● {Person sparql="SELECT ?inst WHERE { ?inst a :Politician }"} [0..3] root:say ● {Person sparql = "SELECT ?inst WHERE { ?inst :party <http://dbpedia.org/resource/Labour_Party_%28UK%29> }" } [0..3] root:say ● {Person sparql = "SELECT ?inst WHERE { ?inst :party <http://dbpedia.org/resource/Labour_Party_%28UK %29> . ?inst :almaMater <http://dbpedia.org/resource/University_of_Edinburgh> }" } [0..3] root:say Can you work out what these do?
  • 41. University of Sheffield, NLP We don't just have to look at politicians saying and measuring things ● If we first process the text with other NLP tools such as sentiment analysis, we can also search for positive or negative documents ● Or positive/negative comments about certain people/topics/things/companies ● In the DecarboNet project, we're looking at people's opinions about topics relating to climate change, e.g. fracking ● We could index on the sentiment annotations too ● Other people are using the combination of opinion mining and MIMIR to look at e.g. customer feedback about products / companies on a huge corpus
  • 42. University of Sheffield, NLP A positive tweet
  • 43. University of Sheffield, NLP A negative tweet
  • 44. University of Sheffield, NLP A Sarcastic Tweet
  • 45. University of Sheffield, NLP What diseases are in these documents?
  • 46. University of Sheffield, NLP What Pathogens?
  • 47. University of Sheffield, NLP Disease vs Disease Co-ocurrences
  • 48. University of Sheffield, NLP Diseases vs Pathogens
  • 49. University of Sheffield, NLP The Political Futures Tracker ● Example of using the technology on a real scenario - analysing political tweets in the run-up to the UK elections ● Project funded by Nesta http://www.nesta.org.uk/ ● Series of blog posts about the project, leading up to the election http://www.nesta.org.uk/news/political-futures-tracker ● Some of our own more technical blog posts about the underlying tools: http://gate.ac.uk/projects/pft/
  • 50. University of Sheffield, NLP Recognition of MPs / Candidates
  • 51. University of Sheffield, NLP Topic recognition This term is found by first recognising the head word “job” from a list under the theme “employment” and matching against its root form in the text, i.e. “job”. It is then extended to include the adjectival modifier “British”, which is not present in a list anywhere.
  • 52. University of Sheffield, NLP Positive opinion about science and innovation
  • 53. University of Sheffield, NLP What do the different parties talk about? Conservative vs Labour Transport, Europe and employment are mentioned more by Conservatives NHS is mentioned more by Labour
  • 54. University of Sheffield, NLP Where did people tweet about the economy?
  • 55. University of Sheffield, NLP Terms that co-occur with immigration topic
  • 56. University of Sheffield, NLP Which tweets express sentiment?
  • 57. University of Sheffield, NLP Summary ● Text mining is a very useful pre-requisite for doing all kinds of more interesting things, like semantic search ● Semantic web search allows you to do much more interesting kinds of search than a standard text-based search ● Text mining is hard, so it won't always be correct, however ● This is especially true on lower quality text such as social media ● On the other hand, social media has some of the most interesting material to look at ● Still plenty of work to be done, but there are lots of tools that you can use now and get useful results ● We run annual GATE training courses where you can spend a whole week learning all this and more!
  • 58. University of Sheffield, NLP Acknowledgements and further information ● Research partially supported by the European Union/EU under the Information and Communication Technologies (ICT) theme of the 7th Framework Programme for R&D (FP7) DecarboNet (610829) http://www.decarbonet.eu and Nesta http://nesta.org.uk ● Annual GATE training course in June in Sheffield:https://gate.ac.uk/family/training.html ● Download gate: http://gate.ac.uk/download
  • 59. University of Sheffield, NLP Relevant publications ● K Bontcheva, V Tablan, H Cunningham. Semantic Search over Documents and Ontologies (2014) Bridging Between Information Retrieval and Databases, 31-53 ● V. Tablan, K. Bontcheva, I. Roberts, and H. Cunningham. MĂ­mir: An open- source semantic search framework for interactive information seeking and discovery. Journal of Web Semantics: Science, Services and Agents on the World Wide Web, 2014 ● A. Dietzel and D. Maynard. Climate Change: A Chance for Political Re- Engagement? In Proc. of the Political Studies Association 65th Annual International Conference, April 2015, Sheffield, UK. ● Diana Maynard. Challenges in Analysing Social Media. In Adrian Duşa, Dietrich Nelle, GĂźnter Stock and Gert G. Wagner (eds.) (2014): Facing the Future: European Research Infrastructures for the Humanities and Social Sciences. SCIVERO Verlag, Berlin, 2014. ● These and other relevant papers on the GATE website: http://gate.ac.uk/gate/doc/papers.html