SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Auto-GSR
1
Parang Saraf
PhD Candidate
Discovery Analytics Center
Department of Computer Science
Virginia Tech
Email: parang@cs.vt.edu
Web: http://people.cs.vt.edu/parang/ June 15th, 2016
Introduction
•  AutoGSR is a part of the EMBERS project
•  EMBERS is a fully automated 24x7 cloud hosted system,
that mines through massive data streams of open source
data like twitter, facebook, news, blogs, etc. to generate
forecasts for civil unrest events that will happen in future
•  EMBERS is funded by IARPA’s OSI program, since 2012
•  Forecasts for civil unrest events generated by EMBERS are
evaluated against ground truth that is reported in news
articles. This ground truth is generated manually by MITRE
using a team of analysts. However, this manual approach
for generating ground truth is not scalable.
2
Goal
AutoGSR aims to generate comprehensive ground
truth data
•  by extracting events of type:
“Who protested where, when and why”
•  from news articles in:
Spanish, Portuguese, English and Arabic
•  While minimizing the manual effort required
In the OSI program, the ground truth data, which comprises of records
of civil unrest events reported in Latin American news papers is referred
to as Gold Standard Report (GSR). Since, we are automating the
process of ground truth generation, we named our system: AutoGSR
3
Sub-Goals
1.  Minimize the Manual Effort required to generate
the Ground Truth Civil Unrest Data
2.  Generate a “comprehensive” dataset
4
•  For the OSI project, IARPA is generating GSR with the help of
MITRE.
•  MITRE’s GSR generation process is purely manual, thereby leading
to high cost.
•  Basic idea behind AutoGSR is to make the GSR generation
economically feasible.
•  Why emphasize on word “Comprehensive”?
o  Because Automated event extractors have poor recall
•  Almost all of the civil unrest events needs to be identified
•  Crucial from the point of view of OSI evaluations
•  This dataset is also used by EMBERS forecasting models for training
Why Automated extractors
have poor Recall?
•  Because most of the extraction methods are
based on patterns ex: <student w/2 protest>
–  While patterns work nicely with semi-structured data
like medical reports, calendar notification etc., it works
poorly for unstructured data like news, blogs etc.
–  Free flowing text can express a given information in a
wide variety of ways
•  Spread across multiple sentences
•  Co-reference Resolution
•  Negation, etc.
6
Precision Recall tradeoff
•  Rigid Patterns (high precision, low recall)
–  <student w/2 protest>
–  Matches true events
–  Looses out of several other real events (labors strike)
–  ICEWS
•  Loose Patterns (low precision, high recall)
–  <Noun w/2 protest/alt>
–  Identifies almost all real events
–  Matches several false events (player strike)
–  GDELT
8
Preferred
What ratio of the articles are
truly protest events?
9
17633%
9868%
2976%
0%
2000%
4000%
6000%
8000%
10000%
12000%
14000%
16000%
18000%
20000%
Google&Search& Processed& Protest&
16.8%
AutoGSR Articles Count for April 2016 (10 LA Countries)
Auto-GSR v1.0
3
Auto-GSR Interface v1.0
4
Baseline Version
•  This is a baseline version that automates the GSR production
process:
–  Performs keyword based Google search query and download links
–  Extracts “article text” from these links and looks for protest keywords
–  Loads only those articles in the interface which have protest keywords
•  Also translates articles into English
•  Loads image associated with the article
•  Highlight protest keywords
•  Identify city names from the article text and pre-populate location dropdown for
faster encoding
–  Interface allows user to encode articles by clicking a few buttons
–  Interface also allows to review and resolve conflicts
•  The encoding process still remains manual:
–  Does not perform any classification or filtering of articles
–  Does not provide any encoding recommendations
5
Auto-GSR v2.0
6
The “intelligent” version
•  This version introduces several machine learning
models for:
–  discovery and classification of news articles
–  Encoding recommendations:
•  Recommendations for Individual encoding elements.
•  Recommendations for the whole encoding tuple
•  The architecture has a very flexible design:
–  It is easy to plug third-party models into the system
•  New Interface
–  Similar news stories are clustered together in real-time
–  Shows Non-Protest articles separately from the Protest
articles
7
Models Ecosystem
8
Filtering-Based Models Probability-Based Models Recommendations-Based Models
These are rules based models that
classify incoming news articles into
protest and non-protest with a 0 or 1
certainty
These models assign a probability
score to an incoming article to specify
whether the article is reporting a
protest or not
These models assume that the incoming article
is a protest article and tries to recommend
complete or partial encoding(s) for the article
1.  Sub-domain based filtering model
2.  URL based filtering model
3.  Negative keyword based filtering
Model
1.  Naïve-Bayes Document Classifier
2.  Image based Classifier
3.  SEO Meta Tags based Classifier
4.  Deep Learning Classifier
1.  Clustering based Model for full-encoding
recommendation
2.  Geo-location Model for location
recommendation
3.  Key sentence(s) recommendation
4.  SEO Meta Tags based recommendations
5.  National or Statewide protest
recommendation
Approach: All articles are passed
through each of these models.
However, if any of these models
classify the article as Non-protest
then the article is labeled as non-
protest article in the interface
Approach: Each of these models
assign individual probabilities to an
incoming article. An article’s final
probability is calculated using ‘model
ensemble’ approach.
In the interface user can specify a
cut-off probability score. Articles that
have probability greater than the
cutoff will appear as protest articles in
the interface
Approach: These recommendations appear in
the interface for each article. The
recommendations are clickable allowing users to
select an encoding by just 1-click.
Filtering-Based Models Probability-Based Models Recommendations-Based Models
These are rules based models that
classify incoming news articles into
protest and non-protest with a 0 or 1
certainty
These models assign a probability
score to an incoming article to specify
whether the article is reporting a
protest or not
These models assume that the incoming article
is a protest article and tries to recommend
complete or partial encoding(s) for the article
1.  Sub-domain based filtering model
2.  URL based filtering model
3.  Negative keyword based filtering
Model
1.  Naïve-Bayes Document Classifier
2.  Image based Classifier
3.  SEO Meta Tags based Classifier
4.  Deep Learning Classifier
1.  Clustering based Model for full-encoding
recommendation
2.  Geo-location Model for location
recommendation
3.  Key sentence(s) recommendation
4.  SEO Meta Tags based recommendations
5.  National or Statewide protest
recommendation
Approach: All articles are passed
through each of these models.
However, if any of these models
classify the article as Non-protest
then the article is labeled as non-
protest article in the interface
Approach: Each of these models
assign individual probabilities to an
incoming article. An article’s final
probability is calculated using ‘model
ensemble’ approach.
In the interface user can specify a
cut-off probability score. Articles that
have probability greater than the
cutoff will appear as protest articles in
the interface
Approach: These recommendations appear in
the interface for each article. The
recommendations are clickable allowing users to
select an encoding by just 1-click.
Models Ecosystem
(duplicate slide for quick reference)
9
Sub-Domain Based Filtering
•  Many of the sub-domains are tagged as
non-relevant for protest articles.
– Sports, Entertainment, Editorial etc.
•  If an article appears in any of these sub-
domains it will be classified as non-protest
article
•  Filtering-Based Model
10
URL-Based Filtering
•  Even from the relevant sub-domains, there might be
several URL structures that are irrelevant. For example:
–  URLs summarizing top stories of the day
Ex: http://www.clarin.com/politica/
–  URLs summarizing stories by topics
Ex: http://www.clarin.com/tema/manifestaciones.html
–  URLs corresponding to search terms
Ex: http://www.clarin.com/buscador?q=protesta
•  Filtering-Based Model
11
Negative Keyword Based
Filtering Model
•  For many of the protest keywords, there exist words
(Negative Keywords) which when used together with the
protest keyword can alter the meaning. For example:
•  Filtering-Based Model
12
Protest Keyword Negative Keyword Phrase Meaning
marcha ponar en marcha to start; to set in motion
protesta tomar protesta to swear in (public official)
protesta rendir protesta to swear in (public official)
Filtering-Based Models Probability-Based Models Recommendations-Based Models
These are rules based models that
classify incoming news articles into
protest and non-protest with a 0 or 1
certainty
These models assign a probability
score to an incoming article to specify
whether the article is reporting a
protest or not
These models assume that the incoming article
is a protest article and tries to recommend
complete or partial encoding(s) for the article
1.  Sub-domain based filtering model
2.  URL based filtering model
3.  Negative keyword based filtering
Model
1.  Naïve-Bayes Document Classifier
2.  Image based Classifier
3.  SEO Meta Tags based Classifier
4.  Deep Learning Classifier
1.  Clustering based Model for full-encoding
recommendation
2.  Geo-location Model for location
recommendation
3.  Key sentence(s) recommendation
4.  SEO Meta Tags based recommendations
5.  National or Statewide protest
recommendation
Approach: All articles are passed
through each of these models.
However, if any of these models
classify the article as Non-protest
then the article is labeled as non-
protest article in the interface
Approach: Each of these models
assign individual probabilities to an
incoming article. An article’s final
probability is calculated using ‘model
ensemble’ approach.
In the interface user can specify a
cut-off probability score. Articles that
have probability greater than the
cutoff will appear as protest articles in
the interface
Approach: These recommendations appear in
the interface for each article. The
recommendations are clickable allowing users to
select an encoding by just 1-click.
Models Ecosystem
(duplicate slide for quick reference)
13
Naïve-Bayes Document
Classifier
1.  For each article in the training set extract named
entities: people, location and organization
2.  For each country, for every mention of people, location,
organization and protest keywords in the training set,
identify the probability of being a protest article
3.  For an incoming article, based on the mentions of
people, location, organization and protest keyword in it,
assign a naive-bayes probability of the article being a
protest article
•  Probability-Based Model
14
Image Based Classifier
•  A picture is worth 1,000 words
•  An image classification model that learns from the
images in the training set and classifies the incoming
images as protest image or not
•  Excludes cases when the article image is a standard
image like newspaper logo or there is no associated
image.
•  Probability-Based Model
15
SEO Meta Tags based
Classification and Suggestions
•  Almost every news site use SEO meta tags that makes it
easy for search engine crawlers to index their content
•  In these tags they provide very succinct information
about the article that can be used to our advantage like
summary, abstract, description, keywords, publish date
etc.
•  These tags are generated for each article specifically to
get a better presence on the web.
•  Probability-Based and Suggestion-Based Model
16
SEO Meta Tags based
Classification and Suggestions
17
Deep Learning Classifier
•  Uses Neural Network based Deep Learning
techniques like word2vec, doc2vec to
classify incoming articles into protest and
non-protest.
•  Probability-Based Model
18
Model Ensemble
•  The goal of model ensemble is to combine probabilities from each of
the probability based models into a one final probability score for the
article.
•  Takes into account how good each of the models have been in the
past
•  Also takes care of cases when one or more of the models is not able
to generate any probability score (for ex: when the image is not
present)
•  The interface shows only one single combined probability for each
article. The interface allows the user to specify a cutoff probability
score. Any article with a combined probability score greater than the
cutoff is shows an protest article in the interface
•  Part of Probability-Based-Models
19
Filtering-Based Models Probability-Based Models Recommendations-Based Models
These are rules based models that
classify incoming news articles into
protest and non-protest with a 0 or 1
certainty
These models assign a probability
score to an incoming article to specify
whether the article is reporting a
protest or not
These models assume that the incoming article
is a protest article and tries to recommend
complete or partial encoding(s) for the article
1.  Sub-domain based filtering model
2.  URL based filtering model
3.  Negative keyword based filtering
Model
1.  Naïve-Bayes Document Classifier
2.  Image based Classifier
3.  SEO Meta Tags based Classifier
4.  Deep Learning Classifier
1.  Clustering based Model for full-encoding
recommendation
2.  Geo-location Model for location
recommendation
3.  Key sentence(s) recommendation
4.  SEO Meta Tags based recommendations
5.  National or Statewide protest
recommendation
Approach: All articles are passed
through each of these models.
However, if any of these models
classify the article as Non-protest
then the article is labeled as non-
protest article in the interface
Approach: Each of these models
assign individual probabilities to an
incoming article. An article’s final
probability is calculated using ‘model
ensemble’ approach.
In the interface user can specify a
cut-off probability score. Articles that
have probability greater than the
cutoff will appear as protest articles in
the interface
Approach: These recommendations appear in
the interface for each article. The
recommendations are clickable allowing users to
select an encoding by just 1-click.
Models Ecosystem
(duplicate slide for quick reference)
20
Clustering-Based Full Encoding
Recommendation
•  Articles referring to the same topic are clustered together in real-
time in the interface
–  Uses a third party search results clustering algorithm named lingo3G
•  If any of the articles in the cluster has already been encoded, the
system starts to recommend the same encoding for other articles in
the cluster
•  In case of multiple articles with different encodings in the same
cluster, then the recommendations are made based on the most
used encoding tuple
•  Recommendations are clickable and allows a user to encode the
article using just 1-click
•  Recommendation-Based Model
21
Geo-Location Model
•  This model works on Location Named Entities extracted
from article text and an extended version of world-
gazetteer to recommend a location that the article is
talking about
•  Also handles cases when the article reports landmarks
instead of city names
•  Recommendation-Based Model
22
Key Sentence(s) Suggestion
•  This is a Neural-Network based model that identifies key sentences
in the article:
–  Sentences reporting protest
–  Sentences reporting reasons for protest, or participating population
–  Sentences providing contextual information
•  On the interface the user can toggle his “reading view” to show:
–  Just the highlighted sentences of the articles
–  Full Article
•  Recommendation-Based Model
23
National / Statewide Protest
Suggestion
•  Simple keyword based model that looks for variants of
the word “national” or “State-wide” in the article text and
makes a recommendation that the protest maybe a
nationwide protest
•  Used more as a cautionary model to alert users that
article might need to be encoded as nationwide/
statewide protest article instead of city level protest
article
•  Recommendation-Based Model
24
Adding a New Model
•  The system has a very flexible architecture that allows
addition of new models till the time they fall in on of the
three categories – filtered, probability or suggestion
based model.
•  The system treats the models as black-box and uses a
standard interface for calling them:
–  Based on the model type, the system expects a standard
response
–  For example: It is very easy to integrate BBN SERIF into the new
version. SERIF will receive an article through an API and will
return the extracted event (full or partial), which will then be
automatically shown as a suggestion in the interface.
25
New “Intelligent” Interface
26
New “Intelligent” Interface
•  New Intelligent Interface:
–  User defined criteria for classifying Protest / Non-protest Article
–  Similar articles appear in clusters, thereby reducing redundancy
–  Shows full-event encoding suggestions (event extraction) for the article.
There are two ways to show these full-event suggestions:
•  Clustering based suggestions: Assuming that articles in the cluster are similar,
encodings from the encoded articles are used to make suggestions for the
unencoded articles
•  Ensembled Recommendation Suggestions: Full tuples encoding suggestions are
generated from the partial suggestions made by the recommendation based
models
–  Individual suggestions are shown in the encoding form itself. These
suggestions are generated by recommendation models
–  Shows the output from all the classification models along with their
comments in an easy to ready well-constructed English statements.
–  Key-sentence Highlighted with an ability to tag sentences and switch
between two reading views: “Full Article” and “Highlighted Text”.
27
Auto-GSR
Interface Walk-Through
28
AutoGSR Interface
29
AutoGSR Interface
30
Allows the user to choose his criteria for
selecting protest/Non-protest articles. He
can define Cutoff Confidence Probability
for classifying an article as protest article.
AutoGSR Interface
31
The returned articles are clustered on-the-
fly such that similar articles appear in the
same cluster. The system also generates
Cluster Labels
AutoGSR Interface
32
Clicking on a cluster shows all the articles
in the clusters along with a color-coding to
differentiate encoded articles from
unencoded articles
AutoGSR Interface
33
Full Encoding Suggestions along with
confidence scores are generated based on
the encodings of the other articles in the
cluster
AutoGSR Interface
34
Encoding Suggestions for Individual
Components are shown in the encoding
form itself. These suggestions are
generated by recommendation models
AutoGSR Interface
35
Shows the output from all the
classification models along with their
comments in an easy to ready well-
constructed English statements
AutoGSR Interface
36
Shows the original text, translated text
along with associated image
AutoGSR Interface
37
Based on the output of key-sentence
recommendations model, sentences are highlighted
that are deemed to contain the information required
by event extraction. Further, a user can also click a
particular sentence and record the type of
information provided by that sentence in case if he
disagrees with the system generated
recommendations
AutoGSR Evaluation
Month Quality Score
(Out of 4)
Precision Recall
October’15 3.561 0.8 0.94
November’15 3.622 0.82 0.78
December’15 3.53 0.88 0.83
January’15 3.54 0.92 0.84
38
February’16 Quality Score
(Out of 4)
Precision Recall
Egypt 4 1 0.315
Jordan 3.56 1 0.94
Time Reduction
3972% Reduction
Thank You
40

Weitere ähnliche Inhalte

Was ist angesagt?

Computing Social Score of Web Artifacts - IRE Major Project Spring 2015
Computing Social Score of Web Artifacts - IRE Major Project Spring 2015Computing Social Score of Web Artifacts - IRE Major Project Spring 2015
Computing Social Score of Web Artifacts - IRE Major Project Spring 2015Amar Budhiraja
 
Discovery of ranking fraud for mobile apps
Discovery of ranking fraud for mobile appsDiscovery of ranking fraud for mobile apps
Discovery of ranking fraud for mobile appsNexgen Technology
 
Epidemiological Modeling of News and Rumors on Twitter
Epidemiological Modeling of News and Rumors on TwitterEpidemiological Modeling of News and Rumors on Twitter
Epidemiological Modeling of News and Rumors on TwitterParang Saraf
 
Machine learning for social media analytics
Machine learning for  social media analyticsMachine learning for  social media analytics
Machine learning for social media analyticsJenya Terpil
 
Online social network analysis with machine learning techniques
Online social network analysis with machine learning techniquesOnline social network analysis with machine learning techniques
Online social network analysis with machine learning techniquesHari KC
 
Slides: Epidemiological Modeling of News and Rumors on Twitter
Slides: Epidemiological Modeling of News and Rumors on TwitterSlides: Epidemiological Modeling of News and Rumors on Twitter
Slides: Epidemiological Modeling of News and Rumors on TwitterParang Saraf
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET Journal
 
Social Network Analysis with Spark
Social Network Analysis with SparkSocial Network Analysis with Spark
Social Network Analysis with SparkGhulam Imaduddin
 
Plagiarism Check
Plagiarism CheckPlagiarism Check
Plagiarism Checkisaacnailor
 
FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1Mark Wilkinson
 
Data Analytics Capstone
Data Analytics CapstoneData Analytics Capstone
Data Analytics CapstoneMacemann
 
Crime Analytics: Analysis of crimes through news paper articles
Crime Analytics: Analysis of crimes through news paper articlesCrime Analytics: Analysis of crimes through news paper articles
Crime Analytics: Analysis of crimes through news paper articlesChamath Sajeewa
 
Team CDTW Capstone Presentation
Team CDTW Capstone Presentation Team CDTW Capstone Presentation
Team CDTW Capstone Presentation Todd Rutherford
 
DataKind SG sharing of our first DataDive
DataKind SG sharing of our first DataDiveDataKind SG sharing of our first DataDive
DataKind SG sharing of our first DataDiveEugene Yan Ziyou
 
Using Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternUsing Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternZakaria Zubi
 
STAT!Ref Installation Instructions
STAT!Ref Installation InstructionsSTAT!Ref Installation Instructions
STAT!Ref Installation Instructionsadonahuemcw
 
Rob Procter
Rob ProcterRob Procter
Rob ProcterNSMNSS
 

Was ist angesagt? (19)

Computing Social Score of Web Artifacts - IRE Major Project Spring 2015
Computing Social Score of Web Artifacts - IRE Major Project Spring 2015Computing Social Score of Web Artifacts - IRE Major Project Spring 2015
Computing Social Score of Web Artifacts - IRE Major Project Spring 2015
 
Seminar Report Mine
Seminar Report MineSeminar Report Mine
Seminar Report Mine
 
Discovery of ranking fraud for mobile apps
Discovery of ranking fraud for mobile appsDiscovery of ranking fraud for mobile apps
Discovery of ranking fraud for mobile apps
 
Epidemiological Modeling of News and Rumors on Twitter
Epidemiological Modeling of News and Rumors on TwitterEpidemiological Modeling of News and Rumors on Twitter
Epidemiological Modeling of News and Rumors on Twitter
 
Machine learning for social media analytics
Machine learning for  social media analyticsMachine learning for  social media analytics
Machine learning for social media analytics
 
Online social network analysis with machine learning techniques
Online social network analysis with machine learning techniquesOnline social network analysis with machine learning techniques
Online social network analysis with machine learning techniques
 
Slides: Epidemiological Modeling of News and Rumors on Twitter
Slides: Epidemiological Modeling of News and Rumors on TwitterSlides: Epidemiological Modeling of News and Rumors on Twitter
Slides: Epidemiological Modeling of News and Rumors on Twitter
 
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2VecIRJET-  	  Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
IRJET- Improved Real-Time Twitter Sentiment Analysis using ML & Word2Vec
 
Social Network Analysis with Spark
Social Network Analysis with SparkSocial Network Analysis with Spark
Social Network Analysis with Spark
 
Plagiarism Check
Plagiarism CheckPlagiarism Check
Plagiarism Check
 
FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1
 
Data Analytics Capstone
Data Analytics CapstoneData Analytics Capstone
Data Analytics Capstone
 
Crime Analytics: Analysis of crimes through news paper articles
Crime Analytics: Analysis of crimes through news paper articlesCrime Analytics: Analysis of crimes through news paper articles
Crime Analytics: Analysis of crimes through news paper articles
 
Team CDTW Capstone Presentation
Team CDTW Capstone Presentation Team CDTW Capstone Presentation
Team CDTW Capstone Presentation
 
DataKind SG sharing of our first DataDive
DataKind SG sharing of our first DataDiveDataKind SG sharing of our first DataDive
DataKind SG sharing of our first DataDive
 
Using Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime PatternUsing Data Mining Techniques to Analyze Crime Pattern
Using Data Mining Techniques to Analyze Crime Pattern
 
STAT!Ref Installation Instructions
STAT!Ref Installation InstructionsSTAT!Ref Installation Instructions
STAT!Ref Installation Instructions
 
Rob Procter
Rob ProcterRob Procter
Rob Procter
 
Red Blue Presentation
Red Blue PresentationRed Blue Presentation
Red Blue Presentation
 

Ähnlich wie EMBERS AutoGSR: Automated Coding of Civil Unrest Events

Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Stanford University
 
Developing a Secured Recommender System in Social Semantic Network
Developing a Secured Recommender System in Social Semantic NetworkDeveloping a Secured Recommender System in Social Semantic Network
Developing a Secured Recommender System in Social Semantic NetworkTamer Rezk
 
Some Frameworks for Improving Analytic Operations at Your Company
Some Frameworks for Improving Analytic Operations at Your CompanySome Frameworks for Improving Analytic Operations at Your Company
Some Frameworks for Improving Analytic Operations at Your CompanyRobert Grossman
 
Tropos project toward RE
Tropos project toward RETropos project toward RE
Tropos project toward RESehrish Asif
 
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...Edge AI and Vision Alliance
 
CodeLess Machine Learning
CodeLess Machine LearningCodeLess Machine Learning
CodeLess Machine LearningSharjeel Imtiaz
 
Advancing Alcohol Behavior Change
Advancing Alcohol Behavior ChangeAdvancing Alcohol Behavior Change
Advancing Alcohol Behavior ChangeChad Travis
 
Software Development Analytics Intro. Twitter OSS workshop
Software Development Analytics Intro. Twitter OSS workshopSoftware Development Analytics Intro. Twitter OSS workshop
Software Development Analytics Intro. Twitter OSS workshopManrique Lopez
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesKimberley Mitchell
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISrathnaarul
 
Narrative Mind Lessons Learned
Narrative Mind Lessons LearnedNarrative Mind Lessons Learned
Narrative Mind Lessons LearnedH4Diadmin
 
Narrative Mind Lessons Learned H4D Stanford 2016
Narrative Mind Lessons Learned H4D Stanford 2016Narrative Mind Lessons Learned H4D Stanford 2016
Narrative Mind Lessons Learned H4D Stanford 2016Stanford University
 
Analysis on Recommended System for Web Information Retrieval Using HMM
Analysis on Recommended System for Web Information Retrieval Using HMMAnalysis on Recommended System for Web Information Retrieval Using HMM
Analysis on Recommended System for Web Information Retrieval Using HMMIJERA Editor
 
A Literature Survey on Recommendation Systems for Scientific Articles.pdf
A Literature Survey on Recommendation Systems for Scientific Articles.pdfA Literature Survey on Recommendation Systems for Scientific Articles.pdf
A Literature Survey on Recommendation Systems for Scientific Articles.pdfAmber Ford
 
Apache mahout and R-mining complex dataobject
Apache mahout and R-mining complex dataobjectApache mahout and R-mining complex dataobject
Apache mahout and R-mining complex dataobjectsakthibalabalamuruga
 

Ähnlich wie EMBERS AutoGSR: Automated Coding of Civil Unrest Events (20)

Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016Narrative Mind Week 5 H4D Stanford 2016
Narrative Mind Week 5 H4D Stanford 2016
 
Developing a Secured Recommender System in Social Semantic Network
Developing a Secured Recommender System in Social Semantic NetworkDeveloping a Secured Recommender System in Social Semantic Network
Developing a Secured Recommender System in Social Semantic Network
 
Some Frameworks for Improving Analytic Operations at Your Company
Some Frameworks for Improving Analytic Operations at Your CompanySome Frameworks for Improving Analytic Operations at Your Company
Some Frameworks for Improving Analytic Operations at Your Company
 
Tropos project toward RE
Tropos project toward RETropos project toward RE
Tropos project toward RE
 
SNATZ Technology
SNATZ TechnologySNATZ Technology
SNATZ Technology
 
Brand Analytics
Brand AnalyticsBrand Analytics
Brand Analytics
 
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
 
CodeLess Machine Learning
CodeLess Machine LearningCodeLess Machine Learning
CodeLess Machine Learning
 
Advancing Alcohol Behavior Change
Advancing Alcohol Behavior ChangeAdvancing Alcohol Behavior Change
Advancing Alcohol Behavior Change
 
Recsys 2016
Recsys 2016Recsys 2016
Recsys 2016
 
BD-ACA Week8a
BD-ACA Week8aBD-ACA Week8a
BD-ACA Week8a
 
Software Development Analytics Intro. Twitter OSS workshop
Software Development Analytics Intro. Twitter OSS workshopSoftware Development Analytics Intro. Twitter OSS workshop
Software Development Analytics Intro. Twitter OSS workshop
 
Predictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use CasesPredictive Analytics: Context and Use Cases
Predictive Analytics: Context and Use Cases
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 
Narrative Mind Lessons Learned
Narrative Mind Lessons LearnedNarrative Mind Lessons Learned
Narrative Mind Lessons Learned
 
Narrative Mind Lessons Learned H4D Stanford 2016
Narrative Mind Lessons Learned H4D Stanford 2016Narrative Mind Lessons Learned H4D Stanford 2016
Narrative Mind Lessons Learned H4D Stanford 2016
 
Seminar on Rs.pptx
Seminar on Rs.pptxSeminar on Rs.pptx
Seminar on Rs.pptx
 
Analysis on Recommended System for Web Information Retrieval Using HMM
Analysis on Recommended System for Web Information Retrieval Using HMMAnalysis on Recommended System for Web Information Retrieval Using HMM
Analysis on Recommended System for Web Information Retrieval Using HMM
 
A Literature Survey on Recommendation Systems for Scientific Articles.pdf
A Literature Survey on Recommendation Systems for Scientific Articles.pdfA Literature Survey on Recommendation Systems for Scientific Articles.pdf
A Literature Survey on Recommendation Systems for Scientific Articles.pdf
 
Apache mahout and R-mining complex dataobject
Apache mahout and R-mining complex dataobjectApache mahout and R-mining complex dataobject
Apache mahout and R-mining complex dataobject
 

Mehr von Parang Saraf

Email and Network Analyzer
Email and Network AnalyzerEmail and Network Analyzer
Email and Network AnalyzerParang Saraf
 
Slides: Safeguarding Abila through Multiple Data Perspectives
Slides: Safeguarding Abila through Multiple Data PerspectivesSlides: Safeguarding Abila through Multiple Data Perspectives
Slides: Safeguarding Abila through Multiple Data PerspectivesParang Saraf
 
Slides: Safeguarding Abila: Real-time Streaming Analysis
Slides: Safeguarding Abila: Real-time Streaming AnalysisSlides: Safeguarding Abila: Real-time Streaming Analysis
Slides: Safeguarding Abila: Real-time Streaming AnalysisParang Saraf
 
Slides: Safeguarding Abila: Spatio-Temporal Activity Modeling
Slides: Safeguarding Abila: Spatio-Temporal Activity ModelingSlides: Safeguarding Abila: Spatio-Temporal Activity Modeling
Slides: Safeguarding Abila: Spatio-Temporal Activity ModelingParang Saraf
 
Safeguarding Abila: Discovering Evolving Activist Networks
Safeguarding Abila: Discovering Evolving Activist NetworksSafeguarding Abila: Discovering Evolving Activist Networks
Safeguarding Abila: Discovering Evolving Activist NetworksParang Saraf
 
EMBERS AutoGSR: Automated Coding of Civil Unrest Events
EMBERS AutoGSR: Automated Coding of Civil Unrest EventsEMBERS AutoGSR: Automated Coding of Civil Unrest Events
EMBERS AutoGSR: Automated Coding of Civil Unrest EventsParang Saraf
 
EMBERS at 4 years: Experiences operating an Open Source Indicators Forecastin...
EMBERS at 4 years: Experiences operating an Open Source Indicators Forecastin...EMBERS at 4 years: Experiences operating an Open Source Indicators Forecastin...
EMBERS at 4 years: Experiences operating an Open Source Indicators Forecastin...Parang Saraf
 
Slides: Forex-Foreteller: Currency Trend Modeling using News Articles
Slides: Forex-Foreteller: Currency Trend Modeling using News ArticlesSlides: Forex-Foreteller: Currency Trend Modeling using News Articles
Slides: Forex-Foreteller: Currency Trend Modeling using News ArticlesParang Saraf
 
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Parang Saraf
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsParang Saraf
 
Bayesian Model Fusion for Forecasting Civil Unrest
Bayesian Model Fusion for Forecasting Civil UnrestBayesian Model Fusion for Forecasting Civil Unrest
Bayesian Model Fusion for Forecasting Civil UnrestParang Saraf
 
‘Beating the News’ with EMBERS: Forecasting Civil Unrest using Open Source In...
‘Beating the News’ with EMBERS: Forecasting Civil Unrest using Open Source In...‘Beating the News’ with EMBERS: Forecasting Civil Unrest using Open Source In...
‘Beating the News’ with EMBERS: Forecasting Civil Unrest using Open Source In...Parang Saraf
 
Safeguarding Abila through Multiple Data Perspectives
Safeguarding Abila through Multiple Data PerspectivesSafeguarding Abila through Multiple Data Perspectives
Safeguarding Abila through Multiple Data PerspectivesParang Saraf
 
Safeguarding Abila: Real-time Streaming Analysis
Safeguarding Abila: Real-time Streaming AnalysisSafeguarding Abila: Real-time Streaming Analysis
Safeguarding Abila: Real-time Streaming AnalysisParang Saraf
 
Safeguarding Abila: Spatio-Temporal Activity Modeling
Safeguarding Abila: Spatio-Temporal Activity ModelingSafeguarding Abila: Spatio-Temporal Activity Modeling
Safeguarding Abila: Spatio-Temporal Activity ModelingParang Saraf
 
Safeguarding Abila: Discovering Evolving Activist Networks
Safeguarding Abila: Discovering Evolving Activist NetworksSafeguarding Abila: Discovering Evolving Activist Networks
Safeguarding Abila: Discovering Evolving Activist NetworksParang Saraf
 
Forex-Foreteller: Currency Trend Modeling using News Articles
Forex-Foreteller: Currency Trend Modeling using News ArticlesForex-Foreteller: Currency Trend Modeling using News Articles
Forex-Foreteller: Currency Trend Modeling using News ArticlesParang Saraf
 
Merseyside Crime Analysis
Merseyside Crime AnalysisMerseyside Crime Analysis
Merseyside Crime AnalysisParang Saraf
 

Mehr von Parang Saraf (20)

Email and Network Analyzer
Email and Network AnalyzerEmail and Network Analyzer
Email and Network Analyzer
 
Slides: Safeguarding Abila through Multiple Data Perspectives
Slides: Safeguarding Abila through Multiple Data PerspectivesSlides: Safeguarding Abila through Multiple Data Perspectives
Slides: Safeguarding Abila through Multiple Data Perspectives
 
Slides: Safeguarding Abila: Real-time Streaming Analysis
Slides: Safeguarding Abila: Real-time Streaming AnalysisSlides: Safeguarding Abila: Real-time Streaming Analysis
Slides: Safeguarding Abila: Real-time Streaming Analysis
 
Slides: Safeguarding Abila: Spatio-Temporal Activity Modeling
Slides: Safeguarding Abila: Spatio-Temporal Activity ModelingSlides: Safeguarding Abila: Spatio-Temporal Activity Modeling
Slides: Safeguarding Abila: Spatio-Temporal Activity Modeling
 
Safeguarding Abila: Discovering Evolving Activist Networks
Safeguarding Abila: Discovering Evolving Activist NetworksSafeguarding Abila: Discovering Evolving Activist Networks
Safeguarding Abila: Discovering Evolving Activist Networks
 
News Analyzer
News AnalyzerNews Analyzer
News Analyzer
 
EMBERS AutoGSR: Automated Coding of Civil Unrest Events
EMBERS AutoGSR: Automated Coding of Civil Unrest EventsEMBERS AutoGSR: Automated Coding of Civil Unrest Events
EMBERS AutoGSR: Automated Coding of Civil Unrest Events
 
EMBERS at 4 years: Experiences operating an Open Source Indicators Forecastin...
EMBERS at 4 years: Experiences operating an Open Source Indicators Forecastin...EMBERS at 4 years: Experiences operating an Open Source Indicators Forecastin...
EMBERS at 4 years: Experiences operating an Open Source Indicators Forecastin...
 
Slides: Forex-Foreteller: Currency Trend Modeling using News Articles
Slides: Forex-Foreteller: Currency Trend Modeling using News ArticlesSlides: Forex-Foreteller: Currency Trend Modeling using News Articles
Slides: Forex-Foreteller: Currency Trend Modeling using News Articles
 
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
 
EMBERS Posters
EMBERS PostersEMBERS Posters
EMBERS Posters
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector Representations
 
Bayesian Model Fusion for Forecasting Civil Unrest
Bayesian Model Fusion for Forecasting Civil UnrestBayesian Model Fusion for Forecasting Civil Unrest
Bayesian Model Fusion for Forecasting Civil Unrest
 
‘Beating the News’ with EMBERS: Forecasting Civil Unrest using Open Source In...
‘Beating the News’ with EMBERS: Forecasting Civil Unrest using Open Source In...‘Beating the News’ with EMBERS: Forecasting Civil Unrest using Open Source In...
‘Beating the News’ with EMBERS: Forecasting Civil Unrest using Open Source In...
 
Safeguarding Abila through Multiple Data Perspectives
Safeguarding Abila through Multiple Data PerspectivesSafeguarding Abila through Multiple Data Perspectives
Safeguarding Abila through Multiple Data Perspectives
 
Safeguarding Abila: Real-time Streaming Analysis
Safeguarding Abila: Real-time Streaming AnalysisSafeguarding Abila: Real-time Streaming Analysis
Safeguarding Abila: Real-time Streaming Analysis
 
Safeguarding Abila: Spatio-Temporal Activity Modeling
Safeguarding Abila: Spatio-Temporal Activity ModelingSafeguarding Abila: Spatio-Temporal Activity Modeling
Safeguarding Abila: Spatio-Temporal Activity Modeling
 
Safeguarding Abila: Discovering Evolving Activist Networks
Safeguarding Abila: Discovering Evolving Activist NetworksSafeguarding Abila: Discovering Evolving Activist Networks
Safeguarding Abila: Discovering Evolving Activist Networks
 
Forex-Foreteller: Currency Trend Modeling using News Articles
Forex-Foreteller: Currency Trend Modeling using News ArticlesForex-Foreteller: Currency Trend Modeling using News Articles
Forex-Foreteller: Currency Trend Modeling using News Articles
 
Merseyside Crime Analysis
Merseyside Crime AnalysisMerseyside Crime Analysis
Merseyside Crime Analysis
 

Kürzlich hochgeladen

怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制vexqp
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样wsppdmt
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxVivek487417
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格q6pzkpark
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制vexqp
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATIONLakpaYanziSherpa
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 

Kürzlich hochgeladen (20)

怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 

EMBERS AutoGSR: Automated Coding of Civil Unrest Events

  • 1. Auto-GSR 1 Parang Saraf PhD Candidate Discovery Analytics Center Department of Computer Science Virginia Tech Email: parang@cs.vt.edu Web: http://people.cs.vt.edu/parang/ June 15th, 2016
  • 2. Introduction •  AutoGSR is a part of the EMBERS project •  EMBERS is a fully automated 24x7 cloud hosted system, that mines through massive data streams of open source data like twitter, facebook, news, blogs, etc. to generate forecasts for civil unrest events that will happen in future •  EMBERS is funded by IARPA’s OSI program, since 2012 •  Forecasts for civil unrest events generated by EMBERS are evaluated against ground truth that is reported in news articles. This ground truth is generated manually by MITRE using a team of analysts. However, this manual approach for generating ground truth is not scalable. 2
  • 3. Goal AutoGSR aims to generate comprehensive ground truth data •  by extracting events of type: “Who protested where, when and why” •  from news articles in: Spanish, Portuguese, English and Arabic •  While minimizing the manual effort required In the OSI program, the ground truth data, which comprises of records of civil unrest events reported in Latin American news papers is referred to as Gold Standard Report (GSR). Since, we are automating the process of ground truth generation, we named our system: AutoGSR 3
  • 4. Sub-Goals 1.  Minimize the Manual Effort required to generate the Ground Truth Civil Unrest Data 2.  Generate a “comprehensive” dataset 4 •  For the OSI project, IARPA is generating GSR with the help of MITRE. •  MITRE’s GSR generation process is purely manual, thereby leading to high cost. •  Basic idea behind AutoGSR is to make the GSR generation economically feasible. •  Why emphasize on word “Comprehensive”? o  Because Automated event extractors have poor recall •  Almost all of the civil unrest events needs to be identified •  Crucial from the point of view of OSI evaluations •  This dataset is also used by EMBERS forecasting models for training
  • 5. Why Automated extractors have poor Recall? •  Because most of the extraction methods are based on patterns ex: <student w/2 protest> –  While patterns work nicely with semi-structured data like medical reports, calendar notification etc., it works poorly for unstructured data like news, blogs etc. –  Free flowing text can express a given information in a wide variety of ways •  Spread across multiple sentences •  Co-reference Resolution •  Negation, etc. 6
  • 6. Precision Recall tradeoff •  Rigid Patterns (high precision, low recall) –  <student w/2 protest> –  Matches true events –  Looses out of several other real events (labors strike) –  ICEWS •  Loose Patterns (low precision, high recall) –  <Noun w/2 protest/alt> –  Identifies almost all real events –  Matches several false events (player strike) –  GDELT 8 Preferred
  • 7. What ratio of the articles are truly protest events? 9 17633% 9868% 2976% 0% 2000% 4000% 6000% 8000% 10000% 12000% 14000% 16000% 18000% 20000% Google&Search& Processed& Protest& 16.8% AutoGSR Articles Count for April 2016 (10 LA Countries)
  • 10. Baseline Version •  This is a baseline version that automates the GSR production process: –  Performs keyword based Google search query and download links –  Extracts “article text” from these links and looks for protest keywords –  Loads only those articles in the interface which have protest keywords •  Also translates articles into English •  Loads image associated with the article •  Highlight protest keywords •  Identify city names from the article text and pre-populate location dropdown for faster encoding –  Interface allows user to encode articles by clicking a few buttons –  Interface also allows to review and resolve conflicts •  The encoding process still remains manual: –  Does not perform any classification or filtering of articles –  Does not provide any encoding recommendations 5
  • 12. The “intelligent” version •  This version introduces several machine learning models for: –  discovery and classification of news articles –  Encoding recommendations: •  Recommendations for Individual encoding elements. •  Recommendations for the whole encoding tuple •  The architecture has a very flexible design: –  It is easy to plug third-party models into the system •  New Interface –  Similar news stories are clustered together in real-time –  Shows Non-Protest articles separately from the Protest articles 7
  • 13. Models Ecosystem 8 Filtering-Based Models Probability-Based Models Recommendations-Based Models These are rules based models that classify incoming news articles into protest and non-protest with a 0 or 1 certainty These models assign a probability score to an incoming article to specify whether the article is reporting a protest or not These models assume that the incoming article is a protest article and tries to recommend complete or partial encoding(s) for the article 1.  Sub-domain based filtering model 2.  URL based filtering model 3.  Negative keyword based filtering Model 1.  Naïve-Bayes Document Classifier 2.  Image based Classifier 3.  SEO Meta Tags based Classifier 4.  Deep Learning Classifier 1.  Clustering based Model for full-encoding recommendation 2.  Geo-location Model for location recommendation 3.  Key sentence(s) recommendation 4.  SEO Meta Tags based recommendations 5.  National or Statewide protest recommendation Approach: All articles are passed through each of these models. However, if any of these models classify the article as Non-protest then the article is labeled as non- protest article in the interface Approach: Each of these models assign individual probabilities to an incoming article. An article’s final probability is calculated using ‘model ensemble’ approach. In the interface user can specify a cut-off probability score. Articles that have probability greater than the cutoff will appear as protest articles in the interface Approach: These recommendations appear in the interface for each article. The recommendations are clickable allowing users to select an encoding by just 1-click.
  • 14. Filtering-Based Models Probability-Based Models Recommendations-Based Models These are rules based models that classify incoming news articles into protest and non-protest with a 0 or 1 certainty These models assign a probability score to an incoming article to specify whether the article is reporting a protest or not These models assume that the incoming article is a protest article and tries to recommend complete or partial encoding(s) for the article 1.  Sub-domain based filtering model 2.  URL based filtering model 3.  Negative keyword based filtering Model 1.  Naïve-Bayes Document Classifier 2.  Image based Classifier 3.  SEO Meta Tags based Classifier 4.  Deep Learning Classifier 1.  Clustering based Model for full-encoding recommendation 2.  Geo-location Model for location recommendation 3.  Key sentence(s) recommendation 4.  SEO Meta Tags based recommendations 5.  National or Statewide protest recommendation Approach: All articles are passed through each of these models. However, if any of these models classify the article as Non-protest then the article is labeled as non- protest article in the interface Approach: Each of these models assign individual probabilities to an incoming article. An article’s final probability is calculated using ‘model ensemble’ approach. In the interface user can specify a cut-off probability score. Articles that have probability greater than the cutoff will appear as protest articles in the interface Approach: These recommendations appear in the interface for each article. The recommendations are clickable allowing users to select an encoding by just 1-click. Models Ecosystem (duplicate slide for quick reference) 9
  • 15. Sub-Domain Based Filtering •  Many of the sub-domains are tagged as non-relevant for protest articles. – Sports, Entertainment, Editorial etc. •  If an article appears in any of these sub- domains it will be classified as non-protest article •  Filtering-Based Model 10
  • 16. URL-Based Filtering •  Even from the relevant sub-domains, there might be several URL structures that are irrelevant. For example: –  URLs summarizing top stories of the day Ex: http://www.clarin.com/politica/ –  URLs summarizing stories by topics Ex: http://www.clarin.com/tema/manifestaciones.html –  URLs corresponding to search terms Ex: http://www.clarin.com/buscador?q=protesta •  Filtering-Based Model 11
  • 17. Negative Keyword Based Filtering Model •  For many of the protest keywords, there exist words (Negative Keywords) which when used together with the protest keyword can alter the meaning. For example: •  Filtering-Based Model 12 Protest Keyword Negative Keyword Phrase Meaning marcha ponar en marcha to start; to set in motion protesta tomar protesta to swear in (public official) protesta rendir protesta to swear in (public official)
  • 18. Filtering-Based Models Probability-Based Models Recommendations-Based Models These are rules based models that classify incoming news articles into protest and non-protest with a 0 or 1 certainty These models assign a probability score to an incoming article to specify whether the article is reporting a protest or not These models assume that the incoming article is a protest article and tries to recommend complete or partial encoding(s) for the article 1.  Sub-domain based filtering model 2.  URL based filtering model 3.  Negative keyword based filtering Model 1.  Naïve-Bayes Document Classifier 2.  Image based Classifier 3.  SEO Meta Tags based Classifier 4.  Deep Learning Classifier 1.  Clustering based Model for full-encoding recommendation 2.  Geo-location Model for location recommendation 3.  Key sentence(s) recommendation 4.  SEO Meta Tags based recommendations 5.  National or Statewide protest recommendation Approach: All articles are passed through each of these models. However, if any of these models classify the article as Non-protest then the article is labeled as non- protest article in the interface Approach: Each of these models assign individual probabilities to an incoming article. An article’s final probability is calculated using ‘model ensemble’ approach. In the interface user can specify a cut-off probability score. Articles that have probability greater than the cutoff will appear as protest articles in the interface Approach: These recommendations appear in the interface for each article. The recommendations are clickable allowing users to select an encoding by just 1-click. Models Ecosystem (duplicate slide for quick reference) 13
  • 19. Naïve-Bayes Document Classifier 1.  For each article in the training set extract named entities: people, location and organization 2.  For each country, for every mention of people, location, organization and protest keywords in the training set, identify the probability of being a protest article 3.  For an incoming article, based on the mentions of people, location, organization and protest keyword in it, assign a naive-bayes probability of the article being a protest article •  Probability-Based Model 14
  • 20. Image Based Classifier •  A picture is worth 1,000 words •  An image classification model that learns from the images in the training set and classifies the incoming images as protest image or not •  Excludes cases when the article image is a standard image like newspaper logo or there is no associated image. •  Probability-Based Model 15
  • 21. SEO Meta Tags based Classification and Suggestions •  Almost every news site use SEO meta tags that makes it easy for search engine crawlers to index their content •  In these tags they provide very succinct information about the article that can be used to our advantage like summary, abstract, description, keywords, publish date etc. •  These tags are generated for each article specifically to get a better presence on the web. •  Probability-Based and Suggestion-Based Model 16
  • 22. SEO Meta Tags based Classification and Suggestions 17
  • 23. Deep Learning Classifier •  Uses Neural Network based Deep Learning techniques like word2vec, doc2vec to classify incoming articles into protest and non-protest. •  Probability-Based Model 18
  • 24. Model Ensemble •  The goal of model ensemble is to combine probabilities from each of the probability based models into a one final probability score for the article. •  Takes into account how good each of the models have been in the past •  Also takes care of cases when one or more of the models is not able to generate any probability score (for ex: when the image is not present) •  The interface shows only one single combined probability for each article. The interface allows the user to specify a cutoff probability score. Any article with a combined probability score greater than the cutoff is shows an protest article in the interface •  Part of Probability-Based-Models 19
  • 25. Filtering-Based Models Probability-Based Models Recommendations-Based Models These are rules based models that classify incoming news articles into protest and non-protest with a 0 or 1 certainty These models assign a probability score to an incoming article to specify whether the article is reporting a protest or not These models assume that the incoming article is a protest article and tries to recommend complete or partial encoding(s) for the article 1.  Sub-domain based filtering model 2.  URL based filtering model 3.  Negative keyword based filtering Model 1.  Naïve-Bayes Document Classifier 2.  Image based Classifier 3.  SEO Meta Tags based Classifier 4.  Deep Learning Classifier 1.  Clustering based Model for full-encoding recommendation 2.  Geo-location Model for location recommendation 3.  Key sentence(s) recommendation 4.  SEO Meta Tags based recommendations 5.  National or Statewide protest recommendation Approach: All articles are passed through each of these models. However, if any of these models classify the article as Non-protest then the article is labeled as non- protest article in the interface Approach: Each of these models assign individual probabilities to an incoming article. An article’s final probability is calculated using ‘model ensemble’ approach. In the interface user can specify a cut-off probability score. Articles that have probability greater than the cutoff will appear as protest articles in the interface Approach: These recommendations appear in the interface for each article. The recommendations are clickable allowing users to select an encoding by just 1-click. Models Ecosystem (duplicate slide for quick reference) 20
  • 26. Clustering-Based Full Encoding Recommendation •  Articles referring to the same topic are clustered together in real- time in the interface –  Uses a third party search results clustering algorithm named lingo3G •  If any of the articles in the cluster has already been encoded, the system starts to recommend the same encoding for other articles in the cluster •  In case of multiple articles with different encodings in the same cluster, then the recommendations are made based on the most used encoding tuple •  Recommendations are clickable and allows a user to encode the article using just 1-click •  Recommendation-Based Model 21
  • 27. Geo-Location Model •  This model works on Location Named Entities extracted from article text and an extended version of world- gazetteer to recommend a location that the article is talking about •  Also handles cases when the article reports landmarks instead of city names •  Recommendation-Based Model 22
  • 28. Key Sentence(s) Suggestion •  This is a Neural-Network based model that identifies key sentences in the article: –  Sentences reporting protest –  Sentences reporting reasons for protest, or participating population –  Sentences providing contextual information •  On the interface the user can toggle his “reading view” to show: –  Just the highlighted sentences of the articles –  Full Article •  Recommendation-Based Model 23
  • 29. National / Statewide Protest Suggestion •  Simple keyword based model that looks for variants of the word “national” or “State-wide” in the article text and makes a recommendation that the protest maybe a nationwide protest •  Used more as a cautionary model to alert users that article might need to be encoded as nationwide/ statewide protest article instead of city level protest article •  Recommendation-Based Model 24
  • 30. Adding a New Model •  The system has a very flexible architecture that allows addition of new models till the time they fall in on of the three categories – filtered, probability or suggestion based model. •  The system treats the models as black-box and uses a standard interface for calling them: –  Based on the model type, the system expects a standard response –  For example: It is very easy to integrate BBN SERIF into the new version. SERIF will receive an article through an API and will return the extracted event (full or partial), which will then be automatically shown as a suggestion in the interface. 25
  • 32. New “Intelligent” Interface •  New Intelligent Interface: –  User defined criteria for classifying Protest / Non-protest Article –  Similar articles appear in clusters, thereby reducing redundancy –  Shows full-event encoding suggestions (event extraction) for the article. There are two ways to show these full-event suggestions: •  Clustering based suggestions: Assuming that articles in the cluster are similar, encodings from the encoded articles are used to make suggestions for the unencoded articles •  Ensembled Recommendation Suggestions: Full tuples encoding suggestions are generated from the partial suggestions made by the recommendation based models –  Individual suggestions are shown in the encoding form itself. These suggestions are generated by recommendation models –  Shows the output from all the classification models along with their comments in an easy to ready well-constructed English statements. –  Key-sentence Highlighted with an ability to tag sentences and switch between two reading views: “Full Article” and “Highlighted Text”. 27
  • 35. AutoGSR Interface 30 Allows the user to choose his criteria for selecting protest/Non-protest articles. He can define Cutoff Confidence Probability for classifying an article as protest article.
  • 36. AutoGSR Interface 31 The returned articles are clustered on-the- fly such that similar articles appear in the same cluster. The system also generates Cluster Labels
  • 37. AutoGSR Interface 32 Clicking on a cluster shows all the articles in the clusters along with a color-coding to differentiate encoded articles from unencoded articles
  • 38. AutoGSR Interface 33 Full Encoding Suggestions along with confidence scores are generated based on the encodings of the other articles in the cluster
  • 39. AutoGSR Interface 34 Encoding Suggestions for Individual Components are shown in the encoding form itself. These suggestions are generated by recommendation models
  • 40. AutoGSR Interface 35 Shows the output from all the classification models along with their comments in an easy to ready well- constructed English statements
  • 41. AutoGSR Interface 36 Shows the original text, translated text along with associated image
  • 42. AutoGSR Interface 37 Based on the output of key-sentence recommendations model, sentences are highlighted that are deemed to contain the information required by event extraction. Further, a user can also click a particular sentence and record the type of information provided by that sentence in case if he disagrees with the system generated recommendations
  • 43. AutoGSR Evaluation Month Quality Score (Out of 4) Precision Recall October’15 3.561 0.8 0.94 November’15 3.622 0.82 0.78 December’15 3.53 0.88 0.83 January’15 3.54 0.92 0.84 38 February’16 Quality Score (Out of 4) Precision Recall Egypt 4 1 0.315 Jordan 3.56 1 0.94