Tech Frontiers: Text, Sentiment, Sense

Technology Frontiers: Text,
Sentiment, and Sense
Seth Grimes
@sethgrimes

A Sensemaking Story
New York Times,
September 30, 2012

New York Times,
September 8, 1957
Valium: A Chain of Connections

Natural Language Processing
By H.P. Luhn, in
IBM Journal,
April, 1958
http://altaplana.com/ibm-
luhn58-LiteratureAbstracts.pdf

Modelling Text
“Statistical information derived from word frequency and distribution is
used by the machine to compute a relative measure of significance, first
for individual words and then for sentences. Sentences scoring highest in
significance are extracted and printed out to become the auto-abstract.”
-- H.P. Luhn, The Automatic Creation of Literature Abstracts, IBM Journal, 1958.
Luhn’s analysis of
Messengers of the Nervous
System, a Scientific American
article
http://wordle.net, applied
to the NY Times article

New York Times,
September 8, 1957
Luhn’s Example

Can Software Make the Connection?
Mark Lombardi, George W. Bush, Harken Energy
and Jackson Stephens, c. 1979-90, Detail

Insight from Connections
… via graphs, clusters, categories, and counts.
… by mining the full set of available data.

http://techpresident.com/news/21618/pol
itico-facebook-sentiment-analysis-bogus
Online & Social Change Everything

Lexical, syntactic, and semantic analysis discern
features including relationships in source materials.
Features = entities, measure-value pairs, concepts,
topics, events, sentiment, and more.
Text analytics may draw on:
• Lexicons & taxonomies.
• Statistics.
• Patterns.
• Linguistics.
• Machine learning.
Text Analytics

From POS to Relationships
Understand parts of
speech (POS), e.g. –
<subject> <verb>
<object> –to
discern facts and
relationships.
Semantic networks
such as WordNet
are a
disambiguation
asset.

Clustered Clarity
Carrot2.
(open source)

Platforms and ecosystems.
APIs and services.
Text and content analytics --
Discerns and extracts features including relationships from
source materials.
Features = entities, key-value pairs, concepts, topics,
events, sentiment, etc.
Provide (for) BI on content-sourced data.
Data integration, record linkage, data fusion.
The Back End

Content, Composites, Connections

Content, Composites, Connections, 2

Sentiment Analysis
“Sentiment analysis is the task of identifying positive
and negative opinions, emotions, and evaluations.”
-- Wilson, Wiebe & Hoffman, 2005, “Recognizing Contextual Polarity in
Phrase-Level Sentiment Analysis”
“Sentiment analysis or opinion mining is the
computational study of opinions, sentiments and
emotions expressed in text… An opinion on a feature f is
a positive or negative view, attitude, emotion or
appraisal on f from an opinion holder.”
-- Bing Liu, 2010, “Sentiment Analysis and Subjectivity,” in Handbook of
Natural Language Processing

Intent Analysis
http://www.aiaioo.com/whitepapers/intention_analysis_use_cases.pdf
http://sentibet.com/

Complications
Sentiment may be of interest at multiple levels.
Corpus / data space, i.e., across multiple sources.
Document.
Statement / sentence.
Entity / topic / concept.
Human language is noisy and chaotic!
Jargon, slang, irony, ambiguity, anaphora, polysemy,
synonymy, etc.
Context is key. Discourse analysis comes into play.
Must distinguish the sentiment holder from the object:
“Geithner said the recession may worsen.”

Audio including speech.
Images.
Video.
http://www.geekosystem.com/
facebook-face-recognition/
http://www.sciencedirect.com/science
/article/pii/S0167639312000118
http://flylib.com/books/en/2.495.1.54/1/
Beyond Text

Sensemaking
“It is convenient to divide the entire
information access process into two
main components: information retrieval
through searching and browsing, and
analysis and synthesis of results. This
broader process is often referred to in
the literature as sensemaking.
Sensemaking refers to an iterative
process of formulating a conceptual
representation from of a large volume
of information. Search plays only one
part in this process.”
-- Marti Hearst, 2009 http://searchuserinterfaces.com/

Apply new tech to old needs, e.g., automated coding.
Select from and use all available data.
Marry social to profiles and surveys.
Factor in behaviors.
Interpret according to context and needs.
Understand intent to create situational predictive
models.
Explore; experiment.
Suggestions

Tech Frontiers: Text, Sentiment, Sense

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (6)

Ähnlich wie Tech Frontiers: Text, Sentiment, Sense

Ähnlich wie Tech Frontiers: Text, Sentiment, Sense (20)

Mehr von Seth Grimes

Mehr von Seth Grimes (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Tech Frontiers: Text, Sentiment, Sense