How people use words—in speech, in writing—can reveal a lot about a person. In a research context, computational text analysis may be applied to extract linguistic patterns from people’s language-based expressions. This presentation demonstrations the application of LIWC2015 for psychometric insights in a research context, to surface insights about individuals, shed light on various hypotheses, compare various text-based expressions, and otherwise engage various corpora through psychological measures.
2. Presentation Description
• How people use words—in speech, in writing—can reveal a lot about a
person. In a research context, computational text analysis may be applied to
extract linguistic patterns from people’s language-based expressions. This
presentation demonstrations the application of LIWC2015 for psychometric
insights in a research context, to surface insights about individuals, shed light
on various hypotheses, compare various text-based expressions, and
otherwise engage various corpora through psychological measures.
2
4. Underlying software assumptions
Human personality and its
expression (in all things)
• Human expressions in all they do
• Unconscious
• Subconscious
• Conscious
• Humans not in full control
• Observable leakage and revelatory
aspects
Language
• Held in balance
• Empirics
• Acculturated into
• Also a small degree of uniqueness
4
5. Tool origination
• LIWC was created in the early- to mid-1990s to enable more consistent
analysis of people’s essays about emotional upheavals (Tausczik &
Pennebaker, 2010)
• It was created by a psychologist and his graduate student team
5
6. Tool validation
• The respective manuals (particularly for LIWC2015) offer insights on the
software tool’s validation for constructs and other measures
6
7. Customization affordances
• LIWC2015 enables the making of custom dictionaries for different types of
coding (to particular research)
• The building of such .dic files is fairly simple, and the testing is also fairly
simple (Hai-Jew, Dec. 2016)
7
9. Curated collections of text expression
Voice to text
• Speeches
• Songs
• Interviews, oral histories
• Transcripts
• and others
Genre based writing
• Essays
• Short stories
• Novels
• Plays
• Monologues, dialogues
• and others
9
10. Curated collections of text expression(cont.)
Mass media
• Mass media articles
• Academic articles
• And others
Social media
• Tweets / microblogging
• Posts
• And others
10
11. Curated collections of text expression(cont.)
Documents, byproducts of work
• Internal corporate documents
• Documents (legal, court, policy,
and others)
• and others
Personal writing
• Letters
• Notes
• And others
11
12. About the text corpus
• What is included? Why?
• What is excluded? Why?
• What is not accessible / available and so has been left out?
• How is the data handled and cleaned?
• What language elements would be excluded from an analysis based on the
tool?
• How can workarounds be created to more fully engage the corpus?
12
13. About the text corpus (cont.)
• What are materials that may not be available publicly (but that may be legally
accessed and used for the particular research)?
• What new potential insights may be available?
13
14. Sectioning the corpus
• What are ways to split the text corpus for different research insights?
• How will segmentation affect the data analyses? The available queries? Other aspects?
• What are ways to combine the text corpus for different research insights?
14
15. Some Uses of LIWC
in Prior Research
(in a general sense)
15
16. Profiling / Description
• Individual speakers and writers (or
combined textual expression)
• Groups
• Cultural entities
• Genres
• Topics
• Demographic group responses to
informational elicitations (incl. age,
race, ethnicity, class, geographical
location, and others)
16
17. Predictivity
• Use language as a classifier of various groups and attitudes
• Use language as an anticipatory measure of what may occur in the near-term,
mid-term, and far future term
• Suicidality (among particular groups like poets) is one such application in the extant
literature
17
18. Problem Solving
• Ways to improve negotiations to be salient (by better understanding the
thinking of the other side)
• Ways to improve self-awareness in one’s own speaking and writing (and
correcting for less conscious speaking and writing “tells”)
• Ways to improve awareness of attitudes and consciousness of various
individuals and groups in various contexts
18
19. Theorizing / Hypothesizing
• Exploring writing of all types: speeches/dialogues, plays, songs, poetry,
interviews, tweetstreams, poststreams, and other works to theorize or
hypothesize (varies across domains and cases)
19
26. Some basic linguistic counts
• Function words: pronoun, ppron (personal pronouns), i, we, you, shehe,
they, ipron (impersonal pronouns), article, prep, auxverb, adverb, conj,
negate, verb, adj, compare, interrog, number, quant (quantifiers)
• Punctuation
• Informalisms: swearing, netspeak, assent, nonfluencies, filler words
26
27. Relevance of basic linguistic counts?
• Such linguistic counts may show a unique author hand…and other
patterning
• They may be indicative of power relationships
• They may be indicative of patterns in certain genres
27
29. Convenience corpus used
• The corpus used is a “convenience” collection of seven recently completed
chapters (as an academic genre) by one author.
29
30. Exploratory plan
• Do the four text score patterns show a consistent hand? What do these show about the
writing style (for the particular genre)?
• Do the function word patterns across the seven works show one author hand? How
much variance is there?
• What do the psychometrics show about the focuses of the respective works (in
comparison to each other)?
30
32. Universal scores across the seven academic
articles
• High in the use of analytic observations and analyses (Analytic)
• Fairly high levels of Clout or language that focuses on power and influence
• Low levels of personal warmth and connecting through language
(Authentic), as is typical with academic writing
• One article in the positive sentiment range but other six in the negative
sentiment range (Tone)
32
33. 33
0
5
10
15
20
25
30
35
40
45
Patterns in Function Words across Seven Academic Research Articles
DemocracyAdvocacy TransnationalFauxRomance TransnationalPlasticsManagement
GlobalWhistleblowing SARSCoV2 PrivacyMovement
PhysicalEffigiesSocialImagery
34. Patterns in function words across seven
academic research articles
• The linegraph suggests some patterning across the function words.
• Prepositions seem popular with this author, along with verbs, numbers, and
conjunctions
• There are lower usages of adverbs, negations, interrogatives, and quantitative terms
(comparatively speaking)
• The genre of writing may inform some of the selections of language, but
the patterns of writing (subconscious, unconscious) in function words may
carry over to other writings by the author
34
35. 35
0
5
10
15
20
25
30
35
40
AllPunc Period Comma Colon SemiC QMark Exclam Dash Quote Apostro Parenth OtherP
Punctuation Usage across the Seven Academic Research Articles
DemocracyAdvocacy TransnationalFauxRomance TransnationalPlasticsManagement GlobalWhistleblowing
SARSCoV2 PrivacyMovement PhysicalEffigiesSocialImagery
48. 48
0 2 4 6 8 10 12 14
DemocracyAdvocacy
TransnationalFauxRomance
TransnationalPlasticsManagement
GlobalWhistleblowing
SARSCoV2
PrivacyMovement
PhysicalEffigiesSocialImagery
Relativity Language Indicators across Seven Academic Research Articles
time space motion relativ
52. References
• Hai-Jew, S. (2016). See Ya! Creating a custom spatial-based linguistic analysis
dictionary from social media data sets to explore American renunciation of
Citizenship. SlideShare.
• Tausczik, Y.R., & Pennebaker, J.W. (2010). The psychological meaning of
words: LIWC and computerized text analysis methods. Journal of Language
and Social Psychology, 29(1), 24 – 54.
52
53. Reference Manuals
• Pennebaker, J.W., Booth, R.J., Boyd, R.L., & Francis, M.E. (2015). Linguistic
Inquiry and Word Count: LIWC2015. Operator’s Manual. Retrieved from
https://s3-us-west-
2.amazonaws.com/downloads.liwc.net/LIWC2015_OperatorManual.pdf.
• Pennebaker, J.W., Chung, C.K., Ireland, M., Gonzales, A., & Booth, R.J.
(2007). The Development and Psychometric Properties of LIWC2007.
Retrieved from https://www.liwc.net/LIWC2007LanguageManual.pdf.
53
54. Additional Research
• The academic research literature has thousands of articles, which show
various approaches using LIWC, various types of hypotheses, analyses, text
corpora, and applied domains.
• On Google Scholar, the disambiguated “LIWC” has 15,900 results.
54
55. Contact Information
• Dr. Shalin Hai-Jew
• Kansas State University
• ITS
• shalin@ksu.edu
• 785-532-5262
55