SlideShare a Scribd company logo
1 of 84
Download to read offline
Mining and Analyzing Social Media
        HICSS 45 Tutorial – Part 2
                            Dave King
                       January 4, 2012
Agenda: This is how the slides are
organized
• Part 1
  –   Introduction – Bio, Resources, Social Media
  –   Data Mining – Processes and Example
  –   Text Mining – General Processes and Example
  –   Predicting the Future – The Portmanteaus
• Part 2
  – Sentiment Analysis
  – Social Network Analysis - Introduction

                                                                            2
                   Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
What are your customers thinking?




Every hour of every day they share their opinions, issues, thoughts and
sentiments about brands, products, services and companies (on line).
                       Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
   Some Survey Data




Cone Communications:
http://www.coneinc.com/2011co
neonlineinfluencetrendtracker

                                                                                         4
                                Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Some Payoffs

 Marketing                     Service                                      Products




  Message                   Response                                     Issues and Focus

   A form of Automated Text Categorization (ATC)



                                                                                        5
                Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Some Examples




                                                                      6
             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
   Some Examples
                          Cycling Community Responds
                                 @BicyclingMag
                                 @BikePortland
                                  @clevercycle
                                @cyclingreporter




GM runs Ad on 10/17/11




                                                                                  7
                         Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Some Examples
                                                    Key Areas of Concern:

                                                         • Break in online link to
                                                           Mint.com
                                                         • Actionable Service
                                                           Breaks
                                                         • Outrage over “$50
                                                           limit on debit card
                                                           transactions”

                                                                                     8
             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Defined
Text Mining to classify subjective opinions in text into
categories like "positive" or "negative” extracting various forms
of attitudinal information: sentiment, opinion, mood, and
emotion. Also called Voice of the Customer (VOC) or Opinion
Mining.




                     Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Sample Software Vendors


Alterian          Etuma                            Lymbix                         Quantivo
Attensity         Evolve24                         Medallia                       Radian6 (SalesForce.com)
Brandwatch        General Sentiment                Meltwater                      SAS
Buzzdetector      IBM Cognos                       Meshlabs                       Sentiment Metrics
Clarabridge       IBM SPSS                         Netbase Solutions              SentMetrix
Crimson Hexagon   InfiniGraph                      OpenAmplify                    Traackr
Digimind          Kontagent                        Overtone                       Visible Technologies
DigitalPebble     Lexalytics                       PostRank (Google)              Wise Window
EffectCheck       Lithium




                                                                                                       10
                         Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Types
             • Sentiment Classification – document
               level, classified as positive or negative
             • Feature-based opinion – sentence
               level, determines which aspects of an
               object people like or dislike
             • Comparative sentence and
               relationship mining – sentence level
               comparisons of one object against
               another (to determine which is better
               than the other)
                                                                       11
              Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Types
      • From one type to the next (classification, features,
        comparisons), it becomes more complex to identify
        and extract the information.

      • Once extracted, standard text mining techniques
        can be used to classify and compare the opinions

      • Simple techniques (like naïve Bayesian) often
        produce strong results (e.g. 80+% accuracy)

                                                                         12
                Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Assumption
       An Opinion Lexicon that Expresses State




       Polar, Opinion-Bearing, and Sentiment Words and Phrases

                                                                           13
                  Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
How do you know if it is “+” or “-”?
        plot : two teen couples go to a church party , drink and then drive .
        they get into an accident .
        one of the guys dies , but his girlfriend continues to see him in her life , and has nightmares .
        what's the deal ?
        watch the movie and " sorta " find out . . .
        critique : a mind-xxx movie for the teen generation that touches on a very cool idea , but
        presents it in a very bad package .
        which is what makes this review an even harder one to write , since i generally applaud films
        which attempt to break the mold , mess with your head and such ( lost highway & memento ) ,
        but there are good and bad ways of making all types of films , and these folks just didn't
        snag this one correctly .
        they seem to have taken this pretty neat concept , but executed it terribly .
        so what are the problems with the movie ?
        well , its main problem is that it's simply too jumbled .

        having not seen , " who framed roger rabbit " in over 10 years , and not remembering much besides
        that i liked it then , i decided to rent it recently .
        watching it i was struck by just how brilliant a film it is .
        aside from the fact that it's a milestone in animation in movies ( it's the first film to combine real
        actors and cartoon characters , have them interact , and make it convincingly real ) and a great
        entertainment it's also quite an effective comedy/mystery .
        while the plot may be somewhat familiar the characters are original , especially baby herman , and
        watching them together is a lot of fun .
        …
        `who framed roger rabbit' is a rare film .
        one that not only presented a great challenge to the filmmakers but one that can be enjoyed
        by the whole family ( although some very young viewers may be a little scared by judge doom ) .
        do yourself a favor and rent it , `p-p-p-p-please . "


                                                                                                                 14
                             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Other interests in Sentiment




                                                                       15
              Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Other interests in Sentiment




                                                                       16
              Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
 Doing Simple Sentiment Analysis
             General Problem
                                                                        1

                               Automated
                                                                        2
Collection                       Process                                     Small Set of
  of Text                    for Classifying                                Predetermined
                                                                        3
Documents                                                                     categories
                                      ???                               …
                                                                        n




                                                                                     17
               Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
 Doing Simple Sentiment Analysis
                    General Answer
                                                                                1

                                       Automated
                                                                                2
Collection                               Process                                     Small Set of
  of Text                            for Classifying                                Predetermined
                                                                                3
Documents                                                                             categories
                                                                                …
                                                                                n




             Inductive, supervised machine learning
               classification process and algorithm
                                                                                             18
                       Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Doing Simple Sentiment Analysis
          Real-World
           Text Data                                           Training Process
                                                            Documents with known Classification

           Document
          Consolidation
                                                                 Train             Test          Validate

          Establish the
            Corpus

                                                                                Classification
        Corpus Refinement                                                         Algorithm
       (Token, Stem, Stop…)


        Feature Selection
          & Weighting
                                                        1                   2             3                  n


                                                                      Predetermined Categories
             Term-
           Doc-Matrix*
                                                                                                            19
                   Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Doing Simple Sentiment Analysis

            Classification Algorithms
            •      Naïve Bayes
            •      Decision Trees
            •      Nearest Neighbor (k-NN)
            •      Support Vector Machine
            •      Neural Nets (e.g. SOM)

                Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Doing Simple Sentiment Analysis

                                  Twitter Statistics
                                   •     ~200M registered users.
                                   •     ~50M users login every day
                                   •     Over 400K new users per day.
                                   •     400 million unique visitors per month.
                                   •     55% use their phone to tweet.
                                   •     Average 200 million tweets a day.
                                   •     600 million search queries per day
                                   •     75% of traffic from 3rd Party Apps
                                   •     60% of tweets from 3rd Party Apps


             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Doing Simple Sentiment Analysis

                                                         Problem Features
                                                     •      Each tweet <= 140 characters
                                                            (avg. 10-15 words/message)
                                                     •      Heavy presence of non-alpha
                                                            symb0-ols, abbrevs,
                                                            misspellings and slang
                                                     •      Tweets often include retweets
                                                            (original tweet repeated)
                                                     •      In spite of this – Tweets have
                                                            proven to be an interesting text
                                                            mining source (warts and all)



             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Doing Simple Sentiment Analysis
•   Zambonini, D. "Self-Improving Bayesian Sentiment Analysis for Twitter.“ August 27,
    2010. danzambonini.com/self-improving-bayesian-sentiment-analysis-for-twitter.
•   Kalafatis, T. “The Sentiment on US Economy from Twitter.” October, 2009.
    lifeanalytics.blogspot.com/2009/10/sentiment-on-us-economy-from-twitter.html.
•   Pak, A. and P. Paroubek. “Twitter as a Corpus for Sentiment Analysis and Opinion
    Mining.” In Proceedings of the Seventh International Conference on Language
    Resources and Evaluation. May, 2010. lrec-
    conf.org/proceedings/lrec2010/slides/385.pdf
•   Sood, S. and L. Vasserman. “ESSE: Exploring Mood on the Web.” August 2009.
    lcs.pomona.edu/people/files/SoodCV.pdf.
•   Go, A. et al. “Twitter Sentiment Classification using Distant Supervision.”
    2009.stanford.edu/~alecmgo/papers/TwitterDistantSupervision09.pdf
•   Agarawal, A. et al. “Sentiment Analysis of Twitter Data.” 2011.
    www1.ccls.columbia.edu/~beck/pubs/lsm2011_full.pdf
                                                                                     23
                            Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Doing Simple Sentiment Analysis




                                                                      24
             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Doing Simple Sentiment Analysis
• Twitter used to get a total of 3 billion requests a day
  via its API
• API Calls for Public Tweets
   – http://search.twitter.com/search.json?q=%3A)+feel+
     feeling&rpp=100&page=1
   – http://api.twitter.com/1/trends/current.json?
     exclude=hashtags



                   Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Doing Simple Sentiment Analysis
                        {u'iso_language_code': u'en',
                        u'to_user_name': None,
                        u'to_user_id_str': None,
                        u'from_user_id_str': u'59862385',
                        u'text': u"Lol i feel ya!!RT @Sweet_Sun_Shine: @joshaustin13 everything's up,
                             its the weekend baby!!!! :) and I plan on enjoying; how are you feeling?",
                        u'from_user_name': u'B.Resilientue50cue50cue50c',
                        u'profile_image_url': u'http://a3.twimg.com/profile_images/1650184586/joshaustin13_normal.jpg',
                        u'id': 145274459127955456L,
                        u'to_user': None,
                        u'source': u'&lt;
                            a href=&quot;http://www.echofon.com/&quot;
     Sample                 rel=&quot;nofollow&quot;
                            &gt;Echofon&lt;
    Tweet from              /a&gt;',

     API call           u'id_str': u'145274459127955456',
                        u'from_user': u'joshaustin13',
                        u'from_user_id': 59862385,
                        u'to_user_id': None,
                        u'geo': None,
                        u'created_at': u'Fri, 09 Dec 2011 22:51:44 +0000',
                        u'metadata': {u'result_type': u'recent'}
                        }


                 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Doing Simple Sentiment Analysis
                                                 “Twitter Sentiment Classification
                                                 using Distant Supervision” (2009)
                                                 • Utilizes presence of emoticons “ :)” &
                                                    “ :( “ to serve as surrogates for
                                                    classification as positive and
                                                    negative sentiment statements
                                                 • To construct the term-document
                                                    matrix relies on a list of positive and
                                                    negative key words from Twittratr,
                                                    counting number of key words that
                                                    appear in each tweet.
                                                 • 180K tweets collected for training
                                                    purposes between April and June
                                                    2009
                                                 • 80%+ accuracy in classification

             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Doing Simple Sentiment Analysis
  Counts   Type    List           Set                                          Happy Face 
  Words    HF              8354         2169
           SF              7702         1996
           Total          16056         3469
  Alpha    HF              5917         1094
           SF              5433         1055
           Total          11350         1169
  Stop     HF              3425          992
           SF              3325          953
           Total           6750         1563
                                                                                 Sad Face 
  Stem     HF              3425          895
           SF              3325          850
           Total           6750         1375
  Stem w/o HF              2618          894
           SF              2516          849
           Total           5134         1374


                                                                                              28
                             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Doing Simple Sentiment Analysis




      Happy Face                                                             Sad Face 

                                                                                           29
                     Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
Doing Simple Sentiment Analysis

                P(H/D) = P(D/H) * P(H)/P(D)
                 H is the hypothesis and D is the data

                 P(H) is the prior probability of H: the probability that H is
                 correct before the data D are seen
                 .
                 P(D/H) is the conditional probability of seeing the data D
                 given that the hypothesis H is true. This conditional
                 probability is called the likelihood.

                 P(D) is the marginal probability of D.

                 P(H/D) is the posterior probability: the probability that
                 the hypothesis is true, given the data and the previous
 Thomas Bayes    state of belief about the hypothesis.


                Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sentiment Analysis:
 Doing Simple Sentiment Analysis
                                                               Training Set
                                     Message                                   Category
                                     Love is great                             Positive
                                     I feel great now                          Positive
                                     I feel sick today                         Negative
                                     Great, today sucks                        Negative
P(Positive | Tweet)
                                     Today is going to be good                 Positive
compared to
P(Negative | Tweet)                P(Pos | Tweet) = P(Pos) * P(W1/Pos) / P(Tweet)
                                   P(Pos| Tweet) = P(Pos) * P(great/Pos)
                                   P(Pos | Tweet) = (3/5) * (2/3) = .4

                                   P(Neg | Tweet) = P(Neg) * P(W1/N) / P(Tweet)
                                   P(Neg | Tweet) = P(Neg) * P(great/Neg)
                                   P(Neg| Tweet) = (2/5)*(1/2) = .2



                      Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Spam Detection:
 Naïve Bayesian Classifier
                                                            Training Set
                                 Message                                       Category
                                 Love is great                                 Positive
                                 I feel great now                              Positive
P(Positive | Tweet)              I feel sick today                             Negative
compared to                      Great, today sucks                            Negative
P(Negative | Tweet)              Today is going to be good                     Positive

                       P(Neg | Tweet) =       P(Neg) * P(W1/Neg) * P(W2/Neg) * ...
                       P(Neg | Tweet) = P(Neg) * P(today/Neg) * P(sucks/Neg)
                       P(Neg | Tweet) = ..4 * 1 * .5 = .2

                       P(Pos | Tweet) = P(Pos) * P(W1/Pos) * P(W2/Pos) * ...
                       P(Pos | Tweet) = P(Pos) * P(today/Pos) * P(sucks/Pos)
                       P(Pos | Tweet) = .6 * .33 * 0 = 0




                      Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Spam Detection:
     Naïve Bayesian Classifier
              Training Set
            Token1 Token2 Token3 Token4 … Class
Tweet1        1       0       0       1           Happy                   Naïve
Tweet2        1       0       1       0            Sad
                                                                         Bayesian
Tweet3        0       0       0       1           Happy
Tweet4        0       0       1       0            Sad
                                                                         Classifier
…             …       …       …       …       …    …

                                                                                                                        P(H|Tweet)
                                                                                                                        P(S|Tweet)
                                                                                                                                       > 0??
                  New Tweet
                                                                      Estimated
              Token1 Token2 Token3 Token4 …
                                                                                                             ,     Decision Rule
    Tweet         0       0       1       0                          Probabilities




                                                                                                                          ,
                                                                                                                 P(H)                 P(Wi|H)
                                                                  ln P(H|Tweet)                     = ln                 +   Ʃ   ln
                                                                     P(S|Tweet)                                  P(S)                 P(Wi|S)

                                                                                                         Proof left to reader

                                                    Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
What is this number?




          4.74
                                                                      34
             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Does this help?
     Frigyes Karninthy                                         Stanley Milgram




                                        6


       John Guare                                                 Duncan Watts
                                                                                 35
                  Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Six Degrees of Separation

 A fascinating game grew out of this discussion. One of us suggested performing
 the following experiment to prove that the population of the Earth is closer
 together now than they have ever been before. We should select any person from
 the 1.5 billion inhabitants of the Earth—anyone, anywhere at all. He bet us that,
 using no more than five individuals, one of whom is a personal acquaintance, he
 could contact the selected individual using nothing except the network of
 personal acquaintances.

                                                                    Frigyes Karninthy , Chains, 1929


  A             1            2                     3                      4         5




                                                                                                       36
                           Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Sample Metric
From Social Network Analysis




          4.74
    Average Distance between
       Facebook Members
                                                                      37
             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Another Type of Analysis
                        Which Blogs are Similar?
        Term1   Term2    Term3    …        TermM                                 Blog1      Blog2   Blog3   …   BlogN
Blog1     1       0        0      …           1                   Blog1            -          1       0     …     1
Blog2     0       0        1      …           0                   Blog2            0          -       1     …     0
Blog3     0       1        0      …           1                   Blog3            1          1       -     …     0
…         …       …        …      …          …                    …                …          …       …     -     …
BlogN     0       0        0      …           1                   BlogN            1          0       1     …      -


          Cluster Analysis                                                                Graph/Network
           (e.g. K-Means)                                                                    Analysis




                                                                                                                        38
                                 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Another Type of Analysis
                        Which Blogs are Similar?
        Word1   Word2   Word3    …        WordM
Blog1    1       0       0       …          1
                                                                    For a detail description:
Blog2    0       0       1       …          0                       http://www.slideshare.net/
Blog3    0       1       0       …          1                       daveking63/
…        …       …       …       …         …
                                                                    text-mining-and-analytics-v6-p2
BlogN    0       0       0       …          1


          Cluster Analysis
           (e.g. K-Means)




                                                                                                  39
                                Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Definitions
Network – Collection of things and their
linkages to one another.

Social Network – Collection of humans, roles,
groups, and/or institutions and their social
relationships with one another.

Social Network Analysis (SNA) – Application of
Graph Theory or Network Science to the study of
social relationships and connections.

                                                                          40
                 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Early Efforts




        Jacob Moreno: Sociometry and the Sociogram

                                                                          41
                 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Definitions
                                                “Ten years ago, the field of
                                                Social Network Analysis
                                                was a scientific backwater.
                                                We were the misfits,
                                                rejected from both
                                                mainstream sociology and
                                                mainstream computer
                                                science.”

                                                                        42
              Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Exploding Commercial Interest




                                                                      43
             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
So What Happened?


                   Small Data                                             Flat-files / in memory
            Manually collected                                            computation

            Medium Data
                                                                                     SQL Databases
 Data snapshots from APIs

 Big Data Real-time                                                                      Big Data Approaches
  social media data




                                                                                                               44
                            Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
When things were simplier …
                                                           N=26




      2005
                                                                      N=80




                                                                         45
             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
and then …
                                                                Growth in Social Media
          N~1400                                                Access to SM Network Data
                                                                Availability of Open Source Tools




                              N~3.5K
 2011




                                                             N~90K



                                                                                            46
               Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
and now …
                             N=20M




                           N=80K
                                                                       N = 721M




                                                                                  47
              Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
 Key Elements
Graph or Network
                                                                           Graph
The set of                                                                [ V,E, f ]
vertices/nodes,                                          A
edges/links and the
relationship/function
connecting them.                                                                     B

Vertices or Nodes                                                                        Edge
                                                     C                                   (Link)
The “things”
                                                                                 D
                                               Vertex
Edges or Links
                                               (Node)
The “relationships”

                                                                                                  48
                        Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Types of Edges or Links
    Undirected,                                          Directed,
    Unweighted                                          Unweighted
A                 B                       A                  Twitter
                                                                            B
     Facebook
      Friends                                               Followers


        C                                                          C
    Undirected,                                 Directed, Weighted
     Weighted                                                  100

A    Facebook
                  B                       A                     60          B
                                                       5       Email 70
      Friends
                                                              Network      20
                                             10

        C                                                          C

                                                                                49
                  Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Types of Networks

     Unimodal                     Bimodal                                          Multiplex

       P1                                E1                                          P1


P2              P3     P1                                 P2                  P2                  P3
                                                                                       Follows
                                                                                       Replies To
                                                                                       Mentions




                                                                                                       50
                     Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Types of Network Analysis
“Whole” Network                    “Ego-Centric” Network

      P1
                                           P2                              P3

P2          P3                         Ego                  P4              Alters


      P4                                     P5                            P6


 P5        P6

                                                                                     51
                  Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
   Node Metrics (Centrality)
Measure       Definition                             Interpretation                           Reasoning
Degree        Number of edges or links. In           How connected is a node? How             Higher probability of receiving and transmitting
              degree- links in, Out-degree - links   many people can this person reach        information flows in the network. Nodes considered to
              out                                    directly?                                have influence over larger number of nodes and or are
                                                                                              capable of communicating quickly with the nodes in
                                                                                              their neighborhood.
Betweenness   Number of times node or vertex         How important is a node in terms         Degree to which node controls flow of information in
              lies on shortest path between 2        of connecting other nodes? How           the network. Those with high betweenness function as
              nodes divided by number of all the     likely is this person to be the most     brokers. Useful where a network is vulnerable.
              shortest paths                         direct route between two people
                                                     in the network?
Closeness     1 over the average distance            How easily can a node reach other   Measure of reach. Importance based on how close a
              between a node and every other         nodes? How fast can this person     node is located with respect to every other node in the
              node in the network                    reach everyone in the network?      network. Nodes able to reach most or be reached by
                                                                                         most all other nodes in the network through geodesic
                                                                                         paths.
Eigenvector   Proporational to the sum of the        How important, central, or          Evaluates a player's popularity. Identifies centers of
              eigenvector centralities of all the    influential are a node’s neighbors? large cliques. Node with more connections to higher
              nodes directly connected to it.        How well is this person connected scoring nodes is more important.
                                                     to other well-connected people?




                                                                                                                                             52
                                                Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
      Centrality – Who is most important?
691
                              E
          B                                   Eigen              Node Degree Normed Degree Betweenness Closeness Eigen Vector
                                                                  A     3        0.17          0.00       0.29       0.29
                                                                  B     4        0.22          0.01       0.30       0.36
                  D
      A                           G                               C     2        0.11          0.03       0.35       0.18
                                                                  D     6        0.33          0.04       0.31       0.46
                                                                   E    3        0.17          0.00       0.29       0.30
                      F
      C                                                            F    4        0.22          0.11       0.36       0.35
                                              Betw                G     5        0.28          0.19       0.37       0.43
                          H                                       H     5        0.28          0.58       0.45       0.28
                                              Close                I    4        0.22          0.53       0.46       0.13
      R                                                            J    7        0.39          0.43       0.43       0.12
                          I           N                           K     3        0.17          0.00       0.32       0.06
              P                               Deg                  L    3        0.17          0.01       0.33       0.05
                              J                                   M     3        0.17          0.21       0.33       0.04
                                          O                       N     3        0.17          0.03       0.38       0.07
          K
                                                                  O     2        0.11          0.00       0.31       0.05
                                  M
                      L                                           P     3        0.17          0.03       0.38       0.08
                                                                  Q     2        0.11          0.11       0.26       0.01
                                                                  R     1        0.06          0.00       0.32       0.07
                                  Q
                                                                   S    1        0.06          0.00       0.21       0.00


                                  S

                                                                                                                                53
                                                      Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
    Cohension – How well connected?
Density        Ratio of the number of edges in       How well connected is the overall Perfectly connected network is called a "clique" and
               the network over the total number     network?                          has a density of 1.
               of possible edges between all pairs
               of nodes
Average Degree Average number of links each node     How well connected are the nodes         Higher the average the better connected the members
               or vector has                         on average?                              are.
Average Path   Average number of edges or links      On average, how far apart are any        This is synonymous with the "degrees of separation" in
Length         between any two nodes (along the      two nodes?                               a network.
(Distance)     shortest path)
Diameter       Longest (shortest path) between       At most, how long will it take to        Measure of the reach of the network
               any two nodes                         reach any node in the network?
                                                     Sparse networks usually have
                                                     greater diameters.
Clustering      A node's clustering coefficient is    What proportion of ego's alters         Measures certain aspects of "cliquishness." Proportion
                the density of it's 1.5 degree       are connected? More technically,         of you friends that are also friends with each other.
                egocentric network (ratio of         how many nodes form triangular           Another way to measure is to determine (in a
                connecting among ego's alters).      subgraphs with their adjacent            undirected) graph the ratio of the number of times that
                For entire network it is the average nodes?                                   two links eminating from the same node are also
                of all the coefficients for the                                               linked.
                individual nodes.




                                                                                                                                                54
                                                 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Network Metrics (Centralization)
        Measure                                  Definition
        Degree Centralization                    Variation in the degrees of vertices divided by the
                                                 maximum degree variation that is possible in a
                                                 network of the same size
        Betweenness Centralization               Variation in the betweenness centrality of vertices
                                                 divided by the maximum variation in betweenness
                                                 centrality scores possible in a network of the same
                                                 size
        Closeness Centralization                 Variation in the closeness centrality of vertices
                                                 divided by the maximum variation in closeness
                                                 centrality scores possible in a network of the same
                                                 size
        Eigenvector Centralization               Variation in the eigenvector centrality of vertices
                                                 divided by the maximum variation in eigenvector
                                                 centrality scores possible in a network of the same
                                                 size

   1.   Variation is the summed absolute differences between centrality scores of the
        vertices and the maximum centrality score among them.
   2.   Network is more centralized if the vertices vary more with respect to their
        centrality.

                                                                                                       55
                                   Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
      Cohesion – How well connected?
691
                              E
          B                                                                                                Node   Clustering
                                                                                                            A        0.67
                                                                                                            B        0.67
                  D
      A                           G           Measure                                        Value          C        0.00
                                                                                                            D        0.40
                                              Average Degree                                      3.37
                                                                                                            E        1.00
                      F                       Density                                             0.19
      C                                                                                                     F        0.50
                                              Average Distance                                    3.06      G        0.50
                          H                                                                                 H        0.10
                                              Diameter                                               8
                                                                                                             I       0.33
      R                                       Degree Centralization                               0.22      J        0.29
                          I           N       Betweenness Centralization                          0.48      K        0.67
              P                                                                                             L        0.67
                                              Closeness Centralization                            0.27
                              J                                                                             M        0.33
                                          O   Eigenvector Centralization                          0.56      N        0.67
          K
                                              Clustering Coefficient                              0.43      O        1.00
                                  M
                      L                                                                                     P        0.67
                                                                                                            Q        0.00
                                                                                                            R        NA
                                  Q
                                                                                                            S        NA


                                  S

                                                                                                                               56
                                                  Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Some General Tendencies
                      • Small diameters and small average path
                        lengths
                      • High clustering coefficients relative to
                        random processes
                      • Rate of clustering among the higher-
                        degree nodes decreases with degree
                      • Fat tailed degree distributions relative to
                        random processes
                      • Hard to find networks that actually follow
                        a strict power law
                      • Positive assortativity and high degrees of
                        homophily at least in social networks
                                                                     57
            Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
   Is it really a small world?




http://www.touchgraph.com/navigator
                                                                                     58
                            Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Is it really a small world?

      Ego         Steps              1                                                 1
      Friends         1             50                                               100
      FoF             2          2,500                                            10,000
      FoFoF           3        125,000                                         1,000,000
      FoFoFoF         4      6,250,000                                       100,000,000
      FoFoFoFoF       5    312,500,000                                    10,000,000,000
      FoFoFoFoFoF     6 15,625,000,000                                 1,000,000,000,000



                  The naïve view

                                                                                           59
                    Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Is it really a small world?




                                                                        60
               Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Is it really a small world?
      The emergence of online social networking services
      over the past decade has revolutionized how social
      scientists study the structure of human relationships [1].
      As individuals bring their social relations online, the
      focal point of the internet is evolving from being a
      network of documents to being a network of people,
      and previously invisible social structures are being
      captured at tremendous scale and with
      unprecedented detail.




                                                                              61
                     Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Study Population

         Active*       Global                                US
         Members         721M                                 149M
         Friends         68.7B                                15.9B
         Aver. Friends    190                                   214
         Total Pop        6.9B                                260M

     Accessed within 28 days of May ’11
            At least one friend
           Over 13 years of age

                                                                       62
              Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Degree Distribution

                                                                       N = 721M
                                                                       F = 69B



             Encouraged
             Up to 20
                                      Median
                                      ~ 99




                                                                                  63
              Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Distance Distribution
         Average                                           Average
         4.7                                               4.3




                                            World 92% 99.6%
                                            US    96% 99.7%




                                                                            64
                   Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Connected Components



                                         2000                      99.91% of
                                        Members                    Members




 Connected components – set of individuals for which each pair of
individuals are connected by at least one path through the network

                                                                               65
                      Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Cohesion



         14% for 100                                                   100 friends –
                                                                       28K unique fof’s;
                                                                       40K non-uniq fofs




You’re friends with a significant                     Feld: your friends have more friends
fraction of your friends’ friends                     than you (same with sex partners)



                                                                                           66
                        Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Correlation and Assortativity




                                                                       67
              Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Bird’s of a Feather - Homophily




         84% in the same country
                                                                       68
              Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
“Revolution 2.0 will not be Televised”




             Tweet Rate – Feb. 24-25, 2011, Tahrir Square




It will be Tweeted & Retweeted
                                                                         69
                Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
“Revolution 2.0 will not be Televised”
            1% Feed from the two day period




             Nodes = 25178 Links = 32471


It will be Tweeted & Retweeted
                                                                        70
               Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Network Features
    Average Degree = 2.58                                                            Average Distance = 5.40
    Min Degree = 0                                                                   Min Distance = 1
    Max Degree = 729                                                                 Diameter = 22




                            Measure                                    Value
                            Density                                    0.0001
                            Degree Centralization                       0.029
                            Betweenness Centralization                  0.076
                            Closeness Centralization                     WC
                            Eigenvector Centralization                  0.724
                            Clustering Coefficient                     0.0045
                            Number of Components                        3122
                            Size of Largest Component                   17762
                            % in Largest Component                     70.50%


                   It will be Tweeted & Retweeted
                                                                                                               71
                            Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Node and Network Metrics
       Node       Degree NormDeg Betw     Close     Eigen     Clust
     Ghonim        729     0.029 0.076    0.214     0.509     0.001
  Dima_Khatib      506     0.020 0.050    0.208     0.216     0.001
  ShababLibya      493     0.020 0.048    0.206     0.219     0.003
 monaeltahawy      436     0.017 0.038    0.204     0.198     0.002
     AJEnglish     359     0.014 0.030    0.195     0.085     0.001
      bencnn       306     0.012 0.021    0.193     0.090     0.001
      AJELive      283     0.011 0.017    0.191     0.065     0.001
    3arabawy       273     0.011 0.033    0.200     0.092     0.002
      cnnbrk       256     0.010 0.015    0.182     0.031     0.000
     AJArabic      238     0.009 0.018    0.192     0.050     0.002
  Sandmonkey       227     0.009 0.020    0.198     0.096     0.003
SultanAlQassemi    216     0.009 0.014    0.189     0.045     0.001
       alaa        204     0.008 0.020    0.202     0.088     0.007
   alarabiya_ar    169     0.007 0.009    0.180     0.035     0.001
  yoanisanchez     161     0.006 0.012    0.149     0.001     0.001
     AymanM        160     0.006 0.009    0.190     0.050     0.003
      acarvin      159     0.006 0.014    0.200     0.092     0.008
iyad_elbaghdadi    146     0.006 0.008    0.182     0.043     0.002
    monasosh       140     0.006 0.009    0.192     0.050     0.004
 ChangeInLibya     134     0.005 0.010    0.192     0.063     0.011



          It will be Tweeted & Retweeted
                                                                                             72
                                    Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Node and Network Metrics




   It will be Tweeted & Retweeted
                                                                      73
             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Egocentric Analysis
             Ghonim            Measure                        Bieber
               730             Vertices                         14
               970              Edges                           13
              0.004            Density                         0.140
              2.660        Average Degree                      1.860
              1.990       Average Distance                     1.860
              2.000           Diameter                         2.000
              0.999     Degree Centralization                  1.000
              0.995 Betweenness Centralization                 1.000
              0.999   Closeness Centralization                 1.000
              0.990  EigenVector Centralization                1.740
              0.003      Cluster Coefficient                   0.000
                1     Number of Components                       1
               730   Size of Largest Component                  14
              100%    % in Largest Component                   100%




  It will be Tweeted & Retweeted
                                                                       74
              Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Subcommunities of US Political Blogs
                               • Single day snapshot of a
                                 snowball sample of
                                 political blogs (N=1490)
                               • Manually assigned as
                                 Liberal or Conservative
                               • Focus on blogrolls and
                                 front page citations
                               • A primary question:
                                 Cyberbalkanization?
                                                                      75
             Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Subcommunities of Political Blogs




                                                                       76
              Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Political Blogs
                                         N=1490
                                      Edges = 16715




          N=758                                                                     N=732
       Edges = 7301                                                              Edges = 7839




         Liberals                                                              Conservatives



                                                                                                77
                      Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
        Political Blogs – Cyberbalkanization?
Viewpoint                           Lib In Links Cons In Links Total In Links %Lib         %Cons
dailykos.com                                 292            46            338        86%       14%                Measure           Liberal   Conservative
www.talkingpointsmemo.com                    242            22            264        92%         8%
atrios.blogspot.com                          230            39            269        86%       14%                     N              758         732
www.washingtonmonthly.com                    165            36            201        82%       18%
www.wonkette.com                              83            30            113        73%       27%                Out Links               74%         84%
www.juancole.com                             149            16            165        90%       10%
yglesias.typepad.com/matthew                 104            24            128        81%       19%                 In Links               67%         82%
www.crookedtimber.org                         81            19            100        81%       19%
www.mydd.com                                 107             8            115        93%         7%
www.oliverwillis.com                          97            20            117        83%       17%
blog.johnkerry.com                            21             2             23        91%         9%
www.pandagon.net                             118             5            123        96%         4%
www.talkleft.com                             126            15            141        89%       11%
digbysblog.blogspot.com                      115             3            118        97%         3%
www.politicalwire.com                         87            16            103        84%       16%
www.j-bradford-delong.net                     98            11            109        90%       10%
www.prospect.org/weblog                      102            11            113        90%       10%
americablog.blogspot.com                      64             5             69        93%         7%
www.theleftcoaster.com                        78             4             82        95%         5%
www.jameswolcott.com                          74             6             80        93%         8%
Total Liberal                               2433           338           2771        88%       12%
www.powerlineblog.com                         26           195            221        12%       88%
instapundit.com                               43           234            277        16%       84%
www.littlegreenfootballs.com/weblog           10           171            181         6%       94%
www.hughhewitt.com                            11           146            157         7%       93%
www.andrewsullivan.com/index.php              59            86            145        41%       59%
www.captainsquartersblog.com/mt                5           117            122         4%       96%
www.wizbangblog.com                           14           125            139        10%       90%
www.indcjournal.com                            6            60             66         9%       91%
www.michellemalkin.com                        10           191            201         5%       95%
blogsforbush.com                               4           208            212         2%       98%
www.allahpundit.com                            2            37             39         5%       95%
belmontclub.blogspot.com                       3            93             96         3%       97%
realclearpolitics.com                         13           104            117        11%       89%
volokh.com                                    27            80            107        25%       75%
timblair.spleenville.com                       7            80             87         8%       92%
windsofchange.net                             16            65             81        20%       80%
www.vodkapundit.com                            9            97            106         8%       92%
www.rogerlsimon.com                            6            74             80         8%       93%
www.deanesmay.com                              8            79             87         9%       91%
mypetjawa.mu.nu                                0            51             51         0%      100%
Total Conservative                           279          2293           2572        11%       89%



                                                                                                                                                             78
                                                                                     Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Networks:
  Political Blogs - Metrics

   Measure      Liberal   Conservative       Total
    Density       0.02        0.03           0.01
No Components     188         107             268
 Largest Comp     569         569            1222
Largest Comp%    75.10       84.97           82.01
    Min Deg        0           0               0
    Max Deg       305         296             351
   Aver Deg      19.26       21.42           22.44
  Deg Central     0.38        0.38           0.22
   Diameter        6           7               8
   Aver Dist      2.51        2.51           2.74
   Betw Cent      0.10        0.16           0.06
 EigVect Cent     0.23        0.26           0.22
  Clust Coeff     0.31        0.20           0.22




                                                                                             79
                                    Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
 Political Blogs – Empirical Comparisons

                                            Blog                  Citation                    Twitter
Measure                                    Political Biology Economics Math Physics UK Journalists Egypt Twitter
Number of Nodes                             1,490 1,520,521 81,217 253,339 52,909        523          25,178
Average Degree                               22.4      15.5      1.7       3.9   9.3      88             3
Average Path Length                          2.74       4.9      9.5       7.6   6.2     1.88           5.4
Diameter of the Largest Component              8        24       29         27   20        4            22
Overall Clustering                           0.22      0.09     0.16       0.15 0.45     0.41          0.004
Fraction of Nodes in Largest Component       0.82      0.92     0.41       0.82 0.85     0.99           0.7




                                                                                                             80
                                         Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
  Political Blogs – Model Comparisons
                                       Political          Bernoulli             Deg Conditional       Small World    Pref Attachment
Measure                                  Blogs        2.5%      97.5%           2.5%    97.5%       2.5%     97.5%   2.5%       97.5%
Number of Components                      268           1           1             1        1          1         1      96        134
Fraction of Nodes in Largest Component    0.82         100       100             100      100        100      100     0.91       0.94
Diameter of the Largest Component          8            4           4             7        9          4         5      7           9
Average Path Length                       2.74        2.61       2.63            3.29    3.36       2.98      3.01    3.07       3.14
Overall Clustering                       0.226        0.017     0.018           0.029   0.031       0.355    0.372   0.095      0.109
Betweenness                              0.065        0.002     0.003           0.010   0.021       0.003    0.004   0.038      0.064




                                                                                                                                  81
                                           Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
    Bernoulli

•    Fixed number of nodes and lines
      –   Sets density and average degree = density * (n-1)
•    Assigns lines to each pair of nodes
     independently with fixed probabilities
      –   Each line a random binary variable
                                                                                                     Measure          Bernouilli
•    Produces Poisson degree distribution                                                            Vertices            1000
                                                                                                      Edges             11149
•    Small diameter                                                                                  Density            0.022
      –   ln(#nodes)/ln(aver.degree)                                                             Average Degree        22.900
                                                                                                Average Distance        2.570
•    Low clustering                                                                                 Diameter              4
      –   Average degree/#nodes-1                                                             Degree Centralization     0.016

•    Large component                                                                      Betweenness Centralization
                                                                                            Closeness Centralization
                                                                                                                        0.003
                                                                                                                        0.066
      –   E.g. At aver.degree of 1.5 ~50% in largest                                       EigenVector Centralization   0.034
          component                                                                            Cluster Coefficient      0.021
                                                                                            Number of Components          1
                                                                                           Size of Largest Component     1000
                                                                                            % in Largest Component      100%
                                                                                                                                   82
                                       Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
    Small World
•   Fixed number of nodes and the number ® of
    nearby neighbors to which each vertex is
    linked
     –   Implies strong Transitive ties
     –   Ensures higher clustering ~ (3r-3)/(4r-2)
•   Probabilistically rewires selected lines from
                                                                                                        Measure          Small World
    one vertex to another (each line and vertex has                                                     Vertices            1000
    equal probability of being selected).                                                                Edges             11000
                                                                                                        Density            0.022
     –   Ensures low average path length
                                                                                                    Average Degree         22.000
                                                                                                   Average Distance        3.070
                                                                                                       Diameter            5.000
                                                                                                 Degree Centralization     0.007
                                                                                             Betweenness Centralization    0.007
                                                                                               Closeness Centralization    0.079
                                                                                              EigenVector Centralization   0.031
                                                                                                  Cluster Coefficient      0.514
                                                                                               Number of Components           1
                                                                                              Size of Largest Component     1000
                                                                                               % in Largest Component      100%

                                                                                                                                       83
                                         Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
Social Network Analysis:
Preferential Attachment
•   Vast majority of new nodes link to nodes
    with proportionately higher degrees
     –   “Rich get richer”
•   Probabilistic part involves the selection of
    vertices for new lines (e.g. end vertex for
    new line is proportional to degree of end
    vertex)
•   Tend to have a small world structure                                                             Measure          Preferential
                                                                                                     Vertices             1000
•   Exhibit long-tailed degree distributions
                                                                                                      Edges              11242
     –   Right-hand tail of the distribution follows a “scale
                                                                                                     Density             0.019
         free” power-law” distribution
                                                                                                 Average Degree          18.770
     –   Log-log graph is a straight line
                                                                                                Average Distance         2.690
     –   Ensures low average path length                                                            Diameter                7
                                                                                              Degree Centralization      0.154
                                                                                          Betweenness Centralization     0.037
                                                                                            Closeness Centralization        -
                                                                                           EigenVector Centralization    0.188
                                                                                               Cluster Coefficient       0.192
                                                                                            Number of Components           92
                                                                                           Size of Largest Component       908
                                                                                            % in Largest Component        91%

                                                                                                                                     84
                                          Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL

More Related Content

Viewers also liked

Bhajan Poorn Hogi Aas
Bhajan Poorn Hogi AasBhajan Poorn Hogi Aas
Bhajan Poorn Hogi AasMool Chand
 
Bhajan Poorn Hogi Aas
Bhajan Poorn Hogi AasBhajan Poorn Hogi Aas
Bhajan Poorn Hogi AasMool Chand
 
Web 2.0 - Social Media Emergency Management
Web 2.0 - Social Media Emergency ManagementWeb 2.0 - Social Media Emergency Management
Web 2.0 - Social Media Emergency ManagementPaolo Cavaliere, MSc
 
Introduction of Cardiovascular Surgery
Introduction of Cardiovascular SurgeryIntroduction of Cardiovascular Surgery
Introduction of Cardiovascular SurgeryRobert Chen
 
Preview guide st852ifr1
Preview guide st852ifr1Preview guide st852ifr1
Preview guide st852ifr1lhghom
 
Toronto Real Estate Board Housing Market_Charts-December_2011
Toronto Real Estate Board Housing Market_Charts-December_2011Toronto Real Estate Board Housing Market_Charts-December_2011
Toronto Real Estate Board Housing Market_Charts-December_2011James Metcalfe
 
Webbdagarna stockholm kopia
Webbdagarna stockholm kopiaWebbdagarna stockholm kopia
Webbdagarna stockholm kopiaGoran Adlen
 
Quella volta che il sito ce lo siamo immaginati noi. E anche lo spettacolo.
Quella volta che il sito ce lo siamo immaginati noi. E anche lo spettacolo.Quella volta che il sito ce lo siamo immaginati noi. E anche lo spettacolo.
Quella volta che il sito ce lo siamo immaginati noi. E anche lo spettacolo.Flavia Rubino | The Talking Village
 
Echo360 improving the quality of education rus
Echo360  improving the quality of education rusEcho360  improving the quality of education rus
Echo360 improving the quality of education rusfarcrys
 
Rppfiqihkelas7mtskurtilasedisirevisi 141126084558-conversion-gate01
Rppfiqihkelas7mtskurtilasedisirevisi 141126084558-conversion-gate01Rppfiqihkelas7mtskurtilasedisirevisi 141126084558-conversion-gate01
Rppfiqihkelas7mtskurtilasedisirevisi 141126084558-conversion-gate01Andre Milanisti
 
Pesquisa Ibope Alvorada FM Março de 2012
Pesquisa Ibope Alvorada FM Março de 2012Pesquisa Ibope Alvorada FM Março de 2012
Pesquisa Ibope Alvorada FM Março de 2012fabricandoweb
 
Ethics in Indian Healthcare - MedicinMan October 2016
Ethics in Indian Healthcare - MedicinMan October 2016Ethics in Indian Healthcare - MedicinMan October 2016
Ethics in Indian Healthcare - MedicinMan October 2016Anup Soans
 
20101126 cedefop dimitris_tsigos
20101126 cedefop dimitris_tsigos20101126 cedefop dimitris_tsigos
20101126 cedefop dimitris_tsigosDimitris Tsingos
 
Why Should the Doctor Rx Your Product?
Why Should the Doctor Rx Your Product?Why Should the Doctor Rx Your Product?
Why Should the Doctor Rx Your Product?Anup Soans
 

Viewers also liked (20)

Bhajan Poorn Hogi Aas
Bhajan Poorn Hogi AasBhajan Poorn Hogi Aas
Bhajan Poorn Hogi Aas
 
Bhajan Poorn Hogi Aas
Bhajan Poorn Hogi AasBhajan Poorn Hogi Aas
Bhajan Poorn Hogi Aas
 
Web 2.0 - Social Media Emergency Management
Web 2.0 - Social Media Emergency ManagementWeb 2.0 - Social Media Emergency Management
Web 2.0 - Social Media Emergency Management
 
Introduction of Cardiovascular Surgery
Introduction of Cardiovascular SurgeryIntroduction of Cardiovascular Surgery
Introduction of Cardiovascular Surgery
 
Metcalfe feb
Metcalfe febMetcalfe feb
Metcalfe feb
 
Hot Air Hand Tools
Hot Air Hand ToolsHot Air Hand Tools
Hot Air Hand Tools
 
Preview guide st852ifr1
Preview guide st852ifr1Preview guide st852ifr1
Preview guide st852ifr1
 
Qwerty
QwertyQwerty
Qwerty
 
Beauty newsletter
Beauty newsletterBeauty newsletter
Beauty newsletter
 
Toronto Real Estate Board Housing Market_Charts-December_2011
Toronto Real Estate Board Housing Market_Charts-December_2011Toronto Real Estate Board Housing Market_Charts-December_2011
Toronto Real Estate Board Housing Market_Charts-December_2011
 
Webbdagarna stockholm kopia
Webbdagarna stockholm kopiaWebbdagarna stockholm kopia
Webbdagarna stockholm kopia
 
Quella volta che il sito ce lo siamo immaginati noi. E anche lo spettacolo.
Quella volta che il sito ce lo siamo immaginati noi. E anche lo spettacolo.Quella volta che il sito ce lo siamo immaginati noi. E anche lo spettacolo.
Quella volta che il sito ce lo siamo immaginati noi. E anche lo spettacolo.
 
Echo360 improving the quality of education rus
Echo360  improving the quality of education rusEcho360  improving the quality of education rus
Echo360 improving the quality of education rus
 
Rppfiqihkelas7mtskurtilasedisirevisi 141126084558-conversion-gate01
Rppfiqihkelas7mtskurtilasedisirevisi 141126084558-conversion-gate01Rppfiqihkelas7mtskurtilasedisirevisi 141126084558-conversion-gate01
Rppfiqihkelas7mtskurtilasedisirevisi 141126084558-conversion-gate01
 
Pesquisa Ibope Alvorada FM Março de 2012
Pesquisa Ibope Alvorada FM Março de 2012Pesquisa Ibope Alvorada FM Março de 2012
Pesquisa Ibope Alvorada FM Março de 2012
 
Ethics in Indian Healthcare - MedicinMan October 2016
Ethics in Indian Healthcare - MedicinMan October 2016Ethics in Indian Healthcare - MedicinMan October 2016
Ethics in Indian Healthcare - MedicinMan October 2016
 
20101126 cedefop dimitris_tsigos
20101126 cedefop dimitris_tsigos20101126 cedefop dimitris_tsigos
20101126 cedefop dimitris_tsigos
 
Tech presentation
Tech presentationTech presentation
Tech presentation
 
Why Should the Doctor Rx Your Product?
Why Should the Doctor Rx Your Product?Why Should the Doctor Rx Your Product?
Why Should the Doctor Rx Your Product?
 
Encuentro 2 Espacio Digital
Encuentro 2 Espacio Digital Encuentro 2 Espacio Digital
Encuentro 2 Espacio Digital
 

Similar to Mining and analyzing social media hicss 45 tutorial – part 2

Buzzient short presentation_nov8_slideshare
Buzzient short presentation_nov8_slideshareBuzzient short presentation_nov8_slideshare
Buzzient short presentation_nov8_slideshareTBJ Investments, LLC
 
Making most of marketing dashboards
Making most of marketing dashboardsMaking most of marketing dashboards
Making most of marketing dashboardsStratigent
 
The MarkeTech Group - Scientific Method Webinar
The MarkeTech Group - Scientific Method WebinarThe MarkeTech Group - Scientific Method Webinar
The MarkeTech Group - Scientific Method WebinarThe MarkeTech Group
 
Big data - A critical appraisal
Big data - A critical appraisalBig data - A critical appraisal
Big data - A critical appraisalBart Knijnenburg
 
Social media mining hicss 46 part 1
Social media mining   hicss 46 part 1Social media mining   hicss 46 part 1
Social media mining hicss 46 part 1Dave King
 
Bring Your Own Device - Key Steps for an effective program
Bring Your Own Device - Key Steps for an effective programBring Your Own Device - Key Steps for an effective program
Bring Your Own Device - Key Steps for an effective programBrent Spencer
 
Audit example
Audit exampleAudit example
Audit exampleHumandgtl
 
The power of_mobile_and_social_data_webinar_slides_21_may2012
The power of_mobile_and_social_data_webinar_slides_21_may2012The power of_mobile_and_social_data_webinar_slides_21_may2012
The power of_mobile_and_social_data_webinar_slides_21_may2012Accenture
 
Prediktiv analys och kundlojalitet
Prediktiv analys och kundlojalitetPrediktiv analys och kundlojalitet
Prediktiv analys och kundlojalitetIBM Sverige
 
Information Management and Analytics
Information Management and Analytics Information Management and Analytics
Information Management and Analytics AKAGroup
 
Marshall Sponder - Social Media Monitoring Analytics - Measure13
Marshall Sponder - Social Media Monitoring Analytics - Measure13Marshall Sponder - Social Media Monitoring Analytics - Measure13
Marshall Sponder - Social Media Monitoring Analytics - Measure13Our Social Times
 
Jerry Silver of EMC - Selling Value
Jerry Silver of EMC - Selling ValueJerry Silver of EMC - Selling Value
Jerry Silver of EMC - Selling Valuejowen_evansdata
 
AIDC Summit LA: Fox Innovations Labs Solutions Overview
AIDC Summit LA: Fox Innovations Labs Solutions OverviewAIDC Summit LA: Fox Innovations Labs Solutions Overview
AIDC Summit LA: Fox Innovations Labs Solutions OverviewIntel® Software
 
Osimo crossover md
Osimo crossover mdOsimo crossover md
Osimo crossover mdosimod
 
Right Space Brief
Right Space BriefRight Space Brief
Right Space Briefjnassour
 
Evaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsEvaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsTeradata Aster
 
DigitalMR social media research sept 2011
DigitalMR social media research sept 2011DigitalMR social media research sept 2011
DigitalMR social media research sept 2011Michalis A. Michael
 
Ruby, rails, no sql and big data
Ruby, rails, no sql and big dataRuby, rails, no sql and big data
Ruby, rails, no sql and big dataJohn Repko
 
Critical Mass 3 Measurement Problems Solved Webinar
Critical Mass 3 Measurement Problems Solved WebinarCritical Mass 3 Measurement Problems Solved Webinar
Critical Mass 3 Measurement Problems Solved WebinarDan Linton
 
The Business Case For Open Source
The Business Case For Open SourceThe Business Case For Open Source
The Business Case For Open SourceOliver Steele
 

Similar to Mining and analyzing social media hicss 45 tutorial – part 2 (20)

Buzzient short presentation_nov8_slideshare
Buzzient short presentation_nov8_slideshareBuzzient short presentation_nov8_slideshare
Buzzient short presentation_nov8_slideshare
 
Making most of marketing dashboards
Making most of marketing dashboardsMaking most of marketing dashboards
Making most of marketing dashboards
 
The MarkeTech Group - Scientific Method Webinar
The MarkeTech Group - Scientific Method WebinarThe MarkeTech Group - Scientific Method Webinar
The MarkeTech Group - Scientific Method Webinar
 
Big data - A critical appraisal
Big data - A critical appraisalBig data - A critical appraisal
Big data - A critical appraisal
 
Social media mining hicss 46 part 1
Social media mining   hicss 46 part 1Social media mining   hicss 46 part 1
Social media mining hicss 46 part 1
 
Bring Your Own Device - Key Steps for an effective program
Bring Your Own Device - Key Steps for an effective programBring Your Own Device - Key Steps for an effective program
Bring Your Own Device - Key Steps for an effective program
 
Audit example
Audit exampleAudit example
Audit example
 
The power of_mobile_and_social_data_webinar_slides_21_may2012
The power of_mobile_and_social_data_webinar_slides_21_may2012The power of_mobile_and_social_data_webinar_slides_21_may2012
The power of_mobile_and_social_data_webinar_slides_21_may2012
 
Prediktiv analys och kundlojalitet
Prediktiv analys och kundlojalitetPrediktiv analys och kundlojalitet
Prediktiv analys och kundlojalitet
 
Information Management and Analytics
Information Management and Analytics Information Management and Analytics
Information Management and Analytics
 
Marshall Sponder - Social Media Monitoring Analytics - Measure13
Marshall Sponder - Social Media Monitoring Analytics - Measure13Marshall Sponder - Social Media Monitoring Analytics - Measure13
Marshall Sponder - Social Media Monitoring Analytics - Measure13
 
Jerry Silver of EMC - Selling Value
Jerry Silver of EMC - Selling ValueJerry Silver of EMC - Selling Value
Jerry Silver of EMC - Selling Value
 
AIDC Summit LA: Fox Innovations Labs Solutions Overview
AIDC Summit LA: Fox Innovations Labs Solutions OverviewAIDC Summit LA: Fox Innovations Labs Solutions Overview
AIDC Summit LA: Fox Innovations Labs Solutions Overview
 
Osimo crossover md
Osimo crossover mdOsimo crossover md
Osimo crossover md
 
Right Space Brief
Right Space BriefRight Space Brief
Right Space Brief
 
Evaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsEvaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics Platforms
 
DigitalMR social media research sept 2011
DigitalMR social media research sept 2011DigitalMR social media research sept 2011
DigitalMR social media research sept 2011
 
Ruby, rails, no sql and big data
Ruby, rails, no sql and big dataRuby, rails, no sql and big data
Ruby, rails, no sql and big data
 
Critical Mass 3 Measurement Problems Solved Webinar
Critical Mass 3 Measurement Problems Solved WebinarCritical Mass 3 Measurement Problems Solved Webinar
Critical Mass 3 Measurement Problems Solved Webinar
 
The Business Case For Open Source
The Business Case For Open SourceThe Business Case For Open Source
The Business Case For Open Source
 

More from Dave King

Mining and analyzing social media part 2 - hicss47 tutorial - dave king
Mining and analyzing social media   part 2 - hicss47 tutorial - dave kingMining and analyzing social media   part 2 - hicss47 tutorial - dave king
Mining and analyzing social media part 2 - hicss47 tutorial - dave kingDave King
 
Mining and analyzing social media part 1 - hicss47 tutorial - dave king
Mining and analyzing social media   part 1 - hicss47 tutorial - dave kingMining and analyzing social media   part 1 - hicss47 tutorial - dave king
Mining and analyzing social media part 1 - hicss47 tutorial - dave kingDave King
 
Mining and analyzing social media facebook w gephi - hicss47 tutorial - dav...
Mining and analyzing social media   facebook w gephi - hicss47 tutorial - dav...Mining and analyzing social media   facebook w gephi - hicss47 tutorial - dav...
Mining and analyzing social media facebook w gephi - hicss47 tutorial - dav...Dave King
 
Mining and analyzing social media bollywood w pajek - hicss47 tutorial - da...
Mining and analyzing social media   bollywood w pajek - hicss47 tutorial - da...Mining and analyzing social media   bollywood w pajek - hicss47 tutorial - da...
Mining and analyzing social media bollywood w pajek - hicss47 tutorial - da...Dave King
 
Mining and analyzing social media sample network w ora - hicss47 tutorial -...
Mining and analyzing social media   sample network w ora - hicss47 tutorial -...Mining and analyzing social media   sample network w ora - hicss47 tutorial -...
Mining and analyzing social media sample network w ora - hicss47 tutorial -...Dave King
 
Mining and analyzing social media hicss 45 tutorial – part 1
Mining and analyzing social media hicss 45 tutorial – part 1Mining and analyzing social media hicss 45 tutorial – part 1
Mining and analyzing social media hicss 45 tutorial – part 1Dave King
 
Text mining and analytics v6 - p1
Text mining and analytics   v6 - p1Text mining and analytics   v6 - p1
Text mining and analytics v6 - p1Dave King
 
Text mining and analytics v6 - p2
Text mining and analytics   v6 - p2Text mining and analytics   v6 - p2
Text mining and analytics v6 - p2Dave King
 
Digital Trails Dave King 1 5 10 Part 2 D3
Digital Trails   Dave King   1 5 10   Part 2   D3Digital Trails   Dave King   1 5 10   Part 2   D3
Digital Trails Dave King 1 5 10 Part 2 D3Dave King
 
Digital Trails Dave King 1 5 10 Part 1 D3
Digital Trails   Dave King   1 5 10   Part 1 D3Digital Trails   Dave King   1 5 10   Part 1 D3
Digital Trails Dave King 1 5 10 Part 1 D3Dave King
 

More from Dave King (10)

Mining and analyzing social media part 2 - hicss47 tutorial - dave king
Mining and analyzing social media   part 2 - hicss47 tutorial - dave kingMining and analyzing social media   part 2 - hicss47 tutorial - dave king
Mining and analyzing social media part 2 - hicss47 tutorial - dave king
 
Mining and analyzing social media part 1 - hicss47 tutorial - dave king
Mining and analyzing social media   part 1 - hicss47 tutorial - dave kingMining and analyzing social media   part 1 - hicss47 tutorial - dave king
Mining and analyzing social media part 1 - hicss47 tutorial - dave king
 
Mining and analyzing social media facebook w gephi - hicss47 tutorial - dav...
Mining and analyzing social media   facebook w gephi - hicss47 tutorial - dav...Mining and analyzing social media   facebook w gephi - hicss47 tutorial - dav...
Mining and analyzing social media facebook w gephi - hicss47 tutorial - dav...
 
Mining and analyzing social media bollywood w pajek - hicss47 tutorial - da...
Mining and analyzing social media   bollywood w pajek - hicss47 tutorial - da...Mining and analyzing social media   bollywood w pajek - hicss47 tutorial - da...
Mining and analyzing social media bollywood w pajek - hicss47 tutorial - da...
 
Mining and analyzing social media sample network w ora - hicss47 tutorial -...
Mining and analyzing social media   sample network w ora - hicss47 tutorial -...Mining and analyzing social media   sample network w ora - hicss47 tutorial -...
Mining and analyzing social media sample network w ora - hicss47 tutorial -...
 
Mining and analyzing social media hicss 45 tutorial – part 1
Mining and analyzing social media hicss 45 tutorial – part 1Mining and analyzing social media hicss 45 tutorial – part 1
Mining and analyzing social media hicss 45 tutorial – part 1
 
Text mining and analytics v6 - p1
Text mining and analytics   v6 - p1Text mining and analytics   v6 - p1
Text mining and analytics v6 - p1
 
Text mining and analytics v6 - p2
Text mining and analytics   v6 - p2Text mining and analytics   v6 - p2
Text mining and analytics v6 - p2
 
Digital Trails Dave King 1 5 10 Part 2 D3
Digital Trails   Dave King   1 5 10   Part 2   D3Digital Trails   Dave King   1 5 10   Part 2   D3
Digital Trails Dave King 1 5 10 Part 2 D3
 
Digital Trails Dave King 1 5 10 Part 1 D3
Digital Trails   Dave King   1 5 10   Part 1 D3Digital Trails   Dave King   1 5 10   Part 1 D3
Digital Trails Dave King 1 5 10 Part 1 D3
 

Recently uploaded

Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Recently uploaded (20)

Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

Mining and analyzing social media hicss 45 tutorial – part 2

  • 1. Mining and Analyzing Social Media HICSS 45 Tutorial – Part 2 Dave King January 4, 2012
  • 2. Agenda: This is how the slides are organized • Part 1 – Introduction – Bio, Resources, Social Media – Data Mining – Processes and Example – Text Mining – General Processes and Example – Predicting the Future – The Portmanteaus • Part 2 – Sentiment Analysis – Social Network Analysis - Introduction 2 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 3. Sentiment Analysis: What are your customers thinking? Every hour of every day they share their opinions, issues, thoughts and sentiments about brands, products, services and companies (on line). Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 4. Sentiment Analysis: Some Survey Data Cone Communications: http://www.coneinc.com/2011co neonlineinfluencetrendtracker 4 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 5. Sentiment Analysis: Some Payoffs Marketing Service Products Message Response Issues and Focus A form of Automated Text Categorization (ATC) 5 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 6. Sentiment Analysis: Some Examples 6 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 7. Sentiment Analysis: Some Examples Cycling Community Responds @BicyclingMag @BikePortland @clevercycle @cyclingreporter GM runs Ad on 10/17/11 7 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 8. Sentiment Analysis: Some Examples Key Areas of Concern: • Break in online link to Mint.com • Actionable Service Breaks • Outrage over “$50 limit on debit card transactions” 8 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 9. Sentiment Analysis: Defined Text Mining to classify subjective opinions in text into categories like "positive" or "negative” extracting various forms of attitudinal information: sentiment, opinion, mood, and emotion. Also called Voice of the Customer (VOC) or Opinion Mining. Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 10. Sentiment Analysis: Sample Software Vendors Alterian Etuma Lymbix Quantivo Attensity Evolve24 Medallia Radian6 (SalesForce.com) Brandwatch General Sentiment Meltwater SAS Buzzdetector IBM Cognos Meshlabs Sentiment Metrics Clarabridge IBM SPSS Netbase Solutions SentMetrix Crimson Hexagon InfiniGraph OpenAmplify Traackr Digimind Kontagent Overtone Visible Technologies DigitalPebble Lexalytics PostRank (Google) Wise Window EffectCheck Lithium 10 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 11. Sentiment Analysis: Types • Sentiment Classification – document level, classified as positive or negative • Feature-based opinion – sentence level, determines which aspects of an object people like or dislike • Comparative sentence and relationship mining – sentence level comparisons of one object against another (to determine which is better than the other) 11 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 12. Sentiment Analysis: Types • From one type to the next (classification, features, comparisons), it becomes more complex to identify and extract the information. • Once extracted, standard text mining techniques can be used to classify and compare the opinions • Simple techniques (like naïve Bayesian) often produce strong results (e.g. 80+% accuracy) 12 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 13. Sentiment Analysis: Assumption An Opinion Lexicon that Expresses State Polar, Opinion-Bearing, and Sentiment Words and Phrases 13 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 14. Sentiment Analysis: How do you know if it is “+” or “-”? plot : two teen couples go to a church party , drink and then drive . they get into an accident . one of the guys dies , but his girlfriend continues to see him in her life , and has nightmares . what's the deal ? watch the movie and " sorta " find out . . . critique : a mind-xxx movie for the teen generation that touches on a very cool idea , but presents it in a very bad package . which is what makes this review an even harder one to write , since i generally applaud films which attempt to break the mold , mess with your head and such ( lost highway & memento ) , but there are good and bad ways of making all types of films , and these folks just didn't snag this one correctly . they seem to have taken this pretty neat concept , but executed it terribly . so what are the problems with the movie ? well , its main problem is that it's simply too jumbled . having not seen , " who framed roger rabbit " in over 10 years , and not remembering much besides that i liked it then , i decided to rent it recently . watching it i was struck by just how brilliant a film it is . aside from the fact that it's a milestone in animation in movies ( it's the first film to combine real actors and cartoon characters , have them interact , and make it convincingly real ) and a great entertainment it's also quite an effective comedy/mystery . while the plot may be somewhat familiar the characters are original , especially baby herman , and watching them together is a lot of fun . … `who framed roger rabbit' is a rare film . one that not only presented a great challenge to the filmmakers but one that can be enjoyed by the whole family ( although some very young viewers may be a little scared by judge doom ) . do yourself a favor and rent it , `p-p-p-p-please . " 14 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 15. Sentiment Analysis: Other interests in Sentiment 15 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 16. Sentiment Analysis: Other interests in Sentiment 16 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 17. Sentiment Analysis: Doing Simple Sentiment Analysis General Problem 1 Automated 2 Collection Process Small Set of of Text for Classifying Predetermined 3 Documents categories ??? … n 17 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 18. Sentiment Analysis: Doing Simple Sentiment Analysis General Answer 1 Automated 2 Collection Process Small Set of of Text for Classifying Predetermined 3 Documents categories … n Inductive, supervised machine learning classification process and algorithm 18 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 19. Sentiment Analysis: Doing Simple Sentiment Analysis Real-World Text Data Training Process Documents with known Classification Document Consolidation Train Test Validate Establish the Corpus Classification Corpus Refinement Algorithm (Token, Stem, Stop…) Feature Selection & Weighting 1 2 3 n Predetermined Categories Term- Doc-Matrix* 19 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 20. Sentiment Analysis: Doing Simple Sentiment Analysis Classification Algorithms • Naïve Bayes • Decision Trees • Nearest Neighbor (k-NN) • Support Vector Machine • Neural Nets (e.g. SOM) Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 21. Sentiment Analysis: Doing Simple Sentiment Analysis Twitter Statistics • ~200M registered users. • ~50M users login every day • Over 400K new users per day. • 400 million unique visitors per month. • 55% use their phone to tweet. • Average 200 million tweets a day. • 600 million search queries per day • 75% of traffic from 3rd Party Apps • 60% of tweets from 3rd Party Apps Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 22. Sentiment Analysis: Doing Simple Sentiment Analysis Problem Features • Each tweet <= 140 characters (avg. 10-15 words/message) • Heavy presence of non-alpha symb0-ols, abbrevs, misspellings and slang • Tweets often include retweets (original tweet repeated) • In spite of this – Tweets have proven to be an interesting text mining source (warts and all) Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 23. Sentiment Analysis: Doing Simple Sentiment Analysis • Zambonini, D. "Self-Improving Bayesian Sentiment Analysis for Twitter.“ August 27, 2010. danzambonini.com/self-improving-bayesian-sentiment-analysis-for-twitter. • Kalafatis, T. “The Sentiment on US Economy from Twitter.” October, 2009. lifeanalytics.blogspot.com/2009/10/sentiment-on-us-economy-from-twitter.html. • Pak, A. and P. Paroubek. “Twitter as a Corpus for Sentiment Analysis and Opinion Mining.” In Proceedings of the Seventh International Conference on Language Resources and Evaluation. May, 2010. lrec- conf.org/proceedings/lrec2010/slides/385.pdf • Sood, S. and L. Vasserman. “ESSE: Exploring Mood on the Web.” August 2009. lcs.pomona.edu/people/files/SoodCV.pdf. • Go, A. et al. “Twitter Sentiment Classification using Distant Supervision.” 2009.stanford.edu/~alecmgo/papers/TwitterDistantSupervision09.pdf • Agarawal, A. et al. “Sentiment Analysis of Twitter Data.” 2011. www1.ccls.columbia.edu/~beck/pubs/lsm2011_full.pdf 23 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 24. Sentiment Analysis: Doing Simple Sentiment Analysis 24 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 25. Sentiment Analysis: Doing Simple Sentiment Analysis • Twitter used to get a total of 3 billion requests a day via its API • API Calls for Public Tweets – http://search.twitter.com/search.json?q=%3A)+feel+ feeling&rpp=100&page=1 – http://api.twitter.com/1/trends/current.json? exclude=hashtags Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 26. Sentiment Analysis: Doing Simple Sentiment Analysis {u'iso_language_code': u'en', u'to_user_name': None, u'to_user_id_str': None, u'from_user_id_str': u'59862385', u'text': u"Lol i feel ya!!RT @Sweet_Sun_Shine: @joshaustin13 everything's up, its the weekend baby!!!! :) and I plan on enjoying; how are you feeling?", u'from_user_name': u'B.Resilientue50cue50cue50c', u'profile_image_url': u'http://a3.twimg.com/profile_images/1650184586/joshaustin13_normal.jpg', u'id': 145274459127955456L, u'to_user': None, u'source': u'&lt; a href=&quot;http://www.echofon.com/&quot; Sample rel=&quot;nofollow&quot; &gt;Echofon&lt; Tweet from /a&gt;', API call u'id_str': u'145274459127955456', u'from_user': u'joshaustin13', u'from_user_id': 59862385, u'to_user_id': None, u'geo': None, u'created_at': u'Fri, 09 Dec 2011 22:51:44 +0000', u'metadata': {u'result_type': u'recent'} } Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 27. Sentiment Analysis: Doing Simple Sentiment Analysis “Twitter Sentiment Classification using Distant Supervision” (2009) • Utilizes presence of emoticons “ :)” & “ :( “ to serve as surrogates for classification as positive and negative sentiment statements • To construct the term-document matrix relies on a list of positive and negative key words from Twittratr, counting number of key words that appear in each tweet. • 180K tweets collected for training purposes between April and June 2009 • 80%+ accuracy in classification Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 28. Sentiment Analysis: Doing Simple Sentiment Analysis Counts Type List Set Happy Face  Words HF 8354 2169 SF 7702 1996 Total 16056 3469 Alpha HF 5917 1094 SF 5433 1055 Total 11350 1169 Stop HF 3425 992 SF 3325 953 Total 6750 1563 Sad Face  Stem HF 3425 895 SF 3325 850 Total 6750 1375 Stem w/o HF 2618 894 SF 2516 849 Total 5134 1374 28 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 29. Sentiment Analysis: Doing Simple Sentiment Analysis Happy Face  Sad Face  29 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 30. Sentiment Analysis: Doing Simple Sentiment Analysis P(H/D) = P(D/H) * P(H)/P(D) H is the hypothesis and D is the data P(H) is the prior probability of H: the probability that H is correct before the data D are seen . P(D/H) is the conditional probability of seeing the data D given that the hypothesis H is true. This conditional probability is called the likelihood. P(D) is the marginal probability of D. P(H/D) is the posterior probability: the probability that the hypothesis is true, given the data and the previous Thomas Bayes state of belief about the hypothesis. Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 31. Sentiment Analysis: Doing Simple Sentiment Analysis Training Set Message Category Love is great Positive I feel great now Positive I feel sick today Negative Great, today sucks Negative P(Positive | Tweet) Today is going to be good Positive compared to P(Negative | Tweet) P(Pos | Tweet) = P(Pos) * P(W1/Pos) / P(Tweet) P(Pos| Tweet) = P(Pos) * P(great/Pos) P(Pos | Tweet) = (3/5) * (2/3) = .4 P(Neg | Tweet) = P(Neg) * P(W1/N) / P(Tweet) P(Neg | Tweet) = P(Neg) * P(great/Neg) P(Neg| Tweet) = (2/5)*(1/2) = .2 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 32. Spam Detection: Naïve Bayesian Classifier Training Set Message Category Love is great Positive I feel great now Positive P(Positive | Tweet) I feel sick today Negative compared to Great, today sucks Negative P(Negative | Tweet) Today is going to be good Positive P(Neg | Tweet) = P(Neg) * P(W1/Neg) * P(W2/Neg) * ... P(Neg | Tweet) = P(Neg) * P(today/Neg) * P(sucks/Neg) P(Neg | Tweet) = ..4 * 1 * .5 = .2 P(Pos | Tweet) = P(Pos) * P(W1/Pos) * P(W2/Pos) * ... P(Pos | Tweet) = P(Pos) * P(today/Pos) * P(sucks/Pos) P(Pos | Tweet) = .6 * .33 * 0 = 0 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 33. Spam Detection: Naïve Bayesian Classifier Training Set Token1 Token2 Token3 Token4 … Class Tweet1 1 0 0 1 Happy Naïve Tweet2 1 0 1 0 Sad Bayesian Tweet3 0 0 0 1 Happy Tweet4 0 0 1 0 Sad Classifier … … … … … … … P(H|Tweet) P(S|Tweet) > 0?? New Tweet Estimated Token1 Token2 Token3 Token4 … , Decision Rule Tweet 0 0 1 0 Probabilities , P(H) P(Wi|H) ln P(H|Tweet) = ln + Ʃ ln P(S|Tweet) P(S) P(Wi|S) Proof left to reader Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 34. What is this number? 4.74 34 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 35. Does this help? Frigyes Karninthy Stanley Milgram 6 John Guare Duncan Watts 35 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 36. Six Degrees of Separation A fascinating game grew out of this discussion. One of us suggested performing the following experiment to prove that the population of the Earth is closer together now than they have ever been before. We should select any person from the 1.5 billion inhabitants of the Earth—anyone, anywhere at all. He bet us that, using no more than five individuals, one of whom is a personal acquaintance, he could contact the selected individual using nothing except the network of personal acquaintances. Frigyes Karninthy , Chains, 1929 A 1 2 3 4 5 36 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 37. Sample Metric From Social Network Analysis 4.74 Average Distance between Facebook Members 37 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 38. Social Network Analysis: Another Type of Analysis Which Blogs are Similar? Term1 Term2 Term3 … TermM Blog1 Blog2 Blog3 … BlogN Blog1 1 0 0 … 1 Blog1 - 1 0 … 1 Blog2 0 0 1 … 0 Blog2 0 - 1 … 0 Blog3 0 1 0 … 1 Blog3 1 1 - … 0 … … … … … … … … … … - … BlogN 0 0 0 … 1 BlogN 1 0 1 … - Cluster Analysis Graph/Network (e.g. K-Means) Analysis 38 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 39. Social Network Analysis: Another Type of Analysis Which Blogs are Similar? Word1 Word2 Word3 … WordM Blog1 1 0 0 … 1 For a detail description: Blog2 0 0 1 … 0 http://www.slideshare.net/ Blog3 0 1 0 … 1 daveking63/ … … … … … … text-mining-and-analytics-v6-p2 BlogN 0 0 0 … 1 Cluster Analysis (e.g. K-Means) 39 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 40. Social Network Analysis: Definitions Network – Collection of things and their linkages to one another. Social Network – Collection of humans, roles, groups, and/or institutions and their social relationships with one another. Social Network Analysis (SNA) – Application of Graph Theory or Network Science to the study of social relationships and connections. 40 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 41. Social Network Analysis: Early Efforts Jacob Moreno: Sociometry and the Sociogram 41 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 42. Social Network Analysis: Definitions “Ten years ago, the field of Social Network Analysis was a scientific backwater. We were the misfits, rejected from both mainstream sociology and mainstream computer science.” 42 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 43. Social Network Analysis: Exploding Commercial Interest 43 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 44. Social Network Analysis: So What Happened? Small Data Flat-files / in memory Manually collected computation Medium Data SQL Databases Data snapshots from APIs Big Data Real-time Big Data Approaches social media data 44 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 45. Social Network Analysis: When things were simplier … N=26 2005 N=80 45 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 46. Social Network Analysis: and then … Growth in Social Media N~1400 Access to SM Network Data Availability of Open Source Tools N~3.5K 2011 N~90K 46 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 47. Social Network Analysis: and now … N=20M N=80K N = 721M 47 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 48. Social Network Analysis: Key Elements Graph or Network Graph The set of [ V,E, f ] vertices/nodes, A edges/links and the relationship/function connecting them. B Vertices or Nodes Edge C (Link) The “things” D Vertex Edges or Links (Node) The “relationships” 48 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 49. Social Network Analysis: Types of Edges or Links Undirected, Directed, Unweighted Unweighted A B A Twitter B Facebook Friends Followers C C Undirected, Directed, Weighted Weighted 100 A Facebook B A 60 B 5 Email 70 Friends Network 20 10 C C 49 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 50. Social Network Analysis: Types of Networks Unimodal Bimodal Multiplex P1 E1 P1 P2 P3 P1 P2 P2 P3 Follows Replies To Mentions 50 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 51. Social Network Analysis: Types of Network Analysis “Whole” Network “Ego-Centric” Network P1 P2 P3 P2 P3 Ego P4 Alters P4 P5 P6 P5 P6 51 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 52. Social Network Analysis: Node Metrics (Centrality) Measure Definition Interpretation Reasoning Degree Number of edges or links. In How connected is a node? How Higher probability of receiving and transmitting degree- links in, Out-degree - links many people can this person reach information flows in the network. Nodes considered to out directly? have influence over larger number of nodes and or are capable of communicating quickly with the nodes in their neighborhood. Betweenness Number of times node or vertex How important is a node in terms Degree to which node controls flow of information in lies on shortest path between 2 of connecting other nodes? How the network. Those with high betweenness function as nodes divided by number of all the likely is this person to be the most brokers. Useful where a network is vulnerable. shortest paths direct route between two people in the network? Closeness 1 over the average distance How easily can a node reach other Measure of reach. Importance based on how close a between a node and every other nodes? How fast can this person node is located with respect to every other node in the node in the network reach everyone in the network? network. Nodes able to reach most or be reached by most all other nodes in the network through geodesic paths. Eigenvector Proporational to the sum of the How important, central, or Evaluates a player's popularity. Identifies centers of eigenvector centralities of all the influential are a node’s neighbors? large cliques. Node with more connections to higher nodes directly connected to it. How well is this person connected scoring nodes is more important. to other well-connected people? 52 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 53. Social Network Analysis: Centrality – Who is most important? 691 E B Eigen Node Degree Normed Degree Betweenness Closeness Eigen Vector A 3 0.17 0.00 0.29 0.29 B 4 0.22 0.01 0.30 0.36 D A G C 2 0.11 0.03 0.35 0.18 D 6 0.33 0.04 0.31 0.46 E 3 0.17 0.00 0.29 0.30 F C F 4 0.22 0.11 0.36 0.35 Betw G 5 0.28 0.19 0.37 0.43 H H 5 0.28 0.58 0.45 0.28 Close I 4 0.22 0.53 0.46 0.13 R J 7 0.39 0.43 0.43 0.12 I N K 3 0.17 0.00 0.32 0.06 P Deg L 3 0.17 0.01 0.33 0.05 J M 3 0.17 0.21 0.33 0.04 O N 3 0.17 0.03 0.38 0.07 K O 2 0.11 0.00 0.31 0.05 M L P 3 0.17 0.03 0.38 0.08 Q 2 0.11 0.11 0.26 0.01 R 1 0.06 0.00 0.32 0.07 Q S 1 0.06 0.00 0.21 0.00 S 53 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 54. Social Network Analysis: Cohension – How well connected? Density Ratio of the number of edges in How well connected is the overall Perfectly connected network is called a "clique" and the network over the total number network? has a density of 1. of possible edges between all pairs of nodes Average Degree Average number of links each node How well connected are the nodes Higher the average the better connected the members or vector has on average? are. Average Path Average number of edges or links On average, how far apart are any This is synonymous with the "degrees of separation" in Length between any two nodes (along the two nodes? a network. (Distance) shortest path) Diameter Longest (shortest path) between At most, how long will it take to Measure of the reach of the network any two nodes reach any node in the network? Sparse networks usually have greater diameters. Clustering A node's clustering coefficient is What proportion of ego's alters Measures certain aspects of "cliquishness." Proportion the density of it's 1.5 degree are connected? More technically, of you friends that are also friends with each other. egocentric network (ratio of how many nodes form triangular Another way to measure is to determine (in a connecting among ego's alters). subgraphs with their adjacent undirected) graph the ratio of the number of times that For entire network it is the average nodes? two links eminating from the same node are also of all the coefficients for the linked. individual nodes. 54 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 55. Social Network Analysis: Network Metrics (Centralization) Measure Definition Degree Centralization Variation in the degrees of vertices divided by the maximum degree variation that is possible in a network of the same size Betweenness Centralization Variation in the betweenness centrality of vertices divided by the maximum variation in betweenness centrality scores possible in a network of the same size Closeness Centralization Variation in the closeness centrality of vertices divided by the maximum variation in closeness centrality scores possible in a network of the same size Eigenvector Centralization Variation in the eigenvector centrality of vertices divided by the maximum variation in eigenvector centrality scores possible in a network of the same size 1. Variation is the summed absolute differences between centrality scores of the vertices and the maximum centrality score among them. 2. Network is more centralized if the vertices vary more with respect to their centrality. 55 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 56. Social Network Analysis: Cohesion – How well connected? 691 E B Node Clustering A 0.67 B 0.67 D A G Measure Value C 0.00 D 0.40 Average Degree 3.37 E 1.00 F Density 0.19 C F 0.50 Average Distance 3.06 G 0.50 H H 0.10 Diameter 8 I 0.33 R Degree Centralization 0.22 J 0.29 I N Betweenness Centralization 0.48 K 0.67 P L 0.67 Closeness Centralization 0.27 J M 0.33 O Eigenvector Centralization 0.56 N 0.67 K Clustering Coefficient 0.43 O 1.00 M L P 0.67 Q 0.00 R NA Q S NA S 56 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 57. Social Network Analysis: Some General Tendencies • Small diameters and small average path lengths • High clustering coefficients relative to random processes • Rate of clustering among the higher- degree nodes decreases with degree • Fat tailed degree distributions relative to random processes • Hard to find networks that actually follow a strict power law • Positive assortativity and high degrees of homophily at least in social networks 57 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 58. Social Network Analysis: Is it really a small world? http://www.touchgraph.com/navigator 58 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 59. Social Network Analysis: Is it really a small world? Ego Steps 1 1 Friends 1 50 100 FoF 2 2,500 10,000 FoFoF 3 125,000 1,000,000 FoFoFoF 4 6,250,000 100,000,000 FoFoFoFoF 5 312,500,000 10,000,000,000 FoFoFoFoFoF 6 15,625,000,000 1,000,000,000,000 The naïve view 59 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 60. Social Network Analysis: Is it really a small world? 60 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 61. Social Network Analysis: Is it really a small world? The emergence of online social networking services over the past decade has revolutionized how social scientists study the structure of human relationships [1]. As individuals bring their social relations online, the focal point of the internet is evolving from being a network of documents to being a network of people, and previously invisible social structures are being captured at tremendous scale and with unprecedented detail. 61 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 62. Social Network Analysis: Study Population Active* Global US Members 721M 149M Friends 68.7B 15.9B Aver. Friends 190 214 Total Pop 6.9B 260M Accessed within 28 days of May ’11 At least one friend Over 13 years of age 62 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 63. Social Network Analysis: Degree Distribution N = 721M F = 69B Encouraged Up to 20 Median ~ 99 63 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 64. Social Network Analysis: Distance Distribution Average Average 4.7 4.3 World 92% 99.6% US 96% 99.7% 64 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 65. Social Network Analysis: Connected Components 2000 99.91% of Members Members Connected components – set of individuals for which each pair of individuals are connected by at least one path through the network 65 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 66. Social Network Analysis: Cohesion 14% for 100 100 friends – 28K unique fof’s; 40K non-uniq fofs You’re friends with a significant Feld: your friends have more friends fraction of your friends’ friends than you (same with sex partners) 66 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 67. Social Network Analysis: Correlation and Assortativity 67 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 68. Social Network Analysis: Bird’s of a Feather - Homophily 84% in the same country 68 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 69. Social Network Analysis: “Revolution 2.0 will not be Televised” Tweet Rate – Feb. 24-25, 2011, Tahrir Square It will be Tweeted & Retweeted 69 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 70. Social Network Analysis: “Revolution 2.0 will not be Televised” 1% Feed from the two day period Nodes = 25178 Links = 32471 It will be Tweeted & Retweeted 70 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 71. Social Network Analysis: Network Features Average Degree = 2.58 Average Distance = 5.40 Min Degree = 0 Min Distance = 1 Max Degree = 729 Diameter = 22 Measure Value Density 0.0001 Degree Centralization 0.029 Betweenness Centralization 0.076 Closeness Centralization WC Eigenvector Centralization 0.724 Clustering Coefficient 0.0045 Number of Components 3122 Size of Largest Component 17762 % in Largest Component 70.50% It will be Tweeted & Retweeted 71 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 72. Social Network Analysis: Node and Network Metrics Node Degree NormDeg Betw Close Eigen Clust Ghonim 729 0.029 0.076 0.214 0.509 0.001 Dima_Khatib 506 0.020 0.050 0.208 0.216 0.001 ShababLibya 493 0.020 0.048 0.206 0.219 0.003 monaeltahawy 436 0.017 0.038 0.204 0.198 0.002 AJEnglish 359 0.014 0.030 0.195 0.085 0.001 bencnn 306 0.012 0.021 0.193 0.090 0.001 AJELive 283 0.011 0.017 0.191 0.065 0.001 3arabawy 273 0.011 0.033 0.200 0.092 0.002 cnnbrk 256 0.010 0.015 0.182 0.031 0.000 AJArabic 238 0.009 0.018 0.192 0.050 0.002 Sandmonkey 227 0.009 0.020 0.198 0.096 0.003 SultanAlQassemi 216 0.009 0.014 0.189 0.045 0.001 alaa 204 0.008 0.020 0.202 0.088 0.007 alarabiya_ar 169 0.007 0.009 0.180 0.035 0.001 yoanisanchez 161 0.006 0.012 0.149 0.001 0.001 AymanM 160 0.006 0.009 0.190 0.050 0.003 acarvin 159 0.006 0.014 0.200 0.092 0.008 iyad_elbaghdadi 146 0.006 0.008 0.182 0.043 0.002 monasosh 140 0.006 0.009 0.192 0.050 0.004 ChangeInLibya 134 0.005 0.010 0.192 0.063 0.011 It will be Tweeted & Retweeted 72 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 73. Social Network Analysis: Node and Network Metrics It will be Tweeted & Retweeted 73 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 74. Social Network Analysis: Egocentric Analysis Ghonim Measure Bieber 730 Vertices 14 970 Edges 13 0.004 Density 0.140 2.660 Average Degree 1.860 1.990 Average Distance 1.860 2.000 Diameter 2.000 0.999 Degree Centralization 1.000 0.995 Betweenness Centralization 1.000 0.999 Closeness Centralization 1.000 0.990 EigenVector Centralization 1.740 0.003 Cluster Coefficient 0.000 1 Number of Components 1 730 Size of Largest Component 14 100% % in Largest Component 100% It will be Tweeted & Retweeted 74 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 75. Social Network Analysis: Subcommunities of US Political Blogs • Single day snapshot of a snowball sample of political blogs (N=1490) • Manually assigned as Liberal or Conservative • Focus on blogrolls and front page citations • A primary question: Cyberbalkanization? 75 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 76. Social Network Analysis: Subcommunities of Political Blogs 76 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 77. Social Network Analysis: Political Blogs N=1490 Edges = 16715 N=758 N=732 Edges = 7301 Edges = 7839 Liberals Conservatives 77 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 78. Social Network Analysis: Political Blogs – Cyberbalkanization? Viewpoint Lib In Links Cons In Links Total In Links %Lib %Cons dailykos.com 292 46 338 86% 14% Measure Liberal Conservative www.talkingpointsmemo.com 242 22 264 92% 8% atrios.blogspot.com 230 39 269 86% 14% N 758 732 www.washingtonmonthly.com 165 36 201 82% 18% www.wonkette.com 83 30 113 73% 27% Out Links 74% 84% www.juancole.com 149 16 165 90% 10% yglesias.typepad.com/matthew 104 24 128 81% 19% In Links 67% 82% www.crookedtimber.org 81 19 100 81% 19% www.mydd.com 107 8 115 93% 7% www.oliverwillis.com 97 20 117 83% 17% blog.johnkerry.com 21 2 23 91% 9% www.pandagon.net 118 5 123 96% 4% www.talkleft.com 126 15 141 89% 11% digbysblog.blogspot.com 115 3 118 97% 3% www.politicalwire.com 87 16 103 84% 16% www.j-bradford-delong.net 98 11 109 90% 10% www.prospect.org/weblog 102 11 113 90% 10% americablog.blogspot.com 64 5 69 93% 7% www.theleftcoaster.com 78 4 82 95% 5% www.jameswolcott.com 74 6 80 93% 8% Total Liberal 2433 338 2771 88% 12% www.powerlineblog.com 26 195 221 12% 88% instapundit.com 43 234 277 16% 84% www.littlegreenfootballs.com/weblog 10 171 181 6% 94% www.hughhewitt.com 11 146 157 7% 93% www.andrewsullivan.com/index.php 59 86 145 41% 59% www.captainsquartersblog.com/mt 5 117 122 4% 96% www.wizbangblog.com 14 125 139 10% 90% www.indcjournal.com 6 60 66 9% 91% www.michellemalkin.com 10 191 201 5% 95% blogsforbush.com 4 208 212 2% 98% www.allahpundit.com 2 37 39 5% 95% belmontclub.blogspot.com 3 93 96 3% 97% realclearpolitics.com 13 104 117 11% 89% volokh.com 27 80 107 25% 75% timblair.spleenville.com 7 80 87 8% 92% windsofchange.net 16 65 81 20% 80% www.vodkapundit.com 9 97 106 8% 92% www.rogerlsimon.com 6 74 80 8% 93% www.deanesmay.com 8 79 87 9% 91% mypetjawa.mu.nu 0 51 51 0% 100% Total Conservative 279 2293 2572 11% 89% 78 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 79. Social Networks: Political Blogs - Metrics Measure Liberal Conservative Total Density 0.02 0.03 0.01 No Components 188 107 268 Largest Comp 569 569 1222 Largest Comp% 75.10 84.97 82.01 Min Deg 0 0 0 Max Deg 305 296 351 Aver Deg 19.26 21.42 22.44 Deg Central 0.38 0.38 0.22 Diameter 6 7 8 Aver Dist 2.51 2.51 2.74 Betw Cent 0.10 0.16 0.06 EigVect Cent 0.23 0.26 0.22 Clust Coeff 0.31 0.20 0.22 79 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 80. Social Network Analysis: Political Blogs – Empirical Comparisons Blog Citation Twitter Measure Political Biology Economics Math Physics UK Journalists Egypt Twitter Number of Nodes 1,490 1,520,521 81,217 253,339 52,909 523 25,178 Average Degree 22.4 15.5 1.7 3.9 9.3 88 3 Average Path Length 2.74 4.9 9.5 7.6 6.2 1.88 5.4 Diameter of the Largest Component 8 24 29 27 20 4 22 Overall Clustering 0.22 0.09 0.16 0.15 0.45 0.41 0.004 Fraction of Nodes in Largest Component 0.82 0.92 0.41 0.82 0.85 0.99 0.7 80 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 81. Social Network Analysis: Political Blogs – Model Comparisons Political Bernoulli Deg Conditional Small World Pref Attachment Measure Blogs 2.5% 97.5% 2.5% 97.5% 2.5% 97.5% 2.5% 97.5% Number of Components 268 1 1 1 1 1 1 96 134 Fraction of Nodes in Largest Component 0.82 100 100 100 100 100 100 0.91 0.94 Diameter of the Largest Component 8 4 4 7 9 4 5 7 9 Average Path Length 2.74 2.61 2.63 3.29 3.36 2.98 3.01 3.07 3.14 Overall Clustering 0.226 0.017 0.018 0.029 0.031 0.355 0.372 0.095 0.109 Betweenness 0.065 0.002 0.003 0.010 0.021 0.003 0.004 0.038 0.064 81 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 82. Social Network Analysis: Bernoulli • Fixed number of nodes and lines – Sets density and average degree = density * (n-1) • Assigns lines to each pair of nodes independently with fixed probabilities – Each line a random binary variable Measure Bernouilli • Produces Poisson degree distribution Vertices 1000 Edges 11149 • Small diameter Density 0.022 – ln(#nodes)/ln(aver.degree) Average Degree 22.900 Average Distance 2.570 • Low clustering Diameter 4 – Average degree/#nodes-1 Degree Centralization 0.016 • Large component Betweenness Centralization Closeness Centralization 0.003 0.066 – E.g. At aver.degree of 1.5 ~50% in largest EigenVector Centralization 0.034 component Cluster Coefficient 0.021 Number of Components 1 Size of Largest Component 1000 % in Largest Component 100% 82 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 83. Social Network Analysis: Small World • Fixed number of nodes and the number ® of nearby neighbors to which each vertex is linked – Implies strong Transitive ties – Ensures higher clustering ~ (3r-3)/(4r-2) • Probabilistically rewires selected lines from Measure Small World one vertex to another (each line and vertex has Vertices 1000 equal probability of being selected). Edges 11000 Density 0.022 – Ensures low average path length Average Degree 22.000 Average Distance 3.070 Diameter 5.000 Degree Centralization 0.007 Betweenness Centralization 0.007 Closeness Centralization 0.079 EigenVector Centralization 0.031 Cluster Coefficient 0.514 Number of Components 1 Size of Largest Component 1000 % in Largest Component 100% 83 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL
  • 84. Social Network Analysis: Preferential Attachment • Vast majority of new nodes link to nodes with proportionately higher degrees – “Rich get richer” • Probabilistic part involves the selection of vertices for new lines (e.g. end vertex for new line is proportional to degree of end vertex) • Tend to have a small world structure Measure Preferential Vertices 1000 • Exhibit long-tailed degree distributions Edges 11242 – Right-hand tail of the distribution follows a “scale Density 0.019 free” power-law” distribution Average Degree 18.770 – Log-log graph is a straight line Average Distance 2.690 – Ensures low average path length Diameter 7 Degree Centralization 0.154 Betweenness Centralization 0.037 Closeness Centralization - EigenVector Centralization 0.188 Cluster Coefficient 0.192 Number of Components 92 Size of Largest Component 908 % in Largest Component 91% 84 Copyright 2011 JDA Software Group, Inc. - CONFIDENTIAL