SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Downloaden Sie, um offline zu lesen
Social Media Monitoring & Insights


          Semeon White Paper
            Alkis Papadopoullos, DTI



                   May 2011




                © Semeon 2011




                                       1
Can Social Media Monitoring achieve the
                        ambitious results promoted by vendors today?
                        The answer depends greatly on the degree to which existing
                        solutions or systems are weighted towards analysis and insights
                        versus monitoring.



                        A Social Mining Primer
                        As evidenced by reports published regularly by the Altimeter
                        Group, Forrester’s Listening Platform, research reports, etc., and
                        the rapidly growing number of Social Media Monitoring
                        companies, it is safe to assume that we have passed the stage of
                        asking whether or not monitoring conversations on the web are a
                        nice-to-have or an outright necessity. Basic usage figures (2010)
                        illustrate eloquently how prevalent the web and social media in
                        particular have become:

                              Wikipedia – more than 3.5 million subjects in English
                              YouTube – more than 100 million videos
                              Blogs – more than 150 million
…we have passed the           Twitter – more than 175 million users
     stage of asking          Websites – more than 250 million
     whether or not           Facebook – more than 500 million users
         monitoring           Internet users – approximately 2 billion
conversations on the
  web are a nice-to-    An organization wishing to engage more directly with its
 have or an outright    customers, communities, members, constituents or political
                        groups can succeed by extracting key concepts and thoughts
           necessity.   contained in their dialogs and contributions in multiple web
                        media. This can be done through the following process:

                        First gather data that already exists on the web, sift the poor
                        or irrelevant content from that which actually carries useful
                        information. Then successfully extract key concepts and
                        thoughts from said content and determine as efficiently as
                        possible what call to action, if any, is required. Finally, route
                        the call to action to the appropriate department, division or
                        resource in a timely fashion.




                                                                                          2
In this regard, market expectations have matured as more social
                          media monitoring products have appeared.

                          The typical features expected from such products are:

                                Central, comprehensive management of brands,
                                 competitors and topic tags.

          Beyond [...]          Restriction by languages and countries.
  requirements, is the          Comprehensive sources (many tools do not scan all
 need to provide true            sources and eliminate tweets).
 ability to dissect the
                                Dashboards with Key Success Indicators (KSIs) as well as
  glut of feedback, in           filters and comparison options (mentions, reach, share of
 order to rapidly gain           voice, etc.).
  real insight into the
                                Sentiment analysis, demographic information where
nature of problems or            possible, identification of influencers and important topics.
   feedback reported.
                                Filterable lists of contributions with information on range,
                                 user profiles etc.

                                Historic data for retrospective analyses.

                                Workflows and pragmatic interfaces for existing CRM
                                 systems.

                                Direct engagement options.

                                E-mail alerts (particularly in the context of crisis
                                 management).


                          Beyond the above requirements, is the ability to successfully
                          dissect the glut of feedback, in order to quickly gain real insight
                          into the nature of problems or feedback reported, and automate
                          relevant calls to action specific for each problem discovered.
                          Successfully mining this vast set of data is key, but approaches to
                          extracting information from this data vary significantly. Particular
                          attention is required when proceeding with what ultimately
                          remains qualitative analysis from open sources that are subject to
                          spam, offensive material and outright gaming.




                                                                                             3
Vetting of sources

                          Identification of key domains or websites where a company or
                          organization believes there is useful interaction between people
                          will help to ensure that content that is being analyzed is truly
                          useful. This can include customer review sites, online communities
                          or forums that are hosted by the company itself, mainstream news
                          and blog sites, technical forums (where applicable) and of course
                          prevalent social networks and micro-blogs, though these must be
                          subjected to spam and offensive content filtering prior to analysis.


                          Content relevance

                          It is all about the search results. No matter how one slices the
                          problem, in order to retrieve the data you are going to be
                          analyzing, you will be writing a query or series of queries in order
                          to get at the relevant data. Ultimately, your analysis can only be as
                          good as the data you are analyzing. If the underlying Search
                          Engine and relevance assessment models are not perfect, your
                          analysis will suffer significantly, and reliable insights and truly
                          actionable information will be difficult to obtain at best.


                          Sentiment analysis reliability

                          Truly accurate sentiment analysis continues to be elusive for
                          several very good reasons. As a Natural Language Processing
                          exercise, sentiment analysis must stand upon the foundation of a
                          number of prerequisite steps that are not simple to achieve
                          without certain error rates. And as we move through each step,
                          error rates are being compounded. Key steps are:

                                Identify the language of the text to be analyzed.
 …sentiment analysis
  must stand upon the           Remove boilerplate and unwanted noise, and extract the
       foundation of a           structure of actual user posts, articles or blogs.

            number of           Identify (when applicable) within this structure,
prerequisite steps that          paragraphs and sentences.
     are not simple to          Separate sentences into their individual concepts.
                                Identify the grammatical categories and sentence
      achieve without
                                 components such as subject group, noun-phrases,
   certain error rates.          adjectives, verbs, adverbs, etc.
                                Identify stylistic features such as punctuation.



                                                                                             4
   Extract and tag key concepts (semantic analysis as
                                  opposed to simpler key word extraction).
                                 Identify sentiment or emotion markers.

                                 Categorize the text being analyzed into categories that
                                  correspond to the most important and prevalent semantic
                                  tags.

                                 Classify sentences according to sentiment.



                           Semeon’s Social Media Monitoring & Insights
                           Contrary to the bulk of vendors in the Social Media Monitoring
                           space, Semeon has taken a natural language processing approach.
                           This combines semantic analysis based on industry specific
                           dictionaries with machine learning algorithms that are attempting
                           to identify meaning and present results in context. We believe this
                           is the only way to enable our customers to reliably mine data for
                           key concepts and thus serve a wide range of opinion mining needs.
                           But much more importantly, to assist in reducing the time that
                           must be spent analyzing data in order to draw reliable
                           conclusions, and obtain actionable information. Many opinion
                           mining and social media monitoring solutions available today do a
                           good job of collecting data, and monitoring for occurrences of key
                           words (brand names, product names, etc.); but they leave the end-
                           user with the task of manually figuring out how to proceed with
  At Semeon, we have       analysis, and, from this flood of data, attempt to understand what
                           the appropriate calls to action are within their organization.
   worked hard to not
only ensure that users
  can collect the data,    Productivity and process
   but also to assist in
                           Semeon’s Social Media Monitoring and Insights has been designed
 shaving costly hours
                           and developed to not only ensure that users can collect the data,
 off the more complex      but also to assist in shaving costly hours off of the more complex
  data analysis phase.     data analysis phase. This solution can thus truly help users gain
                           actionable insight from analysis of feedback about their products
                           and/or services. It also enables them to monitor the impact of
                           recent public relations and marketing campaigns, and pursue
                           targeted intervention tactics in order to redress perceptions, if and
                           when they or their products are subject to criticism online. In
                           order to do so, Semeon’s Social Media Monitoring and Insights



                                                                                              5
completes a series of steps that make it possible to get to the most
                          important concepts found in individual users’ feedback. Equating
                          individual user feedback with a document, it is necessary to:

                                Accurately identify the language of feedback if data is from
                                 open sources.

                                Remove boilerplate for web content and effectively extract
                                 only the actual text written up by a user.

                                Identify document structure, so as to separate out
                                 individual paragraphs and, more importantly, determine
                                 sentence boundaries within paragraphs.

                                Tokenize sentences (separation of sentences into its
                                 individual words).
                                Proceed with partial syntactic analysis and part-of-speech
                                 identification (identification of grammatical categories
                                 such as subject group, noun-phrases, adjectives, verbs,
                                 adverbs, etc.).

                                Identify stylistic features (titles, text formatting,
                                 accentuation through punctuation, etc.).
                                Determine the semantic descriptor of each document

                                Determine the sentiment associated with each sentence

                                Identify important links between semantic metadata and
   The end result is a           sentiment metadata
clean set of sentences
                          The end result of these steps is a clean set of sentences that are
    that are tagged as    tagged as belonging to a given instance of user feedback, with
 belonging to a given     relevant concepts, entities and sentiment extracted for further
       instance of user   analysis by our Social Media Monitoring and Insights product.
                          Where available, additional metadata associated with user
         feedback, with
                          feedback is stored along with the above, specifically:
    relevant concepts,
entities and sentiment          Language.

 extracted for further          Date and time stamp.
           analysis […]
                                Source of the user feedback (a URL for example).

                                User identification (for example, a user’s nickname on a
                                 blog or forum).




                                                                                            6
   User demographics (when from Enterprise Content
                                   sources only and not from the web where such
                                   information is very sketchy at best).

                           The identification of Key Concepts, Named Entities and Emotion
                           Markers uses a combination of proprietary unsupervised and
                           semi-supervised machine-learning techniques in order to optimize
                           classification of sentences into the categories that are as precise as
                           possible. The subsequent assignment of a sentiment flag to each
                           sentence also relies on proprietary machine learning algorithms.
                           Successive pruning of concepts based on relevance feedback from
                           the customers themselves, as well as filtering for topics specific to
                           the industry or sub-industry of interest based on industry-specific
                           dictionaries, help to ensure that the only the most salient topics
                           emerge accompanied by an assessment of how emotional about
                           these topics are the people who have written about them.



                           The value is in insights, not mining

                           Why go to these lengths, when we could more simply limit our
                           efforts to identifying keyword patterns in social media feeds? The
                           steps outlined above result in Semeon’s relevance assessment
                           model being the best applicable to text based information, and
                           ensure that clients can get to the most interesting nuggets of
                           information from user feedback.

                           When setting up feedback studies, making use of our very effective
    …the system helps      relevance assessment model, the system helps clients pre-analyze
    clients pre-analyze    user feedback for Key Concepts, Named Entities and Sentiment, to
                           ensure that they are retrieving the most useful and relevant
 user feedback for Key
                           information about the topic or topics they have chosen to study.
     Concepts, Named       They can then obtain as complete and relevant a collection of user
            Entities and   feedback as possible, for the topic at hand, and subsequent
 Sentiment, in order to    insights will be that much more interesting and useful.
 help them ensure that
they are retrieving the
                           Categories, Concepts and Entities
        most useful and
  relevant information     In retrieving information, the system proceeds with a number of
                           pre-computations attempting to establish links between Key
                           Concepts, People names, Organization names, and corresponding
                           sentiment. The most relevant and prevalent of the above are
                           labeled as categories and these help to guide customers through


                                                                                               7
the analysis steps. Then we cross-reference categories to other
categories to establish the context. This facilitates individual users
to gain insights into:

       Who are the main contributors

       Where their expertise lies

       What is the sentiment to as to determine whether the
        feedback is a complaint or praises the topic at hand.

 Figure 1 shows how the most prevalent categories are presented
with the terms or concepts that describe why they are positive or
negative and what specifically is being praised or criticized. Figure
2 illustrates the breakdown of users per category, giving clear
indication of the main ideas and concepts evoked by each user, the
frequency of their posts or write-ups and recent distribution of
sentiment.




Figure 1: Categories within negative or positive context




                                                                    8
With the approach
developed at Semeon,
    we are striving to
      help customers
reduce the most time-
   consuming part of
         social media
 monitoring projects,
  namely the analysis
                 step.   Figure 2: Categories cross-referenced with influencers

                         Conclusion: do not settle for monitoring

                         This ability to show the context of the topics evoked is critical. It
                         depends on the detailed natural language processing that
                         Semeon’s Social Media Monitoring & Insights brings to the table.
                         This makes it feasible for customers to draw actionable
                         conclusions then and there rather than spend time trying to sift
                         through data that is difficult to read and interpret given that it is
                         often limited to context-less keyword tracking. With the approach
                         developed at Semeon, we are striving to help customers save the
                         most time consuming part of social media monitoring projects,
                         namely the analysis step. Our proprietary algorithms rooted in
                         robust semantic relevance assessment, and deriving the context of
                         all categories (Key Concepts, People, Organizations, etc.), are
                         designed to help obtain answers to questions, gain insight where
                         insight is elusive and ultimately save you time and money. Why
                         settle for monitoring only when, with Semeon you can also get to
                         actionable insights from all user feedback?




                                                                                            9

Weitere ähnliche Inhalte

Empfohlen

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Empfohlen (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Social media-monitoring-insights-white-paper-semeon 2013-03_12_04

  • 1. Social Media Monitoring & Insights Semeon White Paper Alkis Papadopoullos, DTI May 2011 © Semeon 2011 1
  • 2. Can Social Media Monitoring achieve the ambitious results promoted by vendors today? The answer depends greatly on the degree to which existing solutions or systems are weighted towards analysis and insights versus monitoring. A Social Mining Primer As evidenced by reports published regularly by the Altimeter Group, Forrester’s Listening Platform, research reports, etc., and the rapidly growing number of Social Media Monitoring companies, it is safe to assume that we have passed the stage of asking whether or not monitoring conversations on the web are a nice-to-have or an outright necessity. Basic usage figures (2010) illustrate eloquently how prevalent the web and social media in particular have become:  Wikipedia – more than 3.5 million subjects in English  YouTube – more than 100 million videos  Blogs – more than 150 million …we have passed the  Twitter – more than 175 million users stage of asking  Websites – more than 250 million whether or not  Facebook – more than 500 million users monitoring  Internet users – approximately 2 billion conversations on the web are a nice-to- An organization wishing to engage more directly with its have or an outright customers, communities, members, constituents or political groups can succeed by extracting key concepts and thoughts necessity. contained in their dialogs and contributions in multiple web media. This can be done through the following process: First gather data that already exists on the web, sift the poor or irrelevant content from that which actually carries useful information. Then successfully extract key concepts and thoughts from said content and determine as efficiently as possible what call to action, if any, is required. Finally, route the call to action to the appropriate department, division or resource in a timely fashion. 2
  • 3. In this regard, market expectations have matured as more social media monitoring products have appeared. The typical features expected from such products are:  Central, comprehensive management of brands, competitors and topic tags. Beyond [...]  Restriction by languages and countries. requirements, is the  Comprehensive sources (many tools do not scan all need to provide true sources and eliminate tweets). ability to dissect the  Dashboards with Key Success Indicators (KSIs) as well as glut of feedback, in filters and comparison options (mentions, reach, share of order to rapidly gain voice, etc.). real insight into the  Sentiment analysis, demographic information where nature of problems or possible, identification of influencers and important topics. feedback reported.  Filterable lists of contributions with information on range, user profiles etc.  Historic data for retrospective analyses.  Workflows and pragmatic interfaces for existing CRM systems.  Direct engagement options.  E-mail alerts (particularly in the context of crisis management). Beyond the above requirements, is the ability to successfully dissect the glut of feedback, in order to quickly gain real insight into the nature of problems or feedback reported, and automate relevant calls to action specific for each problem discovered. Successfully mining this vast set of data is key, but approaches to extracting information from this data vary significantly. Particular attention is required when proceeding with what ultimately remains qualitative analysis from open sources that are subject to spam, offensive material and outright gaming. 3
  • 4. Vetting of sources Identification of key domains or websites where a company or organization believes there is useful interaction between people will help to ensure that content that is being analyzed is truly useful. This can include customer review sites, online communities or forums that are hosted by the company itself, mainstream news and blog sites, technical forums (where applicable) and of course prevalent social networks and micro-blogs, though these must be subjected to spam and offensive content filtering prior to analysis. Content relevance It is all about the search results. No matter how one slices the problem, in order to retrieve the data you are going to be analyzing, you will be writing a query or series of queries in order to get at the relevant data. Ultimately, your analysis can only be as good as the data you are analyzing. If the underlying Search Engine and relevance assessment models are not perfect, your analysis will suffer significantly, and reliable insights and truly actionable information will be difficult to obtain at best. Sentiment analysis reliability Truly accurate sentiment analysis continues to be elusive for several very good reasons. As a Natural Language Processing exercise, sentiment analysis must stand upon the foundation of a number of prerequisite steps that are not simple to achieve without certain error rates. And as we move through each step, error rates are being compounded. Key steps are:  Identify the language of the text to be analyzed. …sentiment analysis must stand upon the  Remove boilerplate and unwanted noise, and extract the foundation of a structure of actual user posts, articles or blogs. number of  Identify (when applicable) within this structure, prerequisite steps that paragraphs and sentences. are not simple to  Separate sentences into their individual concepts.  Identify the grammatical categories and sentence achieve without components such as subject group, noun-phrases, certain error rates. adjectives, verbs, adverbs, etc.  Identify stylistic features such as punctuation. 4
  • 5. Extract and tag key concepts (semantic analysis as opposed to simpler key word extraction).  Identify sentiment or emotion markers.  Categorize the text being analyzed into categories that correspond to the most important and prevalent semantic tags.  Classify sentences according to sentiment. Semeon’s Social Media Monitoring & Insights Contrary to the bulk of vendors in the Social Media Monitoring space, Semeon has taken a natural language processing approach. This combines semantic analysis based on industry specific dictionaries with machine learning algorithms that are attempting to identify meaning and present results in context. We believe this is the only way to enable our customers to reliably mine data for key concepts and thus serve a wide range of opinion mining needs. But much more importantly, to assist in reducing the time that must be spent analyzing data in order to draw reliable conclusions, and obtain actionable information. Many opinion mining and social media monitoring solutions available today do a good job of collecting data, and monitoring for occurrences of key words (brand names, product names, etc.); but they leave the end- user with the task of manually figuring out how to proceed with At Semeon, we have analysis, and, from this flood of data, attempt to understand what the appropriate calls to action are within their organization. worked hard to not only ensure that users can collect the data, Productivity and process but also to assist in Semeon’s Social Media Monitoring and Insights has been designed shaving costly hours and developed to not only ensure that users can collect the data, off the more complex but also to assist in shaving costly hours off of the more complex data analysis phase. data analysis phase. This solution can thus truly help users gain actionable insight from analysis of feedback about their products and/or services. It also enables them to monitor the impact of recent public relations and marketing campaigns, and pursue targeted intervention tactics in order to redress perceptions, if and when they or their products are subject to criticism online. In order to do so, Semeon’s Social Media Monitoring and Insights 5
  • 6. completes a series of steps that make it possible to get to the most important concepts found in individual users’ feedback. Equating individual user feedback with a document, it is necessary to:  Accurately identify the language of feedback if data is from open sources.  Remove boilerplate for web content and effectively extract only the actual text written up by a user.  Identify document structure, so as to separate out individual paragraphs and, more importantly, determine sentence boundaries within paragraphs.  Tokenize sentences (separation of sentences into its individual words).  Proceed with partial syntactic analysis and part-of-speech identification (identification of grammatical categories such as subject group, noun-phrases, adjectives, verbs, adverbs, etc.).  Identify stylistic features (titles, text formatting, accentuation through punctuation, etc.).  Determine the semantic descriptor of each document  Determine the sentiment associated with each sentence  Identify important links between semantic metadata and The end result is a sentiment metadata clean set of sentences The end result of these steps is a clean set of sentences that are that are tagged as tagged as belonging to a given instance of user feedback, with belonging to a given relevant concepts, entities and sentiment extracted for further instance of user analysis by our Social Media Monitoring and Insights product. Where available, additional metadata associated with user feedback, with feedback is stored along with the above, specifically: relevant concepts, entities and sentiment  Language. extracted for further  Date and time stamp. analysis […]  Source of the user feedback (a URL for example).  User identification (for example, a user’s nickname on a blog or forum). 6
  • 7. User demographics (when from Enterprise Content sources only and not from the web where such information is very sketchy at best). The identification of Key Concepts, Named Entities and Emotion Markers uses a combination of proprietary unsupervised and semi-supervised machine-learning techniques in order to optimize classification of sentences into the categories that are as precise as possible. The subsequent assignment of a sentiment flag to each sentence also relies on proprietary machine learning algorithms. Successive pruning of concepts based on relevance feedback from the customers themselves, as well as filtering for topics specific to the industry or sub-industry of interest based on industry-specific dictionaries, help to ensure that the only the most salient topics emerge accompanied by an assessment of how emotional about these topics are the people who have written about them. The value is in insights, not mining Why go to these lengths, when we could more simply limit our efforts to identifying keyword patterns in social media feeds? The steps outlined above result in Semeon’s relevance assessment model being the best applicable to text based information, and ensure that clients can get to the most interesting nuggets of information from user feedback. When setting up feedback studies, making use of our very effective …the system helps relevance assessment model, the system helps clients pre-analyze clients pre-analyze user feedback for Key Concepts, Named Entities and Sentiment, to ensure that they are retrieving the most useful and relevant user feedback for Key information about the topic or topics they have chosen to study. Concepts, Named They can then obtain as complete and relevant a collection of user Entities and feedback as possible, for the topic at hand, and subsequent Sentiment, in order to insights will be that much more interesting and useful. help them ensure that they are retrieving the Categories, Concepts and Entities most useful and relevant information In retrieving information, the system proceeds with a number of pre-computations attempting to establish links between Key Concepts, People names, Organization names, and corresponding sentiment. The most relevant and prevalent of the above are labeled as categories and these help to guide customers through 7
  • 8. the analysis steps. Then we cross-reference categories to other categories to establish the context. This facilitates individual users to gain insights into:  Who are the main contributors  Where their expertise lies  What is the sentiment to as to determine whether the feedback is a complaint or praises the topic at hand. Figure 1 shows how the most prevalent categories are presented with the terms or concepts that describe why they are positive or negative and what specifically is being praised or criticized. Figure 2 illustrates the breakdown of users per category, giving clear indication of the main ideas and concepts evoked by each user, the frequency of their posts or write-ups and recent distribution of sentiment. Figure 1: Categories within negative or positive context 8
  • 9. With the approach developed at Semeon, we are striving to help customers reduce the most time- consuming part of social media monitoring projects, namely the analysis step. Figure 2: Categories cross-referenced with influencers Conclusion: do not settle for monitoring This ability to show the context of the topics evoked is critical. It depends on the detailed natural language processing that Semeon’s Social Media Monitoring & Insights brings to the table. This makes it feasible for customers to draw actionable conclusions then and there rather than spend time trying to sift through data that is difficult to read and interpret given that it is often limited to context-less keyword tracking. With the approach developed at Semeon, we are striving to help customers save the most time consuming part of social media monitoring projects, namely the analysis step. Our proprietary algorithms rooted in robust semantic relevance assessment, and deriving the context of all categories (Key Concepts, People, Organizations, etc.), are designed to help obtain answers to questions, gain insight where insight is elusive and ultimately save you time and money. Why settle for monitoring only when, with Semeon you can also get to actionable insights from all user feedback? 9