SlideShare ist ein Scribd-Unternehmen logo
1 von 4
I D C               A N A L Y S T                         C O N N E C T I O N




                         David Schubmehl
                         Research Manager




Diving Deep Outside the Firew all for Market
Research Insights
October 2012

For many enterprises, Big Data is now a mainstream concern, as evidenced by changes in
organizational structure and budgets to focus on this area. However, most enterprises have yet to tap
into the vast resource of data outside the firewall to incorporate Web-based Big Data in real time. The
Web provides a lot of data that can be useful to market research efforts, particularly if organizations
go beyond analyzing quantitative data such as statistics or demographics and look at customer
sentiment as revealed in comments on product reviews as well as posts on social networks.

The following questions were posed by Connotate to David Schubmehl, research manager at IDC, on
behalf of Connotate's customers.

Q.         How are enterprises missing out by failing to tap into the Web?

A.         The Web has become a global repository that contains over 8 billion pages of unstructured
           information ranging from news and social media to research and philosophical treatises. The
           Web is a tremendous source of information about an enterprise's prospects, customers, and
           competitors, which is why leading organizations are making heavy use of the Web as a
           research tool. Survey research indicates that global CEOs are looking to Big Data on the
           Web to understand their customers and build engagement models with their existing
           customers and prospective customers.

           Where enterprises are missing out is by failing to tap into the tremendous amount of social
           media information on the Web. Many organizations are beginning to understand that their
           customers are out there talking about them on the Web and on social media sites, yet they
           don't have a very good handle on how to collect all of that information. As a result, many
           companies are missing opportunities because they aren't aware of or don't understand the
           conversations — both good and bad — that are going on about them, particularly in focused
           blogs and online user group communities. By tapping into these specialized online sources
           (not just Twitter and Facebook), companies can better understand what their customers are
           saying, thinking, or looking for regarding specific products and services. Just think of all the
           product reviews that are posted on the Web. Companies can gain a lot of insight about
           customer sentiment by tapping into this information.




IDC 1390
On a similar note, organizations can make use of the wealth of competitive information on
     the Web. Competitor product data, prices, reviews, and even comparisons can be found
     on the Web. In the same manner that organizations can tap into the "voice of the customer,"
     they can also tap into their competitors' data to understand and compete more effectively.

     These are just a few examples of valuable data that is out there waiting for organizations that
     are willing to go find it and collect it.

Q.   What are the benefits — and challenges — of using Web-based data to fuel customer
     sentiment analysis in market research?

A.   The benefits of Web-based data revolve around three factors: timeliness, legitimacy, and
     aggregation. Typically, collecting data from social media sites, product review sites, and other
     sources can be very current and even provide up-to-the-minute feedback. Still, it can be a
     challenge to figure out how to collect that information in a manner that is as close to real time
     as possible and also to determine what kind of feedback that can be collected is going to
     evolve — and therefore be more valuable for trend analysis — over time. For many
     organizations, trend analysis actually is extremely valuable and can provide long-term benefits.

     Legitimacy is also a major factor. Are the review and the sentiment real? Is someone posting
     something because he or she wants to share true feelings about a product, or is it a
     competitor looking to sabotage reviews? Perhaps a reviewer is being paid to say something
     positive, which could skew results, so how can an organization identify the unpaid reviews?
     All of these factors can be challenging to quantify. Finally, a wide variety of customer reviews
     and feelings need to be collected in order to accurately gauge customer sentiment, especially
     if the collection is being done automatically. Small samples can skew results and analysis.

     Most organizations would like to collect as many comments or as much information from as
     many relevant sites as possible. The problem is that the number of sites that may have
     valuable content is expanding at a tremendous rate. It's a challenge for an organization that's
     trying to collect all this information and pull it together in a way that is useful. That's why
     aggregation into a single structure is important. It's relatively easy to pull things from a Twitter
     stream or a Facebook feed, but organizations often have to contend with all of the other sites
     that are out there, and this is often where this type of data collection can become complicated.

     The fragility and the rate of change of content within Web pose an additional challenge. Web
     sites change constantly, pages are moved or modified, and content is added or deleted on a
     regular basis. Less robust approaches to collecting Web data will "break" and cease to return
     valid output when a change is made to the target Web page. A fragile system delivers only a
     fragment of the value when Web content changes and doesn't allow for time series analytics. A
     more robust solution features resiliency to change and, in the long run, delivers higher value.

Q.   What is "deep" Web data, and why is it more valuable than "surface" Web data?

A.   Deep Web data, which builds on IDC's traditional definition of Web data, is typically data that
     can't be crawled or accessed at all except through some kind of authentication process.
     A typical place for such data is in a document management system that is available via the
     Web, but only through authentication. However, many organizations now view the deep Web
     as the layers below the surface of a typical Web site. For example, the comments section of a
     Web-based ecommerce site might be buried 30 or 40 levels deep within the organization's Web
     site; some types of crawlers and aggregators wouldn't easily be able to find this type of
     information. Organizations often want to look at this information because there can be real
     value in it. What is hidden deep within the system can often reveal more insights than data at




                                                  2                                             ©2012 IDC
the surface level. The ability to ferret out all of the information contained in the deep Web will be
        more valuable to organizations than just looking at what is easily crawled at a surface level.

Q.      How can enterprises tap into deep Web data, and what are the stumbling blocks to
        doing this? What complementary technologies should they consider, and/or how can
        they simplify this process?

A.      The barriers to accessing deep Web data typically involve the inability to obtain that data
        through a standard RSS feed or a standard Twitter API screen feed. Organizations may
        collect information at this surface level, but extra processing is required, such as in the case
        of shortened URLs. There are different shortening techniques for compressing Web site
        locators into the 140-character maximum length of a Twitter or RSS stream. One approach is
        to use technology that can shorten the URLs and then use them to go down 20, 30, or 40
        levels — however many levels it takes to get at the relevant information. Technologies are
        available today that can help automate this process, and they are worthy of consideration for
        extracting value from deep Web data.

        There are also technologies that include an authentication method if it's necessary to require
        a user ID and a password. Then the actual crawling is automated in the system, as if an end
        user is pulling up the information and manipulating and extracting it. Then the data can be
        handed off in some fashion to another system for something like sentiment analysis or
        content analytics to actually understand what's being said on that page or in that set of
        comments.

        Once you have identified relevant data sources and the technologies required to access
        them, the next step is to identify technologies needed for extracting the valuable information
        — such as product numbers, prices, descriptions, comments, and other fields — normalize
        that information, and then place the information into some kind of structured repository such
        as a database or search system. These tools often have to be tailored to the kinds of Web
        data that is being collected, but they are absolutely essential to the process of deep Web
        data collection.

Q.      What are some specific use cases and vertical market applications for deep Web data?

A.      From a market research standpoint, there are many different applications where deep Web
        data can be used to gain insights. Manufacturers of 35in. large-screen TVs, for example,
        could use deep Web extraction technology to pull the pricing information from other Web
        sites or from Web-based catalogs. This software can collect product and pricing information
        from vendors such as Wal-Mart, Target, Best Buy, Amazon.com, and many others in an
        automatic fashion. These types of applications collect all of the relevant information, extract
        it, aggregate it, and then place the data in one or more relational database tables. A TV
        manufacturer using this type of system could then find out what the current prices are for TVs
        and could also go back to previous months or even years to determine pricing trends.

        Another potentially interesting application is in the pharmaceuticals industry. A pharmaceutical
        manufacturer can see what prices are charged for its products on targeted Web sites anywhere
        in the world. If products are being sold below market value in one part of the world, this can
        indicate a black market– or potentially white market–type sales activity. A manufacturer can look
        at these sites and look at the data aggregation to try to understand why some locales are selling
        products at prices that may seem to be below market level.

        Appliances are another common use case where deep Web data can be very useful.
        Perhaps consumers are looking at reviews for washing machines in an effort to determine
        reliability versus price for when they need to make a purchase. It would certainly be helpful



©2012 IDC                                            3
for the manufacturers to understand what the consumers are saying about their washing
         machines and what potential buyers might see if they went to these sites. Manufacturers can
         collect this deep Web information from all of these different sites — whether retail sites,
         repair sites, review sites, or competitor sites — to find out what people are saying about
         washers with regard to reliability, price, and even ease of use. Many similar use cases fall
         into this category.

         Another market research use is trying to understand future buying patterns by conducting
         trend analysis. What's trending in terms of hot new smartphones, best-selling books, or video
         games? What are people talking about on Twitter and on social media Web sites? Are they
         talking about the latest weight loss medication approved by the FDA? Who is spending
         money and where? IDC is seeing a lot of companies starting to think about trend analysis.
         Data supporting trend analysis can be used to design future products. A manufacturer can
         look to the Web to find out what features of a new phone are being discussed or what
         features are being disparaged. This type of information is valuable to designers and
         engineers because it provides a view into what customers are actually thinking about when
         they use a product.

         Deep Web data has many uses in market research, and IDC expects that more and more
         organizations will have a deep Web data collection and use strategy as part of their ongoing
         market research efforts.



  A B O U T     T H I S    A N A L Y S T

  Dave Schubmehl is research manager for IDC's search, content analytics, and discovery research. His research covers
  information access technologies including content analytics, search systems, unstructured information representation,
  unified access to structured and unstructured information, Big Data, visualization, and rich media search. This research
  analyzes the trends and dynamics of the content analytics, search, and discovery software markets and the costs, benefits,
  and workflow impacts of solutions that use these technologies.




A B O U T    T H I S   P U B L I C A T I O N

This publication was produced by IDC Go-to-Market Services. The opinion, analysis, and research results presented herein
are drawn from more detailed research and analysis independently conducted and published by IDC, unless specific vendor
sponsorship is noted. IDC Go-to-Market Services makes IDC content available in a wide range of formats for distribution by
various companies. A license to distribute IDC content does not imply endorsement of or opinion about the licensee.

C O P Y R I G H T      A N D   R E S T R I C T I O N S

Any IDC information or reference to IDC that is to be used in advertising, press releases, or promotional materials requires
prior written approval from IDC. For permission requests, contact the GMS information line at 508-988-7610 or gms@idc.com.
Translation and/or localization of this document requires an additional license from IDC.

For more information on IDC, visit www.idc.com. For more information on IDC GMS, visit www.idc.com/gms.

Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com




                                                             4                                                   ©2012 IDC

Weitere ähnliche Inhalte

Kürzlich hochgeladen

The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
daisycvs
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls In Nangloi Rly Metro ꧂…….95996 … 13876 Enjoy ꧂Escort
Call Girls In Nangloi Rly Metro ꧂…….95996 … 13876 Enjoy ꧂EscortCall Girls In Nangloi Rly Metro ꧂…….95996 … 13876 Enjoy ꧂Escort
Call Girls In Nangloi Rly Metro ꧂…….95996 … 13876 Enjoy ꧂Escort
dlhescort
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
dollysharma2066
 
Call Girls From Raj Nagar Extension Ghaziabad❤️8448577510 ⊹Best Escorts Servi...
Call Girls From Raj Nagar Extension Ghaziabad❤️8448577510 ⊹Best Escorts Servi...Call Girls From Raj Nagar Extension Ghaziabad❤️8448577510 ⊹Best Escorts Servi...
Call Girls From Raj Nagar Extension Ghaziabad❤️8448577510 ⊹Best Escorts Servi...
lizamodels9
 
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
amitlee9823
 
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
daisycvs
 
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Anamikakaur10
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
amitlee9823
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
Abortion pills in Kuwait Cytotec pills in Kuwait
 
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 

Kürzlich hochgeladen (20)

The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Call Girls In Nangloi Rly Metro ꧂…….95996 … 13876 Enjoy ꧂Escort
Call Girls In Nangloi Rly Metro ꧂…….95996 … 13876 Enjoy ꧂EscortCall Girls In Nangloi Rly Metro ꧂…….95996 … 13876 Enjoy ꧂Escort
Call Girls In Nangloi Rly Metro ꧂…….95996 … 13876 Enjoy ꧂Escort
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
 
PHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation FinalPHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation Final
 
Call Girls From Raj Nagar Extension Ghaziabad❤️8448577510 ⊹Best Escorts Servi...
Call Girls From Raj Nagar Extension Ghaziabad❤️8448577510 ⊹Best Escorts Servi...Call Girls From Raj Nagar Extension Ghaziabad❤️8448577510 ⊹Best Escorts Servi...
Call Girls From Raj Nagar Extension Ghaziabad❤️8448577510 ⊹Best Escorts Servi...
 
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
 
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
 
Uneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration PresentationUneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration Presentation
 
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
Call Now ☎️🔝 9332606886🔝 Call Girls ❤ Service In Bhilwara Female Escorts Serv...
 
Whitefield CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
Whitefield CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLWhitefield CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
Whitefield CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
 
Falcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to ProsperityFalcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to Prosperity
 
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
Call Girls Zirakpur👧 Book Now📱7837612180 📞👉Call Girl Service In Zirakpur No A...
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Nelamangala Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
Malegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Malegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceMalegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Malegaon Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 

Empfohlen

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Empfohlen (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

IDC Analyst Connection Connotate Diving Deep Outside the Firewall for Market Research Insights

  • 1. I D C A N A L Y S T C O N N E C T I O N David Schubmehl Research Manager Diving Deep Outside the Firew all for Market Research Insights October 2012 For many enterprises, Big Data is now a mainstream concern, as evidenced by changes in organizational structure and budgets to focus on this area. However, most enterprises have yet to tap into the vast resource of data outside the firewall to incorporate Web-based Big Data in real time. The Web provides a lot of data that can be useful to market research efforts, particularly if organizations go beyond analyzing quantitative data such as statistics or demographics and look at customer sentiment as revealed in comments on product reviews as well as posts on social networks. The following questions were posed by Connotate to David Schubmehl, research manager at IDC, on behalf of Connotate's customers. Q. How are enterprises missing out by failing to tap into the Web? A. The Web has become a global repository that contains over 8 billion pages of unstructured information ranging from news and social media to research and philosophical treatises. The Web is a tremendous source of information about an enterprise's prospects, customers, and competitors, which is why leading organizations are making heavy use of the Web as a research tool. Survey research indicates that global CEOs are looking to Big Data on the Web to understand their customers and build engagement models with their existing customers and prospective customers. Where enterprises are missing out is by failing to tap into the tremendous amount of social media information on the Web. Many organizations are beginning to understand that their customers are out there talking about them on the Web and on social media sites, yet they don't have a very good handle on how to collect all of that information. As a result, many companies are missing opportunities because they aren't aware of or don't understand the conversations — both good and bad — that are going on about them, particularly in focused blogs and online user group communities. By tapping into these specialized online sources (not just Twitter and Facebook), companies can better understand what their customers are saying, thinking, or looking for regarding specific products and services. Just think of all the product reviews that are posted on the Web. Companies can gain a lot of insight about customer sentiment by tapping into this information. IDC 1390
  • 2. On a similar note, organizations can make use of the wealth of competitive information on the Web. Competitor product data, prices, reviews, and even comparisons can be found on the Web. In the same manner that organizations can tap into the "voice of the customer," they can also tap into their competitors' data to understand and compete more effectively. These are just a few examples of valuable data that is out there waiting for organizations that are willing to go find it and collect it. Q. What are the benefits — and challenges — of using Web-based data to fuel customer sentiment analysis in market research? A. The benefits of Web-based data revolve around three factors: timeliness, legitimacy, and aggregation. Typically, collecting data from social media sites, product review sites, and other sources can be very current and even provide up-to-the-minute feedback. Still, it can be a challenge to figure out how to collect that information in a manner that is as close to real time as possible and also to determine what kind of feedback that can be collected is going to evolve — and therefore be more valuable for trend analysis — over time. For many organizations, trend analysis actually is extremely valuable and can provide long-term benefits. Legitimacy is also a major factor. Are the review and the sentiment real? Is someone posting something because he or she wants to share true feelings about a product, or is it a competitor looking to sabotage reviews? Perhaps a reviewer is being paid to say something positive, which could skew results, so how can an organization identify the unpaid reviews? All of these factors can be challenging to quantify. Finally, a wide variety of customer reviews and feelings need to be collected in order to accurately gauge customer sentiment, especially if the collection is being done automatically. Small samples can skew results and analysis. Most organizations would like to collect as many comments or as much information from as many relevant sites as possible. The problem is that the number of sites that may have valuable content is expanding at a tremendous rate. It's a challenge for an organization that's trying to collect all this information and pull it together in a way that is useful. That's why aggregation into a single structure is important. It's relatively easy to pull things from a Twitter stream or a Facebook feed, but organizations often have to contend with all of the other sites that are out there, and this is often where this type of data collection can become complicated. The fragility and the rate of change of content within Web pose an additional challenge. Web sites change constantly, pages are moved or modified, and content is added or deleted on a regular basis. Less robust approaches to collecting Web data will "break" and cease to return valid output when a change is made to the target Web page. A fragile system delivers only a fragment of the value when Web content changes and doesn't allow for time series analytics. A more robust solution features resiliency to change and, in the long run, delivers higher value. Q. What is "deep" Web data, and why is it more valuable than "surface" Web data? A. Deep Web data, which builds on IDC's traditional definition of Web data, is typically data that can't be crawled or accessed at all except through some kind of authentication process. A typical place for such data is in a document management system that is available via the Web, but only through authentication. However, many organizations now view the deep Web as the layers below the surface of a typical Web site. For example, the comments section of a Web-based ecommerce site might be buried 30 or 40 levels deep within the organization's Web site; some types of crawlers and aggregators wouldn't easily be able to find this type of information. Organizations often want to look at this information because there can be real value in it. What is hidden deep within the system can often reveal more insights than data at 2 ©2012 IDC
  • 3. the surface level. The ability to ferret out all of the information contained in the deep Web will be more valuable to organizations than just looking at what is easily crawled at a surface level. Q. How can enterprises tap into deep Web data, and what are the stumbling blocks to doing this? What complementary technologies should they consider, and/or how can they simplify this process? A. The barriers to accessing deep Web data typically involve the inability to obtain that data through a standard RSS feed or a standard Twitter API screen feed. Organizations may collect information at this surface level, but extra processing is required, such as in the case of shortened URLs. There are different shortening techniques for compressing Web site locators into the 140-character maximum length of a Twitter or RSS stream. One approach is to use technology that can shorten the URLs and then use them to go down 20, 30, or 40 levels — however many levels it takes to get at the relevant information. Technologies are available today that can help automate this process, and they are worthy of consideration for extracting value from deep Web data. There are also technologies that include an authentication method if it's necessary to require a user ID and a password. Then the actual crawling is automated in the system, as if an end user is pulling up the information and manipulating and extracting it. Then the data can be handed off in some fashion to another system for something like sentiment analysis or content analytics to actually understand what's being said on that page or in that set of comments. Once you have identified relevant data sources and the technologies required to access them, the next step is to identify technologies needed for extracting the valuable information — such as product numbers, prices, descriptions, comments, and other fields — normalize that information, and then place the information into some kind of structured repository such as a database or search system. These tools often have to be tailored to the kinds of Web data that is being collected, but they are absolutely essential to the process of deep Web data collection. Q. What are some specific use cases and vertical market applications for deep Web data? A. From a market research standpoint, there are many different applications where deep Web data can be used to gain insights. Manufacturers of 35in. large-screen TVs, for example, could use deep Web extraction technology to pull the pricing information from other Web sites or from Web-based catalogs. This software can collect product and pricing information from vendors such as Wal-Mart, Target, Best Buy, Amazon.com, and many others in an automatic fashion. These types of applications collect all of the relevant information, extract it, aggregate it, and then place the data in one or more relational database tables. A TV manufacturer using this type of system could then find out what the current prices are for TVs and could also go back to previous months or even years to determine pricing trends. Another potentially interesting application is in the pharmaceuticals industry. A pharmaceutical manufacturer can see what prices are charged for its products on targeted Web sites anywhere in the world. If products are being sold below market value in one part of the world, this can indicate a black market– or potentially white market–type sales activity. A manufacturer can look at these sites and look at the data aggregation to try to understand why some locales are selling products at prices that may seem to be below market level. Appliances are another common use case where deep Web data can be very useful. Perhaps consumers are looking at reviews for washing machines in an effort to determine reliability versus price for when they need to make a purchase. It would certainly be helpful ©2012 IDC 3
  • 4. for the manufacturers to understand what the consumers are saying about their washing machines and what potential buyers might see if they went to these sites. Manufacturers can collect this deep Web information from all of these different sites — whether retail sites, repair sites, review sites, or competitor sites — to find out what people are saying about washers with regard to reliability, price, and even ease of use. Many similar use cases fall into this category. Another market research use is trying to understand future buying patterns by conducting trend analysis. What's trending in terms of hot new smartphones, best-selling books, or video games? What are people talking about on Twitter and on social media Web sites? Are they talking about the latest weight loss medication approved by the FDA? Who is spending money and where? IDC is seeing a lot of companies starting to think about trend analysis. Data supporting trend analysis can be used to design future products. A manufacturer can look to the Web to find out what features of a new phone are being discussed or what features are being disparaged. This type of information is valuable to designers and engineers because it provides a view into what customers are actually thinking about when they use a product. Deep Web data has many uses in market research, and IDC expects that more and more organizations will have a deep Web data collection and use strategy as part of their ongoing market research efforts. A B O U T T H I S A N A L Y S T Dave Schubmehl is research manager for IDC's search, content analytics, and discovery research. His research covers information access technologies including content analytics, search systems, unstructured information representation, unified access to structured and unstructured information, Big Data, visualization, and rich media search. This research analyzes the trends and dynamics of the content analytics, search, and discovery software markets and the costs, benefits, and workflow impacts of solutions that use these technologies. A B O U T T H I S P U B L I C A T I O N This publication was produced by IDC Go-to-Market Services. The opinion, analysis, and research results presented herein are drawn from more detailed research and analysis independently conducted and published by IDC, unless specific vendor sponsorship is noted. IDC Go-to-Market Services makes IDC content available in a wide range of formats for distribution by various companies. A license to distribute IDC content does not imply endorsement of or opinion about the licensee. C O P Y R I G H T A N D R E S T R I C T I O N S Any IDC information or reference to IDC that is to be used in advertising, press releases, or promotional materials requires prior written approval from IDC. For permission requests, contact the GMS information line at 508-988-7610 or gms@idc.com. Translation and/or localization of this document requires an additional license from IDC. For more information on IDC, visit www.idc.com. For more information on IDC GMS, visit www.idc.com/gms. Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com 4 ©2012 IDC