SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Brad Hubbard
Product Manager, Developer Relations
DataSift
Five Things You Didn’t Know
DataSift Can Do
#DSwebinar
HUMAN DATA
INTELLIGENCE
FILTER TAG • ENRICH
STORE
Stream products will be covered today
To see PYLON (our aggregated, anonymized Facebook topic data), join our next live demo:
http://lp.datasift.com/20150701-Live-SE-Demo-Registration
DataSift is of Two Minds:
Indexed Data & Streaming
#DSwebinar
VEDO
2011 1K 4
Launched
• San Francisco
• New York
• London
• Reading, UK
Customers across
40 countries
2B
Items
processed
per day
(These don’t count toward the 5 things)
Global offices:
#DSwebinar
Brave New Data World
of all digital data created
by consumers
emails a day
of US adults’ location is
known
increase in global
data by 2020
Thoughts
EmotionsLIKES
Dislikes
Intentions IdeasCurrent
Events
GEO
OccupationAge
Topics
GenderIdeas
Gender
Occupation
Intentions
Age
Thoughts
GEO
Dislikes
Age
Ideas
Thoughts
Age
Intentions
Current Events
Current Events
Emotions
GEO
Ideas
GEO
#DSwebinar
Sources of Human-Generated Data
BLOGS & NEWS INSIDE YOUR
BUSINESS
SOCIAL NETWORKS
#DSwebinar
The Complexity of Human Data
VOLUME
VARIET
Y
VELOCITY
Billions of users
Noisy
Generated in real time
per second
Post vs blog vs like
Terabytes per day
Ambiguous
Big spikesUnstructured
#DSwebinar
Turn Human Data into Meaning
#DSwebinar
Unify Human Data
#DSwebinar
9
We apply structure to the chaotic world of human data
#DSwebinar
Facebook Tencent Weibo Sina Weibo Google+ YouTube Instagram
LexisNexis Wikipedia
Wordpress
Tumblr Intense Debate Disqus NewsCred Reddit
TopixJiveTwitter EDGAR NewsVideoIMDBYammer
Unifying data from across the web
#DSwebinar
Filtering Human Data
with CSDL
#DSwebinar
Filter: CSDL Data Processing Language
WRITE ONCE • USE MANY
Filters against generic objects or get source-specific
#DSwebinar
Rules can contain millions of tag and filter criteria, no
need to limit yourself
INFINITE COMPLEXITY
#DSwebinar
Enrich Human Data
#DSwebinar
Identifies links in social posts and fetches header data
Allowing you to filter against link content
LINKS AUGMENTATION
#DSwebinar
LANGUAGE DETECTION
Write filters on a per-language basis, or limit
yourself to only certain languages
#DSwebinar
Location either disclosed by user or listed in profile
GENDER DETECTION
USING PROFILES AND NAME + LANGUAGE
#DSwebinar
SENTIMENT AND TOPICS
Likely positive • Neutral • Likely Negative
Topic detection (looking for nouns and disambiguating
them)
#DSwebinar
Categorization, Scoring
and Tagging
#DSwebinar
VEDO enables automatic
classification of Human Data
based on it’s meaning
Apply Data Science
#DSwebinar
OFF THE SHELF CLASSIFIERS
Enable automatic scoring and classification
#DSwebinar
CUSTOM TAXONOMIES
Hierarchal rules to mach your business
#DSwebinar
CUSTOM SCORING SYTEM
To expose meaning hidden deep within
unstructured, text-rich data
#DSwebinar
Delivery
Use Everywhere
#DSwebinar
CONSUME A JSON STREAM DIRECTLY
#DSwebinar
Send your data to any of these pre-built connectors
#DSwebinar
We handle the infrastructure and
send you the data you need
#DSwebinar
THANK YOU
#DSwebinar

Weitere ähnliche Inhalte

Andere mochten auch

Manual UPC de protección de datos
Manual UPC de protección de datosManual UPC de protección de datos
Manual UPC de protección de datos
Luigi Ceccaroni
 
Marketers, Rev Your Engines: Facebook Topic Data is Available Now
Marketers, Rev Your Engines: Facebook Topic Data is Available Now Marketers, Rev Your Engines: Facebook Topic Data is Available Now
Marketers, Rev Your Engines: Facebook Topic Data is Available Now
DataSift
 
Tareas u2 basico 1
Tareas u2 basico 1Tareas u2 basico 1
Tareas u2 basico 1
LuisIxcot
 
Aktuelle projekte
Aktuelle projekteAktuelle projekte
Aktuelle projekte
hausformat
 
AK_RightsList_Frankfurt2012
AK_RightsList_Frankfurt2012AK_RightsList_Frankfurt2012
AK_RightsList_Frankfurt2012
Susana Gross
 

Andere mochten auch (18)

SEO on a Budget - Search London - July 30 2014
SEO on a Budget - Search London - July 30 2014SEO on a Budget - Search London - July 30 2014
SEO on a Budget - Search London - July 30 2014
 
Manual UPC de protección de datos
Manual UPC de protección de datosManual UPC de protección de datos
Manual UPC de protección de datos
 
plan de negocios (gustavo marchena)
plan de negocios (gustavo marchena)plan de negocios (gustavo marchena)
plan de negocios (gustavo marchena)
 
Oferteo.pl - jak zarabiać na prowadzeniu bloga
Oferteo.pl - jak zarabiać na prowadzeniu blogaOferteo.pl - jak zarabiać na prowadzeniu bloga
Oferteo.pl - jak zarabiać na prowadzeniu bloga
 
Abencor: Outsourcing Services
Abencor: Outsourcing ServicesAbencor: Outsourcing Services
Abencor: Outsourcing Services
 
Marketers, Rev Your Engines: Facebook Topic Data is Available Now
Marketers, Rev Your Engines: Facebook Topic Data is Available Now Marketers, Rev Your Engines: Facebook Topic Data is Available Now
Marketers, Rev Your Engines: Facebook Topic Data is Available Now
 
Actuadores electricos
Actuadores electricosActuadores electricos
Actuadores electricos
 
Examen parcial ms word 11
Examen parcial ms word 11Examen parcial ms word 11
Examen parcial ms word 11
 
Tareas u2 basico 1
Tareas u2 basico 1Tareas u2 basico 1
Tareas u2 basico 1
 
Beneficios del seguro en los trabajadores
Beneficios del seguro en los trabajadoresBeneficios del seguro en los trabajadores
Beneficios del seguro en los trabajadores
 
Portafolio de servicios empresa neullava
Portafolio de servicios empresa neullavaPortafolio de servicios empresa neullava
Portafolio de servicios empresa neullava
 
Aktuelle projekte
Aktuelle projekteAktuelle projekte
Aktuelle projekte
 
Digital Marketing Success Stories
Digital Marketing   Success StoriesDigital Marketing   Success Stories
Digital Marketing Success Stories
 
AK_RightsList_Frankfurt2012
AK_RightsList_Frankfurt2012AK_RightsList_Frankfurt2012
AK_RightsList_Frankfurt2012
 
Swiss Fluid Cylindrical Plug Valve
Swiss Fluid Cylindrical Plug ValveSwiss Fluid Cylindrical Plug Valve
Swiss Fluid Cylindrical Plug Valve
 
Bio2#7
Bio2#7Bio2#7
Bio2#7
 
Invirtiendo en el Perú
Invirtiendo en el PerúInvirtiendo en el Perú
Invirtiendo en el Perú
 
Desarrollo social
Desarrollo socialDesarrollo social
Desarrollo social
 

Ähnlich wie Five Things You Didn't Know DataSift Can Do

Transitioning to-lean-at-infochimps
Transitioning to-lean-at-infochimpsTransitioning to-lean-at-infochimps
Transitioning to-lean-at-infochimps
Ash Maurya
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
Blogtalk 2008
 

Ähnlich wie Five Things You Didn't Know DataSift Can Do (20)

The Connected Data Imperative: The Shifting Enterprise Data Story
The Connected Data Imperative: The Shifting Enterprise Data StoryThe Connected Data Imperative: The Shifting Enterprise Data Story
The Connected Data Imperative: The Shifting Enterprise Data Story
 
Use Big Data to Improve Content Marketing
Use Big Data to Improve Content MarketingUse Big Data to Improve Content Marketing
Use Big Data to Improve Content Marketing
 
How to Build Innovative Products with Facebook Topic Data
How to Build Innovative Products with Facebook Topic DataHow to Build Innovative Products with Facebook Topic Data
How to Build Innovative Products with Facebook Topic Data
 
How to Build Innovative Products with Facebook Topic Data
How to Build Innovative Products with Facebook Topic DataHow to Build Innovative Products with Facebook Topic Data
How to Build Innovative Products with Facebook Topic Data
 
Graphs are Eating the World
Graphs are Eating the WorldGraphs are Eating the World
Graphs are Eating the World
 
Semantic web & structured data - #SMT Search Marketing Thursday - Jan-Willem ...
Semantic web & structured data - #SMT Search Marketing Thursday - Jan-Willem ...Semantic web & structured data - #SMT Search Marketing Thursday - Jan-Willem ...
Semantic web & structured data - #SMT Search Marketing Thursday - Jan-Willem ...
 
Eight Proven Content Creation & Marketing Strategies with Case Studies
Eight Proven Content Creation & Marketing Strategies with Case StudiesEight Proven Content Creation & Marketing Strategies with Case Studies
Eight Proven Content Creation & Marketing Strategies with Case Studies
 
Solving the Planning Puzzle - Plan Your Next Project with Ease!
Solving the Planning Puzzle - Plan Your Next Project with Ease!Solving the Planning Puzzle - Plan Your Next Project with Ease!
Solving the Planning Puzzle - Plan Your Next Project with Ease!
 
Integrated Media Strategies - RISE Austin 2011
Integrated Media Strategies - RISE Austin 2011Integrated Media Strategies - RISE Austin 2011
Integrated Media Strategies - RISE Austin 2011
 
Transitioning to-lean-at-infochimps
Transitioning to-lean-at-infochimpsTransitioning to-lean-at-infochimps
Transitioning to-lean-at-infochimps
 
Stone Ward Digital Swagger Presentation
Stone Ward Digital Swagger PresentationStone Ward Digital Swagger Presentation
Stone Ward Digital Swagger Presentation
 
Polyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4jPolyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4j
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
 
SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...
SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...
SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...
 
Graph databases and the #panamapapers
Graph databases and the #panamapapersGraph databases and the #panamapapers
Graph databases and the #panamapapers
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
 
Nova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web TalkNova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web Talk
 
What We Pitched the Obama Campaign in 2012
What We Pitched the Obama Campaign in 2012What We Pitched the Obama Campaign in 2012
What We Pitched the Obama Campaign in 2012
 
Building an Online Presence
Building an Online PresenceBuilding an Online Presence
Building an Online Presence
 
NISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to RealityNISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to Reality
 

Mehr von DataSift

Mehr von DataSift (9)

Boosting Your Brand Marketing with Facebook Topic Data Insights
Boosting Your Brand Marketing with Facebook Topic Data InsightsBoosting Your Brand Marketing with Facebook Topic Data Insights
Boosting Your Brand Marketing with Facebook Topic Data Insights
 
Staying on the Right Side of the Fence when Analyzing Human Data
Staying on the Right Side of the Fence when Analyzing Human DataStaying on the Right Side of the Fence when Analyzing Human Data
Staying on the Right Side of the Fence when Analyzing Human Data
 
10 Reasons Facebook Topic Data Will Change Your World
10 Reasons Facebook Topic Data Will Change Your World 10 Reasons Facebook Topic Data Will Change Your World
10 Reasons Facebook Topic Data Will Change Your World
 
Taming Social Data: How Social Data Framing liberates analysis and accelerate...
Taming Social Data: How Social Data Framing liberates analysis and accelerate...Taming Social Data: How Social Data Framing liberates analysis and accelerate...
Taming Social Data: How Social Data Framing liberates analysis and accelerate...
 
Building the Social Powered Brand: Turning Social Data Into Competitive Advan...
Building the Social Powered Brand: Turning Social Data Into Competitive Advan...Building the Social Powered Brand: Turning Social Data Into Competitive Advan...
Building the Social Powered Brand: Turning Social Data Into Competitive Advan...
 
DataSift's Rob Bailey at The Social Media Strategies Summit
DataSift's Rob Bailey at The Social Media Strategies Summit DataSift's Rob Bailey at The Social Media Strategies Summit
DataSift's Rob Bailey at The Social Media Strategies Summit
 
Follow the content
Follow the contentFollow the content
Follow the content
 
Twitter, Social Sentiment and Stock Markets
Twitter, Social Sentiment and Stock MarketsTwitter, Social Sentiment and Stock Markets
Twitter, Social Sentiment and Stock Markets
 
Creating streams with DataSift
Creating streams with DataSiftCreating streams with DataSift
Creating streams with DataSift
 

Kürzlich hochgeladen

Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
UK Journal
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Kürzlich hochgeladen (20)

Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & IrelandUsing IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 

Five Things You Didn't Know DataSift Can Do

Hinweis der Redaktion

  1. First, a little about DataSift
  2. Human data is a particular challenge: not only is there a lot of it – but it’s complex, highly varied, and comes at you fast. It can also have
  3. We bring in data from a ton of different places. You’ve probably heard of most of these – and we’d be happy to dig into more detail on any of these later on if you’re curious, or you can find more on our website.
  4. A Facebook post looks different than a Disqus comment. But you might want to search for your company or product anywhere. Because we’ve already normalized the data, you write simplified filters that make it easy for you. You can write against both generic targets – like “the main body text contains android” or more specific, nuanced targets, such as “the author’s account is at least 90 days old”
  5. Once we have the data in a standardized format we enrich it with a lot of really useful stuff. Just like the raw content and other information can be filtered on, so can all the enhanced data we add.
  6. “This is cool! http://bit.ly/AsdFa” Shortened URLs and tracking URLs are incredibly common in social data. What we do is not only traverse these redirects to their final destination, but we also fetch the page header information and metadata and append it to the source object. This means you can filter not only on posts which contain “Android”, but also posts with links which contain “Android” in the title, description, or keywords. We do this at line speeds, across every social post on the planet, as it happens. This is an extremely powerful tool and the value it can provide is considerable. So much of the social landscape is dominated by discussions of a shared link, and without that content, you can miss the entirety.