11. Kinds of Data and Algorithms
Public social media (Twitter, Facebook) 250M+ documents per day
Programming info for 200+ U.S. networks
Video signal for 65+ U.S. networks
Brand conversation & ad tracking for thousands of brands
Realtime semantic analysis of comments
Demographic & behavioral analysis of authors
Advertising context & effect of advertising on brand dynamics
Overlap between audiences and comparative analysis
Bluefin Labs Proprietary and Confidential
12. Realtime & Historical Data
2M show telecasts
1.5M ad airings / month
50M links between social media users and TV shows / month
10B links between social media users and TV ads / month
End-to-end latency in minutes - visible & searchable in realtime
Historical data visible & searchable through various UIs/tools
Searchable text index of all social media comments in our archive &
methods for large-scale analysis jobs (including MR)
Bluefin Labs Proprietary and Confidential
13. Kinds of Questions
We often deal at the intersection of multiple data streams or data &
algorithms
How much chatter about a show (realtime)? (Social media +
programming info + semantic analysis)
What ads are airing (near realtime)? (Video signals + programming
info + computer vision/audio fingerprinting)
Which brands does the audience of a show talk most about? Which
shows do brand engaged authors talk most about? (Social media +
programming info + brand data + semantic analysis + audience
overlap analysis)
Bluefin Labs Proprietary and Confidential
14. More Data
“More data” can mean new streams, broader streams, or more
granular data
“More data” powers better algorithms & aids in creating better data
Bluefin Labs Proprietary and Confidential
15. More Data
“More data” can mean new streams, broader streams, or more
granular data
“More data” powers better algorithms & aids in creating better data
Capturing color, texture, & audio features from the TV video stream
improved our ad detection
Bluefin Labs Proprietary and Confidential
16. More Data
“More data” can mean new streams, broader streams, or more
granular data
“More data” powers better algorithms & aids in creating better data
Capturing color, texture, & audio features from the TV video stream
improved our ad detection
Tapping into full author history permitted better age classification
Bluefin Labs Proprietary and Confidential
17. More Data
“More data” can mean new streams, broader streams, or more
granular data
“More data” powers better algorithms & aids in creating better data
Capturing color, texture, & audio features from the TV video stream
improved our ad detection
Tapping into full author history permitted better age classification
Analyzing closed caption gave us another dimension of semantic
analysis and avenues to explore social/mass media engagement
Bluefin Labs Proprietary and Confidential
18. Better Data
“Better data” achieved through human-machine collaboration, with a
view to continual improvement
“Better data” makes for better algorithms & big data more useful
Bluefin Labs Proprietary and Confidential
19. Better Data
“Better data” achieved through human-machine collaboration, with a
view to continual improvement
“Better data” makes for better algorithms & big data more useful
Both realtime and large scale review & curation
Bluefin Labs Proprietary and Confidential
20. Better Data
“Better data” achieved through human-machine collaboration, with a
view to continual improvement
“Better data” makes for better algorithms & big data more useful
Both realtime and large scale review & curation
Systematic monitoring, statistical QA, & estimation models
Bluefin Labs Proprietary and Confidential
21. Better Data
“Better data” achieved through human-machine collaboration, with a
view to continual improvement
“Better data” makes for better algorithms & big data more useful
Both realtime and large scale review & curation
Systematic monitoring, statistical QA, & estimation models
High quality data supports in-domain benchmarking (How is a show
or network vs. competitors? How is a brand within its sector?)
Bluefin Labs Proprietary and Confidential
22. Better Data
“Better data” achieved through human-machine collaboration, with a
view to continual improvement
“Better data” makes for better algorithms & big data more useful
Both realtime and large scale review & curation
Systematic monitoring, statistical QA, & estimation models
High quality data supports in-domain benchmarking (How is a show
or network vs. competitors? How is a brand within its sector?)
High quality and consistent data permits richer trend analysis (e.g.
season-over-season or ad campaign-to-ad campaign comparison)
Bluefin Labs Proprietary and Confidential
23. Better Algorithms
“Better algorithms” include both new analytics & improvements to
existing ones
“Better algorithm” approaches can be taken with more & better data
Bluefin Labs Proprietary and Confidential
24. Better Algorithms
“Better algorithms” include both new analytics & improvements to
existing ones
“Better algorithm” approaches can be taken with more & better data
Focus areas of NLP/machine learning, computer vision, & statistical
analysis; key to “better” is having a way to measure “goodness”
Bluefin Labs Proprietary and Confidential
25. Better Algorithms
“Better algorithms” include both new analytics & improvements to
existing ones
“Better algorithm” approaches can be taken with more & better data
Focus areas of NLP/machine learning, computer vision, & statistical
analysis; key to “better” is having a way to measure “goodness”
Ad discovery methods possible changed once we shifted to broader
approach
Bluefin Labs Proprietary and Confidential
26. Better Algorithms
“Better algorithms” include both new analytics & improvements to
existing ones
“Better algorithm” approaches can be taken with more & better data
Focus areas of NLP/machine learning, computer vision, & statistical
analysis; key to “better” is having a way to measure “goodness”
Ad discovery methods possible changed once we shifted to broader
approach
Higher quality show telecast engagement data permits more precise
audience analysis across domains - e.g. shows & networks to brands
Bluefin Labs Proprietary and Confidential
27. All of the Above
More data helps build better data & algorithms
Better data improves algorithms & makes large data more useful
Better algorithms get leverage out of more & better data
You should care about all three
Bluefin Labs Proprietary and Confidential