6. 1. High-level overview of the data
product management lifecycle.
– “I’m thinking about creating a data product.
What are some key concepts and considerations
that I should understand?”
2. Intro to the breadth/depth of
Chicago’s data product management
firms and talent
3. Great networking
4. Fun (including t-shirt prizes!)
Product Development and Management Association
7. •
1. What's a big data product and how does it differ from
“traditional” digital and physical products?
2. Designing a data product to fit a real need? (Identifying
needs, segmenting, knowing customer requirements)
3. Getting your data, Part 1: How to source existing databases?
4. Getting your data, Part 2: How to manufacture new
data? (Gathering, housing, analytics, structuring)
5. Legal and ethical constraints of data products: regulatory
compliance, privacy and corporate trade secrets
6. Packaging your data and pricing it
7. Successfully Marketing and Selling Your Data
8. Winning elements of a big data product team
Product Development and Management Association
21. DESIGNING A DATA PRODUCT
TO FIT A REAL NEED
Kamal Tahir, Experian
Identifying needs , Segmenting, Knowing
Customer Requirements
22. Using data, technology, analytics and strategy, I help drive profit, volume & share
across digital, social and traditional channels by improving acquisition, conversion,
retention and engagement
• 500 million vehicles • 235 million consumers • Global commercialization
• Registration, accident, • 113 million households of Nielsen Answers BI
emissions, odometer • Behavioral, attitudinal platform
• States, dealers, OEMs, • 3K+ elements • Global lead for data and
insurance, auction • Plus Web search data analytical asset delivery
• Sales performance • Automated profiling and platform $1.5B, 35K users,
• Predictive purchase targeting solutions 33 countries, 12 languages
models • Digital effectiveness
• EDI based volume data for
• First Global data
500+ national
solution for
agricultural pesticides
environmental
wholesalers to drive
compliance
marketing plans
• Product-assembly-
component-base material
22
22
23. THE KEYS – OCDix™
Owners Inspire Value to you
Capability Delivery
Objectives Improvise Value to
Competence Devices
Implement user
Outcomes Capacity Data
Value <> $
23
24. Put data in context of needs to build a roadmap to
solution
who What
• is the audience? • is the need?
More than one? • problems to be solved?
• will you design for? • decisions to be made?
• will you not design • questions to be
for? answered
• other questions may
come up
HOW CAN I HELP
24
25. How will it be used
User Type
• Internal or Usage Style
external • Summary rollups
• Tech vs. non tech • Alerts and signals
• Onsite/Remote/ • Ad-hoc analysis
Mobile
• Interactive
Delivery & Devices
• Website Usage Type
• FTP • Single use
• Integrations • Subscription
• Tapes (yes) • Ad-hoc
• Tablet, phone, CAN I HELP YOU
custom devices
25
26. Success -Ability to solve, deliver, use - for You & user
YOU User
Competency & Competency-
Competition
core competency for
Can they use the new
information
you?
Capability & Capacity- Capability & Capacity
Can you address it? –
What else is on your
How soon will user
start using it
plate?
Are other pieces to
Can you deliver if it is execute available?
built? CAN IT BE BUILT?
SHOULD I Build it Complexity &
Complexity & Constraints
Constraints-
size, usage, frequency,
reliability, How much advisory &
ROI consulting needed
regulatory?
Opportunity Cost
26
28. Big data for big challenges?
Big, small, medium, Solve incremental
petite, grande, venti, issues along the way
Big and tall..look for quicker ROI
beyond the label
Fund future
Big problems = big initiatives and get
investment + evolutionary gains
complexity & along the way to
constraints = revolutionary gains
longer duration for
ROI.
28
29. SUMMARY- building a wining product
• Really know your users & • No/Low value- Walk
their goals away
• Call out all limitations, • Don’t Overbuild
capacity, complexity etc • Think Incremental gains
• Product variance by user • Use the force
type
Owners Capability Delivery Inspire Value to you
Objectives Competence Devices Improvise Value to user
Outcomes Capacity Data Implement Value <> $
29
33. Mark’s Experience & Company
Formal Education: Undergrad: Art; Grad: Business (Marketing)
Informal Education: WWW, Events, Books, Tutorials, Friends, Family, Music, Art,
Movies, Reflection, Life Experiences, Successes, and Failures.
Early Career: Developer & Designer of “Web 1.0” Sites, Portals, CMS,
E-Commerce, Advertising, and Loyalty Systems
Mid Career: Transition to Product & Team Leadership 2004
Past 5 years @ Navteq & Nokia: Technology Research, Mentorship, Product
Prototyping, Service Design, Invention, and Portfolio Management
Business Owner of Allstate Enterprise Analytic Ecosystem
A Data Scientist’s Paradise!
BI, Descriptive Analytics, NLP, Predictive Analytics, Prescriptive Analytics. Using
Hadoop, Exadata, Vertica, et al.
34. Mark’s Product Responsibilities
People
– Analysts, Actuaries, Analytics Engineers, Developers, Testers, Statisticians,
Mathematicians, and more!
– Train, Mentor, Manage, Collaborate, Lead, Partner
Process
– Research (Economic, Fraud, Pricing, Marketing)
– Operations (Menlo Park, Northbrook, Belfast N. Ireland)
– Go Agile Methodology!!
Technology
– Hardware (Big Box, Hadoop, GPUs, VMs, Cloud, Legacy, ESB)
– Software (Open Source, Commercial, Custom, and Secret Sauces : )
New ideas and approaches percolate just about every day..
35. Focus Topic: Sourcing Internal Data
Identify Your Sources:
Any Data can be Big, you’ve heard about the 3 Vs + C? (Frequently Cited: volume, variety, velocity, and complexity)
• Customer
– Broad (purchases, returns, credit, age, gender)
– Narrow (mouse movements, eye tracking, voice monitoring)
• Transactional (customers, vendors, marketplace, ESB, and ??)
• Employee & Employee Generated
• Operational & Logistics
• Sensor
• Location (one of my favorites)
• Public Domain
• Semantic Linkages & Relationships
• Audio & Video
• Unexplored digital areas
• and more…
Remember: if you don’t have it, you can always start gathering it.
36. Focus Topic: Sourcing Internal Data
Co-mingling Tactics:
• Blending, Joining, Fuzzy-Joining, Inferencing
• Character Sets, Language, Transliteration, Localization, Regional Dialects
• Format & Structure (raw text, structured text, images, spatial, video,
audio, xml, csv)
• Transition with ease (avoid flattening, respect schema)
• Nurture your taxonomies & ontology, hire an MLS
Iterate, Document, Test, Automate, Be Smart, Be Inquisitive
37. Focus Topic: Sourcing Internal Data
Sourcing Advice:
• Get Permission to use data
• Be careful, outsiders can model your data and spy on you (srsly)
• Standardize Source Data Analysis
– Better Yet, Automate it
– Even Better, Run it all the time, Obsess over quality
• Source with your customers in mind --
• Source with your competition in mind
• Understand both signal & noise
The “Dollars Per Gigabyte” model died with the DVD -- Value comes
from how fast and well you assimilate, process, and distribute data
38. “Interchangeable” Key Take-Aways
• Rookie: Exciting Times
– Data and the tools we interact with it are hyper-evolving, this
will be a wild and fun ride! Learn something everyday.
• Manager: Stay Focused
– Embrace both Quantitative Metrics & Qualitative Metrics
• Director: Ask The Tough Questions
– Data is always half as good as it appears to be
• Business Unit Manager: Build Smart Organizations
– Go watch the “I Love Lucy” Chocolate Factory video
…that’s big data
Thanks for listening!!
Time for the next speaker
40. Getting Your Data,
Part 2:
Manufacturing New
Data Sources
Perspectives from a research
organization
41. What is NORC?
• Survey research organization established in 1941
• Affiliated with the University of Chicago
• Reputation for producing high-quality,
foundational data sources
• General Social Survey (GSS)
• National Longitudinal Survey of Youth
• National Immunization Survey
• National Social Life, Health and Aging Study
• National Survey of Children’s Health
• Survey of Consumer Finance
• Work in the public interest
Insert Presentation Title and Any Confidentiality Information 41
42. Characteristics of High-Quality,
Primary Data Collection
• Research objectives are carefully conceived and
very clear
• Design questionnaire items and rigorously test
them for comprehension, validity and reliability
• Information collected directly from respondent
• Robust statistical dimension
• Sample design that ensures the data represent the
population
• Identifying and managing potential for bias in the
sample that might skew the truth
• Cleaning, preparing and weighting data
Insert Presentation Title and Any Confidentiality Information 42
43. Characteristics, continued
• Respondent Right to Consent
• Institutional Review Board approval
• Transparency and Credibility
• Methods are documented and published
• Data must withstand the scrutiny of the
government and the research community
• Use in peer-reviewed publications
• Slow, steady, precise approach
• Can be costly, time-consuming
Insert Presentation Title and Any Confidentiality Information 43
44. How Do We Do It?
• Determine the best sample for the research need
• Random Digit Dial
• Area probability sampling
• List Samples
• Census
• Design your instrument and decide the best way
(mode) to ask your questions
• Telephone interview
• Face-to-face interview
• Web survey
• Fancier ways (cameras, diaries, sensors, drones…)
Insert Presentation Title and Any Confidentiality Information 44
45. How Do We Do It, continued
• Lots of quality checks:
• Instrument development and testing
• Consistent training and certification of interviewers
• Real-time data review and consistency checks to
make sure instrument (and interviewers!) are working
properly
• Data cleaning and preparation steps
• Statistical weighting to offset any bias in the
sample
Insert Presentation Title and Any Confidentiality Information 45
46. Is All This Necessary?
• Different data needs demand different degrees of
statistical rigor
• Statistical underpinnings provide confidence that
the data represent the population
• All data have some degree of error, but we know
exactly what that error is
• Pew Study (2013) on public opinion surveys vs.
Twitter
• www.pewresearch.org/2013/03/04/twitter-reaction-to-
events-often-at-odds-with-overall-public-opinion/
Insert Presentation Title and Any Confidentiality Information 46
47. How Do These Data Sources
Help Me?
• Taming the Wild West of Big Data
• These “primary” data sources provide a
foundation for testing the validity and viability of
new data sources
• You need a gold standard against which to introduce a
new currency
• Recent assessments of Google and Twitter flu data
Insert Presentation Title and Any Confidentiality Information 47
50. Legal and Ethical Constraints on Data
Products:
Managing to Regulatory Compliance, Consumer Privacy
and Corporate Trade Secrets
Jackie Beaubaire, Director, Content Licensing & Governance
March 19, 2013
51. Lets Talk about Me
Background:
Degree in Health Information Management
Rush Presbyterian St. Luke's Medical Center
North Shore University Health System
HealthStar PPO
Deloitte Consulting
Truven Health Analytics (FKA Sachs Group, Solucient, Thomson,
Thomson Reuters, etc, etc
51
52. Truven Health Analytics
• In the data/analytics business since the 80s…..but
different names
• Clients include:
– hospitals
– health plans
– Employers
– Pharmaceutical
– federal and state government
• Our solutions support marketing, planning, clinical
analysis, claims analysis….improve outcomes and
decrease costs
• Approx $600M in annual revenues
• We use client supplied data and purchased
intellectual properity from 3rd party vendors
53. Me, Continued
• Director, Content Licensing and Governance
– Acquire content from 3rd parties
• Data and Methodologies
– State and federal data
– Reference Data
– Other large data vendors
• Sometimes we negotiate multi-year complex deals and sometimes we
just sign on the doted line
• Data costs range from free to $1M per year
– Govern the use/release of the content
• Ensure that the release rules and obligations are woven into the fabric
of the business
54. Lots and Lots of Data with lots and lots of rules
• Regardless of where you get the data, there are usually rules to
follow.
• Some are specific to Healthcare and some are not
– HIPAA – Privacy and Security
– SOX
– DOJ
– Other rules around use of SS#. claims data and marketing
– Contractual obligations
• You need to understand the rules that impact your industry and
data type
• Misuse of data can lead to fines, public announcements,
potential jail time, reputation issues and loss of the data
stream….all of which can impact revenue
• Some contracts have incident notification clauses and some
don’t. There is an ethical line that you don’t want to cross
55. Tips For Using Client Supplied Data
• If you are using client supplied data:
– Client contracts must support your use/release
• “XYZ company retains the world wide rights to use your data as long
as we….”
• Sometimes this requires reading all of your client agreements to
ensure the use rights are there.
– Make sure that the client is authorized to provide this data to you
– Sometimes you give a small part of the product away for the wider
use of the data
– You need to understand the clients security, privacy, confidentiality,
ethical and other concerns and then support them. They do not
want to give their data to have you misuse it
– Misuse of data can lead to fines, reputation issues and loss of the
data stream….all of which can impact revenue
56. Tips For Using Vendor Data
• You are purchasing someone else's intellectual property. This is
how they make their money and you should respect that.
• Some data can be found and other data have only one source.
This dramatically changes the relationship and negotiation
• Vendors will outline your use rights and obligations in the
contract
• Sometime you can negotiate and other times you can’t
• Obligations can include, Client data use agreements,
aggregation, cell suppression, royalty, citations, market sales
limitations, etc
• Misuse of data can lead to fines, reputation issues and loss of
the data stream….all of which can impact revenue
57. Data Governance
• If you are a data company…..data is your most important asset
It is a good idea to protect it
• It does not have to be large, but you do need a presence
• Ensure that your products and services are compliant BEFORE
launch or contract signature
• Examples:
– My team is at gate meetings and can stop a product from releasing
– I work with legal and the sales team on new/unique deals to ensure
that we can sell what we want sell. Shutting a deal down right
before contract signature is not fun
60. PDMA - Monetizing Big Data Panel:
Packaging & Pricing Your Data
Mike Jakob – President & COO
March 2013
61. Sportvision Company Highlights
• Leading provider of sports media and data solutions
• 10,000+ live events
• 100M+ viewers annually
• 18 Olympic, Pro and College sports
• History of cutting-edge new product innovation
• 10 Emmy Awards, Invented Iconic sports products
• Fast Company “The World’s 50 Most Innovative Companies”
• Sports Business Journal Technology of the Year
• Positioned to benefit from growing market for sports data
• Fans want interactive content across devices
• Data becoming critical for teams, leagues and broadcasters
• YouTube video link about Sportvision
• http://www.youtube.com/watch?v=lxDHYKXZa6w
61
Proprietary and Confidential
63. Version 2.0: Proprietary Sports Data & Multi-Platform Capabilities
63
Proprietary and Confidential
64. Sportvision is Collecting Big Data
Sport Live Event Presence Data Collected:
Baseball
• Speed, location, and trajectory of every
• MLB, MiLB, WBC, KBO
pitch, hit, player, throw
Football
Motorsports
• NASCAR: • Car speed, location, acceleration, time
behind leader, RPM, brake, throttle
Cup, Nationwide, Truck percentage, pit stop data
Hockey
Sailing
• Boat speed, location, acceleration, time
• All AC Series races behind leader, infractions, course
boundaries
64
Proprietary and Confidential
65. Packaging the Data: Vertically Integrated or Data Provider?
• What are the potential markets for my Data? Which are the
most valuable segments & who accrues the most value?
• Do I have the skills, expertise, credibility and capital for each
addressable market? Can I acquire more through
partnerships?
• Can I play in multiple markets at once?
65
Proprietary and Confidential
66. Pricing the Data: How much is it worth?
Tim Lincecum’s August 2010 “Slump”
The release slot of all of his pitches were higher than average. Shown here are the
differences between his cut fastball and slider.
66
Proprietary and Confidential
67. Pricing the Data
• Tim Lincecum’s ERA drops from 7.82 in August 2010 to 1.94
in September 2010
– Picks up 5 post-season wins in October, Giants win first World Series
since 1954
– Lincecum signs a new two-year deal after the 2011 season worth
$40.5m
• What’s this Data worth to the Giants? To Lincecum?
• How much did we get paid for it?
67
Proprietary and Confidential
68. A few Takeaway Lessons
• Proprietary Data is valuable and often enables a barrier to
entry for competitors
• Much of the value often goes to the “last mile” in the value
chain…so do more than just collect it
• Even if you are not able to charge what the data is worth…if
you create value for your customers they will keep coming
back for more
68
Proprietary and Confidential
81. BACKGROUND
Direct Marketing Executive through the emerging Digital Data
Evolution
Coolsavings – original digital coupon, redemption and modeled
emailer
HR Competencies – amassing SME’s to define successful
competencies
Vente – Experian Unit – selling consumer data attributes for
marketing services
Dotomi – Personalized advertising that uses big data and dynamic
creative
82. COMPETING WORLD VIEWS DRIVE NEED
Traditional Future
Datasets; Lists; Solutions; Prediction;
Attributes; Implied Machine integration;
Benefits Micro to macro
83. HARMONIOUS CONFLICT STRETCHES A
TEAM
Sales – expand data Quality – narrow data
Operations – streamline Analytics – insight, artisan
mechanize new innovation
84. MBA’S VS. PH.D’S
ANALYSTS VS. SCIENTISTS
We have the answers The data has the answer
85. KEYS AND INTEGRATION
Is data responsible for
Obama winning the
election?
Integration
Predictability
Application
86. UNLOCKING HIDDEN MEANING
Breaking down the
details for new truths
Seeing patterns
Crowd-sourcing
OED:
- Details
- Rules based
- Crowd sourced
87. FINDING TALENT AND EXPERTISE
Leaders
Outside of data; Customer Centric; Inspiring
Data Operations:
Large retailers and cataloguers
PhD’s:
Political campaigns; Financial Services
Sales
Many data service companies and Media companies
Quality
Manufacturing – garbage in / garbage out
88. SUMMARY OF WINNING ELEMENTS
Establish your vision – and be aware of long term
“machination”
Leadership to manage through the table -stakes resources
The new age of the scientist
You need to lock into your target environment
A role for crowd-sourcing and getting to elemental patterns