SlideShare ist ein Scribd-Unternehmen logo
1 von 31
MIKE DRISCOLL CO-FOUNDER + CTO METAMARKETS @medriscoll making sense of data:  Lessons for start-ups
If it is unmanaged, you will be blind to weaknesses, deaf to new opportunities, and dumb to your customers. data IS SENSORY INPUT
Data is the sensory input that moves through it. your technology stack is your nervous system
Collecting customer data is a way to “get out of the building.” create feedback loops
customers
Complexity lies at the boundaries between systems make etl a priority
Real-Time Daily Weekly sync data latencieswith decision loops
All data models are wrong. Some data models are useful. don’t agonize overdata schemas
Hadoop is a processing layer You also need a query layer hadoop isn’t enough
Embrace a polyglot architecture of formats and data stores there is no‘One True database’
A RESTful query layer will reduce pain of migration. separate query& storage layers
Reduce the barriers to accessing data across systems. make data easy
“Human-time” means that queries return in seconds. make data fast
Human activity is small in size fully instrument your customers
Human activity is small in size. fully instrument your customers
Machine-generated data can quickly overwhelm. selectively instrument your machines
Machine-generated data can quickly overwhelm. selectively instrument your machines
Work backwards from business questions. Don’t let data architecture drive business needs architect aroundbusiness questions
Someone who can munge, model, & visualize data hire a data scientist
Engineers with a thin grasp of statistics beat statisticians with thin grasp of engineering. working code beats theoretical models
Isolated from production systems. Analytics are a different constituency with different needs create an analytics sandbox
Both internal & external obsess about dashboard design
Either by directly monetizing them or enhance customer experience extract value from yourdata assets
YOUR TECHNOLOGYSTACK IS YOUR NERVOUS SYSTEM.YOUR DATA IS YOUR SENSORY INPUT.
MIKE DRISCOLL CO-FOUNDER + CTO METAMARKETS @medriscoll making sense of data:  lessons for start-ups questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...Saratoga
 
Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case Muh Saleh
 
Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public CloudIMC Institute
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance Qubole
 
One Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical ApplicationsOne Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical ApplicationsFairCom
 
Open Source Tools for Big Data
Open Source Tools for Big DataOpen Source Tools for Big Data
Open Source Tools for Big DataTeemu Heikkilä
 
Is Hadoop a Necessity for Data Science
Is Hadoop a Necessity for Data ScienceIs Hadoop a Necessity for Data Science
Is Hadoop a Necessity for Data ScienceEdureka!
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataKaran Desai
 
The Six pillars for Building big data analytics ecosystems
The Six pillars for Building big data analytics ecosystemsThe Six pillars for Building big data analytics ecosystems
The Six pillars for Building big data analytics ecosystemstaimur hafeez
 

Was ist angesagt? (20)

BigData Analytics
BigData AnalyticsBigData Analytics
BigData Analytics
 
Motivation for big data
Motivation for big dataMotivation for big data
Motivation for big data
 
Big data
Big dataBig data
Big data
 
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
 
Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case
 
Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public Cloud
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance
 
Big data
Big dataBig data
Big data
 
One Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical ApplicationsOne Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical Applications
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Open Source Tools for Big Data
Open Source Tools for Big DataOpen Source Tools for Big Data
Open Source Tools for Big Data
 
Is Hadoop a Necessity for Data Science
Is Hadoop a Necessity for Data ScienceIs Hadoop a Necessity for Data Science
Is Hadoop a Necessity for Data Science
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
The Six pillars for Building big data analytics ecosystems
The Six pillars for Building big data analytics ecosystemsThe Six pillars for Building big data analytics ecosystems
The Six pillars for Building big data analytics ecosystems
 
Big data abstract
Big data abstractBig data abstract
Big data abstract
 
Big data storage
Big data storageBig data storage
Big data storage
 
Case Study mypetstop
Case Study mypetstopCase Study mypetstop
Case Study mypetstop
 
Big Data
Big DataBig Data
Big Data
 
Clustrix Infographic
Clustrix InfographicClustrix Infographic
Clustrix Infographic
 
Data lake ppt
Data lake pptData lake ppt
Data lake ppt
 

Ähnlich wie Making Sense of Data

Business_Analytics_Presentation_Luke_Caratan
Business_Analytics_Presentation_Luke_CaratanBusiness_Analytics_Presentation_Luke_Caratan
Business_Analytics_Presentation_Luke_CaratanLuke Caratan
 
Better Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and SmartBetter Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and SmartPaul Boal
 
Harness the power of data
Harness the power of dataHarness the power of data
Harness the power of dataHarsha MV
 
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDataStax
 
Big Data
Big DataBig Data
Big DataNGDATA
 
How 3 trends are shaping analytics and data management
How 3 trends are shaping analytics and data management How 3 trends are shaping analytics and data management
How 3 trends are shaping analytics and data management Abhishek Sood
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera, Inc.
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018mark madsen
 
Gerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and InvestmentGerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and Investmentvijayk23x
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptalmaraniabwmalk
 
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
2013  International Conference on Knowledge, Innovation and Enterprise Presen...2013  International Conference on Knowledge, Innovation and Enterprise Presen...
2013 International Conference on Knowledge, Innovation and Enterprise Presen...oj08
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingJason S
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPeculium Crypto
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL TechnologiesAmit Singh
 
Expert Big Data Tips
Expert Big Data TipsExpert Big Data Tips
Expert Big Data TipsQubole
 
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...Dana Gardner
 

Ähnlich wie Making Sense of Data (20)

Business_Analytics_Presentation_Luke_Caratan
Business_Analytics_Presentation_Luke_CaratanBusiness_Analytics_Presentation_Luke_Caratan
Business_Analytics_Presentation_Luke_Caratan
 
Better Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and SmartBetter Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and Smart
 
The new EDW
The new EDWThe new EDW
The new EDW
 
Harness the power of data
Harness the power of dataHarness the power of data
Harness the power of data
 
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
 
Big Data
Big DataBig Data
Big Data
 
How 3 trends are shaping analytics and data management
How 3 trends are shaping analytics and data management How 3 trends are shaping analytics and data management
How 3 trends are shaping analytics and data management
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
 
Gerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and InvestmentGerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and Investment
 
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.pptLecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
 
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
2013  International Conference on Knowledge, Innovation and Enterprise Presen...2013  International Conference on Knowledge, Innovation and Enterprise Presen...
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
 
The Smarter Way To Manage Data
The Smarter Way To Manage DataThe Smarter Way To Manage Data
The Smarter Way To Manage Data
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedback
 
Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL Technologies
 
SegmentOfOne
SegmentOfOneSegmentOfOne
SegmentOfOne
 
Expert Big Data Tips
Expert Big Data TipsExpert Big Data Tips
Expert Big Data Tips
 
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
 

Kürzlich hochgeladen

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 

Kürzlich hochgeladen (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

Making Sense of Data

Hinweis der Redaktion

  1. Feedback loops.
  2. Over the next set of slides, I’m going discuss some lessons as data moves through a start-ups organization...
  3. So this is how we frame our technology stack at my start-up, Metamarkets. It’s a four-tiered stack. I believe that many start-ups have similar stacks when they think about how data moves through them.But there’s something important missing here: your technology stack doesn’t exist in a vacuum.
  4. Over the next set of slides, I’m going discuss some lessons as data moves through a start-ups organization...
  5. To be successful, we’ve got to incorporate feedback, both from customers, and the larger world.Feedback is critical. Steve Blank and Eric Ries have talked about not iterating in a vacuum.The feedback you can achieve by managing your data can be incredibly important.
  6. Which begins at ingestion, and ends at the top with products.
  7. ETL often gets a bad wrap. Nothing could be more important to your company than moving data between systems.That is what ETL does. It should be a first class piece of your architecture, you should put one of top engineers at this layer of the stack.(At Metamarkets, we have a former VP of BlackRock working on ETL, and he’s been outstanding).When our ETL breaks down, the data stops flowing, and our business stops moving.
  8. * Don’t invest in real-time data if you’re making weekly decisions.* Moving away from batch systems is hard work.Alternatively, some systems – such as those required for monitoring – may need sub-millisecond response times.But as a general rule, reducing latency in systems creates value in unexpected ways.
  9. Don’t get bogged down in discussions of the perfect data format for your company. “All models are wrong, some models are useful.”There is no such thing.
  10. Which begins at ingestion, and ends at the top with products.
  11. You will likely end up using a variety of data stores in your organization.So don’t agonize over your data store choices.
  12. As you scale and grow, you will have to change storage layers.We went through three different versions, first Postgres, then Greenplum, then HBase, before developing on our own version.
  13. embrace standardssimple, flat formats wherever possible (XML is the clamshell packaging of data)We recently onboarded a client who gave us JSON data. It’s a beautiful thing.Everyone knows SQL: Cloudera found that Hadoop cluster use went up 10x when HIVE was installed.
  14. But HIVE isn’t going to cut it for getting quick insights into their data. No wants to wait 15 minutes for answers.Put in ETL flows that summarize data, and keep a core set of key business metrics in a “hot” database, one that can be queried in real-time.
  15. Feedback loops.
  16. Requirements for systems should be driven by their business needs.
  17. Which begins at ingestion, and ends at the top with products.
  18. but remember...
  19. 4sq explorepymkkaggle winnerswritten by individuals who were engineers first, statisticians second.when hiring folks to do your analytics, you want those who can roll up their sleaves and actually code the models themselvees.
  20. don’t make your analytics team compete for resources, or jeopardize production systemsthey will only get burned and then cut outset up systems where analytics folks can play with data, safelyanalytics often falls into the class of problems that are important, but not urgent. don’t let this happen to your organization.
  21. Which begins at ingestion, and ends at the top with products.
  22. Data represents the totality of a start-up’s sensory experiences.Absent a well-developed digital nervous system to respond to these inputs, you are blind to your deficiencies, deaf to your customers, and dumb to your opportunities.
  23. Either externally, as Klout,Flightcaster, and BillGuard have done.4SQ’s Explore and LinkedIn’s PYMK, has both improved User Experience.Having strong analytical talent in your organization is critical to success here.
  24. Feedback loops.