SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Small Data Sets to Scale:
  Planning for the Evolution of Data

      Poornima Vijayashanker
     CEO & Founder BizeeBee
     poornima@bizeebee.com
           @poornima
       www.femgineer.com
AGENDA
I. Stealth Mode - “pre-data” phase
II. Launch
III. Compute Growth Rate
IV. Optimizations
V. Data Storage
Pre-Data
Stealth Mode - “pre-data” phase
Small initial data set
  Easy storage
  Storage solutions like Heroku, RackSpace
  Design features around it

Simplicity of Storage v. Complexity of Design
  e.g. Mint - 3 months of financial data, FB - social graph is
 limited to universities
0 to 100k to 1M
0 - 100k easiest schema design
  Single DB - with user & static data
  Single instance of app accessing the db

100k - 1M+ time to re-design db and app
  Break up databases - user & static
  Multiple instances of the app
Growth Rate
What is your user growth rate?
  Basic unit e.g. Mint - transaction
  User generated content
  Size of unit e.g. FB - photo

Storage capacity v. Seek v. Size
Optimizations
Capacity - throw hardware
Seek - throw software
  Cache data

Size - design around it
  Limit usage size e.g. 4MB picture
Optimizations Cont’d
Code Level
  Processes - Computation v. Retrieval
  DB Techniques - Index, De-Normalize

Data Level
  Partioning: Siloed v. Interconnected
Data Storage
Single User’s Data v. Aggregated Data
  Single user’s data v. data aggregated across users
  e.g Mint - Spending Trends
  Scheme to compute, store, and retrieve aggregated data
Conclusion
  Start small - provide enough value to user
  Monitor & project growth rate of data
  Break data apart
  Simple optimizations - indexing, de-
normalizing, caching
  Large data sets - warehousing, partitioning
db
  Hiring designer & engineer for BizeeBee :)

Weitere ähnliche Inhalte

Mehr von Mediabistro

Chris Leigh-Lancaster_Inside 3D Printing Melbourne
Chris Leigh-Lancaster_Inside 3D Printing MelbourneChris Leigh-Lancaster_Inside 3D Printing Melbourne
Chris Leigh-Lancaster_Inside 3D Printing Melbourne
Mediabistro
 
Evan Wagner and Robby Dermody Presentation
Evan Wagner and Robby Dermody PresentationEvan Wagner and Robby Dermody Presentation
Evan Wagner and Robby Dermody Presentation
Mediabistro
 

Mehr von Mediabistro (20)

Chris Leigh-Lancaster_Inside 3D Printing Melbourne
Chris Leigh-Lancaster_Inside 3D Printing MelbourneChris Leigh-Lancaster_Inside 3D Printing Melbourne
Chris Leigh-Lancaster_Inside 3D Printing Melbourne
 
Terry Wohlers_Inside 3D Printing Melbourne
Terry Wohlers_Inside 3D Printing MelbourneTerry Wohlers_Inside 3D Printing Melbourne
Terry Wohlers_Inside 3D Printing Melbourne
 
2014 07-09 Juan Llanos Presentation
2014 07-09 Juan Llanos Presentation2014 07-09 Juan Llanos Presentation
2014 07-09 Juan Llanos Presentation
 
Gary Anderson_Inside 3D Printing Melbourne
Gary Anderson_Inside 3D Printing MelbourneGary Anderson_Inside 3D Printing Melbourne
Gary Anderson_Inside 3D Printing Melbourne
 
James canning inside bitcoin melbourne final
James canning inside bitcoin melbourne finalJames canning inside bitcoin melbourne final
James canning inside bitcoin melbourne final
 
Gst & bitcoins slides- Potential Pitfalls
Gst & bitcoins slides- Potential PitfallsGst & bitcoins slides- Potential Pitfalls
Gst & bitcoins slides- Potential Pitfalls
 
Building a trading platform from scratch
Building a trading platform from scratchBuilding a trading platform from scratch
Building a trading platform from scratch
 
Bitcoin Lateral Economics
Bitcoin Lateral EconomicsBitcoin Lateral Economics
Bitcoin Lateral Economics
 
State of Ethereum, and Mining
State of Ethereum, and MiningState of Ethereum, and Mining
State of Ethereum, and Mining
 
Future of Bitcoin Mining- Josh Zerlan
Future of Bitcoin Mining- Josh ZerlanFuture of Bitcoin Mining- Josh Zerlan
Future of Bitcoin Mining- Josh Zerlan
 
Evan Wagner and Robby Dermody Presentation
Evan Wagner and Robby Dermody PresentationEvan Wagner and Robby Dermody Presentation
Evan Wagner and Robby Dermody Presentation
 
Crypto Law
Crypto LawCrypto Law
Crypto Law
 
Morning Keynote: Bobby Lee
Morning Keynote: Bobby LeeMorning Keynote: Bobby Lee
Morning Keynote: Bobby Lee
 
Yuan Bao Presentation
Yuan Bao PresentationYuan Bao Presentation
Yuan Bao Presentation
 
Bitcoin derivatives
Bitcoin derivativesBitcoin derivatives
Bitcoin derivatives
 
Inside3 d printing_brianfederal
Inside3 d printing_brianfederalInside3 d printing_brianfederal
Inside3 d printing_brianfederal
 
3 d printing_paultrani
3 d printing_paultrani3 d printing_paultrani
3 d printing_paultrani
 
Inside3DPrinting_marktrageser
Inside3DPrinting_marktrageserInside3DPrinting_marktrageser
Inside3DPrinting_marktrageser
 
Inside3DPrinting_johnhornick
Inside3DPrinting_johnhornickInside3DPrinting_johnhornick
Inside3DPrinting_johnhornick
 
Inisde3DPrinting_naturalmachines
Inisde3DPrinting_naturalmachinesInisde3DPrinting_naturalmachines
Inisde3DPrinting_naturalmachines
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

P. Vijayashanker From Small Datasets to Scale: Planning for the Evolution of Data Social Developer Summit

  • 1. Small Data Sets to Scale: Planning for the Evolution of Data Poornima Vijayashanker CEO & Founder BizeeBee poornima@bizeebee.com @poornima www.femgineer.com
  • 2. AGENDA I. Stealth Mode - “pre-data” phase II. Launch III. Compute Growth Rate IV. Optimizations V. Data Storage
  • 3. Pre-Data Stealth Mode - “pre-data” phase Small initial data set Easy storage Storage solutions like Heroku, RackSpace Design features around it Simplicity of Storage v. Complexity of Design e.g. Mint - 3 months of financial data, FB - social graph is limited to universities
  • 4. 0 to 100k to 1M 0 - 100k easiest schema design Single DB - with user & static data Single instance of app accessing the db 100k - 1M+ time to re-design db and app Break up databases - user & static Multiple instances of the app
  • 5. Growth Rate What is your user growth rate? Basic unit e.g. Mint - transaction User generated content Size of unit e.g. FB - photo Storage capacity v. Seek v. Size
  • 6. Optimizations Capacity - throw hardware Seek - throw software Cache data Size - design around it Limit usage size e.g. 4MB picture
  • 7. Optimizations Cont’d Code Level Processes - Computation v. Retrieval DB Techniques - Index, De-Normalize Data Level Partioning: Siloed v. Interconnected
  • 8. Data Storage Single User’s Data v. Aggregated Data Single user’s data v. data aggregated across users e.g Mint - Spending Trends Scheme to compute, store, and retrieve aggregated data
  • 9. Conclusion Start small - provide enough value to user Monitor & project growth rate of data Break data apart Simple optimizations - indexing, de- normalizing, caching Large data sets - warehousing, partitioning db Hiring designer & engineer for BizeeBee :)

Hinweis der Redaktion