SlideShare ist ein Scribd-Unternehmen logo
1 von 69
A Journey In The Public Clouds
       With Datadog

    Alexis Lê-Quôc (Product Guy) at Datadog
             IASA New York Chapter
                 June 28th, 2011
What I’m going to talk about
 ‣What we do and for whom
 ‣The kind of data we deal with
 ‣Our architecture
 ‣Our architecture in a public cloud (AWS)
 ‣What we learned
 ‣Q+A
SaaS Platform for
Aggregation, Correlation, Collaboration
           For Dev & Ops




            What we do?
The Mess
                                                                                                        Usage Analytics
                                                                                                                                                         Too many data streams,
                                                                  IAAS / PAAS
                                                                                                                                                             too many silos
                                                                                                                               Issue Resolution

                                                                                                               t
                                                      ics
 Servers and Devices
                                                                          ics                              igh


                                                  ices
                                                                       etr                              ins
                                              metr

                                                    g
                                             billin                                                                                                       Too many choices to
                                     m                             m
                                             cho
                                       et
                                           ri c                                                                    s
                                               s
                                                            ?!?                                             change                                          make, too often
                                                                                    Dev team



                       changes                                    !?
                                                                                          ics           choices
                                                                                  metr
                                               Ops team                                                                                  Applications

                          tri
                              cs                                                                      ch
                                                                                                         an
                                                                                                                                                          Only getting worse as
                       me
                                 nts                                                                        ge
                                                                                                                                                           SaaS Silos multiply
me




                                                                                even                           s
                             ve                                                      ts
tri




                                                                  ad




                           e                                                              + fe
                                      es                                                        edb
 cs




                                                                    vic




                                  oic                                                                ack
                               ch
                                                                       e
                                                            me
                                                     s
                                           s
                                      tric
                                                   choice


                                                            tri
                                    me




                                                             cs




                                                                                                                                                          Separate Dev and Ops
                                                                                     Cap. Planning                        SDLC support

  Monitoring

                                                                                                                                                        teams, looking at separate
                                               Hosting
                                                                                                                                                              data streams
                                                                                                                                Asset Mgmt
                                                                                   CDNs




                                  Data-Driven decision making in IT is rarely happening.
                                      Too slow, Too expensive, requires too much discipline.
We Simplify
Datadog to the rescue
                system metrics
                                    key metrics
               quality metrics     to Alice Dev

                  SaaS data




                                                      visibility
               capacity metrics

               usage analytics
                                  recommendations
                cloud billing        to Bob Ops

                code metrics




                                                       visibility
               config changes

                 IaaS pricing
                                   business metrics
                  perf. data       to Charlie CEO

                vendors info

               curated metadata
 Aggregation   Correlation        Collaboration
Concretely
etc.
       Aggregation
AGGREGATION
        Aggregation
https://app.datad0g.com/dash/dash/1000#/date_range/1308057152698-1308143552698
                                                                                 Correlation
Collaboration
What Architecture For
 What Kind Of Data?
Events          Metrics
User comments   Unique visitors
Alert           Load
Build           Transaction duration
Batch job       etc.
Taxonomy
Atomicity
Concistency
Isolation
Durability

e.g. SQL DBs



           CLASSICS
        http://en.wikipedia.org/wiki/Eventual_consistency
Atomicity                                    Basically
Concistency                                  Available
Isolation                                    Soft-state
Durability                                   Eventual
                                             consistency
e.g. SQL DBs
                                             e.g. DNS


           CLASSICS
        http://en.wikipedia.org/wiki/Eventual_consistency
Data
      Intensive
      Real
      Time

      e.g. real-time web


NEW COMER
Brian Cantrill: http://dtrace.org/resources/bmc/DIRT.pdf
Aggregation
Constant data influx
Large data sets

              Correlation
              On-demand visualization
              Background data analysis

                             Collaboration
                             Real-time updates
                             On-the-fly data analysis
Aggregation

    SE
Constant data influx
  BA
Large data sets

              Correlation
              On-demand visualization
              Background data analysis

                             Collaboration
                             Real-time updates
                             On-the-fly data analysis
Aggregation

    SE


             T
Constant data influx


           IR
  BA


          D
Large data sets

              Correlation
              On-demand visualization
              Background data analysis

                             Collaboration
                             Real-time updates
                             On-the-fly data analysis
Aggregation

    SE


             T
Constant data influx


           IR
  BA


          D
Large data sets

              Correlation




                        SE
              On-demand visualization


                      BA
              Background data analysis

                             Collaboration
                             Real-time updates
                             On-the-fly data analysis
Aggregation

    SE


             T
Constant data influx


           IR
  BA


          D
Large data sets

              Correlation




                        SE
              On-demand visualization


                      BA
              Background data analysis

                             Collaboration




                                        T
                             Real-time updates




                                      IR
                                     D
                             On-the-fly data analysis
Aggregation

    SE


             T
Constant data influx


           IR
  BA


          D
Large data sets

              Correlation




                        SE
              On-demand visualization


                      BA
              Background data analysis

                             Collaboration




                                        T
                             Real-time updates




                                      IR
                                     D
                             On-the-fly data analysis

  Datadog = DIRT + BASE + a tiny bit of ACID
How It All Fits Together
    http://www.flickr.com/photos/tom-margie/1253798184/
Architecture
   Simplified
Architecture
       Simplified




  SE
BA
Architecture
              Simplified




         SE
   T
 IR


       BA
D
Architecture
              Simplified




         SE



                ID
   T
 IR




               C
       BA



              A
D
The Environment
4 Dimensions
Compute
Storage
Network
Management
ON-PREMISE TRAITS
http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
Compute
Fast
Inelastic




       ON-PREMISE TRAITS
        http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
Compute
Fast
Inelastic




Storage
Fast
Centralized
Redundant

         ON-PREMISE TRAITS
          http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
Compute                                                                               Network
Fast                                                                                  Fast
Inelastic                                                                             Localized




Storage
Fast
Centralized
Redundant

         ON-PREMISE TRAITS
          http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
Compute                                                                               Network
Fast                                                                                  Fast
Inelastic                                                                             Localized




Storage
Fast                                                                       Management
Centralized                                                                People-based
Redundant                                                                  Full access

         ON-PREMISE TRAITS
          http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
CLOUD TRAITS
Compute
Slow
Elastic




          CLOUD TRAITS
Compute
Slow
Elastic




Storage
Slow
Jittery
Maybe durable
Low memory

                CLOUD TRAITS
Compute                    Network
Slow                       “Fast”
Elastic                    Geo-distributed




Storage
Slow
Jittery
Maybe durable
Low memory

                CLOUD TRAITS
Compute                    Network
Slow                       “Fast”
Elastic                    Geo-distributed




Storage
Slow
Jittery                   Management
Maybe durable             No bare-metal
Low memory                “Magic” API

                CLOUD TRAITS
What We Have
   Found
Network
Network
Layer 2: Virtual Domain
Layer 3: Crude Edge Filtering
Layer 7: Crude Load Balancing
DNS
CDN
Network
Layer 2: Virtual Domain




                !
Layer 3: Crude Edge Filtering


              ks
           or
Layer 7: Crude Load Balancing
DNS
          W
        It
CDN
Storage
Latency

                                     BASE
                                     Amazon S3


                       BASE
                       Apache Cassandra
          ACID
          PostgreSQL
   DIRT
   Redis
                                            Capacity

                  Storage
Latency

                                      BASE




                                            y
                                           nc
                                      Amazon S3




                                           te
                                       La
                                t
                        BASE




                                pu
                    y

                             gh
                  er
                        Apache Cassandra


                           ou
           ACID  tt

                           hr
               Ji

                        dt
           PostgreSQL
                    i te
                 Lim

   DIRT
           y
          or
      em




   Redis
                                                Capacity
    m
  w
Lo




                    Storage
Low Memory
 http://aws.amazon.com/ec2/#instance
Jittery, Limited Throughput
          Network Block Storage (EBS)

  https://app.datad0g.com/dash/dash/1032#/date_range/1308608717016-1309213517016
Average wait in ms

                     DEV      tps   rd_sec/s   wr_sec/s   avgrq-sz   avgqu-sz    await   svctm   %util
03:35:02   PM    dev8-80   375.95   23614.08       5.70      62.83      47.21   125.58    1.26   47.34
03:35:02   PM    dev8-96   373.63   23749.65       5.64      63.58      45.55   121.91    1.22   45.72
03:35:02   PM   dev8-112   375.28   23693.47       5.52      63.15      45.52   121.22    1.23   46.31
03:35:02   PM   dev8-128   375.31   23721.57       7.19      63.22      56.00   148.96    1.34   50.35




                Read throughput in sector/s                                     Average service
                      Total: 368Mb/s                                              time in ms

   Limited Throughput In Numbers
                      RAID 0 EBS Volumes, m1.large instances
Some Tricks
Software RAID
RAID 0
Offsite backups




              Some Tricks
Software RAID       Limited by slowest
RAID 0              volume
Offsite backups




              Some Tricks
Software RAID           Limited by slowest
RAID 0                  volume
Offsite backups




Streaming replication
S3 backups




              Some Tricks
Software RAID           Limited by slowest
RAID 0                  volume
Offsite backups

Ephemeral volumes
And Offsite backups

Streaming replication
S3 backups




              Some Tricks
Software RAID           Limited by slowest
RAID 0                  volume
Offsite backups

Ephemeral volumes
And Offsite backups     Complexity
                        Recovery Time Objective
Streaming replication   Recovery Point Objective
S3 backups




              Some Tricks
Software RAID           Limited by slowest
RAID 0                  volume
Offsite backups

Ephemeral volumes
And Offsite backups     Complexity
                        Recovery Time Objective
Streaming replication   Recovery Point Objective
S3 backups

Database Service
MySQL/Oracle RDS

              Some Tricks
Software RAID           Limited by slowest
RAID 0                  volume
Offsite backups

Ephemeral volumes
And Offsite backups     Complexity
                        Recovery Time Objective
Streaming replication   Recovery Point Objective
S3 backups

Database Service        Trust
MySQL/Oracle RDS        RDS Outage 2 months ago

              Some Tricks
Network Block Storage
 Is The Dark Side
Network Block Storage
 Is The Dark Side

 Bait For Enterprise
    Customers
Network Block Storage
    Is The Dark Side

    Bait For Enterprise
       Customers


Hard Problem For
 Cloud Providers
Don’t rely on networked block storage
Small data sets only if you have to

Don’t trust data-at-rest
Copy, replicate, back up

Do use S3 if you can
Object semantics a limitation
Slow but durable



       Some Do’s And Don’t
Compute
“Performance”
      Scale up   Shard


       ACID
       Nodes



                 BASE DIRT Add more
                 Nodes Nodes
                                      Number

                 Compute
Don’t rely on scale-ups
Low memory a hard limit for DBs
Noisy neighbors
Individual performance poor and jittery

Scale out
First scale up
Then Shard
Parallelize across machines
Vector-processing via GPUs


       Some Do’s And Don’t
Management
An API for everything
Compute
Storage
Network
Management
Do use the AWS APIs
Almost like magic
Rich libraries
Ever expanding

Do use tools
e.g. Chef, Puppet, cfengine, etc.
Datadog

Do Kill and Respawn
Low-level debugging impossible
Instance creation is cheap

Some Do’s And Don’t
New Rules
New Tools
New Playbook

Same Fundamentals
Questions!

http://datadoghq.com
      twitter: @alq

Weitere ähnliche Inhalte

Kürzlich hochgeladen

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Kürzlich hochgeladen (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

Empfohlen

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Empfohlen (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

A journey in the public clouds

  • 1. A Journey In The Public Clouds With Datadog Alexis Lê-Quôc (Product Guy) at Datadog IASA New York Chapter June 28th, 2011
  • 2. What I’m going to talk about ‣What we do and for whom ‣The kind of data we deal with ‣Our architecture ‣Our architecture in a public cloud (AWS) ‣What we learned ‣Q+A
  • 3. SaaS Platform for Aggregation, Correlation, Collaboration For Dev & Ops What we do?
  • 4. The Mess Usage Analytics Too many data streams, IAAS / PAAS too many silos Issue Resolution t ics Servers and Devices ics igh ices etr ins metr g billin Too many choices to m m cho et ri c s s ?!? change make, too often Dev team changes !? ics choices metr Ops team Applications tri cs ch an Only getting worse as me nts ge SaaS Silos multiply me even s ve ts tri ad e + fe es edb cs vic oic ack ch e me s s tric choice tri me cs Separate Dev and Ops Cap. Planning SDLC support Monitoring teams, looking at separate Hosting data streams Asset Mgmt CDNs Data-Driven decision making in IT is rarely happening. Too slow, Too expensive, requires too much discipline.
  • 5. We Simplify Datadog to the rescue system metrics key metrics quality metrics to Alice Dev SaaS data visibility capacity metrics usage analytics recommendations cloud billing to Bob Ops code metrics visibility config changes IaaS pricing business metrics perf. data to Charlie CEO vendors info curated metadata Aggregation Correlation Collaboration
  • 7. etc. Aggregation
  • 8. AGGREGATION Aggregation
  • 11. What Architecture For What Kind Of Data?
  • 12. Events Metrics User comments Unique visitors Alert Load Build Transaction duration Batch job etc.
  • 14. Atomicity Concistency Isolation Durability e.g. SQL DBs CLASSICS http://en.wikipedia.org/wiki/Eventual_consistency
  • 15. Atomicity Basically Concistency Available Isolation Soft-state Durability Eventual consistency e.g. SQL DBs e.g. DNS CLASSICS http://en.wikipedia.org/wiki/Eventual_consistency
  • 16. Data Intensive Real Time e.g. real-time web NEW COMER Brian Cantrill: http://dtrace.org/resources/bmc/DIRT.pdf
  • 17. Aggregation Constant data influx Large data sets Correlation On-demand visualization Background data analysis Collaboration Real-time updates On-the-fly data analysis
  • 18. Aggregation SE Constant data influx BA Large data sets Correlation On-demand visualization Background data analysis Collaboration Real-time updates On-the-fly data analysis
  • 19. Aggregation SE T Constant data influx IR BA D Large data sets Correlation On-demand visualization Background data analysis Collaboration Real-time updates On-the-fly data analysis
  • 20. Aggregation SE T Constant data influx IR BA D Large data sets Correlation SE On-demand visualization BA Background data analysis Collaboration Real-time updates On-the-fly data analysis
  • 21. Aggregation SE T Constant data influx IR BA D Large data sets Correlation SE On-demand visualization BA Background data analysis Collaboration T Real-time updates IR D On-the-fly data analysis
  • 22. Aggregation SE T Constant data influx IR BA D Large data sets Correlation SE On-demand visualization BA Background data analysis Collaboration T Real-time updates IR D On-the-fly data analysis Datadog = DIRT + BASE + a tiny bit of ACID
  • 23. How It All Fits Together http://www.flickr.com/photos/tom-margie/1253798184/
  • 24. Architecture Simplified
  • 25. Architecture Simplified SE BA
  • 26. Architecture Simplified SE T IR BA D
  • 27. Architecture Simplified SE ID T IR C BA A D
  • 31. Compute Fast Inelastic ON-PREMISE TRAITS http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
  • 32. Compute Fast Inelastic Storage Fast Centralized Redundant ON-PREMISE TRAITS http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
  • 33. Compute Network Fast Fast Inelastic Localized Storage Fast Centralized Redundant ON-PREMISE TRAITS http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
  • 34. Compute Network Fast Fast Inelastic Localized Storage Fast Management Centralized People-based Redundant Full access ON-PREMISE TRAITS http://www.flickr.com/photos/theplanetdotcom/4879419788/sizes/l/in/photostream/
  • 36. Compute Slow Elastic CLOUD TRAITS
  • 38. Compute Network Slow “Fast” Elastic Geo-distributed Storage Slow Jittery Maybe durable Low memory CLOUD TRAITS
  • 39. Compute Network Slow “Fast” Elastic Geo-distributed Storage Slow Jittery Management Maybe durable No bare-metal Low memory “Magic” API CLOUD TRAITS
  • 40. What We Have Found
  • 42. Network Layer 2: Virtual Domain Layer 3: Crude Edge Filtering Layer 7: Crude Load Balancing DNS CDN
  • 43. Network Layer 2: Virtual Domain ! Layer 3: Crude Edge Filtering ks or Layer 7: Crude Load Balancing DNS W It CDN
  • 45. Latency BASE Amazon S3 BASE Apache Cassandra ACID PostgreSQL DIRT Redis Capacity Storage
  • 46. Latency BASE y nc Amazon S3 te La t BASE pu y gh er Apache Cassandra ou ACID tt hr Ji dt PostgreSQL i te Lim DIRT y or em Redis Capacity m w Lo Storage
  • 48. Jittery, Limited Throughput Network Block Storage (EBS) https://app.datad0g.com/dash/dash/1032#/date_range/1308608717016-1309213517016
  • 49. Average wait in ms DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 03:35:02 PM dev8-80 375.95 23614.08 5.70 62.83 47.21 125.58 1.26 47.34 03:35:02 PM dev8-96 373.63 23749.65 5.64 63.58 45.55 121.91 1.22 45.72 03:35:02 PM dev8-112 375.28 23693.47 5.52 63.15 45.52 121.22 1.23 46.31 03:35:02 PM dev8-128 375.31 23721.57 7.19 63.22 56.00 148.96 1.34 50.35 Read throughput in sector/s Average service Total: 368Mb/s time in ms Limited Throughput In Numbers RAID 0 EBS Volumes, m1.large instances
  • 51. Software RAID RAID 0 Offsite backups Some Tricks
  • 52. Software RAID Limited by slowest RAID 0 volume Offsite backups Some Tricks
  • 53. Software RAID Limited by slowest RAID 0 volume Offsite backups Streaming replication S3 backups Some Tricks
  • 54. Software RAID Limited by slowest RAID 0 volume Offsite backups Ephemeral volumes And Offsite backups Streaming replication S3 backups Some Tricks
  • 55. Software RAID Limited by slowest RAID 0 volume Offsite backups Ephemeral volumes And Offsite backups Complexity Recovery Time Objective Streaming replication Recovery Point Objective S3 backups Some Tricks
  • 56. Software RAID Limited by slowest RAID 0 volume Offsite backups Ephemeral volumes And Offsite backups Complexity Recovery Time Objective Streaming replication Recovery Point Objective S3 backups Database Service MySQL/Oracle RDS Some Tricks
  • 57. Software RAID Limited by slowest RAID 0 volume Offsite backups Ephemeral volumes And Offsite backups Complexity Recovery Time Objective Streaming replication Recovery Point Objective S3 backups Database Service Trust MySQL/Oracle RDS RDS Outage 2 months ago Some Tricks
  • 58. Network Block Storage Is The Dark Side
  • 59. Network Block Storage Is The Dark Side Bait For Enterprise Customers
  • 60. Network Block Storage Is The Dark Side Bait For Enterprise Customers Hard Problem For Cloud Providers
  • 61. Don’t rely on networked block storage Small data sets only if you have to Don’t trust data-at-rest Copy, replicate, back up Do use S3 if you can Object semantics a limitation Slow but durable Some Do’s And Don’t
  • 63. “Performance” Scale up Shard ACID Nodes BASE DIRT Add more Nodes Nodes Number Compute
  • 64. Don’t rely on scale-ups Low memory a hard limit for DBs Noisy neighbors Individual performance poor and jittery Scale out First scale up Then Shard Parallelize across machines Vector-processing via GPUs Some Do’s And Don’t
  • 66. An API for everything Compute Storage Network Management
  • 67. Do use the AWS APIs Almost like magic Rich libraries Ever expanding Do use tools e.g. Chef, Puppet, cfengine, etc. Datadog Do Kill and Respawn Low-level debugging impossible Instance creation is cheap Some Do’s And Don’t
  • 68. New Rules New Tools New Playbook Same Fundamentals