SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Processing
Firefox Crash
Reports with
Python

laura@ mozilla.com
@lxt
Overview



• The basics
• The numbers
• Work process and tools
The basics
Socorro




          Very Large Array at Socorro, New Mexico, USA. Photo taken by Hajor, 08.Aug.2004. Released under cc.by.sa
          and/or GFDL. Source: http://en.wikipedia.org/wiki/File:USA.NM.VeryLargeArray.02.jpg
“Socorro has a lot of moving parts”
...
“I prefer to think of them as dancing parts”
Basic architecture (simplified)

              Collector   Crashmover




              cronjobs     HBase        Monitor




             Middleware   PostgreSQL   Processor




              Webapp
Lifetime of a crash


 • Raw crash submitted by user via POST (metadata json +
    minidump)

 • Collected to disk by collector (web.py WSGI app)
 • Moved to HBase by crashmover
 • Noticed in queue by monitor and assigned for processing
Processing


• Processor spins off minidumpstackwalk (MDSW)
• MDSW re-unites raw crash with symbols to generate a stack
• Processor generates a signature and pulls out other data
• Processor writes processed crash back to HBase and to
  PostgreSQL
Back end processing
 Large number of cron jobs, e.g.:

  • Calculate aggregates: Top crashers by signature, URL,
     domain

  • Process incoming builds from ftp server
  • Match known crashes to bugzilla bugs
  • Duplicate detection
  • Match up pairs of dumps (OOPP, content crashes, etc)
  • Generate extracts (CSV) for engineers to analyze
Middleware


• Moving all data access to be through REST API (by end of
   year)

• (Still some queries in webapp)
• Enable other front ends to data and us to rewrite webapp
   using Django in 2012

• In upcoming version (late 2011, 2012) each component will
   have its own API for status and health checks
Webapp


• Hard part here: how to visualize some of this data
• Example: nightly builds, moving to reporting in build time
   rather than clock time

• Code a bit crufty: rewrite in 2012
• Currently KohanaPHP, will be (likely) Django
Implementation details


 • Python 2.6 mostly (except PHP for the webapp)
 • PostgreSQL9.1, some stored procs in pgpl/sql
 • memcache for the webapp
 • Thrift for HBase access
 • HBase we use CDH3
Scale
A different type of scaling:
• Typical webapp: scale to millions of users without
   degradation of response time
• Socorro: less than a hundred users, terabytes of data.
Basic law of scale still applies:
The bigger you get, the more spectacularly you fail
Some numbers



• At peak we receive 2300 crashes per minute
• 2.5 million per day
• Median crash size 150k, max size 20MB (reject bigger)
• ~110TB stored in HDFS (3x replication, ~40TB of HBase data)
What can we do?

• Does betaN have more (null signature) crashes than other
   betas?
• Analyze differences between Flash versions x and y crashes
• Detect duplicate crashes
• Detect explosive crashes
• Find “frankeninstalls”
• Email victims of a malware-related crash
Implementation scale


• > 115 physical boxes (not cloud)
• Now up to 8 developers + sysadmins + QA + Hadoop ops/
  analysts

  • (Yes, hiring.   mozilla.org/careers)

• Deploy approximately weekly but could do continuous if
  needed
Managing complexity
Development process


• Fork
• Hard to install: use a VM (more in a moment)
• Pull request with bugfix/feature
• Code review
• Lands on master
Development process -2

• Jenkins polls github master, picks up changes
• Jenkins runs tests, builds a package
• Package automatically picked up and pushed to dev
• Wanted changes merged to release branch
• Jenkins builds release branch, manual push to stage
• QA runs acceptance on stage (Selenium/Jenkins + manual)
• Manual push same build to production
Note “Manual” deployment =

• Run a single script with the build as
   parameter

• Pushes it out to all machines and restarts
   where needed
ABSOLUTELY CRITICAL

All the machinery for continuous deployment
even if you don’t want to deploy continuously
Configuration management


• Some releases involve a configuration change
• These are controlled and managed through Puppet
• Again, a single line to change config the same way every
   time

• Config controlled the same way on dev and stage; tested the
   same way; deployed the same way
Virtualization


 • You don’t want to install HBase
 • Use Vagrant to set up a virtual machine
 • Use Jenkins to build a new Vagrant VM with each code build
 • Use Puppet to configure the VM the same way as production
 • Tricky part is still getting the right data
Virtualization - 2



• VM work at
       github.com/rhelmer/socorro-vagrant

• Also pulled in as a submodule to socorro
Upcoming


• ElasticSearch implemented for better search including
   faceted search, waiting on hardware to pref it on

• More analytics: automatic detection of explosive crashes,
   malware, etc

• Better queueing: looking at Sagrada queue
• Grand Unified Configuration System
Everything is open (source)


  Fork: https://github.com/mozilla/socorro
  Read/file/fix bugs: https://bugzilla.mozilla.org/
  Docs: http://www.readthedocs.org/docs/socorro
  Mailing list: https://lists.mozilla.org/listinfo/tools-socorro
  Join us in IRC: irc.mozilla.org #breakpad
Questions?




• Ask me (almost) anything, now or later
• laura@mozilla.com

Weitere ähnliche Inhalte

Was ist angesagt?

Easily create dashboards to manage your databases with OVH
Easily create dashboards to manage your databases with OVH Easily create dashboards to manage your databases with OVH
Easily create dashboards to manage your databases with OVH OVHcloud
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedTim Callaghan
 
PyCon India 2012: Celery Talk
PyCon India 2012: Celery TalkPyCon India 2012: Celery Talk
PyCon India 2012: Celery TalkPiyush Kumar
 
Introduction to Systems Management with SaltStack
Introduction to Systems Management with SaltStackIntroduction to Systems Management with SaltStack
Introduction to Systems Management with SaltStackCraig Sebenik
 
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 PeopleKafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 Peopleconfluent
 
London devops logging
London devops loggingLondon devops logging
London devops loggingTomas Doran
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Docker, Inc.
 
Rust with-kafka-07-02-2019
Rust with-kafka-07-02-2019Rust with-kafka-07-02-2019
Rust with-kafka-07-02-2019Gerard Klijs
 
High Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningHigh Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningAlbert Chen
 
Testing Below the Application
Testing Below the ApplicationTesting Below the Application
Testing Below the ApplicationAsh Winter
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...HostedbyConfluent
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profilingJon Haddad
 
Hacking on WildFly 9
Hacking on WildFly 9Hacking on WildFly 9
Hacking on WildFly 9JBUG London
 
Rust kafka-5-2019-unskip
Rust kafka-5-2019-unskipRust kafka-5-2019-unskip
Rust kafka-5-2019-unskipGerard Klijs
 
Streaming and Messaging
Streaming and MessagingStreaming and Messaging
Streaming and MessagingXin Wang
 
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache KafkaKafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafkaconfluent
 
Log aggregation and analysis
Log aggregation and analysisLog aggregation and analysis
Log aggregation and analysisDhaval Mehta
 
Zabbix 3.2 presentation June 2017
Zabbix 3.2 presentation June 2017Zabbix 3.2 presentation June 2017
Zabbix 3.2 presentation June 2017Amirhossein Saberi
 

Was ist angesagt? (20)

Easily create dashboards to manage your databases with OVH
Easily create dashboards to manage your databases with OVH Easily create dashboards to manage your databases with OVH
Easily create dashboards to manage your databases with OVH
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons Learned
 
PyCon India 2012: Celery Talk
PyCon India 2012: Celery TalkPyCon India 2012: Celery Talk
PyCon India 2012: Celery Talk
 
Introduction to Systems Management with SaltStack
Introduction to Systems Management with SaltStackIntroduction to Systems Management with SaltStack
Introduction to Systems Management with SaltStack
 
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 PeopleKafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
 
Rust with-kafka-07-02-2019
Rust with-kafka-07-02-2019Rust with-kafka-07-02-2019
Rust with-kafka-07-02-2019
 
High Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance TuningHigh Concurrency Architecture and Laravel Performance Tuning
High Concurrency Architecture and Laravel Performance Tuning
 
Testing Below the Application
Testing Below the ApplicationTesting Below the Application
Testing Below the Application
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profiling
 
Apache pulsar
Apache pulsarApache pulsar
Apache pulsar
 
Hacking on WildFly 9
Hacking on WildFly 9Hacking on WildFly 9
Hacking on WildFly 9
 
Rust kafka-5-2019-unskip
Rust kafka-5-2019-unskipRust kafka-5-2019-unskip
Rust kafka-5-2019-unskip
 
London HUG 12/4
London HUG 12/4London HUG 12/4
London HUG 12/4
 
Streaming and Messaging
Streaming and MessagingStreaming and Messaging
Streaming and Messaging
 
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache KafkaKafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
 
Log aggregation and analysis
Log aggregation and analysisLog aggregation and analysis
Log aggregation and analysis
 
Zabbix 3.2 presentation June 2017
Zabbix 3.2 presentation June 2017Zabbix 3.2 presentation June 2017
Zabbix 3.2 presentation June 2017
 

Andere mochten auch

Everything you ever wanted to know about deployment but were afraid to ask
Everything you ever wanted to know about deployment but were afraid to askEverything you ever wanted to know about deployment but were afraid to ask
Everything you ever wanted to know about deployment but were afraid to asklauraxthomson
 
Classroom Blog Math Conf 08
Classroom Blog Math Conf 08Classroom Blog Math Conf 08
Classroom Blog Math Conf 08Darren Mosley
 
Credit Crisis in 30 slides
Credit Crisis in 30 slidesCredit Crisis in 30 slides
Credit Crisis in 30 slidesChrisJCook
 
Introducing the PetroTrust
Introducing the PetroTrustIntroducing the PetroTrust
Introducing the PetroTrustChrisJCook
 
Partnership Finance October 2010
Partnership Finance October 2010Partnership Finance October 2010
Partnership Finance October 2010ChrisJCook
 
Presentation to Combined Heat and Power Roadshow 30 05_13
Presentation to Combined Heat and Power Roadshow 30 05_13Presentation to Combined Heat and Power Roadshow 30 05_13
Presentation to Combined Heat and Power Roadshow 30 05_13ChrisJCook
 
St Margarets Partnership June 2010
St Margarets Partnership June 2010St Margarets Partnership June 2010
St Margarets Partnership June 2010ChrisJCook
 
Talk proposal get_accepted
Talk proposal get_acceptedTalk proposal get_accepted
Talk proposal get_acceptedlauraxthomson
 
Rewrite or refactor: When to declare technical bankruptcy
Rewrite or refactor: When to declare technical bankruptcyRewrite or refactor: When to declare technical bankruptcy
Rewrite or refactor: When to declare technical bankruptcylauraxthomson
 
Migrating big data
Migrating big dataMigrating big data
Migrating big datalauraxthomson
 
Energy Pools - Scottish Energy Institute 11 11 2009
Energy Pools - Scottish Energy Institute 11 11 2009Energy Pools - Scottish Energy Institute 11 11 2009
Energy Pools - Scottish Energy Institute 11 11 2009ChrisJCook
 
No Fracase En Su Negocio
No Fracase En Su NegocioNo Fracase En Su Negocio
No Fracase En Su NegocioNegocios Negocios
 
Money 3.0
Money 3.0Money 3.0
Money 3.0ChrisJCook
 
Energy Pool 20 05 2009
Energy Pool 20 05 2009Energy Pool 20 05 2009
Energy Pool 20 05 2009ChrisJCook
 

Andere mochten auch (14)

Everything you ever wanted to know about deployment but were afraid to ask
Everything you ever wanted to know about deployment but were afraid to askEverything you ever wanted to know about deployment but were afraid to ask
Everything you ever wanted to know about deployment but were afraid to ask
 
Classroom Blog Math Conf 08
Classroom Blog Math Conf 08Classroom Blog Math Conf 08
Classroom Blog Math Conf 08
 
Credit Crisis in 30 slides
Credit Crisis in 30 slidesCredit Crisis in 30 slides
Credit Crisis in 30 slides
 
Introducing the PetroTrust
Introducing the PetroTrustIntroducing the PetroTrust
Introducing the PetroTrust
 
Partnership Finance October 2010
Partnership Finance October 2010Partnership Finance October 2010
Partnership Finance October 2010
 
Presentation to Combined Heat and Power Roadshow 30 05_13
Presentation to Combined Heat and Power Roadshow 30 05_13Presentation to Combined Heat and Power Roadshow 30 05_13
Presentation to Combined Heat and Power Roadshow 30 05_13
 
St Margarets Partnership June 2010
St Margarets Partnership June 2010St Margarets Partnership June 2010
St Margarets Partnership June 2010
 
Talk proposal get_accepted
Talk proposal get_acceptedTalk proposal get_accepted
Talk proposal get_accepted
 
Rewrite or refactor: When to declare technical bankruptcy
Rewrite or refactor: When to declare technical bankruptcyRewrite or refactor: When to declare technical bankruptcy
Rewrite or refactor: When to declare technical bankruptcy
 
Migrating big data
Migrating big dataMigrating big data
Migrating big data
 
Energy Pools - Scottish Energy Institute 11 11 2009
Energy Pools - Scottish Energy Institute 11 11 2009Energy Pools - Scottish Energy Institute 11 11 2009
Energy Pools - Scottish Energy Institute 11 11 2009
 
No Fracase En Su Negocio
No Fracase En Su NegocioNo Fracase En Su Negocio
No Fracase En Su Negocio
 
Money 3.0
Money 3.0Money 3.0
Money 3.0
 
Energy Pool 20 05 2009
Energy Pool 20 05 2009Energy Pool 20 05 2009
Energy Pool 20 05 2009
 

Ähnlich wie Crash reports pycodeconf

Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansPeter Clapham
 
Flexible compute
Flexible computeFlexible compute
Flexible computePeter Clapham
 
JavaScript Event Loop
JavaScript Event LoopJavaScript Event Loop
JavaScript Event LoopThomas Hunter II
 
Tuenti Release Workflow v1.1
Tuenti Release Workflow v1.1Tuenti Release Workflow v1.1
Tuenti Release Workflow v1.1Tuenti
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitterRoger Xia
 
Fixing_Twitter
Fixing_TwitterFixing_Twitter
Fixing_Twitterliujianrong
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...smallerror
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...xlight
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopEvans Ye
 
DCSF19 Transforming a 15+ Year Old Semiconductor Manufacturing Environment
DCSF19 Transforming a 15+ Year Old Semiconductor Manufacturing EnvironmentDCSF19 Transforming a 15+ Year Old Semiconductor Manufacturing Environment
DCSF19 Transforming a 15+ Year Old Semiconductor Manufacturing EnvironmentDocker, Inc.
 
3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging振东 刘
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQLKonstantin Gredeskoul
 
introduction to node.js
introduction to node.jsintroduction to node.js
introduction to node.jsorkaplan
 
Midwest PHP - Scaling Magento
Midwest PHP - Scaling MagentoMidwest PHP - Scaling Magento
Midwest PHP - Scaling MagentoMathew Beane
 
Spot Trading - A case study in continuous delivery for mission critical finan...
Spot Trading - A case study in continuous delivery for mission critical finan...Spot Trading - A case study in continuous delivery for mission critical finan...
Spot Trading - A case study in continuous delivery for mission critical finan...SaltStack
 
MySQL Infrastructure Testing Automation at GitHub
MySQL Infrastructure Testing Automation at GitHubMySQL Infrastructure Testing Automation at GitHub
MySQL Infrastructure Testing Automation at GitHubIke Walker
 
A Tale of 2 Systems
A Tale of 2 SystemsA Tale of 2 Systems
A Tale of 2 SystemsDavid Newman
 
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)Panagiotis Kanavos
 
KACE Agent Architecture and Troubleshooting Overview
KACE Agent Architecture and Troubleshooting OverviewKACE Agent Architecture and Troubleshooting Overview
KACE Agent Architecture and Troubleshooting OverviewDell World
 

Ähnlich wie Crash reports pycodeconf (20)

Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
JavaScript Event Loop
JavaScript Event LoopJavaScript Event Loop
JavaScript Event Loop
 
Tuenti Release Workflow v1.1
Tuenti Release Workflow v1.1Tuenti Release Workflow v1.1
Tuenti Release Workflow v1.1
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
 
Fixing_Twitter
Fixing_TwitterFixing_Twitter
Fixing_Twitter
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
DCSF19 Transforming a 15+ Year Old Semiconductor Manufacturing Environment
DCSF19 Transforming a 15+ Year Old Semiconductor Manufacturing EnvironmentDCSF19 Transforming a 15+ Year Old Semiconductor Manufacturing Environment
DCSF19 Transforming a 15+ Year Old Semiconductor Manufacturing Environment
 
3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging
 
12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL12-Step Program for Scaling Web Applications on PostgreSQL
12-Step Program for Scaling Web Applications on PostgreSQL
 
introduction to node.js
introduction to node.jsintroduction to node.js
introduction to node.js
 
Midwest PHP - Scaling Magento
Midwest PHP - Scaling MagentoMidwest PHP - Scaling Magento
Midwest PHP - Scaling Magento
 
Spot Trading - A case study in continuous delivery for mission critical finan...
Spot Trading - A case study in continuous delivery for mission critical finan...Spot Trading - A case study in continuous delivery for mission critical finan...
Spot Trading - A case study in continuous delivery for mission critical finan...
 
MySQL Infrastructure Testing Automation at GitHub
MySQL Infrastructure Testing Automation at GitHubMySQL Infrastructure Testing Automation at GitHub
MySQL Infrastructure Testing Automation at GitHub
 
A Tale of 2 Systems
A Tale of 2 SystemsA Tale of 2 Systems
A Tale of 2 Systems
 
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)
 
KACE Agent Architecture and Troubleshooting Overview
KACE Agent Architecture and Troubleshooting OverviewKACE Agent Architecture and Troubleshooting Overview
KACE Agent Architecture and Troubleshooting Overview
 

KĂźrzlich hochgeladen

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĂşjo
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vĂĄzquez
 

KĂźrzlich hochgeladen (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Crash reports pycodeconf

  • 2. Overview • The basics • The numbers • Work process and tools
  • 4. Socorro Very Large Array at Socorro, New Mexico, USA. Photo taken by Hajor, 08.Aug.2004. Released under cc.by.sa and/or GFDL. Source: http://en.wikipedia.org/wiki/File:USA.NM.VeryLargeArray.02.jpg
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. “Socorro has a lot of moving parts” ... “I prefer to think of them as dancing parts”
  • 10. Basic architecture (simplified) Collector Crashmover cronjobs HBase Monitor Middleware PostgreSQL Processor Webapp
  • 11. Lifetime of a crash • Raw crash submitted by user via POST (metadata json + minidump) • Collected to disk by collector (web.py WSGI app) • Moved to HBase by crashmover • Noticed in queue by monitor and assigned for processing
  • 12. Processing • Processor spins off minidumpstackwalk (MDSW) • MDSW re-unites raw crash with symbols to generate a stack • Processor generates a signature and pulls out other data • Processor writes processed crash back to HBase and to PostgreSQL
  • 13. Back end processing Large number of cron jobs, e.g.: • Calculate aggregates: Top crashers by signature, URL, domain • Process incoming builds from ftp server • Match known crashes to bugzilla bugs • Duplicate detection • Match up pairs of dumps (OOPP, content crashes, etc) • Generate extracts (CSV) for engineers to analyze
  • 14. Middleware • Moving all data access to be through REST API (by end of year) • (Still some queries in webapp) • Enable other front ends to data and us to rewrite webapp using Django in 2012 • In upcoming version (late 2011, 2012) each component will have its own API for status and health checks
  • 15. Webapp • Hard part here: how to visualize some of this data • Example: nightly builds, moving to reporting in build time rather than clock time • Code a bit crufty: rewrite in 2012 • Currently KohanaPHP, will be (likely) Django
  • 16. Implementation details • Python 2.6 mostly (except PHP for the webapp) • PostgreSQL9.1, some stored procs in pgpl/sql • memcache for the webapp • Thrift for HBase access • HBase we use CDH3
  • 17. Scale
  • 18. A different type of scaling: • Typical webapp: scale to millions of users without degradation of response time • Socorro: less than a hundred users, terabytes of data.
  • 19. Basic law of scale still applies: The bigger you get, the more spectacularly you fail
  • 20. Some numbers • At peak we receive 2300 crashes per minute • 2.5 million per day • Median crash size 150k, max size 20MB (reject bigger) • ~110TB stored in HDFS (3x replication, ~40TB of HBase data)
  • 21. What can we do? • Does betaN have more (null signature) crashes than other betas? • Analyze differences between Flash versions x and y crashes • Detect duplicate crashes • Detect explosive crashes • Find “frankeninstalls” • Email victims of a malware-related crash
  • 22. Implementation scale • > 115 physical boxes (not cloud) • Now up to 8 developers + sysadmins + QA + Hadoop ops/ analysts • (Yes, hiring. mozilla.org/careers) • Deploy approximately weekly but could do continuous if needed
  • 24. Development process • Fork • Hard to install: use a VM (more in a moment) • Pull request with bugfix/feature • Code review • Lands on master
  • 25. Development process -2 • Jenkins polls github master, picks up changes • Jenkins runs tests, builds a package • Package automatically picked up and pushed to dev • Wanted changes merged to release branch • Jenkins builds release branch, manual push to stage • QA runs acceptance on stage (Selenium/Jenkins + manual) • Manual push same build to production
  • 26. Note “Manual” deployment = • Run a single script with the build as parameter • Pushes it out to all machines and restarts where needed
  • 27. ABSOLUTELY CRITICAL All the machinery for continuous deployment even if you don’t want to deploy continuously
  • 28. Configuration management • Some releases involve a configuration change • These are controlled and managed through Puppet • Again, a single line to change config the same way every time • Config controlled the same way on dev and stage; tested the same way; deployed the same way
  • 29. Virtualization • You don’t want to install HBase • Use Vagrant to set up a virtual machine • Use Jenkins to build a new Vagrant VM with each code build • Use Puppet to configure the VM the same way as production • Tricky part is still getting the right data
  • 30. Virtualization - 2 • VM work at github.com/rhelmer/socorro-vagrant • Also pulled in as a submodule to socorro
  • 31. Upcoming • ElasticSearch implemented for better search including faceted search, waiting on hardware to pref it on • More analytics: automatic detection of explosive crashes, malware, etc • Better queueing: looking at Sagrada queue • Grand Unified Configuration System
  • 32. Everything is open (source) Fork: https://github.com/mozilla/socorro Read/file/fix bugs: https://bugzilla.mozilla.org/ Docs: http://www.readthedocs.org/docs/socorro Mailing list: https://lists.mozilla.org/listinfo/tools-socorro Join us in IRC: irc.mozilla.org #breakpad
  • 33. Questions? • Ask me (almost) anything, now or later • laura@mozilla.com

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n