SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Agenda
• About the Project
• Lesson Learned
• Rethinking about tester’s career
Today’s Technique
Trends
• Continuous Integration
• Continuous Delivery
• Live Site First
• DevOps
• Bigdata/Hadoop
• Testing in Production
• Real Time Analysis
Please visit infoq.com
Topics covered by
this talk
• Monitoring in Production
• Data Driven Quality
• Data Pipeline
• Alert
About Me
Has been SQL Server team for 8 years
Lucky to always report to great manager
Mainly focus on Windows Azure SQL Database Now
And I will share the lesson learned from monitoring our
service through telemetry.
Blog: http://blogs.msdn.com/b/qingsongyao/
Read my test career blog
What is Windows Azure SQL Database
Windows Azure SQL Database, formerly SQL Azure, is a fully
managed relational database service that delivers flexible
manageability, includes built-in high availability, offers predictable
performance, and supports massive scale-out.
In other word, you create server and database, and we manage for
you to achieve HA, reliable performance with low cost
What we have before
• Flexible tool to get service status in real-time through rich API
• Rich telemetry data exposed in different ways:
o PerfStore stores all perf counters
o OpStore stores all operation records
o Cluster Manager contains state for machines, watchdogs and alerts
The Problem
o Hard to correlated data from different sources
o No separation of telemetry data with customer data
o Very hard to write and deploy new telemetry and alerts
Project Overview
• Problem Statement
We have lot of data, but lack of ways to retrieve and present them in an
easy way.
• What is the project about?
We want to have a central place to display real time information about
all clusters
• What is the goal of this project
o Help people to get service insight
o Effective Detect production issues and assist people to solve them quickly
o Help to deep analyze the issues and understand the root cause
SQL
Azure
Clusters
Data
Collector
Command
Gateway
Dashboard
DW
Incident
Response Team
raise alerts
Dashboard Report
Architect
GPM
LPM
MSDB
PerfStore
OpStore
…
Multi Thread
PowerShell
Data
Collection
Agent
SQL Azure
And IASS
Business Value
Trend Analysis
• Using Dashboard for Livesite incident
• Drive repair items and feature planning
Internal Monitoring Alert
• From reactive to proactive
• Reduce issue detection and migration time
Monitoring Testing Clusters
• All A1 clusters are monitoring
Monitoring
the production
On 3/8 7PM UTC, a couple of machines are down, and 150 DBs are impacted,
we start to use dashboard to monitor the recovery progress
Availability Trend
Bug # Assert Count
1229076 "Assert Assert Failed: Stack: at System.Environment.GetStackTrace(Exception 1
"Assert Assert Failed: ClientId: 00000000-0000-0000-0000-000000000000 NodeInfo: 66
1229073 "Assert Assert Failed: Incoming epoch 0-130072965305395635-6f6f103456a59b4f4a44d 71
1192590 "Assert Assert Failed: PartitionId <App>dbo</App><TG>UserDb</TG><Lo>0x8000000000 85806
1228173 "FabricUnhandledException System.ArgumentException: Illegal characters in path. 2
1229079 "FabricUnhandledException System.ComponentModel.Win32Exception (0x80004005) 11
1229087 "FabricUnhandledException System.Data.Fabric.Common.AsyncCallbackException: 9
1224236 "FabricUnhandledException System.InsufficientMemoryException: Insufficient winso 818
1229089 "FabricUnhandledException System.IO.FileNotFoundException: Could not find file ' 5
1226404 "FabricUnhandledException System.IO.IOException: The process cannot access the f 9
1228178 "FabricUnhandledException System.NullReferenceException: Object reference not se 2
1229081 "FabricUnhandledException System.ObjectDisposedException: Cannot access a dispos 1
1229084 "FabricUnhandledException System.Runtime.CallbackException: Async Callback threw 3
When we have outage in one cluster, we scan all exceptions and measure the
potential impact of other clusters
Incident
SE Repl LCK_M_X Hit Per Cluster
Trend Analysis and Prediction
Alert
New Alert based on dashboard
Original goal is to collect real time cluster information in a
dashboard
Quickly turn into an very important way of alert and
resolve live side issue
Highlights
• Data Lag is usually less than 10 minutes
• Data aggregated at central DW
• Write and Deploy a new alert take hours
• We can always watch and turn your alert at any time
Alert
From Passive to Reactive and to Predictive
What happens yesterday:
• Customer noticed us that we have outage.
• Every day we only look at issues happens in the past.
What happen today with the assistance of dashboard
• You always know what happens in a cluster now.
• You noticed live site issue as soon as it happens
• You have enough information to trouble shooting.
Long Term Alert Process
Monitoring
Data generators
•SAWA
•Autopilot
•MDS
•Internal Customer
•Real time Log
parsing
•(no alert will fire at
here).
Automatic Data
Aggregation
•Filter noise data
•Align data by time
series
•Enable cross
domain/dimension
analysis.
Automatic Issue
detection
• Base on cluster health
model
•Built-in knowledge of
issue diagnostics (replace
TSG)
• Heuristics and Statistics
models
Fast and Accurate
Solution for issues
• largely reduce false
failures
•Root causes are
correctly identified
•Route to the right
team
•Auto-health
support will be
built-in into the
system
Lesson Learned for
building a data pipeline
Choose the right technique is important
o You don’t necessary need Hadoop to process large
amount of data.
o Latency does matter, the faster you can get the data,
the more valuable it is.
o Allow other can quickly authoring and consume your
data.
Build resilience into your data pipeline
o The flow of one kind of data does not impact any
other flows
o Build-in retry logic in your data flow
o Always assuming that your data flow can be
failed, and allow reprocess the same flow
Monitoring your pipeline
• Data processing time
• Data processing error frequency
• Performance of your database
How we running Cluster Dashboard
• DevOps model:
o new change need pass unit tests
o deployed to testcluster dashboard for a couple of hours
o Xcopy deploy to production on demand.
• HA and Monitoring built-in
o Having back collector machine and DW machines
o DW has daily full backup and hours incremental backup.
o Measure, monitor and alert both collector and DW machine
• Data size and Performance
o Key table and queries are extensive tuned for better performance
o Data retention policy applied for several tables.
Rethinking about
tester’s career
What I am doing
everyday?
• 0% writing tests
• 0% sign-off
• 0% test planning
• 0% on test lab
• 60% monitoring the
production
• 40% learning and thinking
Key Takeaway
• Data visualization is needed for people to understand
the data.
• Your telemetry/bigdata project should drive actions,
instead of only providing data.
• It will take time and resource to build a data pipeline
and it is fun and learning process to build such
pipeline
• Alert has a life cycle as well.

Weitere ähnliche Inhalte

Was ist angesagt?

Geek Sync | New Features in SQL Server That Will Change the Way You Tune
Geek Sync | New Features in SQL Server That Will Change the Way You TuneGeek Sync | New Features in SQL Server That Will Change the Way You Tune
Geek Sync | New Features in SQL Server That Will Change the Way You TuneIDERA Software
 
Cassandra Summit 2014: Diagnosing Problems in Production
Cassandra Summit 2014: Diagnosing Problems in ProductionCassandra Summit 2014: Diagnosing Problems in Production
Cassandra Summit 2014: Diagnosing Problems in ProductionDataStax Academy
 
ODSC West TidalScale Keynote Slides
ODSC West TidalScale Keynote SlidesODSC West TidalScale Keynote Slides
ODSC West TidalScale Keynote SlidesChuck Piercey
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionDataStax Academy
 
From PoCs to Production
From PoCs to ProductionFrom PoCs to Production
From PoCs to ProductionDataStax
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAmazon Web Services
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...Hakka Labs
 
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsBattery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsDataStax Academy
 
Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...
Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...
Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...DataStax
 
Processing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and SparkProcessing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and SparkBen Slater
 
Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Eric Sammer
 
Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...Eric Sammer
 
Webinar | Building Apps with the Cassandra Python Driver
Webinar | Building Apps with the Cassandra Python DriverWebinar | Building Apps with the Cassandra Python Driver
Webinar | Building Apps with the Cassandra Python DriverDataStax Academy
 
from source to solution - building a system for event-oriented data
from source to solution - building a system for event-oriented datafrom source to solution - building a system for event-oriented data
from source to solution - building a system for event-oriented dataEric Sammer
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseDataStax
 
Архитектура приложений с использованием MySQL, Петр Зайцев (Percona)
Архитектура приложений с использованием MySQL, Петр Зайцев (Percona)Архитектура приложений с использованием MySQL, Петр Зайцев (Percona)
Архитектура приложений с использованием MySQL, Петр Зайцев (Percona)Ontico
 
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Using Apache Pulsar to Provide Real-Time IoT Analytics on the EdgeUsing Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Using Apache Pulsar to Provide Real-Time IoT Analytics on the EdgeDataWorks Summit
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...Big Data Spain
 
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...DataStax
 

Was ist angesagt? (20)

Geek Sync | New Features in SQL Server That Will Change the Way You Tune
Geek Sync | New Features in SQL Server That Will Change the Way You TuneGeek Sync | New Features in SQL Server That Will Change the Way You Tune
Geek Sync | New Features in SQL Server That Will Change the Way You Tune
 
Cassandra Summit 2014: Diagnosing Problems in Production
Cassandra Summit 2014: Diagnosing Problems in ProductionCassandra Summit 2014: Diagnosing Problems in Production
Cassandra Summit 2014: Diagnosing Problems in Production
 
ODSC West TidalScale Keynote Slides
ODSC West TidalScale Keynote SlidesODSC West TidalScale Keynote Slides
ODSC West TidalScale Keynote Slides
 
SQL vs. NoSQL
SQL vs. NoSQLSQL vs. NoSQL
SQL vs. NoSQL
 
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in ProductionCassandra Day Atlanta 2015: Diagnosing Problems in Production
Cassandra Day Atlanta 2015: Diagnosing Problems in Production
 
From PoCs to Production
From PoCs to ProductionFrom PoCs to Production
From PoCs to Production
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
 
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsBattery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
 
Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...
Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...
Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...
 
Processing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and SparkProcessing 50,000 events per second with Cassandra and Spark
Processing 50,000 events per second with Cassandra and Spark
 
Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015
 
Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...Building a system for machine and event-oriented data - Velocity, Santa Clara...
Building a system for machine and event-oriented data - Velocity, Santa Clara...
 
Webinar | Building Apps with the Cassandra Python Driver
Webinar | Building Apps with the Cassandra Python DriverWebinar | Building Apps with the Cassandra Python Driver
Webinar | Building Apps with the Cassandra Python Driver
 
from source to solution - building a system for event-oriented data
from source to solution - building a system for event-oriented datafrom source to solution - building a system for event-oriented data
from source to solution - building a system for event-oriented data
 
Data Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax EnterpriseData Pipelines with Spark & DataStax Enterprise
Data Pipelines with Spark & DataStax Enterprise
 
Архитектура приложений с использованием MySQL, Петр Зайцев (Percona)
Архитектура приложений с использованием MySQL, Петр Зайцев (Percona)Архитектура приложений с использованием MySQL, Петр Зайцев (Percona)
Архитектура приложений с использованием MySQL, Петр Зайцев (Percona)
 
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Using Apache Pulsar to Provide Real-Time IoT Analytics on the EdgeUsing Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
Using Apache Pulsar to Provide Real-Time IoT Analytics on the Edge
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
 
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
 

Ähnlich wie Sql azure cluster dashboard public.ppt

ADDO Open Source Observability Tools
ADDO Open Source Observability Tools ADDO Open Source Observability Tools
ADDO Open Source Observability Tools Mickey Boxell
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Precisely
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Dataconomy Media
 
Five Ways to Fix Your SQL Server Dev-Test Problems
Five Ways to Fix Your SQL Server Dev-Test Problems Five Ways to Fix Your SQL Server Dev-Test Problems
Five Ways to Fix Your SQL Server Dev-Test Problems Catalogic Software
 
Geek Sync | Deployment and Management of Complex Azure Environments
Geek Sync | Deployment and Management of Complex Azure EnvironmentsGeek Sync | Deployment and Management of Complex Azure Environments
Geek Sync | Deployment and Management of Complex Azure EnvironmentsIDERA Software
 
Adding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance TestAdding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance TestRodolfo Kohn
 
Eric Proegler Oredev Performance Testing in New Contexts
Eric Proegler Oredev Performance Testing in New ContextsEric Proegler Oredev Performance Testing in New Contexts
Eric Proegler Oredev Performance Testing in New ContextsEric Proegler
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
 
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...rschuppe
 
ThoughtWorks Continuous Delivery
ThoughtWorks Continuous DeliveryThoughtWorks Continuous Delivery
ThoughtWorks Continuous DeliveryKyle Hodgson
 
Real-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfReal-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfAlbert Wong
 
EM12c Monitoring, Metric Extensions and Performance Pages
EM12c Monitoring, Metric Extensions and Performance PagesEM12c Monitoring, Metric Extensions and Performance Pages
EM12c Monitoring, Metric Extensions and Performance PagesEnkitec
 
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionHow KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionElasticsearch
 
Itsummit2015 blizzard
Itsummit2015 blizzardItsummit2015 blizzard
Itsummit2015 blizzardkevin_donovan
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks
 
Monitoring Containerized Micro-Services In Azure
Monitoring Containerized Micro-Services In AzureMonitoring Containerized Micro-Services In Azure
Monitoring Containerized Micro-Services In AzureAlex Bulankou
 
Real Time Big Data Processing on AWS
Real Time Big Data Processing on AWSReal Time Big Data Processing on AWS
Real Time Big Data Processing on AWSCaserta
 

Ähnlich wie Sql azure cluster dashboard public.ppt (20)

ADDO Open Source Observability Tools
ADDO Open Source Observability Tools ADDO Open Source Observability Tools
ADDO Open Source Observability Tools
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
 
Five Ways to Fix Your SQL Server Dev-Test Problems
Five Ways to Fix Your SQL Server Dev-Test Problems Five Ways to Fix Your SQL Server Dev-Test Problems
Five Ways to Fix Your SQL Server Dev-Test Problems
 
Geek Sync | Deployment and Management of Complex Azure Environments
Geek Sync | Deployment and Management of Complex Azure EnvironmentsGeek Sync | Deployment and Management of Complex Azure Environments
Geek Sync | Deployment and Management of Complex Azure Environments
 
rakesh_resume
rakesh_resumerakesh_resume
rakesh_resume
 
Adding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance TestAdding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance Test
 
Ioug oow12 em12c
Ioug oow12 em12cIoug oow12 em12c
Ioug oow12 em12c
 
Eric Proegler Oredev Performance Testing in New Contexts
Eric Proegler Oredev Performance Testing in New ContextsEric Proegler Oredev Performance Testing in New Contexts
Eric Proegler Oredev Performance Testing in New Contexts
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...
Application Performance Troubleshooting 1x1 - Part 2 - Noch mehr Schweine und...
 
ThoughtWorks Continuous Delivery
ThoughtWorks Continuous DeliveryThoughtWorks Continuous Delivery
ThoughtWorks Continuous Delivery
 
Real-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfReal-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdf
 
EM12c Monitoring, Metric Extensions and Performance Pages
EM12c Monitoring, Metric Extensions and Performance PagesEM12c Monitoring, Metric Extensions and Performance Pages
EM12c Monitoring, Metric Extensions and Performance Pages
 
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionHow KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
 
Itsummit2015 blizzard
Itsummit2015 blizzardItsummit2015 blizzard
Itsummit2015 blizzard
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
Monitoring Containerized Micro-Services In Azure
Monitoring Containerized Micro-Services In AzureMonitoring Containerized Micro-Services In Azure
Monitoring Containerized Micro-Services In Azure
 
Real Time Big Data Processing on AWS
Real Time Big Data Processing on AWSReal Time Big Data Processing on AWS
Real Time Big Data Processing on AWS
 

Kürzlich hochgeladen

Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...SUHANI PANDEY
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtrahman018755
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查ydyuyu
 
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...nilamkumrai
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableSeo
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"growthgrids
 
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...SUHANI PANDEY
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirtrahman018755
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445ruhi
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Call Girls in Nagpur High Profile
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Delhi Call girls
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.soniya singh
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdfMatthew Sinclair
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...singhpriety023
 
( Pune ) VIP Pimpri Chinchwad Call Girls 🎗️ 9352988975 Sizzling | Escorts | G...
( Pune ) VIP Pimpri Chinchwad Call Girls 🎗️ 9352988975 Sizzling | Escorts | G...( Pune ) VIP Pimpri Chinchwad Call Girls 🎗️ 9352988975 Sizzling | Escorts | G...
( Pune ) VIP Pimpri Chinchwad Call Girls 🎗️ 9352988975 Sizzling | Escorts | G...nilamkumrai
 
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)Delhi Call girls
 
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls DubaiDubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubaikojalkojal131
 
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 

Kürzlich hochgeladen (20)

Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
( Pune ) VIP Baner Call Girls 🎗️ 9352988975 Sizzling | Escorts | Girls Are Re...
 
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
Low Sexy Call Girls In Mohali 9053900678 🥵Have Save And Good Place 🥵
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
 
( Pune ) VIP Pimpri Chinchwad Call Girls 🎗️ 9352988975 Sizzling | Escorts | G...
( Pune ) VIP Pimpri Chinchwad Call Girls 🎗️ 9352988975 Sizzling | Escorts | G...( Pune ) VIP Pimpri Chinchwad Call Girls 🎗️ 9352988975 Sizzling | Escorts | G...
( Pune ) VIP Pimpri Chinchwad Call Girls 🎗️ 9352988975 Sizzling | Escorts | G...
 
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
 
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls DubaiDubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
 
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Himatnagar 7001035870 Whatsapp Number, 24/07 Booking
 

Sql azure cluster dashboard public.ppt

  • 1. Agenda • About the Project • Lesson Learned • Rethinking about tester’s career
  • 2. Today’s Technique Trends • Continuous Integration • Continuous Delivery • Live Site First • DevOps • Bigdata/Hadoop • Testing in Production • Real Time Analysis Please visit infoq.com Topics covered by this talk • Monitoring in Production • Data Driven Quality • Data Pipeline • Alert
  • 3. About Me Has been SQL Server team for 8 years Lucky to always report to great manager Mainly focus on Windows Azure SQL Database Now And I will share the lesson learned from monitoring our service through telemetry. Blog: http://blogs.msdn.com/b/qingsongyao/ Read my test career blog
  • 4. What is Windows Azure SQL Database Windows Azure SQL Database, formerly SQL Azure, is a fully managed relational database service that delivers flexible manageability, includes built-in high availability, offers predictable performance, and supports massive scale-out. In other word, you create server and database, and we manage for you to achieve HA, reliable performance with low cost
  • 5. What we have before • Flexible tool to get service status in real-time through rich API • Rich telemetry data exposed in different ways: o PerfStore stores all perf counters o OpStore stores all operation records o Cluster Manager contains state for machines, watchdogs and alerts The Problem o Hard to correlated data from different sources o No separation of telemetry data with customer data o Very hard to write and deploy new telemetry and alerts
  • 6. Project Overview • Problem Statement We have lot of data, but lack of ways to retrieve and present them in an easy way. • What is the project about? We want to have a central place to display real time information about all clusters • What is the goal of this project o Help people to get service insight o Effective Detect production issues and assist people to solve them quickly o Help to deep analyze the issues and understand the root cause
  • 7. SQL Azure Clusters Data Collector Command Gateway Dashboard DW Incident Response Team raise alerts Dashboard Report Architect GPM LPM MSDB PerfStore OpStore … Multi Thread PowerShell Data Collection Agent SQL Azure And IASS
  • 8. Business Value Trend Analysis • Using Dashboard for Livesite incident • Drive repair items and feature planning Internal Monitoring Alert • From reactive to proactive • Reduce issue detection and migration time Monitoring Testing Clusters • All A1 clusters are monitoring
  • 10. On 3/8 7PM UTC, a couple of machines are down, and 150 DBs are impacted, we start to use dashboard to monitor the recovery progress Availability Trend
  • 11. Bug # Assert Count 1229076 "Assert Assert Failed: Stack: at System.Environment.GetStackTrace(Exception 1 "Assert Assert Failed: ClientId: 00000000-0000-0000-0000-000000000000 NodeInfo: 66 1229073 "Assert Assert Failed: Incoming epoch 0-130072965305395635-6f6f103456a59b4f4a44d 71 1192590 "Assert Assert Failed: PartitionId <App>dbo</App><TG>UserDb</TG><Lo>0x8000000000 85806 1228173 "FabricUnhandledException System.ArgumentException: Illegal characters in path. 2 1229079 "FabricUnhandledException System.ComponentModel.Win32Exception (0x80004005) 11 1229087 "FabricUnhandledException System.Data.Fabric.Common.AsyncCallbackException: 9 1224236 "FabricUnhandledException System.InsufficientMemoryException: Insufficient winso 818 1229089 "FabricUnhandledException System.IO.FileNotFoundException: Could not find file ' 5 1226404 "FabricUnhandledException System.IO.IOException: The process cannot access the f 9 1228178 "FabricUnhandledException System.NullReferenceException: Object reference not se 2 1229081 "FabricUnhandledException System.ObjectDisposedException: Cannot access a dispos 1 1229084 "FabricUnhandledException System.Runtime.CallbackException: Async Callback threw 3 When we have outage in one cluster, we scan all exceptions and measure the potential impact of other clusters Incident
  • 12. SE Repl LCK_M_X Hit Per Cluster Trend Analysis and Prediction
  • 13. Alert
  • 14. New Alert based on dashboard Original goal is to collect real time cluster information in a dashboard Quickly turn into an very important way of alert and resolve live side issue Highlights • Data Lag is usually less than 10 minutes • Data aggregated at central DW • Write and Deploy a new alert take hours • We can always watch and turn your alert at any time
  • 15. Alert
  • 16. From Passive to Reactive and to Predictive What happens yesterday: • Customer noticed us that we have outage. • Every day we only look at issues happens in the past. What happen today with the assistance of dashboard • You always know what happens in a cluster now. • You noticed live site issue as soon as it happens • You have enough information to trouble shooting.
  • 17. Long Term Alert Process Monitoring Data generators •SAWA •Autopilot •MDS •Internal Customer •Real time Log parsing •(no alert will fire at here). Automatic Data Aggregation •Filter noise data •Align data by time series •Enable cross domain/dimension analysis. Automatic Issue detection • Base on cluster health model •Built-in knowledge of issue diagnostics (replace TSG) • Heuristics and Statistics models Fast and Accurate Solution for issues • largely reduce false failures •Root causes are correctly identified •Route to the right team •Auto-health support will be built-in into the system
  • 18. Lesson Learned for building a data pipeline
  • 19. Choose the right technique is important o You don’t necessary need Hadoop to process large amount of data. o Latency does matter, the faster you can get the data, the more valuable it is. o Allow other can quickly authoring and consume your data.
  • 20. Build resilience into your data pipeline o The flow of one kind of data does not impact any other flows o Build-in retry logic in your data flow o Always assuming that your data flow can be failed, and allow reprocess the same flow
  • 21. Monitoring your pipeline • Data processing time • Data processing error frequency • Performance of your database
  • 22. How we running Cluster Dashboard • DevOps model: o new change need pass unit tests o deployed to testcluster dashboard for a couple of hours o Xcopy deploy to production on demand. • HA and Monitoring built-in o Having back collector machine and DW machines o DW has daily full backup and hours incremental backup. o Measure, monitor and alert both collector and DW machine • Data size and Performance o Key table and queries are extensive tuned for better performance o Data retention policy applied for several tables.
  • 24. What I am doing everyday? • 0% writing tests • 0% sign-off • 0% test planning • 0% on test lab • 60% monitoring the production • 40% learning and thinking
  • 25. Key Takeaway • Data visualization is needed for people to understand the data. • Your telemetry/bigdata project should drive actions, instead of only providing data. • It will take time and resource to build a data pipeline and it is fun and learning process to build such pipeline • Alert has a life cycle as well.

Hinweis der Redaktion

  1. This template can be used as a starter file to give updates for project milestones. Sections Right-click on a slide to add sections. Sections can help to organize your slides or facilitate collaboration between multiple authors. Notes Use the Notes section for delivery notes or to provide additional details for the audience. View these notes in Presentation View during your presentation. Keep in mind the font size (important for accessibility, visibility, videotaping, and online production) Coordinated colors Pay particular attention to the graphs, charts, and text boxes. Consider that attendees will print in black and white or grayscale. Run a test print to make sure your colors work when printed in pure black and white and grayscale. Graphics, tables, and graphs Keep it simple: If possible, use consistent, non-distracting styles and colors. Label all graphs and tables.
  2. What is the project about? Define the goal of this project Is it similar to projects in the past or is it a new effort? Define the scope of this project Is it an independent project or is it related to other projects? * Note that this slide is not necessary for weekly status meetings
  3. The following slides show several examples of timelines using SmartArt graphics. Include a timeline for the project, clearly marking milestones, important dates, and highlight where the project is now.