SQLSaturday - divide and conquer - scale out using Azure federated databases

•Als PPTX, PDF herunterladen•

1 gefällt mir•477 views

WierdAl

SQLSaturday Presentation on Scale out using Azure Federation Databases.

Technologie

Martin Phelps
Database Architect
MiX telematics
Divide and Conquer - Scale out
using Federated Database in Azure

Intro
 Scaling the database layer
 Understanding of Sharding Basics
 Demo
 Performance
 Limitations
 Conclusions

 Scale OUT - Hardware
 Scale UP
 Master / Slave
 Partitioned views
 Table Partitioning
 Windows Azure Sql Database
Scaling the database layer

 Range Partitioning
 List Partitioning
 Hash Partitioning
Sharding Basics – Types of sharding

 Problems it can address
 Current Performance Issues
 Physical hardware constraints
 Logical constraints
 Security (Separation of data)
 Planning for future growth
 Start Small
 Grow on demand
 Cater for high volume periods
 Less surprises
 Complex to Maintain
 Schema maintenance
 Monitoring of growth
 Manual splitting of Shards - downtime
Sharding Basics

 Editions
 Web 100 Mb – 5 Gb
 Business 10 Gb – 150 Gb
 Premium – Dedicated Mem / CPU / IO
 Developer Tools
 Azure Console
 Visual Studio
 SSMS
Azure – Sql Database

Performance
Florin Dumitrescu - http://www.ducons.com/blog/benchmarking-
throughput-and-scalability-on-sql-azure-federations

Performance
http://www.microsoft.com/casestudies/Windows-Azure/Flavorus/Ticketing-Company-Scales-to-Sell-150-000-Tickets-in-10-Seconds-by-Moving-to-Cloud-Computing-Solution/4000011072

 Merge Operations
 Fan-out Queries
 Schema Management
 Policy based auto-repartitioning
 Multi column federation keys
 Data Sync Services
 No Backup/Restore Operation
Current Limitations

 Costs
 Own Server + OS + Sql Ent (R75000 P/M)
 Azure VM + OS + Sql Ent (36000 P/M)
 Azure Sql Database (R27000 P/M)
 Growth
 Linear Scalability (Size & Performance)
 Maturity
 Been available for 2 years already
 Continues to improve
 Enterprise Ready?
 Yes… But
Conclusions

 http://www.ducons.com/blog/benchmarking-throughput-and-scalability-on-sql-azure-
federations
 http://research.microsoft.com/en-us/downloads/5c8189b9-53aa-4d6a-a086-
013d927e15a7/default.aspx
 http://msdn.microsoft.com/en-us/library/ff394115.aspx
 http://social.technet.microsoft.com/wiki/contents/articles/3507.windows-azure-sql-
database-performance-and-elasticity-guide.aspx
 http://msdn.microsoft.com/en-us/library/windowsazure/dn338083.aspx
 http://research.microsoft.com/en-us/downloads/5c8189b9-53aa-4d6a-a086-
013d927e15a7/default.aspx
 http://msdn.microsoft.com/en-us/magazine/hh848258.aspx
 http://sqlazuremw.codeplex.com/releases/view/32334
 http://sqlazurefedmw.codeplex.com/releases/view/71985
References

 martin.phelps@gmail.com
 za.linkedin.com/in/phelpsm
 @mphelps_1968
 www.databasediary.com
Contact Me

Empfohlen

SQL Server 2012 Beyond Relational Performance and ScaleMichael Rys

Creating Not For Profit Providers Of Health And Social CareGeraint Day

SQL Server 2012 - FileTables Sperasoft

FileTable and Semantic Search in SQL Server 2012Michael Rys

SQL Server 2012 and Big DataMicrosoft TechNet - Belgium and Luxembourg

2024 State of Marketing Report – by HubspotMarius Sescu

Everything You Need To Know About ChatGPTExpeed Software

Product Design Trends in 2024 | Teenage EngineeringsPixeldarts

Empfohlen

SQL Server 2012 Beyond Relational Performance and ScaleMichael Rys

Creating Not For Profit Providers Of Health And Social CareGeraint Day

SQL Server 2012 - FileTables Sperasoft

FileTable and Semantic Search in SQL Server 2012Michael Rys

SQL Server 2012 and Big DataMicrosoft TechNet - Belgium and Luxembourg

2024 State of Marketing Report – by HubspotMarius Sescu

Everything You Need To Know About ChatGPTExpeed Software

Product Design Trends in 2024 | Teenage EngineeringsPixeldarts

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

How to convert PDF to text with Nanonetsnaman860154

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

A Year of the Servo Reboot: Where Are We Now?Igalia

Artificial Intelligence: Facts and MythsJoaquim Jorge

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

Slack Application Development 101 Slidespraypatel2

How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow

AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork

Weitere ähnliche Inhalte

Kürzlich hochgeladen

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

How to convert PDF to text with Nanonetsnaman860154

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

A Year of the Servo Reboot: Where Are We Now?Igalia

Artificial Intelligence: Facts and MythsJoaquim Jorge

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

Slack Application Development 101 Slidespraypatel2

Kürzlich hochgeladen (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Handwritten Text Recognition for manuscripts and early printed texts

How to convert PDF to text with Nanonets

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

A Year of the Servo Reboot: Where Are We Now?

Artificial Intelligence: Facts and Myths

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Powerful Google developer tools for immediate impact! (2023-24 C)

Breaking the Kubernetes Kill Chain: Host Path Mount

A Domino Admins Adventures (Engage 2024)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Finology Group – Insurtech Innovation Award 2024

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Tata AIG General Insurance Company - Insurer Innovation Award 2024

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

2024: Domino Containers - The Next Step. News from the Domino Container commu...

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

Slack Application Development 101 Slides

Empfohlen

How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow

AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork

Skeleton Culture CodeSkeleton Technologies

PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley

Content Methodology: A Best Practices Report (Webinar)contently

How to Prepare For a Successful Job Search for 2024Albert Qian

Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)

Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal

5 Public speaking tips from TED - Visualized summarySpeakerHub

ChatGPT and the Future of Work - Clark Boyd Clark Boyd

Getting into the tech field. what next Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray

How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC

Introduction to Data ScienceChristy Abraham Joy

Time Management & Productivity - Best PracticesVit Horky

The six step guide to practical project managementMindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools

12 Ways to Increase Your Influence at WorkGetSmarter

ChatGPT webinar slidesAlireza Esmikhani

Empfohlen (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

Skeleton Culture Code

PEPSICO Presentation to CAGNY Conference Feb 2024

Content Methodology: A Best Practices Report (Webinar)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...

12 Ways to Increase Your Influence at Work

ChatGPT webinar slides

SQLSaturday - divide and conquer - scale out using Azure federated databases

1. Martin Phelps Database Architect MiX telematics Divide and Conquer - Scale out using Federated Database in Azure

2. Intro  Scaling the database layer  Understanding of Sharding Basics  Demo  Performance  Limitations  Conclusions

3.  Scale OUT - Hardware  Scale UP  Master / Slave  Partitioned views  Table Partitioning  Windows Azure Sql Database Scaling the database layer

4.  Range Partitioning  List Partitioning  Hash Partitioning Sharding Basics – Types of sharding

5.  Problems it can address  Current Performance Issues  Physical hardware constraints  Logical constraints  Security (Separation of data)  Planning for future growth  Start Small  Grow on demand  Cater for high volume periods  Less surprises  Complex to Maintain  Schema maintenance  Monitoring of growth  Manual splitting of Shards - downtime Sharding Basics

6.  Editions  Web 100 Mb – 5 Gb  Business 10 Gb – 150 Gb  Premium – Dedicated Mem / CPU / IO  Developer Tools  Azure Console  Visual Studio  SSMS Azure – Sql Database

7. DEMO

8. Performance Florin Dumitrescu - http://www.ducons.com/blog/benchmarking- throughput-and-scalability-on-sql-azure-federations

9. Performance http://www.microsoft.com/casestudies/Windows-Azure/Flavorus/Ticketing-Company-Scales-to-Sell-150-000-Tickets-in-10-Seconds-by-Moving-to-Cloud-Computing-Solution/4000011072

10.  Merge Operations  Fan-out Queries  Schema Management  Policy based auto-repartitioning  Multi column federation keys  Data Sync Services  No Backup/Restore Operation Current Limitations

11.  Costs  Own Server + OS + Sql Ent (R75000 P/M)  Azure VM + OS + Sql Ent (36000 P/M)  Azure Sql Database (R27000 P/M)  Growth  Linear Scalability (Size & Performance)  Maturity  Been available for 2 years already  Continues to improve  Enterprise Ready?  Yes… But Conclusions

12.  http://www.ducons.com/blog/benchmarking-throughput-and-scalability-on-sql-azure- federations  http://research.microsoft.com/en-us/downloads/5c8189b9-53aa-4d6a-a086- 013d927e15a7/default.aspx  http://msdn.microsoft.com/en-us/library/ff394115.aspx  http://social.technet.microsoft.com/wiki/contents/articles/3507.windows-azure-sql- database-performance-and-elasticity-guide.aspx  http://msdn.microsoft.com/en-us/library/windowsazure/dn338083.aspx  http://research.microsoft.com/en-us/downloads/5c8189b9-53aa-4d6a-a086- 013d927e15a7/default.aspx  http://msdn.microsoft.com/en-us/magazine/hh848258.aspx  http://sqlazuremw.codeplex.com/releases/view/32334  http://sqlazurefedmw.codeplex.com/releases/view/71985 References

13. Q&A

14.  martin.phelps@gmail.com  za.linkedin.com/in/phelpsm  @mphelps_1968  www.databasediary.com Contact Me

Hinweis der Redaktion

Database architect at MiX Telematics. Specialising in Solution Architecture across their OLTP and DW databases. With 15 years of in-the-trenches experience with Sql Server have been providing solutions using Sql Server since version 4.2. Have previously done work for Insurance, ICT, Marketing and Mining companies.Creating a Federated Sql Database in Azure can allow your data to scale out as it grows. Session will primarily be demos covering Setting up and configuring a Federated Sql Database in Azure as well as how to monitor growth of federations and how to split federations. Will also cover some limitations and disadvantages that need to be taken into consideration when deciding if using a Federated Sql Database is suitable for your business.
Scale Out – more servers / create controller database / change connection / licensing costs higher / development costs higherScale Up – buy bigger more powerful hardware, get new toys. 2-3 years later everyone is bemoaning the purchase and how bad a buy it was. ES7000? License costs came down.Master / Slave (replication) – separation of read / write tasks, Partitioned views – std edition , across databases and serversTable partitioning – reduced IO if accessed correctly, caters for very large volumes of dataWASD – progression of existing technologies incorporating a bit of everything
A major difficulty with sharding is determining where to write data. There are several approaches to determining where to write data, but these approaches can be broken down into three categories: range partitioning, list partitioning, and hash partitioning.Range partitioning involves splitting data across servers using a range of values. Rows between 1 and 100 go to database A, 101 to 200 to database B etc. while logical it has some problems. It creates write hot spots – all new data will be written to the same range server. Sending all writes to a single server doesn’t help us scale out. Range partitioning also doesn’t guarantee an even distribution of data or activity patterns.List partitioning is similar to range partitioning. Instead of defining a range of values, we assign a row to a database based on some intrinsic data property. This might be based on department, geographic region, or customer id. This is can be an effective way to shard data. Members of each list grouping are likely to use the same features and have similar data growth patterns. The downsides of this approach are that it may require domain knowledge to create effective lists, lists are unlikely to experience even growth.The last approach to partitioning is hash partitioning. Instead of partitioning by some property of the data, we assign data to a random database. We apply a hashing function to some property of the data to randomize it. Hashing makes it easy to take an input of any length and produce an identifiable output of a fixed length – we’re taking randomly sized strings and mapping them to a known size number.A naive approach to hashing keys would be to split data into multiple buckets (three buckets in this example) based on the output of the hashing function. If our business grows and we decide to scale out to additional servers, all of the data will need to be re-written.A more sophisticated approach uses something called consistent hashing. consistent hashing distributes keys along a continuum – think of it as a ring. Each of our sharded databases is responsible for a portion of the data. If we want to add another server, we just add it into the ring and it takes over for a portion of the hashed values. With consistent hashing, only a small portion of the data needs to be re-written, unlike our naive hashing example where all data needs to be redistributed
Constantly having to increase CPU / Memory / Disk or having to replace hardware with more powerful larger hardwareConcurrent connections, IO contention, locking etc.Storing multiple clients data in one database – row level security to ensure they can’t access someone else's dataStart off small (low cost) increase as growth occurs. Less likely to be surprised by sudden unusual grow.Ensuring schema is consistent across all shards
Web edition – get to 100Mb option through Console – Server admin only150 databases per server (22Tb Federation)MAXSIZE, you will receive an error code 40544 – up to 15 minute delay to add new data180 Concurrent Worker ThreadsSessions – Internal (< 2000)1 million locks per session5GB / 2 GB tempdb per sessionMemory Wait > 20 sec – sessions using more than 16Mb for longer than 20 sec terminated from highest to lowestTransaction duration 24 hours / 2 sec if locking a system taskIdle Connection timeout - 30 minutesP1 –1 CPU / 200 workers / 2000 sessions 150 IOPS / 8 GB MemP2 – 2 CPU / 400 workers / 4000 sessions 300 IOPS / 16 GB Mem
3000 – 5000 QpsEach SQL Database computer is equipped with 32 GB RAM, 8 CPU cores and 12 hard drives. To prevent SQL Database computers from being overloaded and jeopardizing any computer’s overall health, workload is monitored by the Engine Throttling component. The Engine Throttling component will block connections of subscribers that use excessive resources to the detriment of a SQL Database computer’s health. The degree to which a subscriber’s connections are blocked correlates to the SQL Database throttling mode employed and ranges from blocking inserts and updates only to completely blocking all connectivity. When a subscriber’s connection is blocked, attempts to retry the blocked connection will return error 40501 and a reason code. The reason code is a decimal value which specifies both the throttling mode and throttling type as described in the "Understanding Windows Azure SQL Database Reason Codes" section in this article
Ticketing Company Scales to Sell 150,000 Tickets in 10 Seconds by Moving to Cloud Computing Solution
Merge Federation – no data lossAllow single query that can process results across large number of federation membersAllow multi version schema deployment across federation membersManages the Split / Merge process based on some policy (query response time / db size)Federate on CustomerId + AccountIdReplicate Reference data between federations / copy federated data to another databaseManually export data to be able to recover in event of accidental data loss
Costs – Own Server – Dell 8 CPU , 64GB Ram , 3 TB Storage (RAID 5) - Azure VM – 8 CPU, 56 GB Ram, 3 TB Storage + Backup - Sql DB – 20 x 100Gb Federations + 1TB Storage 32787 150 Gb Federations – 4.7 PetaBytesSecurity concerns. Once you have chosen a provider its very difficult to move to another or back to on premise if large volumes of data are involved. Ideal for new application can start small and grow. Migrating existing application will be more complicated.