Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Microsoft Database Options
1. Data Options in the Cloud Joe Shirey joe.shirey@microsoft.com
2. Database Choices Value Props: Full h/w control – size/scale 100% of API surface area Roll-your-own HA/DR/scale Value Props: 100% of API surface area Roll-your-own HA/DR/scale Dedicated On-premise SQL Server or other s/w on-premise Resource governance @ machine Security @ DB Server/OS Value Props: Auto HA, Fault-Tolerance Friction-free scale Self-provisioning Subset of API surface area Resources Hosted Hosted SQL Server Resource governance @ VM Security @ DB Server/OS SQL Azure Database Virtual DB server Resource governance @ DB Security @ DB/Virtual Server Shared Low “Friction”/Control High SQL Azure V1 targets scenarios that live in the lower left quadrant
3. SQL Server 2008 R2 Focus Areas Improvements in Scalability and Performance Enhanced Manageability Self Service Business Intelligence Master Data Management
4. Master Data Management Single source of truth for Master Data Thin client management Safeguards of data integrity Data versioning Import/Export capabilities
5.
6.
7. Storage in the Cloud WEB & CLOUDS third party cloud web applications developer experience use existing skills and tools management management connectivity access control compute relational data storage ON-PREMISES lob Applications composite applications
8. Why Data in the Cloud Capital Costs DBA (or lack thereof) High Availability Performance Scalability
9. Windows Azure Storage The goal is to allow users and services Anywhere at anytime access Store data for any length of time Scale to store any amount of data Be confident that the data will not be lost Pay for only what they use/store
10. Windows Azure Storage Storage Durable Scalable (capacity and throughput) Highly available Rich storage concepts Large user data items: blobs Service state: tables Service communication: queues Simple and familiar programming interfaces REST (HTTP and HTTPS) .NET accessible
11. Fundamental data abstractions Blobs – provide a simple interface for storing named files along with metadata for the file Tables – provide structured storage. a table is a set of entities, which contain a set of properties Queues – provide reliable storage and delivery of messages for an application
13. Blob Features and Functions Store large objects (up to 200GB) Associate metadata with blob metadata is <name, value> pairs, up to 8KB per blob set/get with or separate from blob data bits Standard REST Interface PutBlob Inserts a new blob or overwrites the existing blob GetBlob Get whole blob or a specific range DeleteBlob
14. Windows Azure Tables Provides structured storage Massively scalable tables Billions of entities (rows) and TBs of data Automatically scales across servers as traffic grows Highly available Anywhere at anytime access to your data Durable Data is replicated at least 3 times Familiar and easy to use programming interfaces ADO.NET data services – .NET 3.5 SP1 .NET classes and LINQ REST - with any platform or language
15. Table Storage Concepts entity table account Name =… Email = … users Name =… Email = … sally Photo ID =… Date =… photo index Photo ID =… Date =…
16. Table Data Model Table A storage account can create many tables Table name is scoped by account Data is stored in tables A table is a set of entities (rows) An entity is a set of properties (columns) Entity Two “key” properties that together are the unique ID of the entity in the table PartitionKey – enables scalability RowKey – uniquely identifies the entity within the partition
17. Partition Deyand Partitions Every table has a partition key It is the first property (column) of your table Used to group entities in the table into partitions A table partition All entities in a table with the same partition key value Partition key is exposed in the programming model Allows application to control the granularity of the partitions and enable scalability
18. Partition Example Table partition – all entities in tablewith same partition key value Application controls granularity of partition Partition 1 Partition 2
19. Purpose of the Partition Key Entity locality Entities in the same partition will be stored together Efficient querying and cache locality Entity group transactions Atomically perform multiple insert/update/delete over entities in same partition in a single transaction Table scalability We monitor the usage patterns of partitions Automatically load balance partitions Each partition can be served by a different storage node Scale to meet the traffic needs of your table
20. Choosing a Partition Key Granularity of entity group transactions Make the partition key only as big as you need it for entity group transactions Spread out load across partitions More partitions – makes it easier to automatically balance load Currently have one primary index Important to use a partition key that is common in your queries If partition key is part of query Fast access to retrieve entities within a single partition If partition key is not specified in a query Then every partition has to be scanned
21. Table Entities and Properties Each entity can have up to 255 properties Mandatory properties for every entity in table Partition key Row key All entities have a system maintained version No fixed schema for rest of properties Each property is stored as a <name, typed value> pair No schema stored for a table 2 entities within the same table can have different properties Properties can be the standard .NET types String, binary, bool, DateTime, GUID, int, int64, and double
22. Table Programming Model Provide familiar and easy to use interfaces Leverage your .NET expertise Table entities are accessed as objects via ADO.NET Data Services – .NET 3.5 SP1 LINQ – language Integrated query RESTful access to table and entities Insert/update/delete entities over the table Query over tables Get back a list of structured entities
23. Web + Worker Role Pattern Web role Web farm that handles request from the internet Push work items onto storage queue Worker role Process work item off storage queue Public internet n m Web role Worker role Q Load balancer Cloud storage (tables, blobs, queues)
24. Windows Azure Queues Provide reliable message delivery Simple, asynchronous work dispatch Programming semantics ensure that a message can be processed at least once Queues are highly available, durable and performance efficient Access is provided via REST
25. Queue Storage Concepts Message Queue Account 128x128, http://… thumbnail jobs 256x256, http://… sally http://… photo processing jobs http://…
26. Account, Queues and Messages An account can create many queues Queue name is scoped by the account Aqueue contains messages No limit on number of messages stored in a queue A message is stored for at most a week in a queue http://<Account>.queue.core.windows.net/<QueueName> Messages Message size <= 8 KB To store larger data, store data in blob/entity storage, and the blob/entity name in the message
27. queue programming API queues create/delete/clear queues inspect queue length messages enqueue (queuename, message) dequeue (queueName, invisibility time T) returns the message with a messageID makes the message invisible for time T delete(queuename, messageID)
28. Queue Best Practices Make message processing idempotent Need to deal with failures No fixed order for dequeue messages Invisible messages result in out of order processing Use the queue length to scale your workers
30. SQL Azure SQL Server Data Services false start Familiar relational model Uses existing APIs & tools Friction free provisioning and reduced management Built for the cloud with availability and scale Clear feedback: “I want a SQL database in the cloud”
31. Service Provisioning Model Each account has zero or more servers Azure wide, provisioned in a common portal Billing instrument Each server has one or more databases Contains metadata about the databases and usage Unit of authentication Unit of geo-location Generated DNS based name Each database has standard SQL objects Unit of consistency Unit of multi-tenancy Contains users, tables, views, indices, etc. Most granular unit of billing SKU’s Web edition -1 GB Business edition – 10 GB account server database
32. Architecture Shared infrastructure at SQL database and below Request routing, security and isolation Scalable HA technology provides the glue Automatic replication and failover Provisioning, metering and billing infrastructure Machine 5 Machine 6 Machine 4 SQL Instance SQL Instance SQL Instance SQL DB SQL DB SQL DB UserDB1 UserDB2 UserDB3 UserDB4 UserDB1 UserDB2 UserDB3 UserDB4 UserDB1 UserDB2 UserDB3 UserDB4 SDS Provisioning (databases, accounts, roles, …, Metering, and Billing Scalability and Availability: Fabric, Failover, Replication, and Load balancing Scalability and Availability: Fabric, Failover, Replication, and Load balancing
39. Programming Model Small data sets Use a single database Same model as on-premise SQL Server Large data sets Partition data across many databases Use parallel fan-out queries to fetch the data Application code must be partition aware
43. Customer Learning’s from TAP program Use SQL Azure to store metadata and BLOB storage for large files Highly elastic load patterns in a cost effective way is an industry challenge Combination of different SKUs
44. Sync Framework SQL Azure Local SQL DB Sync Process Sync Process Local Computer MSDN - http://msdn.microsoft.com/en-us/sync/default.aspx
46. Pricing Windows Azure Storage Bandwidth: $0.10 in / $0.15 out / GB $0.15/GB stored/month SQL Azure Storage Bandwidth: $0.10 in / $0.15 out / GB Web edition (1GB): $9.99/month Business edition (10GB): $99.99/month
47. Data Options in the Cloud joe.shirey@microsoft.com http://www.joeshirey.com