SlideShare ist ein Scribd-Unternehmen logo
1 von 20
SQL Server 2008 R2 Parallel Data Warehouse
Scott Adams, Principal Consultant
scott.adams@imgroup.com
IMGROUP
 Introduction
 Microsoft data warehousing
 PDW key features
 Fast Track DW & PDW partners
 PDW high level architecture
 Case study
 Summary & suggested next steps
SQL Server 2008 R2 PDW
Data Warehousing Appliances with Microsoft
 Principal Consultant,
Business Intelligence and
Data Warehousing,
IMGROUP
 Senior Architect, Business
Intelligence, Microsoft
Enterprise Group
 Head of Data Warehouse
Products, Business
Objects/SAP
 Co-Founder & VP
Engineering, Appsmart
Software
Scott Adams
 SQL Server 2008 R2 Parallel Data Warehouse (PDW)
 Formerly project codenamed “Madison”.
 Integration of technology acquired from DatAllegro
(acquisition announced in Sept 2008).
 Enables ‘00s TB data warehouses using industry
standard hardware and SQL Server 2008
 PDW is an MPP (massively parallel processing) system,
consisting shared-nothing nodes.
 PDW enables both centralised data warehouses, and
hub and spoke architecture through connected units.
 Hub & Spoke can contain both MPP and SMP (symmetric
multi-processing) nodes, including SQL Server Analysis
Services (OLAP) and enables PB volumes of data.
Introduction
 Bill Inmon is widely credited as the “father”
of the data warehouse
 Bill is part of the rollout of PDW
 Actively involved and supporting Microsoft
with training of Microsoft staff and partners
alike on Data Warehousing 2.0
 Comments:
 “architecture is sound”
 “very well though out technology”
 “a compelling offering in the marketplace”
Bill Inmon: “father” of the data warehouse
Introduction: Timeline
6
Microsoft Confidential—Preliminary Information Subject to
Change
2008 Beyond2009 2010
Parallel Data Warehouse
MTP Program Launched
Circa 10 Customers Provided with early Madison
Benchmark
Madison Named as SQL Server Parallel DW
List Price at $58K per proc
Microsoft Announce Intention to Acquire
DATAllegro (July)
Acquisition Closes (Sept)
150TB demo of DATAllegro on SQL Server
run at BI Conference (Oct)
Hardware Architectures Identified
Early whitepapers / guidance
Launch date estimated Summer 2010
Project “Madison” MTP 2 Program to Launch (fully functional,
fully performant)
TAP Program (on client site)
RTM in October 2010
Parallel Data Warehouse
PDW vNext
Focus on continually lowering the
costs of high end DW, while
increasing performance
Additional Hardware Partners
Additional functionality
Further integration with MS stack
 1. SQL Server 2008 R2
 Scalable enterprise database: compression,
MERGE, resource governor, star optimisation
 Many multi-terabyte (TB) implementations
 2. Fast Track
 Enterprise DW reference architectures from
HP, IBM, Bull, Dell and EMC
 Accelerate DW deployment
 Reduce hardware testing and tuning
 Scale from 4 to 48 TB
 3. PDW
 10’s to ’00s of TB
 Industry standard hardware
 Appliance model from HP, Dell or Bull
 4. PDW with Hub & Spoke
 PDW as an enterprise “hub”, publishing data
to “spokes”
 Integrates with existing SQL Server 2008
 Hub and spokes interconnected via dedicated
high-speed network (500GB per minute)
 Simplified ETL/ELT with a data publishing
model
Microsoft Data Warehousing
 Data warehouse scalability from 10s to 100s of
Terabytes.
 Low cost of ownership through industry standard
hardware.
 Simplified deployment and maintenance of
appliance model.
 Integration with existing SQL Server 2008 data
warehouses via hub-and-spoke architecture.
 Greater ROI from BI investments through
integration with SQL Server 2008.
 Reduced risk through use of redundant, industry-
standard hardware.
 Predictable performance delivery through
balanced reference architectures.
 Better agility and business alignment through hub-
and-spoke architecture.
PDW Key Features
 Avoid performance problems due to conflicts
between queries from different business units
or IT management functions at peak times.
 Provide dedicated, high-speed, network
interconnecting all hub and spoke databases.
 Support a centralized data model for metadata
management and data governance without the
limitations of single-platform centralization.
 Allow business units to scale their dedicated
data mart platforms separately from the hub
system.
 Allow central IT to scale the hub appliance to
meet overall enterprise requirements without
having to scale every dependent data mart
appliance at the same time.
 Physically separate the processes associated
with data management and development from
those associated with the consumption of
information by end users.
Microsoft Data Warehousing
Hub & Spoke Architecture
 Enterprise reference
architecture options available
from all suppliers for Fast Track
Data Warehouse.
 SQL Server 2008 R2 Parallel Data
Warehouse appliances available
from all suppliers.
Fast Track DW & PDW Partners
An Appliance Experience
All hardware from a single partner
Options:
• Hardware Vendor – Choose Vendor: HP, Dell, IBM, Bull
• Choose Drive Size (300 Gig, 450 Gig, 1 TB) – racks from 30 -100 TB
Orderable in preconfigured blocks - racks
Hardware Vendor will:
• Assemble appliances
• Image appliances with OS, SQL Server and PDW software
Appliance installed in less than a day
Support:
• Microsoft provides first call support – 24*7*365 Triage
• Hardware partner provides onsite break / fix support
Fast Track DW & PDW Partners
HP
Basic
6 – 12TB
DL38x w/
MSA2000
Mainstream
12 – 24TB
DL585 G6 w/
MSA2000
Mainstream
16 – 32 TB
DL580 G5 w/
MSA2000 G2
Premium
24 – 48 TB
DL785 G6 w/
MSA2000 G2
Control Rack Data Rack
HP SQL Server Fast Track Data Warehouse HP SQL Server 2008 R2 PDW
Parallel Data Warehouse Architecture
A Node:
Database Server Storage Node
PDW Architecture - HP
Control Rack Data
Rack
Control Rack
Data Rack/s
“PDW…a highly scalable data warehouse appliance that delivers performance at low
cost through a massively parallel processing (MPP) architecture.”
Backup
 Integrated backup hardware – internal high speed Infiniband network access
 Parallelised high-speed backup solution
 Full or incremental backups
 Backup compression
 Connect to corporate backup solution, e.g. via HBA
Landing Zone
 Provides high capacity staging server for data files from ETL processes
 SQL Server Integration services available
 Connected to high speed internal network and external GigE
 Available as sandbox for other applications and scripts that run on internal
network.
Source Landing Zone Files Data Loader Compute Nodes
PDW Architecture
Date Dim
D_DATE_SK
D_DATE_ID
D_DATE
D_MONTH
… Item
I_ITEM_SK
I_ITEM_ID
I_REC_START_DATE
I_ITEM_DESC
…
Store Sales
Ss_sold_date_sk
Ss_item_sk
Ss_customer_sk
Ss_cdemo_sk
Ss_store_sk
Ss_promo_sk
Ss_quantity
…
Promotion
P_PROMO_SK
P_PROMO_ID
P_START_DATE_SK
P_END_DATE_SK
…
Store
S_STORE_SK
S_STORE_ID
S_REC_START_DATE
S_REC_END_DATE
S_STORE_NAME
…
Customer
C-CUSTOMER_SK
C_CUSTOMER_ID
C_CURRENT_ADDR
…
Customer
Demographics
CD_DEMO_SK
CD_GENDER
CD_MARITAL_STATUS
CD_EDUCATION
…
Database Distributed & Replicated Tables
Data Distribution with Replication
C I
D
CD
S
P
SS[1]
C I
D
CD
S
P
SS[2]
C I
D
CD
S
P
SS[3]
C I
D
CD
S
P
SS[4]
SS[1]
SS[2]
SS[3}
SS[4}
Case Study: First Premier Bankcard
PDW Highlights
Available today
Hub-and-spoke federation for multiple SMP & MPP DMs/DWs
Scale SQL Server 2008 from 10’s of TB to 1 PB
Set a new bar in appliance pricing and performance
Available today
Leverage SQL Server 2008 Enterprise DW enhancements
Scale up to 48 TB
Summary
Parallel DW
Fast Track Data
Warehouse
Available today
Enterprise Edition significant DW enhancements
Multiple TB, Compression, Resource Gov., Policy-based Admin
SQL Server
2008
 IMGROUP Data Warehouse Vision and
Foundation service, including:
 Business requirements, objectives and goals
 Data Warehouse Readiness and Roadmap
 Relational and Dimensional Data Modelling
 Extract Transform and Load (ETL)
 Online Analytical Processing (OLAP)
 Reporting and Analytics, including Data Mining
 Dashboards and Scorecards
Scott Adams scott.adams@imgroup.com
Steve Prokopiou steve.prokopiou@imgroup.com
Suggested Next Steps
scott.adams@imgroup.com
steve.prokopiou@imgroup.com
For more information, contact:

Weitere ähnliche Inhalte

Was ist angesagt?

Gartner magic quadrant for data warehouse database management systems
Gartner magic quadrant for data warehouse database management systemsGartner magic quadrant for data warehouse database management systems
Gartner magic quadrant for data warehouse database management systems
paramitap
 

Was ist angesagt? (20)

Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
 
Machine Learning for z/OS
Machine Learning for z/OSMachine Learning for z/OS
Machine Learning for z/OS
 
Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)
 
Actionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceActionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data Science
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
SnapLogic corporate presentation
SnapLogic corporate presentationSnapLogic corporate presentation
SnapLogic corporate presentation
 
Ibm db2 big sql
Ibm db2 big sqlIbm db2 big sql
Ibm db2 big sql
 
2014.07.11 biginsights data2014
2014.07.11 biginsights data20142014.07.11 biginsights data2014
2014.07.11 biginsights data2014
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
IBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query IntroductionIBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query Introduction
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesVirtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & Bénéfices
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Why advanced monitoring is key for healthy
Why advanced monitoring is key for healthyWhy advanced monitoring is key for healthy
Why advanced monitoring is key for healthy
 
Gartner magic quadrant for data warehouse database management systems
Gartner magic quadrant for data warehouse database management systemsGartner magic quadrant for data warehouse database management systems
Gartner magic quadrant for data warehouse database management systems
 
IBM Power8 announce
IBM Power8 announceIBM Power8 announce
IBM Power8 announce
 
Designing For Occasionally Connected Apps Slideshare
Designing For Occasionally Connected Apps SlideshareDesigning For Occasionally Connected Apps Slideshare
Designing For Occasionally Connected Apps Slideshare
 
Hitachi compute blade 2000 executive overview
Hitachi compute blade 2000 executive overviewHitachi compute blade 2000 executive overview
Hitachi compute blade 2000 executive overview
 
Unified Compute Platform Pro for VMware vSphere
Unified Compute Platform Pro for VMware vSphereUnified Compute Platform Pro for VMware vSphere
Unified Compute Platform Pro for VMware vSphere
 
Data Integration through Data Virtualization (SQL Server Konferenz 2019)
Data Integration through Data Virtualization (SQL Server Konferenz 2019)Data Integration through Data Virtualization (SQL Server Konferenz 2019)
Data Integration through Data Virtualization (SQL Server Konferenz 2019)
 

Ähnlich wie SQL Server 2008 R2 Parallel Data Warehouse

Microsoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse PresentationMicrosoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse Presentation
Microsoft Private Cloud
 
Professional Portfolio
Professional PortfolioProfessional Portfolio
Professional Portfolio
MoniqueO Opris
 
ds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suiteds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suite
Robin Fong 方俊强
 

Ähnlich wie SQL Server 2008 R2 Parallel Data Warehouse (20)

Introduction to microsoft sql server 2008 r2
Introduction to microsoft sql server 2008 r2Introduction to microsoft sql server 2008 r2
Introduction to microsoft sql server 2008 r2
 
Microsoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse PresentationMicrosoft SQL Server - Parallel Data Warehouse Presentation
Microsoft SQL Server - Parallel Data Warehouse Presentation
 
Whats New Sql Server 2008 R2 Cw
Whats New Sql Server 2008 R2 CwWhats New Sql Server 2008 R2 Cw
Whats New Sql Server 2008 R2 Cw
 
Informix warehouse and accelerator overview
Informix warehouse and accelerator overviewInformix warehouse and accelerator overview
Informix warehouse and accelerator overview
 
Whats New Sql Server 2008 R2
Whats New Sql Server 2008 R2Whats New Sql Server 2008 R2
Whats New Sql Server 2008 R2
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2Streaming Real-time Data to Azure Data Lake Storage Gen 2
Streaming Real-time Data to Azure Data Lake Storage Gen 2
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
 
Azure Data platform
Azure Data platformAzure Data platform
Azure Data platform
 
Comprehensive Guide for Microsoft Fabric to Master Data Analytics
Comprehensive Guide for Microsoft Fabric to Master Data AnalyticsComprehensive Guide for Microsoft Fabric to Master Data Analytics
Comprehensive Guide for Microsoft Fabric to Master Data Analytics
 
Exploring Microsoft Azure Infrastructures
Exploring Microsoft Azure InfrastructuresExploring Microsoft Azure Infrastructures
Exploring Microsoft Azure Infrastructures
 
Power BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data Solutions
 
Sql Server 2012 Datasheet
Sql Server 2012 DatasheetSql Server 2012 Datasheet
Sql Server 2012 Datasheet
 
Quickly Deploy Microsoft Private Cloud and SQL Server 2012 Data Warehouse on ...
Quickly Deploy Microsoft Private Cloud and SQL Server 2012 Data Warehouse on ...Quickly Deploy Microsoft Private Cloud and SQL Server 2012 Data Warehouse on ...
Quickly Deploy Microsoft Private Cloud and SQL Server 2012 Data Warehouse on ...
 
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformModernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
 
Professional Portfolio
Professional PortfolioProfessional Portfolio
Professional Portfolio
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
ds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suiteds_Pivotal_Big_Data_Suite_Product_Suite
ds_Pivotal_Big_Data_Suite_Product_Suite
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

SQL Server 2008 R2 Parallel Data Warehouse

  • 1. SQL Server 2008 R2 Parallel Data Warehouse Scott Adams, Principal Consultant scott.adams@imgroup.com IMGROUP
  • 2.  Introduction  Microsoft data warehousing  PDW key features  Fast Track DW & PDW partners  PDW high level architecture  Case study  Summary & suggested next steps SQL Server 2008 R2 PDW Data Warehousing Appliances with Microsoft
  • 3.  Principal Consultant, Business Intelligence and Data Warehousing, IMGROUP  Senior Architect, Business Intelligence, Microsoft Enterprise Group  Head of Data Warehouse Products, Business Objects/SAP  Co-Founder & VP Engineering, Appsmart Software Scott Adams
  • 4.  SQL Server 2008 R2 Parallel Data Warehouse (PDW)  Formerly project codenamed “Madison”.  Integration of technology acquired from DatAllegro (acquisition announced in Sept 2008).  Enables ‘00s TB data warehouses using industry standard hardware and SQL Server 2008  PDW is an MPP (massively parallel processing) system, consisting shared-nothing nodes.  PDW enables both centralised data warehouses, and hub and spoke architecture through connected units.  Hub & Spoke can contain both MPP and SMP (symmetric multi-processing) nodes, including SQL Server Analysis Services (OLAP) and enables PB volumes of data. Introduction
  • 5.  Bill Inmon is widely credited as the “father” of the data warehouse  Bill is part of the rollout of PDW  Actively involved and supporting Microsoft with training of Microsoft staff and partners alike on Data Warehousing 2.0  Comments:  “architecture is sound”  “very well though out technology”  “a compelling offering in the marketplace” Bill Inmon: “father” of the data warehouse
  • 6. Introduction: Timeline 6 Microsoft Confidential—Preliminary Information Subject to Change 2008 Beyond2009 2010 Parallel Data Warehouse MTP Program Launched Circa 10 Customers Provided with early Madison Benchmark Madison Named as SQL Server Parallel DW List Price at $58K per proc Microsoft Announce Intention to Acquire DATAllegro (July) Acquisition Closes (Sept) 150TB demo of DATAllegro on SQL Server run at BI Conference (Oct) Hardware Architectures Identified Early whitepapers / guidance Launch date estimated Summer 2010 Project “Madison” MTP 2 Program to Launch (fully functional, fully performant) TAP Program (on client site) RTM in October 2010 Parallel Data Warehouse PDW vNext Focus on continually lowering the costs of high end DW, while increasing performance Additional Hardware Partners Additional functionality Further integration with MS stack
  • 7.  1. SQL Server 2008 R2  Scalable enterprise database: compression, MERGE, resource governor, star optimisation  Many multi-terabyte (TB) implementations  2. Fast Track  Enterprise DW reference architectures from HP, IBM, Bull, Dell and EMC  Accelerate DW deployment  Reduce hardware testing and tuning  Scale from 4 to 48 TB  3. PDW  10’s to ’00s of TB  Industry standard hardware  Appliance model from HP, Dell or Bull  4. PDW with Hub & Spoke  PDW as an enterprise “hub”, publishing data to “spokes”  Integrates with existing SQL Server 2008  Hub and spokes interconnected via dedicated high-speed network (500GB per minute)  Simplified ETL/ELT with a data publishing model Microsoft Data Warehousing
  • 8.  Data warehouse scalability from 10s to 100s of Terabytes.  Low cost of ownership through industry standard hardware.  Simplified deployment and maintenance of appliance model.  Integration with existing SQL Server 2008 data warehouses via hub-and-spoke architecture.  Greater ROI from BI investments through integration with SQL Server 2008.  Reduced risk through use of redundant, industry- standard hardware.  Predictable performance delivery through balanced reference architectures.  Better agility and business alignment through hub- and-spoke architecture. PDW Key Features
  • 9.  Avoid performance problems due to conflicts between queries from different business units or IT management functions at peak times.  Provide dedicated, high-speed, network interconnecting all hub and spoke databases.  Support a centralized data model for metadata management and data governance without the limitations of single-platform centralization.  Allow business units to scale their dedicated data mart platforms separately from the hub system.  Allow central IT to scale the hub appliance to meet overall enterprise requirements without having to scale every dependent data mart appliance at the same time.  Physically separate the processes associated with data management and development from those associated with the consumption of information by end users. Microsoft Data Warehousing Hub & Spoke Architecture
  • 10.  Enterprise reference architecture options available from all suppliers for Fast Track Data Warehouse.  SQL Server 2008 R2 Parallel Data Warehouse appliances available from all suppliers. Fast Track DW & PDW Partners
  • 11. An Appliance Experience All hardware from a single partner Options: • Hardware Vendor – Choose Vendor: HP, Dell, IBM, Bull • Choose Drive Size (300 Gig, 450 Gig, 1 TB) – racks from 30 -100 TB Orderable in preconfigured blocks - racks Hardware Vendor will: • Assemble appliances • Image appliances with OS, SQL Server and PDW software Appliance installed in less than a day Support: • Microsoft provides first call support – 24*7*365 Triage • Hardware partner provides onsite break / fix support
  • 12. Fast Track DW & PDW Partners HP Basic 6 – 12TB DL38x w/ MSA2000 Mainstream 12 – 24TB DL585 G6 w/ MSA2000 Mainstream 16 – 32 TB DL580 G5 w/ MSA2000 G2 Premium 24 – 48 TB DL785 G6 w/ MSA2000 G2 Control Rack Data Rack HP SQL Server Fast Track Data Warehouse HP SQL Server 2008 R2 PDW
  • 13. Parallel Data Warehouse Architecture A Node: Database Server Storage Node
  • 14. PDW Architecture - HP Control Rack Data Rack Control Rack Data Rack/s “PDW…a highly scalable data warehouse appliance that delivers performance at low cost through a massively parallel processing (MPP) architecture.”
  • 15. Backup  Integrated backup hardware – internal high speed Infiniband network access  Parallelised high-speed backup solution  Full or incremental backups  Backup compression  Connect to corporate backup solution, e.g. via HBA Landing Zone  Provides high capacity staging server for data files from ETL processes  SQL Server Integration services available  Connected to high speed internal network and external GigE  Available as sandbox for other applications and scripts that run on internal network. Source Landing Zone Files Data Loader Compute Nodes PDW Architecture
  • 16. Date Dim D_DATE_SK D_DATE_ID D_DATE D_MONTH … Item I_ITEM_SK I_ITEM_ID I_REC_START_DATE I_ITEM_DESC … Store Sales Ss_sold_date_sk Ss_item_sk Ss_customer_sk Ss_cdemo_sk Ss_store_sk Ss_promo_sk Ss_quantity … Promotion P_PROMO_SK P_PROMO_ID P_START_DATE_SK P_END_DATE_SK … Store S_STORE_SK S_STORE_ID S_REC_START_DATE S_REC_END_DATE S_STORE_NAME … Customer C-CUSTOMER_SK C_CUSTOMER_ID C_CURRENT_ADDR … Customer Demographics CD_DEMO_SK CD_GENDER CD_MARITAL_STATUS CD_EDUCATION … Database Distributed & Replicated Tables Data Distribution with Replication C I D CD S P SS[1] C I D CD S P SS[2] C I D CD S P SS[3] C I D CD S P SS[4] SS[1] SS[2] SS[3} SS[4}
  • 17. Case Study: First Premier Bankcard PDW Highlights
  • 18. Available today Hub-and-spoke federation for multiple SMP & MPP DMs/DWs Scale SQL Server 2008 from 10’s of TB to 1 PB Set a new bar in appliance pricing and performance Available today Leverage SQL Server 2008 Enterprise DW enhancements Scale up to 48 TB Summary Parallel DW Fast Track Data Warehouse Available today Enterprise Edition significant DW enhancements Multiple TB, Compression, Resource Gov., Policy-based Admin SQL Server 2008
  • 19.  IMGROUP Data Warehouse Vision and Foundation service, including:  Business requirements, objectives and goals  Data Warehouse Readiness and Roadmap  Relational and Dimensional Data Modelling  Extract Transform and Load (ETL)  Online Analytical Processing (OLAP)  Reporting and Analytics, including Data Mining  Dashboards and Scorecards Scott Adams scott.adams@imgroup.com Steve Prokopiou steve.prokopiou@imgroup.com Suggested Next Steps

Hinweis der Redaktion

  1. ©2009 Microsoft Corporation