Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
SQL Server 2008 R2 Parallel Data Warehouse
1. SQL Server 2008 R2 Parallel Data Warehouse
Scott Adams, Principal Consultant
scott.adams@imgroup.com
IMGROUP
2. Introduction
Microsoft data warehousing
PDW key features
Fast Track DW & PDW partners
PDW high level architecture
Case study
Summary & suggested next steps
SQL Server 2008 R2 PDW
Data Warehousing Appliances with Microsoft
3. Principal Consultant,
Business Intelligence and
Data Warehousing,
IMGROUP
Senior Architect, Business
Intelligence, Microsoft
Enterprise Group
Head of Data Warehouse
Products, Business
Objects/SAP
Co-Founder & VP
Engineering, Appsmart
Software
Scott Adams
4. SQL Server 2008 R2 Parallel Data Warehouse (PDW)
Formerly project codenamed “Madison”.
Integration of technology acquired from DatAllegro
(acquisition announced in Sept 2008).
Enables ‘00s TB data warehouses using industry
standard hardware and SQL Server 2008
PDW is an MPP (massively parallel processing) system,
consisting shared-nothing nodes.
PDW enables both centralised data warehouses, and
hub and spoke architecture through connected units.
Hub & Spoke can contain both MPP and SMP (symmetric
multi-processing) nodes, including SQL Server Analysis
Services (OLAP) and enables PB volumes of data.
Introduction
5. Bill Inmon is widely credited as the “father”
of the data warehouse
Bill is part of the rollout of PDW
Actively involved and supporting Microsoft
with training of Microsoft staff and partners
alike on Data Warehousing 2.0
Comments:
“architecture is sound”
“very well though out technology”
“a compelling offering in the marketplace”
Bill Inmon: “father” of the data warehouse
6. Introduction: Timeline
6
Microsoft Confidential—Preliminary Information Subject to
Change
2008 Beyond2009 2010
Parallel Data Warehouse
MTP Program Launched
Circa 10 Customers Provided with early Madison
Benchmark
Madison Named as SQL Server Parallel DW
List Price at $58K per proc
Microsoft Announce Intention to Acquire
DATAllegro (July)
Acquisition Closes (Sept)
150TB demo of DATAllegro on SQL Server
run at BI Conference (Oct)
Hardware Architectures Identified
Early whitepapers / guidance
Launch date estimated Summer 2010
Project “Madison” MTP 2 Program to Launch (fully functional,
fully performant)
TAP Program (on client site)
RTM in October 2010
Parallel Data Warehouse
PDW vNext
Focus on continually lowering the
costs of high end DW, while
increasing performance
Additional Hardware Partners
Additional functionality
Further integration with MS stack
7. 1. SQL Server 2008 R2
Scalable enterprise database: compression,
MERGE, resource governor, star optimisation
Many multi-terabyte (TB) implementations
2. Fast Track
Enterprise DW reference architectures from
HP, IBM, Bull, Dell and EMC
Accelerate DW deployment
Reduce hardware testing and tuning
Scale from 4 to 48 TB
3. PDW
10’s to ’00s of TB
Industry standard hardware
Appliance model from HP, Dell or Bull
4. PDW with Hub & Spoke
PDW as an enterprise “hub”, publishing data
to “spokes”
Integrates with existing SQL Server 2008
Hub and spokes interconnected via dedicated
high-speed network (500GB per minute)
Simplified ETL/ELT with a data publishing
model
Microsoft Data Warehousing
8. Data warehouse scalability from 10s to 100s of
Terabytes.
Low cost of ownership through industry standard
hardware.
Simplified deployment and maintenance of
appliance model.
Integration with existing SQL Server 2008 data
warehouses via hub-and-spoke architecture.
Greater ROI from BI investments through
integration with SQL Server 2008.
Reduced risk through use of redundant, industry-
standard hardware.
Predictable performance delivery through
balanced reference architectures.
Better agility and business alignment through hub-
and-spoke architecture.
PDW Key Features
9. Avoid performance problems due to conflicts
between queries from different business units
or IT management functions at peak times.
Provide dedicated, high-speed, network
interconnecting all hub and spoke databases.
Support a centralized data model for metadata
management and data governance without the
limitations of single-platform centralization.
Allow business units to scale their dedicated
data mart platforms separately from the hub
system.
Allow central IT to scale the hub appliance to
meet overall enterprise requirements without
having to scale every dependent data mart
appliance at the same time.
Physically separate the processes associated
with data management and development from
those associated with the consumption of
information by end users.
Microsoft Data Warehousing
Hub & Spoke Architecture
10. Enterprise reference
architecture options available
from all suppliers for Fast Track
Data Warehouse.
SQL Server 2008 R2 Parallel Data
Warehouse appliances available
from all suppliers.
Fast Track DW & PDW Partners
11. An Appliance Experience
All hardware from a single partner
Options:
• Hardware Vendor – Choose Vendor: HP, Dell, IBM, Bull
• Choose Drive Size (300 Gig, 450 Gig, 1 TB) – racks from 30 -100 TB
Orderable in preconfigured blocks - racks
Hardware Vendor will:
• Assemble appliances
• Image appliances with OS, SQL Server and PDW software
Appliance installed in less than a day
Support:
• Microsoft provides first call support – 24*7*365 Triage
• Hardware partner provides onsite break / fix support
12. Fast Track DW & PDW Partners
HP
Basic
6 – 12TB
DL38x w/
MSA2000
Mainstream
12 – 24TB
DL585 G6 w/
MSA2000
Mainstream
16 – 32 TB
DL580 G5 w/
MSA2000 G2
Premium
24 – 48 TB
DL785 G6 w/
MSA2000 G2
Control Rack Data Rack
HP SQL Server Fast Track Data Warehouse HP SQL Server 2008 R2 PDW
14. PDW Architecture - HP
Control Rack Data
Rack
Control Rack
Data Rack/s
“PDW…a highly scalable data warehouse appliance that delivers performance at low
cost through a massively parallel processing (MPP) architecture.”
15. Backup
Integrated backup hardware – internal high speed Infiniband network access
Parallelised high-speed backup solution
Full or incremental backups
Backup compression
Connect to corporate backup solution, e.g. via HBA
Landing Zone
Provides high capacity staging server for data files from ETL processes
SQL Server Integration services available
Connected to high speed internal network and external GigE
Available as sandbox for other applications and scripts that run on internal
network.
Source Landing Zone Files Data Loader Compute Nodes
PDW Architecture
16. Date Dim
D_DATE_SK
D_DATE_ID
D_DATE
D_MONTH
… Item
I_ITEM_SK
I_ITEM_ID
I_REC_START_DATE
I_ITEM_DESC
…
Store Sales
Ss_sold_date_sk
Ss_item_sk
Ss_customer_sk
Ss_cdemo_sk
Ss_store_sk
Ss_promo_sk
Ss_quantity
…
Promotion
P_PROMO_SK
P_PROMO_ID
P_START_DATE_SK
P_END_DATE_SK
…
Store
S_STORE_SK
S_STORE_ID
S_REC_START_DATE
S_REC_END_DATE
S_STORE_NAME
…
Customer
C-CUSTOMER_SK
C_CUSTOMER_ID
C_CURRENT_ADDR
…
Customer
Demographics
CD_DEMO_SK
CD_GENDER
CD_MARITAL_STATUS
CD_EDUCATION
…
Database Distributed & Replicated Tables
Data Distribution with Replication
C I
D
CD
S
P
SS[1]
C I
D
CD
S
P
SS[2]
C I
D
CD
S
P
SS[3]
C I
D
CD
S
P
SS[4]
SS[1]
SS[2]
SS[3}
SS[4}
18. Available today
Hub-and-spoke federation for multiple SMP & MPP DMs/DWs
Scale SQL Server 2008 from 10’s of TB to 1 PB
Set a new bar in appliance pricing and performance
Available today
Leverage SQL Server 2008 Enterprise DW enhancements
Scale up to 48 TB
Summary
Parallel DW
Fast Track Data
Warehouse
Available today
Enterprise Edition significant DW enhancements
Multiple TB, Compression, Resource Gov., Policy-based Admin
SQL Server
2008
19. IMGROUP Data Warehouse Vision and
Foundation service, including:
Business requirements, objectives and goals
Data Warehouse Readiness and Roadmap
Relational and Dimensional Data Modelling
Extract Transform and Load (ETL)
Online Analytical Processing (OLAP)
Reporting and Analytics, including Data Mining
Dashboards and Scorecards
Scott Adams scott.adams@imgroup.com
Steve Prokopiou steve.prokopiou@imgroup.com
Suggested Next Steps