Databases for Storage Engineers

Databases
For storage People

Thomas Kejser
thomas@kejser.org
http://blog.kejser.org
@thomaskejser

Agenda

• The Microsoft Database Stack
• Hard problems the database solves
• File layout and I/O pattern
• Data and Log Files
• Analysis Services Files
• TempDb and other system databases

• Installation of SQL
• Q&A

Product Portfolio
• SQL Server (aka: Core Engine)
• SQL Server Analysis Services (SSAS)
• Tabular
• Multi Dimensional
• SQL Server Service Broker (SSB)
• SQL Server Integration Services (SSIS)
• SQL Server Reporting Services (SSRS)
• SQL Server Data Quality Tools
• SQL Server Master Data Services
• SQL Server Parallel Data Warehouse
• .NET stuff…
• Various Excel plug-ins

• A “full” stack!

What Type of Workload?

Big Simulation ETL
Data Returned

Small OLTP BI/DW

Small Big

Data Touched

A Template OLTP System

“App” tier
.NET .NET .NET .NET Web Server Windows
License

Database Tier
Web/Core Licensing
2 or 4 sockets
Core

A Template Data Warehouse

SSAS
SSIS

Core
SSIS
SSAS

SSIS

SSIS Core
Core

SSRS

Integration Tier “Enterprise” Warehouse Tier BI / Presentation / Cubes
Blades Large machines
CPU Intensive VERY CPU greedy
Medium Servers
low IOPS VERY I/O greedy (GB/sec) Can be IOPS greedy

A Template MPP Warehouse

SSIS
SSAS

SSIS

SSIS
Core
SSIS

Data Marts
(The “spokes”)

Enterprise Warehouse Tier
Appliance (The “hub”)

Management Tools you Need to Know
Pre 2012 2012

Management Studio (Management Studio)
(AKA: Enterprise Manager)

Project Data Dude Data Tools

Configuration Manager Configuration Manager

SQL Server Profiler Xevent Tracing

Reporting Services Config Reporting Services Config
Manager Manager
Sp_configure Sp_configure / ALTER SERVER

Hard problems
databases
help you solve

Query Plan Generation

Find all parts bought by
Thomas Kejser

Express Problem, Auto get solutions

To do this well, we need Statistics

SQL Did it

I did it 

THIS is not accurate and it will never be!

… and we Need Indexes

B+ Tree

95% of all database problems* are caused by:

A) Poor indexing

B) Wrong Statistics

A) Badly written queries

B) All of the above

* Low estimate, trying to be nice to humanity

And most of the time,
there is nothing you can do about that*

… which is where storage come into the picture

* AKA: “Craplications”, technical term

Two types of bad Queries
• The CPU Bound
• Have to help rewrite C
L
2 L
• Better storage does not help C
L
2
3
• But DBAs may still believe it is I/O CPU

• The I/O bound
• Can throw NAND at it
• I will show you how to diagnose

• DBA people like to talk about this like…

Response time = Service Time + Wait Time

Algorithms “Bottlenecks”
and
Data Structures

When Speaking about Service Time

• We normally end up talking about bad
join plans

• Joins come in three flavours
• Merge
• Hash
• Loop

Merge Join

m row result n row result
1 1
1 2
2 3

Sorted
3 4
4
Sorted

43

43

Complexity: O(m + n)

Hash Join

m row result n row join table
1
43
13
3

Hash(1)

n row hash table

7

Complexity: O(m + 2n)

Loop Join

m row result
1
43 Log(n) reads
13
3

n row B-tree

7

Complexity: O(m * log(n))

When Hash Joins hurt you

Runtime (seconds)
30

25

20

15

10

5
Spill Zone!
0
400 350 300 250 200 150 100 50 0
Hash Memory (MB)

Join Hints

B probed, lower table in join
(second table in join statement)

A probed, upper table in join
(first table in join statement)
Just the way it is …

Why is it so hard to get joins right?
Time

Loop Join

Merge Join

Hash Join
n

m

No-one has been
able to get joins consistently right!

P = NP ?

Getting I/O right…

Language Processing (Parse/Bind)

Query Optimization
Statement/Batch Execution
(Plan Generation, View
Matching, Statistics, Costing)
Query Execution
(Query Operators, Memory
Plan Cache Management
Grants, Parallelism)

Storage Engine (Access Methods, Database Page Cache, Locking, Transactions, …)

SQL-OS (Schedulers, Buffer Pool, Memory Management, Synchronization Primitives, …)

The Storage Engines makes I/O Transparent!

Rest of engine
only sees the API

Storage Engine

RAM Storage

Two Different Philosophies

Primitive SQL Server Analysis Services

Scheduling Voluntary Yield, User Kernel mode, Preemptive
mode

I/O Engine Dedicated I/O stack Windows Buffered I/O

Waiting / Spinning SQLOS Primitives Windows

Memory Management SQLOS / Storage Engine Windows Paging

Serialisation TDS special purpose XML

Network Fully optimizable, async, Windows primitives,
affinitized engine blocking

SQL Server is different

• Primitives are a different beast than
Windows
• Scale issues are generally specific to the
core, not Windows
• Exposes own “belly of the beast”
profiling
• SQL Team build their own
primitives, often better than Windows
core
• Highest throughput app on
Windows, drives all the scale stuff there

Analysis Services is “just another App”

• Analysis Services relies fully on
Windows primitives
• You can profile it by looking at how
Windows behaves
• Upgrades to Windows are more likely to
help it
• No TPC style benchmarks…

A is for Atomic

LINEITEM LINEITEM LINEITEM

ORDER_KEY
ORDER_KEY ORDER_KEY
PART_KEY
PART_KEY PART_KEY
COMMITDATE COMMITDATE
COMMITDATE
QUANTITY QUANTITY
QUANTITY

ORDER ORDER ORDER

ORDER_KEY ORDER_KEY ORDER_KEY
CUSTOMER_KEY CUSTOMER_KEY CUSTOMER_KEY

C is for Consistency

LINEITEM LINEITEM LINEITEM

ORDER_KEY
ORDER_KEY COMMITDATE PART_KEY
= 42 = 2012-02-30 COMMITDATE
QUANTITY

ORDER ORDER ORDER

ORDER_KEY ORDER_KEY
!= 42 ORDER_KEY CUSTOMER_KEY

I is for Isolation

SELECT @LastTransaction_ID =
LastTransaction_ID
FROM ATM
WHERE ATM_ID = 13
SELECT @LastTransaction_ID =
LastTransaction_ID
FROM ATM
(@LastTransaction_ID = 42) WHERE ATM_ID = 13

(@LastTransaction_ID = 42)
SET @ID = @LastTransaction_ID + 1 SET @ID = @LastTransaction_ID + 1

UPDATE ATM UPDATE ATM
SET @LastTransaction_ID = @ID SET @LastTransaction_ID = @ID
WHERE ATM_ID = 13 WHERE ATM_ID = 13

D is for Durability

Do Transactions
Do Transactions
Do Transactions
Do Transactions
Do Transactions
Ack
Do Transactions
Ack
Ack
Do Transactions
Ack
Do Transactions
Do Transactions
Ack
Do Transactions
Ack
Ack
Ack
Ack
Ack

Summary – Databases Help You

• Do complex operations in optimal time
• …at high parallelism
• Optimise I/O pattern
• Be ACID compliant
• Store stuff safely…

• noSQL/Big Data systems trade off >0 of
these to get more of the others

System Databases

• Server won’t start without:
• master
• mssqlsystemressource
• System CAN start, but wont work well
• model
• msdb
• System will start under special
conditions
• tempdb

Master and mssqlsystemressources

• Together, contain all system information
• Mssqlsystemressource
• Lives under: MSSQLBinn
• Contains all system code
• Hidden by default
• Master
• Lives under: MSSQLDATA

• You should move these to a safe
location

Disaster: Master or systemResources

• You lost:
• All passwords and server logins
• All system wide certificates (You may be
unable to decrypt!)
• All System procedures you created
• You are not 100% screwed, but you are
in for a long night
• Both can be rebuild (empty) during server
start
• …Or restored from backup
• if you remembered to take one
• Need /f and /T3608 to get back up

Database: model

• Every new created
database is cloned
from this
• Loss is not
catastrophic
• Copy from healthy
machine
• Tempdb can’t boot
without it
• Lives with master

Database tempdb

• Database “swap file”

• Does not survive
restarts

• No Durability
guarantees here

• Fast I/O helps

Loss of Tempdb…is…Temporary

• Will rebuild itself after instance restart

• Configuration is stored in master

• Clones from msdb

• Nearly every installation must change
defaults

• If tempdb cannot be created, server will
only start from command line

User Databases and Failure

• A database consists of
• At least one Transaction Log File
• The PRIMARY filegroup
• At least one data file in PRIMARY
• If any of these are lost, the database is
dead
• You can in some cases bring a database
without a transaction log back alive
• But typically with data loss…
• Lesson: carefully protect all of
above

What is in the Files?

PRIMARY Transaction Log

Primary File
Headers
GAM / SGAM

PFS Map
VLF
Metadata
(system objects)

VLF
User Data

VLF

Data Files

• Regular files in NTFS
• Secured
• Files can Auto Grow as needed
• Risky
• File Imbalance

How are Database Files Created?

• ALTER or CREATE
DATABASE
• Transaction log file
always zeroed out
• This looks super cool
on FusionIo by the
way
• Data files MAY be
zeroed out
• Depends in privileges
• May use instant file
init

Filegroups

• Filegroups (one PRIMARY
word) are containers
of files User Data

• Used to group similar
data together DATA
• Oracle people know User Data
this concept as a
table-spaces User Data

• Files inside FG are
accessed/allocated User Data

round-robin User Data

Reclaiming/Moving Space in Files

• DBCC SHRINKFILE

• REBUILD data

DBCC SHRINKFILE

7 8
5 6
3 4
1 2

LUN 1 LUN 2 LUN 3 LUN 4

How to reclaim space the right way…

New Filegroup

7 8 7 8

5 6 5 6

3 4 3 4

1 2 1 2

LUN 1 LUN 2 LUN 3 LUN 4

ALTER INDEX Foo WITH
REBUILD, SORT_IN_TEMPDB = ON

PFS Contention

• Too few PFS maps can
lead to latch
File
contention PFS Map
• Diagnosed in:
sys.dm_os_waiting_tas User Data
(8000 pages)
ks
PFS Map

• Look for
PAGELATCH_UP User Data
(8000 pages)

I/O DBA people worry about

• DBAs typically diagnose issues with
waits stats
• Issues they look for:
• WRITELOG/LOGBUFFER waits
• PAGELATCHIO_<X> waits
• BACKUPIO waits
• IO_COMPLETION/ASYNC_IO_COMPLETION

Places you need to know about

• Diagnosing ressource waits:
• sys.dm_os_wait_stats
• Post 2008R2 – can use Xevents (harder)
• More detail in:
• sys.dm_io_virtual_filestats(NULL, NULL)
• Confirm waits here!
• SQL Server errors in log file:

Databases for Storage Engineers

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Databases for Storage Engineers

Ähnlich wie Databases for Storage Engineers (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Databases for Storage Engineers

Hinweis der Redaktion