Best Practices for Removing the Data Constraint

Best Practices for Application Development:
Removing the Data Constraint
kylehailey.com kyle@delphix.com @virtdata

2© 2016 Delphix. All Rights Reserved. Private & Confidential.
Application Development is Critical
Technology
Disruption
“Software is eating the world.”
- Marc Andreessen
Increasing
Commoditization
Competitive
Pressures

• Problem : Data Constraint
• Solution : Virtual Data
• Use Cases
In this presentation :

The Phoenix Project
What is the
constraint
in IT ?

Put your energy into the constraint
Top 5 constraints in IT
1. Dev environments setup
2. QA setup
3. Code Architecture
4. Development
5. Product management
- Gene Kim Surveyed
• 14000 companies
• 100s of CIOs

Flow of Features
6
1
Development
Environments
2
QA & Testing Environments
Product
Management
Features
2 2
Code Architecture
3Code Speed
4
5
Data

7
Automation
Jenkins Team City Travis
Data
Virtualizatio
n
Configurati
on Chef Puppet Ansible Vagrant
Compute
Virtualizatio
n
?
Vmware OpenStack Docker

Development Pipeline
Build
Application
Build
QA
App Machine
Install
Application
Provision
Data Store
(database)
Build
QA
DB Machine

9
Run QA tests
Destructive Tests
Require Refresh
New Code New Code

Data Management not Agile
10
20% SDLC time lost waiting for data
60% dev/QA time consumed by data-related task
data management does not scale to
Agile
- Infosys & Compuware

Data is the constraint
60% Projects Over Schedule
85% delayed waiting for data
Data is the Constraint
CIO Magazine Survey:
only getting worse
Gartner: Data Doomsday, by 2017 1/3rd IT in crisis

Application Development Problems
12
• Not enough resources
• Bad Data leading to bugs
• Slow environment builds

1. Not Enough Resources: shared bottlenecks
Frustration Waiting

1. Not Enough Resources : bugs because of old data
Old Unrepresentative Data

1. Not enough resources: limited environments

2. Bad data lets to bugs: subsets
False Negatives
False Positives
Bugs in Production

17
2. Bad data lets to bugs: Production Wall

2. Bad data leads to bugs: late stage bugs
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7
Cost
To
Correct
Dev QA UAT
Software
Engineering
Economics
– Barry Boehm (1981)
Production
0
50
100
150
200
250
300
350
400
450
500
Dev Testing UAT Production

19
PROD
DEV DEV Test Test UAT
DBA
Sys
Admin
Storage
Admin
Legacy Data Movement: Slow & expensive
?
3. Slow environment builds: delays

Developer Asks for
DB
Get
Access
Manager approves
DBA Request
system
Setup DB
System
Admin
Request
storage
Setup
machine
Storage
Admin
Allocate
storage
(take snapshot)

Why are hand offs so expensive?
1hour
1 day
9 days

Could I have a copy of the production DB ?
Developer, tester or AnalystBoss, Storage Admin, DBA

Metrics
–Time
–Old Data
–Storage
Other
–Analysts
–Audits
–Data Center Modernization
companies unaware
"we say no, no, no until we can't say no anymore"
response when IT asked for copies of prod DB

• Data Constraint
• Solution
• Use Cases
In this presentation :

Development UATQA
99% of blocks are identical

Install Delphix on Intel hardware
• .
• .
• .
• .
• .
• Data
• .
• Binaries
• Application Stacks
• EBS
• SAP
• Flat files

Allocate Any Storage to Delphix
Any Storage
Pure Storage + Delphix
Better Performance for
1/10 the cost

One time backup of source database
Data is
compressed
typically 1/3
size
Production
3 TB
1 TB

32
PROD
Data as a Service : fast, elastic, secure
Self Service

Three Physical Copies
Three Virtual Copies
Data
Virtualization
Appliance

34
PROD
DBA
Sys
Admin
Storage
Admin
Legacy Data Movement: Slow & expensive
?

35
PROD
Data as a Service : fast, elastic, secure
Self Service

• Problem in the Industry
• Solution
• Use Cases

1. Development & QA
2. Production Support
3. Business
Use Cases

Development: Virtual Data
Development

Virtual Data: Easy
Source
Clone 1
Clone 2
Clone 3
Virtual Data
Appliance

Virtual Data: Parallelize
gif by Steve Karam

Virtual Data: Self Service
Self Service

Environments: almost unlimited

QA : Virtual Data
• Fast
• Parallel
• A/B testing

Physical Data : late stage bugs
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7
Cost
To
Correct
Dev QA UAT
Software
Engineering
Economics
– Barry Boehm (198
Production
0
50
100
150
200
250
300
350
400
450
500
Bugs Discovered Legacy

Physical Data : find bugs fast
Dev QA UAT Production
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7
Cost
To
Correct

The Impact: Shift Left in Quality
0
50
100
150
200
250
300
350
400
450
500
Bugs Discovered Legacy
With Delphix

Production Time Flow
Dev
QA
Instance
Prod
Virtual Data
Appliance
• Fast
• Full Size
• Run Parallel QA
Virtual Data : Parallel

Virtual Data: Rewind
DVAInstance
QA
Prod

Virtual Data : Fast Refresh
20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST
• Fast
• Full
• Fresh
• Efficient
8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs 8 Hrs
20 MIN
TEST

Virtual Data: A/B
DVAInstance
Instance
Instance
Index 1
Index 2

Virtual Data: Version Control
1/27/2016 53
Dev
QA
2.1
Dev
QA
2.2
2.1 2.2
Instance
Prod
DVA Production Time Flow

1. Development and QA
3. Business
Use Cases

• Recovery
• Forensics
• Migration
Production Support

9TB database 1TB change day : 30 days
0
10
20
30
40
50
60
70
week1
week2
week3
week4
original
Oracle
Delphix
Storage
Required
(TB)
Days

Virtual Data: Recovery
Instance
Instance
Recover VDB
Drop
Source
DVA Production Time Flow

Virtual Data: Forensics
Instance
Development
DVA
Source

Virtual Data: Development recovery
Instance
Development
DVA
Source
Development
Prod & VDB Time Flow

Cloud Migration and Replication
61

Production Dev, QA, UAT Reporting Backup
Security problem

Production Dev, QA, UAT Reporting Sandbox
Security management improvement

Production
Dev, QA, UAT Reporting Sandbox
Security Solution

1. Development and QA
3. Business Continuity
Use Cases

Business Intelligence
• Audits
• ETL
• Temporal
• Federated data
• Consolidated data

Virtual Data: Audit
1/27/2016 67
Instance
Prod
DVA
Live Archive
Live Archive data for years
• Archive EBS R11 before upgrade to R12
• Sarbanes-Oxley
• Dodd-Frank
• Financial Stress tests

Business Intelligence: ETL and Refresh Windows
1pm 10pm 8am
noon

• Collect only Changes
• Refresh in minutes
Instance
Prod
BI and DW
ETL
24x7
DVA
Virtual Data: Fast Refreshes
Time Flow

Modernization: Federated
Instance
Instance
Source1
Source2
Production Time Flow 1
Production Time Flow 2

“I looked like a hero”
Tony Young, CIO Informatica
Virtual Data: Federated

1.Development & QA
– Dev throughput increase by 2x
– 30 days in size of source
3. Business Continuity
– 24x7 ETL & federated cloning
Use Case Summary

74
Automation
Data
Virtualizatio
n
Configurati
on Chef Puppet Ansible
Compute
Virtualizatio
n Vmware OpenStack Docker
? ? ? ?

75
Automation
Data
Virtualizatio
n
Configurati
on Chef Puppet Ansible
Compute
Virtualizatio
n Vmware OpenStack Docker

• Projects “12 months to 6 months.”
– New York Life
• Insurance product “about 50 days ... to about 23 days”
– Presbyterian Health
• “Can't imagine working without it”
– State of California
Virtual Data Quotes

• Problem: Data constraint
• Solution: Data Virtualization
Summary

Thank you!
• Kyle Hailey - Technical Evangelist (Oracle Ace, Oaktable)
– Kyle@delphix.com
– kylehailey.com
– slideshare.net/khailey
– @virtdata

Best Practices for Removing the Data Constraint

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Best Practices for Removing the Data Constraint

Similar to Best Practices for Removing the Data Constraint (20)

More from Kyle Hailey

More from Kyle Hailey (13)

Recently uploaded

Recently uploaded (20)

Best Practices for Removing the Data Constraint

Editor's Notes