This document discusses using data virtualization to accelerate application projects by 50%. It outlines some common problems with physical data copies, such as bottlenecks, bugs due to old data, difficulty creating subsets, and delays. The document then introduces the concept of using a data virtualization appliance to take snapshots of production data and create thin clones for development and testing environments. This allows for fast, full-sized, self-service clones that can be refreshed quickly. Use cases discussed include improved development and testing workflows, faster production support like recovery and migration, and enabling continuous business intelligence functions.
2. ď§ 1990 Oracle
â 90 support
â 92 Ported v6
â 93 France
â 95 Benchmarking
â 98 ST Real World Performance
ď§ 2000 Dot.Com
ď§ 2001 Quest
ď§ 2002 Oracle OEM 10g
Success!
First successful OEM design
Who is Kyle Hailey
3. ď§ 1990 Oracle
â 90 support
â 92 Ported v6
â 93 France
â 95 Benchmarking
â 98 ST Real World Performance
ď§ 2000 Dot.Com
ď§ 2001 Quest
ď§ 2002 Oracle OEM 10g
ď§ 2005 Embarcadero
â DB Optimizer
Who is Kyle Hailey
4. Who is Kyle Hailey
⢠1990 Oracleď§ 90 support
ď§ 92 Ported v6
ď§ 93 France
ď§ 95 Benchmarking
ď§ 98 ST Real World Performance
⢠2000 Dot.Com
⢠2001 Quest
⢠2002 Oracle OEM 10g
⢠2005 Embarcadero
ď§ DB Optimizer
⢠2010 Delphix
When not being a Geek
- Have a little 6 year old boy
& new baby
who take up all my time
7. 7
Automation
Jenkins Team City Travis
Data
Virtualizatio
n
Configurati
on Chef Puppet Ansible
Compute
Virtualizatio
n Vmware OpenStack Docker
?
8. Put your energy into the constraint
Top 5 constraints in IT
1. Dev environments setup
2. QA setup
3. Code Architecture
4. Development
5. Product management
- Gene Kim Surveyed
⢠14000 companies
⢠100s of CIOs
10. Data is the constraint
60% Projects Over Schedule
85% delayed waiting for data
Data is the Constraint
CIO Magazine Survey:
only getting worse
Gartner: Data Doomsday, by 2017 1/3rd IT in crisis
12. Application Development Problems
12
⢠Not enough resources
⢠Contention on shared environments
⢠Lack of enough environments
⢠Late stage bug discovery
⢠Faulty Data leading to bugs
⢠Subsets
⢠Synthetic data
⢠Old data
⢠Slow environment builds
⢠Delays
⢠Developers waiting
⢠QA slow and expensive
19. Physical Data : late stage bugs
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7
Delay in Fixing the bug
Cost
To
Correct
Dev QA UAT
# of
bugs
found
Software
Engineering
Economics
â Barry Boehm (1981)
20. Virtual Data : Expensive Refresh
20
20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST
8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs 8 Hrs
22. ⢠Hardware
â storage, systems, network,
â rack space, power cooling
⢠People
â 1000s hours per year just for DBAs
â DBAs
â SYS Admin
â Storage Admin
â Backup Admin
â Network Admin
⢠$10s Millions for data center modernizations
Copies require People & Time
36. ⢠EMC Symmetrix
⢠Netapp & EMC VNX
⢠Solaris ZFS
Technology Core : file system snapshots
Also check out new SSD storage such as: Pure Storage, EMC XtremIO
37. Fuel not equal car
Challenges
1. Technical
2. Bureaucracy
38. 1. Bureaucracy
Developer Asks for DB Get Access
Manager approves
DBA Request
system
Setup DB
System
Admin
Request
storage
Setup
machine
Storage
Admin
Allocate
storage
(take snapshot)
39. Why are hand offs so expensive?
1hour
1 day
9 days
1. Bureaucracy
43. 43Š 2015 Delphix. All Rights Reserved. Private & Confidential.
Install Delphix on Intel hardware
⢠.
⢠.
⢠.
⢠.
⢠.
⢠Data
⢠.
⢠Binaries
⢠Application Stacks
⢠EBS
⢠SAP
⢠Flat files
44. 44Š 2015 Delphix. All Rights Reserved. Private & Confidential.
Allocate Any Storage to Delphix
Allocate Storage
Any type Pure Storage + Delphix
Better Performance for
1/10 the cost
45. 45Š 2015 Delphix. All Rights Reserved. Private & Confidential.
One time backup of source database
Data is
compressed
typically 1/3
size
Production
3 TB 1 TB
46. 46Š 2015 Delphix. All Rights Reserved. Private & Confidential.
Incremental forever change collection
Two week time flow
Production
47. 47Š 2015 Delphix. All Rights Reserved. Private & Confidential.
Clones: Fast, Free, Full
Production
Two week time flow
NFS
49. Before Virtual Data
Production Dev, QA, UAT
Instance
Reporting Backup
File system
Database
Instance
File system
Database
File system
Database
File system
Database
Instance
Instance
Instance
File system
Database
File system
Database
âtriple data
taxâ
50. With Virtual Data
Production
Instance
Dev & QA
Instance
Reporting
Instance
Backup
Instance Instance Instance
InstanceInstance
Instance
File system
Database
Instance
Instance
59. QA : Virtual Data
⢠Fast
⢠Parallel
⢠A/B testing
60. Physical Data : find bugs fast
Dev QA UAT
# of
bugs
found
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7
Delay in Fixing the bug
Cost
To
Correct
61. Dev
QA
Instance
Prod
DVA
⢠Fast
⢠Full Size
⢠Run Parallel QA
⢠Lots of environments for projects like ERP
Upgrades
Virtual Data : Parallel
Production Time Flow
63. Virtual Data : Fast Refresh
63
20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST 20 MIN TEST
⢠Fast
⢠Full
⢠Fresh
⢠Efficient
8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs8 Hrs 8 Hrs
20 MIN
TEST
70. 9TB database 1TB change day : 30 days
0
10
20
30
40
50
60
70
week1
week2
week3
week4
original
Oracle
Delphix
Storage
Required
(TB)
Days
71. RPO & RTO
71
⢠RPO
â Any time in last 30 days
â Down to the second
⢠RTO
â Minutes
â Push button
0
5
10
15
week1
week2
week3
week4 original
Delphix
79. Production Time Flow
Virtual Data: Audit
4/30/2015 79
Instance
Prod
DVA
Live Archive
Live Archive data for years
⢠Archive EBS R11 before upgrade to R12
⢠Sarbanes-Oxley
⢠Dodd-Frank
⢠Financial Stress tests
90. 1.Development & QA
â Dev throughput increase by 2x
2. Production Support
â 30 days in size of source
3. Business Continuity
â 24x7 ETL & federated cloning
Use Case Summary
91. Š 2015 Delphix. All Rights Reserved. Private & Confidential. P91.Š 2015 Delphix. All Rights Reserved. Private & Confidential. P91.
Shift Left
ROI
Time
Reduced
OpEx, CapEx
B
⢠Insurance product âabout 50 days ... to about 23 daysâ
â Presbyterian Health
⢠âCan't imagine working without itâ
â State of California
⢠Projects â12 months to 6 months.â
â New York Life
92. ⢠Projects â12 months to 6 months.â
â New York Life
⢠Insurance product âabout 50 days ... to about 23 daysâ
â Presbyterian Health
⢠âCan't imagine working without itâ
â State of California
Virtual Data Quotes
93. ⢠Problem: Data constraint
⢠Solution: Data Virtualization
Summary
Innovation
⢠Transformative
⢠Automation
⢠Self Service
111. 111
Automation
Jenkins Team City Travis
Data
Virtualizatio
n Delphix Open ZFS Flocker
Configurati
on
Managemen
t
Chef Puppet Ansible
Compute
Virtualizatio
n VMware Vagrant Docker AWS OpenStack
112. 112
Jenkins, Team City, Travis
Open Stack, Vagrant, Docker
Chef, Puppet, Ansible
Delphix
DevOps : Automation + Culture
113. Snapshot 1 - full backup
Jonathan Lewis Š 2013
Virtual DB
113 / 30
a b c d e f g h i
114. Snapshot 2 - incremental
Jonathan Lewis Š 2013
b' c'
a b c d e f g h i
115. Snapshot 2 - apply
Jonathan Lewis Š 2013
a b c d e f g h ib' c'
116. Snapshot 1 â drop
Jonathan Lewis Š 2013
b' c'a d e f g h i
117. Creating a VDB
Jonathan Lewis Š 2013
b' c'a d e f g h i
My vDB
(filesystem)
Your vDB
(filesystem)
b' c'a d e f g h i
118. Modify a vDB
Jonathan Lewis
Š 2013
b' c'a d e f g h i
My vDB
(filesystem)
Your vDB
(filesystem)
iâb' c'a d e f g h ib' c'a d e f g h i
119. What is DevOps ?
119
⢠Not Tools (required)
⢠Not a Process (not standardized yet)
⢠Not Culture (critical)
DevOps is a Goal
120. DevOps Goal :
120
Fast flow of features
from development
to IT operations
to the customers
- Gene Kim
128. A database refresh in 15 minutes?
That is mind blowing!
Delphix nailed it for us.
- Matt Lawrence , Sr Director Wind River (Intel)
Took 3 weeks to build a dev env
now with Delphix takes less than a day
the db part is less than 15 minutes
- Marty Boos , Stubhub (Ebay)
Delphix goes beyond storage
Delphix so much more than
We thought it was
-Michael Brow State of Colorado
129. Worth investing on this product
the technology is strong and
value prop is high
- Deloitte
I'm convinced about Delphix's
technology Delphix can really
increase the quality of Dev / QA
- Oaktable Member
Delphix allows us to move fast and setup database copies in seconds
Delphix is powerful and allowed us to scale from 2 projects to 11
We need Delphix to scale our agile environment
â Tim Campos, CIO, Facebook
Hinweis der Redaktion
Talking mainly about Delphix
What IT tasks have the most impact on company performance
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
<div>Icon made by <a href="http://www.freepik.com" title="Freepik">Freepik</a> from <a href="http://www.flaticon.com" title="Flaticon">www.flaticon.com</a> is licensed under <a href="http://creativecommons.org/licenses/by/3.0/" title="Creative Commons BY 3.0">CC BY 3.0</a></div>
if you look at whatâs really impeding flow from development to operations to the customer, Â itâs typically IT operations.
Operations can never deliver environments upon demand. You have to wait months or quarters to get a test environment.  When that happens terrible things happen. People actually horde environments.  They invite people to their teams because the know they have  reputation for having a cluster of test environments so people end up testing on environments that are years old which doesnât actually achieve the goal.
One of the most powerful things that organizations can do is to enable development and testing to get environment they need  when they need itâ
One of the best predictors of DevOps performance is that IT Operations can make available environments available on-demand to Development and Test, so that they can build and test the application in an environment that is synchronized with Production.
One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it
Eliyahu Goldratt
IT bottlenecks
Setting Priorities
Company Goals
Defining Metrics
Fast Iterations
IT version of
âThe Goalâ
by E. Goldratt
We know from our experience that there are some $1B+ Data center consolidation price tags. Taking even 30% of the cost out of that, and cutting the timeline, is a strong and powerful way to improve margin.
What about really big problems like consolidating data center real estate, or moving to the cloud?
f you can non-disruptively collect the data, and easily and repeatedly present it in the target data center, you take huge chunks out of these migration timelines. Moreover, with data being so easy to move on demand, you neutralize the hordes of users who insist that there isnât enough time to do this, or its too hard, or too risky.
Annual time spent coping databases can measure in the 1000s of hours just for DBAs not including all the other personnel required to supply the infrastructure necessary
Internet vs browser
Automate or die â the revolution will be automated
The worst enemy of companies today is thinking that they have the best processes that exist, that their IT organizations are using the latest and greatest technology and nothing better exists in the field. This mentality will be the undermining of many companies.
http://www.kylehailey.com/automate-or-die-the-revolution-will-be-automated/
Data IS the constraint
Business skeptics are saying to themselves that data processes are just a rounding error in most of their project timelines, and that they are sure their IT has developed processes to fix that. Thatâs the fundamental mistake. The very large and often hidden data tax lay in all the ways that weâve optimized our software, data protection, and decision systems around the expectation that data is simply not virtual. The belief that there is no agility problem is part of the problem.
http://www.kylehailey.com/data-is-the-constraint/
Due to the constraints of building clone copy database environments one ends up in the âculture of noâ
Where developers stop asking for a copy of a production database because the answer is ânoâ
If the developers need to debug an anomaly seen on production or if they need to write a custom module which requires a copy of production they know not to even ask and just give up.
Everyone Standup
Sit down if your QA data sets are less than a
week old
Month old
6 months
Year
2 years
How long does it take a developer to get a copy of a database
Time: how long to get or make a DB copy?
Dev?
QA?
DBA?
Old: How old is data ?
BI ,DW
QA ,Dev
Storage : How much storage used?
Analysts: batch job windows, lock out periods?
Audits : can you support â?
Fastest query is the query not run
In the physical database world, 3 clones take up 3x the storage.
In the virtual world 3 clones take up 1/3 the storage thanks to block sharing and compression
Not sure if youâve run into this but I have personally experience the following
When I was talking to one group at Ebay, in that development group they
Shared a single copy of the production database between the developers on that team.
What this sharing of a single copy of production meant, is that whenever a
Developer wanted to modified that database, they had to submit their changes to code
Review and that code review took 1 to 2 weeks.
I donât know about you, but that kind of delay would stifle my motivation
And I have direct experience with the kind of disgruntlement it can cause.
When I was last a DBA, all schema changes went through me.
It took me about half a day to process schema changes. That delay was too much so it was unilaterally decided by
They developers to go to an EAV schema. Or entity attribute value schema
Which mean that developers could add new fields without consulting me and without stepping on each others feat.
It also mean that SQL code as unreadable and performance was atrocious.
Besides creating developer frustration, sharing a database
also makes refreshing the data difficult as it takes a while to refresh the full copy
And it takes even longer to coordinate a time when everyone stops using the copy to make the refresh
All this means is that the copy rarely gets refreshed and the data gets old and unreliable
Not sure if youâve run into this but I have personally experience the following
When I was talking to one group at Ebay, in that development group they
Shared a single copy of the production database between the developers on that team.
What this sharing of a single copy of production meant, is that whenever a
Developer wanted to modified that database, they had to submit their changes to code
Review and that code review took 1 to 2 weeks.
I donât know about you, but that kind of delay would stifle my motivation
And I have direct experience with the kind of disgruntlement it can cause.
When I was last a DBA, all schema changes went through me.
It took me about half a day to process schema changes. That delay was too much so it was unilaterally decided by
They developers to go to an EAV schema. Or entity attribute value schema
Which mean that developers could add new fields without consulting me and without stepping on each others feat.
It also mean that SQL code as unreadable and performance was atrocious.
Besides creating developer frustration, sharing a database
also makes refreshing the data difficult as it takes a while to refresh the full copy
And it takes even longer to coordinate a time when everyone stops using the copy to make the refresh
All this means is that the copy rarely gets refreshed and the data gets old and unreliable
For example Stubhub went from 5 copies of production in development to 120
Giving each developer their own copy
To circumvent the problems of sharing a single copy of production
Many shops we talk to create subsets.
One company we talked to , spends 50% of time copying databases
have to subset because not enough storage
subsetting process constantly needs fixing modification
Now What happens when developers use subsets ď-- ****** ď-----
Stubhub estimated a 20% reduction in bugs that made it to production
Development asks for a database it takes days or weeks.
KLA Tencore
Stateado
Slow downs mean bottlenecks
We talked to Presbyterian Healthcare
And they told us that they spend 96% of their QA cycle time building the QA environment
And only 4% actually running the QA suite
This happens for every QA suite
meaning
For every dollar spent on QA there was only 4 cents of actual QA value
And that 96% cost is infrastructure time and overhead
We talked to Presbyterian Healthcare
And they told us that they spend 96% of their QA cycle time building the QA environment
And only 4% actually running the QA suite
This happens for every QA suite
meaning
For every dollar spent on QA there was only 4 cents of actual QA value
And that 96% cost is infrastructure time and overhead
Physically independent but logically correlated
Cloning multiple source databases at the same time can be a daunting task
One example with our customers is Informatica
Who had a project to integrate 6 databases into one central database
The time of the project was estimated at 12 months
With much of that coming from trying to orchestrating
Getting copies of the 6 databases at the same point in time
Like herding cats
Walmart.com
Informatical had a 12 month project to integrate 6 databases.
After installing Delphix they did it in 6 months.
I delivered this early
I generated more revenue
I freed up money and put it into innovation
won an award with Ventana Research for this project
How big is the data tax? One way we can measure it is by looking at the improvements in project timelines at companies that have eliminated this data tax through implementing a data virtualization appliance (DVA) and creating an virtual data platform (ADP). virtual data is data that is delivered to the exact spot itâs needed just in time and with much less time/cost/effort. By looking at productivity rates after implementing an ADP compared to before the ADP we can get an idea of the price of the data tax without an ADP. IT experts building mission critical systems for Fortune 500 companies have seen real project returns averaging 20-50% productivity increases after having implemented an ADP. Thatâs a big data tax to pay without an ADP. The data tax is real, and once you understand how real it is, you realize how many of your key business decisions and strategies are affected by the agility of the data in your applications.
Took us 50 days to develop an insurance product ⌠now we can get a product to the customer in 23 days with Delphix
Moving the data IS the big gorilla. Eliminating the data tax is crucial to the success of your company. And, if huge databases can be ready at target data centers in minutes, the rest of the excuses are flimsy.
virtual data â virtualized data â uses a small footprint. A truly virtual data platform can deliver full size datasets cheaper than subsets. A truly virtual data platform can move the time or the location pointer on its data very rapidly, and can store any version thatâs needed in a library at an unbelievably low cost. And, a truly virtual data platform can massively improve app quality by making it reliable and dead simple to return to a common baseline for one or many databases in a very short amount of time. Applications delivered with agile data can afford a lot more full size virtual copies, eliminating wait time and extra work caused by sharing, as well as side effects. With the cost of data falling so dramatically, business can radically increase their utilization of existing hardware and storage, delivering much more rapidly without any additional cost. An agile data platform presents data so rapidly and reliably that the data becomes commoditized â and servers that sit idle because it would just take too long to rebuild can now switch roles on demand.
Once Last Thing
http://www.dadbm.com/wp-content/uploads/2013/01/12c_pluggable_database_vs_separate_database.png