DevOps derives from both development and operations, groups that DBAs often have a foot in each of.
There is a high focus on collaboration, geared on methodologies, process and practice.
The goal is to release more frequently, more successfully and with less bugs.
Talk about the future of the DBA with DevOps-
I just presented this last week to both Oracle and SQL Server events on this very topic.
What does this have to do with a DBA? Is it our future?, is it something we all have to embrace or convert to?
My answer is no- just as we still see cobalt and fortran apps still in need of support, traditional relational database support for on-prem isn’t going away anytime soon.
We all have enough work to keep us busy as traditional DBAs for a good decade or more. Those of you in the municipal and federal jobs are safe for a few more decades…
DevOps derives from both development and operations, groups that DBAs often have a foot in each of.
There is a high focus on collaboration, geared on methodologies, process and practice.
To be empowered by DevOps requires automation and with that, tools. Tools can include scripting through CLIs and GUI interaction.
Agile 2008 conference, Andrew Clay Shafer and Patrick Debois discussed "Agile Infrastructure”
The term DevOps was popularized through a series of "devopsdays" starting in 2009 in Belgium
Introduction of the cloud, the idea of the department that buys the server and gets a developer to build something they need outside of IT, is now on steroids..
They now just open a cloud account with the idea that its our problem when it become mission critical
Arrow Electronics just claimed at a dinner that 30% of their business will be on audits of unsecure, non-policy meeting cloud initiatives that are in production from this exact practice.
So we review code, but how often do we check the tools that are being used…
On the Oracle side, we saw this all the time- The developers used Toad or other tools to develop, but the Oracle DBA would require SQL Plus to release and it would fail due to proprietary comments in the scripts or parameter setup at the command line that was assumed.
We are the masters of automation, so we should be involved in tool selection to ensure they cover a broad range of tiers in the IT environment.
How many of you use these tools? How many of you use these tools when executing to production?
Keep in mind that there are many terms used for the concepts on this slide.
I’ve chosen the most common ones, but depending on the choice in Agile and DevOps methodology, the words may change, but the goal is the same.
Build automation is the process of automating the creation of a software build and the associated processes including: compiling computer source code into binary code, packaging binary code, and running automated tests.
Continuous delivery (CD) is a software engineering approach in which teams produce software ... incremental updates to applications in production. A straightforward and repeatable deployment process is important for continuous delivery.
At the same time, there are a few tool in CD, like Jenkins, that have been very popular with the DBA masses.
Ant is another java based built tool that’s part of Apache open-source project.
Similar to Make and written in XML.
This Groovy script executes another script, making it valuable in environments that already have a number of mature scripts in place that should be reused in automation.
This is our plugin- that’s how important we find these tools that we’ve built it into Delphix….
Configuration management (CM) is a systems engineering process for establishing and maintaining consistency of a product's performance, functional, and physical attributes with its requirements, design, and operational information throughout its life.
A DBA’s desire for low risk and stability assists here as we desire routine that results in expected outcomes.
This is a simple ansible call to copy a script from on directory to another, change the permissions and then execute it. This is all being done on a Linux machine.
Solutions for DevOps, Security/compliance, configuration management, cloud/container management and “infrastructure as code”
They have new products outside of their Enterprise,
Like Discovery, Bolt, Pipelines and the Container Registry
This is another area that introduces risk, but DBAs are less adverse to this methodology, as it focuses on one feature, even if it focuses on multiple tiers.
If something goes wrong, it can mean higher detail of coordination to back a change out or to correct a problem.
Release Orchestration is the use of tools like XLRelease which manage software releases from the development stage to the actual software release itself.
It is more known for diagnostics, but allows for automation of admin tasks quickly. Perform analysis of data and create automated scripts for reuse.
Most useful? SQL Job Editor, SQL User Clone, (full clone) SQL Job Editor, SQL Server Configuration Compare.
Redgate DML Automation not only automates your release, you can create release scripts from this application.
We all know how important it is to track changes, but a repository can be used for a number of other valuable ways.
I’m going to add to this definition with Data version control.
This is where we move from DevOps into DataOps and it’s the both the evolution of DevOps, along with where the DBA becomes a focal point of DevOps.
OK, yours may currently may not be the same. We need to talk about how you can become aligned with everyone else’s goals.
It doesn’t mean you have to give up your first database.
You can be part of the goals of the company and still protect the data, all of the data and the database.
The concept was first coined just a few years ago by a Senior VP Platform Engineer, Dave McCrory. It was an open discussion aimed at understanding how data impacted the way technology changed when connected with network, software and compute.
He discusses the basic understanding that there’s a limit in “the speed with which information can get from memory (where data is stored) to computing (where data is acted upon) is the limiting factor in computing speed.” called the Von Newmann Bottleneck.
These are essential concepts that I believe all DBAs and Developers should understand, as data gravity impacts all of us. Its the reason for many enhancements to database, network and compute power. Its the reason optimization specialists are in such demand. Other roles such as backup, monitoring and error handling can be automated, but the more that we drive logic into programs, nothing is as good as true skill in optimization when it comes to eliminating much of data gravity issues. Less data, less weight- it’s as simple as that.
In computing, virtualization means to create a virtual version of a device or resource, such as a server, storage device, network or even a database. The framework divides the resource into one or more execution environments. For data, this can result in a golden copy or source that is used for a centralized location and removal of duplicated data. For read and writes, having unique data for that given copy, while duplicates are kept to singular.
RMAN duplicates, cold backup to restores, datapump and other archaic data transfer processes are time consuming.
By virtualizing, we remove the “weight” of the data. We know that 80% of the data won’t change between copies, so why do we need individual copies of it. Our source is then deduped and compressed to conserve more space.
How do we “rewind” data and code changes now?
Why should the DBA rewind changes made in dev and test?
Why should you be the one to do this in test?
Virtualization removes this.
The Virtual databases are read and write, so even maintenance tasks, like DBCC’s can be offloaded to one.
Ability to version control, not just the meta data, but the user data!
I work with Delphix, so you would think I know our virtualization the best, but the truth is, I also know many other virtualization tools at a very detailed level.
The amount of information I know on Oracle virtualization tools is pretty insane, in fact.
Point out the engine and size after we’ve compressed and de-duplicated.
Note that each of the VDBs will take approximately 5-10G vs. 1TB to offer a FULL read/write copy of the production system
It will do so in just a matter of minutes.
That this can also be done for the application tier!
Each Virtual Database, (VDB) will no longer require space, (only background and transaction log unique to the user database, etc.) This is a considerable savings, but…
If we take this a step further by embracing write changes only on blocks changed from the source, then we’ll experience 10-20 copies of a database in about the same space that one database requires.
Package software into standardized units for development, shipment and deployment. A container image is a lightweight, stand-alone, executable package of a piece of software that includes everything needed to run it: code, runtime, system tools, system libraries, settings.
The next step is moving to data pods. Containers are a buzz area of technology right now. If we’re talking Docker or Kubernetes, we know this is the way of the future. Instead of having locked, unique environments, the ability to package them as one, in a lighter and more flexible unit makes incredible sense.
As a DBA, I rarely, if ever, just released code to the database. It was commonly to the database, the application and linked products.
The ability to package and manage as a Data Pod is an impressive enhancement to the Developer, tester and DBA.
The next step is the ability to migrate to the cloud or from one cloud to another. Right now, 60% of customers are using 2-5 clouds on average. The ability to move a Data Pod from one cloud to another is incredibly powerful.
Companies are spending increased time now just migrating to the cloud, but to other clouds and if it would be as simple as migrating a Data pod with a few changes to the new storage location, (i.e. cloud) that could save companies millions of dollars.
A data pod is a set of virtual data environments and controls built then delivered to users for self-service data consumption. It allows for self-management without the need for DBAs to manage standard processing, automate rebuilds and even remove need for backout scripts when development, testing and promotion goes wrong.
We refer to a container as a template in our product.
Note that a data pod can be moved here or to the cloud…
DBA has to commandeer a database for patch testing.
This has to be performed for EACH environment, 100’s or 1000’s of databases!
Most are not synchronized with production, different outcomes when released to production.
Bugs occurring in one, not another!
Over 80% of time is waiting for RDBMS, (relational databases) to be refreshed. Developers and Testers are waiting for data to do their primary functions.
This allows for faster and less costly migrations to the cloud, too.
So what is “data versioning”? This is similar to version control at the code level, but tracks changes.
Lot of interest in SQL Server Temporal tables, (although very few use cases in the real world)
Some of these products are focused on the DBA to control the changes, as they are most often the one having to address how to “rewind” or correct changes
When they occur.
Jet Stream focuses on developers and Testers and although can be at the database only, we more often build it with data pods, (i.e. containers) that consist of the database, application
And any other tier that interacts with the database.
There’s significant benefit to doing it this way and more third party providers may begin to do this as well.
This is a cornerstone to developers and testers, so as DBAs, we know the pain when a developer comes to us to flashback a database and before that, recover or logically recover, (import or datapump) independent objects. What is The developer/tester could do this for themselves?
This may appear to be a traffic disaster of changes, but for developers with Agile experience, a “sprint” looks just like this. You have different sprints that are quick runs and merges where developers are working separately on code that must merge successfully at the correct intersection and be deployed.
Versioning with source control is displayed at the top, using Virtual images. You can see each iteration of the sprints.
In the middle section is the branches of that occur during the development process. A virtual can be spun from a virtual, which means that it’s easier for developers to work from the work another developer has produced.
Stopping points and release via a clone is simply minutes vs. hours or days.
This is the interface for Developers and testers- they can bookmark before important tasks or rewind to any point in the process. They can bookmark and branch for full development/testing needs.
An Agile Framework
Scrum Framework consists of:
A Product owner creating a wish list and the “sprint” begins
Sprint planning and backlog is created.
Team sets up schedule and beings to have daily scrum standups, (commonly 5 minutes)
Scrum master keeps team focused and collaborating, keeps track of status
Product is released
Sprint ends with feedback and lessons learned
Next sprint begins
Considered a very “visual” development process, based off of grocery store shelf stocking.
Uses standardized cues and refined processes
Goal to reduce waste and maximize value
Most often uses sticky notes and whiteboard to create a picture of the work to complete, what’s in process and what’s done.
Visualize Work
Limit Work in Process
Focus on Flow
Continuous Improvement
Code may come first in XP, but testing must already exist to know what the successful outcome will be.
Code is written by pairs of programmers, allowing for better collaboration.
Believes in the power of doing, vs. extensive planning. Failure is expected.
Always build foundations that can be built on later.
Rarely specialize- everyone develops, tests, designs, etc.
Shades of Crystal- orange, yellow, etc.
Similar to Rapid Deploy, but it is one tier focused often. The developers work in on a goal of client focused projects and the value must be seen.
It’s not about correcting or fixing, but on driving a feature that is demanded from the user and creates revenue.
FDD also defines a collection of supporting roles, including:
Domain Manager
Release Manager
Language Guru
Build Engineer
Toolsmith
System Administrator
Tester
Deployer
Technical Writer
Methods provide a format or guide to work from. Hybrid approaches often implement best.
Collaboration methods ensure that communication continues when team members return to their desks
Deployment tools help with documenting and lessons learned
Build tools help with automation and orchestration
Or does it shift the problem toward authentication and authorization?
Idera SQL Secure identifies who has access of on-prem and cloud environments.
Set Strong security policies.
Present security violations, analyze user permissions and you can create security templates to create similar databases roles and privs in the future.
This includes ones pre-built for PCI, HIPAA, FERPA for guidelines for STIG and CIS
Where the SQL Compliance manager works to audit sensitive data, stop potential threats by tracking access.
This feature also has templates similar to the ones in SQL Secure. Compliance manager offers a lot more in reporting and dashboard, but has less features.
For a typical Fortune 1000 company, just a 10% increase in data accessibility will result in more than $65 million additional net income.
Leveraging data coupld increase revenue by as much as 60%
There are larger data sources every day. Databases are at the center of this friction and the natural life of a database is growth.
There are two different definitions of data gravity
The weight of data causes application, access and services to be pulled to the data.
The very weight of data is heavy, creating a gravitational pull that is difficult to escape from when working with it.
By 2020, we’ll grow from today’s 4.4 zettabyets to an approximate, but staggering 44 zettabytes, or 44 trillion gigabytes.
And by 2020, a third of that data will pass through the cloud.
Data gravity is the ability of bodies of data to attract applications, services and other data. ... IT expert Dave McRory coined the term data gravity as an analogy to the way that, in accordance with the physical laws of gravity, objects with more mass attract those with less.
And yet we state that we won’t need DBAs? That data isn’t the center of challenge?
Per Forbes, by the year 2020, about 1.7 megabytes of new information will be created every second for every human being on the planet.
more data has been created in the past two years than in the entire previous history of the human race.
That data has to be stored somewhere and there’s a large chance it’s going to be in a relational data store.
We can’t eliminate the majority of data
We can optimize the code and the applications, but data is still data- i.e. large.
It will continue to grow
The business is able to provision new environments or refresh existing ones in a matter of minutes.
Developers and testers who’ve worked with bookmarks and branching of their code changes can now do the same with database changes, rewinding and refreshing as they need without impacting the DBAs day. This allows the DBA to do more with their time.
Having tools that includes the database in the Agile development cycle makes a pivotal change in how the DBA is capable of being part of DevOps.