This white paper discusses how software-defined storage enables organizations to protect and grow infrastructure investments to meet new massive scale data, cloud-scale workloads, and storage-as-a-service requirements through a simple, extensible, and open storage platform.
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
White Paper: Realizing the Benefits of Software-Defined Storage
1. 1
REALIZING THE BENEFITS OF
SOFTWARE-DEFINED STORAGE
AN EMC PERSPECTIVE
EMC WHITE PAPER
ABSTRACT
This white paper describes how data storage needs to become more software-
defined to address the expectations of an increasingly mobile user community.
Data has cloud properties now and users demand data to be available at
anytime, anywhere. IT must meet this demand or risk losing users to public
clouds. IT can realize the benefits of software-defined storage with an
architecture that abstracts physical storage and centrally manages it in the
control plane while leveraging the capabilities in existing storage infrastructure
investments in the data plane.
September 2013
2. 2
TABLE OF CONTENTS
ABSTRACT ............................................................................................................................................. 1
TABLE OF CONTENTS ............................................................................................................................. 2
REALIZING THE BENEFITS OF SOFTWARE-DEFINED STORAGE .............................................................. 3
Cloud Transforms IT.......................................................................................................................... 3
Application-Driven Needs ................................................................................................................... 3
The Software-Defined Data Center ...................................................................................................... 4
Data-Centric World ........................................................................................................................... 4
The Future of Storage........................................................................................................................ 4
Applying Software-Defined Storage ..................................................................................................... 7
Business Benefits ........................................................................................................................ 7
CONCLUSION......................................................................................................................................... 8
CONTACT US .......................................................................................................................................... 8
3. 3
REALIZING THE BENEFITS OF SOFTWARE-DEFINED STORAGE
An EMC perspective on software-defined storage
The surest path to long-term success is the ability to quickly and accurately adapt to changing circumstances. Within the context
of the data center, successful IT departments must quickly adapt to new technologies to improve the organization’s ability to be
more agile and efficient in responding to marketplace conditions, including both internal and external customer needs. In recent
years, these new technologies have been anchored by a common theme of separating or abstracting data presentation and
manipulation from the constraints of physical devices. This trend is evident in the proliferation of server virtualization in data
centers and the increasing popularity of public clouds as an alternative to central IT departments for fulfilling computing needs.
Not surprisingly, more and more IT executives are looking to extend the concept of virtualization beyond servers to networking,
storage, and security to realize the benefits of virtualization throughout the data center. Additionally, these IT executives are
adopting different cloud models to deliver IT services internally.
Cloud Transforms IT
The Internet and cloud technologies have spurred on the evolution and transformation of IT. What once took days, weeks, or
months to provision IT resources is now expected to happen in minutes or seconds—on-demand. The process of creating a
service-request through an IT Help Desk has been replaced with a user self-service model. Cloud has also brought the idea of
pay-as-you-go metered service rather than a flat fixed operating expense spread across departments in an organization.
Similarly, IT-issued equipment has increasingly given way to users often supplying their own devices, in what has been termed
bring-your-own-device (BYOD). User devices, especially tablets and smart phones, employ new application types that demand
instant access to data from anywhere.
Application-Driven Needs
The data center has always been application and business process driven. Looking at the evolution of the data center, batch
processing applications drove mainframe adoption, while the advent of online transaction procession (OLTP) applications led to
client/server computing and shared storage. Applications were designed and optimized for particular compute, networking, and
storage resources that continued to grow more powerful during this evolution. Storage, for example, gained intelligence and
specialization with value-add features such as local data protection, point-in-time snapshots, synchronous and asynchronous
replication, online backups, and more. As long as storage IOPS and bandwidth limited application scale, growing the intelligence
and value-add capabilities of the storage substrate delivered additional agility and flexibility within the data center. While this
approach met current application requirements for a long time, it resulted in silos of resources with an abundance of
unaccounted and under-utilized capacity because storage capabilities grew to the point where a single application could no
longer fully utilize the array. Additional scale and capabilities within the storage subsystem no longer delivered the value within
the data center that it once did.
SILOS OF RESOURCES
4. 4
With the Web, the scalability challenge shifted from moving data from disk to an application at a single location to moving data
simultaneously to thousands or tens of thousands of remote applications. The challenge demanded the adoption of compute
anywhere—and object storage. If you consider the Web as a global network, compute anywhere means user access from any
point to data stored anywhere along the Internet. The Internet and more recently, the cloud, contributed to the growth of data
objects (e.g. audio, video, and online documents) accessed by unique identifiers or URLs and the need for object storage.
Unstructured data or object data growth now accounts for about 90%1
of overall data growth.
The change from batch to online applications was driven by users in the enterprise. User expectations for cloud models, and
consequently for IT, however, are being influenced by personal experience with the Internet, public cloud services, and social
media, where the expectations for scale and performance of IT infrastructure are vastly different. For instance, while the
bandwidth and IOPS needed for an individual email user in a cloud hardly tax a modern system, the aggregate demands of
approximately 2 billion worldwide email users are staggering, and could not be satisfied with an IT infrastructure tuned to
maximize performance in a silo for a single, local application.
Because of these changes in technology and user demands, IT must shift its thinking about data management and delivery to
satisfy users and market demands or risk users going to easy-to-consume public clouds for timely IT services. IT needs to
evolve to the software-defined data center.
The Software-Defined Data Center
The software-defined data center is intelligent software that abstracts hardware resources, pools them into aggregated capacity,
and automates distributing them as needed to applications. It consolidates all systems into a single platform built on an x86
architecture supporting both industry-standard protocols for stability and integration with the network backbone and open APIs
for application and management tool integration and portability.
The software-defined data center envisions a topology that abstracts, pools, and automates compute, network, and storage over
multiple data centers, including those owned by enterprises and service providers. It presents a variety of compute, network,
and storage services on-demand to give developers and application owners their own personal virtual data centers (VDCs), with
elasticity through a pay-as-you-go model to keep expenses in check.
Data-Centric World
The challenges in realizing the software-defined data center do not lie in applications, but with data. About half to three-quarters
(50-70%) or more of servers in organizations are virtualized and the cloud has sped the evolution of APIs resulting in a common
integration point based on the Representational State Transfer (REST) design model that gives new meaning to rapid application
development agnostic to and portable across the underlying infrastructure.
The challenge lies in dealing with the exploding data growth which is estimated to reach 40 zeta bytes (ZB) by 2020 and hefty
investments in legacy storage infrastructure characterized by proprietary APIs and proprietary operating systems. This lack of
standardization is a leading contributor to the estimates that only about 5-15% of storage is virtualized, compared to compute
(50-70%), and networking (estimated at 20% or more).
Data is inherently heavy and difficult to move and manage, given the lack of standardization and a single access point for
heterogeneous storage. Working with different data types (i.e. block, file, and object) requires using specialty storage devices
and operations on this data can require moving it across costly networks.
The Future of Storage
From an EMC perspective, the future of storage acknowledges the data-centric world we inhabit. Data still resides in
heterogeneous storage systems in multiple data centers, and customers still use storage systems that best meet their business
application needs for performance, cost, compliance, or data protection.
Data exhibits cloud properties now, however: always available, accessible from anywhere at any time. In a software-defined
model, storage infrastructure is abstracted separating the physical devices in the data plane from the logical in the control plane,
similar to compute and network. This abstraction provides resiliency, massive scale, and geo-distribution. It also preserves the
properties of the underlying arrays, and protects customers’ storage infrastructure investments.
1
Source: IDC’s Digital Universe Study, sponsored by EMC, December 2013
5. 5
THE FUTURE OF STORAGE IN A DATA-CENTRIC WORLD
Through this software-defined model, all storage allocations to various applications are done from a shared virtual pool. Data
appears to be logically in one virtual pool even though it may be physically geographically distributed.
The entire virtual storage infrastructure is accessed via a single control point and managed through automated policies. New
capabilities are added to the underlying arrays in software—once for all arrays.
WHAT IS SOFTWARE-DEFINED STORAGE?
Software-defined storage transforms existing heterogeneous physical storage into a simple, extensible, and
open virtual storage platform that preserves the capabilities of underlying physical storage arrays. It abstracts
physical storage, pools aggregated capacity, and automates and centralizes management across
heterogeneous storage, including commodity storage, in a scale-out architecture. It includes storage services
for provisioning, orchestration, change management, monitoring, reporting, and quality of service, and the
ability to do new operations on data in-place via value-add data services.
Software-defined storage provides a central point of access to all management functions, translating requests
into specific calls to the underlying storage, while offering storage services to multiple users or tenants, with
different access roles through a single common portal. This approach standardizes operations, reduces
complexities, and improves an organizations efficiency and agility in deliver storage when and where needed.
Software-defined storage is an approach that enables companies to address the future of storage without having to replace
existing infrastructure. It enables new capabilities in the enterprise for data movement, management, and service delivery
using cloud-based models.
6. 6
SOFTWARE-DEFINED STORAGE ARCHITECUTRE
Software-defined storage in the data-centric world provides:
• Automation with Policy-based Storage Provisioning: Many of the challenges inherent to data center storage systems
arise from the many manual steps required to provision and deliver storage. Software-defined storage automates the
provisioning process providing a single method to abstract the disparate physical storage resources into a single virtual
storage pool to be divided up and delivered based on pre-defined policies that align to service-level agreements (SLAs) for
different users.
• Programmability through a Single, Central Control Point: Multiple storage systems have long had proprietary APIs and
proprietary operating systems adding complexity to data centers employing multi-vendor, tiered storage strategies.
Software-defined storage abstracts physical storage in the data path or data plane into a single logical layer making a single
access point possible, with a REST-based API in the control path or control plane. This approach enables customers to both
centralize management and extend the capabilities of the virtual storage platform through in-house development or via
integration with third-party solutions interfacing with one common API.
• Centralized Management across Heterogeneous Storage: Multiple storage systems are also characterized by multiple
management tools specific to vendors or storage devices. With software-defined storage, a single control point simplifies
management and provides a common user experience across supported storage devices. Monitoring, metering, reporting,
workflow orchestration, change management, and cataloging can be centralized, reducing the need for multiple management
tools.
• Scale-out Architecture for Unprecedented Growth: Physical storage is constrained by disk capacity, number of bays,
and available footprint in the traditional data center. By abstracting physical storage into a virtual pool, software-defined
storage aggregates existing and new arrays into limitless virtual capacity available to users through a single portal regardless
of where the data resides and the users’ physical location.
• Extensible Data Services for New Capabilities: Different data types usually require different, dedicated storage such as
block-based storage for OLTP applications, file-based storage for file sharing applications, and object storage for unstructured
data. These dedicated storage methods can require numerous moves such as between file and object based storage when
manipulating objects or applying data analytics to objects. By separating data access and management in the control plane
from the physical storage in the data plane, new operations are possible on data in-place. Remember: data is heavy. Being
able to do more with data without moving it among different storage devices saves time and money (e.g. hardware, staff,
network bandwidth). It also makes new technologies such as big data analytics for business intelligence more accessible to a
larger audience who might not otherwise afford dedicated storage and data scientists. This extensibility is a key tenet to
being truly software-defined.
7. 7
Applying Software-Defined Storage
How does software-defined storage apply in the real-world?
Software-defined storage sounds good in theory but the power lies in putting it into action. Let’s explore how a financial services
firm would use software-defined storage compared to traditional storage to continuously develop and deploy a Financial-Services
Education application that helps financial analysts and planners keep up to date with changing government regulations and new
financial services offerings. With traditional storage, administrators would need to execute several steps using a number of
different management tools to allocate 500 terabytes (TB) of storage with high-availability (HA), in each of the sites. They would
also need to know up-front the storage capacity that will be accessed via block storage to store low-latency transactional data,
and via file storage for high-throughput processing of video and audio tutorials. But, in reality, it is extremely difficult to predict
the capacity required for each data type using traditional methods, and allocating new storage to address under-provisioned
applications is a lengthy and time-consuming process. As a result, storage capacity is generally over-provisioned in anticipation
of maximum future needs, leading to poor resource utilization.
In the context of software-defined storage, standardized operations offer the capability to allocate 500 terabytes of HA storage,
across different sites, with specific performance characteristics, without the need to execute each one of the individual steps, be
exposed to the details of the underlying environment, or even be aware of where data resides. With software-defined storage,
storage administrators define the policies governing who can access which types of data services and in which locations. The
system then allows storage consumers to express their requirements at a high level of abstraction, such as simply requesting
the allocation of 500 TB of HA storage. Then, the system validates the request against policies, identifies appropriate and
available infrastructure, and the services are provisioned and delivered automatically. All the complexity associated with the
configuration disappears. Adding more capacity to an existing environment is now a much simpler operation, and under-
provisioning errors can be address far more seamlessly and quickly. Capacity needs can be estimated more for the normal use
cases.
Additionally, through an abstracted storage services, end-users—in this case, the financial analysts and planners—get a
consistent experience based on their SLA, regardless of physical location. Standardization of operations has enabled these data
centers to consolidate and simplify operations and reduce complexity to deliver a consistent user experience.
Taking this concept a step further, these data centers could integrate policy-driven software-defined storage with other layers in
the data center including VMware® and OpenStack® cloud stacks. Then, VMware vCenter or other virtual administrators could
set policies that enable users to request and consume storage with compute. Add software-defined networking (SDN) to the mix
and these data centers could deliver personal VDCs and realize the software-defined data center.
Business Benefits
There are many potential use cases for software-defined storage like the one cited above. To sum it up, the business benefits of
software-defined storage include:
• Faster Time to Value: In the software-defined data center, IT is a competitive differentiator enabling organizations to react
more expediently to opportunities. Data is the intellectual capital of organizations and software-defined storage the means to
leverage this data to competitive advantage. Software-defined storage minimizes repetitive, time-consuming provisioning
tasks to present data storage quickly to get applications and new services up and running. In this way, organizations are
more agile in responding to changing marketplace conditions.
• Better Return on IT Spend: With three-to-five year technology refresh cycles, storage purchases are usually over-
configured in anticipation of future storage needs. Storage is also usually underutilized with allocated and available capacity
frequently unknown. While software-defined storage does not remove the need to provision for a degree of anticipated
storage needs, it does mean that a single buffer can serve the entire data center rather than a single application silo. This
approach allows the right storage to be available and applied to application needs as necessary. Centralized management
also makes getting to a single composite view of the entire storage infrastructure possible for monitoring and reporting on all
storage resources and for doing predictive analytics to better plan and manage storage purchases.
• No Vendor Lock-in: With software-defined storage, data access and management occurs in the control plane separate from
physical dependencies that once tied applications, servers, networking, storage, and security together into impenetrable
silos. Software-defined storage enables customers to shift physical infrastructure to match changing needs, add and displace
storage vendors, and to take advantage of lower-cost commodity storage options.