SlideShare ist ein Scribd-Unternehmen logo
1 von 32
Downloaden Sie, um offline zu lesen
©2016SAPSEoranSAPaffiliatecompany.Allrightsreserved.
CIO Guide to Using the SAP HANA®
Platform for Big Data
February 2016
Table of Contents
2	 Executive Summary
3	 Introduction
4	 Challenges and Opportunities of Big Data
7	 Big Data Reference Architecture
10	 SAP HANA Platform for Handling Big Data
17	 Big Data Scenarios and Data Flows
23	 Big Data Use Cases
28	 SAP HANA Platform: Open, Flexible,
Integrated, Scalable, and More
31	 Find Out More
DISCLAIMER
This document outlines our general product direction and should not be relied on in making a purchase decision. This presentation is not subject
to your license agreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation
or to develop or release any functionality mentioned in this presentation. This presentation and SAP’s strategy and possible future developments
are subject to change and may be changed by SAP at any time for any reason without notice. This document is provided without a warranty of
any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-
infringement. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally
or grossly negligent.
See http://global.sap.com/corporate-en/legal/copyright/index.epx for additional trademark information and notices.
2 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Executive Summary
Big Data is often characterized by the three Vs:
volume, velocity, and variety.These characteristics
pose great challenges for conventional disk-based
relational databases.The different categories of
data require different storage capabilities and
performance, which involve different costs.
The SAP HANA platform offers several types
of data storage and processing engines. Online
transaction processing (OLTP) and online analyt-
ical processing (OLAP) applications can now
easily run in one system, on one database. In-
memory stores in SAP HANA are recommended
for high-value data (hot data) that must be
accessed and processed with extreme speed, for
data that is frequently changed, and when you
need the platform’s native features. Customers
typically use in-memory stores for (compressed)
data volumes up to several terabytes.1
The dynamic tiering option extends the SAP HANA
database with disk-based columnar tables, based
on SAP® IQ software technology.This option is rec-
ommended for storing big sets of structured data,
when high OLAP performance and deep integra-
tion into SAP HANA is important, and when the
processing features of Structured Query Language
(SQL) are sufficient. Dynamic tiering can be used,
for example, for lower-value data for which in-
memory performance is not required (warm data).
It can manage data volumes from several hundred
terabytes to petabytes.
Hadoop is suited for raw data that can grow infi-
nitely, for unstructured and semistructured data,
and when massive scale-out is required for pro-
cessing. With Hadoop you can flexibly scale out
with low initial cost. Hadoop is also suited for
data from business applications that is no longer
required (cold data).
The SAP HANA Vora™ engine is the recommended
SQL engine for high-performance analytics on
structured data in Hadoop. It enables optimized
access to data in SAP HANA from Hadoop or
Spark.
For all data, SAP HANA serves as the central
point for data access, data modeling, and
system administration. Due to its openness,
the SAP HANA platform can be extended with
non-SAP technology depending on the require-
ments. This flexibility makes the platform a sus-
tainable investment. By streamlining system
administration and software lifecycle manage-
ment, SAP HANA enables CIOs to simplify their
system landscape and significantly reduce cost
of ownership.
This guide supports CIOs in setting up a system infrastructure for their business
that can get the best out of Big Data. We describe what the SAP HANA® platform
can do and how it integrates with Hadoop and related technologies. We examine
typical data flows where different technologies interplay, looking at data lifecycle
management and data streaming. Concrete use cases point out the requirements
associated with Big Data as well as the opportunities it offers, and how companies
are already taking advantage of them.
1.	 In SAP HANA, a volume on the order of terabytes results
after compression. Therefore, this actually equates to much
larger data volumes in conventional systems.
3 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Introduction
Consider, for example, an application that uses
SAP HANA to store its business data and has to
manage additional data coming from devices.
Assume that the device data is structured, the
total expected volume is in the gigabytes range,
and the data is processed with SQL. Since such
data can be efficiently managed in the SAP HANA
database, a solution architecture that requires an
additional Hadoop cluster would add unneces-
sary complexity and cost of operations.
The decision for a particular storage option or
processing engine depends on several dimen-
sions, including performance, data volume, cost,
features, and operational readiness. This guide
helps CIOs understand these dimensions and
make the right decision for their business.
The guide is structured in the following sections:
•• The “Challenges and Opportunities of Big Data”
section describes briefly what makes Big Data
challenging yet, at the same time, a source of
benefit for today’s businesses.
•• “Big Data Reference Architecture” explains the
lambda architecture, which is one of the current
reference architectures for Big Data.
•• “SAP HANA Platform for Handling Big Data”
gives an overview of the SAP HANA platform
and its options for managing Big Data.
•• “Big Data Scenarios and Data Flows”explains
typical Big Data scenarios and data flows and how
they influence the Big Data infrastructure setup.
•• “Big Data Use Cases” describes two real-world
use cases.
•• “SAP HANA Platform: Open, Flexible, Integrated,
Scalable, and More”summarizes the key charac-
teristics of the SAP HANA platform that make it
a sustainable investment.
•• “Find Out More” provides links to further
information.
Technology has the ability to shape our world.
Big Data is one of the most important technology
trends that will impact our world between now
and 2020.
Arguably, Big Data is an artificial category created
by technology vendors as a convenient way to
reference certain new tools (the value of which is
undeniable).The term is broadly used to refer to
large or complex data sets that traditional data
processing applications are not able to manage.
In the past five years we have created more data
than all prior years combined, and having all this
new data is making business operations much
more complex.
In addition, a lot of data does not automatically
equal a lot of useful information. An effective Big
Data infrastructure should be able to separate
the background noise from the valuable signals
that can be translated to actionable insights.
There are many choices when it comes to design-
ing and setting up a suitable Big Data system
infrastructure, but there has been little guidance
regarding the best approach for both exploiting
the potential of Big Data and meeting enterprise-
specific requirements. This document provides
orientation for CIOs so they can choose the
right storage and processing options for a given
Big Data application and understand the impact
and consequences of their decision. It helps to
answer questions about what type of data should
be stored in in-memory tables, in dynamic tier-
ing, and in Hadoop, as well as which processing
engine should be used for a given task.
Making the wrong decision can lead to unneces-
sarily complex and expensive solutions that do
not meet the business requirements of Big Data.
4 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Challenges and Opportunities of Big Data
There has been an explosion of new technologies,
new data types and sources, and new ways of
using existing technology. From mobile and cloud
computing to social media and sentiment to
machine, log, and sensor data, data science is
wringing value from massive amounts of struc-
tured and unstructured data, and high-velocity
streaming data is providing insight and driving
decisions like never before.
Possessing data does not add value in and of itself,
but being able to use it to make timely, meaningful
decisions that impact business is enormously
valuable. Before Big Data can be monetized and
turned into a strategic asset, however, today’s
CIOs need to consider several things.To start, the
expectations of the business should be matched
by the most suitable technology (see Figure 1).
In today’s networked economy, characterized by
hyperconnectivity or instant connection to the
business network, we can confidently anticipate
that data quantities will continue to grow at high
rates. At the same time, this data will be of differ-
ent types – structured and unstructured, and
from low value to high value.
In the last decade, companies like Google,
Facebook, and Netflix have led the way in collect-
ing and monetizing huge amounts of data gener-
ated by consumers’ everyday activity. They look
on this data as a strategic asset – every decision
in their organizations is data driven, as is every
product they sell. This has created enormous
interest among traditional enterprises, which
can easily see the benefit of putting their data
to work in the same way.
Big Data is often characterized by the three
Vs: volume, velocity, and variety. These
characteristics pose great challenges for
conventional disk-based relational databases.
5 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
cost of operations and at the same time preserve
existing investments in technology, staff, and
training.
To manage Big Data well, enterprises must have
people with dedicated skills to exploit the opportu-
nities that it offers.The main task of data analysts
or data scientists is to separate the background
noise from the meaningful signals to enable solid
decision making and take appropriate actions.
SAP HANA is already helping businesses to unlock
this valuable information by addressing at least
one very important aspect: the ability to perform
real-time analytics on very large data sets. Not
only data scientists but also managers and execu-
tives can now get insight into their current state of
affairs at any time and at“the speed of thought.”
The left side of Figure 1 shows common
expectations regarding Big Data that require
consideration of:
•• New digital channels
•• Possible correlations between transactional
and analytics data from the enterprise with
data from other sources (for example, weather
data or social media) that might be meaningful
•• The ability to create simulations and sophisti-
cated data visualizations that depict insights
from data in new, more compelling ways
On the technology side, CIOs should get familiar
with the available technologies and their capabili-
ties in order to make the right decisions to build
a system infrastructure that satisfies business
expectations without disregarding costs. For
example, a unifying platform can reduce total
Figure 1: Matching Business Expectations with Technology, Solutions, and Skills
What are the expectations of the
following for Big Data?
•	 Marketing analytics
•	 Sales analytics
•	 Operational analytics
•	 Financial analytics
Business expectations
•	 Which technology should we
consider?
•	 Is a unifying platform possible?
•	 Which skill sets will we need?
Technology, solutions, and skills
6 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
(variety) – from comments about products
on Twitter or Facebook and logs of customer
behavior on a company Web site to sensor and
weather data.
A general architecture pattern has been postu-
lated to account for exploding data volumes,
fast-growing data quantities that require pro-
cessing and storage, and increasing data variety.
This pattern is called the lambda architecture,
which we discuss next.
CHARACTERISTICS OF BIG DATA
The most conspicuous and technically measur-
able characteristics of Big Data are often referred
to as the three Vs: volume, velocity, and variety
(see Figure 2).2
Not only is the volume of data large, but it is
arriving ever more rapidly (velocity) – consider
machine data generated on the factory floor, or
algorithmic trading data generated by financial
markets. The data is also of many different types
2.	 “Value” and “veracity” are further dimensions that are
considered to be characteristic Vs of Big Data. Value refers
to the ROI the data is able to provide to the business. ROI
accounts for both the potential benefit that can be derived
from the data and the cost of storage media. Veracity refers
to the ability to trust the data used to make decisions (which,
ultimately, is required of all data).
Figure 2: Characteristics and Opportunities of Big Data
Exploding data
VOLUME
Characteristics
Deep data insights
Opportunities
Broad data scope
Ability to access
most recent data
Answers in real time
No delays due to constant
data preparation
Predictive capabilities
Accelerated
data processing
VELOCITY
Increasing data
VARIETY
7 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Big Data Reference Architecture
The most important characteristics of a lambda
architecture are:
•• Fault tolerance: ability to satisfy requirements in
spite of failures (if a fault occurs, no information
gets lost, since the incoming data can be recom-
puted from the append-only master data set)
•• Scalability: flexibility to accommodate data
growth
•• Low latency for reads and writes: minimum
time delay of system responses
•• Real-time results: quick return of results
despite data load
Lambda architecture is defined by three functions:
real-time processing, batch processing, and query
processing.
Figure 3 shows how these functions interact.
Additionally, the master data set (“all data” in the
figure) is being continuously updated by append-
ing new data. To compensate for the latency of
batch processing, three architectural layers are
defined: the batch layer, the serving layer, and the
speed layer.
The techniques that have been used to deal with
Big Data as a strategic asset are based on tech-
nologies that allow the collection, storage, and
processing of very large amounts of data at a low
economic cost. Initially, some of these technolo-
gies (for example, Hadoop) were primarily used
for batch workloads. However, in the last several
years other technologies (for example, Spark)
have emerged that enable both batch and real-
time processing to work in parallel in the same
infrastructure. It is precisely this combination of
capabilities that is the foremost requirement of
Big Data architectures. The lambda architecture
describes how this requirement can be achieved.
The Big Data community regards it as an impor-
tant reference architecture for Big Data (although
it has pros and cons). It is particularly well suited
for predictive analytics where patterns are identi-
fied within a historical data set on a regular basis
and incoming records are checked in real time to
see if they correspond to these patterns.
Figure 3: Lambda Architecture
Speed layer
Batch layer
Query
result
Serving layer
Real-time processing
Real-time
views
Update
New
data
1
Append
All data Batch processing
Batch
views
Overwrite
Query
processing
2
3
8 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
In the lambda architecture pattern, incoming
queries are able to act on a combination of
results from the batch and real-time views. Con-
sequently, databases typically index the batch
view to allow for ad hoc queries and have one
component to merge both real-time and batch
views.
Fault tolerance is achieved by maintaining an
immutable, append-only master data set in which
all incoming data is stored. If a fault occurs in any
processing step, everything can be recomputed
from the master data set. The continuously grow-
ing data is stored in HDFS, for example, as scal-
able and reliable distributed storage. The data in
an HDFS environment is traditionally processed
with MapReduce batch jobs.
Incoming new data is routed into both the batch
layer and the speed layer. In the batch layer, the
incoming data is appended to the master data
set – for example, in a Hadoop Distributed File
System (HDFS). Batch jobs are used to read the
master data set and produce precomputed and
preaggregated results called batch views. The
potentially long-running batch jobs are continu-
ously reexecuted to overwrite the batch views
with newer versions as soon as they are available.
The batch views are loaded into a read-only data
store in the serving layer, to support fast and
random reads. Real-time views are not copied
because an ad hoc creation of the view is desired;
every query should be answered with the newest
possible data. Therefore, the real-time view is
part of the real-time layer and not reproduced in
the query layer.
For all data, SAP HANA serves as the central
point for data access, data modeling, and
system administration.
9 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
results from both layers. The effort involved is
huge because of the complexity of the distributed
systems.
But in spite of these criticisms, lambda architec-
ture continues to be a valid way to handle Big
Data. It is ultimately a matter of making lambda
requirements more manageable.
SAP HANA enables you to implement a complete
lambda architecture from preintegrated compo-
nents of the platform, including the processing
engines and data stores. SAP HANA even goes
beyond lambda since it comes with options for
integration with remote data sources and sup-
ports data integration and replication, data
transformations, data quality operations, event
streaming processing, and federated query execu-
tion. Moreover, SAP HANA offers the advantages
of an integrated data platform – simpler installa-
tion, administration, lifecycle management, and
development – because all three layers of the
lambda architecture can run within one system
and all persisted data can be accessed within the
same database.
The serving layer enables fast queries in the batch
views, but these views only contain information
from the time before the producing batch jobs
were started.To compensate for this time gap, all
new data is fed into the speed layer in parallel.The
speed layer processes new data in real time with
stream-processing technology (such as Apache
Storm or Spark Streaming).The results are used
to incrementally update real-time views in a data-
base supporting fast random reads and writes
(for example, by using Apache Cassandra3).
The speed layer needs to approximate the results
because it has only current information and no
historical data. As explained, real-time views and
batch views are merged in the serving layer when
a query is executed.
Recently, the lambda architecture has been criti-
cized because it creates a complex landscape
with many different components that are loosely
integrated, and it requires implementing different
code bases on different technology stacks in the
batch and speed layer. At the same time, the
implemented functions must be maintained and
kept synchronized so that they produce the same
3.	 Apache Cassandra is an open-source distributed database
management system initially developed at Facebook to deal
with its in-box search feature. For details, see the DATASTAX
Web site,“About Apache Cassandra.”
10 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
SAP HANA Platform for Handling Big Data
The SAP HANA platform includes application ser-
vices, database services, and integration services
(see Figure 4). Since database and integration
services are most relevant for Big Data infra-
structures, we discuss them first.
DATABASE SERVICES
Database transactions are processed reliably
because the system is compliant with ACID4
and SQL standards. The database services are
accessible through Java Database Connectivity
(JDBC), Open Database Connectivity (ODBC),
JavaScript Object Notation (JSON), and Open
Data Protocol (OData). Ultimately, this means
that SAP HANA is both standards based and
open for connection through commonly used
application programming interfaces (APIs) and
protocols, which facilitates its adoption and
adaptation to existing infrastructures.
As mentioned earlier, more and more software
solutions require capabilities to manage and pro-
cess Big Data holistically, independent of whether
the source is a machine sensor or social media.
Big Data typically needs to be combined with tra-
ditional business data created by enterprise appli-
cations. SAP HANA is the SAP strategic platform
for unifying and combining all this data. It is ideal
for central data management for all applications
because it is open and capable of handling not
only transactional but also analytics workloads all
on one platform.As described in this section, inte-
gration capabilities in SAP HANA make it possible
to combine it with other technologies (such as
Hadoop and members of its family) to obtain the
most suitable and effective Big Data landscape.
Figure 4: Overview of the SAP HANA Platform
4.	 Atomicity, consistency, isolation, and durability.
ISV = Independent software vendor
All devices
SAP (such as SAP® S/4HANA suite), ISV, and custom applications
SAP HANA®
platform
Application services
Database services
Integration services
11 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
long-running analytical queries that consume
a lot of resources from the host. A dedicated
workload manager helps ensure that this is done
effectively by controlling parallel processing and
prioritization of processing activities.
Because SAP HANA uses multiple tenant data-
bases that can run in one instance of SAP HANA,
it is cloud ready and allows secure, efficient man-
agement of a shared infrastructure. Moreover,
SAP HANA maintains a strong separation of data,
resources, and users among tenant databases.
Multiple databases can be managed as a unit.
Additionally, SAP HANA provides the flexibility to
allocate memory or CPU to each tenant database.
The database services encompass foundation
and processing capabilities. The foundation
consists of functionality that turns data into real-
time information, with no sophisticated tuning
required for complex and ad hoc queries (see
Figure 5). OLTP and OLAP can run on a single
copy of data in the same system because data is
in-memory, and the columnar store in SAP HANA
can handle both types of workloads with high
performance.
In mixed environments with both OLAP and
OLTP operations, SAP HANA deals with a blend
of hundreds to tens of thousands concurrent
statements, from simple, short-lived, and
high-priority transactions to deeply complex,
Figure 5: Database Services – Foundation
SAP HANA® platform
Application services
Database services – foundation and processing capabilities
In-memory ACID columnar Multicore and parallelization Advanced compression
Multitenant database containers Dynamic tiering
Integration services
ACID = Atomicity, consistency, isolation, and durability
12 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
shows that SAP HANA is technically able to store
Big Data in-memory, even though this is not often
done in practice.
In addition, warm data can be stored on disk in
a columnar format and accessed transparently.
This is the dynamic tiering option, which extends
SAP HANA with a disk-based columnar store,
also called an extended store. It runs in the
extended store server, which is an integration
of SAP IQ into SAP HANA. The extended store
server can manage up to petabytes of data
and is optimized for fast execution of complex
analytical queries on very big tables.
Due to its advanced compression capabilities,
SAP HANA can support scale-out deployment of
several terabytes – it is not confined by the size
of memory. Assuming a data compression factor
of 7 and that approximately 50% of the memory
is used for query processing, a 3 TB single node
system, for example, could store a database with
an uncompressed size of 10 TB. Scale-out sys-
tems with much more memory are also possible,
and there exist productive scale-out systems for
databases with an uncompressed size in the
range of hundreds of terabytes. The largest
certified scale-out hardware configuration for
SAP HANA has 94 cluster nodes with 3 TB of
memory each.5 This extreme certified scale
5.	 See the certified hardware directory for SAP HANA at
http://global.sap.com/community/ebook/2014-09-02
 -hana-hardware/enEN/index.html.
Possessing data does not add value in and of
itself, but being able to use it to make timely,
meaningful decisions that impact business is
enormously valuable.
13 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
•• Prepackaged predictive algorithms that operate
on current data, as well as open predictive and
machine-learning abilities – for example,
through the integration of an R server
•• Interactive planning without the need to move
data to the application server
•• Cleansing of locally generated and imported
data without the need for postprocessing.
Particularly in the context of the Internet of
Things, the capabilities of SAP HANA facilitate
handling of series data, mostly consisting of
successive events collected over a predefined
time interval.
The processing capabilities of the database ser-
vices (see Figure 6) allow running applications
with almost any data characteristics in the same
system. These capabilities include:
•• Analysis of sentiment by summarizing, classify-
ing, and investigating text content
•• The ability to search across structured and
unstructured data
•• Persistence, manipulation, and analysis of
network relationships and graphs without data
duplication
•• Built-in business rules and functions that accel-
erate application development
Figure 6: Database Services – Processing Capabilities
SAP HANA® platform
Application services
Database services – foundation and processing capabilities
Spatial Predictive Search
Integration services
Text analytics Text mining Graph
Data quality Series data Function libraries
14 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
INTEGRATION SERVICES
Integration services play a strategic role in
handling Big Data coming from any source
and using it to provide a complete view of the
business (see Figure 7).
Figure 7: Integration Services of the SAP HANA Platform
SAP HANA® platform
Application services
Database services
Integration services
Smart data access
Smart data
integration
Smart data
streaming
Remote data sync
Hadoop
integration
IBM DB2, Netezza,
Oracle, MS SQL
Server, Teradata,
SAP HANA, SAP®
ASE, SAP IQ, Hive
Federation
IBM DB2, Oracle,
MS SQL Server,
Twitter, Hive,
OData, custom
adapters
Loading Streaming Synchronizing
SAP HANA Vora™
Spark
Hive
Hadoop
SAP ASE = SAP Adaptive Server Enterprise
15 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
•• Remote data synchronization can be used
for bidirectional data synchronization with
embedded SAP SQL Anywhere® solutions and
UltraLite databases running on devices. This is
particularly indicated for data synchronization
across high-latency or intermittent networks.
•• Hadoop integration involves multiple access
points from SAP HANA to Hadoop data
(through Spark, Hive, HDFS, and MapReduce).
Furthermore, SAP HANA Vora is an in-memory,
column-based SQL engine that complements
the SAP HANA platform with a native SAP
query processing engine on Hadoop. It is
designed for high-performance queries on
Big Data volumes in large distributed clusters.
SAP HANA Vora is integrated into the Spark
computing framework.
Along with the database services, the integration
services enable the SAP HANA platform to
handle Big Data regardless of its characteristics
(volume, velocity, and variety). In addition, the
services offer openness and connectivity with
virtually any technology, and the most important
tools and toolkits are already closely integrated
into the SAP HANA platform environment.
The integration services of SAP HANA allow
access to information in different data sources.
These services enable replication and movement
of almost any type of data in near-real time:
•• Smart data access (SDA) enables remote query
execution, also known as data virtualization. By
means of virtual tables, data in remote systems
is made available from queries executed in the
SAP HANA platform.
•• Smart data integration (SDI) can be used for
both batch and real-time data provisioning from
a variety of remote data sources. Aside from
supporting additional kinds of data sources,
SDI also includes an adapter software develop-
ment kit (SDK) for customers and partners. SDI
partially uses SDA – for example, it uses the
concept of remote source systems and virtual
tables. It also plugs into the SDA federation
framework for remote query execution.
•• Smart data streaming (SDS) enables capture
and analysis of live data streams and routes
them to the appropriate storage or dashboard.
It includes a “streaming lite” deployment option
that can run on small devices in the edge (for
example, on Application Response Measure-
ment [ARM]–based Linux). It can capture and
preprocess data from sensors and machines
and send results to a central core or cloud-
based SDS.
16 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
The constituent services of the SAP HANA
platform enable you to deal with the challenges
and requirements associated with Big Data. The
database foundation services support you when
dealing with structured (OLTP and OLAP) and
unstructured data from any source, and support
high performance on one platform with different
storage options. The processing capabilities facil-
itate handling data of almost any type, allowing
sentiment and predictive analysis, among other
types. The integration services facilitate connec-
tion to virtually any data source and cover the
requirements of lambda and other architecture
principles that may be related to Big Data han-
dling. And finally, the application services provide
a state-of-the-art look and feel for all applications
running on SAP HANA and help simplify system
landscapes by letting you run applications on the
same platform.
APPLICATION SERVICES
Big Data is of limited value unless you can opera-
tionalize the insight through applications and busi-
ness processes. New business processes may
even be created by enhancing existing applica-
tions or developing new ones.The application ser-
vices of SAP HANA support Big Data management
by enabling all kinds of applications to create new
or use existing data stored in the platform. More-
over, when building Big Data applications, these
services help to simplify the landscape by running
applications on the same platform.The applica-
tion services deliver a first-class user experience
on any device through the SAP Fiori® user experi-
ence (UX) technology. In addition, they support
open development standards such as those for
HTML5,JSON, and JavaScript. Built-in tools are
included that support development, version
control, bundling, transport, and installation of
applications (see Figure 8).
Figure 8: Application Services of the SAP HANA Platform
SAP HANA® platform
Database services
Application services
Web server SAP Fiori® user experience (UX) Application lifecycle management
Integration services
SQL = Structured Query Language
JSON = JavaScript Object Notation
ODBC = Open Database Connectivity
OData = Open Data Protocol
MDX = Multidimensional Expressions
SQL JSON ADO.NET J/ODBC OData HTML5 MDX XML/A
17 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Big Data Scenarios and Data Flows
As mentioned in the previous section, the
SAP HANA platform comprises memory-based
and disk-based data stores. The platform sup-
ports relational data, text data, spatial data,
series data, and graph-structured data and
provides various processing options such as
SQL, SQLScript, calculation views, and libraries
for business calculations as well as predictive
analysis. From the perspective of data flows,
there are several typical patterns, some of
which are explained in Figure 9.
Having introduced the elements of the SAP HANA
platform, it is now time to show how they interact
and enable you to deal best with your Big Data
requirements. In this section we examine typical
Big Data scenarios and the associated data flows.
This will help you better understand the character-
istics of the different storage and processing
options.We will discuss two main scenarios:
•• How incoming external data is stored and
processed
•• How the lifecycle for data created by enterprise
applications is handled (with the associated
moving of aged data between different
storage tiers)
The integration services of the SAP HANA
platform facilitate connection to virtually any
data source and cover the requirements of
lambda and other architecture principles that
may be related to Big Data handling.
01001010111010010
10100101000101010
01001010001011101
00101010010100010
10100100101000101
18 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Figure 9: Typical Data Flows
vUDF = Virtual user-defined function
SQL = Structured Query Language
HDFS = Hadoop distributed file system
External
data
source
Data
collection
technologies
(Kafka, Flume,
and so on)
External
data
source
Data
discovery
tools
Enterprise
applications
Analytics
tools
SAP HANA®
data
warehousing
Other clients
of SAP HANA
Smartdatastreaming
In-memory data processing
Processing engines in SAP HANA
In-memory
store
Dynamic
tiering
Hadoop
SAP HANA platform to manage Big Data
MapReduce
Smart data
access
vUDF
HDFSRaw data
Preprocessed
data
Offline
processing
A
B
I J
G C2
F
H2
H1
ED
C1
Hive
Spark
SQL
SAP
HANA
Vora™
19 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
however, that raw data need not necessarily
be stored in HDFS in all cases. If the raw
data is structured, and if SQL is sufficient
for processing, it can also be stored in the
dynamic tiering extended store (C2).
(D) Smart data streaming is the preferred
option in an SAP HANA platform environ-
ment. This is because it has some unique
processing features and reduces the cost of
development and operation with powerful
development and administration tools, as
well as integration into the SAP HANA plat-
form. But if real-time processing of incoming
data is not required, the raw data can be
directly stored in HDFS. Data routing is man-
aged through data collection technologies
such as Kafka and Flume as alternatives to
smart data streaming.
(E) The raw data is preprocessed on the Hadoop
side by means of batch jobs, such as infor-
mation extraction, transformations, filtering,
data cleansing, preaggregation, or analysis
of multimedia content. SAP HANA is also
capable of executing the corresponding
MapReduce jobs. But depending on the
selected setup, this kind of batch processing
(batch layer of the lambda architecture) can
be executed on the Hadoop cluster outside
the control of SAP HANA. This can be done
using the various processing engines of the
Hadoop family. The result of the preprocess-
ing is again stored in HDFS and can be
accessed from SAP HANA (see step F).
(A) Incoming streams of raw data can be filtered,
analyzed, cleansed, and preaggregated
in real time with smart data streaming in
SAP HANA. Such preprocessing can, for
example, be used for correcting outliers
and missing values in streams of sensor data,
or for detecting patterns that indicate alert
conditions.
(B) Smart data streaming creates preprocessed
and condensed high-value data, which is
stored in in-memory tables (real-time
views), for example, for real-time analytics
and for combining with data created by
enterprise applications on SAP HANA.
Smart data streaming is capable of handling
high volumes of rapidly incoming data of
various types. Steps A and B correspond to
the speed layer in the lambda architecture.
(C) In parallel, the incoming raw data is
appended to a huge data store (C1) where
all data is collected. Such massively growing
data stores are often called data lakes. Data
collected this way can be utilized for later
analysis and information extraction, poten-
tially using the complete data set, including
historical data. HDFS is a scalable, fault-
tolerant, and comparatively inexpensive
data store, which makes it a suitable choice
for storing this constantly growing set of
incoming raw data. Stored in HDFS, the data
can be further processed with a variety of
data technologies from the Hadoop family.
It can also be queried and analyzed with
SAP HANA Vora. It should be mentioned,
20 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
use case, some mechanism for refreshing
the data may need to be implemented. The
alternative is to not persist the results in
tables but to execute the remote queries or
vUDFs each time the data is accessed by the
application. The decision depends on how
frequently the data is accessed, how fresh
the data needs to be, and what access
latency can be tolerated. Therefore, the
option should be determined based on the
specific use case.
(H) Data discovery tools can consume data
from SAP HANA Vora through SAP HANA
using smart data access (H1). It is also pos-
sible to connect such tools to SAP HANA
Vora through Spark SQL for data discovery
and visualization (H2), independently of the
server for the SAP HANA database. That
way, tools such as SAP Lumira® software
can be used with SAP HANA Vora to analyze
and visualize data in Hadoop. However, a
decisive advantage of using SAP HANA is
that its adapter for Spark SQL enables data
analysts to combine data in SAP HANA Vora
with business data in the SAP HANA data-
base. With data virtualization through
SAP HANA, any application or tool enabled
for SAP HANA is automatically enabled for
SAP HANA Vora, since the existing inter-
faces and connectivity are used.
(F) Structured data can be directly read from
HDFS files into SAP HANA. The virtual tables
for smart data access can be used to exe-
cute federated queries against data in SAP
HANA Vora, Spark SQL, or Hive. Virtual user-
defined functions6 (vUDFs) that trigger
MapReduce jobs can be used if application-
specific processing is required that cannot
be done with SQL on Hadoop. Whenever the
vUDF is called, the MapReduce job is exe-
cuted on the Hadoop cluster and the result
is returned to SAP HANA. Because this
option is slower, results can be cached on
Hadoop to improve the performance if the
underlying data changes infrequently.
(G) High-value data created as the result
of preprocessing within Hadoop (by batch,
vUDF, or federated queries) can be stored
in in-memory tables in SAP HANA for low
latency access, for efficient processing
with the database’s native engines, and for
combining it efficiently with other data in
SAP HANA. Such derived high-value data
can be pulled by SAP HANA (with vUDF calls
and remote queries), or it can be pushed to
SAP HANA from Hadoop – for example, over
a JDBC connection. Storing the derived data
in in-memory tables in SAP HANA has the
advantage that the data can be accessed
with very low latency; but depending on the
6.	To deliver and run MapReduce jobs using SAP HANA,
developers need to define a table-valued virtual user-defined
function (vUDF). SAP HANA applications can use vUDF calls
in the FROM clause of an SQL query such as a table or view.
The vUDF is associated with a MapReduce Java program,
which is shipped together with the function definition as
SAP HANA database content.
21 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
warehousing process simpler and more agile,
virtual, and comprehensive. Particularly in
landscapes consisting of many systems and
data sources, a centralized data warehouse
supports the combination of Big Data stored
in the platform with centralized corporate
data to achieve new insights. Data warehous-
ing using SAP HANA creates a central place
where one version of the truth is available,
based on trusted, harmonized data.
As we have said, raw external data may come
from a variety of sources – sensors and machines,
social media content, e-mails,Web content,
Web logs, security logs, text documents, multi­
media files, and more.The data may be collected,
analyzed, and aggregated in Hadoop, and the
extracted high-value (hot) data is moved to
in-memory tables in SAP HANA.This case is
depicted in Figure 10 (from right to left) as the
blue arrow that turns red.
(I) A business application on top of SAP HANA
may also create data. Initially, the data is of
high value and therefore stored in in-memory
tables, where it can be accessed with very
low latency. After some time, the data may
become less relevant and is moved either to
disk-based tables in the extended store
(dynamic tiering) or to HDFS. In dynamic
tiering the data can continue to be managed
with tools in SAP HANA and queried with
high performance, and selective updates are
still supported. In contrast, data in HDFS can
no longer be changed, but applications that
run on the SAP HANA database can still
query it through virtual tables when required.
(J) Data warehousing based on the SAP HANA
platform, which includes the SAP Business
Warehouse application and native modeling
tools in SAP HANA, can easily consume Big
Data through the platform, and can make the
Figure 10: Data Flows in Combined Data Processing Technologies
OLTP = Online transaction processing
SAP HANA® platform to manage Big Data
OLTP
system
Processing engines in SAP HANA
In-memory
storage in
SAP HANA
Analytics
tools
(such as
SAP Lumira®
software)
Dynamic tiering in SAP HANA
Extended
storage
Hadoop
Hadoop Distributed
File System (HDFS)
Sensors,
social, and
so on
Data aging/temperature
Data aggregation
22 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
of warm and raw data are not relevant for in-
memory storage but are best managed with
dynamic tiering (for warm data) or Hadoop
(for cold business data, raw data, or data of
unknown value). Figure 11 depicts a qualitative
comparison of these technologies in terms of
data value, processing performance, and volume.
As we have seen, there are multiple options to
deal with data of different types. SAP HANA is
ready to handle data regardless of its character-
istics in different storage tiers and with different
processing engines – enabling it to act as the
unifying data platform.
Figure 10 also shows that another typical data
flow goes in the opposite direction (from left to
right). In this scenario, the high-value (OLTP)
data is created by an SAP business application
and stored in in-memory tables in the SAP HANA
database (red arrow), where it can be accessed
with very low latency by analytics tools such as
SAP Lumira. But not all data has the same busi-
ness value, and not all data needs to be kept in-
memory forever. Aside from its business value,
data can be categorized in terms of volume,
access patterns, and performance requirements.
Based on all these characteristics, the different
categories of data can be stored in different data
stores with different storage capabilities, perfor-
mance, and cost. When data loses value over
time, it is said to get colder. It can then be moved
to a different storage tier with higher latency,
bigger capacity, and maybe less cost. In Figure
10, this is represented by the red arrow that
turns blue.
When considering the dimensions value,
volume, and processing performance for data,
clear differences can be observed. In-memory
storage in SAP HANA has the highest processing
performance and is therefore used for high-value
data, which is characterized by its comparatively
low volume (up to several terabytes, though
higher volumes can be reached). Higher volumes
Figure 11: Qualitative Assessment of Data
Processing Technologies
Value,performance
Volume
Hadoop
Dynamic
tiering
SAP
HANA®
in-memory
23 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Big Data Use Cases
PREDICTIVE MAINTENANCE AND SERVICE
USE CASE
This use case is about obtaining meaningful
views on data from machines, assets, and
devices for making better real-time decisions
and predictions and for improving operational
performance. Typically, new business models
from Industry 4.0 demand that companies
improve their asset maintenance so as to achieve
maximum machine availability with minimum
costs. Companies also need to reduce their spare
parts inventory, minimizing the amount of mate-
rials consumed by maintenance and repairs. This
requires predictive analytics and algorithms to
forecast equipment health.
The architecture setup in this case should allow
real-time operations, analyses, and actions. At
the same time, a very large number of events
per day must be correlated with enterprise data.
The SAP HANA platform, enhanced with the
SAP Predictive Maintenance and Service solu-
tion, meets all the requirements associated with
this use case (see Figure 12).
The data flow scenarios presented in the previ-
ous sections represent basic principles. Real-
world Big Data use cases illustrate concrete
areas of application. Possible new applications
are emerging on a daily basis. Successful Big
Data use cases are already being deployed in the
following areas (to name just a few):
•• Anticipating consumer behavior
•• Increasing safety
•• Mastering performance
•• Redefining operational efficiency
•• Managing predictive maintenance
•• Preventing fraud
•• Saving lives by improving medical research
and services
•• Personalizing real-time promotions
•• Preventing injuries in sport
•• Enhancing fan experience
We will look at two examples that involve using
SAP technology alone. However, since the
SAP HANA platform is open, other components
such as Hadoop and technologies from the
Hadoop family can be incorporated into the
setup as required. In any case, when designing
the Big Data system landscape for your business,
you should look for opportunities to simplify
your IT infrastructure. And putting the SAP HANA
platform at the center of your Big Data landscape
is a strong start toward simplification.
24 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Figure 12: Predictive Maintenance and Service Setup
Note: Gold frames indicate particularly important components
SQL = Structured Query Language
MDX = Multidimensional eXpressions
JSON = JavaScript Object Notation
SAS = Statistical Analysis System
SAP HANA® platform
SQL, SQLScript, JavaScript
Integration services
Spatial Search Text mining
Stored
procedure and
data models
Application
and UI services
Business
function
library
Predictive
analytics
library
Database
services
Planning
engine
Rules engine
Supports
any device
SAP Predictive
Maintenance and Service,
technical foundation
Any Apps
Any app server SQL MDX R JSON SAS
Open
connectivity
Functionality from
SAP Predictive
Analysis
Functionality from
SAP InfiniteInsight®
SAP Lumira®
software
SAP Business Suite
and SAP NetWeaver®
Application Server for ABAP®
SAP® Predictive Analytics software
Transaction Unstructured Machine Hadoop Real time Locations Other apps
25 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
combined from many sources, in seconds. The
SQL, SQLScript, and JavaScript capabilities allow
utilization of fast programs that use many que-
ries. With the built-in sophisticated algorithms of
the predictive analytics library, you can carry out
complex analyses. In addition, with the rules
engine you can define and trigger business rules.
You can also expand to planning scenarios by
using the planning engine. Text analysis (search
and relate) is no problem with the text analysis
engine, which allows handling text as just another
query aspect. Similarly, the geospatial analytics
capabilities of the platform make comprehensive
analyses of geospatial data possible. These
engines enable you to find reasons for failure
(root cause), detect deviations from the norm,
and create a prediction model from sensor data
and failures.
Open Connectivity
With its open connectivity, the SAP HANA
Platform enables you to reuse your Statistical
Analysis System (SAS) models with high-speed
execution, should you need to connect to an SAS
application. Moreover, you can explore statistics
or even let the system make a best proposal by
using SAP Predictive Analytics software. Because
SAP HANA is able to deal with R procedures, if an
R server is connected, for example, in order to
use data mining algorithms for predictive stock
and workforce optimization, you can reuse your
R models. By integrating a Hadoop system, you
can flexibly scale out to meet the requirements of
preprocessing and storing vast amounts of data.
Hadoop may also be used for archiving historical
data or for offline batch processes.
SAP Predictive Maintenance and Service enables
equipment manufacturers and operators of
machinery and assets to monitor machine health
remotely, predict failures, and proactively main-
tain assets. It is offered as a standard cloud
edition or as a repeatable custom solution for the
technical foundation and a custom development
project. Both options are based on SAP HANA.
Using a Lot of Data from Any System
To meet the requirements of this use case, your
infrastructure must be able to use data from any
system, SAP or non-SAP, regardless of the data
type (business or sensor data). This is where
SAP Data Services software is used. Further-
more, your infrastructure should be able to listen
to 1 to 2 million records per second and only keep
what is interesting. SAP Event Stream Processor
handles this part. Through integration of Hadoop,
the platform is able to connect to many tera-
bytes, even petabytes, of back-end data.
Processing a Lot of Data
Predictive maintenance is an important step for
keeping assets from failing. For that purpose,
sensor, business, environmental, sentiment, and
other data must be analyzed to discover relation-
ships, patterns, rules, outliers, and root causes
and thereby enable predictions. Based on this
data mining, it is possible to take actions such as
creating notifications, altering maintenance
schedules, prepositioning spare parts, adjusting
service scheduling, changing product specifica-
tions, and more. With its query engine and rules
engine, the SAP HANA platform lets you perform
ad hoc queries on a large number of records
26 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
High-volume, high-variety, and high-velocity data
can be processed both offline and in real time.
The data can be stored in-memory or in the
dynamic tiering option or Hadoop, depending on
the requirements. Hadoop would be used, for
example, when the volume of data coming partic-
ularly from external sources is extremely high
and comes in with high velocity (as is the case
with point-of-sales data).
In the retail industry, in the area of customer
behavior analytics, a similar setup allows integra-
tion of several data sources into a central data
store in SAP HANA, based on a predictive model
that determines scores for several metrics such as
return rate and likelihood to churn.The resulting
scores can then be used in customer engagement
intelligence for campaign target selection.The
purpose is to improve the ROI of campaigns by
targeting the right audience. In turn, this market-
ing optimization is expected not only to drive
revenue and improve margins but also to deliver
a personalized consumer experience across
channels.A highly optimized analytical engine in
SAP HANA enables processing huge amounts of
data by using many different techniques to reduce
the number of searches, as well as by employing
the best scientific algorithms available.
In short, by deploying the SAP HANA platform,
businesses are able to gain a distinct advantage
over their competition. Powered by SAP HANA,
SAP Demand Signal Management provides a
centralized platform to monitor the aggregated
market data, which in turn results in a better
understanding of demand and the ability to focus
in the right markets, thereby lowering costs and
increasing revenues.
This use case requires quick data correlations and
actions to anticipate operational needs and unex-
pected breakdowns, and to automate triggers. In
this system configuration, flexible predictive algo-
rithms and tools combine technical and business
data. Machine-to-machine communication helps
monitor activities and stores data to fuel real-time
reporting. Sophisticated business intelligence
tools are required to obtain meaningful visualiza-
tions (such as SAP Lumira can provide).
DEMAND SIGNAL MANAGEMENT USE CASE
Demand signal management is a common
requirement of consumer products companies.
They are looking to apply their efforts toward
those markets of retailers and consumers where
they can realize the greatest growth.To realize this
goal, they need a consistent and comprehensive
global market view to be able to understand the
demand from these various markets.Typically,
these companies use syndicated data from vari-
ous agencies to understand demand and brand
perception so they can focus on the right areas.
But if it is not automated, the consolidation and
harmonization of data from various sources (inter-
nal and external) is a highly time-consuming and
error-prone manual activity.The SAP Demand
Signal Management application powered by
SAP HANA addresses the most common chal-
lenges of data harmonization and automation
for marketing, supply chain, and sales.With
SAP Demand Signal Management, the SAP HANA
platform can be evolved to include data sources
such as weather data or social media data. SAP
Demand Signal Management can act as a central
platform for various use cases such as trade pro-
motion optimization, sentiment analytics, demand
forecast, and brand perception in consumer prod-
ucts and other industries. Figure 13 illustrates a
typical demand signal management setup.
27 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Figure 13: Demand Signal Management Setup
Note: Gold frames indicate particularly important components
SQL = Structured Query Language
MDX = Multidimensional eXpressions
JSON = JavaScript Object Notation
SAS = Statistical Analysis System
POS = Point of sale
SAP HANA® platform
SQL, SQLScript, JavaScript
Integration services
Spatial Search Text mining
Stored
procedure and
data models
Application
and UI services
Business
function
library
Predictive
analytics
library
Database
services
Planning
engine
Rules engine
Supports
any device
SAP Predictive
Maintenance and Service,
technical foundation
SAP Demand
Signal Manage-
ment application
SQL MDX R JSON SAS
Open
connectivity
Functionality from
SAP Predictive
Analysis
Functionality from
SAP InfiniteInsight®
SAP Lumira®
software
SAP Business Suite
and SAP NetWeaver®
Application Server for ABAP®
SAP® Predictive Analytics software
Company-
internal
data
(shipments,
stock,
promotions,
and so on)
POS data Hadoop Other apps Market research data from other sources
28 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
SAP HANA Platform: Open, Flexible,
Integrated, Scalable, and More
built-in rule engines and business functions
accelerate application development. And having
all data in a unified platform also facilitates ana-
lytics for real business value.
The SAP HANA platform helps your business
meet the demands of the digital economy, where
every company will be a technology company.
A key challenge faced by all companies across
every industry is driving innovation while tracking
results in real time. Unlike traditional databases,
SAP HANA is ready to manage this challenge
successfully by removing data silos and providing
a single platform for storing transaction, opera-
tional, warehousing, machine, event, and unstruc-
tured data – one real-time system operating on
one copy of data, supporting any type of business
workload, all at the same time.
OPEN AND FLEXIBLE
You can build a complete lambda architecture
with components of the SAP HANA platform,
including the processing engines and data stores.
Further, because of its openness to integrating
other technologies, SAP HANA lets you go
beyond lambda and cover other reference
architecture principles and requirements by
combining different components. This makes the
platform very flexible and therefore a sustainable
investment.
SAP HANA has a powerful and flexible search func-
tion that allows searching across structured and
unstructured data.Analysis of sentiment can be
achieved by summarizing, classifying, and investi-
gating text content. Persistence, manipulation, and
analysis of network relationships and graphs can
be carried out without data duplication.
A UNIFYING DATA PLATFORM
SAP HANA is SAP’s strategic platform for unify-
ing and combining relational, text, spatial, series,
and graph-structured data. It provides various
processing options such as SQL, SQLScript,
calculation views, and libraries for business cal-
culations and predictive analysis. As described
earlier, the platform comprises memory-based
stores and a disk-based data store (the dynamic
tiering option). To complete the picture, the
SAP HANA platform comes with several options
for integration with remote data sources, sup-
porting data integration, data transformations,
data quality operations, complex event process-
ing, and federated query execution. SAP HANA
puts a broad spectrum of capabilities at your dis-
posal to manage the challenging characteristics
of Big Data on one platform.
SAP HANA Vora also underlines the unifying
character of the SAP HANA platform. This in-
memory query processor has been built as an
extension of Apache Spark and is effectively an
in-memory query engine that can make the pro-
cess of data analysis more in depth and oriented
around business processes. Though it can be
used independently of SAP HANA, businesses
can benefit from using SAP HANA Vora as a
bridge between regularly updated and accessed
business data kept in SAP HANA with historical
and mass data stored in Hadoop data lakes.
For all data, SAP HANA serves as the central
point for data access, data modeling, and system
administration. Having all data in one platform is
a great advantage for application developers
because they can access data in SAP HANA and
external sources in a uniform way, such as with
SAP SQL, SQLScript, and calculation views. The
29 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
processing modules.As mentioned earlier, while
SAP HANA Vora can be used independently of the
SAP HANA platform, it is optimally integrated in
the SAP HANA platform to help ensure very high
efficiency.
SIMPLIFICATION AND SECURITY
Using SAP HANA as the unifying platform for all
your data also simplifies system administration
and software lifecycle management, thus helping
to reduce cost of ownership.You can gain effi-
ciency and agility while reducing IT expenses by
running SAP HANA in a virtualized environment.
And the enterprise-class high-availability and
disaster-recovery features in SAP HANA are
designed for continuous operation, even if
failures occur.
SAP HANA provides versatile tools such as the
SAP HANA studio, SAP HANA cockpit, SAP DB
Control Center systems console, SAP Solution
Manager, and SAP Landscape Virtualization Man-
agement software to monitor the health of your
system and effectively administer your data
infrastructure. You can also manage the platform
lifecycle more efficiently by streamlining installa-
tions, configurations, and upgrades for your
entire SAP HANA platform environment, using
another rich set of tools to help simplify deploy-
ment and maintenance while reducing costs.
INTEGRATED AND SCALABLE
Predictive analysis is enabled through prepack-
aged, state-of-the-art predictive algorithms that
operate on current data, as well as through open
predictive and machine-learning abilities, such as
by integrating an R server. Interactive planning is
possible without moving data to the application
server. SAP HANA allows cleansing of locally
generated and imported data, which can be
performed without postprocessing.
Hadoop is suited for data that can grow infinitely –
that is, for unstructured and raw data and when
massive scale-out is required for processing.
With Hadoop you can scale out flexibly.As a
good complement, SAP HANA Vora is the rec­
ommended SQL engine for high-performance
analytics on data in Hadoop and Spark. To
achieve high performance, SAP HANA Vora uses
advanced algorithms and data structures, and
just-in-time compilation of query plans into
machine-executable binary code. The integration
of SAP HANA Vora into the Spark execution
framework has the advantage of reusing various
Spark capabilities, such as the Spark API, and
Spark’s ability to integrate with a cluster man-
ager such as Hadoop YARN.7 This integration
makes it possible for Spark programs and Spark
SQL queries to access data using SAP HANA Vora
and combine it with other Spark data sources and
7.	 YARN = Yet Another Resource Negotiator.
30 / 31
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
database administrators’ access and integrate
SAP HANA with your existing security infrastruc-
ture to help ensure your data is always secure.
CORE MANAGEMENT SYSTEM FOR BIG DATA
With SAP HANA, you benefit from intuitive, state-
of-the art tools and technologies to effectively
run a mission-critical, secure environment for
your most valuable data assets. At the same
time, you are preparing your business to manage
the challenges of the digital economy.
For all these reasons, we recommend using
SAP HANA as your core data management
system for all Big Data applications, including
custom applications, and adding other options
such as SAP HANA Vora or Hadoop when addi-
tional capabilities are required.
With SAP HANA you can set up simplified data
warehousing that reduces IT workloads, enhances
data modeling, and simplifies administration –
particularly important for landscapes consisting of
many systems and data sources.The centralized
data warehouse not only consumes Big Data
stored in the SAP HANA platform but also harmo-
nizes it and combines it with centralized corporate
data, thereby providing one version of the truth
based on trusted data.
Last but not least, SAP HANA helps you ensure
your business data is safe by providing security
functions to implement specific security policies.
Additionally, when you run your SAP applications
on SAP HANA, they benefit from the same
security foundation, and you can even add incre-
mental protection. For example, you can control
SAP HANA enables you to implement a
complete lambda architecture from
preintegrated components of the platform,
including the processing engines and
data stores.
31 / 31
Find Out More
•• SAP HANA smart data streaming:
http://help.sap.com/hana_options_sds
•• SAP Event Stream Processor – a complex event
processing platform for event-driven analytics:
www.sap.com/pc/tech/database/software
/sybase-complex-event-processing
/index.html
•• Advanced analytics with SAP HANA for in-
memory processing for text, spatial, graph,
and predictive analysis:
http://hana.sap.com/abouthana/hana
-features/processing-capabilities.html
•• The Data Science organization at SAP, with
experts who help implement predictive analyt-
ics and custom apps:
www.sap.com/solution/big-data/software
/data-science/index.html
•• SAP Predictive Maintenance and Service, cloud
edition: http://help.sap.com/pdm-od
Here are sources for more information on the
ways SAP can help you manage and get the most
from Big Data:
•• SAP HANA platform: http://go.sap.com
/solution/in-memory-platform.html
•• Administration and IT operations for
SAP HANA: http://hana.sap.com
/capabilities/admin-ops.html
•• SAP HANA and Apache Hadoop:
www.sap.com/solution/big-data/software
/hadoop/index.html
•• SAP HANA Vora: http://go.sap.com/product
/data-mgmt/hana-vora-hadoop.html
•• SAP IQ for logical Big Data warehousing
(OLAP): www.sap.com/iq
•• SAP SQL Anywhere for designing embedded
database applications for mobile and remote
environments:
www.sap.com/pc/tech/database/software
/sybase-sql-anywhere/index.html
•• SAP Data Services for all types of data integra-
tion, including smart data access, smart data
integration, and smart data quality:
www.sap.com/pc/tech/enterprise
-information-management/software
/data-services/index.html
Studio SAP | 42065enUS (16/03) © 2016 SAP SE or an SAP affiliate company. All rights reserved.
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the
express permission of SAP SE or an SAP affiliate company.
SAP and other SAP products and services mentioned herein as well as their respective logos are
trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other
countries. Please see http://www.sap.com/corporate-en/legal/copyright/index.epx#trademark for
additional trademark information and notices. Some software products marketed by SAP SE and its
distributors contain proprietary software components of other software vendors.
National product specifications may vary.
These materials are provided by SAP SE or an SAP affiliate company for informational purposes only,
without representation or warranty of any kind, and SAP SE or its affiliated companies shall not be liable for
errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate company
products and services are those that are set forth in the express warranty statements accompanying such
products and services, if any. Nothing herein should be construed as constituting an additional warranty.
In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business
outlined in this document or any related presentation, or to develop or release any functionality mentioned
therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy
and possible future developments, products, and/or platform directions and functionality are all subject
to change and may be changed by SAP SE or its affiliated companies at any time for any reason without
notice. The information in this document is not a commitment, promise, or legal obligation to deliver
any material, code, or functionality. All forward-looking statements are subject to various risks and
uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned
not to place undue reliance on these forward-looking statements, which speak only as of their dates, and
they should not be relied upon in making purchasing decisions.

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data with SAP HANA Vora
Big Data with SAP HANA VoraBig Data with SAP HANA Vora
Big Data with SAP HANA Vora
Vigram V
 

Was ist angesagt? (20)

Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
 
How can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedHow can Hadoop & SAP be integrated
How can Hadoop & SAP be integrated
 
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
 
SAP Lambda Architecture Point of View
SAP Lambda Architecture Point of ViewSAP Lambda Architecture Point of View
SAP Lambda Architecture Point of View
 
Sap hana platform sps 11 introduces new sap hana hadoop integration features
Sap hana platform sps 11 introduces new sap hana hadoop integration featuresSap hana platform sps 11 introduces new sap hana hadoop integration features
Sap hana platform sps 11 introduces new sap hana hadoop integration features
 
Flexpod with SAP HANA and SAP Applications
Flexpod with SAP HANA and SAP ApplicationsFlexpod with SAP HANA and SAP Applications
Flexpod with SAP HANA and SAP Applications
 
Hadoop integration with SAP HANA
Hadoop integration with SAP HANAHadoop integration with SAP HANA
Hadoop integration with SAP HANA
 
Finance month closing with HANA
Finance month closing with HANAFinance month closing with HANA
Finance month closing with HANA
 
SAP EIM Overview
SAP EIM OverviewSAP EIM Overview
SAP EIM Overview
 
Building Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANABuilding Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANA
 
SAP Helps Reduce Silos Between Business and Spatial Data
SAP Helps Reduce Silos Between Business and Spatial DataSAP Helps Reduce Silos Between Business and Spatial Data
SAP Helps Reduce Silos Between Business and Spatial Data
 
SAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop IntegrationSAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop Integration
 
SAP HANA and SAP Vora
SAP HANA and SAP VoraSAP HANA and SAP Vora
SAP HANA and SAP Vora
 
Big Data with SAP HANA Vora
Big Data with SAP HANA VoraBig Data with SAP HANA Vora
Big Data with SAP HANA Vora
 
SAP HANA Interactive Use Case Map
SAP HANA Interactive Use Case MapSAP HANA Interactive Use Case Map
SAP HANA Interactive Use Case Map
 
SAP HANA SPS09 - HANA IM Services
SAP HANA SPS09 - HANA IM ServicesSAP HANA SPS09 - HANA IM Services
SAP HANA SPS09 - HANA IM Services
 
What's Planned for SAP HANA SPS10
What's Planned for SAP HANA SPS10What's Planned for SAP HANA SPS10
What's Planned for SAP HANA SPS10
 
Introduction to HANA in-memory from SAP
Introduction to HANA in-memory from SAPIntroduction to HANA in-memory from SAP
Introduction to HANA in-memory from SAP
 
SAP HANA Timeline
SAP HANA TimelineSAP HANA Timeline
SAP HANA Timeline
 
In-Memory Database Platform for Big Data
In-Memory Database Platform for Big DataIn-Memory Database Platform for Big Data
In-Memory Database Platform for Big Data
 

Andere mochten auch

HPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENW
HPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENWHPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENW
HPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENW
Viplava Kumar Madasu
 

Andere mochten auch (7)

Deterministic releases and how to get there with Nigel Babu
Deterministic releases and how to get there with Nigel BabuDeterministic releases and how to get there with Nigel Babu
Deterministic releases and how to get there with Nigel Babu
 
HPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENW
HPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENWHPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENW
HPE Reference Architecture for SAP HANA Vora with Spark and Hadoop-4AA6-7739ENW
 
SAP Big Data Strategy
SAP Big Data StrategySAP Big Data Strategy
SAP Big Data Strategy
 
How to Test Big Data Systems | QualiTest Group
How to Test Big Data Systems | QualiTest GroupHow to Test Big Data Systems | QualiTest Group
How to Test Big Data Systems | QualiTest Group
 
Sap Leonardo IoT Overview
Sap Leonardo IoT OverviewSap Leonardo IoT Overview
Sap Leonardo IoT Overview
 
Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
 

Ähnlich wie CIO Guide to Using SAP HANA Platform For Big Data

ManMachine&Mathematics_Arup_Ray_Ext
ManMachine&Mathematics_Arup_Ray_ExtManMachine&Mathematics_Arup_Ray_Ext
ManMachine&Mathematics_Arup_Ray_Ext
Arup Ray
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
Jane Roberts
 
SAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperSAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White Paper
Vipul Neema
 

Ähnlich wie CIO Guide to Using SAP HANA Platform For Big Data (20)

SAP Data Hub – What is it, and what’s new? (Sefan Linders)
SAP Data Hub – What is it, and what’s new? (Sefan Linders)SAP Data Hub – What is it, and what’s new? (Sefan Linders)
SAP Data Hub – What is it, and what’s new? (Sefan Linders)
 
Capture and Feed Telecom Network Data and More Into SAP HANA - Quicky and Aff...
Capture and Feed Telecom Network Data and More Into SAP HANA - Quicky and Aff...Capture and Feed Telecom Network Data and More Into SAP HANA - Quicky and Aff...
Capture and Feed Telecom Network Data and More Into SAP HANA - Quicky and Aff...
 
Unleash_PA_on_HANA
Unleash_PA_on_HANAUnleash_PA_on_HANA
Unleash_PA_on_HANA
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White Paper
 
The New Enterprise Blueprint featuring the Gartner Magic Quadrant
The New Enterprise Blueprint featuring the Gartner Magic QuadrantThe New Enterprise Blueprint featuring the Gartner Magic Quadrant
The New Enterprise Blueprint featuring the Gartner Magic Quadrant
 
IT Simplification with the SAP HANA platform
IT Simplification with the SAP HANA platformIT Simplification with the SAP HANA platform
IT Simplification with the SAP HANA platform
 
Empowering SAP HANA Customers and Use Cases
Empowering SAP HANA Customers and Use CasesEmpowering SAP HANA Customers and Use Cases
Empowering SAP HANA Customers and Use Cases
 
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
Enterprise Archiving with Apache Hadoop Featuring the 2015 Gartner Magic Quad...
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact SheetBig Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
 
Big data rmoug
Big data rmougBig data rmoug
Big data rmoug
 
Getting Started with BI Analytics on HANA
Getting Started with BI Analytics on HANAGetting Started with BI Analytics on HANA
Getting Started with BI Analytics on HANA
 
ManMachine&Mathematics_Arup_Ray_Ext
ManMachine&Mathematics_Arup_Ray_ExtManMachine&Mathematics_Arup_Ray_Ext
ManMachine&Mathematics_Arup_Ray_Ext
 
Sap bw4 hana architecture archetypes
Sap bw4 hana architecture archetypesSap bw4 hana architecture archetypes
Sap bw4 hana architecture archetypes
 
Reduce TCO with SAP Business Suite powered by SAP HANA
Reduce TCO with SAP Business Suite powered by SAP HANAReduce TCO with SAP Business Suite powered by SAP HANA
Reduce TCO with SAP Business Suite powered by SAP HANA
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
 
SAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperSAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White Paper
 
What Is SAP HANA And Its Benefits?
What Is SAP HANA And Its Benefits?What Is SAP HANA And Its Benefits?
What Is SAP HANA And Its Benefits?
 
#asksap Analytics Innovations Community Call: SAP BW/4HANA - the Big Data War...
#asksap Analytics Innovations Community Call: SAP BW/4HANA - the Big Data War...#asksap Analytics Innovations Community Call: SAP BW/4HANA - the Big Data War...
#asksap Analytics Innovations Community Call: SAP BW/4HANA - the Big Data War...
 

Kürzlich hochgeladen

%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 

Kürzlich hochgeladen (20)

Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 

CIO Guide to Using SAP HANA Platform For Big Data

  • 1. ©2016SAPSEoranSAPaffiliatecompany.Allrightsreserved. CIO Guide to Using the SAP HANA® Platform for Big Data February 2016 Table of Contents 2 Executive Summary 3 Introduction 4 Challenges and Opportunities of Big Data 7 Big Data Reference Architecture 10 SAP HANA Platform for Handling Big Data 17 Big Data Scenarios and Data Flows 23 Big Data Use Cases 28 SAP HANA Platform: Open, Flexible, Integrated, Scalable, and More 31 Find Out More DISCLAIMER This document outlines our general product direction and should not be relied on in making a purchase decision. This presentation is not subject to your license agreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or to develop or release any functionality mentioned in this presentation. This presentation and SAP’s strategy and possible future developments are subject to change and may be changed by SAP at any time for any reason without notice. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non- infringement. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent. See http://global.sap.com/corporate-en/legal/copyright/index.epx for additional trademark information and notices.
  • 2. 2 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. Executive Summary Big Data is often characterized by the three Vs: volume, velocity, and variety.These characteristics pose great challenges for conventional disk-based relational databases.The different categories of data require different storage capabilities and performance, which involve different costs. The SAP HANA platform offers several types of data storage and processing engines. Online transaction processing (OLTP) and online analyt- ical processing (OLAP) applications can now easily run in one system, on one database. In- memory stores in SAP HANA are recommended for high-value data (hot data) that must be accessed and processed with extreme speed, for data that is frequently changed, and when you need the platform’s native features. Customers typically use in-memory stores for (compressed) data volumes up to several terabytes.1 The dynamic tiering option extends the SAP HANA database with disk-based columnar tables, based on SAP® IQ software technology.This option is rec- ommended for storing big sets of structured data, when high OLAP performance and deep integra- tion into SAP HANA is important, and when the processing features of Structured Query Language (SQL) are sufficient. Dynamic tiering can be used, for example, for lower-value data for which in- memory performance is not required (warm data). It can manage data volumes from several hundred terabytes to petabytes. Hadoop is suited for raw data that can grow infi- nitely, for unstructured and semistructured data, and when massive scale-out is required for pro- cessing. With Hadoop you can flexibly scale out with low initial cost. Hadoop is also suited for data from business applications that is no longer required (cold data). The SAP HANA Vora™ engine is the recommended SQL engine for high-performance analytics on structured data in Hadoop. It enables optimized access to data in SAP HANA from Hadoop or Spark. For all data, SAP HANA serves as the central point for data access, data modeling, and system administration. Due to its openness, the SAP HANA platform can be extended with non-SAP technology depending on the require- ments. This flexibility makes the platform a sus- tainable investment. By streamlining system administration and software lifecycle manage- ment, SAP HANA enables CIOs to simplify their system landscape and significantly reduce cost of ownership. This guide supports CIOs in setting up a system infrastructure for their business that can get the best out of Big Data. We describe what the SAP HANA® platform can do and how it integrates with Hadoop and related technologies. We examine typical data flows where different technologies interplay, looking at data lifecycle management and data streaming. Concrete use cases point out the requirements associated with Big Data as well as the opportunities it offers, and how companies are already taking advantage of them. 1. In SAP HANA, a volume on the order of terabytes results after compression. Therefore, this actually equates to much larger data volumes in conventional systems.
  • 3. 3 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. Introduction Consider, for example, an application that uses SAP HANA to store its business data and has to manage additional data coming from devices. Assume that the device data is structured, the total expected volume is in the gigabytes range, and the data is processed with SQL. Since such data can be efficiently managed in the SAP HANA database, a solution architecture that requires an additional Hadoop cluster would add unneces- sary complexity and cost of operations. The decision for a particular storage option or processing engine depends on several dimen- sions, including performance, data volume, cost, features, and operational readiness. This guide helps CIOs understand these dimensions and make the right decision for their business. The guide is structured in the following sections: •• The “Challenges and Opportunities of Big Data” section describes briefly what makes Big Data challenging yet, at the same time, a source of benefit for today’s businesses. •• “Big Data Reference Architecture” explains the lambda architecture, which is one of the current reference architectures for Big Data. •• “SAP HANA Platform for Handling Big Data” gives an overview of the SAP HANA platform and its options for managing Big Data. •• “Big Data Scenarios and Data Flows”explains typical Big Data scenarios and data flows and how they influence the Big Data infrastructure setup. •• “Big Data Use Cases” describes two real-world use cases. •• “SAP HANA Platform: Open, Flexible, Integrated, Scalable, and More”summarizes the key charac- teristics of the SAP HANA platform that make it a sustainable investment. •• “Find Out More” provides links to further information. Technology has the ability to shape our world. Big Data is one of the most important technology trends that will impact our world between now and 2020. Arguably, Big Data is an artificial category created by technology vendors as a convenient way to reference certain new tools (the value of which is undeniable).The term is broadly used to refer to large or complex data sets that traditional data processing applications are not able to manage. In the past five years we have created more data than all prior years combined, and having all this new data is making business operations much more complex. In addition, a lot of data does not automatically equal a lot of useful information. An effective Big Data infrastructure should be able to separate the background noise from the valuable signals that can be translated to actionable insights. There are many choices when it comes to design- ing and setting up a suitable Big Data system infrastructure, but there has been little guidance regarding the best approach for both exploiting the potential of Big Data and meeting enterprise- specific requirements. This document provides orientation for CIOs so they can choose the right storage and processing options for a given Big Data application and understand the impact and consequences of their decision. It helps to answer questions about what type of data should be stored in in-memory tables, in dynamic tier- ing, and in Hadoop, as well as which processing engine should be used for a given task. Making the wrong decision can lead to unneces- sarily complex and expensive solutions that do not meet the business requirements of Big Data.
  • 4. 4 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. Challenges and Opportunities of Big Data There has been an explosion of new technologies, new data types and sources, and new ways of using existing technology. From mobile and cloud computing to social media and sentiment to machine, log, and sensor data, data science is wringing value from massive amounts of struc- tured and unstructured data, and high-velocity streaming data is providing insight and driving decisions like never before. Possessing data does not add value in and of itself, but being able to use it to make timely, meaningful decisions that impact business is enormously valuable. Before Big Data can be monetized and turned into a strategic asset, however, today’s CIOs need to consider several things.To start, the expectations of the business should be matched by the most suitable technology (see Figure 1). In today’s networked economy, characterized by hyperconnectivity or instant connection to the business network, we can confidently anticipate that data quantities will continue to grow at high rates. At the same time, this data will be of differ- ent types – structured and unstructured, and from low value to high value. In the last decade, companies like Google, Facebook, and Netflix have led the way in collect- ing and monetizing huge amounts of data gener- ated by consumers’ everyday activity. They look on this data as a strategic asset – every decision in their organizations is data driven, as is every product they sell. This has created enormous interest among traditional enterprises, which can easily see the benefit of putting their data to work in the same way. Big Data is often characterized by the three Vs: volume, velocity, and variety. These characteristics pose great challenges for conventional disk-based relational databases.
  • 5. 5 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. cost of operations and at the same time preserve existing investments in technology, staff, and training. To manage Big Data well, enterprises must have people with dedicated skills to exploit the opportu- nities that it offers.The main task of data analysts or data scientists is to separate the background noise from the meaningful signals to enable solid decision making and take appropriate actions. SAP HANA is already helping businesses to unlock this valuable information by addressing at least one very important aspect: the ability to perform real-time analytics on very large data sets. Not only data scientists but also managers and execu- tives can now get insight into their current state of affairs at any time and at“the speed of thought.” The left side of Figure 1 shows common expectations regarding Big Data that require consideration of: •• New digital channels •• Possible correlations between transactional and analytics data from the enterprise with data from other sources (for example, weather data or social media) that might be meaningful •• The ability to create simulations and sophisti- cated data visualizations that depict insights from data in new, more compelling ways On the technology side, CIOs should get familiar with the available technologies and their capabili- ties in order to make the right decisions to build a system infrastructure that satisfies business expectations without disregarding costs. For example, a unifying platform can reduce total Figure 1: Matching Business Expectations with Technology, Solutions, and Skills What are the expectations of the following for Big Data? • Marketing analytics • Sales analytics • Operational analytics • Financial analytics Business expectations • Which technology should we consider? • Is a unifying platform possible? • Which skill sets will we need? Technology, solutions, and skills
  • 6. 6 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. (variety) – from comments about products on Twitter or Facebook and logs of customer behavior on a company Web site to sensor and weather data. A general architecture pattern has been postu- lated to account for exploding data volumes, fast-growing data quantities that require pro- cessing and storage, and increasing data variety. This pattern is called the lambda architecture, which we discuss next. CHARACTERISTICS OF BIG DATA The most conspicuous and technically measur- able characteristics of Big Data are often referred to as the three Vs: volume, velocity, and variety (see Figure 2).2 Not only is the volume of data large, but it is arriving ever more rapidly (velocity) – consider machine data generated on the factory floor, or algorithmic trading data generated by financial markets. The data is also of many different types 2.  “Value” and “veracity” are further dimensions that are considered to be characteristic Vs of Big Data. Value refers to the ROI the data is able to provide to the business. ROI accounts for both the potential benefit that can be derived from the data and the cost of storage media. Veracity refers to the ability to trust the data used to make decisions (which, ultimately, is required of all data). Figure 2: Characteristics and Opportunities of Big Data Exploding data VOLUME Characteristics Deep data insights Opportunities Broad data scope Ability to access most recent data Answers in real time No delays due to constant data preparation Predictive capabilities Accelerated data processing VELOCITY Increasing data VARIETY
  • 7. 7 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. Big Data Reference Architecture The most important characteristics of a lambda architecture are: •• Fault tolerance: ability to satisfy requirements in spite of failures (if a fault occurs, no information gets lost, since the incoming data can be recom- puted from the append-only master data set) •• Scalability: flexibility to accommodate data growth •• Low latency for reads and writes: minimum time delay of system responses •• Real-time results: quick return of results despite data load Lambda architecture is defined by three functions: real-time processing, batch processing, and query processing. Figure 3 shows how these functions interact. Additionally, the master data set (“all data” in the figure) is being continuously updated by append- ing new data. To compensate for the latency of batch processing, three architectural layers are defined: the batch layer, the serving layer, and the speed layer. The techniques that have been used to deal with Big Data as a strategic asset are based on tech- nologies that allow the collection, storage, and processing of very large amounts of data at a low economic cost. Initially, some of these technolo- gies (for example, Hadoop) were primarily used for batch workloads. However, in the last several years other technologies (for example, Spark) have emerged that enable both batch and real- time processing to work in parallel in the same infrastructure. It is precisely this combination of capabilities that is the foremost requirement of Big Data architectures. The lambda architecture describes how this requirement can be achieved. The Big Data community regards it as an impor- tant reference architecture for Big Data (although it has pros and cons). It is particularly well suited for predictive analytics where patterns are identi- fied within a historical data set on a regular basis and incoming records are checked in real time to see if they correspond to these patterns. Figure 3: Lambda Architecture Speed layer Batch layer Query result Serving layer Real-time processing Real-time views Update New data 1 Append All data Batch processing Batch views Overwrite Query processing 2 3
  • 8. 8 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. In the lambda architecture pattern, incoming queries are able to act on a combination of results from the batch and real-time views. Con- sequently, databases typically index the batch view to allow for ad hoc queries and have one component to merge both real-time and batch views. Fault tolerance is achieved by maintaining an immutable, append-only master data set in which all incoming data is stored. If a fault occurs in any processing step, everything can be recomputed from the master data set. The continuously grow- ing data is stored in HDFS, for example, as scal- able and reliable distributed storage. The data in an HDFS environment is traditionally processed with MapReduce batch jobs. Incoming new data is routed into both the batch layer and the speed layer. In the batch layer, the incoming data is appended to the master data set – for example, in a Hadoop Distributed File System (HDFS). Batch jobs are used to read the master data set and produce precomputed and preaggregated results called batch views. The potentially long-running batch jobs are continu- ously reexecuted to overwrite the batch views with newer versions as soon as they are available. The batch views are loaded into a read-only data store in the serving layer, to support fast and random reads. Real-time views are not copied because an ad hoc creation of the view is desired; every query should be answered with the newest possible data. Therefore, the real-time view is part of the real-time layer and not reproduced in the query layer. For all data, SAP HANA serves as the central point for data access, data modeling, and system administration.
  • 9. 9 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. results from both layers. The effort involved is huge because of the complexity of the distributed systems. But in spite of these criticisms, lambda architec- ture continues to be a valid way to handle Big Data. It is ultimately a matter of making lambda requirements more manageable. SAP HANA enables you to implement a complete lambda architecture from preintegrated compo- nents of the platform, including the processing engines and data stores. SAP HANA even goes beyond lambda since it comes with options for integration with remote data sources and sup- ports data integration and replication, data transformations, data quality operations, event streaming processing, and federated query execu- tion. Moreover, SAP HANA offers the advantages of an integrated data platform – simpler installa- tion, administration, lifecycle management, and development – because all three layers of the lambda architecture can run within one system and all persisted data can be accessed within the same database. The serving layer enables fast queries in the batch views, but these views only contain information from the time before the producing batch jobs were started.To compensate for this time gap, all new data is fed into the speed layer in parallel.The speed layer processes new data in real time with stream-processing technology (such as Apache Storm or Spark Streaming).The results are used to incrementally update real-time views in a data- base supporting fast random reads and writes (for example, by using Apache Cassandra3). The speed layer needs to approximate the results because it has only current information and no historical data. As explained, real-time views and batch views are merged in the serving layer when a query is executed. Recently, the lambda architecture has been criti- cized because it creates a complex landscape with many different components that are loosely integrated, and it requires implementing different code bases on different technology stacks in the batch and speed layer. At the same time, the implemented functions must be maintained and kept synchronized so that they produce the same 3. Apache Cassandra is an open-source distributed database management system initially developed at Facebook to deal with its in-box search feature. For details, see the DATASTAX Web site,“About Apache Cassandra.”
  • 10. 10 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. SAP HANA Platform for Handling Big Data The SAP HANA platform includes application ser- vices, database services, and integration services (see Figure 4). Since database and integration services are most relevant for Big Data infra- structures, we discuss them first. DATABASE SERVICES Database transactions are processed reliably because the system is compliant with ACID4 and SQL standards. The database services are accessible through Java Database Connectivity (JDBC), Open Database Connectivity (ODBC), JavaScript Object Notation (JSON), and Open Data Protocol (OData). Ultimately, this means that SAP HANA is both standards based and open for connection through commonly used application programming interfaces (APIs) and protocols, which facilitates its adoption and adaptation to existing infrastructures. As mentioned earlier, more and more software solutions require capabilities to manage and pro- cess Big Data holistically, independent of whether the source is a machine sensor or social media. Big Data typically needs to be combined with tra- ditional business data created by enterprise appli- cations. SAP HANA is the SAP strategic platform for unifying and combining all this data. It is ideal for central data management for all applications because it is open and capable of handling not only transactional but also analytics workloads all on one platform.As described in this section, inte- gration capabilities in SAP HANA make it possible to combine it with other technologies (such as Hadoop and members of its family) to obtain the most suitable and effective Big Data landscape. Figure 4: Overview of the SAP HANA Platform 4. Atomicity, consistency, isolation, and durability. ISV = Independent software vendor All devices SAP (such as SAP® S/4HANA suite), ISV, and custom applications SAP HANA® platform Application services Database services Integration services
  • 11. 11 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. long-running analytical queries that consume a lot of resources from the host. A dedicated workload manager helps ensure that this is done effectively by controlling parallel processing and prioritization of processing activities. Because SAP HANA uses multiple tenant data- bases that can run in one instance of SAP HANA, it is cloud ready and allows secure, efficient man- agement of a shared infrastructure. Moreover, SAP HANA maintains a strong separation of data, resources, and users among tenant databases. Multiple databases can be managed as a unit. Additionally, SAP HANA provides the flexibility to allocate memory or CPU to each tenant database. The database services encompass foundation and processing capabilities. The foundation consists of functionality that turns data into real- time information, with no sophisticated tuning required for complex and ad hoc queries (see Figure 5). OLTP and OLAP can run on a single copy of data in the same system because data is in-memory, and the columnar store in SAP HANA can handle both types of workloads with high performance. In mixed environments with both OLAP and OLTP operations, SAP HANA deals with a blend of hundreds to tens of thousands concurrent statements, from simple, short-lived, and high-priority transactions to deeply complex, Figure 5: Database Services – Foundation SAP HANA® platform Application services Database services – foundation and processing capabilities In-memory ACID columnar Multicore and parallelization Advanced compression Multitenant database containers Dynamic tiering Integration services ACID = Atomicity, consistency, isolation, and durability
  • 12. 12 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. shows that SAP HANA is technically able to store Big Data in-memory, even though this is not often done in practice. In addition, warm data can be stored on disk in a columnar format and accessed transparently. This is the dynamic tiering option, which extends SAP HANA with a disk-based columnar store, also called an extended store. It runs in the extended store server, which is an integration of SAP IQ into SAP HANA. The extended store server can manage up to petabytes of data and is optimized for fast execution of complex analytical queries on very big tables. Due to its advanced compression capabilities, SAP HANA can support scale-out deployment of several terabytes – it is not confined by the size of memory. Assuming a data compression factor of 7 and that approximately 50% of the memory is used for query processing, a 3 TB single node system, for example, could store a database with an uncompressed size of 10 TB. Scale-out sys- tems with much more memory are also possible, and there exist productive scale-out systems for databases with an uncompressed size in the range of hundreds of terabytes. The largest certified scale-out hardware configuration for SAP HANA has 94 cluster nodes with 3 TB of memory each.5 This extreme certified scale 5. See the certified hardware directory for SAP HANA at http://global.sap.com/community/ebook/2014-09-02  -hana-hardware/enEN/index.html. Possessing data does not add value in and of itself, but being able to use it to make timely, meaningful decisions that impact business is enormously valuable.
  • 13. 13 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. •• Prepackaged predictive algorithms that operate on current data, as well as open predictive and machine-learning abilities – for example, through the integration of an R server •• Interactive planning without the need to move data to the application server •• Cleansing of locally generated and imported data without the need for postprocessing. Particularly in the context of the Internet of Things, the capabilities of SAP HANA facilitate handling of series data, mostly consisting of successive events collected over a predefined time interval. The processing capabilities of the database ser- vices (see Figure 6) allow running applications with almost any data characteristics in the same system. These capabilities include: •• Analysis of sentiment by summarizing, classify- ing, and investigating text content •• The ability to search across structured and unstructured data •• Persistence, manipulation, and analysis of network relationships and graphs without data duplication •• Built-in business rules and functions that accel- erate application development Figure 6: Database Services – Processing Capabilities SAP HANA® platform Application services Database services – foundation and processing capabilities Spatial Predictive Search Integration services Text analytics Text mining Graph Data quality Series data Function libraries
  • 14. 14 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. INTEGRATION SERVICES Integration services play a strategic role in handling Big Data coming from any source and using it to provide a complete view of the business (see Figure 7). Figure 7: Integration Services of the SAP HANA Platform SAP HANA® platform Application services Database services Integration services Smart data access Smart data integration Smart data streaming Remote data sync Hadoop integration IBM DB2, Netezza, Oracle, MS SQL Server, Teradata, SAP HANA, SAP® ASE, SAP IQ, Hive Federation IBM DB2, Oracle, MS SQL Server, Twitter, Hive, OData, custom adapters Loading Streaming Synchronizing SAP HANA Vora™ Spark Hive Hadoop SAP ASE = SAP Adaptive Server Enterprise
  • 15. 15 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. •• Remote data synchronization can be used for bidirectional data synchronization with embedded SAP SQL Anywhere® solutions and UltraLite databases running on devices. This is particularly indicated for data synchronization across high-latency or intermittent networks. •• Hadoop integration involves multiple access points from SAP HANA to Hadoop data (through Spark, Hive, HDFS, and MapReduce). Furthermore, SAP HANA Vora is an in-memory, column-based SQL engine that complements the SAP HANA platform with a native SAP query processing engine on Hadoop. It is designed for high-performance queries on Big Data volumes in large distributed clusters. SAP HANA Vora is integrated into the Spark computing framework. Along with the database services, the integration services enable the SAP HANA platform to handle Big Data regardless of its characteristics (volume, velocity, and variety). In addition, the services offer openness and connectivity with virtually any technology, and the most important tools and toolkits are already closely integrated into the SAP HANA platform environment. The integration services of SAP HANA allow access to information in different data sources. These services enable replication and movement of almost any type of data in near-real time: •• Smart data access (SDA) enables remote query execution, also known as data virtualization. By means of virtual tables, data in remote systems is made available from queries executed in the SAP HANA platform. •• Smart data integration (SDI) can be used for both batch and real-time data provisioning from a variety of remote data sources. Aside from supporting additional kinds of data sources, SDI also includes an adapter software develop- ment kit (SDK) for customers and partners. SDI partially uses SDA – for example, it uses the concept of remote source systems and virtual tables. It also plugs into the SDA federation framework for remote query execution. •• Smart data streaming (SDS) enables capture and analysis of live data streams and routes them to the appropriate storage or dashboard. It includes a “streaming lite” deployment option that can run on small devices in the edge (for example, on Application Response Measure- ment [ARM]–based Linux). It can capture and preprocess data from sensors and machines and send results to a central core or cloud- based SDS.
  • 16. 16 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. The constituent services of the SAP HANA platform enable you to deal with the challenges and requirements associated with Big Data. The database foundation services support you when dealing with structured (OLTP and OLAP) and unstructured data from any source, and support high performance on one platform with different storage options. The processing capabilities facil- itate handling data of almost any type, allowing sentiment and predictive analysis, among other types. The integration services facilitate connec- tion to virtually any data source and cover the requirements of lambda and other architecture principles that may be related to Big Data han- dling. And finally, the application services provide a state-of-the-art look and feel for all applications running on SAP HANA and help simplify system landscapes by letting you run applications on the same platform. APPLICATION SERVICES Big Data is of limited value unless you can opera- tionalize the insight through applications and busi- ness processes. New business processes may even be created by enhancing existing applica- tions or developing new ones.The application ser- vices of SAP HANA support Big Data management by enabling all kinds of applications to create new or use existing data stored in the platform. More- over, when building Big Data applications, these services help to simplify the landscape by running applications on the same platform.The applica- tion services deliver a first-class user experience on any device through the SAP Fiori® user experi- ence (UX) technology. In addition, they support open development standards such as those for HTML5,JSON, and JavaScript. Built-in tools are included that support development, version control, bundling, transport, and installation of applications (see Figure 8). Figure 8: Application Services of the SAP HANA Platform SAP HANA® platform Database services Application services Web server SAP Fiori® user experience (UX) Application lifecycle management Integration services SQL = Structured Query Language JSON = JavaScript Object Notation ODBC = Open Database Connectivity OData = Open Data Protocol MDX = Multidimensional Expressions SQL JSON ADO.NET J/ODBC OData HTML5 MDX XML/A
  • 17. 17 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. Big Data Scenarios and Data Flows As mentioned in the previous section, the SAP HANA platform comprises memory-based and disk-based data stores. The platform sup- ports relational data, text data, spatial data, series data, and graph-structured data and provides various processing options such as SQL, SQLScript, calculation views, and libraries for business calculations as well as predictive analysis. From the perspective of data flows, there are several typical patterns, some of which are explained in Figure 9. Having introduced the elements of the SAP HANA platform, it is now time to show how they interact and enable you to deal best with your Big Data requirements. In this section we examine typical Big Data scenarios and the associated data flows. This will help you better understand the character- istics of the different storage and processing options.We will discuss two main scenarios: •• How incoming external data is stored and processed •• How the lifecycle for data created by enterprise applications is handled (with the associated moving of aged data between different storage tiers) The integration services of the SAP HANA platform facilitate connection to virtually any data source and cover the requirements of lambda and other architecture principles that may be related to Big Data handling. 01001010111010010 10100101000101010 01001010001011101 00101010010100010 10100100101000101
  • 18. 18 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. Figure 9: Typical Data Flows vUDF = Virtual user-defined function SQL = Structured Query Language HDFS = Hadoop distributed file system External data source Data collection technologies (Kafka, Flume, and so on) External data source Data discovery tools Enterprise applications Analytics tools SAP HANA® data warehousing Other clients of SAP HANA Smartdatastreaming In-memory data processing Processing engines in SAP HANA In-memory store Dynamic tiering Hadoop SAP HANA platform to manage Big Data MapReduce Smart data access vUDF HDFSRaw data Preprocessed data Offline processing A B I J G C2 F H2 H1 ED C1 Hive Spark SQL SAP HANA Vora™
  • 19. 19 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. however, that raw data need not necessarily be stored in HDFS in all cases. If the raw data is structured, and if SQL is sufficient for processing, it can also be stored in the dynamic tiering extended store (C2). (D) Smart data streaming is the preferred option in an SAP HANA platform environ- ment. This is because it has some unique processing features and reduces the cost of development and operation with powerful development and administration tools, as well as integration into the SAP HANA plat- form. But if real-time processing of incoming data is not required, the raw data can be directly stored in HDFS. Data routing is man- aged through data collection technologies such as Kafka and Flume as alternatives to smart data streaming. (E) The raw data is preprocessed on the Hadoop side by means of batch jobs, such as infor- mation extraction, transformations, filtering, data cleansing, preaggregation, or analysis of multimedia content. SAP HANA is also capable of executing the corresponding MapReduce jobs. But depending on the selected setup, this kind of batch processing (batch layer of the lambda architecture) can be executed on the Hadoop cluster outside the control of SAP HANA. This can be done using the various processing engines of the Hadoop family. The result of the preprocess- ing is again stored in HDFS and can be accessed from SAP HANA (see step F). (A) Incoming streams of raw data can be filtered, analyzed, cleansed, and preaggregated in real time with smart data streaming in SAP HANA. Such preprocessing can, for example, be used for correcting outliers and missing values in streams of sensor data, or for detecting patterns that indicate alert conditions. (B) Smart data streaming creates preprocessed and condensed high-value data, which is stored in in-memory tables (real-time views), for example, for real-time analytics and for combining with data created by enterprise applications on SAP HANA. Smart data streaming is capable of handling high volumes of rapidly incoming data of various types. Steps A and B correspond to the speed layer in the lambda architecture. (C) In parallel, the incoming raw data is appended to a huge data store (C1) where all data is collected. Such massively growing data stores are often called data lakes. Data collected this way can be utilized for later analysis and information extraction, poten- tially using the complete data set, including historical data. HDFS is a scalable, fault- tolerant, and comparatively inexpensive data store, which makes it a suitable choice for storing this constantly growing set of incoming raw data. Stored in HDFS, the data can be further processed with a variety of data technologies from the Hadoop family. It can also be queried and analyzed with SAP HANA Vora. It should be mentioned,
  • 20. 20 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. use case, some mechanism for refreshing the data may need to be implemented. The alternative is to not persist the results in tables but to execute the remote queries or vUDFs each time the data is accessed by the application. The decision depends on how frequently the data is accessed, how fresh the data needs to be, and what access latency can be tolerated. Therefore, the option should be determined based on the specific use case. (H) Data discovery tools can consume data from SAP HANA Vora through SAP HANA using smart data access (H1). It is also pos- sible to connect such tools to SAP HANA Vora through Spark SQL for data discovery and visualization (H2), independently of the server for the SAP HANA database. That way, tools such as SAP Lumira® software can be used with SAP HANA Vora to analyze and visualize data in Hadoop. However, a decisive advantage of using SAP HANA is that its adapter for Spark SQL enables data analysts to combine data in SAP HANA Vora with business data in the SAP HANA data- base. With data virtualization through SAP HANA, any application or tool enabled for SAP HANA is automatically enabled for SAP HANA Vora, since the existing inter- faces and connectivity are used. (F) Structured data can be directly read from HDFS files into SAP HANA. The virtual tables for smart data access can be used to exe- cute federated queries against data in SAP HANA Vora, Spark SQL, or Hive. Virtual user- defined functions6 (vUDFs) that trigger MapReduce jobs can be used if application- specific processing is required that cannot be done with SQL on Hadoop. Whenever the vUDF is called, the MapReduce job is exe- cuted on the Hadoop cluster and the result is returned to SAP HANA. Because this option is slower, results can be cached on Hadoop to improve the performance if the underlying data changes infrequently. (G) High-value data created as the result of preprocessing within Hadoop (by batch, vUDF, or federated queries) can be stored in in-memory tables in SAP HANA for low latency access, for efficient processing with the database’s native engines, and for combining it efficiently with other data in SAP HANA. Such derived high-value data can be pulled by SAP HANA (with vUDF calls and remote queries), or it can be pushed to SAP HANA from Hadoop – for example, over a JDBC connection. Storing the derived data in in-memory tables in SAP HANA has the advantage that the data can be accessed with very low latency; but depending on the 6. To deliver and run MapReduce jobs using SAP HANA, developers need to define a table-valued virtual user-defined function (vUDF). SAP HANA applications can use vUDF calls in the FROM clause of an SQL query such as a table or view. The vUDF is associated with a MapReduce Java program, which is shipped together with the function definition as SAP HANA database content.
  • 21. 21 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. warehousing process simpler and more agile, virtual, and comprehensive. Particularly in landscapes consisting of many systems and data sources, a centralized data warehouse supports the combination of Big Data stored in the platform with centralized corporate data to achieve new insights. Data warehous- ing using SAP HANA creates a central place where one version of the truth is available, based on trusted, harmonized data. As we have said, raw external data may come from a variety of sources – sensors and machines, social media content, e-mails,Web content, Web logs, security logs, text documents, multi­ media files, and more.The data may be collected, analyzed, and aggregated in Hadoop, and the extracted high-value (hot) data is moved to in-memory tables in SAP HANA.This case is depicted in Figure 10 (from right to left) as the blue arrow that turns red. (I) A business application on top of SAP HANA may also create data. Initially, the data is of high value and therefore stored in in-memory tables, where it can be accessed with very low latency. After some time, the data may become less relevant and is moved either to disk-based tables in the extended store (dynamic tiering) or to HDFS. In dynamic tiering the data can continue to be managed with tools in SAP HANA and queried with high performance, and selective updates are still supported. In contrast, data in HDFS can no longer be changed, but applications that run on the SAP HANA database can still query it through virtual tables when required. (J) Data warehousing based on the SAP HANA platform, which includes the SAP Business Warehouse application and native modeling tools in SAP HANA, can easily consume Big Data through the platform, and can make the Figure 10: Data Flows in Combined Data Processing Technologies OLTP = Online transaction processing SAP HANA® platform to manage Big Data OLTP system Processing engines in SAP HANA In-memory storage in SAP HANA Analytics tools (such as SAP Lumira® software) Dynamic tiering in SAP HANA Extended storage Hadoop Hadoop Distributed File System (HDFS) Sensors, social, and so on Data aging/temperature Data aggregation
  • 22. 22 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. of warm and raw data are not relevant for in- memory storage but are best managed with dynamic tiering (for warm data) or Hadoop (for cold business data, raw data, or data of unknown value). Figure 11 depicts a qualitative comparison of these technologies in terms of data value, processing performance, and volume. As we have seen, there are multiple options to deal with data of different types. SAP HANA is ready to handle data regardless of its character- istics in different storage tiers and with different processing engines – enabling it to act as the unifying data platform. Figure 10 also shows that another typical data flow goes in the opposite direction (from left to right). In this scenario, the high-value (OLTP) data is created by an SAP business application and stored in in-memory tables in the SAP HANA database (red arrow), where it can be accessed with very low latency by analytics tools such as SAP Lumira. But not all data has the same busi- ness value, and not all data needs to be kept in- memory forever. Aside from its business value, data can be categorized in terms of volume, access patterns, and performance requirements. Based on all these characteristics, the different categories of data can be stored in different data stores with different storage capabilities, perfor- mance, and cost. When data loses value over time, it is said to get colder. It can then be moved to a different storage tier with higher latency, bigger capacity, and maybe less cost. In Figure 10, this is represented by the red arrow that turns blue. When considering the dimensions value, volume, and processing performance for data, clear differences can be observed. In-memory storage in SAP HANA has the highest processing performance and is therefore used for high-value data, which is characterized by its comparatively low volume (up to several terabytes, though higher volumes can be reached). Higher volumes Figure 11: Qualitative Assessment of Data Processing Technologies Value,performance Volume Hadoop Dynamic tiering SAP HANA® in-memory
  • 23. 23 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. Big Data Use Cases PREDICTIVE MAINTENANCE AND SERVICE USE CASE This use case is about obtaining meaningful views on data from machines, assets, and devices for making better real-time decisions and predictions and for improving operational performance. Typically, new business models from Industry 4.0 demand that companies improve their asset maintenance so as to achieve maximum machine availability with minimum costs. Companies also need to reduce their spare parts inventory, minimizing the amount of mate- rials consumed by maintenance and repairs. This requires predictive analytics and algorithms to forecast equipment health. The architecture setup in this case should allow real-time operations, analyses, and actions. At the same time, a very large number of events per day must be correlated with enterprise data. The SAP HANA platform, enhanced with the SAP Predictive Maintenance and Service solu- tion, meets all the requirements associated with this use case (see Figure 12). The data flow scenarios presented in the previ- ous sections represent basic principles. Real- world Big Data use cases illustrate concrete areas of application. Possible new applications are emerging on a daily basis. Successful Big Data use cases are already being deployed in the following areas (to name just a few): •• Anticipating consumer behavior •• Increasing safety •• Mastering performance •• Redefining operational efficiency •• Managing predictive maintenance •• Preventing fraud •• Saving lives by improving medical research and services •• Personalizing real-time promotions •• Preventing injuries in sport •• Enhancing fan experience We will look at two examples that involve using SAP technology alone. However, since the SAP HANA platform is open, other components such as Hadoop and technologies from the Hadoop family can be incorporated into the setup as required. In any case, when designing the Big Data system landscape for your business, you should look for opportunities to simplify your IT infrastructure. And putting the SAP HANA platform at the center of your Big Data landscape is a strong start toward simplification.
  • 24. 24 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. Figure 12: Predictive Maintenance and Service Setup Note: Gold frames indicate particularly important components SQL = Structured Query Language MDX = Multidimensional eXpressions JSON = JavaScript Object Notation SAS = Statistical Analysis System SAP HANA® platform SQL, SQLScript, JavaScript Integration services Spatial Search Text mining Stored procedure and data models Application and UI services Business function library Predictive analytics library Database services Planning engine Rules engine Supports any device SAP Predictive Maintenance and Service, technical foundation Any Apps Any app server SQL MDX R JSON SAS Open connectivity Functionality from SAP Predictive Analysis Functionality from SAP InfiniteInsight® SAP Lumira® software SAP Business Suite and SAP NetWeaver® Application Server for ABAP® SAP® Predictive Analytics software Transaction Unstructured Machine Hadoop Real time Locations Other apps
  • 25. 25 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. combined from many sources, in seconds. The SQL, SQLScript, and JavaScript capabilities allow utilization of fast programs that use many que- ries. With the built-in sophisticated algorithms of the predictive analytics library, you can carry out complex analyses. In addition, with the rules engine you can define and trigger business rules. You can also expand to planning scenarios by using the planning engine. Text analysis (search and relate) is no problem with the text analysis engine, which allows handling text as just another query aspect. Similarly, the geospatial analytics capabilities of the platform make comprehensive analyses of geospatial data possible. These engines enable you to find reasons for failure (root cause), detect deviations from the norm, and create a prediction model from sensor data and failures. Open Connectivity With its open connectivity, the SAP HANA Platform enables you to reuse your Statistical Analysis System (SAS) models with high-speed execution, should you need to connect to an SAS application. Moreover, you can explore statistics or even let the system make a best proposal by using SAP Predictive Analytics software. Because SAP HANA is able to deal with R procedures, if an R server is connected, for example, in order to use data mining algorithms for predictive stock and workforce optimization, you can reuse your R models. By integrating a Hadoop system, you can flexibly scale out to meet the requirements of preprocessing and storing vast amounts of data. Hadoop may also be used for archiving historical data or for offline batch processes. SAP Predictive Maintenance and Service enables equipment manufacturers and operators of machinery and assets to monitor machine health remotely, predict failures, and proactively main- tain assets. It is offered as a standard cloud edition or as a repeatable custom solution for the technical foundation and a custom development project. Both options are based on SAP HANA. Using a Lot of Data from Any System To meet the requirements of this use case, your infrastructure must be able to use data from any system, SAP or non-SAP, regardless of the data type (business or sensor data). This is where SAP Data Services software is used. Further- more, your infrastructure should be able to listen to 1 to 2 million records per second and only keep what is interesting. SAP Event Stream Processor handles this part. Through integration of Hadoop, the platform is able to connect to many tera- bytes, even petabytes, of back-end data. Processing a Lot of Data Predictive maintenance is an important step for keeping assets from failing. For that purpose, sensor, business, environmental, sentiment, and other data must be analyzed to discover relation- ships, patterns, rules, outliers, and root causes and thereby enable predictions. Based on this data mining, it is possible to take actions such as creating notifications, altering maintenance schedules, prepositioning spare parts, adjusting service scheduling, changing product specifica- tions, and more. With its query engine and rules engine, the SAP HANA platform lets you perform ad hoc queries on a large number of records
  • 26. 26 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. High-volume, high-variety, and high-velocity data can be processed both offline and in real time. The data can be stored in-memory or in the dynamic tiering option or Hadoop, depending on the requirements. Hadoop would be used, for example, when the volume of data coming partic- ularly from external sources is extremely high and comes in with high velocity (as is the case with point-of-sales data). In the retail industry, in the area of customer behavior analytics, a similar setup allows integra- tion of several data sources into a central data store in SAP HANA, based on a predictive model that determines scores for several metrics such as return rate and likelihood to churn.The resulting scores can then be used in customer engagement intelligence for campaign target selection.The purpose is to improve the ROI of campaigns by targeting the right audience. In turn, this market- ing optimization is expected not only to drive revenue and improve margins but also to deliver a personalized consumer experience across channels.A highly optimized analytical engine in SAP HANA enables processing huge amounts of data by using many different techniques to reduce the number of searches, as well as by employing the best scientific algorithms available. In short, by deploying the SAP HANA platform, businesses are able to gain a distinct advantage over their competition. Powered by SAP HANA, SAP Demand Signal Management provides a centralized platform to monitor the aggregated market data, which in turn results in a better understanding of demand and the ability to focus in the right markets, thereby lowering costs and increasing revenues. This use case requires quick data correlations and actions to anticipate operational needs and unex- pected breakdowns, and to automate triggers. In this system configuration, flexible predictive algo- rithms and tools combine technical and business data. Machine-to-machine communication helps monitor activities and stores data to fuel real-time reporting. Sophisticated business intelligence tools are required to obtain meaningful visualiza- tions (such as SAP Lumira can provide). DEMAND SIGNAL MANAGEMENT USE CASE Demand signal management is a common requirement of consumer products companies. They are looking to apply their efforts toward those markets of retailers and consumers where they can realize the greatest growth.To realize this goal, they need a consistent and comprehensive global market view to be able to understand the demand from these various markets.Typically, these companies use syndicated data from vari- ous agencies to understand demand and brand perception so they can focus on the right areas. But if it is not automated, the consolidation and harmonization of data from various sources (inter- nal and external) is a highly time-consuming and error-prone manual activity.The SAP Demand Signal Management application powered by SAP HANA addresses the most common chal- lenges of data harmonization and automation for marketing, supply chain, and sales.With SAP Demand Signal Management, the SAP HANA platform can be evolved to include data sources such as weather data or social media data. SAP Demand Signal Management can act as a central platform for various use cases such as trade pro- motion optimization, sentiment analytics, demand forecast, and brand perception in consumer prod- ucts and other industries. Figure 13 illustrates a typical demand signal management setup.
  • 27. 27 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. Figure 13: Demand Signal Management Setup Note: Gold frames indicate particularly important components SQL = Structured Query Language MDX = Multidimensional eXpressions JSON = JavaScript Object Notation SAS = Statistical Analysis System POS = Point of sale SAP HANA® platform SQL, SQLScript, JavaScript Integration services Spatial Search Text mining Stored procedure and data models Application and UI services Business function library Predictive analytics library Database services Planning engine Rules engine Supports any device SAP Predictive Maintenance and Service, technical foundation SAP Demand Signal Manage- ment application SQL MDX R JSON SAS Open connectivity Functionality from SAP Predictive Analysis Functionality from SAP InfiniteInsight® SAP Lumira® software SAP Business Suite and SAP NetWeaver® Application Server for ABAP® SAP® Predictive Analytics software Company- internal data (shipments, stock, promotions, and so on) POS data Hadoop Other apps Market research data from other sources
  • 28. 28 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. SAP HANA Platform: Open, Flexible, Integrated, Scalable, and More built-in rule engines and business functions accelerate application development. And having all data in a unified platform also facilitates ana- lytics for real business value. The SAP HANA platform helps your business meet the demands of the digital economy, where every company will be a technology company. A key challenge faced by all companies across every industry is driving innovation while tracking results in real time. Unlike traditional databases, SAP HANA is ready to manage this challenge successfully by removing data silos and providing a single platform for storing transaction, opera- tional, warehousing, machine, event, and unstruc- tured data – one real-time system operating on one copy of data, supporting any type of business workload, all at the same time. OPEN AND FLEXIBLE You can build a complete lambda architecture with components of the SAP HANA platform, including the processing engines and data stores. Further, because of its openness to integrating other technologies, SAP HANA lets you go beyond lambda and cover other reference architecture principles and requirements by combining different components. This makes the platform very flexible and therefore a sustainable investment. SAP HANA has a powerful and flexible search func- tion that allows searching across structured and unstructured data.Analysis of sentiment can be achieved by summarizing, classifying, and investi- gating text content. Persistence, manipulation, and analysis of network relationships and graphs can be carried out without data duplication. A UNIFYING DATA PLATFORM SAP HANA is SAP’s strategic platform for unify- ing and combining relational, text, spatial, series, and graph-structured data. It provides various processing options such as SQL, SQLScript, calculation views, and libraries for business cal- culations and predictive analysis. As described earlier, the platform comprises memory-based stores and a disk-based data store (the dynamic tiering option). To complete the picture, the SAP HANA platform comes with several options for integration with remote data sources, sup- porting data integration, data transformations, data quality operations, complex event process- ing, and federated query execution. SAP HANA puts a broad spectrum of capabilities at your dis- posal to manage the challenging characteristics of Big Data on one platform. SAP HANA Vora also underlines the unifying character of the SAP HANA platform. This in- memory query processor has been built as an extension of Apache Spark and is effectively an in-memory query engine that can make the pro- cess of data analysis more in depth and oriented around business processes. Though it can be used independently of SAP HANA, businesses can benefit from using SAP HANA Vora as a bridge between regularly updated and accessed business data kept in SAP HANA with historical and mass data stored in Hadoop data lakes. For all data, SAP HANA serves as the central point for data access, data modeling, and system administration. Having all data in one platform is a great advantage for application developers because they can access data in SAP HANA and external sources in a uniform way, such as with SAP SQL, SQLScript, and calculation views. The
  • 29. 29 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. processing modules.As mentioned earlier, while SAP HANA Vora can be used independently of the SAP HANA platform, it is optimally integrated in the SAP HANA platform to help ensure very high efficiency. SIMPLIFICATION AND SECURITY Using SAP HANA as the unifying platform for all your data also simplifies system administration and software lifecycle management, thus helping to reduce cost of ownership.You can gain effi- ciency and agility while reducing IT expenses by running SAP HANA in a virtualized environment. And the enterprise-class high-availability and disaster-recovery features in SAP HANA are designed for continuous operation, even if failures occur. SAP HANA provides versatile tools such as the SAP HANA studio, SAP HANA cockpit, SAP DB Control Center systems console, SAP Solution Manager, and SAP Landscape Virtualization Man- agement software to monitor the health of your system and effectively administer your data infrastructure. You can also manage the platform lifecycle more efficiently by streamlining installa- tions, configurations, and upgrades for your entire SAP HANA platform environment, using another rich set of tools to help simplify deploy- ment and maintenance while reducing costs. INTEGRATED AND SCALABLE Predictive analysis is enabled through prepack- aged, state-of-the-art predictive algorithms that operate on current data, as well as through open predictive and machine-learning abilities, such as by integrating an R server. Interactive planning is possible without moving data to the application server. SAP HANA allows cleansing of locally generated and imported data, which can be performed without postprocessing. Hadoop is suited for data that can grow infinitely – that is, for unstructured and raw data and when massive scale-out is required for processing. With Hadoop you can scale out flexibly.As a good complement, SAP HANA Vora is the rec­ ommended SQL engine for high-performance analytics on data in Hadoop and Spark. To achieve high performance, SAP HANA Vora uses advanced algorithms and data structures, and just-in-time compilation of query plans into machine-executable binary code. The integration of SAP HANA Vora into the Spark execution framework has the advantage of reusing various Spark capabilities, such as the Spark API, and Spark’s ability to integrate with a cluster man- ager such as Hadoop YARN.7 This integration makes it possible for Spark programs and Spark SQL queries to access data using SAP HANA Vora and combine it with other Spark data sources and 7. YARN = Yet Another Resource Negotiator.
  • 30. 30 / 31 © 2016 SAP SE or an SAP affiliate company. All rights reserved. database administrators’ access and integrate SAP HANA with your existing security infrastruc- ture to help ensure your data is always secure. CORE MANAGEMENT SYSTEM FOR BIG DATA With SAP HANA, you benefit from intuitive, state- of-the art tools and technologies to effectively run a mission-critical, secure environment for your most valuable data assets. At the same time, you are preparing your business to manage the challenges of the digital economy. For all these reasons, we recommend using SAP HANA as your core data management system for all Big Data applications, including custom applications, and adding other options such as SAP HANA Vora or Hadoop when addi- tional capabilities are required. With SAP HANA you can set up simplified data warehousing that reduces IT workloads, enhances data modeling, and simplifies administration – particularly important for landscapes consisting of many systems and data sources.The centralized data warehouse not only consumes Big Data stored in the SAP HANA platform but also harmo- nizes it and combines it with centralized corporate data, thereby providing one version of the truth based on trusted data. Last but not least, SAP HANA helps you ensure your business data is safe by providing security functions to implement specific security policies. Additionally, when you run your SAP applications on SAP HANA, they benefit from the same security foundation, and you can even add incre- mental protection. For example, you can control SAP HANA enables you to implement a complete lambda architecture from preintegrated components of the platform, including the processing engines and data stores.
  • 31. 31 / 31 Find Out More •• SAP HANA smart data streaming: http://help.sap.com/hana_options_sds •• SAP Event Stream Processor – a complex event processing platform for event-driven analytics: www.sap.com/pc/tech/database/software /sybase-complex-event-processing /index.html •• Advanced analytics with SAP HANA for in- memory processing for text, spatial, graph, and predictive analysis: http://hana.sap.com/abouthana/hana -features/processing-capabilities.html •• The Data Science organization at SAP, with experts who help implement predictive analyt- ics and custom apps: www.sap.com/solution/big-data/software /data-science/index.html •• SAP Predictive Maintenance and Service, cloud edition: http://help.sap.com/pdm-od Here are sources for more information on the ways SAP can help you manage and get the most from Big Data: •• SAP HANA platform: http://go.sap.com /solution/in-memory-platform.html •• Administration and IT operations for SAP HANA: http://hana.sap.com /capabilities/admin-ops.html •• SAP HANA and Apache Hadoop: www.sap.com/solution/big-data/software /hadoop/index.html •• SAP HANA Vora: http://go.sap.com/product /data-mgmt/hana-vora-hadoop.html •• SAP IQ for logical Big Data warehousing (OLAP): www.sap.com/iq •• SAP SQL Anywhere for designing embedded database applications for mobile and remote environments: www.sap.com/pc/tech/database/software /sybase-sql-anywhere/index.html •• SAP Data Services for all types of data integra- tion, including smart data access, smart data integration, and smart data quality: www.sap.com/pc/tech/enterprise -information-management/software /data-services/index.html Studio SAP | 42065enUS (16/03) © 2016 SAP SE or an SAP affiliate company. All rights reserved.
  • 32. © 2016 SAP SE or an SAP affiliate company. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company. SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. Please see http://www.sap.com/corporate-en/legal/copyright/index.epx#trademark for additional trademark information and notices. Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors. National product specifications may vary. These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.