SlideShare ist ein Scribd-Unternehmen logo
1 von 158
Downloaden Sie, um offline zu lesen
SAP VORA 1.4
July 2017
Puntis Jifroodian-Haghighi, SAP Vora Product Management
Jason Hinsperger, SAP Vora Product Management
Vitaliy Rudnytskiy, SAP Developer Relations
2INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
o Introduction to SAP Vora and Big Data
o Vora Installation and Configuration
o Tables and Views in Vora
o Break
o Different Data Sources
o Hierarchies
o Vora Graph Analysis
o Break
o Time Series, Document Store and Disk Engine
o Wrap-up
Agenda
Introduction
4INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
For SAP, Big Data Expands the Customer Data Footprint
Enterprise Data Data Lakes
BW/4
HANA
ASE
IQ
HANA
D A T A L A K E
5INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
What is a Data Lake and Why the rise in popularity?
A Data Lake is a massive, easily accessible, centralized repository of data stored on commodity
hardware. Data is stored in its raw format and transformed when needed. Data Lakes can scale to
PB range, thus becoming the de facto standard for Big Data initiatives.
Scalable
Manage more data
than ever before
Store any data
Supports all data types
Flexible
Analyze and simulate
any business scenario
Dynamic
Deploy on premise or
in the Cloud
6INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Hadoop: Cheaper, Faster, Effective Big Data Storage and Processing
• Highly scalable – handles terabytes
to petabytes and beyond
• Affordable – open source software,
runs on commodity machines
• Flexible – handles all varieties of data,
not just structured data
• Tackles the data challenges that
drive modern business – Yahoo!,
Netflix, Facebook all use Hadoop to
run core business
7INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Challenges with Big Data
* Source: Gartner Big Data Adoption Survey insights November 2016 https://www.gartner.com/webinar/3451618
Achieving Value, Skills, Capabilities, Governance
8INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Unlock Business Potential from Your Big Data
Insights from
one single solution
In-memory distributed computing
engines: Relational, Time Series,
Graph, JSON/Doc
Disk-to-memory accelerator
Enterprise-ready
Production-ready,
integrated solution
Support, Versioning and
Compatibility between different
Hadoop components
Seamless integration with
SAP HANA
Easier to use
Intuitive web interface
One SQL entry point
Connect with familiar tools
9INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
DW/Data Tiering
• Economically maintain context
data without compromising on
performance
• Immediate results from
petabytes of contextual data
without compromising core
business systems
IoT
• Immediately sense and respond to
data streams with Vora’s
embeddable design
• Off-load processing to edge
devices and simplify data
synchronization with gateways
and data warehouses
Data Lineage &
Compliance
• Query archived data without
losing data lineage & control
• Deliver timely, accurate
compliance reporting
• Perform complete audit trail
analysis from operational to
archive data
Cloud
• Analysis of large volumes of
external data without building
on-premise infrastructure
Data Lake
• Faster analytics with
compiled query for
distributed processing
• Better insights with drill-
down root-cause analysis
SAP Vora Usage Patterns
10INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Use Case Challenge Why SAP
HANA Vora?
• Significant compute capacity
required for valuations
• Store larger volumes of data in
Hadoop and leverage Spark + Vora
to deliver massive parallel compute
power + enterprise-grade analytics
Risk
Management
Fraud
Detection
• Early detection of rogue/illicit trading
• Leverage Vora to correlate
seemingly unrelated transactions to
identify fraudulent activity
Broker/Trade
Compliance
• Detect fraudulent transaction earlier
to improve accuracy and reduce
costs
• Link trading and Financial (ERP/S/4)
transactions with unstructured
content (emails/instant
messaging/sentiment data)
Anti-Money
Laundering
• Detect illegal activities faster and
with improved accuracy
• Ingest, store and process large
amounts of data and leverage
machine learning algorithms to
detect money laundering activities
Customer 360˚
• Reduce Customer Churn
• Grow the business with targeted
marketing campaigns
• Better understand your customer
behaviors and needs (sentiment
analysis) to proactively make
relevant offers and reduce churn
SAP Vora for Financial Services/Capital Markets
11INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP Vora Telco/Network capacity planning/Real-time bandwidth allocation
 Network Capacity planning – By analyzing Call
Detail Records (CDRs) and network loads, telcos
can plan infrastructure expansion with greater
precision
 Cellphone service improvement -Ability to
analyze, understand and fix instances of poor
cellular service e.g. dropped calls/poor audio
 Targeted network maintenance/upgrades –
Analysis of how cable network congestion affects
churn, and where exactly network upgrades produce
the most incremental revenue
 Real-time bandwidth allocation: With Real-time
packet inspection operators can steer traffic and
optimize network quality of service (QoS) in real time
in an attempt to maintain the best service quality
12INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP Vora Oil & Gas Environmental Safety/Oil Production/Predictive Maintenance
 Environmental Safety: Analysis of anomalies in
drilling can be identified in real time. Well problems
can be detected before they become serious and
drills can be shut down proactively to prevent
environmental risks
 Oil Production Forecasting: Analysis of seismic,
drilling and production data to enhance oil
extraction from existing wells and forecast oil
production
 Identification of potential drilling errors
/equipment failures, by analyzing sensor data from
equipment (drill heads, down hole sensors, etc.) as
well as geological data. Understand what
equipment works best in each environment
 Predictive Analysis : Identification of events or
patterns that could indicate an imminent security
threat or cyber- terrorist act in order to keep their
personnel, property and equipment safe
13INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP Vora Retail/Consumer Buying Patterns/Dynamic Pricing
 360° Customer Insight: Improve customer
satisfaction and increase sales opportunities by
integrating all relevant customer data across online
transactions, POS transactions, social media, and
customer service interactions into one single view
 Upsell/cross-sell recommendations: Increase
online purchases by recommending relevant
products and promotions in real time. Retailers can
recommend products based on what other similar
customers have bought
 Click Stream Analysis: Clickstream analysis helps
retailers to better understand how consumers make
online purchase decisions which in turn helps them
to optimize web pages/offers to increase
conversion, resulting in lower cart abandonment
14INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP Vora - Manufacturing
Potential Use Cases
 Predictive Maintenance – Minimize Non-Productive Time
(NPT) by monitoring equipment or product utilization in a
live environment to identify patterns that indicate imminent
failure.
 Assembly Line Quality Assurance: Take measurements
of work-in-progress products to find manufacturing defects
as early as possible, while also identifying any potential
process or design flaws. Analysis of
 Real-time Parts Flow Monitoring: Attaching sensors to all
parts in the production process and tracking them in real
time enables manufacturers to have a real-time view to
their production process.
 Product Configuration Planning: Product configuration
planning helps accelerate production by offering fast
delivery times for the manufacture of millions of different
product configurations.
Introduction to SAP Vora
16INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
3xV of Big Data
17INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Hadoop Overview
18INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP Vora in the Hadoop Ecosystem
19INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Apache Spark is a fast and general engine for large-scale data processing
Can run along with, or independent from Hadoop
Can work against data from many sources – HDFS, local files, Amazon S3, etc…
Supports Java, Scala or Python based applications
Works on data in an in-memory fashion
– 10-100 times faster than Hive on Map/Reduce
Supports SQL access
Includes a Streaming engine
Apache Spark
20INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
What does SAP Vora provide which Spark does not?
Integration with Enterprise data
– Move processing closer to the data by extending Spark to allow the computation of full logical plans at the datasource (whether its HANA
or the Vora Engine)
– The result is a performance increase (but dependent on the actual query)
Bidirectional virtual data access from HANA to Hadoop/Spark extending beyond SDA capabilities
– No data duplication
Enterprise Analytics additions to SparkSQL
– Hierarchies, currency conversion
Unified access layer built on SQL allowing integrated analysis of disparate datasets
Simplified data modeling on HDFS/S3 data sets using web based modeler
SAP Vora Complementing Spark Framework
21INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Distributed Computing Solution for the Digital Enterprise
HadoopFiles Files Files
Vora
Spark
Vora
Spark
Vora
Spark
…
Distributed Transaction Log
Disk-to-Memory Accelerator
Data Modeler
Relational Time Series Graph Doc Store
SAP VORA
Data Science, Predictive, Business Intelligence, Visualization Apps
Distributed computing cluster
Spark
Hadoop
Other Apps
In-Memory
Store
SAP
HANA
Platform
O P T I O N AL
22INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Key Capabilities of SAP VORA
Vora Vora Vora
Vora Vora Vora
Vora Vora Vora
Native „Database“
in Hadoop
Multiple Engines
Relational
(OLAP)
In-
Memor
Time
Series
Doc Store
Graphs
Intuitive Tools Tight HANA Integration
0.1sec
∞
HANA
Hadoop
23INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
• Full support for Kerberos (AD, MIT) enabled Hadoop
landscapes
• Encrypted HDFS support
• Access Hadoop transparent encryption zones
• Consul and Nomad TLS (Transport Layer Security) & ACL (Access
Control Lists) support for Vora services
• Secure access to various Vora components ( engines, Tools UI,
catalog)
Security
SAP Vora: Key Components
25INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
:To administer and monitor the Hadoop landscapeCluster Management Tools
Cloudera Manager MapR Control System Ambari
26INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Vora Nodes
Vora Spark Extension
SAP VORA
Transaction
Coordinator
Metadata Scheduler Discovery Landscape
DLog S3 / Swift / HDFS/ORC/Parquet
Control
Nodes
(1..few)
Compute
Nodes
(1..many)
Relational
Documents Graph
Time Series
Disc
Persistent
Storage
Relational
Documents Graph
Time Series
Disc
27INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Vora Manager UI http://<manager_node>:19000
Shows the status of all services
▫ Start/stop all services
▫ Configuration management:
 specify node assignments
 change parameters
Consul
▫ Vora Discovery Service UI http://<discovery_server_node>:8500/ui
▫ Know where the services are
Nomad
▫ Process scheduler and resource manager
▫ Start/stop/restart of Vora services -- If a service fails, Nomad will
automatically keep trying to restart it until it succeeds
▫ Manage node assignment
Vora Manager Components
28INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Vora Manager UI: Home Page
29INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Vora Manager UI: User Management
30INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Vora Manager UI: Nodes
31INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Vora Manager UI: Services
32INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP Vora Nodes: Control Nodes
Vora Spark Extension
SAP VORA
Transaction
Coordinator
Metadata Scheduler Discovery Landscape
Control
Nodes
(1..few)
33INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
• Transaction Coordinator
• manages user transactions
• Transaction Broker
• enforces consistent (meta)data
modifications
• Landscape Manager
• Controls data partitioning and
placement across different engines
• Lock Manager
• Provides a distributed read-write lock
mechanism for concurrent load
statements to avoid loading the same
partition multiple times
• A driver for query execution with user
session semantics
Control Nodes
34INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Query Processing
Query
Processor
Catalog
Landscape
Manager
Host
Assignment
Scheduler
User Query
Vora Engines (graph, doc, series, disk)
Result Set
35INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Vora Nodes: Compute Nodes
Vora Spark Extension
SAP VORA
Compute
Nodes
(1..many)
Relational
Documents Graph
Time Series
Disc
Relational
Documents Graph
Time Series
Disc
36INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
 Relational Engine – allows you to frame your data as relational entities and communicate with them using SQL
 Graph engine – SAP VORA embeds an in-memory graph database for real-time graph analysis. The primary focus is on complex read-only analytical queries on
very large graphs.
 Time Series – SAP VORA provides a highly-distributed time series analysis engine which supports storing and analyzing time series data. By enabling efficient
(memory and speed) time series compression and supporting features like standard aggregation, granularization, and advanced analysis; SAP VORA allows you to join
the relational data with series data to build efficient SQL models in Hadoop and other Big Data environments
 Document Store – SAP VORA introduces NoSQL features like storing JSON documents using the new Document Store as part of the SAP VORA 1.3 release. The
new DocStore supports schema-less tables, allowing you to flexibly add or remove fields from any documents and helps scale horizontally
 Disk to Memory Accelerator – SAP VORA provides relational capabilities without having to load all the data into memory in those cases where it will not fit.
Vora Analysis Engines
Distributed Transaction Log
Disk-to-Memory Accelerator
Data Modeler
Relational Time Series Graph Doc Store
SAP VORA
Spark
Hadoop
37INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Vora Nodes: Persistent Storage
Vora Spark Extension
SAP VORA
DLog S3 / Swift / HDFS/ORC/Parquet
Persistent
Storage
38INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Persistent Storage: HDFS + CATALOG + DLOG
Vora Catalog
(Metadata)
Vora
DLOG
Spark
Context
REGISTER ALL TABLES USING com.sap.spark.vora
REGISTER TABLE <name> USING com.sap.spark.vora
SHOW TABLES USING com.sap.spark.vora
Spark parameter: spark.sap.autoregister com.sap.spark.vora
Persistent
Volatile
(Spark is session-based)
• Shows tables in local Spark Context
• After restart (e.g. of spark-shell) Vora table metadata
needs to be reloaded into local Spark Catalog via
REGISTER
Spark Catalog
SHOW TABLES
39INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Take a few minutes and explore the Vora manager, configuration components and various settings
available.
 SAP Vora Manager: http://<IP_ADDRESS>:19000
Check It Out!
SAP Vora: Getting Started
41INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
On-Premise
– Developer Edition on Premise: http://developers.sap.com
– Install Official Vora Release on SAP SMP
Cloud Based
– Vora on CAL
– Amazon AWS with “one-click” – developer and production editions (AWS marketplace)
Vora Testdrives
– 3hr testdrives - http://testdrive.saphanavora.com/
▫ Retail
▫ Telco
▫ Time series
Vora Installation Options
42INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
– Install Official Vora Release
– Automated scripts for you own cluster (Ansible)
– Monsoon
Options to start with Vora
On premise and in the cloud developer editions:
http://www.sap.com/developer/topics/vora.html#freetrial
43INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Set up and Start developer editions
https://www.sap.com/developer/tutorial-navigator.how-to.html?tag=products:data-management/sap-hana-vora
44INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
• Pre-defined business scenario--sample data
included
• Step-by-step instructions
• Includes all solution components (e.g. Vora +
Lumira)
• 100% free
• Time bound
Test Drive Lab vs. Developer Edition
http://testdrive.saphanavora.com/
45INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Features
• Setup fully functional and configured SAP Vora cluster with few clicks. Available in most of the
AWS regions.
• SAP Vora Console to manage and scale the cluster
• Optimal 4 node cluster based on
• Centos 7.2
• Vora 1.3.61 (GA)
• Apache Ambari 2.2.1.0
• Spark 1.6.1
• Hadoop Distribution HDP 2.4.2
• Zeppelin 0.6.0
SAP Vora Developer Edition on AWS
46INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Prerequisites
Amazon Web Services account
Virtual Private Cloud (VPC) and Security Group as virtual firewall
Steps
Go to Vora sign up page to launch an instance from an Amazon Machine image (AMI)
Use SAP Vora console to setup cluster, add nodes and configure
Use console to view and manager the cluster
Get your SAP Vora Dev edition in AWS
Connecting to SAP Vora
48INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
How can I communicate with Vora?
Spark Adapter
SAP HANA
Thrift Server
Vora
Spark
Vora JDBC /
ODBC
Vora
Tools
Type
• Native Spark
• JDBC
• Graphical
• HANA
49INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Vora Consumption
Consumption from HANA is possible through
 SDA with “Spark-Adapter” (since SAP HANA SPS 10)
 SDA with Thrift Server (Since SAP HANA SPS 7)
 Direct Connectivity (End of 2016)
JDBC / ODBC Access
 Spark SQL is exposed through the Thriftserver (SAP VORA
Extensions included and using SapSQLContext)
Data visualization tools
 Apache Zeppelin (or Jupyter) via SAP Spark extensions
 SAP Lumira Data Analysis via JDBC channel
 SAP VORA Data Modelling via JDBC channel
Native language bindings
 for programmatic data access
 Available in Scala, Java, Python, R
How can I communicate with Vora?
Consumption of SAP VORA
Spark
Adapter
SAP HANA
Thrift
Server
Vora
Spark-Driver
SAP Spark Extensions
Vora
ODBC
Vora
Tools
50INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
– Vora Tools UI, http://<tools_node>:9225
– SQL Editor
– Data Browser
– Modeler Perspective
– User Management
Vora Tools
51INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Zeppelin, Spark Shell and Jupyter Notebook
Apache Zeppelin
Jupyter Notebook
Spark Shell
• Spark-Shell with Vora
• $ source /etc/vora/vora-env.sh
• $ $VORA_SPARK_HOME/bin/start-spark-shell.sh
• Also possible: spark-submit; pyspark; other tools that can connect to the Vora
Thriftserver
52INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Lumira
– Graphical Data Exploration
SAP Lumira
SAP Lumira, Zeppelin:
• Installation and Administration
Guide (do not install Zeppelin via Ambari)
Jupyter:
• https://blogs.sap.com/2016/01/21/visualizing-data-with-jupyter/
53INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP HANA
– Virtual Table via SDA
– Remote Source
▫ VoraODBC Connector
▫ Spark Connector
SAP HANA (via Smart Data Access, SDA)
54INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Pluggable API to access structured
data through Spark SQL
Data filtering and column pruning can
be pushed down to the data sources
in many cases (depends really on the
source capability)
Advantages:
• Less data in Spark, Less disk IO
• Less to do in Spark
• Less memory consumption
• Any Spark language can leverage
Data Sources API
Spark Data Sources API: Quick Refresher
Spark SQL
Spark Core Engine
Data Sources
MLlib
Spark
Streaming
GraphX
(graph)
CSV HANA
55INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
• Extends Spark Data Source to allow the creation of
relations in Vora
• The tables stored in SAP VORA are exposed to Spark as
standard Spark DataFrames
• To query the relations use standard Spark SQL or Vora
SQL
• Vora data source analyzes Spark SQL queries for
opportunities to push the execution of parts or all of the
query to SAP VORA
• Vora data source, a Spark SQL query can be executed on
multiple SAP VORA engines concurrently
• the execution result returned back to the Spark runtime as
an RDD (resilient distributed data set)
SAP VORA Data Source (1)
56INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP VORA provides the following data sources:
• com.sap.spark.hana for the SAP HANA data source
• com.sap.spark.engines for the specific SAP VORA engines (relational,
graph, time series, document store, and disk)
• com.sap.spark.vora for the SAP VORA data source (deprecated)
SAP VORA Data Source (2)
SAP VORA 1.4
Tables and Views
58INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SQL Code:
CREATE TABLE USERS (user_id string, age
integer, gender string, occupation string,
zip_code string)
USING com.sap.spark.engines.relational
OPTIONS (
files
"/path/to/USERS1.csv,/path/to/USERS2.csv”
)
Working with Tables and Views in Vora
Creating Tables
• Options: storagebackend, format, csvdelimiter, csvquote, null,
csvskip, and datetimeformat.
• A Create TABLE statement, registers the table in the SparkSqlContext and
creates a table in the SAP VORA engine.
60INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
• Create table if table exists in Vora catalogue
• Create table if not exist
• Create table without data
• Create table without scheme
Create Table Conditions
61INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
 APPEND TABLE testTableName
OPTIONS (files
"path1/to/file/file1.csv,path2/to/file/file
2.csv”)
 APPEND TABLE testTableName
OPTIONS (files "path1/*”)
Working with Tables and Views in Vora
Register, Append and Drop
 DROP TABLE/VIEW testTableName
 DROP TABLE/VIEW IF EXISTS tableName
 DROP TABLE/VIEW tableName CASCADE
Use wild cards to read from hdfs files in a folder:
OPTIONS ( ...
files "/dir1/*,/dir2/*"
... )
62INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Working with Tables and Views in SAP VORA
Creating tables (2)
63INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
 Listing Tables and Views:
Working with Tables and Views in SAP VORA
Listing and loading tables and views in SAP VORA
REGISTER ALL TABLES USING
com.sap.spark.vora
REGISTER TABLE TABLE1
USING com.sap.spark.vora
IGNORING CONFLICTS
SHOW TABLES
USING com.sap.spark.vora
OPTIONS (…)
// show only tables registered in the
Spark catalog
SHOW TABLES
 Loading Tables from SAP VORA into Spark:
64INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Views:
 SQL
CREATE VIEW MyView
AS SELECT * FROM Table1
USING com.sap.spark.vora
OPTIONS ( …)
 Dimensions
CREATE DIMENSION VIEW MyDimensionView
AS SELECT * FROM Table1
USING com.sap.spark.vora
OPTIONS (…)
Working with Tables and Views in SAP VORA
Persisted views
 Cubes
CREATE CUBE VIEW MyCubeView
AS SELECT * FROM Table1
USING com.sap.spark.vora
OPTIONS (…)
 DROP VIEW MyView
USING com.sap.spark.vora
OPTIONS (…)
 DESCRIBE TABLE MyView USING com.sap.spark.vora
© 2017 SAP SE or an SAP affiliate company. All rights reserved. 65PUBLIC
1. Financial institutions
2. Products
3. Complaints
© 2017 SAP SE or an SAP affiliate company. All rights reserved. 66PUBLIC
Complaints.csv
FinancialInstitutions.csv
Products.csv
67INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Take 15 minutes and get familiar with Apache Zeppelin, and execute the commands in the following
notebooks:
– 0_DATA – create base tables
– Tables and Views – create views, cubes and dimensions on the base tables
 https://www.sap.com/developer/tutorials/vora-ova-zeppelin0.html
 Apache Zeppelin: http://<IP_ADDRESS>:9099
Check It Out!
68INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Vora in-memory
Review – What did we do?
CREATE TABLE…
…
Thrift Server
Spark
SELECT * FROM
TABLE…
Data
Data File
SAP VORA 1.4
Different Data Sources
70INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
CREATE TABLE Users(id integer, name string)
USING com.sap.spark.vora
OPTIONS (
files "/S3_BUCKET/data.csv",
csvdelimiter "|",
storagebackend "s3",
s3accesskeyid "S3_KEY_ID",
s3secretaccesskey "S3_KEY_SECRET",
s3endpoint "S3_ENDPOINT",
s3region "S3_REGION" )
Loading Different Data Types
Loading data from Amazon S3
71INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
CREATE TABLE Users(id integer, name string)
USING com.sap.spark.vora
OPTIONS (
tablename "Users",
files "/ORC_Files/data.orc/*",
format "orc"
)
CREATE TABLE Users(id integer, name string)
USING com.sap.spark.vora
OPTIONS (
tablename "Users",
files "/Parquet_Files/data.parquet/*",
format "parquet"
)
Loading Different Data Types
Loading data from ORC and Parquet file
72INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Individual tables via Vora
All tables via Scala
 import com.sap.spark.engines.client.EngineClient
 EngineClient.getOrCreate().reloadTables()
Reloading tables into Vora engines after restart
73INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
74INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
 SAP HANA DATA SOURCE
CREATE TABLE $tableName
USING com.sap.spark.hana
OPTIONS
(
tablepath "$tableName”,
dbschema "$dbSchema”,
host "$host”,
Instance "$instance”,
user "$user”
passwd "$passwd”
)
Loading Different Data Types: SAP VORA – SAP HANA Connection
Loading data from/to SAP HANA
SAP HANA
In-Memory
Store
Spark SQL
Smart Data Access Spark Extension
Spark SparkSpark
75INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Writing Data into HANA
76INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SHOW TABLES USING com.sap.spark.hana
OPTIONS (
host "$host”,
instance "$instance”,
user "$user”,
passwd "$passwd”,
dbschema "$dbSchema”,
tablePattern "$pattern”
)
Listing Tables in SAP HANA
77INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
REGISTER TABLE tablename
USING com.sap.spark.hana
OPTIONS (
host "$host”,
instance "$instance”,
user "$user”,
passwd "$passwd”,
dbschema "$dbSchema”) [IGNORING CONFLICTS]
REGISTER ALL TABLES USING com.sap.spark.hana
OPTIONS (…)
Register HANA tables to current Spark Context
78INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
DROP TABLE testTableName
DROP TABLE IF EXISTS testTableName
DROP TABLE testTableName CASCADE
DESCRIBE TABLE tablename
USING com.sap.spark.hana
OPTIONS (
host "$host”,
instance "$instance”,
user "$user”,
passwd "$passwd”,
dbschema "$dbSchema”)
Drop/Expose metadata
79INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Pushing down the Hana UDF’s
80INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
81INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Take 10 minutes and go through the tutorial to connect Vora to HANA and to read different file
formats.
https://www.sap.com/developer/tutorials/vora-ova-zeppelin6.html
https://www.sap.com/developer/tutorials/vora-ova-hana-datasource.html
 SAP Vora Tools: http://<IP_ADDRESS>:9225
 Apache Zeppelin: http://<IP_ADDRESS>:9099
Check It Out!
SAP VORA 1.4
Disk to memory
83INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
• Relational column based store
• You can create, register, and query disk engine tables in Spark SQL
statements in the same way as with other data sources (for example,
SAP VORA, SAP HANA, and native Spark data sources). The syntax
and options of the CREATE TABLE statement are compatible with the
SAP VORA data source.
• Uses ‘com.sap.spark.engines.disk’
• REGISTER ALL TABLES USING com.sap.spark.engines.disk
Vora Disk to Memory Accelarator
84INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Take 10 minutes and work through the doc store notebook in Apache Zeppelin
https://www.sap.com/developer/tutorials/vora-ova-zeppelin3.html
 SAP Vora Tools: http://<IP_ADDRESS>:9225
 Apache Zeppelin: http://<IP_ADDRESS>:9099
Check It Out!
SAP VORA 1.4
Hierarchies
86INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Working with Hierarchies in SAP VORA
What are hierarchies?
Clothes
Men Women
Evening
Gowns
SkirtsShirts SuitsShorts
JacketsSlacks
Dresses Blouses
Sun
Dresses
87INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Parent-Child Hierarchy (aka Adjacency-List)
Clothes
Men Women
Evening
Gowns
SkirtsShirts SuitsShorts
JacketsSlacks
Dresse
s
Blouse
s
Sun
Dresses
1
32
4 5 6 7 8 9
10 11 12 13
88INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
• ORDER SIBLINGS BY: is relevant for some UDFs, such as IS_PRECEDING and IS_FOLLOWING
• START WHERE
• SET
Create Parent-Child hierarchy from the clothes table
Clothes
Men Women
Evening
Gowns
SkirtsShirts SuitsShorts
JacketsSlacks
Dresse
s
Blouse
s
Sun
Dresses
1
32
4 5 6 7 8 9
10 11 12 13
89INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Parent-Child Hierarchies
90INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
h_src(level 1, level 2, level 3, level 4)
Level-Based Hierarchies (aka flattened hierarchies)
Clothes
Men Women
Evening
Gowns
SkirtsShirts SuitsShorts
JacketsSlacks
Dresse
s
Blouse
s
Sun
Dresses
1
32
4 5 6 7 8 9
10 11 12 13
91INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
• WITH LEVELS (col1, col2, col3, col4)
• MATCH PATH: Determines the way identical nodes are handled across the same and across different
columns.
• ORDER SIBLINGS BY
• SET
Level Hierarchies – Create statement
92INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
93INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Working with Hierarchies in SAP VORA
Hierarchy UDFs
UDF
level(u)
is_root(u)
is_descendant(u,v)
is_descendant_or_self(u,v)
is_ancestor(u,v)
is_ancestor_or_self(u,v)
is_parent(u,v)
is_child(u,v)
is_sibling(u,v)
is_following(u,v)
is_preceding(u,v)
node(node)
94INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Using UDF’s with Hierarchies - example
95INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Using UDF’s with Hierarchies
96INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Take 10 minutes and work through the hierarchies notebook in Apache Zeppelin
https://www.sap.com/developer/tutorials/vora-ova-zeppelin2.html
 SAP Vora Tools: http://<IP_ADDRESS>:9225
 Apache Zeppelin: http://<IP_ADDRESS>:9099
Check It Out!
SAP VORA 1.4
Time Series Engine
98INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Time Series and Time Series Analysis
-30
-25
-20
-15
-10
-5
0
5
Temperature °C
Halifax Waterloo
Time Series: Sequence of data points recorded over time, may
occur as equidistant / non-equidistant
Detect and correct errors / anomalies: Outlier Detection,
Missing Value Replacement, Editing
Standard aggregation: SUM, MIN, MAX, Select, Join,
Grouping across Series (e.g.: group by province)
Granularization Support: Hourly to Daily measurements
Series Analysis: Smoothing, Binning, Correlation
99INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Time Series Data Analysis across big data
Efficiently analyze time series data in distributed
environments
 Interactive access to standard time series
analysis functions using the well-known SQL
language
 Efficient compression allowing analysis of
more data using less memory
 Build time series models visually using Vora
Data Modeler
Trend | Cyclical | Seasonal | Random | Exception
100INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
• Identified series column
• Know what the series period is
• Think about compression strategy
eg. Max error bound for compression
• Define partition function
• Define partition scheme
Creating a Series Table - Prerequisites
101INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
CREATE TABLE "name" ( "time" TIMESTAMP,
"value1" INTEGER DEFAULT NULL,
"value2" DOUBLE )
SERIES ( PERIOD FOR SERIES "time"
START TIMESTAMP '2010-01-01 00:00:00'
END TIMESTAMP '2010-12-31 00:00:00'
EQUIDISTANT INCREMENT BY 1 HOUR )
PARTITION BY PS1("time")
Creating a Series Table - Examples
102INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
CREATE PARTITION FUNCTION PF1(C TIMESTAMP)
AS RANGE
BOUNDARIES(TIMESTAMP '2010-04-01 09:00:00.0000',
TIMESTAMP '2010-09-01 09:00:00.0000');
CREATE PARTITION SCHEME PS1 USING PF1;
Creating a Series Table – Partitions Example
103INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
ALTER TABLE name ADD DATASOURCE (
ts,
j thousands separated by ',',
d thousands separated by ',' decimal separated by '.'
)
'file_path_and_name'
delimited by ';'
string delimited by ' '
thousands separated by ','
decimal separated by '.';
LOAD TABLE name;
Add data to TS Table: Importing Data
104INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
The definition is stored in the catalog, so whenever the data is loaded into memory
the compression is applied.
``LOAD TABLE name`` using com.sap.spark.engines;
Load Table
105INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
CREATE TABLE national_grid_demand (
ts TIMESTAMP, ND double, I014_ND double, TSD double, I014_TSD double,
England_Wales_Demand double, Embedded_Wind_Generation double,
Embedded_Wind_Capacity integer, Embedded_Solar_Generation double,
Embedded_Solar_Capacity integer )
SERIES ( PERIOD FOR SERIES ts
START TIMESTAMP '2015-01-01 00:00:00' END TIMESTAMP '2017-01-01 00:00:00'
EQUIDISTANT INCREMENT BY 30 MINUTE DEFAULT
COMPRESSION use (APCA error 3.0 percent)
COMPRESSION ON (Embedded_Wind_Generation) use (SDT error 4.0 percent)
COMPRESSION ON (Embedded_Solar_Generation) use (SDT error 5.0 percent) )
PARTITION BY PS1( ts )
USING com.sap.spark.engines
OPTIONS (files "/user/<userid>/DemandData_2015_2.csv",
csvskip "1",
csvdelimiter ";",
storagebackend "hdfs")
Add date to TS Table: Create and Include Data
106INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SELECT ts, ND FROM national_grid_demand
WHERE PERIOD AS OF TIMESTAMP '2015-09-01 00:00:00‘
SELECT ts, ND FROM national_grid_demand
WHERE PERIOD BETWEEN TIMESTAMP '2015-08-31 00:00:00' AND TIMESTAMP '2015-
09-06 00:00:00'
Querying Series Data - Examples
107INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Standard aggregations
 SUM | AVG | MIN | MAX | COUNT
Trend
 Calculates the non-stationary behavior of the column using exponential smoothing
Median
 Median value for a time series column
Mode
 Modal value for a time series column
Querying Series Data – Column Functions
108INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SELECT median(Embedded_Wind_Generation), median(Embedded_Wind_Capacity)
FROM national_grid_demand
WHERE PERIOD BETWEEN TIMESTAMP '2015-08-01 00:00:00' AND TIMESTAMP '2015-
09-01 00:00:00';
SELECT TREND(Embedded_Wind_Capacity), TREND(Embedded_Solar_Capacity)
FROM national_grid_demand
WHERE PERIOD BETWEEN TIMESTAMP '2015-01-01 00:00:00' AND TIMESTAMP '2017-
01-01 00:00:00';
Querying Series Data - Examples
109INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Auto Correlation
• Calculates correlation coefficient for a given column and lag
• Allows you to look at a single column and find if there is a repeating pattern between the values of a column
Cross Correlation
• Calculates correlation coefficient for a 2 columns and a given lag
• Find if there is a correlation between multiple columns of a table
Histogram
• Calculate the frequency distribution of the values for the specified column
• Shows what is the distribution of your values in your time series over a time period
Granulize
 Returns a new series based on an existing series by changing the interval between adjacent time
stamps
Querying Series Data – Table Functions
110INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SELECT * FROM HISTOGRAM (
SERIES national_grid_demand, 10, DESCRIPTOR( Embedded_Wind_Generation ) ) HIST;
SELECT * FROM AUTO_CORR (
SERIES national_grid_demand, 48, DESCRIPTOR( ND ) ) auto_corr_res;
SELECT ts, Embedded_Wind_Generation, Embedded_Wind_Capacity, Embedded_Solar_Generation,
Embedded_Solar_Capacity
FROM GRANULIZE(
SERIES national_grid_demand,
24 HOUR,
ROUND_HALF_UP,
SUM => DESCRIPTOR( Embedded_Wind_Generation, Embedded_Solar_Generation ),
AVG => DESCRIPTOR( Embedded_Wind_Capacity, Embedded_Solar_Capacity) )
WHERE PERIOD BETWEEN TIMESTAMP '2015-08-01 12:00:00' AND TIMESTAMP '2015-08-03 12:00:00';
Querying Series Data - Examples
111INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SELECT * FROM
(``TS select statement 1`` USING … ) as T1,
(RT select statement USING … ) as T2,
WHERE
T1.v1 = T2.v1
Join between TS table and Relational table
112INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Take 15 minutes and work through the time series notebook in Apache Zeppelin
https://www.sap.com/developer/tutorials/vora-ova-zeppelin5.html
 SAP Vora Tools: http://<IP_ADDRESS>:9225
 Apache Zeppelin: http://<IP_ADDRESS>:9099
Check It Out!
SAP VORA 1.4
Document Store Engine
114INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Schemaless
 Very flexible
 Add/Remove fields to any document
Horizontally scalable (scale-out)
 Good for big data processing
Very low latency for key lookups
Easy to use (SQL syntax)
Document Store Overview
115INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
What does it store?
{
name: „Joe“,
age: 25,
hobbies: [„soccer“, „swimming“],
address: {
street: „4 Pennsylvania Plaza“,
city: „New York“
}
}
Field: Value
Field: Value
Field: Value
Field: Value
116INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Documents are stored in Collections
{
name: „Jim“,
age: 25,
hobbies: [„soccer“, „swimming“],
address: {
street: „4
Pennsylvania Plaza“,
citity: „New York“
}
}
{
name: „Jane“,
age: 25,
hobbies: [„soccer“, „swimming“],
address: {
street: „4
Pennsylvania Plaza“,
citity: „New York“
}
}
{
name: „Joe“,
age: 25,
hobbies: [„soccer“, „swimming“],
address: {
street: „4
Pennsylvania Plaza“,
city: „New York“
}
}
117INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Schemaless – Missing Fields
{
firstName:"John",
lastName:"Smith",
age:22,
address: {
streetAddress: "21 2nd Street",
city: "New York",
state: "NY",
postalCode: "10021" }
}
{
firstName:”George",
lastName:”Brown",
licensePlate:”1ABC234”
address: {
streetAddress: "69 1st Street",
city: "Los Angeles",
state: ”CA” }
}
118INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Schemaless – Different Types
{
firstName:"John",
lastName:"Smith",
age:22,
address: {
streetAddress: "21 2nd Street",
city: "New York",
state: "NY",
postalCode: "10A021" }
}
{
firstName:”George",
lastName:”Brown",
age:22,
address: {
streetAddress: "69 1st Street",
city: "Los Angeles",
state: ”CA” ,
postalCode: 34707 }
}
NUMBERSTRING
119INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Schemaless – Different Types
{
firstName:"John",
lastName:"Smith",
age:22,
address: { ... },
phoneNumber: [
{ type: "home", number: "212 555-1234” },
{ type: "fax", number: "646 555-4567” }
]
}
{
firstName:”George",
lastName:”Brown",
age:22,
address: { ... },
phoneNumber : {
countryCode:90,
number:” 212 666-1234”}
}
OBJECTARRAY
120INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
How to use Doc. Store: Create Collection
CREATE COLLECTION <collection_identifier>
[PARTITION BY <partition_scheme_identifier>]
USING com.sap.spark.engines OPTIONS (<list_of_options>);
eg.
CREATE COLLECTION T
PARTITION BY PS2("state")
USING com.sap.spark.engines
OPTIONS (files 'path/to/file/to/load/some.json‘,
storagebackend "hdfs" );
121INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Partitioning
CREATE PARTITION FUNCTION <partition_function_identifier>(<list_of_fields>)
AS HASH(<list_of_fields>) MIN PARTITIONS <num> MAX PARTITIONS <num>
USING com.sap.spark.engines;
CREATE PARTITION SCHEME <partition_scheme_identifier>
USING <partition_function_identifier> USING com.sap.spark.engines;
eg.
CREATE PARTITION FUNCTION PF("_id") AS HASH("_id")
MIN PARTITIONS 3 MAX PARTITIONS 3
USING com.sap.spark.engines;
CREATE PARTITION SCHEME PS
USING PF USING com.sap.spark.engines;
122INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SQL Queries – Result Sets
SELECT
 Returns a relational result
SELECT {}
 Returns a JSON document
Use Dot Notation to access nested fields in a document
eg. {
"author": {
"address": {
"street": "..."
},
...
},
...
}
–Nested blocks accessed via array ‘[ ]’ operator
SELECT author.addresses[arraynum].street FROM <collection>
123INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Vora Doc Store
Queries
 Standard SQL support
 SELECT
– Returns a relational result
 SELECT { }
– Returns a JSON document
 IS MISSING – look for missing fields similar to IS NULL
 Standard SQL support – sorting, subselect, alias, joins
group by
sum, avg, min, max, count
upper, lower, modulus
124INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Select – Different Types
Always STRINGIFY in case of type mismatch
 Even objects and arrays
SELECT s.firstName,
s.address.postalCode
FROM Students s
firstName postalCode
”John” ”10A021”
”George” ”34707”
125INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SQL Queries - Examples
SELECT {"ct": "city"} FROM <collection>
WHERE employeecollection.address.street = ‘Maple’
SELECT city as ct FROM <collection>
WHERE employeecollection.address.street = ‘Maple’
SELECT * FROM <collection>
WHERE author.address IS MISSING using com.sap.spark.engines;
SELECT * FROM <collection>
WHERE author.address[1].state=‘NY’
126INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SQL Queries – Expressions and Aggregates
Standard Aggregate Functions are Supported
 count, min, max, avg, sum, etc…
GROUP BY
 Behaves as expected for a relational result
 For a JSON result, a new document is created where the field of the GROUP BY clause becomes the id of the
resulting document
 The _id field of the new documents is an object of grouped field names:
SELECT { _id: _id, aggr: avg(cost) } FROM <collection>
GROUP BY productgroup, productname using com.sap.spark.engines;
_id: {productgroup:value, productname:value}
employees.json
128INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Take 10 minutes and work through the doc store notebook in Apache Zeppelin
https://www.sap.com/developer/tutorials/vora-ova-zeppelin7.html
 SAP Vora Tools: http://<IP_ADDRESS>:9225
 Apache Zeppelin: http://<IP_ADDRESS>:9099
Check It Out!
SAP VORA 1.4
Graph Engine
130INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Graph Databases
• Relationships between entities are of key interest
• Network view on complex structures
• Complex search patterns, e.g. subgraph search and link predictions
Application Areas
• Social networks
• Business networks (Users&Products, Learners&Contents)
• Knowledge graphs (Entities and concepts/relationships around them: Apple)
• Recommendation systems
• Org-/project structures (hierarchies / DAGs)
• Supply chain management
Graph Databases
131INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Building Knowledge Graph
Build and infer relations between data points
Machine LearningMachine Teaching
hasLearningItem
SAP
hasOrgUnit
hasOrgUnit
Puntis
Product Manager
Balaji
Product Manager
Machine Teaching
132INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
133INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SAP VORA Graph: Scalable Graph Analytics
Graph Analytics for
Enterprise Applications
Expressive, declarative graph query
language based on SQL and graph
pattern matching
High Performance for
Real-Time Applications
Native in-memory graph store and
light-weight query engine for super
fast query execution
Flexible Data Import from
Various Sources
Easy mapping from relational
tables to graphs. Direct loading
from files, HDFS, VORA catalog
Platform and
Application Integration
Part of VORA, connect to HANA
Distributed Engine
for Scale-Out Scenarios
Analyze billion-node graphs using
distributed version of the graph
store and query engine
134INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Property Graph Model
Nodes have a type, a unique numeric ID
(64-bit integer > 0), and a set of
primitive-valued properties
Edges have a type and connect two
nodes
Directed and undirected graphs are
supported
Node properties have a type and a
primitive-typed value (nullable)
Supported primitive datatypes:
 64-bit signed integers
 variable-length strings
 double-precision floats
Graphs have no fixed schema
Properties can be indexed
135INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Mapping Edge Properties
:PERSON :PERSON
:MARRIED
DATE=‘01-01-72’
:PERSON :PERSON
:MARRIAGE
DATE=’01-01-72’:MARRIED :MARRIED
:PERSON :PERSON
:MARRIAGE
DATE=’01-01-72’
:MARRIED_E :MARRIED_E
:MARRIED
or
- For performance reasons, the
design decision was not to have
edge properties
- Edge properties can be modeled
using auxiliary nodes
136INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Storing Graphs in Files
Graphs can be loaded from JSG-files (local disk or HDFS)
When loading from local disk, the file must be present on all VORA nodes at the same path
JSG-format:
 Optional JSON-header with metadata # { … }
 Every line in the body is a JSON array representing one node, consisting of:
[ <node-type>, <node-id>, <properties>, <edges> ]
138INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
139INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Creating VORA Graphs from Spark
Graph: Partitioned and un-partitioned
1. Create a BLOCK partition function:
 Recommended PARTITIONS: (number-of-VORA-nodes * cores-per-VORA-
node)
 Recommended BLOCKSIZE: 1000
2. Create a partition scheme
3. Create a graph using partition scheme and data source. Partition scheme must
get NODEID as parameter.
140INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Create Graph
141INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
 SQL-based graph query language
 Aggregations, sorting, limits, and expressions inherited from SQL
Graph Query Language in a Nutshell
Basic Concepts
Select node type SELECT NAME FROM ACTOR USING GRAPH IMDB : Any property from one type node
Select all nodes SELECT NAME FROM ANY USING GRAPH IMDB : Any property from ANY type node
Select paths SELECT PLAYS_IN.DIRECTED_BY.NAME FROM ACTOR USING GRAPH IMDB : Traverse nodes
Node type and ID SELECT NODETYPE FROM ANY WHERE NODEID > 5 USING GRAPH IMDB : Node ID and Node Type
Built-in Graph Functions
Node degree SELECT DEGREE(A) FROM ACTOR A USING GRAPH IMDB
Distance SELECT DISTANCE(A,B) FROM ACTOR A, ACTOR B USING GRAPH IMDB WHERE…
Connected Components SELECT CONNECTED_COMPONENT(A) FROM ACTOR A USING GRAPH IMDB
Graph Pattern Matching
Check edges SELECT A.NAME, M.TITLE FROM ACTOR A, MOVIE M USING GRAPH IMDB WHERE M IN A.PLAYS_IN
Check distance SELECT A.NAME, B.NAME FROM ACTOR A, ACTOR B USING GRAPH IMDB WHERE DISTANCE(A,B) < 5
142INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
143INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
SELECT
145INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Aggregation Functions
146INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Graph Functions
147INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Degree Function
149INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Connected Component Function
150INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Connected Component Function
SELECT A.NAME, CONNECTED_COMPONENT (STRONG A) AS
SCC, CONNECTED_COMPONENT(WEAK A) AS WCC
FROM ACTOR A USING GRAPH MOVIES
A.NAME SCC WCC
Brad Pitt 1 1
Angelina Jolie 2 1
Shah Rukh Khan 3 3
152INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
The DISTANCE function calculates the distance of the directed or undirected path between two nodes:
• Sum of weights
• Number of hops between the source and destination Node.
Distance Function (Shortest Path)
S: 1, D: 4
1-2-5-4(3)
1-5-4 (2)
1-2-3-4 (3)
1-5-2-3-4 (4)
S: A, D: B
A-B, w: 4
A-C-B, w: 3 (2+1)
A-C-D-B, w: 8 (2+3+3)
A-C-D-E-B, w: 10 (2+3+1+4)
154INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Take 15 minutes and work through the graph notebook in Apache Zeppelin
https://www.sap.com/developer/tutorials/vora-ova-zeppelin4.html
 SAP Vora Tools: http://<IP_ADDRESS>:9225
 Apache Zeppelin: http://<IP_ADDRESS>:9099
Check It Out!
SAP VORA 1.4
Data Modeler
157INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
http://<DNS_NAME_OF_JUMPBOX_NODE>:9225
SAP VORA 1.4
Resources
167INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
TOP REASONS TO CHOOSE VORA
1
Integrated Solution
Combine relational, time series, JSON, and graph
processing. No need to move data between systems
2
SQL Access
SQL access to relational, time series, JSON, and graph
computing engines
5
Metadata Persistence
Enabled by Distributed Transaction Log that can be
recovered when needed
6
Business Semantics
Out-of-the-box hierarchy processing and currency
conversion
3
Intuitive Interface
Web interface with SQL editor, data browser, and drag-and-
drop to interact with Vora computing engines
7
HANA Integration
High-performance interactive analytics across enterprise
data in HANA and Hadoop data
4
Open Consumption
Open framework supports data integration via JDBC,
OData and Restful Services
168INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Documentation
SAP VORA website http://sap.com/vora
SAP VORA documentation https://help.sap.com/viewer/p/SAP_VORA
- Installation and Administration Guide
- Developer Guide
- Troubleshooting Guide
- Sizing Guide
Product Availability Matrix (PAM) incl. release strategy
https://apps.support.sap.com/sap/support/pam/pam.html?#pvnr=73555000100900000415
SAP Notes (component HAN-VO*) https://launchpad.support.sap.com/#/solutions/notes/
- 2303668 - SAP VORA 1.3 Release Note
- 2405200 - Release Restrictions for SAP VORA 1.3
- 2213226 - Prerequisites for installing SAP VORA: Operating Systems and Hadoop Components
- 2220859 - SAP VORA Documentation Corrections
Further help
- For product issues -> SAP customer message in components HAN-VO*
- Community-based help on Stackoverflow (no SLA, non-critical issues only) http://stackoverflow.com/questions/tagged/vora
Resources - documentation
SAP Vora CodeJam

Weitere ähnliche Inhalte

Was ist angesagt?

Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...DataWorks Summit/Hadoop Summit
 
SAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop IntegrationSAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop IntegrationSAP Technology
 
Integration of SAP HANA with Hadoop
Integration of SAP HANA with HadoopIntegration of SAP HANA with Hadoop
Integration of SAP HANA with HadoopRamkumar Rajendran
 
Finance month closing with HANA
Finance month closing with HANAFinance month closing with HANA
Finance month closing with HANADouglas Bernardini
 
SAP HANA - The Foundation of Real Time, Now on the AWS Cloud Computing Platform
SAP HANA - The Foundation of Real Time, Now on the AWS Cloud Computing PlatformSAP HANA - The Foundation of Real Time, Now on the AWS Cloud Computing Platform
SAP HANA - The Foundation of Real Time, Now on the AWS Cloud Computing PlatformAmazon Web Services
 
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...SAP Technology
 
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact SheetBig Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact SheetSAP Technology
 
Hadoop integration with SAP HANA
Hadoop integration with SAP HANAHadoop integration with SAP HANA
Hadoop integration with SAP HANADebajit Banerjee
 
SAP HANA "THE WHY"- Value Proposition - Run Simple
SAP HANA "THE WHY"- Value Proposition - Run SimpleSAP HANA "THE WHY"- Value Proposition - Run Simple
SAP HANA "THE WHY"- Value Proposition - Run SimpleSandeep Mahindra
 
What you need to know before migrating to SAP Hana
What you need to know before migrating to SAP HanaWhat you need to know before migrating to SAP Hana
What you need to know before migrating to SAP HanaDataVard
 
Development to Deployment with SAP HANA
Development to Deployment with SAP HANADevelopment to Deployment with SAP HANA
Development to Deployment with SAP HANACraig Cmehil
 
SAP Helps Reduce Silos Between Business and Spatial Data
SAP Helps Reduce Silos Between Business and Spatial DataSAP Helps Reduce Silos Between Business and Spatial Data
SAP Helps Reduce Silos Between Business and Spatial DataSAP Technology
 
SAP Developer Relations for Nextgen
SAP Developer Relations for NextgenSAP Developer Relations for Nextgen
SAP Developer Relations for NextgenCraig Cmehil
 
"Integration of Hadoop in Business landscape", Michal Alexa, IT and Innovatio...
"Integration of Hadoop in Business landscape", Michal Alexa, IT and Innovatio..."Integration of Hadoop in Business landscape", Michal Alexa, IT and Innovatio...
"Integration of Hadoop in Business landscape", Michal Alexa, IT and Innovatio...Dataconomy Media
 
SQL Anywhere and the Internet of Things
SQL Anywhere and the Internet of ThingsSQL Anywhere and the Internet of Things
SQL Anywhere and the Internet of ThingsSAP Technology
 
SAP HANA Use Cases in 27 Industries
SAP HANA Use Cases in 27 IndustriesSAP HANA Use Cases in 27 Industries
SAP HANA Use Cases in 27 IndustriesSAP Asia Pacific
 

Was ist angesagt? (19)

Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
Building Information Platform - Integration of Hadoop with SAP HANA and HANA ...
 
SAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop IntegrationSAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop Integration
 
Integration of SAP HANA with Hadoop
Integration of SAP HANA with HadoopIntegration of SAP HANA with Hadoop
Integration of SAP HANA with Hadoop
 
SAP EIM Overview
SAP EIM OverviewSAP EIM Overview
SAP EIM Overview
 
Finance month closing with HANA
Finance month closing with HANAFinance month closing with HANA
Finance month closing with HANA
 
SAP HANA - The Foundation of Real Time, Now on the AWS Cloud Computing Platform
SAP HANA - The Foundation of Real Time, Now on the AWS Cloud Computing PlatformSAP HANA - The Foundation of Real Time, Now on the AWS Cloud Computing Platform
SAP HANA - The Foundation of Real Time, Now on the AWS Cloud Computing Platform
 
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...
Accelerate Your Move to an Intelligent Enterprise with SAP Cloud Platform and...
 
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact SheetBig Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
 
Hadoop integration with SAP HANA
Hadoop integration with SAP HANAHadoop integration with SAP HANA
Hadoop integration with SAP HANA
 
SAP HANA "THE WHY"- Value Proposition - Run Simple
SAP HANA "THE WHY"- Value Proposition - Run SimpleSAP HANA "THE WHY"- Value Proposition - Run Simple
SAP HANA "THE WHY"- Value Proposition - Run Simple
 
What you need to know before migrating to SAP Hana
What you need to know before migrating to SAP HanaWhat you need to know before migrating to SAP Hana
What you need to know before migrating to SAP Hana
 
Development to Deployment with SAP HANA
Development to Deployment with SAP HANADevelopment to Deployment with SAP HANA
Development to Deployment with SAP HANA
 
SAP Helps Reduce Silos Between Business and Spatial Data
SAP Helps Reduce Silos Between Business and Spatial DataSAP Helps Reduce Silos Between Business and Spatial Data
SAP Helps Reduce Silos Between Business and Spatial Data
 
SAP Developer Relations for Nextgen
SAP Developer Relations for NextgenSAP Developer Relations for Nextgen
SAP Developer Relations for Nextgen
 
S4 1610 business value l1
S4 1610 business value l1S4 1610 business value l1
S4 1610 business value l1
 
"Integration of Hadoop in Business landscape", Michal Alexa, IT and Innovatio...
"Integration of Hadoop in Business landscape", Michal Alexa, IT and Innovatio..."Integration of Hadoop in Business landscape", Michal Alexa, IT and Innovatio...
"Integration of Hadoop in Business landscape", Michal Alexa, IT and Innovatio...
 
SQL Anywhere and the Internet of Things
SQL Anywhere and the Internet of ThingsSQL Anywhere and the Internet of Things
SQL Anywhere and the Internet of Things
 
How Old Is Your Data? Don't Settle For Bad Data!
How Old Is Your Data? Don't Settle For Bad Data!How Old Is Your Data? Don't Settle For Bad Data!
How Old Is Your Data? Don't Settle For Bad Data!
 
SAP HANA Use Cases in 27 Industries
SAP HANA Use Cases in 27 IndustriesSAP HANA Use Cases in 27 Industries
SAP HANA Use Cases in 27 Industries
 

Ähnlich wie SAP Vora CodeJam

Overview of SAP HANA Cloud Platform
Overview of SAP HANA Cloud PlatformOverview of SAP HANA Cloud Platform
Overview of SAP HANA Cloud PlatformVitaliy Rudnytskiy
 
SAP HANA Data Center Intelligence Overview
SAP HANA Data Center Intelligence OverviewSAP HANA Data Center Intelligence Overview
SAP HANA Data Center Intelligence OverviewSAP Technology
 
Disaster Recovery for SAP HANA with SUSE Linux
Disaster Recovery for SAP HANA with SUSE LinuxDisaster Recovery for SAP HANA with SUSE Linux
Disaster Recovery for SAP HANA with SUSE LinuxDirk Oppenkowski
 
#asksap Analytics Innovations Community Call: SAP BW/4HANA - the Big Data War...
#asksap Analytics Innovations Community Call: SAP BW/4HANA - the Big Data War...#asksap Analytics Innovations Community Call: SAP BW/4HANA - the Big Data War...
#asksap Analytics Innovations Community Call: SAP BW/4HANA - the Big Data War...SAP Analytics
 
02_SAP_S4HANA_Value_Roadmap_Next_Generation_Suite2.pdf
02_SAP_S4HANA_Value_Roadmap_Next_Generation_Suite2.pdf02_SAP_S4HANA_Value_Roadmap_Next_Generation_Suite2.pdf
02_SAP_S4HANA_Value_Roadmap_Next_Generation_Suite2.pdfdiamondfire201
 
Capture and Feed Telecom Network Data and More Into SAP HANA - Quicky and Aff...
Capture and Feed Telecom Network Data and More Into SAP HANA - Quicky and Aff...Capture and Feed Telecom Network Data and More Into SAP HANA - Quicky and Aff...
Capture and Feed Telecom Network Data and More Into SAP HANA - Quicky and Aff...SAP Solution Extensions
 
Business intelligence in the era of big data
Business intelligence in the era of big dataBusiness intelligence in the era of big data
Business intelligence in the era of big dataJC Raveneau
 
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AGSap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AGJessie Paul
 
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AGSap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AGINDUSCommunity
 
SAP Data Hub – What is it, and what’s new? (Sefan Linders)
SAP Data Hub – What is it, and what’s new? (Sefan Linders)SAP Data Hub – What is it, and what’s new? (Sefan Linders)
SAP Data Hub – What is it, and what’s new? (Sefan Linders)Twan van den Broek
 
Analytics Products L2 public 2020-23 Black.pptx
Analytics Products L2 public 2020-23 Black.pptxAnalytics Products L2 public 2020-23 Black.pptx
Analytics Products L2 public 2020-23 Black.pptxBurakAyan6
 
Spark Summit Keynote with Ken Tsai
Spark Summit Keynote with Ken TsaiSpark Summit Keynote with Ken Tsai
Spark Summit Keynote with Ken TsaiSpark Summit
 
Spark Usage in Enterprise Business Operations
Spark Usage in Enterprise Business OperationsSpark Usage in Enterprise Business Operations
Spark Usage in Enterprise Business OperationsSAP Technology
 
Sapwebinar2 how 2transition2s4hanagetyourdatacleanandkeepitclean1569951002523
Sapwebinar2 how 2transition2s4hanagetyourdatacleanandkeepitclean1569951002523Sapwebinar2 how 2transition2s4hanagetyourdatacleanandkeepitclean1569951002523
Sapwebinar2 how 2transition2s4hanagetyourdatacleanandkeepitclean1569951002523Steffen König
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...DataWorks Summit/Hadoop Summit
 
Building Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANABuilding Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANASAP Technology
 
BPI_Topic #3_Introduction to SAP S4HANA (1)-merged (1).pdf
BPI_Topic #3_Introduction to SAP S4HANA (1)-merged (1).pdfBPI_Topic #3_Introduction to SAP S4HANA (1)-merged (1).pdf
BPI_Topic #3_Introduction to SAP S4HANA (1)-merged (1).pdf1705Savani
 
SAP Inside Track Walldorf 2018 - Demistify SAP Leonardo Machine Learning Foun...
SAP Inside Track Walldorf 2018 - Demistify SAP Leonardo Machine Learning Foun...SAP Inside Track Walldorf 2018 - Demistify SAP Leonardo Machine Learning Foun...
SAP Inside Track Walldorf 2018 - Demistify SAP Leonardo Machine Learning Foun...Abdelhalim DADOUCHE
 

Ähnlich wie SAP Vora CodeJam (20)

Overview of SAP HANA Cloud Platform
Overview of SAP HANA Cloud PlatformOverview of SAP HANA Cloud Platform
Overview of SAP HANA Cloud Platform
 
SAP HANA Data Center Intelligence Overview
SAP HANA Data Center Intelligence OverviewSAP HANA Data Center Intelligence Overview
SAP HANA Data Center Intelligence Overview
 
Disaster Recovery for SAP HANA with SUSE Linux
Disaster Recovery for SAP HANA with SUSE LinuxDisaster Recovery for SAP HANA with SUSE Linux
Disaster Recovery for SAP HANA with SUSE Linux
 
Sap bw4 hana
Sap bw4 hanaSap bw4 hana
Sap bw4 hana
 
#asksap Analytics Innovations Community Call: SAP BW/4HANA - the Big Data War...
#asksap Analytics Innovations Community Call: SAP BW/4HANA - the Big Data War...#asksap Analytics Innovations Community Call: SAP BW/4HANA - the Big Data War...
#asksap Analytics Innovations Community Call: SAP BW/4HANA - the Big Data War...
 
02_SAP_S4HANA_Value_Roadmap_Next_Generation_Suite2.pdf
02_SAP_S4HANA_Value_Roadmap_Next_Generation_Suite2.pdf02_SAP_S4HANA_Value_Roadmap_Next_Generation_Suite2.pdf
02_SAP_S4HANA_Value_Roadmap_Next_Generation_Suite2.pdf
 
Capture and Feed Telecom Network Data and More Into SAP HANA - Quicky and Aff...
Capture and Feed Telecom Network Data and More Into SAP HANA - Quicky and Aff...Capture and Feed Telecom Network Data and More Into SAP HANA - Quicky and Aff...
Capture and Feed Telecom Network Data and More Into SAP HANA - Quicky and Aff...
 
HANA a PoV
HANA a PoVHANA a PoV
HANA a PoV
 
Business intelligence in the era of big data
Business intelligence in the era of big dataBusiness intelligence in the era of big data
Business intelligence in the era of big data
 
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AGSap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
 
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AGSap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
 
SAP Data Hub – What is it, and what’s new? (Sefan Linders)
SAP Data Hub – What is it, and what’s new? (Sefan Linders)SAP Data Hub – What is it, and what’s new? (Sefan Linders)
SAP Data Hub – What is it, and what’s new? (Sefan Linders)
 
Analytics Products L2 public 2020-23 Black.pptx
Analytics Products L2 public 2020-23 Black.pptxAnalytics Products L2 public 2020-23 Black.pptx
Analytics Products L2 public 2020-23 Black.pptx
 
Spark Summit Keynote with Ken Tsai
Spark Summit Keynote with Ken TsaiSpark Summit Keynote with Ken Tsai
Spark Summit Keynote with Ken Tsai
 
Spark Usage in Enterprise Business Operations
Spark Usage in Enterprise Business OperationsSpark Usage in Enterprise Business Operations
Spark Usage in Enterprise Business Operations
 
Sapwebinar2 how 2transition2s4hanagetyourdatacleanandkeepitclean1569951002523
Sapwebinar2 how 2transition2s4hanagetyourdatacleanandkeepitclean1569951002523Sapwebinar2 how 2transition2s4hanagetyourdatacleanandkeepitclean1569951002523
Sapwebinar2 how 2transition2s4hanagetyourdatacleanandkeepitclean1569951002523
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
 
Building Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANABuilding Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANA
 
BPI_Topic #3_Introduction to SAP S4HANA (1)-merged (1).pdf
BPI_Topic #3_Introduction to SAP S4HANA (1)-merged (1).pdfBPI_Topic #3_Introduction to SAP S4HANA (1)-merged (1).pdf
BPI_Topic #3_Introduction to SAP S4HANA (1)-merged (1).pdf
 
SAP Inside Track Walldorf 2018 - Demistify SAP Leonardo Machine Learning Foun...
SAP Inside Track Walldorf 2018 - Demistify SAP Leonardo Machine Learning Foun...SAP Inside Track Walldorf 2018 - Demistify SAP Leonardo Machine Learning Foun...
SAP Inside Track Walldorf 2018 - Demistify SAP Leonardo Machine Learning Foun...
 

Mehr von Vitaliy Rudnytskiy

Gentle Introduction into Geospatial (using SQL in SAP HANA)
Gentle Introduction into Geospatial (using SQL in SAP HANA)Gentle Introduction into Geospatial (using SQL in SAP HANA)
Gentle Introduction into Geospatial (using SQL in SAP HANA)Vitaliy Rudnytskiy
 
Welcome to SAP Community of Developers!
Welcome to SAP Community of Developers!Welcome to SAP Community of Developers!
Welcome to SAP Community of Developers!Vitaliy Rudnytskiy
 
Mobile of People and Internet of Things: State of the Union
Mobile of People and Internet of Things: State of the UnionMobile of People and Internet of Things: State of the Union
Mobile of People and Internet of Things: State of the UnionVitaliy Rudnytskiy
 
Quantify your drive: IoT on a personal scale with SAP technologies
Quantify your drive: IoT on a personal scale with SAP technologiesQuantify your drive: IoT on a personal scale with SAP technologies
Quantify your drive: IoT on a personal scale with SAP technologiesVitaliy Rudnytskiy
 
Developing and Deploying Applications on the SAP HANA Platform
Developing and Deploying Applications on the SAP HANA PlatformDeveloping and Deploying Applications on the SAP HANA Platform
Developing and Deploying Applications on the SAP HANA PlatformVitaliy Rudnytskiy
 
Welcome to SAP Community of Developers!
Welcome to SAP Community of Developers!Welcome to SAP Community of Developers!
Welcome to SAP Community of Developers!Vitaliy Rudnytskiy
 
SAP Developer Center - March 2016 update
SAP Developer Center - March 2016 updateSAP Developer Center - March 2016 update
SAP Developer Center - March 2016 updateVitaliy Rudnytskiy
 
SAP Tech Innovation for Business - 2014.05
SAP Tech Innovation for Business - 2014.05SAP Tech Innovation for Business - 2014.05
SAP Tech Innovation for Business - 2014.05Vitaliy Rudnytskiy
 
SAP HANA - Big Data and Fast Data
SAP HANA - Big Data and Fast DataSAP HANA - Big Data and Fast Data
SAP HANA - Big Data and Fast DataVitaliy Rudnytskiy
 
SAP CodeJam Mobile - Poland 2013
SAP CodeJam Mobile - Poland 2013SAP CodeJam Mobile - Poland 2013
SAP CodeJam Mobile - Poland 2013Vitaliy Rudnytskiy
 
SAP Store (in Polish / po polsku)
SAP Store (in Polish / po polsku)SAP Store (in Polish / po polsku)
SAP Store (in Polish / po polsku)Vitaliy Rudnytskiy
 

Mehr von Vitaliy Rudnytskiy (20)

SIT Wrocław 2019 - Intro
SIT Wrocław 2019 - IntroSIT Wrocław 2019 - Intro
SIT Wrocław 2019 - Intro
 
Wroclaw SAP Meetup 2019/02
Wroclaw SAP Meetup 2019/02Wroclaw SAP Meetup 2019/02
Wroclaw SAP Meetup 2019/02
 
Wrocław SAP Meetup - 2018/02
Wrocław SAP Meetup - 2018/02Wrocław SAP Meetup - 2018/02
Wrocław SAP Meetup - 2018/02
 
Gentle Introduction into Geospatial (using SQL in SAP HANA)
Gentle Introduction into Geospatial (using SQL in SAP HANA)Gentle Introduction into Geospatial (using SQL in SAP HANA)
Gentle Introduction into Geospatial (using SQL in SAP HANA)
 
IoT at Scale
IoT at ScaleIoT at Scale
IoT at Scale
 
Welcome to SAP Community of Developers!
Welcome to SAP Community of Developers!Welcome to SAP Community of Developers!
Welcome to SAP Community of Developers!
 
Wroclaw SAP Meetup 2017/10
Wroclaw SAP Meetup 2017/10Wroclaw SAP Meetup 2017/10
Wroclaw SAP Meetup 2017/10
 
Mobile of People and Internet of Things: State of the Union
Mobile of People and Internet of Things: State of the UnionMobile of People and Internet of Things: State of the Union
Mobile of People and Internet of Things: State of the Union
 
Wroclaw SAP Meetup - 2017/01
Wroclaw SAP Meetup - 2017/01Wroclaw SAP Meetup - 2017/01
Wroclaw SAP Meetup - 2017/01
 
Wroclaw SAP Meetup - 2016/10
Wroclaw SAP Meetup - 2016/10Wroclaw SAP Meetup - 2016/10
Wroclaw SAP Meetup - 2016/10
 
Quantify your drive: IoT on a personal scale with SAP technologies
Quantify your drive: IoT on a personal scale with SAP technologiesQuantify your drive: IoT on a personal scale with SAP technologies
Quantify your drive: IoT on a personal scale with SAP technologies
 
Developing and Deploying Applications on the SAP HANA Platform
Developing and Deploying Applications on the SAP HANA PlatformDeveloping and Deploying Applications on the SAP HANA Platform
Developing and Deploying Applications on the SAP HANA Platform
 
OpenUI5
OpenUI5OpenUI5
OpenUI5
 
Welcome to SAP Community of Developers!
Welcome to SAP Community of Developers!Welcome to SAP Community of Developers!
Welcome to SAP Community of Developers!
 
SAP Developer Center - March 2016 update
SAP Developer Center - March 2016 updateSAP Developer Center - March 2016 update
SAP Developer Center - March 2016 update
 
SAP Tech Innovation for Business - 2014.05
SAP Tech Innovation for Business - 2014.05SAP Tech Innovation for Business - 2014.05
SAP Tech Innovation for Business - 2014.05
 
SAP HANA - Big Data and Fast Data
SAP HANA - Big Data and Fast DataSAP HANA - Big Data and Fast Data
SAP HANA - Big Data and Fast Data
 
SAP CodeJam Mobile - Poland 2013
SAP CodeJam Mobile - Poland 2013SAP CodeJam Mobile - Poland 2013
SAP CodeJam Mobile - Poland 2013
 
SAP Store (in Polish / po polsku)
SAP Store (in Polish / po polsku)SAP Store (in Polish / po polsku)
SAP Store (in Polish / po polsku)
 
SAP Runs SAP Mobile
SAP Runs SAP MobileSAP Runs SAP Mobile
SAP Runs SAP Mobile
 

Kürzlich hochgeladen

Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 

Kürzlich hochgeladen (20)

Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 

SAP Vora CodeJam

  • 1. SAP VORA 1.4 July 2017 Puntis Jifroodian-Haghighi, SAP Vora Product Management Jason Hinsperger, SAP Vora Product Management Vitaliy Rudnytskiy, SAP Developer Relations
  • 2. 2INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ o Introduction to SAP Vora and Big Data o Vora Installation and Configuration o Tables and Views in Vora o Break o Different Data Sources o Hierarchies o Vora Graph Analysis o Break o Time Series, Document Store and Disk Engine o Wrap-up Agenda
  • 4. 4INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ For SAP, Big Data Expands the Customer Data Footprint Enterprise Data Data Lakes BW/4 HANA ASE IQ HANA D A T A L A K E
  • 5. 5INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ What is a Data Lake and Why the rise in popularity? A Data Lake is a massive, easily accessible, centralized repository of data stored on commodity hardware. Data is stored in its raw format and transformed when needed. Data Lakes can scale to PB range, thus becoming the de facto standard for Big Data initiatives. Scalable Manage more data than ever before Store any data Supports all data types Flexible Analyze and simulate any business scenario Dynamic Deploy on premise or in the Cloud
  • 6. 6INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Hadoop: Cheaper, Faster, Effective Big Data Storage and Processing • Highly scalable – handles terabytes to petabytes and beyond • Affordable – open source software, runs on commodity machines • Flexible – handles all varieties of data, not just structured data • Tackles the data challenges that drive modern business – Yahoo!, Netflix, Facebook all use Hadoop to run core business
  • 7. 7INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Challenges with Big Data * Source: Gartner Big Data Adoption Survey insights November 2016 https://www.gartner.com/webinar/3451618 Achieving Value, Skills, Capabilities, Governance
  • 8. 8INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Unlock Business Potential from Your Big Data Insights from one single solution In-memory distributed computing engines: Relational, Time Series, Graph, JSON/Doc Disk-to-memory accelerator Enterprise-ready Production-ready, integrated solution Support, Versioning and Compatibility between different Hadoop components Seamless integration with SAP HANA Easier to use Intuitive web interface One SQL entry point Connect with familiar tools
  • 9. 9INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ DW/Data Tiering • Economically maintain context data without compromising on performance • Immediate results from petabytes of contextual data without compromising core business systems IoT • Immediately sense and respond to data streams with Vora’s embeddable design • Off-load processing to edge devices and simplify data synchronization with gateways and data warehouses Data Lineage & Compliance • Query archived data without losing data lineage & control • Deliver timely, accurate compliance reporting • Perform complete audit trail analysis from operational to archive data Cloud • Analysis of large volumes of external data without building on-premise infrastructure Data Lake • Faster analytics with compiled query for distributed processing • Better insights with drill- down root-cause analysis SAP Vora Usage Patterns
  • 10. 10INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Use Case Challenge Why SAP HANA Vora? • Significant compute capacity required for valuations • Store larger volumes of data in Hadoop and leverage Spark + Vora to deliver massive parallel compute power + enterprise-grade analytics Risk Management Fraud Detection • Early detection of rogue/illicit trading • Leverage Vora to correlate seemingly unrelated transactions to identify fraudulent activity Broker/Trade Compliance • Detect fraudulent transaction earlier to improve accuracy and reduce costs • Link trading and Financial (ERP/S/4) transactions with unstructured content (emails/instant messaging/sentiment data) Anti-Money Laundering • Detect illegal activities faster and with improved accuracy • Ingest, store and process large amounts of data and leverage machine learning algorithms to detect money laundering activities Customer 360˚ • Reduce Customer Churn • Grow the business with targeted marketing campaigns • Better understand your customer behaviors and needs (sentiment analysis) to proactively make relevant offers and reduce churn SAP Vora for Financial Services/Capital Markets
  • 11. 11INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SAP Vora Telco/Network capacity planning/Real-time bandwidth allocation  Network Capacity planning – By analyzing Call Detail Records (CDRs) and network loads, telcos can plan infrastructure expansion with greater precision  Cellphone service improvement -Ability to analyze, understand and fix instances of poor cellular service e.g. dropped calls/poor audio  Targeted network maintenance/upgrades – Analysis of how cable network congestion affects churn, and where exactly network upgrades produce the most incremental revenue  Real-time bandwidth allocation: With Real-time packet inspection operators can steer traffic and optimize network quality of service (QoS) in real time in an attempt to maintain the best service quality
  • 12. 12INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SAP Vora Oil & Gas Environmental Safety/Oil Production/Predictive Maintenance  Environmental Safety: Analysis of anomalies in drilling can be identified in real time. Well problems can be detected before they become serious and drills can be shut down proactively to prevent environmental risks  Oil Production Forecasting: Analysis of seismic, drilling and production data to enhance oil extraction from existing wells and forecast oil production  Identification of potential drilling errors /equipment failures, by analyzing sensor data from equipment (drill heads, down hole sensors, etc.) as well as geological data. Understand what equipment works best in each environment  Predictive Analysis : Identification of events or patterns that could indicate an imminent security threat or cyber- terrorist act in order to keep their personnel, property and equipment safe
  • 13. 13INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SAP Vora Retail/Consumer Buying Patterns/Dynamic Pricing  360° Customer Insight: Improve customer satisfaction and increase sales opportunities by integrating all relevant customer data across online transactions, POS transactions, social media, and customer service interactions into one single view  Upsell/cross-sell recommendations: Increase online purchases by recommending relevant products and promotions in real time. Retailers can recommend products based on what other similar customers have bought  Click Stream Analysis: Clickstream analysis helps retailers to better understand how consumers make online purchase decisions which in turn helps them to optimize web pages/offers to increase conversion, resulting in lower cart abandonment
  • 14. 14INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SAP Vora - Manufacturing Potential Use Cases  Predictive Maintenance – Minimize Non-Productive Time (NPT) by monitoring equipment or product utilization in a live environment to identify patterns that indicate imminent failure.  Assembly Line Quality Assurance: Take measurements of work-in-progress products to find manufacturing defects as early as possible, while also identifying any potential process or design flaws. Analysis of  Real-time Parts Flow Monitoring: Attaching sensors to all parts in the production process and tracking them in real time enables manufacturers to have a real-time view to their production process.  Product Configuration Planning: Product configuration planning helps accelerate production by offering fast delivery times for the manufacture of millions of different product configurations.
  • 16. 16INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ 3xV of Big Data
  • 17. 17INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Hadoop Overview
  • 18. 18INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SAP Vora in the Hadoop Ecosystem
  • 19. 19INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Apache Spark is a fast and general engine for large-scale data processing Can run along with, or independent from Hadoop Can work against data from many sources – HDFS, local files, Amazon S3, etc… Supports Java, Scala or Python based applications Works on data in an in-memory fashion – 10-100 times faster than Hive on Map/Reduce Supports SQL access Includes a Streaming engine Apache Spark
  • 20. 20INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ What does SAP Vora provide which Spark does not? Integration with Enterprise data – Move processing closer to the data by extending Spark to allow the computation of full logical plans at the datasource (whether its HANA or the Vora Engine) – The result is a performance increase (but dependent on the actual query) Bidirectional virtual data access from HANA to Hadoop/Spark extending beyond SDA capabilities – No data duplication Enterprise Analytics additions to SparkSQL – Hierarchies, currency conversion Unified access layer built on SQL allowing integrated analysis of disparate datasets Simplified data modeling on HDFS/S3 data sets using web based modeler SAP Vora Complementing Spark Framework
  • 21. 21INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Distributed Computing Solution for the Digital Enterprise HadoopFiles Files Files Vora Spark Vora Spark Vora Spark … Distributed Transaction Log Disk-to-Memory Accelerator Data Modeler Relational Time Series Graph Doc Store SAP VORA Data Science, Predictive, Business Intelligence, Visualization Apps Distributed computing cluster Spark Hadoop Other Apps In-Memory Store SAP HANA Platform O P T I O N AL
  • 22. 22INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Key Capabilities of SAP VORA Vora Vora Vora Vora Vora Vora Vora Vora Vora Native „Database“ in Hadoop Multiple Engines Relational (OLAP) In- Memor Time Series Doc Store Graphs Intuitive Tools Tight HANA Integration 0.1sec ∞ HANA Hadoop
  • 23. 23INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ • Full support for Kerberos (AD, MIT) enabled Hadoop landscapes • Encrypted HDFS support • Access Hadoop transparent encryption zones • Consul and Nomad TLS (Transport Layer Security) & ACL (Access Control Lists) support for Vora services • Secure access to various Vora components ( engines, Tools UI, catalog) Security
  • 24. SAP Vora: Key Components
  • 25. 25INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ :To administer and monitor the Hadoop landscapeCluster Management Tools Cloudera Manager MapR Control System Ambari
  • 26. 26INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Vora Nodes Vora Spark Extension SAP VORA Transaction Coordinator Metadata Scheduler Discovery Landscape DLog S3 / Swift / HDFS/ORC/Parquet Control Nodes (1..few) Compute Nodes (1..many) Relational Documents Graph Time Series Disc Persistent Storage Relational Documents Graph Time Series Disc
  • 27. 27INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Vora Manager UI http://<manager_node>:19000 Shows the status of all services ▫ Start/stop all services ▫ Configuration management:  specify node assignments  change parameters Consul ▫ Vora Discovery Service UI http://<discovery_server_node>:8500/ui ▫ Know where the services are Nomad ▫ Process scheduler and resource manager ▫ Start/stop/restart of Vora services -- If a service fails, Nomad will automatically keep trying to restart it until it succeeds ▫ Manage node assignment Vora Manager Components
  • 28. 28INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Vora Manager UI: Home Page
  • 29. 29INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Vora Manager UI: User Management
  • 30. 30INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Vora Manager UI: Nodes
  • 31. 31INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Vora Manager UI: Services
  • 32. 32INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SAP Vora Nodes: Control Nodes Vora Spark Extension SAP VORA Transaction Coordinator Metadata Scheduler Discovery Landscape Control Nodes (1..few)
  • 33. 33INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ • Transaction Coordinator • manages user transactions • Transaction Broker • enforces consistent (meta)data modifications • Landscape Manager • Controls data partitioning and placement across different engines • Lock Manager • Provides a distributed read-write lock mechanism for concurrent load statements to avoid loading the same partition multiple times • A driver for query execution with user session semantics Control Nodes
  • 34. 34INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Query Processing Query Processor Catalog Landscape Manager Host Assignment Scheduler User Query Vora Engines (graph, doc, series, disk) Result Set
  • 35. 35INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Vora Nodes: Compute Nodes Vora Spark Extension SAP VORA Compute Nodes (1..many) Relational Documents Graph Time Series Disc Relational Documents Graph Time Series Disc
  • 36. 36INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ  Relational Engine – allows you to frame your data as relational entities and communicate with them using SQL  Graph engine – SAP VORA embeds an in-memory graph database for real-time graph analysis. The primary focus is on complex read-only analytical queries on very large graphs.  Time Series – SAP VORA provides a highly-distributed time series analysis engine which supports storing and analyzing time series data. By enabling efficient (memory and speed) time series compression and supporting features like standard aggregation, granularization, and advanced analysis; SAP VORA allows you to join the relational data with series data to build efficient SQL models in Hadoop and other Big Data environments  Document Store – SAP VORA introduces NoSQL features like storing JSON documents using the new Document Store as part of the SAP VORA 1.3 release. The new DocStore supports schema-less tables, allowing you to flexibly add or remove fields from any documents and helps scale horizontally  Disk to Memory Accelerator – SAP VORA provides relational capabilities without having to load all the data into memory in those cases where it will not fit. Vora Analysis Engines Distributed Transaction Log Disk-to-Memory Accelerator Data Modeler Relational Time Series Graph Doc Store SAP VORA Spark Hadoop
  • 37. 37INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Vora Nodes: Persistent Storage Vora Spark Extension SAP VORA DLog S3 / Swift / HDFS/ORC/Parquet Persistent Storage
  • 38. 38INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Persistent Storage: HDFS + CATALOG + DLOG Vora Catalog (Metadata) Vora DLOG Spark Context REGISTER ALL TABLES USING com.sap.spark.vora REGISTER TABLE <name> USING com.sap.spark.vora SHOW TABLES USING com.sap.spark.vora Spark parameter: spark.sap.autoregister com.sap.spark.vora Persistent Volatile (Spark is session-based) • Shows tables in local Spark Context • After restart (e.g. of spark-shell) Vora table metadata needs to be reloaded into local Spark Catalog via REGISTER Spark Catalog SHOW TABLES
  • 39. 39INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Take a few minutes and explore the Vora manager, configuration components and various settings available.  SAP Vora Manager: http://<IP_ADDRESS>:19000 Check It Out!
  • 40. SAP Vora: Getting Started
  • 41. 41INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ On-Premise – Developer Edition on Premise: http://developers.sap.com – Install Official Vora Release on SAP SMP Cloud Based – Vora on CAL – Amazon AWS with “one-click” – developer and production editions (AWS marketplace) Vora Testdrives – 3hr testdrives - http://testdrive.saphanavora.com/ ▫ Retail ▫ Telco ▫ Time series Vora Installation Options
  • 42. 42INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ – Install Official Vora Release – Automated scripts for you own cluster (Ansible) – Monsoon Options to start with Vora On premise and in the cloud developer editions: http://www.sap.com/developer/topics/vora.html#freetrial
  • 43. 43INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Set up and Start developer editions https://www.sap.com/developer/tutorial-navigator.how-to.html?tag=products:data-management/sap-hana-vora
  • 44. 44INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ • Pre-defined business scenario--sample data included • Step-by-step instructions • Includes all solution components (e.g. Vora + Lumira) • 100% free • Time bound Test Drive Lab vs. Developer Edition http://testdrive.saphanavora.com/
  • 45. 45INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Features • Setup fully functional and configured SAP Vora cluster with few clicks. Available in most of the AWS regions. • SAP Vora Console to manage and scale the cluster • Optimal 4 node cluster based on • Centos 7.2 • Vora 1.3.61 (GA) • Apache Ambari 2.2.1.0 • Spark 1.6.1 • Hadoop Distribution HDP 2.4.2 • Zeppelin 0.6.0 SAP Vora Developer Edition on AWS
  • 46. 46INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Prerequisites Amazon Web Services account Virtual Private Cloud (VPC) and Security Group as virtual firewall Steps Go to Vora sign up page to launch an instance from an Amazon Machine image (AMI) Use SAP Vora console to setup cluster, add nodes and configure Use console to view and manager the cluster Get your SAP Vora Dev edition in AWS
  • 48. 48INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ How can I communicate with Vora? Spark Adapter SAP HANA Thrift Server Vora Spark Vora JDBC / ODBC Vora Tools Type • Native Spark • JDBC • Graphical • HANA
  • 49. 49INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Vora Consumption Consumption from HANA is possible through  SDA with “Spark-Adapter” (since SAP HANA SPS 10)  SDA with Thrift Server (Since SAP HANA SPS 7)  Direct Connectivity (End of 2016) JDBC / ODBC Access  Spark SQL is exposed through the Thriftserver (SAP VORA Extensions included and using SapSQLContext) Data visualization tools  Apache Zeppelin (or Jupyter) via SAP Spark extensions  SAP Lumira Data Analysis via JDBC channel  SAP VORA Data Modelling via JDBC channel Native language bindings  for programmatic data access  Available in Scala, Java, Python, R How can I communicate with Vora? Consumption of SAP VORA Spark Adapter SAP HANA Thrift Server Vora Spark-Driver SAP Spark Extensions Vora ODBC Vora Tools
  • 50. 50INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ – Vora Tools UI, http://<tools_node>:9225 – SQL Editor – Data Browser – Modeler Perspective – User Management Vora Tools
  • 51. 51INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Zeppelin, Spark Shell and Jupyter Notebook Apache Zeppelin Jupyter Notebook Spark Shell • Spark-Shell with Vora • $ source /etc/vora/vora-env.sh • $ $VORA_SPARK_HOME/bin/start-spark-shell.sh • Also possible: spark-submit; pyspark; other tools that can connect to the Vora Thriftserver
  • 52. 52INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Lumira – Graphical Data Exploration SAP Lumira SAP Lumira, Zeppelin: • Installation and Administration Guide (do not install Zeppelin via Ambari) Jupyter: • https://blogs.sap.com/2016/01/21/visualizing-data-with-jupyter/
  • 53. 53INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SAP HANA – Virtual Table via SDA – Remote Source ▫ VoraODBC Connector ▫ Spark Connector SAP HANA (via Smart Data Access, SDA)
  • 54. 54INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Pluggable API to access structured data through Spark SQL Data filtering and column pruning can be pushed down to the data sources in many cases (depends really on the source capability) Advantages: • Less data in Spark, Less disk IO • Less to do in Spark • Less memory consumption • Any Spark language can leverage Data Sources API Spark Data Sources API: Quick Refresher Spark SQL Spark Core Engine Data Sources MLlib Spark Streaming GraphX (graph) CSV HANA
  • 55. 55INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ • Extends Spark Data Source to allow the creation of relations in Vora • The tables stored in SAP VORA are exposed to Spark as standard Spark DataFrames • To query the relations use standard Spark SQL or Vora SQL • Vora data source analyzes Spark SQL queries for opportunities to push the execution of parts or all of the query to SAP VORA • Vora data source, a Spark SQL query can be executed on multiple SAP VORA engines concurrently • the execution result returned back to the Spark runtime as an RDD (resilient distributed data set) SAP VORA Data Source (1)
  • 56. 56INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SAP VORA provides the following data sources: • com.sap.spark.hana for the SAP HANA data source • com.sap.spark.engines for the specific SAP VORA engines (relational, graph, time series, document store, and disk) • com.sap.spark.vora for the SAP VORA data source (deprecated) SAP VORA Data Source (2)
  • 57. SAP VORA 1.4 Tables and Views
  • 58. 58INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SQL Code: CREATE TABLE USERS (user_id string, age integer, gender string, occupation string, zip_code string) USING com.sap.spark.engines.relational OPTIONS ( files "/path/to/USERS1.csv,/path/to/USERS2.csv” ) Working with Tables and Views in Vora Creating Tables • Options: storagebackend, format, csvdelimiter, csvquote, null, csvskip, and datetimeformat. • A Create TABLE statement, registers the table in the SparkSqlContext and creates a table in the SAP VORA engine.
  • 59. 60INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ • Create table if table exists in Vora catalogue • Create table if not exist • Create table without data • Create table without scheme Create Table Conditions
  • 60. 61INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ  APPEND TABLE testTableName OPTIONS (files "path1/to/file/file1.csv,path2/to/file/file 2.csv”)  APPEND TABLE testTableName OPTIONS (files "path1/*”) Working with Tables and Views in Vora Register, Append and Drop  DROP TABLE/VIEW testTableName  DROP TABLE/VIEW IF EXISTS tableName  DROP TABLE/VIEW tableName CASCADE Use wild cards to read from hdfs files in a folder: OPTIONS ( ... files "/dir1/*,/dir2/*" ... )
  • 61. 62INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Working with Tables and Views in SAP VORA Creating tables (2)
  • 62. 63INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ  Listing Tables and Views: Working with Tables and Views in SAP VORA Listing and loading tables and views in SAP VORA REGISTER ALL TABLES USING com.sap.spark.vora REGISTER TABLE TABLE1 USING com.sap.spark.vora IGNORING CONFLICTS SHOW TABLES USING com.sap.spark.vora OPTIONS (…) // show only tables registered in the Spark catalog SHOW TABLES  Loading Tables from SAP VORA into Spark:
  • 63. 64INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Views:  SQL CREATE VIEW MyView AS SELECT * FROM Table1 USING com.sap.spark.vora OPTIONS ( …)  Dimensions CREATE DIMENSION VIEW MyDimensionView AS SELECT * FROM Table1 USING com.sap.spark.vora OPTIONS (…) Working with Tables and Views in SAP VORA Persisted views  Cubes CREATE CUBE VIEW MyCubeView AS SELECT * FROM Table1 USING com.sap.spark.vora OPTIONS (…)  DROP VIEW MyView USING com.sap.spark.vora OPTIONS (…)  DESCRIBE TABLE MyView USING com.sap.spark.vora
  • 64. © 2017 SAP SE or an SAP affiliate company. All rights reserved. 65PUBLIC 1. Financial institutions 2. Products 3. Complaints
  • 65. © 2017 SAP SE or an SAP affiliate company. All rights reserved. 66PUBLIC Complaints.csv FinancialInstitutions.csv Products.csv
  • 66. 67INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Take 15 minutes and get familiar with Apache Zeppelin, and execute the commands in the following notebooks: – 0_DATA – create base tables – Tables and Views – create views, cubes and dimensions on the base tables  https://www.sap.com/developer/tutorials/vora-ova-zeppelin0.html  Apache Zeppelin: http://<IP_ADDRESS>:9099 Check It Out!
  • 67. 68INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Vora in-memory Review – What did we do? CREATE TABLE… … Thrift Server Spark SELECT * FROM TABLE… Data Data File
  • 68. SAP VORA 1.4 Different Data Sources
  • 69. 70INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ CREATE TABLE Users(id integer, name string) USING com.sap.spark.vora OPTIONS ( files "/S3_BUCKET/data.csv", csvdelimiter "|", storagebackend "s3", s3accesskeyid "S3_KEY_ID", s3secretaccesskey "S3_KEY_SECRET", s3endpoint "S3_ENDPOINT", s3region "S3_REGION" ) Loading Different Data Types Loading data from Amazon S3
  • 70. 71INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ CREATE TABLE Users(id integer, name string) USING com.sap.spark.vora OPTIONS ( tablename "Users", files "/ORC_Files/data.orc/*", format "orc" ) CREATE TABLE Users(id integer, name string) USING com.sap.spark.vora OPTIONS ( tablename "Users", files "/Parquet_Files/data.parquet/*", format "parquet" ) Loading Different Data Types Loading data from ORC and Parquet file
  • 71. 72INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Individual tables via Vora All tables via Scala  import com.sap.spark.engines.client.EngineClient  EngineClient.getOrCreate().reloadTables() Reloading tables into Vora engines after restart
  • 72. 73INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
  • 73. 74INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ  SAP HANA DATA SOURCE CREATE TABLE $tableName USING com.sap.spark.hana OPTIONS ( tablepath "$tableName”, dbschema "$dbSchema”, host "$host”, Instance "$instance”, user "$user” passwd "$passwd” ) Loading Different Data Types: SAP VORA – SAP HANA Connection Loading data from/to SAP HANA SAP HANA In-Memory Store Spark SQL Smart Data Access Spark Extension Spark SparkSpark
  • 74. 75INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Writing Data into HANA
  • 75. 76INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SHOW TABLES USING com.sap.spark.hana OPTIONS ( host "$host”, instance "$instance”, user "$user”, passwd "$passwd”, dbschema "$dbSchema”, tablePattern "$pattern” ) Listing Tables in SAP HANA
  • 76. 77INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ REGISTER TABLE tablename USING com.sap.spark.hana OPTIONS ( host "$host”, instance "$instance”, user "$user”, passwd "$passwd”, dbschema "$dbSchema”) [IGNORING CONFLICTS] REGISTER ALL TABLES USING com.sap.spark.hana OPTIONS (…) Register HANA tables to current Spark Context
  • 77. 78INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ DROP TABLE testTableName DROP TABLE IF EXISTS testTableName DROP TABLE testTableName CASCADE DESCRIBE TABLE tablename USING com.sap.spark.hana OPTIONS ( host "$host”, instance "$instance”, user "$user”, passwd "$passwd”, dbschema "$dbSchema”) Drop/Expose metadata
  • 78. 79INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Pushing down the Hana UDF’s
  • 79. 80INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
  • 80. 81INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Take 10 minutes and go through the tutorial to connect Vora to HANA and to read different file formats. https://www.sap.com/developer/tutorials/vora-ova-zeppelin6.html https://www.sap.com/developer/tutorials/vora-ova-hana-datasource.html  SAP Vora Tools: http://<IP_ADDRESS>:9225  Apache Zeppelin: http://<IP_ADDRESS>:9099 Check It Out!
  • 81. SAP VORA 1.4 Disk to memory
  • 82. 83INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ • Relational column based store • You can create, register, and query disk engine tables in Spark SQL statements in the same way as with other data sources (for example, SAP VORA, SAP HANA, and native Spark data sources). The syntax and options of the CREATE TABLE statement are compatible with the SAP VORA data source. • Uses ‘com.sap.spark.engines.disk’ • REGISTER ALL TABLES USING com.sap.spark.engines.disk Vora Disk to Memory Accelarator
  • 83. 84INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Take 10 minutes and work through the doc store notebook in Apache Zeppelin https://www.sap.com/developer/tutorials/vora-ova-zeppelin3.html  SAP Vora Tools: http://<IP_ADDRESS>:9225  Apache Zeppelin: http://<IP_ADDRESS>:9099 Check It Out!
  • 85. 86INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Working with Hierarchies in SAP VORA What are hierarchies? Clothes Men Women Evening Gowns SkirtsShirts SuitsShorts JacketsSlacks Dresses Blouses Sun Dresses
  • 86. 87INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Parent-Child Hierarchy (aka Adjacency-List) Clothes Men Women Evening Gowns SkirtsShirts SuitsShorts JacketsSlacks Dresse s Blouse s Sun Dresses 1 32 4 5 6 7 8 9 10 11 12 13
  • 87. 88INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ • ORDER SIBLINGS BY: is relevant for some UDFs, such as IS_PRECEDING and IS_FOLLOWING • START WHERE • SET Create Parent-Child hierarchy from the clothes table Clothes Men Women Evening Gowns SkirtsShirts SuitsShorts JacketsSlacks Dresse s Blouse s Sun Dresses 1 32 4 5 6 7 8 9 10 11 12 13
  • 88. 89INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Parent-Child Hierarchies
  • 89. 90INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ h_src(level 1, level 2, level 3, level 4) Level-Based Hierarchies (aka flattened hierarchies) Clothes Men Women Evening Gowns SkirtsShirts SuitsShorts JacketsSlacks Dresse s Blouse s Sun Dresses 1 32 4 5 6 7 8 9 10 11 12 13
  • 90. 91INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ • WITH LEVELS (col1, col2, col3, col4) • MATCH PATH: Determines the way identical nodes are handled across the same and across different columns. • ORDER SIBLINGS BY • SET Level Hierarchies – Create statement
  • 91. 92INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
  • 92. 93INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Working with Hierarchies in SAP VORA Hierarchy UDFs UDF level(u) is_root(u) is_descendant(u,v) is_descendant_or_self(u,v) is_ancestor(u,v) is_ancestor_or_self(u,v) is_parent(u,v) is_child(u,v) is_sibling(u,v) is_following(u,v) is_preceding(u,v) node(node)
  • 93. 94INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Using UDF’s with Hierarchies - example
  • 94. 95INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Using UDF’s with Hierarchies
  • 95. 96INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Take 10 minutes and work through the hierarchies notebook in Apache Zeppelin https://www.sap.com/developer/tutorials/vora-ova-zeppelin2.html  SAP Vora Tools: http://<IP_ADDRESS>:9225  Apache Zeppelin: http://<IP_ADDRESS>:9099 Check It Out!
  • 96. SAP VORA 1.4 Time Series Engine
  • 97. 98INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Time Series and Time Series Analysis -30 -25 -20 -15 -10 -5 0 5 Temperature °C Halifax Waterloo Time Series: Sequence of data points recorded over time, may occur as equidistant / non-equidistant Detect and correct errors / anomalies: Outlier Detection, Missing Value Replacement, Editing Standard aggregation: SUM, MIN, MAX, Select, Join, Grouping across Series (e.g.: group by province) Granularization Support: Hourly to Daily measurements Series Analysis: Smoothing, Binning, Correlation
  • 98. 99INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Time Series Data Analysis across big data Efficiently analyze time series data in distributed environments  Interactive access to standard time series analysis functions using the well-known SQL language  Efficient compression allowing analysis of more data using less memory  Build time series models visually using Vora Data Modeler Trend | Cyclical | Seasonal | Random | Exception
  • 99. 100INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ • Identified series column • Know what the series period is • Think about compression strategy eg. Max error bound for compression • Define partition function • Define partition scheme Creating a Series Table - Prerequisites
  • 100. 101INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ CREATE TABLE "name" ( "time" TIMESTAMP, "value1" INTEGER DEFAULT NULL, "value2" DOUBLE ) SERIES ( PERIOD FOR SERIES "time" START TIMESTAMP '2010-01-01 00:00:00' END TIMESTAMP '2010-12-31 00:00:00' EQUIDISTANT INCREMENT BY 1 HOUR ) PARTITION BY PS1("time") Creating a Series Table - Examples
  • 101. 102INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ CREATE PARTITION FUNCTION PF1(C TIMESTAMP) AS RANGE BOUNDARIES(TIMESTAMP '2010-04-01 09:00:00.0000', TIMESTAMP '2010-09-01 09:00:00.0000'); CREATE PARTITION SCHEME PS1 USING PF1; Creating a Series Table – Partitions Example
  • 102. 103INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ ALTER TABLE name ADD DATASOURCE ( ts, j thousands separated by ',', d thousands separated by ',' decimal separated by '.' ) 'file_path_and_name' delimited by ';' string delimited by ' ' thousands separated by ',' decimal separated by '.'; LOAD TABLE name; Add data to TS Table: Importing Data
  • 103. 104INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ The definition is stored in the catalog, so whenever the data is loaded into memory the compression is applied. ``LOAD TABLE name`` using com.sap.spark.engines; Load Table
  • 104. 105INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ CREATE TABLE national_grid_demand ( ts TIMESTAMP, ND double, I014_ND double, TSD double, I014_TSD double, England_Wales_Demand double, Embedded_Wind_Generation double, Embedded_Wind_Capacity integer, Embedded_Solar_Generation double, Embedded_Solar_Capacity integer ) SERIES ( PERIOD FOR SERIES ts START TIMESTAMP '2015-01-01 00:00:00' END TIMESTAMP '2017-01-01 00:00:00' EQUIDISTANT INCREMENT BY 30 MINUTE DEFAULT COMPRESSION use (APCA error 3.0 percent) COMPRESSION ON (Embedded_Wind_Generation) use (SDT error 4.0 percent) COMPRESSION ON (Embedded_Solar_Generation) use (SDT error 5.0 percent) ) PARTITION BY PS1( ts ) USING com.sap.spark.engines OPTIONS (files "/user/<userid>/DemandData_2015_2.csv", csvskip "1", csvdelimiter ";", storagebackend "hdfs") Add date to TS Table: Create and Include Data
  • 105. 106INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SELECT ts, ND FROM national_grid_demand WHERE PERIOD AS OF TIMESTAMP '2015-09-01 00:00:00‘ SELECT ts, ND FROM national_grid_demand WHERE PERIOD BETWEEN TIMESTAMP '2015-08-31 00:00:00' AND TIMESTAMP '2015- 09-06 00:00:00' Querying Series Data - Examples
  • 106. 107INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Standard aggregations  SUM | AVG | MIN | MAX | COUNT Trend  Calculates the non-stationary behavior of the column using exponential smoothing Median  Median value for a time series column Mode  Modal value for a time series column Querying Series Data – Column Functions
  • 107. 108INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SELECT median(Embedded_Wind_Generation), median(Embedded_Wind_Capacity) FROM national_grid_demand WHERE PERIOD BETWEEN TIMESTAMP '2015-08-01 00:00:00' AND TIMESTAMP '2015- 09-01 00:00:00'; SELECT TREND(Embedded_Wind_Capacity), TREND(Embedded_Solar_Capacity) FROM national_grid_demand WHERE PERIOD BETWEEN TIMESTAMP '2015-01-01 00:00:00' AND TIMESTAMP '2017- 01-01 00:00:00'; Querying Series Data - Examples
  • 108. 109INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Auto Correlation • Calculates correlation coefficient for a given column and lag • Allows you to look at a single column and find if there is a repeating pattern between the values of a column Cross Correlation • Calculates correlation coefficient for a 2 columns and a given lag • Find if there is a correlation between multiple columns of a table Histogram • Calculate the frequency distribution of the values for the specified column • Shows what is the distribution of your values in your time series over a time period Granulize  Returns a new series based on an existing series by changing the interval between adjacent time stamps Querying Series Data – Table Functions
  • 109. 110INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SELECT * FROM HISTOGRAM ( SERIES national_grid_demand, 10, DESCRIPTOR( Embedded_Wind_Generation ) ) HIST; SELECT * FROM AUTO_CORR ( SERIES national_grid_demand, 48, DESCRIPTOR( ND ) ) auto_corr_res; SELECT ts, Embedded_Wind_Generation, Embedded_Wind_Capacity, Embedded_Solar_Generation, Embedded_Solar_Capacity FROM GRANULIZE( SERIES national_grid_demand, 24 HOUR, ROUND_HALF_UP, SUM => DESCRIPTOR( Embedded_Wind_Generation, Embedded_Solar_Generation ), AVG => DESCRIPTOR( Embedded_Wind_Capacity, Embedded_Solar_Capacity) ) WHERE PERIOD BETWEEN TIMESTAMP '2015-08-01 12:00:00' AND TIMESTAMP '2015-08-03 12:00:00'; Querying Series Data - Examples
  • 110. 111INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SELECT * FROM (``TS select statement 1`` USING … ) as T1, (RT select statement USING … ) as T2, WHERE T1.v1 = T2.v1 Join between TS table and Relational table
  • 111. 112INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Take 15 minutes and work through the time series notebook in Apache Zeppelin https://www.sap.com/developer/tutorials/vora-ova-zeppelin5.html  SAP Vora Tools: http://<IP_ADDRESS>:9225  Apache Zeppelin: http://<IP_ADDRESS>:9099 Check It Out!
  • 112. SAP VORA 1.4 Document Store Engine
  • 113. 114INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Schemaless  Very flexible  Add/Remove fields to any document Horizontally scalable (scale-out)  Good for big data processing Very low latency for key lookups Easy to use (SQL syntax) Document Store Overview
  • 114. 115INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ What does it store? { name: „Joe“, age: 25, hobbies: [„soccer“, „swimming“], address: { street: „4 Pennsylvania Plaza“, city: „New York“ } } Field: Value Field: Value Field: Value Field: Value
  • 115. 116INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Documents are stored in Collections { name: „Jim“, age: 25, hobbies: [„soccer“, „swimming“], address: { street: „4 Pennsylvania Plaza“, citity: „New York“ } } { name: „Jane“, age: 25, hobbies: [„soccer“, „swimming“], address: { street: „4 Pennsylvania Plaza“, citity: „New York“ } } { name: „Joe“, age: 25, hobbies: [„soccer“, „swimming“], address: { street: „4 Pennsylvania Plaza“, city: „New York“ } }
  • 116. 117INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Schemaless – Missing Fields { firstName:"John", lastName:"Smith", age:22, address: { streetAddress: "21 2nd Street", city: "New York", state: "NY", postalCode: "10021" } } { firstName:”George", lastName:”Brown", licensePlate:”1ABC234” address: { streetAddress: "69 1st Street", city: "Los Angeles", state: ”CA” } }
  • 117. 118INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Schemaless – Different Types { firstName:"John", lastName:"Smith", age:22, address: { streetAddress: "21 2nd Street", city: "New York", state: "NY", postalCode: "10A021" } } { firstName:”George", lastName:”Brown", age:22, address: { streetAddress: "69 1st Street", city: "Los Angeles", state: ”CA” , postalCode: 34707 } } NUMBERSTRING
  • 118. 119INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Schemaless – Different Types { firstName:"John", lastName:"Smith", age:22, address: { ... }, phoneNumber: [ { type: "home", number: "212 555-1234” }, { type: "fax", number: "646 555-4567” } ] } { firstName:”George", lastName:”Brown", age:22, address: { ... }, phoneNumber : { countryCode:90, number:” 212 666-1234”} } OBJECTARRAY
  • 119. 120INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ How to use Doc. Store: Create Collection CREATE COLLECTION <collection_identifier> [PARTITION BY <partition_scheme_identifier>] USING com.sap.spark.engines OPTIONS (<list_of_options>); eg. CREATE COLLECTION T PARTITION BY PS2("state") USING com.sap.spark.engines OPTIONS (files 'path/to/file/to/load/some.json‘, storagebackend "hdfs" );
  • 120. 121INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Partitioning CREATE PARTITION FUNCTION <partition_function_identifier>(<list_of_fields>) AS HASH(<list_of_fields>) MIN PARTITIONS <num> MAX PARTITIONS <num> USING com.sap.spark.engines; CREATE PARTITION SCHEME <partition_scheme_identifier> USING <partition_function_identifier> USING com.sap.spark.engines; eg. CREATE PARTITION FUNCTION PF("_id") AS HASH("_id") MIN PARTITIONS 3 MAX PARTITIONS 3 USING com.sap.spark.engines; CREATE PARTITION SCHEME PS USING PF USING com.sap.spark.engines;
  • 121. 122INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SQL Queries – Result Sets SELECT  Returns a relational result SELECT {}  Returns a JSON document Use Dot Notation to access nested fields in a document eg. { "author": { "address": { "street": "..." }, ... }, ... } –Nested blocks accessed via array ‘[ ]’ operator SELECT author.addresses[arraynum].street FROM <collection>
  • 122. 123INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Vora Doc Store Queries  Standard SQL support  SELECT – Returns a relational result  SELECT { } – Returns a JSON document  IS MISSING – look for missing fields similar to IS NULL  Standard SQL support – sorting, subselect, alias, joins group by sum, avg, min, max, count upper, lower, modulus
  • 123. 124INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Select – Different Types Always STRINGIFY in case of type mismatch  Even objects and arrays SELECT s.firstName, s.address.postalCode FROM Students s firstName postalCode ”John” ”10A021” ”George” ”34707”
  • 124. 125INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SQL Queries - Examples SELECT {"ct": "city"} FROM <collection> WHERE employeecollection.address.street = ‘Maple’ SELECT city as ct FROM <collection> WHERE employeecollection.address.street = ‘Maple’ SELECT * FROM <collection> WHERE author.address IS MISSING using com.sap.spark.engines; SELECT * FROM <collection> WHERE author.address[1].state=‘NY’
  • 125. 126INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SQL Queries – Expressions and Aggregates Standard Aggregate Functions are Supported  count, min, max, avg, sum, etc… GROUP BY  Behaves as expected for a relational result  For a JSON result, a new document is created where the field of the GROUP BY clause becomes the id of the resulting document  The _id field of the new documents is an object of grouped field names: SELECT { _id: _id, aggr: avg(cost) } FROM <collection> GROUP BY productgroup, productname using com.sap.spark.engines; _id: {productgroup:value, productname:value}
  • 127. 128INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Take 10 minutes and work through the doc store notebook in Apache Zeppelin https://www.sap.com/developer/tutorials/vora-ova-zeppelin7.html  SAP Vora Tools: http://<IP_ADDRESS>:9225  Apache Zeppelin: http://<IP_ADDRESS>:9099 Check It Out!
  • 129. 130INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Graph Databases • Relationships between entities are of key interest • Network view on complex structures • Complex search patterns, e.g. subgraph search and link predictions Application Areas • Social networks • Business networks (Users&Products, Learners&Contents) • Knowledge graphs (Entities and concepts/relationships around them: Apple) • Recommendation systems • Org-/project structures (hierarchies / DAGs) • Supply chain management Graph Databases
  • 130. 131INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Building Knowledge Graph Build and infer relations between data points Machine LearningMachine Teaching hasLearningItem SAP hasOrgUnit hasOrgUnit Puntis Product Manager Balaji Product Manager Machine Teaching
  • 131. 132INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
  • 132. 133INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SAP VORA Graph: Scalable Graph Analytics Graph Analytics for Enterprise Applications Expressive, declarative graph query language based on SQL and graph pattern matching High Performance for Real-Time Applications Native in-memory graph store and light-weight query engine for super fast query execution Flexible Data Import from Various Sources Easy mapping from relational tables to graphs. Direct loading from files, HDFS, VORA catalog Platform and Application Integration Part of VORA, connect to HANA Distributed Engine for Scale-Out Scenarios Analyze billion-node graphs using distributed version of the graph store and query engine
  • 133. 134INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Property Graph Model Nodes have a type, a unique numeric ID (64-bit integer > 0), and a set of primitive-valued properties Edges have a type and connect two nodes Directed and undirected graphs are supported Node properties have a type and a primitive-typed value (nullable) Supported primitive datatypes:  64-bit signed integers  variable-length strings  double-precision floats Graphs have no fixed schema Properties can be indexed
  • 134. 135INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Mapping Edge Properties :PERSON :PERSON :MARRIED DATE=‘01-01-72’ :PERSON :PERSON :MARRIAGE DATE=’01-01-72’:MARRIED :MARRIED :PERSON :PERSON :MARRIAGE DATE=’01-01-72’ :MARRIED_E :MARRIED_E :MARRIED or - For performance reasons, the design decision was not to have edge properties - Edge properties can be modeled using auxiliary nodes
  • 135. 136INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Storing Graphs in Files Graphs can be loaded from JSG-files (local disk or HDFS) When loading from local disk, the file must be present on all VORA nodes at the same path JSG-format:  Optional JSON-header with metadata # { … }  Every line in the body is a JSON array representing one node, consisting of: [ <node-type>, <node-id>, <properties>, <edges> ]
  • 136. 138INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
  • 137. 139INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Creating VORA Graphs from Spark Graph: Partitioned and un-partitioned 1. Create a BLOCK partition function:  Recommended PARTITIONS: (number-of-VORA-nodes * cores-per-VORA- node)  Recommended BLOCKSIZE: 1000 2. Create a partition scheme 3. Create a graph using partition scheme and data source. Partition scheme must get NODEID as parameter.
  • 138. 140INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Create Graph
  • 139. 141INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ  SQL-based graph query language  Aggregations, sorting, limits, and expressions inherited from SQL Graph Query Language in a Nutshell Basic Concepts Select node type SELECT NAME FROM ACTOR USING GRAPH IMDB : Any property from one type node Select all nodes SELECT NAME FROM ANY USING GRAPH IMDB : Any property from ANY type node Select paths SELECT PLAYS_IN.DIRECTED_BY.NAME FROM ACTOR USING GRAPH IMDB : Traverse nodes Node type and ID SELECT NODETYPE FROM ANY WHERE NODEID > 5 USING GRAPH IMDB : Node ID and Node Type Built-in Graph Functions Node degree SELECT DEGREE(A) FROM ACTOR A USING GRAPH IMDB Distance SELECT DISTANCE(A,B) FROM ACTOR A, ACTOR B USING GRAPH IMDB WHERE… Connected Components SELECT CONNECTED_COMPONENT(A) FROM ACTOR A USING GRAPH IMDB Graph Pattern Matching Check edges SELECT A.NAME, M.TITLE FROM ACTOR A, MOVIE M USING GRAPH IMDB WHERE M IN A.PLAYS_IN Check distance SELECT A.NAME, B.NAME FROM ACTOR A, ACTOR B USING GRAPH IMDB WHERE DISTANCE(A,B) < 5
  • 140. 142INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
  • 141. 143INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ SELECT
  • 142.
  • 143. 145INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Aggregation Functions
  • 144. 146INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Graph Functions
  • 145. 147INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Degree Function
  • 146.
  • 147. 149INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Connected Component Function
  • 148. 150INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Connected Component Function SELECT A.NAME, CONNECTED_COMPONENT (STRONG A) AS SCC, CONNECTED_COMPONENT(WEAK A) AS WCC FROM ACTOR A USING GRAPH MOVIES A.NAME SCC WCC Brad Pitt 1 1 Angelina Jolie 2 1 Shah Rukh Khan 3 3
  • 149.
  • 150. 152INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ The DISTANCE function calculates the distance of the directed or undirected path between two nodes: • Sum of weights • Number of hops between the source and destination Node. Distance Function (Shortest Path) S: 1, D: 4 1-2-5-4(3) 1-5-4 (2) 1-2-3-4 (3) 1-5-2-3-4 (4) S: A, D: B A-B, w: 4 A-C-B, w: 3 (2+1) A-C-D-B, w: 8 (2+3+3) A-C-D-E-B, w: 10 (2+3+1+4)
  • 151.
  • 152. 154INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Take 15 minutes and work through the graph notebook in Apache Zeppelin https://www.sap.com/developer/tutorials/vora-ova-zeppelin4.html  SAP Vora Tools: http://<IP_ADDRESS>:9225  Apache Zeppelin: http://<IP_ADDRESS>:9099 Check It Out!
  • 153. SAP VORA 1.4 Data Modeler
  • 154. 157INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ http://<DNS_NAME_OF_JUMPBOX_NODE>:9225
  • 156. 167INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ TOP REASONS TO CHOOSE VORA 1 Integrated Solution Combine relational, time series, JSON, and graph processing. No need to move data between systems 2 SQL Access SQL access to relational, time series, JSON, and graph computing engines 5 Metadata Persistence Enabled by Distributed Transaction Log that can be recovered when needed 6 Business Semantics Out-of-the-box hierarchy processing and currency conversion 3 Intuitive Interface Web interface with SQL editor, data browser, and drag-and- drop to interact with Vora computing engines 7 HANA Integration High-performance interactive analytics across enterprise data in HANA and Hadoop data 4 Open Consumption Open framework supports data integration via JDBC, OData and Restful Services
  • 157. 168INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ Documentation SAP VORA website http://sap.com/vora SAP VORA documentation https://help.sap.com/viewer/p/SAP_VORA - Installation and Administration Guide - Developer Guide - Troubleshooting Guide - Sizing Guide Product Availability Matrix (PAM) incl. release strategy https://apps.support.sap.com/sap/support/pam/pam.html?#pvnr=73555000100900000415 SAP Notes (component HAN-VO*) https://launchpad.support.sap.com/#/solutions/notes/ - 2303668 - SAP VORA 1.3 Release Note - 2405200 - Release Restrictions for SAP VORA 1.3 - 2213226 - Prerequisites for installing SAP VORA: Operating Systems and Hadoop Components - 2220859 - SAP VORA Documentation Corrections Further help - For product issues -> SAP customer message in components HAN-VO* - Community-based help on Stackoverflow (no SLA, non-critical issues only) http://stackoverflow.com/questions/tagged/vora Resources - documentation