SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
1
2
OVHcloud Data Processing :
Le nouveau service pour valoriser vos
données à l'aide du cloud
BELLLION Bastien
2020-06-25
3
Bastien BELLION
• Data Convergence team
• bastien.bellion@ovhcloud.com
• @BastienBellion
Data Convergence :
• Analytics Data Platform
• Data Processing
4
Content
• Big Data
• Cluster Computing
• Apache Spark
• Exemple 1: Lyrics Wordcount
• OVHcloud Data Processing platform
• Exemple 2: New-York Taxis
• Roadmap
5
Why do we need to process data?
• Invent new products
• Make better decisions
• Prevent danger and fraud
• Forecasting events
6
Flood of data
• 90 percent of data in the world has been
generated in last 2 years.
• Autonomouscars: 20TB /hour
• 500 million tweets per day.
• 75 billion IoT devices by 2025
• Data exhaust
7
Where is the data stored?
8
Cluster computing
Cluster
9
Selecting the right tool
• Open source
• Biggest open source
community in big data
• Fast
• Java/Scala/Python/R
• Cluster
Big data tools – Google trend graph
10
Who is using Spark ?
https://databricks.com/fr/session/netflixs-recommendation-ml-
pipeline-using-apache-spark
https://labs.criteo.com/2018/01/spark-out-of-memory/
11
Spark
12
Spark
Spark Cluster
Job 1
Job 2
Job 3
spark-submit --class org.apache.spark.examples.SparkPi
 --driver-memory 8G
 --master local[2]
 --executor-memory 8G
 --executor-cores 2
 --num-executors 3
 spark-examples.jar
13
Example 1 : Lyrics Wordcount
https://github.com/walkerkq/musiclyrics
Billboard hot 100 between 1965-2015
5100 song with for each:
• Rank of the year
• Song name
• Artist
• Year
• Lyrics
• Source
Full written guide : https://docs.ovh.com/gb/en/data-processing/wordcount-spark/
14
Example 1 : Lyrics Wordcount
1/3 of dataset, 1700 songs
1/3 of dataset, 1700 songs
1/3 of dataset, 1700 songs
Lyrics.csv
15
Example 1 : Lyrics Wordcount
Volume of processed data Time elapsed one computer Time elapsed spark cluster
7.59 Mb 0,5 s 60 s
759 Mb 5,1 s 100 s
7.59 Gb 48 s 186 s
76 Gb 490 s 358 s
760 Gb 4 820 s / 80 minutes 2 250 s / 37 minutes
Work computer
4 cpu cores, 8 threads
16 Gb of RAM
OVHcloud instances
10 workers
2 cpu cores per workers
8 Gb of RAM per workers
16
Manual Spark cluster problems
• Take time and knowledge to setup
• Cluster Management
Spark version 2.2.1
125 Cores , 500GB RAM
GRA
Job 1
Job 2
Job 3
Spark 2.2.1
100GB RAM, 25 Cores
GRA
Spark 2.2.1
150GB RAM, 37 Cores
GRA
Spark 2.2.1
100GB RAM, 25 Cores
GRA
17
Manual Spark cluster problems
• Take time and knowledge to setup
• Cluster Management
• Spark version bounded
Spark version 2.2.1
125 Cores , 500GB RAM
GRA
Job 1
Job 2
Job 3
Spark 2.2.1
100GB RAM, 25 Cores
GRA
Spark 2.2.1
150GB RAM, 37 Cores
GRA
Spark 2.2.1
100GB RAM, 25 Cores
GRA
Job 4
Spark 3.0.0
50GB RAM, 10 Cores
GRA
18
Manual Spark cluster problems
• Take time and knowledge to setup
• Cluster Management
• Spark version bounded
• Limited ressources
Spark version 2.2.1
125 Cores , 500GB RAM
GRA
Job 1
Job 2
Job 3
Spark 2.2.1
100GB RAM, 25 Cores
GRA
Spark 2.2.1
150GB RAM, 37 Cores
GRA
Spark 2.2.1
100GB RAM, 25 Cores
GRA
Job 4
Spark 2.2.1
200GB RAM, 50 Cores
GRA
19
Manual Spark cluster problems
• Take time and knowledge to setup
• Cluster Management
• Spark version bounded
• Limited ressources
• Region bounded
Spark version 2.2.1
125 Cores , 500GB RAM
GRA
Job 1
Job 2
Job 3
Spark 2.2.1
100GB RAM, 25 Cores
GRA
Spark 2.2.1
150GB RAM, 37 Cores
GRA
Spark 2.2.1
100GB RAM, 25 Cores
GRA
Job 4
Spark 2.2.1
50GB RAM, 10 Cores
BHS
20
Manual Spark cluster problems
Job 1
Job 2
Job 3
Spark 2.4.5
4 nodes, 100GB RAM
GRA
Spark 2.4.3
5 nodes, 200GB RAM
BHS
Spark 3
2 nodes, 50GB RAM
GRA
• No Cluster Management
• Start a cluster in seconds
• Dedicated resources
• No Unused resources
• Any Spark version
• Any region
Job 4
Spark 3.0.0
200GB RAM, 50 Cores
BHS
21
OVHcloud Data Processing
A single cluster per job To focus on what matters
Be in the cloud To scale up and down quickly
Bin different regions Data policy compliance and be close to
data
Distributed Computing
Technology
To distribute processing task to several
computers automatically
22
OVHcloud Data Processing
A single cluster per job To focus on what matters
Be in the cloud To scale up and down quickly
Bin different regions Data policy compliance and be close to
data
Distributed Computing
Technology
To distribute processing task to several
computers automatically
OVHcloud Data Processing
23
OVHcloud Data Processing
24
OVHcloud Data Processing
25
OVHcloud Data Processing
26
OVHcloud Data Processing
27
OVHcloud Data Processing
28
OVHcloud Data Processing
Supported languages :
• Java 8
• Scala 2.12
• Python v2.7+ and v3.4+
Available regions :
• Europe
• North America (Soon)
Engine:
• Apache Spark 2.4.3
• Apache Spark 3.0.0 (Soon)
Job size :
• Up to 60 GB of RAM per Executor
• Up to 16 Cores of CPU per Executor
• Up to 10 Executors per job
29
1 Via Control Panel 2 Via API OVHcloud 3 Via CLI (Spark-Submit)
$ ./ovh-spark-submit 
--project-id yourProjectId 
--upload ./spark-examples.jar 
--class org.apache.spark.examples.SparkPi 
--driver-cores 1 
--driver-memory 4G 
--executor-cores 1 
--executor-memory 4G 
--num-executors 1 
swift://odp/spark-examples.jar 1000
• Auto upload your application code
• Open source
https://github.com/ovh/data-processing-spark-submit
30
Example 2 : New-YorkTaxis
• Date
• Number of people per ride
• Price
• Tips
• Pickup and dropoff zone
• 10 years worth of taxis ride
• +100 Go of data
31
Example 2 : New-YorkTaxis
First step: make the data accessible
Public Cloud
Object Storage
32
Example 2 : New-YorkTaxis
Second step: submit your job
Data Processing
Start a new job
33
Example 2 : New-YorkTaxis
Second step: submit your job
34
Example 2 : New-YorkTaxis
Second step: submit your job
35
Example 2 : New-YorkTaxis
Second step: submit your job
36
Example 2 : New-YorkTaxis
.
.
. ..
..
.
.
Bill
I submit 1job, with
• 1 driver ( 4 vCore / 15GB RAM)
• 5 executors( 5 x 8vCore / 30GB RAM)
I alsoused OVHcloud Object Storage
Job lasted15 minutes
Payment: (44 vCores + 165 GB RAM ) for15 minutes
= 1,33€
+ classicpriceof Object Storage(storageanddatatransfer)
37
Roadmap
Development Lab GA Soon
ETA = July 2020Launched (April)
Free ! Full working:
• Product itself
• Control panel/API/CLI
• Documentation
Added:
• Bug fix
• Billing
S2 2020
Backlog:
• More Regions
• More flavors/prices
• More Engines
• More Spark versions
38
Thank You !
https://gitter.im/ovh/data-processing
https://www.ovh.com/blog/ovhcloud-techs-talk-fr-s01e09/

Weitere ähnliche Inhalte

Was ist angesagt?

How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)DataStax Academy
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodesaaronmorton
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionDataStax Academy
 
1 sysadmin vs 250 clusters de stockage
1 sysadmin vs 250 clusters de stockage1 sysadmin vs 250 clusters de stockage
1 sysadmin vs 250 clusters de stockageOVHcloud
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...DataStax Academy
 
Cassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For OperatorsCassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For OperatorsJeff Jirsa
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreDataStax Academy
 
Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2DataStax Academy
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandraScyllaDB
 
Wide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingWide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingScyllaDB
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...DataStax
 
Scylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScyllaDB
 
Understanding DSE Search by Matt Stump
Understanding DSE Search by Matt StumpUnderstanding DSE Search by Matt Stump
Understanding DSE Search by Matt StumpDataStax
 
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...DataStax
 
Building Scalable, Real Time Applications for Financial Services with DataStax
Building Scalable, Real Time Applications for Financial Services with DataStaxBuilding Scalable, Real Time Applications for Financial Services with DataStax
Building Scalable, Real Time Applications for Financial Services with DataStaxDataStax
 
Cold Storage That Isn't Glacial (Joshua Hollander, Protectwise) | Cassandra S...
Cold Storage That Isn't Glacial (Joshua Hollander, Protectwise) | Cassandra S...Cold Storage That Isn't Glacial (Joshua Hollander, Protectwise) | Cassandra S...
Cold Storage That Isn't Glacial (Joshua Hollander, Protectwise) | Cassandra S...DataStax
 
Troubleshooting Cassandra (J.B. Langston, DataStax) | C* Summit 2016
Troubleshooting Cassandra (J.B. Langston, DataStax) | C* Summit 2016Troubleshooting Cassandra (J.B. Langston, DataStax) | C* Summit 2016
Troubleshooting Cassandra (J.B. Langston, DataStax) | C* Summit 2016DataStax
 
Cassandra Day London 2015: Securing Cassandra and DataStax Enterprise
Cassandra Day London 2015: Securing Cassandra and DataStax EnterpriseCassandra Day London 2015: Securing Cassandra and DataStax Enterprise
Cassandra Day London 2015: Securing Cassandra and DataStax EnterpriseDataStax Academy
 
How to achieve no compromise performance and availability
How to achieve no compromise performance and availabilityHow to achieve no compromise performance and availability
How to achieve no compromise performance and availabilityScyllaDB
 

Was ist angesagt? (20)

How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
1 sysadmin vs 250 clusters de stockage
1 sysadmin vs 250 clusters de stockage1 sysadmin vs 250 clusters de stockage
1 sysadmin vs 250 clusters de stockage
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Cassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For OperatorsCassandra Summit 2015: Real World DTCS For Operators
Cassandra Summit 2015: Real World DTCS For Operators
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
 
Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2Oracle to Cassandra Core Concepts Guide Pt. 2
Oracle to Cassandra Core Concepts Guide Pt. 2
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from Cassandra
 
Wide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingWide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data Modeling
 
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
Lessons Learned on Java Tuning for Our Cassandra Clusters (Carlos Monroy, Kne...
 
Scylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla OperatorScylla on Kubernetes: Introducing the Scylla Operator
Scylla on Kubernetes: Introducing the Scylla Operator
 
Understanding DSE Search by Matt Stump
Understanding DSE Search by Matt StumpUnderstanding DSE Search by Matt Stump
Understanding DSE Search by Matt Stump
 
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
 
Building Scalable, Real Time Applications for Financial Services with DataStax
Building Scalable, Real Time Applications for Financial Services with DataStaxBuilding Scalable, Real Time Applications for Financial Services with DataStax
Building Scalable, Real Time Applications for Financial Services with DataStax
 
Cold Storage That Isn't Glacial (Joshua Hollander, Protectwise) | Cassandra S...
Cold Storage That Isn't Glacial (Joshua Hollander, Protectwise) | Cassandra S...Cold Storage That Isn't Glacial (Joshua Hollander, Protectwise) | Cassandra S...
Cold Storage That Isn't Glacial (Joshua Hollander, Protectwise) | Cassandra S...
 
Troubleshooting Cassandra (J.B. Langston, DataStax) | C* Summit 2016
Troubleshooting Cassandra (J.B. Langston, DataStax) | C* Summit 2016Troubleshooting Cassandra (J.B. Langston, DataStax) | C* Summit 2016
Troubleshooting Cassandra (J.B. Langston, DataStax) | C* Summit 2016
 
Cassandra Day London 2015: Securing Cassandra and DataStax Enterprise
Cassandra Day London 2015: Securing Cassandra and DataStax EnterpriseCassandra Day London 2015: Securing Cassandra and DataStax Enterprise
Cassandra Day London 2015: Securing Cassandra and DataStax Enterprise
 
How to achieve no compromise performance and availability
How to achieve no compromise performance and availabilityHow to achieve no compromise performance and availability
How to achieve no compromise performance and availability
 

Ähnlich wie OVHcloud Tech Talks S01E09 - OVHcloud Data Processing : Le nouveau service pour valoriser vos données à l'aide du cloud

The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)Nicolas Poggi
 
The state of Spark in the cloud
The state of Spark in the cloudThe state of Spark in the cloud
The state of Spark in the cloudNicolas Poggi
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comDamien Krotkine
 
Using BigBench to compare Hive and Spark (Long version)
Using BigBench to compare Hive and Spark (Long version)Using BigBench to compare Hive and Spark (Long version)
Using BigBench to compare Hive and Spark (Long version)Nicolas Poggi
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHungWei Chiu
 
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLArnab Biswas
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsAli Hodroj
 
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseTackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseDatabricks
 
Oracle Cloud Infrastructure Data Science 概要資料(20200406)
Oracle Cloud Infrastructure Data Science 概要資料(20200406)Oracle Cloud Infrastructure Data Science 概要資料(20200406)
Oracle Cloud Infrastructure Data Science 概要資料(20200406)オラクルエンジニア通信
 
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with SchlumbergerGet Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumbergerinside-BigData.com
 
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per SecondAmazon Web Services
 
There's no magic... until you talk about databases
 There's no magic... until you talk about databases There's no magic... until you talk about databases
There's no magic... until you talk about databasesESUG
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyCeph Community
 
Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Anthony Baker
 
Databricks clusters in autopilot mode
Databricks clusters in autopilot modeDatabricks clusters in autopilot mode
Databricks clusters in autopilot modePrakash Chockalingam
 
Cloud: From Unmanned Data Center to Algorithmic Economy using Openstack
Cloud: From Unmanned Data Center to Algorithmic Economy using OpenstackCloud: From Unmanned Data Center to Algorithmic Economy using Openstack
Cloud: From Unmanned Data Center to Algorithmic Economy using OpenstackAndrew Yongjoon Kong
 
KSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success StoryKSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success StoryKristofferson A
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesRose Toomey
 
Leveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesLeveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesRose Toomey
 

Ähnlich wie OVHcloud Tech Talks S01E09 - OVHcloud Data Processing : Le nouveau service pour valoriser vos données à l'aide du cloud (20)

Spark Uber Development Kit
Spark Uber Development KitSpark Uber Development Kit
Spark Uber Development Kit
 
The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)The state of Hive and Spark in the Cloud (July 2017)
The state of Hive and Spark in the Cloud (July 2017)
 
The state of Spark in the cloud
The state of Spark in the cloudThe state of Spark in the cloud
The state of Spark in the cloud
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.com
 
Using BigBench to compare Hive and Spark (Long version)
Using BigBench to compare Hive and Spark (Long version)Using BigBench to compare Hive and Spark (Long version)
Using BigBench to compare Hive and Spark (Long version)
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User Group
 
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkML
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGs
 
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseTackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
 
Oracle Cloud Infrastructure Data Science 概要資料(20200406)
Oracle Cloud Infrastructure Data Science 概要資料(20200406)Oracle Cloud Infrastructure Data Science 概要資料(20200406)
Oracle Cloud Infrastructure Data Science 概要資料(20200406)
 
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with SchlumbergerGet Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
 
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
 
There's no magic... until you talk about databases
 There's no magic... until you talk about databases There's no magic... until you talk about databases
There's no magic... until you talk about databases
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case Study
 
Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)
 
Databricks clusters in autopilot mode
Databricks clusters in autopilot modeDatabricks clusters in autopilot mode
Databricks clusters in autopilot mode
 
Cloud: From Unmanned Data Center to Algorithmic Economy using Openstack
Cloud: From Unmanned Data Center to Algorithmic Economy using OpenstackCloud: From Unmanned Data Center to Algorithmic Economy using Openstack
Cloud: From Unmanned Data Center to Algorithmic Economy using Openstack
 
KSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success StoryKSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success Story
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark Pipelines
 
Leveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesLeveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelines
 

Mehr von OVHcloud

OVHcloud Startup Program : Découvrir l'écosystème au service des startups
OVHcloud Startup Program : Découvrir l'écosystème au service des startups OVHcloud Startup Program : Découvrir l'écosystème au service des startups
OVHcloud Startup Program : Découvrir l'écosystème au service des startups OVHcloud
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsOVHcloud
 
OVHcloud Tech Talks S01E08 - GAIA-X pour les techs : OVHcloud & Scaleway vous...
OVHcloud Tech Talks S01E08 - GAIA-X pour les techs : OVHcloud & Scaleway vous...OVHcloud Tech Talks S01E08 - GAIA-X pour les techs : OVHcloud & Scaleway vous...
OVHcloud Tech Talks S01E08 - GAIA-X pour les techs : OVHcloud & Scaleway vous...OVHcloud
 
Webinar - Enterprise Cloud Databases
Webinar - Enterprise Cloud DatabasesWebinar - Enterprise Cloud Databases
Webinar - Enterprise Cloud DatabasesOVHcloud
 
OVHcloud Tech Talks S01E07 – Introduction à l’intelligence artificielle pour ...
OVHcloud Tech Talks S01E07 – Introduction à l’intelligence artificielle pour ...OVHcloud Tech Talks S01E07 – Introduction à l’intelligence artificielle pour ...
OVHcloud Tech Talks S01E07 – Introduction à l’intelligence artificielle pour ...OVHcloud
 
OVHcloud Tech Talks Fr S01E06 – BeeGFS, un filesystem orienté performance, ma...
OVHcloud Tech Talks Fr S01E06 – BeeGFS, un filesystem orienté performance, ma...OVHcloud Tech Talks Fr S01E06 – BeeGFS, un filesystem orienté performance, ma...
OVHcloud Tech Talks Fr S01E06 – BeeGFS, un filesystem orienté performance, ma...OVHcloud
 
OVHcloud Tech Talks Fr S01E05 – L’opérateur Harbor, une nécessité pour certai...
OVHcloud Tech Talks Fr S01E05 – L’opérateur Harbor, une nécessité pour certai...OVHcloud Tech Talks Fr S01E05 – L’opérateur Harbor, une nécessité pour certai...
OVHcloud Tech Talks Fr S01E05 – L’opérateur Harbor, une nécessité pour certai...OVHcloud
 
OVHcloud Tech-Talk S01E04 - La télémétrie au service de l'agilité
OVHcloud Tech-Talk S01E04 - La télémétrie au service de l'agilitéOVHcloud Tech-Talk S01E04 - La télémétrie au service de l'agilité
OVHcloud Tech-Talk S01E04 - La télémétrie au service de l'agilitéOVHcloud
 
OVHcloud TechTalks - ML serving
OVHcloud TechTalks - ML servingOVHcloud TechTalks - ML serving
OVHcloud TechTalks - ML servingOVHcloud
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloudOVHcloud
 
Les APIs OpenStack
Les APIs OpenStackLes APIs OpenStack
Les APIs OpenStackOVHcloud
 
Migrer 3 millions de sites sans maitriser leur code source ? Impossible mais ...
Migrer 3 millions de sites sans maitriser leur code source ? Impossible mais ...Migrer 3 millions de sites sans maitriser leur code source ? Impossible mais ...
Migrer 3 millions de sites sans maitriser leur code source ? Impossible mais ...OVHcloud
 
Industrialize Machine Learning
Industrialize Machine Learning Industrialize Machine Learning
Industrialize Machine Learning OVHcloud
 
OVHcloud – Enterprise Cloud Databases
OVHcloud – Enterprise Cloud DatabasesOVHcloud – Enterprise Cloud Databases
OVHcloud – Enterprise Cloud DatabasesOVHcloud
 
OVHcloud Hosted Private Cloud Platform Network use cases with VMware NSX
OVHcloud Hosted Private Cloud Platform Network use cases with VMware NSXOVHcloud Hosted Private Cloud Platform Network use cases with VMware NSX
OVHcloud Hosted Private Cloud Platform Network use cases with VMware NSXOVHcloud
 
Pilotage et gestion proactive de vos machines virtuelles dans le Hosted Priva...
Pilotage et gestion proactive de vos machines virtuelles dans le Hosted Priva...Pilotage et gestion proactive de vos machines virtuelles dans le Hosted Priva...
Pilotage et gestion proactive de vos machines virtuelles dans le Hosted Priva...OVHcloud
 
Online passwords – understanding "credential stuffing" cyberattack
Online passwords – understanding "credential stuffing" cyberattackOnline passwords – understanding "credential stuffing" cyberattack
Online passwords – understanding "credential stuffing" cyberattackOVHcloud
 
OVHcloud and Microsoft for the public sector
OVHcloud and Microsoft for the public sectorOVHcloud and Microsoft for the public sector
OVHcloud and Microsoft for the public sectorOVHcloud
 
The new AMD EPYC solutions from OVHcloud: what benefits?
The new AMD EPYC solutions from OVHcloud: what benefits?The new AMD EPYC solutions from OVHcloud: what benefits?
The new AMD EPYC solutions from OVHcloud: what benefits?OVHcloud
 
Maximising the security of your cloud infrastructure
Maximising the security of your cloud infrastructureMaximising the security of your cloud infrastructure
Maximising the security of your cloud infrastructureOVHcloud
 

Mehr von OVHcloud (20)

OVHcloud Startup Program : Découvrir l'écosystème au service des startups
OVHcloud Startup Program : Découvrir l'écosystème au service des startups OVHcloud Startup Program : Découvrir l'écosystème au service des startups
OVHcloud Startup Program : Découvrir l'écosystème au service des startups
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP models
 
OVHcloud Tech Talks S01E08 - GAIA-X pour les techs : OVHcloud & Scaleway vous...
OVHcloud Tech Talks S01E08 - GAIA-X pour les techs : OVHcloud & Scaleway vous...OVHcloud Tech Talks S01E08 - GAIA-X pour les techs : OVHcloud & Scaleway vous...
OVHcloud Tech Talks S01E08 - GAIA-X pour les techs : OVHcloud & Scaleway vous...
 
Webinar - Enterprise Cloud Databases
Webinar - Enterprise Cloud DatabasesWebinar - Enterprise Cloud Databases
Webinar - Enterprise Cloud Databases
 
OVHcloud Tech Talks S01E07 – Introduction à l’intelligence artificielle pour ...
OVHcloud Tech Talks S01E07 – Introduction à l’intelligence artificielle pour ...OVHcloud Tech Talks S01E07 – Introduction à l’intelligence artificielle pour ...
OVHcloud Tech Talks S01E07 – Introduction à l’intelligence artificielle pour ...
 
OVHcloud Tech Talks Fr S01E06 – BeeGFS, un filesystem orienté performance, ma...
OVHcloud Tech Talks Fr S01E06 – BeeGFS, un filesystem orienté performance, ma...OVHcloud Tech Talks Fr S01E06 – BeeGFS, un filesystem orienté performance, ma...
OVHcloud Tech Talks Fr S01E06 – BeeGFS, un filesystem orienté performance, ma...
 
OVHcloud Tech Talks Fr S01E05 – L’opérateur Harbor, une nécessité pour certai...
OVHcloud Tech Talks Fr S01E05 – L’opérateur Harbor, une nécessité pour certai...OVHcloud Tech Talks Fr S01E05 – L’opérateur Harbor, une nécessité pour certai...
OVHcloud Tech Talks Fr S01E05 – L’opérateur Harbor, une nécessité pour certai...
 
OVHcloud Tech-Talk S01E04 - La télémétrie au service de l'agilité
OVHcloud Tech-Talk S01E04 - La télémétrie au service de l'agilitéOVHcloud Tech-Talk S01E04 - La télémétrie au service de l'agilité
OVHcloud Tech-Talk S01E04 - La télémétrie au service de l'agilité
 
OVHcloud TechTalks - ML serving
OVHcloud TechTalks - ML servingOVHcloud TechTalks - ML serving
OVHcloud TechTalks - ML serving
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloud
 
Les APIs OpenStack
Les APIs OpenStackLes APIs OpenStack
Les APIs OpenStack
 
Migrer 3 millions de sites sans maitriser leur code source ? Impossible mais ...
Migrer 3 millions de sites sans maitriser leur code source ? Impossible mais ...Migrer 3 millions de sites sans maitriser leur code source ? Impossible mais ...
Migrer 3 millions de sites sans maitriser leur code source ? Impossible mais ...
 
Industrialize Machine Learning
Industrialize Machine Learning Industrialize Machine Learning
Industrialize Machine Learning
 
OVHcloud – Enterprise Cloud Databases
OVHcloud – Enterprise Cloud DatabasesOVHcloud – Enterprise Cloud Databases
OVHcloud – Enterprise Cloud Databases
 
OVHcloud Hosted Private Cloud Platform Network use cases with VMware NSX
OVHcloud Hosted Private Cloud Platform Network use cases with VMware NSXOVHcloud Hosted Private Cloud Platform Network use cases with VMware NSX
OVHcloud Hosted Private Cloud Platform Network use cases with VMware NSX
 
Pilotage et gestion proactive de vos machines virtuelles dans le Hosted Priva...
Pilotage et gestion proactive de vos machines virtuelles dans le Hosted Priva...Pilotage et gestion proactive de vos machines virtuelles dans le Hosted Priva...
Pilotage et gestion proactive de vos machines virtuelles dans le Hosted Priva...
 
Online passwords – understanding "credential stuffing" cyberattack
Online passwords – understanding "credential stuffing" cyberattackOnline passwords – understanding "credential stuffing" cyberattack
Online passwords – understanding "credential stuffing" cyberattack
 
OVHcloud and Microsoft for the public sector
OVHcloud and Microsoft for the public sectorOVHcloud and Microsoft for the public sector
OVHcloud and Microsoft for the public sector
 
The new AMD EPYC solutions from OVHcloud: what benefits?
The new AMD EPYC solutions from OVHcloud: what benefits?The new AMD EPYC solutions from OVHcloud: what benefits?
The new AMD EPYC solutions from OVHcloud: what benefits?
 
Maximising the security of your cloud infrastructure
Maximising the security of your cloud infrastructureMaximising the security of your cloud infrastructure
Maximising the security of your cloud infrastructure
 

Kürzlich hochgeladen

Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...tanu pandey
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Servicesexy call girls service in goa
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.soniya singh
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLimonikaupta
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)Damian Radcliffe
 
Call Girls in Mayur Vihar ✔️ 9711199171 ✔️ Delhi ✔️ Enjoy Call Girls With Our...
Call Girls in Mayur Vihar ✔️ 9711199171 ✔️ Delhi ✔️ Enjoy Call Girls With Our...Call Girls in Mayur Vihar ✔️ 9711199171 ✔️ Delhi ✔️ Enjoy Call Girls With Our...
Call Girls in Mayur Vihar ✔️ 9711199171 ✔️ Delhi ✔️ Enjoy Call Girls With Our...sonatiwari757
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...Neha Pandey
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024APNIC
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girladitipandeya
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...aditipandeya
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
Radiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsRadiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsstephieert
 

Kürzlich hochgeladen (20)

Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girls
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)
 
Call Girls in Mayur Vihar ✔️ 9711199171 ✔️ Delhi ✔️ Enjoy Call Girls With Our...
Call Girls in Mayur Vihar ✔️ 9711199171 ✔️ Delhi ✔️ Enjoy Call Girls With Our...Call Girls in Mayur Vihar ✔️ 9711199171 ✔️ Delhi ✔️ Enjoy Call Girls With Our...
Call Girls in Mayur Vihar ✔️ 9711199171 ✔️ Delhi ✔️ Enjoy Call Girls With Our...
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
Radiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsRadiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girls
 

OVHcloud Tech Talks S01E09 - OVHcloud Data Processing : Le nouveau service pour valoriser vos données à l'aide du cloud

  • 1. 1
  • 2. 2 OVHcloud Data Processing : Le nouveau service pour valoriser vos données à l'aide du cloud BELLLION Bastien 2020-06-25
  • 3. 3 Bastien BELLION • Data Convergence team • bastien.bellion@ovhcloud.com • @BastienBellion Data Convergence : • Analytics Data Platform • Data Processing
  • 4. 4 Content • Big Data • Cluster Computing • Apache Spark • Exemple 1: Lyrics Wordcount • OVHcloud Data Processing platform • Exemple 2: New-York Taxis • Roadmap
  • 5. 5 Why do we need to process data? • Invent new products • Make better decisions • Prevent danger and fraud • Forecasting events
  • 6. 6 Flood of data • 90 percent of data in the world has been generated in last 2 years. • Autonomouscars: 20TB /hour • 500 million tweets per day. • 75 billion IoT devices by 2025 • Data exhaust
  • 7. 7 Where is the data stored?
  • 9. 9 Selecting the right tool • Open source • Biggest open source community in big data • Fast • Java/Scala/Python/R • Cluster Big data tools – Google trend graph
  • 10. 10 Who is using Spark ? https://databricks.com/fr/session/netflixs-recommendation-ml- pipeline-using-apache-spark https://labs.criteo.com/2018/01/spark-out-of-memory/
  • 12. 12 Spark Spark Cluster Job 1 Job 2 Job 3 spark-submit --class org.apache.spark.examples.SparkPi --driver-memory 8G --master local[2] --executor-memory 8G --executor-cores 2 --num-executors 3 spark-examples.jar
  • 13. 13 Example 1 : Lyrics Wordcount https://github.com/walkerkq/musiclyrics Billboard hot 100 between 1965-2015 5100 song with for each: • Rank of the year • Song name • Artist • Year • Lyrics • Source Full written guide : https://docs.ovh.com/gb/en/data-processing/wordcount-spark/
  • 14. 14 Example 1 : Lyrics Wordcount 1/3 of dataset, 1700 songs 1/3 of dataset, 1700 songs 1/3 of dataset, 1700 songs Lyrics.csv
  • 15. 15 Example 1 : Lyrics Wordcount Volume of processed data Time elapsed one computer Time elapsed spark cluster 7.59 Mb 0,5 s 60 s 759 Mb 5,1 s 100 s 7.59 Gb 48 s 186 s 76 Gb 490 s 358 s 760 Gb 4 820 s / 80 minutes 2 250 s / 37 minutes Work computer 4 cpu cores, 8 threads 16 Gb of RAM OVHcloud instances 10 workers 2 cpu cores per workers 8 Gb of RAM per workers
  • 16. 16 Manual Spark cluster problems • Take time and knowledge to setup • Cluster Management Spark version 2.2.1 125 Cores , 500GB RAM GRA Job 1 Job 2 Job 3 Spark 2.2.1 100GB RAM, 25 Cores GRA Spark 2.2.1 150GB RAM, 37 Cores GRA Spark 2.2.1 100GB RAM, 25 Cores GRA
  • 17. 17 Manual Spark cluster problems • Take time and knowledge to setup • Cluster Management • Spark version bounded Spark version 2.2.1 125 Cores , 500GB RAM GRA Job 1 Job 2 Job 3 Spark 2.2.1 100GB RAM, 25 Cores GRA Spark 2.2.1 150GB RAM, 37 Cores GRA Spark 2.2.1 100GB RAM, 25 Cores GRA Job 4 Spark 3.0.0 50GB RAM, 10 Cores GRA
  • 18. 18 Manual Spark cluster problems • Take time and knowledge to setup • Cluster Management • Spark version bounded • Limited ressources Spark version 2.2.1 125 Cores , 500GB RAM GRA Job 1 Job 2 Job 3 Spark 2.2.1 100GB RAM, 25 Cores GRA Spark 2.2.1 150GB RAM, 37 Cores GRA Spark 2.2.1 100GB RAM, 25 Cores GRA Job 4 Spark 2.2.1 200GB RAM, 50 Cores GRA
  • 19. 19 Manual Spark cluster problems • Take time and knowledge to setup • Cluster Management • Spark version bounded • Limited ressources • Region bounded Spark version 2.2.1 125 Cores , 500GB RAM GRA Job 1 Job 2 Job 3 Spark 2.2.1 100GB RAM, 25 Cores GRA Spark 2.2.1 150GB RAM, 37 Cores GRA Spark 2.2.1 100GB RAM, 25 Cores GRA Job 4 Spark 2.2.1 50GB RAM, 10 Cores BHS
  • 20. 20 Manual Spark cluster problems Job 1 Job 2 Job 3 Spark 2.4.5 4 nodes, 100GB RAM GRA Spark 2.4.3 5 nodes, 200GB RAM BHS Spark 3 2 nodes, 50GB RAM GRA • No Cluster Management • Start a cluster in seconds • Dedicated resources • No Unused resources • Any Spark version • Any region Job 4 Spark 3.0.0 200GB RAM, 50 Cores BHS
  • 21. 21 OVHcloud Data Processing A single cluster per job To focus on what matters Be in the cloud To scale up and down quickly Bin different regions Data policy compliance and be close to data Distributed Computing Technology To distribute processing task to several computers automatically
  • 22. 22 OVHcloud Data Processing A single cluster per job To focus on what matters Be in the cloud To scale up and down quickly Bin different regions Data policy compliance and be close to data Distributed Computing Technology To distribute processing task to several computers automatically OVHcloud Data Processing
  • 28. 28 OVHcloud Data Processing Supported languages : • Java 8 • Scala 2.12 • Python v2.7+ and v3.4+ Available regions : • Europe • North America (Soon) Engine: • Apache Spark 2.4.3 • Apache Spark 3.0.0 (Soon) Job size : • Up to 60 GB of RAM per Executor • Up to 16 Cores of CPU per Executor • Up to 10 Executors per job
  • 29. 29 1 Via Control Panel 2 Via API OVHcloud 3 Via CLI (Spark-Submit) $ ./ovh-spark-submit --project-id yourProjectId --upload ./spark-examples.jar --class org.apache.spark.examples.SparkPi --driver-cores 1 --driver-memory 4G --executor-cores 1 --executor-memory 4G --num-executors 1 swift://odp/spark-examples.jar 1000 • Auto upload your application code • Open source https://github.com/ovh/data-processing-spark-submit
  • 30. 30 Example 2 : New-YorkTaxis • Date • Number of people per ride • Price • Tips • Pickup and dropoff zone • 10 years worth of taxis ride • +100 Go of data
  • 31. 31 Example 2 : New-YorkTaxis First step: make the data accessible Public Cloud Object Storage
  • 32. 32 Example 2 : New-YorkTaxis Second step: submit your job Data Processing Start a new job
  • 33. 33 Example 2 : New-YorkTaxis Second step: submit your job
  • 34. 34 Example 2 : New-YorkTaxis Second step: submit your job
  • 35. 35 Example 2 : New-YorkTaxis Second step: submit your job
  • 36. 36 Example 2 : New-YorkTaxis . . . .. .. . . Bill I submit 1job, with • 1 driver ( 4 vCore / 15GB RAM) • 5 executors( 5 x 8vCore / 30GB RAM) I alsoused OVHcloud Object Storage Job lasted15 minutes Payment: (44 vCores + 165 GB RAM ) for15 minutes = 1,33€ + classicpriceof Object Storage(storageanddatatransfer)
  • 37. 37 Roadmap Development Lab GA Soon ETA = July 2020Launched (April) Free ! Full working: • Product itself • Control panel/API/CLI • Documentation Added: • Bug fix • Billing S2 2020 Backlog: • More Regions • More flavors/prices • More Engines • More Spark versions