SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
Big Data meets Big Compute
Connecting MongoDB and Spark for Fun
and Profit
{	name:	"Ross	Lawley",	role:	"Senior	Software	Engineer",		
		twitter:	"@RossC0"	}
#MDBE16
Agenda
What is Spark?
How does it work?
What problems can it solve?
Whats the future of Spark?
Spark introduction
01 A deep dive into the connector.
Configuration options
Partitioning challenges
How to Scale and keep data local.
Internals
03Introducing the new connector
How to install it and use it
How to use it in various languages
When to use it, when not to
The new connector
02
An impressive demostration of
MongoDB and Spark combined!
Demo
04 I'll try and help answer any
questions you might have.
I'll also answer questions at the
Drivers booth!
Questions
06Quick recap.
Where to go for more information
Conclusions
05
Spark an introduction
#MDBE16
What is Spark?
Fast and distributed general computing engine
•  Makes it easy and fast to process large datasets
•  Libraries for SQL, streaming, machine learning, graphs
•  APIs in Scala, Python, Java, R
•  It’s fundamentally different to what’s come before
#MDBE16
So why not just use Hadoop?
Spark is FAST
•  Faster to write.
•  Friendly API in Scala, Python, Java and R
•  Faster to run.
•  Up to 100x faster than Hadoop in memory
•  10x faster on disk.
#MDBE16
A visual comparison
Hadoop Spark
#MDBE16
Spark History
2009
The Beginning
Spark project started at UC
Berkeley's AMPLab
Spark Open Sourced
2010
2015
Spark 1.3.0 – 1.5.0
R support
Spark SQL out of alpha
DataFrames
Spark 1.6.0
Spark 2.0
Datasets
Structured Streams
2016
2013
Joined the Apache
foundation
Spark 1.0.0 – 1.2.0
Scala, Java & Python
Spark SQL
Streaming
Mlib
GraphX
2014
#MDBE16
Spark Programming Model
Resilient Distributed Datasets
•  An RDD is a collection of elements that is immutable, distributed and fault-
tolerant.
•  Transformations can be applied to a RDD, resulting in new RDD.
•  Actions can be applied to a RDD to obtain a value.
•  RDD is lazy.
#MDBE16
RDD Operations
Transformations Actions
map reduce
filter collect
flatMap count
mapPartitions save
sample lookupKey
union take
join foreach
groupByKey
reduceByKey
#MDBE16
Built in fault tolerance
RDDs maintain lineage information that can be used to reconstruct lost
partitions
val	searches	=	spark.textFile("hdfs://...")	
																				.filter(_.contains("Search"))	
																				.map(_.split("t")(2)).cache()	
																				.filter(_.contains("MongoDB"))	
																				.count()	
Mapped
RDD
Filtered
RDD
HDFS RDD
Cached
RDD
Filtered
RDD Count
#MDBE16
. . .
Spark Driver
Worker 1 Worker nWorker 2
Cluster
Manager
Data source
Spark topology
#MDBE16
Spark high level view
#MDBE16
Spark high level view
RDDs – Unstructured Data
Datasets – Structured Data
Spark SQL
Spark
Streaming
MLIB GraphX
The new connector
#MDBE16
Connecting MongoDB and Spark
Big Data Storage Big Data Compute
#MDBE16
Different use cases
Applications
OLTP
Fine grained operations
Offline Processing
Analytics
Data Warehousing
#MDBE16
The MongoDB Spark Connector
MongoDB Spark
Connector
#MDBE16
The MongoDB Spark Connector
•  Spark 1.6.x and Spark 2.0.x
•  Scala, Python, Java, and R
•  Idiomatic Scala API
•  Supports custom Aggregations
•  Multiple partitioning strategies
•  Automatic schema inference
•  Automatic conversion to Datasets
>	$SPARK_HOME/bin/spark-shell	--packages	org.mongodb.spark:mongo-spark-connector_2.10:2.0.0
“ Reynold Xin
Co-Founder and Chief Architect at
Databricks
Users are already combining Apache Spark and
MongoDB to build sophisticated analytics applications.
The new native MongoDB Connector for Apache Spark
provides higher performance, greater ease of use, and
access to more advanced Apache Spark functionality
than any MongoDB connector available today.”
#MDBE16
Fare Calculation Engine
One of World’s Largest Airlines Migrates from Oracle to
MongoDB and Apache Spark to Support 100x performance
improvement
Problem Why MongoDB Results
Problem Solution Results
China Eastern targeting 130,000 seats
sold every day across its web and
mobile channels
New fare calculation engine needed to
support 20,000 search queries per
second, but current Oracle platform
supported only 200 per second
Apache Spark used for fare
calculations, using business rules
stored in MongoDB
Fare calculations written to MongoDB
for access by the search application
MongoDB Connector for Apache Spark
allows seamless integration with data
locality awareness across the cluster
Cluster of less than 20 API, Spark &
MongoDB nodes supports 180m fare
calculations & 1.6 billion searches per
day
Each node delivers 15x higher
performance and 10x lower latency
than existing Oracle servers
MongoDB Enterprise Advanced
provided Ops Manager for operational
automation and access to expert
technical support
Connector Internals
#MDBE16
What's needed to connect to Spark?
1. Create a connection
•  This has some cost.
The Mongo Java Driver runs a connection pool
Authenticates connections, replica set discovery etc…
•  Only two modes to support:
Reads
Writes
#MDBE16
What's needed to connect to Spark?
2. Partition the data
•  Partitions provide parallelism – splits the collection into parts
•  Challenges for mutable data sources as not a snapshot in time
RDD / Collection
#MDBE16
MongoSamplePartitioner
The default partitioner
•  Over samples the collection
•  Calculate the number of partitions.
Uses the average document size and the configured partition size.
•  Samples the collection, sampling n number of documents per partition
•  Sorts the data by partition key
•  Takes each n partition
•  Adds a min and max key partition split at the start and end of the collection	
	
	
{$gte: {_id: minKey}, $lt: {_id: 1}}{$gte: {_id: 1}, $lt: {_id: 100}} {$gte: {_id: 5000}, $lt: {_id: maxKey}}{$gte: {_id: 100}, $lt: {_id: 200}} {$gte: {_id: 4900}, $lt: {_id: 5000}}
#MDBE16
MongoShardedPartitioner
Sharded collections are already partitioned
•  Examines the shard config database
•  Creates partitions based on the shard chunk min and max ranges
•  Stores the Shard location data for the chunk, to help promote locality
•  Adds a min and max key partition split at the start and end of the collection	
	
	{$gte: {_id: minKey}, $lt: {_id: 1}} {$gte: {_id: 1000}, $lt: {_id: maxKey}}{$gte: {_id: 194}, $lt: {_id: 232}}
#MDBE16
Alternative Partitioners
•  MongoSplitVectorPartitioner
A partitioner for standalone or replicaSets. Command requires special privileges.
•  MongoPaginateByCountPartitioner
Creates a maximum number of partitions
Costs a query to calculate each partition
•  MongoPaginateBySizePartitioner
As above but using average document size to determine the partitions.
•  Create your own
Just implement the MongoPartitioner trait and add the full path to the config
#MDBE16
Whats needed to connect to Spark?
3. Support DataFrames & Datasets
•  RDD's with Schema
•  Supports Simple Types
•  BinaryType, BooleanType, ByteType, CalendarIntervalType, DateType, DoubleType, FloatType,
IntegerType, LongType, NullType, ShortType, StringType, TimestampType
•  Complex Types:
•  ArrayType - Typed Array
•  StructType – Map
•  Unsupported Bson types use StructType similar to extended json.
#MDBE16
DataFrames & Datasets
•  Automatic Schema inference:
val	dataFrame	=	MongoSpark.load(sparkSession)	
	
•  Supply the schema
case	class	Person(firstName:	String,	lastName:String)	
val	dataFrame=	MongoSpark.load[Person](sparkSession)
#MDBE16
Whats needed to connect to Spark?
4. Configuration
•  Read Config
•  uri, database, collection, partitioner, sampleSize, localThreshold,readPreference,
readConcern
•  Write Config
•  uri, database, collection, writeConcern
#MDBE16
The Anatomy of a read
MongoSpark.load(sparkSession).count()	
1.  Create a MongoRDD[Row]
2.  Infer the schema (none provided)
3.  Partition the data
4.  Calculate the Partitions .
5.  Allocate the workers
6.  For each partition on each worker:
i.  Queries and returns the cursor
ii.  Iterates the cursor and sums up the data
7.  Finally, the Spark application returns the sum of the sums.
#MDBE16
Performance
•  MongoDB Usual Suspects
Document design
Indexes
Read Concern
•  Spark Specifics
Partitioning Strategy
Data Locality
#MDBE16
. . .
Data locality
#MDBE16
Data locality
MongoS	 MongoS	 MongoS	 MongoS	 MongoS	
. . .
#MDBE16
Data locality
Configure: LocalThreshold, MongoShardedPartitioner
MongoS	 MongoS	 MongoS	 MongoS	 MongoS	
. . .
#MDBE16
Data locality
MongoD	 MongoD	 MongoD	 MongoD	 MongoD	
MongoS	 MongoS	 MongoS	 MongoS	 MongoS	
. . .
Configure: ReadPreference, LocalThreshold, MongoShardedPartitioner
Demo Time!
#MDBE16
Scenario: You've won the EuroMillions lottery!
•  To celebrate you want to travel to
Europes largest 50 cities!
•  The nouveau riche only have one way
to travel; in style by personal
helicopter!
•  It’s a logistical nightmare. "Travelling
Salesman Problem"
#MDBE16
The scale of the problem
•  With 50 places to visit there are: 49 x 48 x 47 x … x 3 x 2 x 1
possible ways to travel between them.
This number is 63 digits long:
608,281,864,034,267,560,872,252,163,321,295,376,887,552,831,379,210,240,000,000,000
•  Don't need to calculate all possible routes. Just need a route that is good
enough.
#MDBE16
Choosing MongoDB and Spark
Good fit:
•  Not possible directly via the aggregation framework
•  CPU intensive task
•  Needs code to solve the problem
Bad fit:
•  Not an obviously parallel problem
•  Can fork, divide and join using Spark
#MDBE16
Finding a solution with a genetic algorithm
Slightly complex but basically we're using evolution.
•  Randomly generate a number of routes
•  Then "evolve" the routes over a number of generations
•  Crossover two parent routes to create a child route.
•  Randomly mutate a % of children routes.
•  Keep a percentage of the best routes.
•  After X generations will end up with a evolved route that is short
Conclusions
#MDBE16
An extremely powerful combination
•  Many possible use cases
•  Solve the right problems
Some operations maybe faster if performed using Aggregation Framework
•  Performance
•  Pick the correct partitioning strategy
•  Tune MongoDB
•  Tune Spark
•  Spark is evolving all the time
Questions?
https://docs.mongodb.com/spark-connector
https://github.com/mongodb/mongo-spark
https://university.mongodb.com/courses/M233/about
MongoDB Europe 2016 - Big Data meets Big Compute

Weitere ähnliche Inhalte

Was ist angesagt?

MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQLMongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQLMongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBMongoDB
 
Webinar: Choosing the Right Shard Key for High Performance and Scale
Webinar: Choosing the Right Shard Key for High Performance and ScaleWebinar: Choosing the Right Shard Key for High Performance and Scale
Webinar: Choosing the Right Shard Key for High Performance and ScaleMongoDB
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB
 
Exploring the replication and sharding in MongoDB
Exploring the replication and sharding in MongoDBExploring the replication and sharding in MongoDB
Exploring the replication and sharding in MongoDBIgor Donchovski
 
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...Gianfranco Palumbo
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingMongoDB
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB
 
MongoDB Days Silicon Valley: Introducing MongoDB 3.2
MongoDB Days Silicon Valley: Introducing MongoDB 3.2MongoDB Days Silicon Valley: Introducing MongoDB 3.2
MongoDB Days Silicon Valley: Introducing MongoDB 3.2MongoDB
 
Building Spring Data with MongoDB
Building Spring Data with MongoDBBuilding Spring Data with MongoDB
Building Spring Data with MongoDBMongoDB
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design PatternsMongoDB
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB
 
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101MongoDB
 
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
Webinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineWebinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineMongoDB
 
Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBMongoDB
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLMongoDB
 

Was ist angesagt? (20)

MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQLMongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
MongoDB .local Munich 2019: Managing a Heterogeneous Stack with MongoDB & SQL
 
MongodB Internals
MongodB InternalsMongodB Internals
MongodB Internals
 
MongoDB Schema Design Tips & Tricks
MongoDB Schema Design Tips & TricksMongoDB Schema Design Tips & Tricks
MongoDB Schema Design Tips & Tricks
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Webinar: Choosing the Right Shard Key for High Performance and Scale
Webinar: Choosing the Right Shard Key for High Performance and ScaleWebinar: Choosing the Right Shard Key for High Performance and Scale
Webinar: Choosing the Right Shard Key for High Performance and Scale
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business Insights
 
Exploring the replication and sharding in MongoDB
Exploring the replication and sharding in MongoDBExploring the replication and sharding in MongoDB
Exploring the replication and sharding in MongoDB
 
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
How to leverage MongoDB for Big Data Analysis and Operations with MongoDB's A...
 
Back to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to ShardingBack to Basics 2017: Introduction to Sharding
Back to Basics 2017: Introduction to Sharding
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
 
MongoDB Days Silicon Valley: Introducing MongoDB 3.2
MongoDB Days Silicon Valley: Introducing MongoDB 3.2MongoDB Days Silicon Valley: Introducing MongoDB 3.2
MongoDB Days Silicon Valley: Introducing MongoDB 3.2
 
Building Spring Data with MongoDB
Building Spring Data with MongoDBBuilding Spring Data with MongoDB
Building Spring Data with MongoDB
 
Mongo db dhruba
Mongo db dhrubaMongo db dhruba
Mongo db dhruba
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design Patterns
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation Performance
 
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101
MongoDB Days Silicon Valley: Jumpstart: Ops/Admin 101
 
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
 
Webinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage EngineWebinar: Schema Patterns and Your Storage Engine
Webinar: Schema Patterns and Your Storage Engine
 
Webinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDBWebinar: Best Practices for Getting Started with MongoDB
Webinar: Best Practices for Getting Started with MongoDB
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
 

Andere mochten auch

MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right WayMongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right WayMongoDB
 
MongoDB Europe 2016 - Star in a Reasonably Priced Car - Which Driver is Best?
MongoDB Europe 2016 - Star in a Reasonably Priced Car - Which Driver is Best?MongoDB Europe 2016 - Star in a Reasonably Priced Car - Which Driver is Best?
MongoDB Europe 2016 - Star in a Reasonably Priced Car - Which Driver is Best?MongoDB
 
MongoDB Europe 2016 - MongoDB Atlas
MongoDB Europe 2016 - MongoDB AtlasMongoDB Europe 2016 - MongoDB Atlas
MongoDB Europe 2016 - MongoDB AtlasMongoDB
 
L’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova GenerazioneL’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova GenerazioneMongoDB
 
MongoDB Europe 2016 - Building WiredTiger
MongoDB Europe 2016 - Building WiredTigerMongoDB Europe 2016 - Building WiredTiger
MongoDB Europe 2016 - Building WiredTigerMongoDB
 
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...MongoDB
 
MongoDB Europe 2016 - Powering Microservices with Docker, Kubernetes, and Kafka
MongoDB Europe 2016 - Powering Microservices with Docker, Kubernetes, and KafkaMongoDB Europe 2016 - Powering Microservices with Docker, Kubernetes, and Kafka
MongoDB Europe 2016 - Powering Microservices with Docker, Kubernetes, and KafkaMongoDB
 
MongoDB Europe 2016 - Distributed Ledgers, Blockchain + MongoDB
MongoDB Europe 2016 - Distributed Ledgers, Blockchain + MongoDBMongoDB Europe 2016 - Distributed Ledgers, Blockchain + MongoDB
MongoDB Europe 2016 - Distributed Ledgers, Blockchain + MongoDBMongoDB
 
Live Demo: Introducing the Spark Connector for MongoDB
Live Demo: Introducing the Spark Connector for MongoDBLive Demo: Introducing the Spark Connector for MongoDB
Live Demo: Introducing the Spark Connector for MongoDBMongoDB
 
MongoDB Europe 2016 - Welcome
MongoDB Europe 2016 - WelcomeMongoDB Europe 2016 - Welcome
MongoDB Europe 2016 - WelcomeMongoDB
 
Past, Present and Future of Data Processing in Apache Hadoop
Past, Present and Future of Data Processing in Apache HadoopPast, Present and Future of Data Processing in Apache Hadoop
Past, Present and Future of Data Processing in Apache HadoopCodemotion
 
Blazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & SparkBlazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & SparkMongoDB
 
Lambda Architecture in Practice
Lambda Architecture in PracticeLambda Architecture in Practice
Lambda Architecture in PracticeNavneet kumar
 
MongoDB Europe 2016 - Warehousing MongoDB Data using Apache Beam and BigQuery
MongoDB Europe 2016 - Warehousing MongoDB Data using Apache Beam and BigQueryMongoDB Europe 2016 - Warehousing MongoDB Data using Apache Beam and BigQuery
MongoDB Europe 2016 - Warehousing MongoDB Data using Apache Beam and BigQueryMongoDB
 
MongoDB Europe 2016 - The Rise of the Data Lake
MongoDB Europe 2016 - The Rise of the Data LakeMongoDB Europe 2016 - The Rise of the Data Lake
MongoDB Europe 2016 - The Rise of the Data LakeMongoDB
 
My other computer is a datacentre - 2012 edition
My other computer is a datacentre - 2012 editionMy other computer is a datacentre - 2012 edition
My other computer is a datacentre - 2012 editionSteve Loughran
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeMongoDB
 
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time ActionApache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time ActionJoão Gabriel Lima
 
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Tugdual Grall
 
MongoDB Europe 2016 - MongoDB, Ops Manager & Docker at SNCF
MongoDB Europe 2016 - MongoDB, Ops Manager & Docker at SNCFMongoDB Europe 2016 - MongoDB, Ops Manager & Docker at SNCF
MongoDB Europe 2016 - MongoDB, Ops Manager & Docker at SNCFMongoDB
 

Andere mochten auch (20)

MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right WayMongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
 
MongoDB Europe 2016 - Star in a Reasonably Priced Car - Which Driver is Best?
MongoDB Europe 2016 - Star in a Reasonably Priced Car - Which Driver is Best?MongoDB Europe 2016 - Star in a Reasonably Priced Car - Which Driver is Best?
MongoDB Europe 2016 - Star in a Reasonably Priced Car - Which Driver is Best?
 
MongoDB Europe 2016 - MongoDB Atlas
MongoDB Europe 2016 - MongoDB AtlasMongoDB Europe 2016 - MongoDB Atlas
MongoDB Europe 2016 - MongoDB Atlas
 
L’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova GenerazioneL’architettura di Classe Enterprise di Nuova Generazione
L’architettura di Classe Enterprise di Nuova Generazione
 
MongoDB Europe 2016 - Building WiredTiger
MongoDB Europe 2016 - Building WiredTigerMongoDB Europe 2016 - Building WiredTiger
MongoDB Europe 2016 - Building WiredTiger
 
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
MongoDB Europe 2016 - Enabling the Internet of Things at Proximus - Belgium's...
 
MongoDB Europe 2016 - Powering Microservices with Docker, Kubernetes, and Kafka
MongoDB Europe 2016 - Powering Microservices with Docker, Kubernetes, and KafkaMongoDB Europe 2016 - Powering Microservices with Docker, Kubernetes, and Kafka
MongoDB Europe 2016 - Powering Microservices with Docker, Kubernetes, and Kafka
 
MongoDB Europe 2016 - Distributed Ledgers, Blockchain + MongoDB
MongoDB Europe 2016 - Distributed Ledgers, Blockchain + MongoDBMongoDB Europe 2016 - Distributed Ledgers, Blockchain + MongoDB
MongoDB Europe 2016 - Distributed Ledgers, Blockchain + MongoDB
 
Live Demo: Introducing the Spark Connector for MongoDB
Live Demo: Introducing the Spark Connector for MongoDBLive Demo: Introducing the Spark Connector for MongoDB
Live Demo: Introducing the Spark Connector for MongoDB
 
MongoDB Europe 2016 - Welcome
MongoDB Europe 2016 - WelcomeMongoDB Europe 2016 - Welcome
MongoDB Europe 2016 - Welcome
 
Past, Present and Future of Data Processing in Apache Hadoop
Past, Present and Future of Data Processing in Apache HadoopPast, Present and Future of Data Processing in Apache Hadoop
Past, Present and Future of Data Processing in Apache Hadoop
 
Blazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & SparkBlazing Fast Analytics with MongoDB & Spark
Blazing Fast Analytics with MongoDB & Spark
 
Lambda Architecture in Practice
Lambda Architecture in PracticeLambda Architecture in Practice
Lambda Architecture in Practice
 
MongoDB Europe 2016 - Warehousing MongoDB Data using Apache Beam and BigQuery
MongoDB Europe 2016 - Warehousing MongoDB Data using Apache Beam and BigQueryMongoDB Europe 2016 - Warehousing MongoDB Data using Apache Beam and BigQuery
MongoDB Europe 2016 - Warehousing MongoDB Data using Apache Beam and BigQuery
 
MongoDB Europe 2016 - The Rise of the Data Lake
MongoDB Europe 2016 - The Rise of the Data LakeMongoDB Europe 2016 - The Rise of the Data Lake
MongoDB Europe 2016 - The Rise of the Data Lake
 
My other computer is a datacentre - 2012 edition
My other computer is a datacentre - 2012 editionMy other computer is a datacentre - 2012 edition
My other computer is a datacentre - 2012 edition
 
Unlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data LakeUnlocking Operational Intelligence from the Data Lake
Unlocking Operational Intelligence from the Data Lake
 
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time ActionApache Spark and MongoDB - Turning Analytics into Real-Time Action
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
 
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
 
MongoDB Europe 2016 - MongoDB, Ops Manager & Docker at SNCF
MongoDB Europe 2016 - MongoDB, Ops Manager & Docker at SNCFMongoDB Europe 2016 - MongoDB, Ops Manager & Docker at SNCF
MongoDB Europe 2016 - MongoDB, Ops Manager & Docker at SNCF
 

Ähnlich wie MongoDB Europe 2016 - Big Data meets Big Compute

How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...Antonios Giannopoulos
 
Spark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross LawleySpark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross LawleySpark Summit
 
Architecting Wide-ranging Analytical Solutions with MongoDB
Architecting Wide-ranging Analytical Solutions with MongoDBArchitecting Wide-ranging Analytical Solutions with MongoDB
Architecting Wide-ranging Analytical Solutions with MongoDBMatthew Kalan
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsAli Hodroj
 
MongoDB World 2016: Scaling MongoDB with Docker and cGroups
MongoDB World 2016: Scaling MongoDB with Docker and cGroupsMongoDB World 2016: Scaling MongoDB with Docker and cGroups
MongoDB World 2016: Scaling MongoDB with Docker and cGroupsMongoDB
 
Scaling MongoDB with Docker and cgroups
Scaling MongoDB with Docker and cgroupsScaling MongoDB with Docker and cgroups
Scaling MongoDB with Docker and cgroupsmarcoita
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at ScaleMongoDB
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014Dylan Tong
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceSasidhar Gogulapati
 
Jumpstart: Your Introduction to MongoDB
Jumpstart: Your Introduction to MongoDBJumpstart: Your Introduction to MongoDB
Jumpstart: Your Introduction to MongoDBMongoDB
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDBMongoDB
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 
MongoDB Knowledge Shareing
MongoDB Knowledge ShareingMongoDB Knowledge Shareing
MongoDB Knowledge ShareingPhilip Zhong
 
Accra MongoDB User Group
Accra MongoDB User GroupAccra MongoDB User Group
Accra MongoDB User GroupMongoDB
 
Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopDataWorks Summit
 
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB
 
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
 Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F... Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...Databricks
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBjhugg
 

Ähnlich wie MongoDB Europe 2016 - Big Data meets Big Compute (20)

How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...How sitecore depends on mongo db for scalability and performance, and what it...
How sitecore depends on mongo db for scalability and performance, and what it...
 
Spark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross LawleySpark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross Lawley
 
MongoDB Basics Unileon
MongoDB Basics UnileonMongoDB Basics Unileon
MongoDB Basics Unileon
 
Architecting Wide-ranging Analytical Solutions with MongoDB
Architecting Wide-ranging Analytical Solutions with MongoDBArchitecting Wide-ranging Analytical Solutions with MongoDB
Architecting Wide-ranging Analytical Solutions with MongoDB
 
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGsHybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGs
 
MongoDB World 2016: Scaling MongoDB with Docker and cGroups
MongoDB World 2016: Scaling MongoDB with Docker and cGroupsMongoDB World 2016: Scaling MongoDB with Docker and cGroups
MongoDB World 2016: Scaling MongoDB with Docker and cGroups
 
Scaling MongoDB with Docker and cgroups
Scaling MongoDB with Docker and cgroupsScaling MongoDB with Docker and cgroups
Scaling MongoDB with Docker and cgroups
 
MongoDB at Scale
MongoDB at ScaleMongoDB at Scale
MongoDB at Scale
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
 
Jumpstart: Your Introduction to MongoDB
Jumpstart: Your Introduction to MongoDBJumpstart: Your Introduction to MongoDB
Jumpstart: Your Introduction to MongoDB
 
Scaling MongoDB
Scaling MongoDBScaling MongoDB
Scaling MongoDB
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
MongoDB Knowledge Shareing
MongoDB Knowledge ShareingMongoDB Knowledge Shareing
MongoDB Knowledge Shareing
 
Accra MongoDB User Group
Accra MongoDB User GroupAccra MongoDB User Group
Accra MongoDB User Group
 
Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on Hadoop
 
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDBMongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
MongoDB Days Silicon Valley: Winning the Dreamforce Hackathon with MongoDB
 
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
 Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F... Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
 
Everything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDBEverything We Learned About In-Memory Data Layout While Building VoltDB
Everything We Learned About In-Memory Data Layout While Building VoltDB
 
MongoDB.pdf
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
 

Mehr von MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Mehr von MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Kürzlich hochgeladen

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 

Kürzlich hochgeladen (20)

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 

MongoDB Europe 2016 - Big Data meets Big Compute

  • 1. Big Data meets Big Compute Connecting MongoDB and Spark for Fun and Profit { name: "Ross Lawley", role: "Senior Software Engineer", twitter: "@RossC0" }
  • 2. #MDBE16 Agenda What is Spark? How does it work? What problems can it solve? Whats the future of Spark? Spark introduction 01 A deep dive into the connector. Configuration options Partitioning challenges How to Scale and keep data local. Internals 03Introducing the new connector How to install it and use it How to use it in various languages When to use it, when not to The new connector 02 An impressive demostration of MongoDB and Spark combined! Demo 04 I'll try and help answer any questions you might have. I'll also answer questions at the Drivers booth! Questions 06Quick recap. Where to go for more information Conclusions 05
  • 4. #MDBE16 What is Spark? Fast and distributed general computing engine •  Makes it easy and fast to process large datasets •  Libraries for SQL, streaming, machine learning, graphs •  APIs in Scala, Python, Java, R •  It’s fundamentally different to what’s come before
  • 5. #MDBE16 So why not just use Hadoop? Spark is FAST •  Faster to write. •  Friendly API in Scala, Python, Java and R •  Faster to run. •  Up to 100x faster than Hadoop in memory •  10x faster on disk.
  • 7. #MDBE16 Spark History 2009 The Beginning Spark project started at UC Berkeley's AMPLab Spark Open Sourced 2010 2015 Spark 1.3.0 – 1.5.0 R support Spark SQL out of alpha DataFrames Spark 1.6.0 Spark 2.0 Datasets Structured Streams 2016 2013 Joined the Apache foundation Spark 1.0.0 – 1.2.0 Scala, Java & Python Spark SQL Streaming Mlib GraphX 2014
  • 8. #MDBE16 Spark Programming Model Resilient Distributed Datasets •  An RDD is a collection of elements that is immutable, distributed and fault- tolerant. •  Transformations can be applied to a RDD, resulting in new RDD. •  Actions can be applied to a RDD to obtain a value. •  RDD is lazy.
  • 9. #MDBE16 RDD Operations Transformations Actions map reduce filter collect flatMap count mapPartitions save sample lookupKey union take join foreach groupByKey reduceByKey
  • 10. #MDBE16 Built in fault tolerance RDDs maintain lineage information that can be used to reconstruct lost partitions val searches = spark.textFile("hdfs://...") .filter(_.contains("Search")) .map(_.split("t")(2)).cache() .filter(_.contains("MongoDB")) .count() Mapped RDD Filtered RDD HDFS RDD Cached RDD Filtered RDD Count
  • 11. #MDBE16 . . . Spark Driver Worker 1 Worker nWorker 2 Cluster Manager Data source Spark topology
  • 13. #MDBE16 Spark high level view RDDs – Unstructured Data Datasets – Structured Data Spark SQL Spark Streaming MLIB GraphX
  • 15. #MDBE16 Connecting MongoDB and Spark Big Data Storage Big Data Compute
  • 16. #MDBE16 Different use cases Applications OLTP Fine grained operations Offline Processing Analytics Data Warehousing
  • 17. #MDBE16 The MongoDB Spark Connector MongoDB Spark Connector
  • 18. #MDBE16 The MongoDB Spark Connector •  Spark 1.6.x and Spark 2.0.x •  Scala, Python, Java, and R •  Idiomatic Scala API •  Supports custom Aggregations •  Multiple partitioning strategies •  Automatic schema inference •  Automatic conversion to Datasets > $SPARK_HOME/bin/spark-shell --packages org.mongodb.spark:mongo-spark-connector_2.10:2.0.0
  • 19. “ Reynold Xin Co-Founder and Chief Architect at Databricks Users are already combining Apache Spark and MongoDB to build sophisticated analytics applications. The new native MongoDB Connector for Apache Spark provides higher performance, greater ease of use, and access to more advanced Apache Spark functionality than any MongoDB connector available today.”
  • 20. #MDBE16 Fare Calculation Engine One of World’s Largest Airlines Migrates from Oracle to MongoDB and Apache Spark to Support 100x performance improvement Problem Why MongoDB Results Problem Solution Results China Eastern targeting 130,000 seats sold every day across its web and mobile channels New fare calculation engine needed to support 20,000 search queries per second, but current Oracle platform supported only 200 per second Apache Spark used for fare calculations, using business rules stored in MongoDB Fare calculations written to MongoDB for access by the search application MongoDB Connector for Apache Spark allows seamless integration with data locality awareness across the cluster Cluster of less than 20 API, Spark & MongoDB nodes supports 180m fare calculations & 1.6 billion searches per day Each node delivers 15x higher performance and 10x lower latency than existing Oracle servers MongoDB Enterprise Advanced provided Ops Manager for operational automation and access to expert technical support
  • 22. #MDBE16 What's needed to connect to Spark? 1. Create a connection •  This has some cost. The Mongo Java Driver runs a connection pool Authenticates connections, replica set discovery etc… •  Only two modes to support: Reads Writes
  • 23. #MDBE16 What's needed to connect to Spark? 2. Partition the data •  Partitions provide parallelism – splits the collection into parts •  Challenges for mutable data sources as not a snapshot in time RDD / Collection
  • 24. #MDBE16 MongoSamplePartitioner The default partitioner •  Over samples the collection •  Calculate the number of partitions. Uses the average document size and the configured partition size. •  Samples the collection, sampling n number of documents per partition •  Sorts the data by partition key •  Takes each n partition •  Adds a min and max key partition split at the start and end of the collection {$gte: {_id: minKey}, $lt: {_id: 1}}{$gte: {_id: 1}, $lt: {_id: 100}} {$gte: {_id: 5000}, $lt: {_id: maxKey}}{$gte: {_id: 100}, $lt: {_id: 200}} {$gte: {_id: 4900}, $lt: {_id: 5000}}
  • 25. #MDBE16 MongoShardedPartitioner Sharded collections are already partitioned •  Examines the shard config database •  Creates partitions based on the shard chunk min and max ranges •  Stores the Shard location data for the chunk, to help promote locality •  Adds a min and max key partition split at the start and end of the collection {$gte: {_id: minKey}, $lt: {_id: 1}} {$gte: {_id: 1000}, $lt: {_id: maxKey}}{$gte: {_id: 194}, $lt: {_id: 232}}
  • 26. #MDBE16 Alternative Partitioners •  MongoSplitVectorPartitioner A partitioner for standalone or replicaSets. Command requires special privileges. •  MongoPaginateByCountPartitioner Creates a maximum number of partitions Costs a query to calculate each partition •  MongoPaginateBySizePartitioner As above but using average document size to determine the partitions. •  Create your own Just implement the MongoPartitioner trait and add the full path to the config
  • 27. #MDBE16 Whats needed to connect to Spark? 3. Support DataFrames & Datasets •  RDD's with Schema •  Supports Simple Types •  BinaryType, BooleanType, ByteType, CalendarIntervalType, DateType, DoubleType, FloatType, IntegerType, LongType, NullType, ShortType, StringType, TimestampType •  Complex Types: •  ArrayType - Typed Array •  StructType – Map •  Unsupported Bson types use StructType similar to extended json.
  • 28. #MDBE16 DataFrames & Datasets •  Automatic Schema inference: val dataFrame = MongoSpark.load(sparkSession) •  Supply the schema case class Person(firstName: String, lastName:String) val dataFrame= MongoSpark.load[Person](sparkSession)
  • 29. #MDBE16 Whats needed to connect to Spark? 4. Configuration •  Read Config •  uri, database, collection, partitioner, sampleSize, localThreshold,readPreference, readConcern •  Write Config •  uri, database, collection, writeConcern
  • 30. #MDBE16 The Anatomy of a read MongoSpark.load(sparkSession).count() 1.  Create a MongoRDD[Row] 2.  Infer the schema (none provided) 3.  Partition the data 4.  Calculate the Partitions . 5.  Allocate the workers 6.  For each partition on each worker: i.  Queries and returns the cursor ii.  Iterates the cursor and sums up the data 7.  Finally, the Spark application returns the sum of the sums.
  • 31. #MDBE16 Performance •  MongoDB Usual Suspects Document design Indexes Read Concern •  Spark Specifics Partitioning Strategy Data Locality
  • 32. #MDBE16 . . . Data locality
  • 33. #MDBE16 Data locality MongoS MongoS MongoS MongoS MongoS . . .
  • 34. #MDBE16 Data locality Configure: LocalThreshold, MongoShardedPartitioner MongoS MongoS MongoS MongoS MongoS . . .
  • 35. #MDBE16 Data locality MongoD MongoD MongoD MongoD MongoD MongoS MongoS MongoS MongoS MongoS . . . Configure: ReadPreference, LocalThreshold, MongoShardedPartitioner
  • 37. #MDBE16 Scenario: You've won the EuroMillions lottery! •  To celebrate you want to travel to Europes largest 50 cities! •  The nouveau riche only have one way to travel; in style by personal helicopter! •  It’s a logistical nightmare. "Travelling Salesman Problem"
  • 38. #MDBE16 The scale of the problem •  With 50 places to visit there are: 49 x 48 x 47 x … x 3 x 2 x 1 possible ways to travel between them. This number is 63 digits long: 608,281,864,034,267,560,872,252,163,321,295,376,887,552,831,379,210,240,000,000,000 •  Don't need to calculate all possible routes. Just need a route that is good enough.
  • 39. #MDBE16 Choosing MongoDB and Spark Good fit: •  Not possible directly via the aggregation framework •  CPU intensive task •  Needs code to solve the problem Bad fit: •  Not an obviously parallel problem •  Can fork, divide and join using Spark
  • 40. #MDBE16 Finding a solution with a genetic algorithm Slightly complex but basically we're using evolution. •  Randomly generate a number of routes •  Then "evolve" the routes over a number of generations •  Crossover two parent routes to create a child route. •  Randomly mutate a % of children routes. •  Keep a percentage of the best routes. •  After X generations will end up with a evolved route that is short
  • 42. #MDBE16 An extremely powerful combination •  Many possible use cases •  Solve the right problems Some operations maybe faster if performed using Aggregation Framework •  Performance •  Pick the correct partitioning strategy •  Tune MongoDB •  Tune Spark •  Spark is evolving all the time