SlideShare a Scribd company logo
1 of 28
Download to read offline
Introduction to Apache Spark
www.mammothdata.com | @mammothdataco
Lab Overview
● ‘Hello world’ RDD example
● Importing a dataset
● Dataframe operations and visualizations
● Using MLLib on dataset
www.mammothdata.com | @mammothdataco
Lab — Hello World
● ./run_spark
www.mammothdata.com | @mammothdataco
Lab — Hello World
● val text = sc.parallelize(Seq(“your text here”))
● val words = text.flatMap(line => line.split(" "))
● words.collect
www.mammothdata.com | @mammothdataco
Lab — Hello World
● val taggedWords = words.map(word => (word,1))
● val counts = taggedWords.reduceByKey(_ + _)
● counts.collect()
www.mammothdata.com | @mammothdataco
Lab — Dataset
● https://archive.ics.uci.edu/ml/datasets/Wine
● Information on 3 different types of wine from Genoa
● 178 entries (small!)
www.mammothdata.com | @mammothdataco
Lab — Loading The Wine Dataset
● val wines = sqlContext.read.json("wine.json")
● wines.registerTempTable(“wines”)
www.mammothdata.com | @mammothdataco
Lab — Showing the generated Schema
● wines.printSchema
www.mammothdata.com | @mammothdataco
Lab — Dataframe Operations
● wines.first
www.mammothdata.com | @mammothdataco
Lab — Dataframe Operations
● sqlContext.sql("SELECT Type, count(Type) AS count FROM
wines GROUP BY Type").show
www.mammothdata.com | @mammothdataco
Lab — Dataframe Operations
● Experiment with %sql on the dataset (SELECT, COUNT, etc)
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering
● K-Means clustering is an unsupervised algorithm which splits a
dataset into a number of clusters (k) based on a notion of
similarity between points. It is often applied to real-world data
to obtain a picture of structure hidden in large datasets, for
example, identifying location clusters or breaking down sales
into distinct purchasing groups.
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering
k initial "means" (in this case k=3)
are randomly generated within the
data domain (shown in colour).
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering
k (in this case, 3) clusters are
created by comparing each data
point to the closest mean.
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering
The centroid of each of these
clusters is found, and these are
used as new means. New clusters
are formed via observing the
closest data points to these new
mean as shown in Step 2. The
process is repeated until the means
converge (or until we hit our
iteration limit)
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering: Imports
● import org.apache.spark.mllib.clustering.KMeans
import org.apache.spark.mllib.linalg.Vectors
● import org.apache.spark.sql._
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering: Features
● val featureCols = wines.select("Alcohol", "Hue", "Proline")
● val features = featureCols.rdd.map { case Row(a: Double, h:
Double, p: Double) => Vectors.dense(a,h,p) }
● features.cache
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering: Training Model
● val numClusters = 2
● val numIterations = 20
● val model = KMeans.train(features, numClusters,
numIterations)
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering: Finding k
● k can be any number you like!
● WSSSE - Within Set Sum of Squared Error
● Squared sum of distances between points and their respective
centroid
● val wssse = model.computeCost(features)
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering: Finding k
● Test on k = 1 to 5
● (1 to 5 by 1).map (k => KMeans.train(features, k,
numIterations).computeCost(features))
● WSSSE normally decreases as k increases
● Look for the ‘elbow’
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering: Training Model
● val numClusters = 1
● val numIterations = 20
● val wssse = KMeans.train(features, numClusters,
numIterations).computeCost(features)
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering: k = 3
● val numClusters = 3
● val numIterations = 10
● val model = KMeans.train(features, numClusters,
numIterations)
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering: Obtaining Type Predictions
● val predictions = features.map ( feature => model.predict
(feature))
www.mammothdata.com | @mammothdataco
Lab — K-means Clustering: Comparing To Labels
● val counts = predictions.map (p => (p,1)).reduceByKey(_+_)
● counts.collect
www.mammothdata.com | @mammothdataco
Lab — Next Steps
● Looks good, right? Let’s look at what the labels for each point
really are.
● val features = featureCols.rdd.map { case Row(t: Double, a:
Double, h: Double, p: Double) => (t,Vectors.dense(a,h,p)) }
● val predictions = features.map ( feature => (feature._1,
model.predict(feature._2)))
● val counts = predictions.map (p => (p,1)).reduceByKey(_+_)
● counts.collect
● A slightly different story!
www.mammothdata.com | @mammothdataco
Lab — Next Steps
● k-means clustering - useful! But not perfect!
● Try again with more features in the vector and see if it
improves the clustering.
● Bayes? Random Forests? All in MLLib and with similar
interfaces!
www.mammothdata.com | @mammothdataco
Lab — Next Steps
● spark.apache.org
www.mammothdata.com | @mammothdataco
Lab — Questions
● ?

More Related Content

What's hot

Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015Sperasoft
 
October 2013 BARUG Lightning Talk
October 2013 BARUG Lightning TalkOctober 2013 BARUG Lightning Talk
October 2013 BARUG Lightning TalkClark Fitzgerald
 
My Gentle Introduction to RxJS
My Gentle Introduction to RxJSMy Gentle Introduction to RxJS
My Gentle Introduction to RxJSMattia Occhiuto
 
spaCy lightning talk for KyivPy #21
spaCy lightning talk for KyivPy #21spaCy lightning talk for KyivPy #21
spaCy lightning talk for KyivPy #21Anton Kasyanov
 
High performance GPU computing with Ruby
High performance GPU computing with RubyHigh performance GPU computing with Ruby
High performance GPU computing with RubyPrasun Anand
 

What's hot (7)

Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015
 
October 2013 BARUG Lightning Talk
October 2013 BARUG Lightning TalkOctober 2013 BARUG Lightning Talk
October 2013 BARUG Lightning Talk
 
My Gentle Introduction to RxJS
My Gentle Introduction to RxJSMy Gentle Introduction to RxJS
My Gentle Introduction to RxJS
 
Ruby memory tips and tricks
Ruby memory tips and tricksRuby memory tips and tricks
Ruby memory tips and tricks
 
spaCy lightning talk for KyivPy #21
spaCy lightning talk for KyivPy #21spaCy lightning talk for KyivPy #21
spaCy lightning talk for KyivPy #21
 
Demonstration
DemonstrationDemonstration
Demonstration
 
High performance GPU computing with Ruby
High performance GPU computing with RubyHigh performance GPU computing with Ruby
High performance GPU computing with Ruby
 

Viewers also liked

Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormDavorin Vukelic
 
Catalogo Planet Network da Spark Controles
Catalogo Planet Network da Spark ControlesCatalogo Planet Network da Spark Controles
Catalogo Planet Network da Spark ControlesSpark Controles
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MorePaco Nathan
 
Apache Spark streaming and HBase
Apache Spark streaming and HBaseApache Spark streaming and HBase
Apache Spark streaming and HBaseCarol McDonald
 
QCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark StreamingQCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark StreamingPaco Nathan
 
Strata EU 2014: Spark Streaming Case Studies
Strata EU 2014: Spark Streaming Case StudiesStrata EU 2014: Spark Streaming Case Studies
Strata EU 2014: Spark Streaming Case StudiesPaco Nathan
 
Apache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSApache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSAmazon Web Services
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkDataWorks Summit
 
Spark machine learning & deep learning
Spark machine learning & deep learningSpark machine learning & deep learning
Spark machine learning & deep learninghoondong kim
 
Maximilian Michels - Flink and Beam
Maximilian Michels - Flink and BeamMaximilian Michels - Flink and Beam
Maximilian Michels - Flink and BeamFlink Forward
 
Machine Learning by Example - Apache Spark
Machine Learning by Example - Apache SparkMachine Learning by Example - Apache Spark
Machine Learning by Example - Apache SparkMeeraj Kunnumpurath
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkRahul Kumar
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksLegacy Typesafe (now Lightbend)
 
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Anton Kirillov
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionGuido Schmutz
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Helena Edelson
 

Viewers also liked (19)

Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
 
Catalogo Planet Network da Spark Controles
Catalogo Planet Network da Spark ControlesCatalogo Planet Network da Spark Controles
Catalogo Planet Network da Spark Controles
 
Apache poi
Apache poi Apache poi
Apache poi
 
Apache Poi Recipes
Apache Poi RecipesApache Poi Recipes
Apache Poi Recipes
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
 
Apache Spark streaming and HBase
Apache Spark streaming and HBaseApache Spark streaming and HBase
Apache Spark streaming and HBase
 
QCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark StreamingQCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark Streaming
 
Strata EU 2014: Spark Streaming Case Studies
Strata EU 2014: Spark Streaming Case StudiesStrata EU 2014: Spark Streaming Case Studies
Strata EU 2014: Spark Streaming Case Studies
 
Apache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWSApache Spark and the Hadoop Ecosystem on AWS
Apache Spark and the Hadoop Ecosystem on AWS
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
 
Spark machine learning & deep learning
Spark machine learning & deep learningSpark machine learning & deep learning
Spark machine learning & deep learning
 
Maximilian Michels - Flink and Beam
Maximilian Michels - Flink and BeamMaximilian Michels - Flink and Beam
Maximilian Michels - Flink and Beam
 
How to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOSHow to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOS
 
Machine Learning by Example - Apache Spark
Machine Learning by Example - Apache SparkMachine Learning by Example - Apache Spark
Machine Learning by Example - Apache Spark
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache spark
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
 
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in Action
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
 

Similar to Intro to Apache Spark - Lab

MLconf NYC Xiangrui Meng
MLconf NYC Xiangrui MengMLconf NYC Xiangrui Meng
MLconf NYC Xiangrui MengMLconf
 
Scalable Applications with Scala
Scalable Applications with ScalaScalable Applications with Scala
Scalable Applications with ScalaNimrod Argov
 
Understanding Parallelization of Machine Learning Algorithms in Apache Spark™
Understanding Parallelization of Machine Learning Algorithms in Apache Spark™Understanding Parallelization of Machine Learning Algorithms in Apache Spark™
Understanding Parallelization of Machine Learning Algorithms in Apache Spark™Databricks
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Yao Yao
 
Large-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache SparkLarge-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache SparkDB Tsai
 
Accumulo Summit 2016: Introducing Accumulo Collections: A Practical Accumulo ...
Accumulo Summit 2016: Introducing Accumulo Collections: A Practical Accumulo ...Accumulo Summit 2016: Introducing Accumulo Collections: A Practical Accumulo ...
Accumulo Summit 2016: Introducing Accumulo Collections: A Practical Accumulo ...Accumulo Summit
 
Parallel Computing with R
Parallel Computing with RParallel Computing with R
Parallel Computing with RAbhirup Mallik
 
Spark ml streaming
Spark ml streamingSpark ml streaming
Spark ml streamingAdam Doyle
 
2014.06.24.what is ubix
2014.06.24.what is ubix2014.06.24.what is ubix
2014.06.24.what is ubixJim Cooley
 
Scala Reflection & Runtime MetaProgramming
Scala Reflection & Runtime MetaProgrammingScala Reflection & Runtime MetaProgramming
Scala Reflection & Runtime MetaProgrammingMeir Maor
 
Object Oriented Programming in Matlab
Object Oriented Programming in Matlab Object Oriented Programming in Matlab
Object Oriented Programming in Matlab AlbanLevy
 
Device status anomaly detection
Device status anomaly detectionDevice status anomaly detection
Device status anomaly detectionDavid Tung
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaChetan Khatri
 
Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache SparkCloudera, Inc.
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
Caret Package for R
Caret Package for RCaret Package for R
Caret Package for Rkmettler
 
Caret max kuhn
Caret max kuhnCaret max kuhn
Caret max kuhnkmettler
 
Overview of Apache SystemML by Berthold Reinwald and Nakul Jindal
Overview of Apache SystemML by Berthold Reinwald and Nakul JindalOverview of Apache SystemML by Berthold Reinwald and Nakul Jindal
Overview of Apache SystemML by Berthold Reinwald and Nakul JindalArvind Surve
 
Towards Increasing Predictability of Machine Learning Research
Towards Increasing Predictability of Machine Learning ResearchTowards Increasing Predictability of Machine Learning Research
Towards Increasing Predictability of Machine Learning ResearchArtemSunfun
 

Similar to Intro to Apache Spark - Lab (20)

MLconf NYC Xiangrui Meng
MLconf NYC Xiangrui MengMLconf NYC Xiangrui Meng
MLconf NYC Xiangrui Meng
 
Scalable Applications with Scala
Scalable Applications with ScalaScalable Applications with Scala
Scalable Applications with Scala
 
Understanding Parallelization of Machine Learning Algorithms in Apache Spark™
Understanding Parallelization of Machine Learning Algorithms in Apache Spark™Understanding Parallelization of Machine Learning Algorithms in Apache Spark™
Understanding Parallelization of Machine Learning Algorithms in Apache Spark™
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
 
Large-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache SparkLarge-Scale Machine Learning with Apache Spark
Large-Scale Machine Learning with Apache Spark
 
Shooting the Rapids
Shooting the RapidsShooting the Rapids
Shooting the Rapids
 
Accumulo Summit 2016: Introducing Accumulo Collections: A Practical Accumulo ...
Accumulo Summit 2016: Introducing Accumulo Collections: A Practical Accumulo ...Accumulo Summit 2016: Introducing Accumulo Collections: A Practical Accumulo ...
Accumulo Summit 2016: Introducing Accumulo Collections: A Practical Accumulo ...
 
Parallel Computing with R
Parallel Computing with RParallel Computing with R
Parallel Computing with R
 
Spark ml streaming
Spark ml streamingSpark ml streaming
Spark ml streaming
 
2014.06.24.what is ubix
2014.06.24.what is ubix2014.06.24.what is ubix
2014.06.24.what is ubix
 
Scala Reflection & Runtime MetaProgramming
Scala Reflection & Runtime MetaProgrammingScala Reflection & Runtime MetaProgramming
Scala Reflection & Runtime MetaProgramming
 
Object Oriented Programming in Matlab
Object Oriented Programming in Matlab Object Oriented Programming in Matlab
Object Oriented Programming in Matlab
 
Device status anomaly detection
Device status anomaly detectionDevice status anomaly detection
Device status anomaly detection
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache Spark
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
Caret Package for R
Caret Package for RCaret Package for R
Caret Package for R
 
Caret max kuhn
Caret max kuhnCaret max kuhn
Caret max kuhn
 
Overview of Apache SystemML by Berthold Reinwald and Nakul Jindal
Overview of Apache SystemML by Berthold Reinwald and Nakul JindalOverview of Apache SystemML by Berthold Reinwald and Nakul Jindal
Overview of Apache SystemML by Berthold Reinwald and Nakul Jindal
 
Towards Increasing Predictability of Machine Learning Research
Towards Increasing Predictability of Machine Learning ResearchTowards Increasing Predictability of Machine Learning Research
Towards Increasing Predictability of Machine Learning Research
 

More from Mammoth Data

Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache SparkMammoth Data
 
A Modern Data Architecture for Risk Management... For Financial Services
A Modern Data Architecture for Risk Management... For Financial ServicesA Modern Data Architecture for Risk Management... For Financial Services
A Modern Data Architecture for Risk Management... For Financial ServicesMammoth Data
 
2015 Red Hat Summit - Open Source in Financial Services
2015 Red Hat Summit - Open Source in Financial Services2015 Red Hat Summit - Open Source in Financial Services
2015 Red Hat Summit - Open Source in Financial ServicesMammoth Data
 
How To Run A Successful BI Project with Hadoop
How To Run A Successful BI Project with HadoopHow To Run A Successful BI Project with Hadoop
How To Run A Successful BI Project with HadoopMammoth Data
 
Cloud Worst Practices
Cloud Worst PracticesCloud Worst Practices
Cloud Worst PracticesMammoth Data
 
A Gentle Introduction To Storm And Kafka
A Gentle Introduction To Storm And KafkaA Gentle Introduction To Storm And Kafka
A Gentle Introduction To Storm And KafkaMammoth Data
 
Become Data Driven With Hadoop as-a-Service
Become Data Driven With Hadoop as-a-ServiceBecome Data Driven With Hadoop as-a-Service
Become Data Driven With Hadoop as-a-ServiceMammoth Data
 

More from Mammoth Data (7)

Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
A Modern Data Architecture for Risk Management... For Financial Services
A Modern Data Architecture for Risk Management... For Financial ServicesA Modern Data Architecture for Risk Management... For Financial Services
A Modern Data Architecture for Risk Management... For Financial Services
 
2015 Red Hat Summit - Open Source in Financial Services
2015 Red Hat Summit - Open Source in Financial Services2015 Red Hat Summit - Open Source in Financial Services
2015 Red Hat Summit - Open Source in Financial Services
 
How To Run A Successful BI Project with Hadoop
How To Run A Successful BI Project with HadoopHow To Run A Successful BI Project with Hadoop
How To Run A Successful BI Project with Hadoop
 
Cloud Worst Practices
Cloud Worst PracticesCloud Worst Practices
Cloud Worst Practices
 
A Gentle Introduction To Storm And Kafka
A Gentle Introduction To Storm And KafkaA Gentle Introduction To Storm And Kafka
A Gentle Introduction To Storm And Kafka
 
Become Data Driven With Hadoop as-a-Service
Become Data Driven With Hadoop as-a-ServiceBecome Data Driven With Hadoop as-a-Service
Become Data Driven With Hadoop as-a-Service
 

Recently uploaded

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Recently uploaded (20)

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Intro to Apache Spark - Lab