SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Downloaden Sie, um offline zu lesen
Jag Jayaprakash
Ganesh Tiwari
Production Readiness Testing
Using Apache Spark
​FORCE
​Model-driven
development
platform
​HEROKU
​Polyglot platform for
elastic scale
​APP EXCHANGE
​Enterprise App
Marketplace
​LIGHTNING
​Visual Development
Platform
​THUNDER
​Stream & event based
primitives
Background: Salesforce App Cloud
Background:Apex
​STRONGLY TYPED
​Direct references to
schema objects
​LOOKS LIKE JAVA
​Acts like database
stored procedures
​EASY TO TEST
​Built-in support for creation &
execution of unit tests
​OBJECT ORIENTED
​Visual Development
Platform
​CLOUD HOSTED COMPILER
​Interpreted & executed on on
multitenant environment
Background: Hammer Process
(Aka Production Readiness Testing)
BASELINE Execution
Results
tests
tests
tests
HAMMER
Cluster 3
Cluster
2
Cluster
1
Cluster
5
UPGRADED
tests
tests
tests
​CUSTOMER UNIT TESTS ARE EXECUTED TWICE
​in Salesforce secured environment in data centers
​2nd EXECUTION CALLED THE UPGRADED
​on release candidate version
​UPGRADED AND BASELINE RESULTS ARE COMPARED
​When a test passes, it should pass in both versions.
​When a test fails, it should fail in both versions.
​Any other outcome is a potential bug in release candidate version!
​1st EXECUTION CALLED THE BASELINE
​on current productionversion BASELINE
Challenges
Infrastructure setup to
execute hammer on
two different platform
versions
Persist and compare
test execution results
Tests are executed
highly secured data
centers
150+ million customer
tests and growing
Internal SLA to keep
these mammoth efforts
to under 3 weeks
Hammer Core:A Functional Overview
Cluster 3
Cluster
2
Cluster
1
Cluster
5
Data Ingestion
(Secured Internal APIs)
Clustering
(Spark ML)
Each cluster is a
potential bug
Extract Transform Load
(Spark SQL)
Hammer:The Architecture
....
Web Server
Salesforce Hammer UI
Salesforce Hammer Core
Secured ExecutionEnvironment
Internal APIData Store
Data Preparation (ETL): Using Apache SparkSQL
​Example oftypical test executionresult ​Transformation Output
Baseline : Expected: 10, Actual: 10
Upgrade : Expected: 10, Actual: 10
PASS - PASS NOTA BUG
Baseline : NullPointerException: Attempt to de-reference a null object
Upgrade : NullPointerException: Attempt to de-reference a null object
FAIL - FAIL NOTA BUG
Baseline : Expected: 10, Actual: 10
Upgrade : NullPointerException: Attempt to de-reference a null object
PASS - FAIL
FOR FURTHER
ANALYSIS
Baseline : Expected: 10, Actual: 21
Upgrade : Expected: 10, Actual: 51 FAIL – FAIL’
FOR FURTHER
ANALYSIS
Spark Machine Learning Pipeline
Group test failure
records into K Clusters
Each cluster is a
potential bug in
Salesforce App Cloud
platform
Enables inspection of
cluster to determine if
it’s a bug or not
Designed to operate on
records marked “FOR
FURTHER
ANALYSIS”
Clustering Using Apache Spark Machine Learning
Filter
Transformer
Baseline: Passed
Upgrade: System.DmlException: Insert failed. First exception on 2016/3/4 first error: Cannot insert date 2016/3/4
Tokenizer
Transformer
Baseline: Passed
Upgrade: [“system.dmlexception”, “insert”, “failed”, “first”, “exception”, “on”, “2016/3/4”, “first”, “error”, “cannot”,
“insert”, “date”, “2016/3/4”]
Baseline: Passed
Upgrade: [“system.dmlexception”, “insert”, “failed”, “first”, “exception”, “on”, “2016/3/4”, “first”, “error”, “cannot”,
“insert”, “date”, “2016/3/4”]
Stop Words
Remover
Transformer
Canonicalizer
Transformer
Baseline: Passed
Upgrade: [“system.dmlexception”, “insert”, “failed”, “first”, “exception”, “<date>”:, “first”, “error”, “cannot”, “insert”,
“date”, “<date>”]
Baseline: Passed
Upgrade: [[100, 101, 123, 345, 543, 435, 213,321, 312,102],[1,2,1,1,1,1,1,2,1,2]]
TF Calculator
Transformer
Sparse vector
format
K-means
Clustering
Accomplishments
Test Records Analyzed
Old Hammer Engine
(hours)
New Hammer with Spark
(minutes)
Speed
Improvement
241K 7.5 9 97.9 %
562K 7.8 13 97.2 %
269K 8 11.5 97.6 %
242K 11.2 10 97.9 %
394K 14.2 20 97.0 %
374K 12.5 12 98.4 %
​In Extract Transform Load process -
Accomplishments
​In Clustering Analysis -
Well formed clusters
yielding to good quality
bugs
Speed – On an average
clustering took 40 minutes to
complete for 100K+ records
Fewer clusters to
analyze
Q & A
thank y u

Weitere ähnliche Inhalte

Was ist angesagt?

Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...Spark Summit
 
Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit
 
Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native KubernetesSimplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native KubernetesDatabricks
 
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim LauSpark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim LauSpark Summit
 
Apache Spark MLlib 2.0 Preview: Data Science and Production
Apache Spark MLlib 2.0 Preview: Data Science and ProductionApache Spark MLlib 2.0 Preview: Data Science and Production
Apache Spark MLlib 2.0 Preview: Data Science and ProductionDatabricks
 
Spark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim HunterSpark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim HunterSpark Summit
 
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim DowlingStructured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim DowlingDatabricks
 
Spark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon WhitearSpark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon WhitearSpark Summit
 
Spark Summit EU talk by Rolf Jagerman
Spark Summit EU talk by Rolf JagermanSpark Summit EU talk by Rolf Jagerman
Spark Summit EU talk by Rolf JagermanSpark Summit
 
Apache Flink vs Apache Spark - Reproducible experiments on cloud.
Apache Flink vs Apache Spark - Reproducible experiments on cloud.Apache Flink vs Apache Spark - Reproducible experiments on cloud.
Apache Flink vs Apache Spark - Reproducible experiments on cloud.Shelan Perera
 
Spark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca CanaliSpark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca CanaliSpark Summit
 
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...Spark Summit
 
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...Databricks
 
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration storyJoan Viladrosa Riera
 
Huawei Advanced Data Science With Spark Streaming
Huawei Advanced Data Science With Spark StreamingHuawei Advanced Data Science With Spark Streaming
Huawei Advanced Data Science With Spark StreamingJen Aman
 
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad FeinbergSpark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad FeinbergSpark Summit
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Databricks
 
An AI-Powered Chatbot to Simplify Apache Spark Performance Management
An AI-Powered Chatbot to Simplify Apache Spark Performance ManagementAn AI-Powered Chatbot to Simplify Apache Spark Performance Management
An AI-Powered Chatbot to Simplify Apache Spark Performance ManagementDatabricks
 
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...Provectus
 
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...Spark Summit
 

Was ist angesagt? (20)

Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
 
Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
 
Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native KubernetesSimplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
 
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim LauSpark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
 
Apache Spark MLlib 2.0 Preview: Data Science and Production
Apache Spark MLlib 2.0 Preview: Data Science and ProductionApache Spark MLlib 2.0 Preview: Data Science and Production
Apache Spark MLlib 2.0 Preview: Data Science and Production
 
Spark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim HunterSpark Summit EU talk by Tim Hunter
Spark Summit EU talk by Tim Hunter
 
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim DowlingStructured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
 
Spark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon WhitearSpark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon Whitear
 
Spark Summit EU talk by Rolf Jagerman
Spark Summit EU talk by Rolf JagermanSpark Summit EU talk by Rolf Jagerman
Spark Summit EU talk by Rolf Jagerman
 
Apache Flink vs Apache Spark - Reproducible experiments on cloud.
Apache Flink vs Apache Spark - Reproducible experiments on cloud.Apache Flink vs Apache Spark - Reproducible experiments on cloud.
Apache Flink vs Apache Spark - Reproducible experiments on cloud.
 
Spark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca CanaliSpark Summit EU talk by Luca Canali
Spark Summit EU talk by Luca Canali
 
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
 
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
Continuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS...
 
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story[Spark Summit EU 2017] Apache spark streaming + kafka 0.10  an integration story
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
 
Huawei Advanced Data Science With Spark Streaming
Huawei Advanced Data Science With Spark StreamingHuawei Advanced Data Science With Spark Streaming
Huawei Advanced Data Science With Spark Streaming
 
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad FeinbergSpark Summit EU talk by Ram Sriharsha and Vlad Feinberg
Spark Summit EU talk by Ram Sriharsha and Vlad Feinberg
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...
 
An AI-Powered Chatbot to Simplify Apache Spark Performance Management
An AI-Powered Chatbot to Simplify Apache Spark Performance ManagementAn AI-Powered Chatbot to Simplify Apache Spark Performance Management
An AI-Powered Chatbot to Simplify Apache Spark Performance Management
 
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
Data Summer Conf 2018, “Building unified Batch and Stream processing pipeline...
 
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
Highlights and Challenges from Running Spark on Mesos in Production by Morri ...
 

Andere mochten auch

Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Spark Summit
 
Spark Summit EU 2015: SparkUI visualization: a lens into your application
Spark Summit EU 2015: SparkUI visualization: a lens into your applicationSpark Summit EU 2015: SparkUI visualization: a lens into your application
Spark Summit EU 2015: SparkUI visualization: a lens into your applicationDatabricks
 
Spark with Cassandra by Christopher Batey
Spark with Cassandra by Christopher BateySpark with Cassandra by Christopher Batey
Spark with Cassandra by Christopher BateySpark Summit
 
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...Spark Summit
 
Some Important Streaming Algorithms You Should Know About-(Ted Dunning, MapR)
Some Important Streaming Algorithms You Should Know About-(Ted Dunning, MapR)Some Important Streaming Algorithms You Should Know About-(Ted Dunning, MapR)
Some Important Streaming Algorithms You Should Know About-(Ted Dunning, MapR)Spark Summit
 
An Introduction to Sparkling Water by Michal Malohlava
An Introduction to Sparkling Water by Michal MalohlavaAn Introduction to Sparkling Water by Michal Malohlava
An Introduction to Sparkling Water by Michal MalohlavaSpark Summit
 
Spark Tuning for Enterprise System Administrators By Anya Bida
Spark Tuning for Enterprise System Administrators By Anya BidaSpark Tuning for Enterprise System Administrators By Anya Bida
Spark Tuning for Enterprise System Administrators By Anya BidaSpark Summit
 
Continuous Integration for Spark Apps by Sean McIntyre
Continuous Integration for Spark Apps by Sean McIntyreContinuous Integration for Spark Apps by Sean McIntyre
Continuous Integration for Spark Apps by Sean McIntyreSpark Summit
 
Insights into Customer Behavior from Clickstream Data by Ronald Nowling
Insights into Customer Behavior from Clickstream Data by Ronald NowlingInsights into Customer Behavior from Clickstream Data by Ronald Nowling
Insights into Customer Behavior from Clickstream Data by Ronald NowlingSpark Summit
 
Beyond Parallelize and Collect by Holden Karau
Beyond Parallelize and Collect by Holden KarauBeyond Parallelize and Collect by Holden Karau
Beyond Parallelize and Collect by Holden KarauSpark Summit
 
Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Integrating Spark and Solr-(Timothy Potter, Lucidworks)Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Integrating Spark and Solr-(Timothy Potter, Lucidworks)Spark Summit
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonSpark Summit
 
Production Readiness Testing Using Spark
Production Readiness Testing Using SparkProduction Readiness Testing Using Spark
Production Readiness Testing Using SparkSalesforce Engineering
 
Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...
Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...
Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...Spark Summit
 
Relationship Extraction from Unstructured Text-Based on Stanford NLP with Spa...
Relationship Extraction from Unstructured Text-Based on Stanford NLP with Spa...Relationship Extraction from Unstructured Text-Based on Stanford NLP with Spa...
Relationship Extraction from Unstructured Text-Based on Stanford NLP with Spa...Spark Summit
 
Spark summit2014 techtalk - testing spark
Spark summit2014 techtalk - testing sparkSpark summit2014 techtalk - testing spark
Spark summit2014 techtalk - testing sparkAnu Shetty
 
Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...
Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...
Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...Spark Summit
 
Spark in the Wild: An In-Depth Analysis of 50+ Production Deployments-(Arsala...
Spark in the Wild: An In-Depth Analysis of 50+ Production Deployments-(Arsala...Spark in the Wild: An In-Depth Analysis of 50+ Production Deployments-(Arsala...
Spark in the Wild: An In-Depth Analysis of 50+ Production Deployments-(Arsala...Spark Summit
 
Beneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek LaskowskiBeneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek LaskowskiSpark Summit
 
Building Robust, Adaptive Streaming Apps with Spark Streaming
Building Robust, Adaptive Streaming Apps with Spark StreamingBuilding Robust, Adaptive Streaming Apps with Spark Streaming
Building Robust, Adaptive Streaming Apps with Spark StreamingDatabricks
 

Andere mochten auch (20)

Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
 
Spark Summit EU 2015: SparkUI visualization: a lens into your application
Spark Summit EU 2015: SparkUI visualization: a lens into your applicationSpark Summit EU 2015: SparkUI visualization: a lens into your application
Spark Summit EU 2015: SparkUI visualization: a lens into your application
 
Spark with Cassandra by Christopher Batey
Spark with Cassandra by Christopher BateySpark with Cassandra by Christopher Batey
Spark with Cassandra by Christopher Batey
 
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
Spark Streaming: Pushing the throughput limits by Francois Garillot and Gerar...
 
Some Important Streaming Algorithms You Should Know About-(Ted Dunning, MapR)
Some Important Streaming Algorithms You Should Know About-(Ted Dunning, MapR)Some Important Streaming Algorithms You Should Know About-(Ted Dunning, MapR)
Some Important Streaming Algorithms You Should Know About-(Ted Dunning, MapR)
 
An Introduction to Sparkling Water by Michal Malohlava
An Introduction to Sparkling Water by Michal MalohlavaAn Introduction to Sparkling Water by Michal Malohlava
An Introduction to Sparkling Water by Michal Malohlava
 
Spark Tuning for Enterprise System Administrators By Anya Bida
Spark Tuning for Enterprise System Administrators By Anya BidaSpark Tuning for Enterprise System Administrators By Anya Bida
Spark Tuning for Enterprise System Administrators By Anya Bida
 
Continuous Integration for Spark Apps by Sean McIntyre
Continuous Integration for Spark Apps by Sean McIntyreContinuous Integration for Spark Apps by Sean McIntyre
Continuous Integration for Spark Apps by Sean McIntyre
 
Insights into Customer Behavior from Clickstream Data by Ronald Nowling
Insights into Customer Behavior from Clickstream Data by Ronald NowlingInsights into Customer Behavior from Clickstream Data by Ronald Nowling
Insights into Customer Behavior from Clickstream Data by Ronald Nowling
 
Beyond Parallelize and Collect by Holden Karau
Beyond Parallelize and Collect by Holden KarauBeyond Parallelize and Collect by Holden Karau
Beyond Parallelize and Collect by Holden Karau
 
Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Integrating Spark and Solr-(Timothy Potter, Lucidworks)Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Integrating Spark and Solr-(Timothy Potter, Lucidworks)
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
 
Production Readiness Testing Using Spark
Production Readiness Testing Using SparkProduction Readiness Testing Using Spark
Production Readiness Testing Using Spark
 
Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...
Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...
Clickstream Analysis with Spark—Understanding Visitors in Realtime by Josef A...
 
Relationship Extraction from Unstructured Text-Based on Stanford NLP with Spa...
Relationship Extraction from Unstructured Text-Based on Stanford NLP with Spa...Relationship Extraction from Unstructured Text-Based on Stanford NLP with Spa...
Relationship Extraction from Unstructured Text-Based on Stanford NLP with Spa...
 
Spark summit2014 techtalk - testing spark
Spark summit2014 techtalk - testing sparkSpark summit2014 techtalk - testing spark
Spark summit2014 techtalk - testing spark
 
Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...
Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...
Implementing Near-Realtime Datacenter Health Analytics using Model-driven Ver...
 
Spark in the Wild: An In-Depth Analysis of 50+ Production Deployments-(Arsala...
Spark in the Wild: An In-Depth Analysis of 50+ Production Deployments-(Arsala...Spark in the Wild: An In-Depth Analysis of 50+ Production Deployments-(Arsala...
Spark in the Wild: An In-Depth Analysis of 50+ Production Deployments-(Arsala...
 
Beneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek LaskowskiBeneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek Laskowski
 
Building Robust, Adaptive Streaming Apps with Spark Streaming
Building Robust, Adaptive Streaming Apps with Spark StreamingBuilding Robust, Adaptive Streaming Apps with Spark Streaming
Building Robust, Adaptive Streaming Apps with Spark Streaming
 

Ähnlich wie Production Readiness Testing At Salesforce Using Spark MLlib

The_Little_Jenkinsfile_That_Could
The_Little_Jenkinsfile_That_CouldThe_Little_Jenkinsfile_That_Could
The_Little_Jenkinsfile_That_CouldShelley Lambert
 
Azure Powershell. Azure Automation
Azure Powershell. Azure AutomationAzure Powershell. Azure Automation
Azure Powershell. Azure AutomationAlexander Feschenko
 
AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...
AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...
AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...Amazon Web Services
 
Testing for fun in production Into The Box 2018
Testing for fun in production Into The Box 2018Testing for fun in production Into The Box 2018
Testing for fun in production Into The Box 2018Ortus Solutions, Corp
 
2014 Joker - Integration Testing from the Trenches
2014 Joker - Integration Testing from the Trenches2014 Joker - Integration Testing from the Trenches
2014 Joker - Integration Testing from the TrenchesNicolas Fränkel
 
Tips to achieve continuous integration/delivery using HP ALM, Jenkins, and S...
 Tips to achieve continuous integration/delivery using HP ALM, Jenkins, and S... Tips to achieve continuous integration/delivery using HP ALM, Jenkins, and S...
Tips to achieve continuous integration/delivery using HP ALM, Jenkins, and S...Skytap Cloud
 
Rewriting a Plugin Architecture 3 Times to Harness the API Economy
Rewriting a Plugin Architecture 3 Times to Harness the API EconomyRewriting a Plugin Architecture 3 Times to Harness the API Economy
Rewriting a Plugin Architecture 3 Times to Harness the API EconomyTim Pettersen
 
Alm Specialist Toolkit Team System Roadmap 2008 And Beyond External
Alm Specialist Toolkit   Team System Roadmap 2008 And Beyond ExternalAlm Specialist Toolkit   Team System Roadmap 2008 And Beyond External
Alm Specialist Toolkit Team System Roadmap 2008 And Beyond ExternalChristian Thilmany
 
Oracle Forms Performance Testing PushToTest TestMaker JAT
Oracle Forms Performance Testing PushToTest TestMaker JATOracle Forms Performance Testing PushToTest TestMaker JAT
Oracle Forms Performance Testing PushToTest TestMaker JATClever Moe
 
Java 6 [Mustang] - Features and Enchantments
Java 6 [Mustang] - Features and Enchantments Java 6 [Mustang] - Features and Enchantments
Java 6 [Mustang] - Features and Enchantments Pavel Kaminsky
 
QAing in a data platform project - Kalyan Muthiah
QAing in a data platform project - Kalyan MuthiahQAing in a data platform project - Kalyan Muthiah
QAing in a data platform project - Kalyan MuthiahThoughtworks
 
Automating Perl deployments with Hudson
Automating Perl deployments with HudsonAutomating Perl deployments with Hudson
Automating Perl deployments with Hudsonnachbaur
 
[ApacheCon 2016] Advanced Apache Cordova
[ApacheCon 2016] Advanced Apache Cordova[ApacheCon 2016] Advanced Apache Cordova
[ApacheCon 2016] Advanced Apache CordovaHazem Saleh
 
Sauce Labs Beta Program Overview
Sauce Labs Beta Program OverviewSauce Labs Beta Program Overview
Sauce Labs Beta Program OverviewAl Sargent
 
Apache Karaf - Building OSGi applications on Apache Karaf - T Frank & A Grzesik
Apache Karaf - Building OSGi applications on Apache Karaf - T Frank & A GrzesikApache Karaf - Building OSGi applications on Apache Karaf - T Frank & A Grzesik
Apache Karaf - Building OSGi applications on Apache Karaf - T Frank & A Grzesikmfrancis
 
Declaring Server App Components in Pure Java
Declaring Server App Components in Pure JavaDeclaring Server App Components in Pure Java
Declaring Server App Components in Pure JavaAtlassian
 
Agile Open Source Performance Test Workshop for Developers, Testers, IT Ops
Agile Open Source Performance Test Workshop for Developers, Testers, IT OpsAgile Open Source Performance Test Workshop for Developers, Testers, IT Ops
Agile Open Source Performance Test Workshop for Developers, Testers, IT OpsClever Moe
 
Testing your application on Google App Engine
Testing your application on Google App EngineTesting your application on Google App Engine
Testing your application on Google App EngineInphina Technologies
 

Ähnlich wie Production Readiness Testing At Salesforce Using Spark MLlib (20)

The_Little_Jenkinsfile_That_Could
The_Little_Jenkinsfile_That_CouldThe_Little_Jenkinsfile_That_Could
The_Little_Jenkinsfile_That_Could
 
Azure Powershell. Azure Automation
Azure Powershell. Azure AutomationAzure Powershell. Azure Automation
Azure Powershell. Azure Automation
 
AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...
AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...
AWS re:Invent 2016: Building a Platform for Collaborative Scientific Research...
 
Testing for fun in production Into The Box 2018
Testing for fun in production Into The Box 2018Testing for fun in production Into The Box 2018
Testing for fun in production Into The Box 2018
 
2014 Joker - Integration Testing from the Trenches
2014 Joker - Integration Testing from the Trenches2014 Joker - Integration Testing from the Trenches
2014 Joker - Integration Testing from the Trenches
 
Tips to achieve continuous integration/delivery using HP ALM, Jenkins, and S...
 Tips to achieve continuous integration/delivery using HP ALM, Jenkins, and S... Tips to achieve continuous integration/delivery using HP ALM, Jenkins, and S...
Tips to achieve continuous integration/delivery using HP ALM, Jenkins, and S...
 
Integration Testing in Python
Integration Testing in PythonIntegration Testing in Python
Integration Testing in Python
 
MidSem
MidSemMidSem
MidSem
 
Rewriting a Plugin Architecture 3 Times to Harness the API Economy
Rewriting a Plugin Architecture 3 Times to Harness the API EconomyRewriting a Plugin Architecture 3 Times to Harness the API Economy
Rewriting a Plugin Architecture 3 Times to Harness the API Economy
 
Alm Specialist Toolkit Team System Roadmap 2008 And Beyond External
Alm Specialist Toolkit   Team System Roadmap 2008 And Beyond ExternalAlm Specialist Toolkit   Team System Roadmap 2008 And Beyond External
Alm Specialist Toolkit Team System Roadmap 2008 And Beyond External
 
Oracle Forms Performance Testing PushToTest TestMaker JAT
Oracle Forms Performance Testing PushToTest TestMaker JATOracle Forms Performance Testing PushToTest TestMaker JAT
Oracle Forms Performance Testing PushToTest TestMaker JAT
 
Java 6 [Mustang] - Features and Enchantments
Java 6 [Mustang] - Features and Enchantments Java 6 [Mustang] - Features and Enchantments
Java 6 [Mustang] - Features and Enchantments
 
QAing in a data platform project - Kalyan Muthiah
QAing in a data platform project - Kalyan MuthiahQAing in a data platform project - Kalyan Muthiah
QAing in a data platform project - Kalyan Muthiah
 
Automating Perl deployments with Hudson
Automating Perl deployments with HudsonAutomating Perl deployments with Hudson
Automating Perl deployments with Hudson
 
[ApacheCon 2016] Advanced Apache Cordova
[ApacheCon 2016] Advanced Apache Cordova[ApacheCon 2016] Advanced Apache Cordova
[ApacheCon 2016] Advanced Apache Cordova
 
Sauce Labs Beta Program Overview
Sauce Labs Beta Program OverviewSauce Labs Beta Program Overview
Sauce Labs Beta Program Overview
 
Apache Karaf - Building OSGi applications on Apache Karaf - T Frank & A Grzesik
Apache Karaf - Building OSGi applications on Apache Karaf - T Frank & A GrzesikApache Karaf - Building OSGi applications on Apache Karaf - T Frank & A Grzesik
Apache Karaf - Building OSGi applications on Apache Karaf - T Frank & A Grzesik
 
Declaring Server App Components in Pure Java
Declaring Server App Components in Pure JavaDeclaring Server App Components in Pure Java
Declaring Server App Components in Pure Java
 
Agile Open Source Performance Test Workshop for Developers, Testers, IT Ops
Agile Open Source Performance Test Workshop for Developers, Testers, IT OpsAgile Open Source Performance Test Workshop for Developers, Testers, IT Ops
Agile Open Source Performance Test Workshop for Developers, Testers, IT Ops
 
Testing your application on Google App Engine
Testing your application on Google App EngineTesting your application on Google App Engine
Testing your application on Google App Engine
 

Mehr von Spark Summit

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang Spark Summit
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...Spark Summit
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang WuSpark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya RaghavendraSpark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingSpark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingSpark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakSpark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimSpark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraSpark Summit
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...Spark Summit
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovSpark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkSpark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...Spark Summit
 

Mehr von Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
 

Kürzlich hochgeladen

办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxellehsormae
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 

Kürzlich hochgeladen (20)

办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptx
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 

Production Readiness Testing At Salesforce Using Spark MLlib

  • 1. Jag Jayaprakash Ganesh Tiwari Production Readiness Testing Using Apache Spark
  • 2. ​FORCE ​Model-driven development platform ​HEROKU ​Polyglot platform for elastic scale ​APP EXCHANGE ​Enterprise App Marketplace ​LIGHTNING ​Visual Development Platform ​THUNDER ​Stream & event based primitives Background: Salesforce App Cloud
  • 3. Background:Apex ​STRONGLY TYPED ​Direct references to schema objects ​LOOKS LIKE JAVA ​Acts like database stored procedures ​EASY TO TEST ​Built-in support for creation & execution of unit tests ​OBJECT ORIENTED ​Visual Development Platform ​CLOUD HOSTED COMPILER ​Interpreted & executed on on multitenant environment
  • 4. Background: Hammer Process (Aka Production Readiness Testing) BASELINE Execution Results tests tests tests HAMMER Cluster 3 Cluster 2 Cluster 1 Cluster 5 UPGRADED tests tests tests ​CUSTOMER UNIT TESTS ARE EXECUTED TWICE ​in Salesforce secured environment in data centers ​2nd EXECUTION CALLED THE UPGRADED ​on release candidate version ​UPGRADED AND BASELINE RESULTS ARE COMPARED ​When a test passes, it should pass in both versions. ​When a test fails, it should fail in both versions. ​Any other outcome is a potential bug in release candidate version! ​1st EXECUTION CALLED THE BASELINE ​on current productionversion BASELINE
  • 5. Challenges Infrastructure setup to execute hammer on two different platform versions Persist and compare test execution results Tests are executed highly secured data centers 150+ million customer tests and growing Internal SLA to keep these mammoth efforts to under 3 weeks
  • 6. Hammer Core:A Functional Overview Cluster 3 Cluster 2 Cluster 1 Cluster 5 Data Ingestion (Secured Internal APIs) Clustering (Spark ML) Each cluster is a potential bug Extract Transform Load (Spark SQL)
  • 7. Hammer:The Architecture .... Web Server Salesforce Hammer UI Salesforce Hammer Core Secured ExecutionEnvironment Internal APIData Store
  • 8. Data Preparation (ETL): Using Apache SparkSQL ​Example oftypical test executionresult ​Transformation Output Baseline : Expected: 10, Actual: 10 Upgrade : Expected: 10, Actual: 10 PASS - PASS NOTA BUG Baseline : NullPointerException: Attempt to de-reference a null object Upgrade : NullPointerException: Attempt to de-reference a null object FAIL - FAIL NOTA BUG Baseline : Expected: 10, Actual: 10 Upgrade : NullPointerException: Attempt to de-reference a null object PASS - FAIL FOR FURTHER ANALYSIS Baseline : Expected: 10, Actual: 21 Upgrade : Expected: 10, Actual: 51 FAIL – FAIL’ FOR FURTHER ANALYSIS
  • 9. Spark Machine Learning Pipeline Group test failure records into K Clusters Each cluster is a potential bug in Salesforce App Cloud platform Enables inspection of cluster to determine if it’s a bug or not Designed to operate on records marked “FOR FURTHER ANALYSIS”
  • 10. Clustering Using Apache Spark Machine Learning Filter Transformer Baseline: Passed Upgrade: System.DmlException: Insert failed. First exception on 2016/3/4 first error: Cannot insert date 2016/3/4 Tokenizer Transformer Baseline: Passed Upgrade: [“system.dmlexception”, “insert”, “failed”, “first”, “exception”, “on”, “2016/3/4”, “first”, “error”, “cannot”, “insert”, “date”, “2016/3/4”] Baseline: Passed Upgrade: [“system.dmlexception”, “insert”, “failed”, “first”, “exception”, “on”, “2016/3/4”, “first”, “error”, “cannot”, “insert”, “date”, “2016/3/4”] Stop Words Remover Transformer Canonicalizer Transformer Baseline: Passed Upgrade: [“system.dmlexception”, “insert”, “failed”, “first”, “exception”, “<date>”:, “first”, “error”, “cannot”, “insert”, “date”, “<date>”] Baseline: Passed Upgrade: [[100, 101, 123, 345, 543, 435, 213,321, 312,102],[1,2,1,1,1,1,1,2,1,2]] TF Calculator Transformer Sparse vector format K-means Clustering
  • 11. Accomplishments Test Records Analyzed Old Hammer Engine (hours) New Hammer with Spark (minutes) Speed Improvement 241K 7.5 9 97.9 % 562K 7.8 13 97.2 % 269K 8 11.5 97.6 % 242K 11.2 10 97.9 % 394K 14.2 20 97.0 % 374K 12.5 12 98.4 % ​In Extract Transform Load process -
  • 12. Accomplishments ​In Clustering Analysis - Well formed clusters yielding to good quality bugs Speed – On an average clustering took 40 minutes to complete for 100K+ records Fewer clusters to analyze
  • 13. Q & A