SlideShare a Scribd company logo
1 of 20
Machine Learning with Spark
and Cassandra - Testing
Tests for Binary Classification Models,
Regression Models,
And Multi-class Classification Models
Series
Machine Learning with Spark and
Cassandra
● Environment Setup
● Data Pre-processing
● Testing
● Validation
● Model Selection Tests
How do we test machine
learning models?
● Tests are a statistical measure of how well our models
work.
● Calculated by running a model on held out data with known
properties and comparing model predictions to known
labels
● Works differently for different types of ML models
● An attempt to capture the potential performance on data the
model will see in day to day operation
When do we test?
On what data?
When to test?
● Whenever we have a trained model, we can start testing. Depending on what we find and where we
are, the test can have us proceeding on to next steps or returning to previous ones.
○ Sometimes we go back to tune the parameters of our model.
○ Sometimes we may want to pick a new algorithm to train altogether.
○ Other times we move forwards to more complex testing strategies or onwards to deployment.
● The same calculations for test statistics can also be a part of the mathematical process for training
our model
What data to train on.
● Should always train on held out data, never the same data that was
used to train the model.
○ ML algorithms often involve optimization on test statistics for the
training dataset. Testing on the training set completely fails to help us
generalize to real data.
● There exist multiple methods for choosing data to be held out,
should always be done randomly.
○ Simplest method is to split data into two random chunks, train on one and
then test on the other
○ Can also split into three chunks, one for training, one for testing, one for
final validation
○ More complex schemes exist, to be covered next time in talk on validation
Binary Classification Tests
● Binary classifiers predict a value which has a boolean typing. It sometimes focuses on the presence
or absence of a particular thing, other times picking between two categories.
● In order to test our binary classification models we use something called a confusion matrix. It
categorizes our predictions based on what value we predicted and what the actual value is.
● Binary classifiers predict a value which has a boolean typing. It sometimes focuses on the presence
or absence of a particular thing, other times picking between two categories.
● In order to test our binary classification models we use something called a confusion matrix. It
categorizes our predictions based on what value we predicted and what the actual value is.
● We use these values to compute more meaningful metrics.
● The most commonly used is accuracy. Accuracy is computed as
correct predictions divided by all predictions. Its a general measure
of how likely we are to correctly predict a given example.
● Recall is computed as the number of correctly identified positive
values divided by the number of actual positive values. It measures
how well our model detects the presence of positive values.
● Precision is calculated as the number of correctly identified positive
values divided by the number of positive predictions. It measures the
reliability of the positive prediction.
● We can use Recall and Precision to calculate a composite value, the
F1 score. If either recall or precision is low, the f1-score will also be
small. It emphasizes the importance of the incorrect predictions.
Test Error for Regression Models
● Regression models estimate functions, and produce predictions in the form of scalar values.
Classification tests do not work for them. Instead we use the difference between predicted and
actual values as a simple error metric.
● Adding error values without extra processing is a bad idea since
errors in different directions can cancel out.
● Instead we use metrics like the sum of squared error (SSE) a simple
measure that captures error over the entire test set.
● We can also use mean squared error (MSE), which in some cases is
better since it is independent of the number of examples in the test
set.
● Root mean squared error (RMSE) is sometimes preferable since it is
returned in the same units as our predictions rather than units
squared, but still maintains many of the statistical feature of the
MSE.
● We sometimes prefer absolute error measures to squared error measures, which we calculate by
taking the absolute value of our error measure rather that squaring them.
● Large error values and therefore outliers are emphasized more by squared error measures.
● The discontinuity in the absolute value function makes it difficult to calculate gradients.
Confusion Matrices for Multiclass
Classification Models
● Multiclass classifiers predict a value that can have more than two but still finite possible values
● We test them by building confusion matrices, similar to binary classification, but these cannot be
turned directly into test metrics.
● We build an n by n grid, where n is the number of possible classes and place each test result into its
cell based on what was predicted and the actual class of the example.
● We can then turn that into n individual matrices, one for each class. We treat correct predictions
on a particular class as true positives, and then all other predictions are classified based on their
relation to the class that the matrix is for.
● From these new matrices, we can calculate our test metrics for each class. We can then combine
these values in various ways based on what is important for our application.
● We average scores together, but we can average based on the number of classes, weighting each
classes scores equally (called macro-average), or we can weight each score by the number of
examples that class has (called micro-average).
● Macro-average can act as a general score though it may obscure very high or low performance on
particular classes. If performance on a particular class is important we may choose to
micro-average or even look at the individual test scores.
Demo
Any Questions?
Strategy: Scalable Fast Data
Architecture: Cassandra, Spark, Kafka
Engineering: Node, Python, JVM,CLR
Operations: Cloud, Container
Rescue: Downtime!! I need help.
 www.anant.us | solutions@anant.us | (855) 262-6826
3 Washington Circle, NW | Suite 301 | Washington, DC 20037

More Related Content

More from Anant Corporation

Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksAnant Corporation
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionAnant Corporation
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Anant Corporation
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & FutureAnant Corporation
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Anant Corporation
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackAnant Corporation
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergAnant Corporation
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsAnant Corporation
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraAnant Corporation
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Anant Corporation
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessAnant Corporation
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsAnant Corporation
 
Data Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature SelectionData Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature SelectionAnant Corporation
 
Data Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource ManagersData Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource ManagersAnant Corporation
 
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...Anant Corporation
 
Data Engineer's Lunch #76: Airflow and Google Dataproc
Data Engineer's Lunch #76: Airflow and Google DataprocData Engineer's Lunch #76: Airflow and Google Dataproc
Data Engineer's Lunch #76: Airflow and Google DataprocAnant Corporation
 
Apache Cassandra Lunch #115: Google Dataproc and DataStax Astra
Apache Cassandra Lunch #115: Google Dataproc and DataStax AstraApache Cassandra Lunch #115: Google Dataproc and DataStax Astra
Apache Cassandra Lunch #115: Google Dataproc and DataStax AstraAnant Corporation
 
Apache Cassandra Lunch #114: Cassandra Virtual Tables
Apache Cassandra Lunch #114: Cassandra Virtual TablesApache Cassandra Lunch #114: Cassandra Virtual Tables
Apache Cassandra Lunch #114: Cassandra Virtual TablesAnant Corporation
 
Apache Cassandra Lunch #110: Full Query Logging
Apache Cassandra Lunch #110: Full Query LoggingApache Cassandra Lunch #110: Full Query Logging
Apache Cassandra Lunch #110: Full Query LoggingAnant Corporation
 

More from Anant Corporation (20)

Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward Talks
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
CL 121
CL 121CL 121
CL 121
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
 
Data Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature SelectionData Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature Selection
 
Data Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource ManagersData Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource Managers
 
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...
 
Data Engineer's Lunch #76: Airflow and Google Dataproc
Data Engineer's Lunch #76: Airflow and Google DataprocData Engineer's Lunch #76: Airflow and Google Dataproc
Data Engineer's Lunch #76: Airflow and Google Dataproc
 
Apache Cassandra Lunch #115: Google Dataproc and DataStax Astra
Apache Cassandra Lunch #115: Google Dataproc and DataStax AstraApache Cassandra Lunch #115: Google Dataproc and DataStax Astra
Apache Cassandra Lunch #115: Google Dataproc and DataStax Astra
 
Apache Cassandra Lunch #114: Cassandra Virtual Tables
Apache Cassandra Lunch #114: Cassandra Virtual TablesApache Cassandra Lunch #114: Cassandra Virtual Tables
Apache Cassandra Lunch #114: Cassandra Virtual Tables
 
Apache Cassandra Lunch #110: Full Query Logging
Apache Cassandra Lunch #110: Full Query LoggingApache Cassandra Lunch #110: Full Query Logging
Apache Cassandra Lunch #110: Full Query Logging
 

Recently uploaded

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Machine Learning with Spark and Cassandra - Testing

  • 1. Machine Learning with Spark and Cassandra - Testing Tests for Binary Classification Models, Regression Models, And Multi-class Classification Models
  • 2. Series Machine Learning with Spark and Cassandra ● Environment Setup ● Data Pre-processing ● Testing ● Validation ● Model Selection Tests
  • 3. How do we test machine learning models?
  • 4. ● Tests are a statistical measure of how well our models work. ● Calculated by running a model on held out data with known properties and comparing model predictions to known labels ● Works differently for different types of ML models ● An attempt to capture the potential performance on data the model will see in day to day operation
  • 5. When do we test? On what data?
  • 6. When to test? ● Whenever we have a trained model, we can start testing. Depending on what we find and where we are, the test can have us proceeding on to next steps or returning to previous ones. ○ Sometimes we go back to tune the parameters of our model. ○ Sometimes we may want to pick a new algorithm to train altogether. ○ Other times we move forwards to more complex testing strategies or onwards to deployment. ● The same calculations for test statistics can also be a part of the mathematical process for training our model
  • 7.
  • 8. What data to train on. ● Should always train on held out data, never the same data that was used to train the model. ○ ML algorithms often involve optimization on test statistics for the training dataset. Testing on the training set completely fails to help us generalize to real data. ● There exist multiple methods for choosing data to be held out, should always be done randomly. ○ Simplest method is to split data into two random chunks, train on one and then test on the other ○ Can also split into three chunks, one for training, one for testing, one for final validation ○ More complex schemes exist, to be covered next time in talk on validation
  • 10. ● Binary classifiers predict a value which has a boolean typing. It sometimes focuses on the presence or absence of a particular thing, other times picking between two categories. ● In order to test our binary classification models we use something called a confusion matrix. It categorizes our predictions based on what value we predicted and what the actual value is. ● Binary classifiers predict a value which has a boolean typing. It sometimes focuses on the presence or absence of a particular thing, other times picking between two categories. ● In order to test our binary classification models we use something called a confusion matrix. It categorizes our predictions based on what value we predicted and what the actual value is.
  • 11. ● We use these values to compute more meaningful metrics. ● The most commonly used is accuracy. Accuracy is computed as correct predictions divided by all predictions. Its a general measure of how likely we are to correctly predict a given example. ● Recall is computed as the number of correctly identified positive values divided by the number of actual positive values. It measures how well our model detects the presence of positive values. ● Precision is calculated as the number of correctly identified positive values divided by the number of positive predictions. It measures the reliability of the positive prediction. ● We can use Recall and Precision to calculate a composite value, the F1 score. If either recall or precision is low, the f1-score will also be small. It emphasizes the importance of the incorrect predictions.
  • 12. Test Error for Regression Models
  • 13. ● Regression models estimate functions, and produce predictions in the form of scalar values. Classification tests do not work for them. Instead we use the difference between predicted and actual values as a simple error metric. ● Adding error values without extra processing is a bad idea since errors in different directions can cancel out. ● Instead we use metrics like the sum of squared error (SSE) a simple measure that captures error over the entire test set. ● We can also use mean squared error (MSE), which in some cases is better since it is independent of the number of examples in the test set. ● Root mean squared error (RMSE) is sometimes preferable since it is returned in the same units as our predictions rather than units squared, but still maintains many of the statistical feature of the MSE.
  • 14. ● We sometimes prefer absolute error measures to squared error measures, which we calculate by taking the absolute value of our error measure rather that squaring them. ● Large error values and therefore outliers are emphasized more by squared error measures. ● The discontinuity in the absolute value function makes it difficult to calculate gradients.
  • 15. Confusion Matrices for Multiclass Classification Models
  • 16. ● Multiclass classifiers predict a value that can have more than two but still finite possible values ● We test them by building confusion matrices, similar to binary classification, but these cannot be turned directly into test metrics. ● We build an n by n grid, where n is the number of possible classes and place each test result into its cell based on what was predicted and the actual class of the example. ● We can then turn that into n individual matrices, one for each class. We treat correct predictions on a particular class as true positives, and then all other predictions are classified based on their relation to the class that the matrix is for.
  • 17. ● From these new matrices, we can calculate our test metrics for each class. We can then combine these values in various ways based on what is important for our application. ● We average scores together, but we can average based on the number of classes, weighting each classes scores equally (called macro-average), or we can weight each score by the number of examples that class has (called micro-average). ● Macro-average can act as a general score though it may obscure very high or low performance on particular classes. If performance on a particular class is important we may choose to micro-average or even look at the individual test scores.
  • 18. Demo
  • 20. Strategy: Scalable Fast Data Architecture: Cassandra, Spark, Kafka Engineering: Node, Python, JVM,CLR Operations: Cloud, Container Rescue: Downtime!! I need help.  www.anant.us | solutions@anant.us | (855) 262-6826 3 Washington Circle, NW | Suite 301 | Washington, DC 20037