Grid search, pipeline, featureunion

•Download as PPTX, PDF•

0 likes•263 views

zekeLabs Technologies

Technology

Agenda
● Hyperparameter
● GridSearch
● Transformers
● Estimators
● Pipeline - Connecting Estimators & Transformers
● FeatureUnion
● Connecting FeatureUnion using Pipeline
● Advantages of Pipeline
● Limitations of Pipeline

Hyperparameter
● Configurable parameters of Transformers & Estimators are
Hyperparameters
● Transformers & Estimators when configured with best hyperparameters, we
get the best model.
● Finding the best hyper-parameters is a tricky task.

GridSearch
● Takes bunch of possible hyperparameters
● Creates model with all possible combinations
● Train & validates all the models
● Returns the best model
● Also, the most suited hyper-parameters
● We may further narrow down & repeat.

Transformers
● Entities capable of transforming data are called as transformers
● PreProcessing functions returns Transformers
● They support fit(), transform() & fit_transform() methods
● StandardScaler, MinMaxScaler etc. are in-built
● Using FunctionTransformer we can create our own transformers

Estimators
● Data after going through right transformation needs be passed to estimator
for training or prediction
● Object of learning algorithm are known as estimators
● LinearRegression, KMeans etc.

Pipeline
● Sequentially apply a list of transforms and a final estimator.
● Intermediate steps of the pipeline must be ‘transforms’, that is, they must
implement fit and transform methods.
● The final estimator only needs to implement fit.
● The transformers in the pipeline can be cached using memory argument.
Imputer StandardScaler PCA SGDClassifier

Pipeline
● Allow to quickly build a model with all the pre-processing and imputation
chains as a scikit-learn object with all the fit and transform methods that
usually come with these objects.
● In addition, we can wrap the object in a grid search to find my optimal
hyperparameters across all the steps in my pipeline chain.

FeatureUnion
● Concatenates results of multiple
transformer objects.
● This estimator applies a list of transformer
objects in parallel to the input data, then
concatenates the results.
● This is useful to combine several feature
extraction mechanisms into a single
transformer.

Advantages
● Integrates very well with GridSearchCV for hyper-parameter tuning
● Code is highly modular & reusable
● Caching of transformers can be enabled
● Perhaps one of the best feature of scikit

Limitations
● Sad part is pipeline doesn’t support partial_fit api.
● That’s because all transformers are not capable of out-of-core processing

Visit : www.zekeLabs.com for more details
THANK YOU
Let us know how can we help your organization to Upskill the
employees to stay updated in the ever-evolving IT Industry.
Get in touch:
www.zekeLabs.com | +91-8095465880 | info@zekeLabs.com

What's hot

Logistic Regression | Logistic Regression In Python | Machine Learning Algori...Simplilearn

Ensemble learning TechniquesBabu Priyavrat

Presentation on Text ClassificationSai Srinivas Kotni

Ensemble learningHaris Jamil

1909 BERT: why-and-how (CODE SEMINAR)WarNik Chow

Genetic Algorithms - Artificial IntelligenceSahil Kumar

XLNet Presentation.pdfSivaKumar458905

Intro to Model Selectionchenhm

. An introduction to machine learning and probabilistic ...butest

Machine Learning with Decision treesKnoldus Inc.

A Comprehensive Review of Large Language Models for.pptxSaiPragnaKancheti

A note on word embeddingKhang Pham

Support Vector Machine ppt presentationAyanaRukasar

Nlp toolkits and_preprocessing_techniquesankit_ppt

Semantic NetworksJenny Galino

Feature Engineering - Getting most out of data for predictive modelsGabriel Moreira

Long Short Term MemoryYan Xu

Naive Bayes Classifier using R.Triloki Gupta

Understanding Bagging and BoostingMohit Rajput

Introduction to natural language processing (NLP)Alia Hamwi

What's hot (20)

Logistic Regression | Logistic Regression In Python | Machine Learning Algori...

Ensemble learning Techniques

Presentation on Text Classification

Ensemble learning

1909 BERT: why-and-how (CODE SEMINAR)

Genetic Algorithms - Artificial Intelligence

XLNet Presentation.pdf

Intro to Model Selection

. An introduction to machine learning and probabilistic ...

Machine Learning with Decision trees

A Comprehensive Review of Large Language Models for.pptx

A note on word embedding

Support Vector Machine ppt presentation

Nlp toolkits and_preprocessing_techniques

Semantic Networks

Feature Engineering - Getting most out of data for predictive models

Long Short Term Memory

Naive Bayes Classifier using R.

Understanding Bagging and Boosting

Introduction to natural language processing (NLP)

Similar to Grid search, pipeline, featureunion

Productionalizing Spark MLdatamantra

Declarative Experimentation in Information Retrieval using PyTerrierCrai Macdonald

Universal metrics with Apache BeamEtienne Chauchot

Using Spark Mllib Models in a Production Training and Serving Platform: Exper...Databricks

Taking your machine learning workflow to the next level using Scikit-Learn Pi...Philip Goddard

Java 8 streams Srinivasan Raghvan

Revolutionise your Machine Learning Workflow using Scikit-Learn PipelinesPhilip Goddard

Function Mesh for Apache Pulsar, the Way for Simple Streaming SolutionsStreamNative

PREDIcTApurva Kulkarni

Optimica Compiler Toolkit - OverviewModelon

How a BEAM runner executes a pipeline. Apache BEAM Summit London 2018javier ramirez

Low latency high throughput streaming using Apache Apex and Apache KuduDataWorks Summit

Java 8vpulec

Robust and declarative machine learning pipelines for predictive buying at Ba...Gianmario Spacagna

Scaling machinelearning as a service at uber li Erran li - 2016Karthik Murugesan

Scaling machine learning as a service at Uber — Li Erran Li at #papis2016PAPIs.io

A Tool For Big Data Analysis using Apache Sparkdatamantra

Lambda.pdfManishWalia18

Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...SQUADEX

Java concurrencySrinivasan Raghvan

Similar to Grid search, pipeline, featureunion (20)

Productionalizing Spark ML

Declarative Experimentation in Information Retrieval using PyTerrier

Universal metrics with Apache Beam

Using Spark Mllib Models in a Production Training and Serving Platform: Exper...

Taking your machine learning workflow to the next level using Scikit-Learn Pi...

Java 8 streams

Revolutionise your Machine Learning Workflow using Scikit-Learn Pipelines

Function Mesh for Apache Pulsar, the Way for Simple Streaming Solutions

PREDIcT

Optimica Compiler Toolkit - Overview

How a BEAM runner executes a pipeline. Apache BEAM Summit London 2018

Low latency high throughput streaming using Apache Apex and Apache Kudu

Java 8

Robust and declarative machine learning pipelines for predictive buying at Ba...

Scaling machinelearning as a service at uber li Erran li - 2016

Scaling machine learning as a service at Uber — Li Erran Li at #papis2016

A Tool For Big Data Analysis using Apache Spark

Lambda.pdf

Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...

Java concurrency

Recently uploaded

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

Powerpoint exploring the locations used in television show Time Clashcharlottematthew16

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

Artificial intelligence in cctv survelliance.pptxhariprasad279825

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

Search Engine Optimization SEO PDF for 2024.pdfRankYa

Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang

The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2

CloudStudio User manual (basic edition):comworks

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

Install Stable Diffusion in windows machinePadma Pradeep

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Recently uploaded (20)

Advanced Test Driven-Development @ php[tek] 2024

Streamlining Python Development: A Guide to a Modern Project Setup

Powerpoint exploring the locations used in television show Time Clash

"Debugging python applications inside k8s environment", Andrii Soldatenko

Artificial intelligence in cctv survelliance.pptx

Vertex AI Gemini Prompt Engineering Tips

Search Engine Optimization SEO PDF for 2024.pdf

Vector Databases 101 - An introduction to the world of Vector Databases

DevEX - reference for building teams, processes, and platforms

Unleash Your Potential - Namagunga Girls Coding Club

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)

The Future of Software Development - Devin AI Innovative Approach.pdf

CloudStudio User manual (basic edition):

Dev Dives: Streamline document processing with UiPath Studio Web

SIP trunking in Janus @ Kamailio World 2024

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)

Install Stable Diffusion in windows machine

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

Grid search, pipeline, featureunion

1. zekeLabs GridSearch, Pipeline & FeatureUnion Learning made Simpler ! www.zekeLabs.com

2. Agenda ● Hyperparameter ● GridSearch ● Transformers ● Estimators ● Pipeline - Connecting Estimators & Transformers ● FeatureUnion ● Connecting FeatureUnion using Pipeline ● Advantages of Pipeline ● Limitations of Pipeline

3. Hyperparameter ● Configurable parameters of Transformers & Estimators are Hyperparameters ● Transformers & Estimators when configured with best hyperparameters, we get the best model. ● Finding the best hyper-parameters is a tricky task.

4. GridSearch ● Takes bunch of possible hyperparameters ● Creates model with all possible combinations ● Train & validates all the models ● Returns the best model ● Also, the most suited hyper-parameters ● We may further narrow down & repeat.

5. Transformers ● Entities capable of transforming data are called as transformers ● PreProcessing functions returns Transformers ● They support fit(), transform() & fit_transform() methods ● StandardScaler, MinMaxScaler etc. are in-built ● Using FunctionTransformer we can create our own transformers

6. Estimators ● Data after going through right transformation needs be passed to estimator for training or prediction ● Object of learning algorithm are known as estimators ● LinearRegression, KMeans etc.

7. Pipeline ● Sequentially apply a list of transforms and a final estimator. ● Intermediate steps of the pipeline must be ‘transforms’, that is, they must implement fit and transform methods. ● The final estimator only needs to implement fit. ● The transformers in the pipeline can be cached using memory argument. Imputer StandardScaler PCA SGDClassifier

8. Pipeline ● Allow to quickly build a model with all the pre-processing and imputation chains as a scikit-learn object with all the fit and transform methods that usually come with these objects. ● In addition, we can wrap the object in a grid search to find my optimal hyperparameters across all the steps in my pipeline chain.

9. FeatureUnion ● Concatenates results of multiple transformer objects. ● This estimator applies a list of transformer objects in parallel to the input data, then concatenates the results. ● This is useful to combine several feature extraction mechanisms into a single transformer.

10. Pipeline with FeatureUnion

11. Advantages ● Integrates very well with GridSearchCV for hyper-parameter tuning ● Code is highly modular & reusable ● Caching of transformers can be enabled ● Perhaps one of the best feature of scikit

12. Limitations ● Sad part is pipeline doesn’t support partial_fit api. ● That’s because all transformers are not capable of out-of-core processing

13. Thank You !!!

14. Visit : www.zekeLabs.com for more details THANK YOU Let us know how can we help your organization to Upskill the employees to stay updated in the ever-evolving IT Industry. Get in touch: www.zekeLabs.com | +91-8095465880 | info@zekeLabs.com

Grid search, pipeline, featureunion

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Grid search, pipeline, featureunion

Similar to Grid search, pipeline, featureunion (20)

More from zekeLabs Technologies

More from zekeLabs Technologies (20)

Recently uploaded

Recently uploaded (20)

Grid search, pipeline, featureunion