SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Practical Machine Learning Tools and Techniques
Decision Trees Dealing with numeric attributes Standard method: binary splits  Steps to decide where to split:  Evaluate info gain for every possible split point of attribute Choose “best” split point But this is computationally intensive
Decision Trees Example Split on temperature attribute:              64  65  68  69  70   71  72  72  75  75  80   81  83  85            Yes  No  Yes  YesYes  No  No  Yes  YesYes  No  Yes  Yes  No temperature < 71.5: yes/4, no/2 temperature > 71.5: yes/5, no/3 Info([4,2],[5,3]) = 6/14 info([4,2]) + 8/14 info([5,3])  = 0.939 bits
Decision Trees Dealing with missing values: Split instances with missing values into pieces A piece going down a branch receives a weight  proportional to the popularity of the branch weights sum to 1
Decision Trees Pruning  Making the decision tree less complex by removing cases of over fitting  We have two types of pruning: Prepruning: Trying to decide during tree building Postpruning: Doing pruning after the tree has been constructed The two types of postpruning thatare generally used are: Subtree replacement  Subtree raising  To decide whether to do postpruning or not, we calculate the error rate before and after the pruning
Decision Trees Subtree raising:
Decision Trees Subtree replacement
Classification rules Criteria for choosing  tests: p/t ratio Maximizes the ratio of positive instances with stress on accuracy p[log(p/t) – log(p/t)] Maximizes the number of positive instances with lesser accuracy
Classification rules Generating good rules: We can remove over fitting by either pruning of trees during construction or after they have been fully constructed To prune during construction we check each newly added test. If the error rate on the pruning set increases because of this new test, we remove it
Classification rules Obtaining rules from partial decision trees: Algorithm
Classification rules
Classification rules As the node 4 was not replaced, we stop at this stage Now each leaf node gives us a possible rule Choose the leaf which covers the greatest number of instances
Extending linear models Support vector machines: Support vector machines are algorithms for learning linear classifier They use maximum marginal hyper plane: removes over fitting The instances closest to the maximum marginal hyper plane are support vectors, rest all instances can be ignored
Extending linear models
Extending linear models Support vector machines: The hyper plane can be written as: Support vector: All instances for which alpha(i) > 0 b and alpha are determined using software packages The hyper plane can also be written using kernel as:
Extending linear models Multilayer perceptron: We can create a network of perceptron to approximate arbitrary target concepts  Multilayer perceptron is an example of an artificial neural network Consists of: input layer, hidden layer(s), and output layer   Structure of MLP is usually found by experimentation Parameters can be found using backpropagation
Extending linear models Examples:
Extending linear models Back propagation: f(x) = 1/(1+exp(-x)) Error = ½(y-f(x))^2 So we try to minimize the error and get: Now just calculate the above expression for all training instances and do:       w(i) = w(i) – L(dE/dw) We assume values of w in the starting
Clustering Incremental clustering: Steps Tree consists of empty root node Add instances one by one Update tree at appropriately at each stage  To update, find the right leaf for an instance  May involve restructuring the tree Restructuring: Merging and Replacement Decisions are made using category utility
Clustering Example of incremental clustering:
EM Algorithm EM = Expectation­Maximization  Generalize k­means to probabilistic setting Iterative procedure: E “expectation” step:      Calculate cluster probability for each instance  M “maximization” step:      Estimate distribution parameters from cluster       probabilities Store cluster probabilities as instance weights Stop when improvement is negligible
Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net

Weitere ähnliche Inhalte

Was ist angesagt?

Machine learning session9(clustering)
Machine learning   session9(clustering)Machine learning   session9(clustering)
Machine learning session9(clustering)Abhimanyu Dwivedi
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forestsViet-Trung TRAN
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithmRashid Ansari
 
Classification with Naive Bayes
Classification with Naive BayesClassification with Naive Bayes
Classification with Naive BayesJosh Patterson
 
Machine learning session8(svm nlp)
Machine learning   session8(svm nlp)Machine learning   session8(svm nlp)
Machine learning session8(svm nlp)Abhimanyu Dwivedi
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmZHAO Sam
 
Random Forest / Bootstrap Aggregation
Random Forest / Bootstrap AggregationRandom Forest / Bootstrap Aggregation
Random Forest / Bootstrap AggregationRupak Roy
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Parth Khare
 
Understanding the Machine Learning Algorithms
Understanding the Machine Learning AlgorithmsUnderstanding the Machine Learning Algorithms
Understanding the Machine Learning AlgorithmsRupak Roy
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector MachineDerek Kane
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmPalin analytics
 
Support Vector Machine (SVM)
Support Vector Machine (SVM)Support Vector Machine (SVM)
Support Vector Machine (SVM)Sana Rahim
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsMd. Main Uddin Rony
 
Preparing your data for Machine Learning with Feature Scaling
Preparing your data for  Machine Learning with Feature ScalingPreparing your data for  Machine Learning with Feature Scaling
Preparing your data for Machine Learning with Feature ScalingRahul K Chauhan
 
Machine Learning 3 - Decision Tree Learning
Machine Learning 3 - Decision Tree LearningMachine Learning 3 - Decision Tree Learning
Machine Learning 3 - Decision Tree Learningbutest
 

Was ist angesagt? (20)

Random forest
Random forestRandom forest
Random forest
 
Machine learning session9(clustering)
Machine learning   session9(clustering)Machine learning   session9(clustering)
Machine learning session9(clustering)
 
ID3 ALGORITHM
ID3 ALGORITHMID3 ALGORITHM
ID3 ALGORITHM
 
Decision tree
Decision treeDecision tree
Decision tree
 
From decision trees to random forests
From decision trees to random forestsFrom decision trees to random forests
From decision trees to random forests
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithm
 
Classification with Naive Bayes
Classification with Naive BayesClassification with Naive Bayes
Classification with Naive Bayes
 
Machine learning session8(svm nlp)
Machine learning   session8(svm nlp)Machine learning   session8(svm nlp)
Machine learning session8(svm nlp)
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) Algorithm
 
Random Forest / Bootstrap Aggregation
Random Forest / Bootstrap AggregationRandom Forest / Bootstrap Aggregation
Random Forest / Bootstrap Aggregation
 
Id3 algorithm
Id3 algorithmId3 algorithm
Id3 algorithm
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
 
Understanding the Machine Learning Algorithms
Understanding the Machine Learning AlgorithmsUnderstanding the Machine Learning Algorithms
Understanding the Machine Learning Algorithms
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector Machine
 
Decision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning AlgorithmDecision Trees for Classification: A Machine Learning Algorithm
Decision Trees for Classification: A Machine Learning Algorithm
 
Support Vector Machine (SVM)
Support Vector Machine (SVM)Support Vector Machine (SVM)
Support Vector Machine (SVM)
 
Classification Continued
Classification ContinuedClassification Continued
Classification Continued
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Preparing your data for Machine Learning with Feature Scaling
Preparing your data for  Machine Learning with Feature ScalingPreparing your data for  Machine Learning with Feature Scaling
Preparing your data for Machine Learning with Feature Scaling
 
Machine Learning 3 - Decision Tree Learning
Machine Learning 3 - Decision Tree LearningMachine Learning 3 - Decision Tree Learning
Machine Learning 3 - Decision Tree Learning
 

Ähnlich wie WEKA:Practical Machine Learning Tools And Techniques

Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)Abhimanyu Dwivedi
 
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfMachine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfAdityaSoraut
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CARTXueping Peng
 
Two methods for optimising cognitive model parameters
Two methods for optimising cognitive model parametersTwo methods for optimising cognitive model parameters
Two methods for optimising cognitive model parametersUniversity of Huddersfield
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Kush Kulshrestha
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptxssuser2023c6
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
CSA 3702 machine learning module 2
CSA 3702 machine learning module 2CSA 3702 machine learning module 2
CSA 3702 machine learning module 2Nandhini S
 
Adapted Branch-and-Bound Algorithm Using SVM With Model Selection
Adapted Branch-and-Bound Algorithm Using SVM With Model SelectionAdapted Branch-and-Bound Algorithm Using SVM With Model Selection
Adapted Branch-and-Bound Algorithm Using SVM With Model SelectionIJECEIAES
 
Machine learning and decision trees
Machine learning and decision treesMachine learning and decision trees
Machine learning and decision treesPadma Metta
 
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
AI optimizing HPC simulations (presentation from  6th EULAG Workshop)AI optimizing HPC simulations (presentation from  6th EULAG Workshop)
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)byteLAKE
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptxssuserf07225
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design TrainingESCOM
 
WEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic MethodsWEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic Methodsweka Content
 

Ähnlich wie WEKA:Practical Machine Learning Tools And Techniques (20)

Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdfMachine Learning Unit-5 Decesion Trees & Random Forest.pdf
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
 
Unit 2-ML.pptx
Unit 2-ML.pptxUnit 2-ML.pptx
Unit 2-ML.pptx
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CART
 
Decision Tree.pptx
Decision Tree.pptxDecision Tree.pptx
Decision Tree.pptx
 
Two methods for optimising cognitive model parameters
Two methods for optimising cognitive model parametersTwo methods for optimising cognitive model parameters
Two methods for optimising cognitive model parameters
 
Decision tree learning
Decision tree learningDecision tree learning
Decision tree learning
 
Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees Machine Learning Algorithm - Decision Trees
Machine Learning Algorithm - Decision Trees
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
eam2
eam2eam2
eam2
 
CSA 3702 machine learning module 2
CSA 3702 machine learning module 2CSA 3702 machine learning module 2
CSA 3702 machine learning module 2
 
Adapted Branch-and-Bound Algorithm Using SVM With Model Selection
Adapted Branch-and-Bound Algorithm Using SVM With Model SelectionAdapted Branch-and-Bound Algorithm Using SVM With Model Selection
Adapted Branch-and-Bound Algorithm Using SVM With Model Selection
 
Random Forest Decision Tree.pptx
Random Forest Decision Tree.pptxRandom Forest Decision Tree.pptx
Random Forest Decision Tree.pptx
 
Chapter 17
Chapter 17Chapter 17
Chapter 17
 
Machine learning and decision trees
Machine learning and decision treesMachine learning and decision trees
Machine learning and decision trees
 
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
AI optimizing HPC simulations (presentation from  6th EULAG Workshop)AI optimizing HPC simulations (presentation from  6th EULAG Workshop)
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptx
 
NEURAL Network Design Training
NEURAL Network Design  TrainingNEURAL Network Design  Training
NEURAL Network Design Training
 
WEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic MethodsWEKA:Algorithms The Basic Methods
WEKA:Algorithms The Basic Methods
 

Mehr von weka Content

WEKA: The Knowledge Flow Interface
WEKA: The Knowledge Flow InterfaceWEKA: The Knowledge Flow Interface
WEKA: The Knowledge Flow Interfaceweka Content
 
WEKA:The Command Line Interface
WEKA:The Command Line InterfaceWEKA:The Command Line Interface
WEKA:The Command Line Interfaceweka Content
 
WEKA:The Experimenter
WEKA:The ExperimenterWEKA:The Experimenter
WEKA:The Experimenterweka Content
 
WEKA:Output Knowledge Representation
WEKA:Output Knowledge RepresentationWEKA:Output Knowledge Representation
WEKA:Output Knowledge Representationweka Content
 
WEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been LearnedWEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been Learnedweka Content
 
WEKA:Data Mining Input Concepts Instances And Attributes
WEKA:Data Mining Input Concepts Instances And AttributesWEKA:Data Mining Input Concepts Instances And Attributes
WEKA:Data Mining Input Concepts Instances And Attributesweka Content
 
WEKA:Introduction To Weka
WEKA:Introduction To WekaWEKA:Introduction To Weka
WEKA:Introduction To Wekaweka Content
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Wekaweka Content
 

Mehr von weka Content (9)

WEKA: The Knowledge Flow Interface
WEKA: The Knowledge Flow InterfaceWEKA: The Knowledge Flow Interface
WEKA: The Knowledge Flow Interface
 
WEKA:The Command Line Interface
WEKA:The Command Line InterfaceWEKA:The Command Line Interface
WEKA:The Command Line Interface
 
WEKA:The Experimenter
WEKA:The ExperimenterWEKA:The Experimenter
WEKA:The Experimenter
 
WEKA:The Explorer
WEKA:The ExplorerWEKA:The Explorer
WEKA:The Explorer
 
WEKA:Output Knowledge Representation
WEKA:Output Knowledge RepresentationWEKA:Output Knowledge Representation
WEKA:Output Knowledge Representation
 
WEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been LearnedWEKA:Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been Learned
 
WEKA:Data Mining Input Concepts Instances And Attributes
WEKA:Data Mining Input Concepts Instances And AttributesWEKA:Data Mining Input Concepts Instances And Attributes
WEKA:Data Mining Input Concepts Instances And Attributes
 
WEKA:Introduction To Weka
WEKA:Introduction To WekaWEKA:Introduction To Weka
WEKA:Introduction To Weka
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Weka
 

Kürzlich hochgeladen

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Kürzlich hochgeladen (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

WEKA:Practical Machine Learning Tools And Techniques

  • 2. Decision Trees Dealing with numeric attributes Standard method: binary splits Steps to decide where to split:  Evaluate info gain for every possible split point of attribute Choose “best” split point But this is computationally intensive
  • 3. Decision Trees Example Split on temperature attribute: 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes No Yes YesYes No No Yes YesYes No Yes Yes No temperature < 71.5: yes/4, no/2 temperature > 71.5: yes/5, no/3 Info([4,2],[5,3]) = 6/14 info([4,2]) + 8/14 info([5,3])  = 0.939 bits
  • 4. Decision Trees Dealing with missing values: Split instances with missing values into pieces A piece going down a branch receives a weight  proportional to the popularity of the branch weights sum to 1
  • 5. Decision Trees Pruning Making the decision tree less complex by removing cases of over fitting We have two types of pruning: Prepruning: Trying to decide during tree building Postpruning: Doing pruning after the tree has been constructed The two types of postpruning thatare generally used are: Subtree replacement Subtree raising To decide whether to do postpruning or not, we calculate the error rate before and after the pruning
  • 8. Classification rules Criteria for choosing tests: p/t ratio Maximizes the ratio of positive instances with stress on accuracy p[log(p/t) – log(p/t)] Maximizes the number of positive instances with lesser accuracy
  • 9. Classification rules Generating good rules: We can remove over fitting by either pruning of trees during construction or after they have been fully constructed To prune during construction we check each newly added test. If the error rate on the pruning set increases because of this new test, we remove it
  • 10. Classification rules Obtaining rules from partial decision trees: Algorithm
  • 12. Classification rules As the node 4 was not replaced, we stop at this stage Now each leaf node gives us a possible rule Choose the leaf which covers the greatest number of instances
  • 13. Extending linear models Support vector machines: Support vector machines are algorithms for learning linear classifier They use maximum marginal hyper plane: removes over fitting The instances closest to the maximum marginal hyper plane are support vectors, rest all instances can be ignored
  • 15. Extending linear models Support vector machines: The hyper plane can be written as: Support vector: All instances for which alpha(i) > 0 b and alpha are determined using software packages The hyper plane can also be written using kernel as:
  • 16. Extending linear models Multilayer perceptron: We can create a network of perceptron to approximate arbitrary target concepts Multilayer perceptron is an example of an artificial neural network Consists of: input layer, hidden layer(s), and output layer  Structure of MLP is usually found by experimentation Parameters can be found using backpropagation
  • 18. Extending linear models Back propagation: f(x) = 1/(1+exp(-x)) Error = ½(y-f(x))^2 So we try to minimize the error and get: Now just calculate the above expression for all training instances and do: w(i) = w(i) – L(dE/dw) We assume values of w in the starting
  • 19. Clustering Incremental clustering: Steps Tree consists of empty root node Add instances one by one Update tree at appropriately at each stage To update, find the right leaf for an instance May involve restructuring the tree Restructuring: Merging and Replacement Decisions are made using category utility
  • 20. Clustering Example of incremental clustering:
  • 21. EM Algorithm EM = Expectation­Maximization  Generalize k­means to probabilistic setting Iterative procedure: E “expectation” step: Calculate cluster probability for each instance  M “maximization” step: Estimate distribution parameters from cluster  probabilities Store cluster probabilities as instance weights Stop when improvement is negligible
  • 22. Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net