SlideShare ist ein Scribd-Unternehmen logo
1 von 17
RAPIDMINER: INTRODUCTION TO DATAMINING
AGENDA ,[object Object]
Introduction to RapidMiner
Use of RapidMiner for Data Mining
Download and Installation Steps
Memory Usage , Plug-ins & Settings
Supported File Formats,[object Object]
Different levels of analysis that are available: Artificial neural networks – Non-linear predictive models that resemble biological neural networks in structure. Genetic algorithms - Optimization techniques that use processes such as genetic combination, mutation, and natural selection in a design based on the concepts of natural evolution. Decision trees – Provide a set of rules that you can apply to a new dataset to predict the outcome.        Examples: ,[object Object]
Chi Square Automatic Interaction Detection (CHAID) . CART and CHAID are decision tree techniques used for classification of a dataset. Rule induction – The extraction of useful if-then rules from data based on statistical significance. Nearest neighbor – Classify records based on the k-most similar  records Data visualization - Visual interpretation of complex relationships in multidimensional data.
Applications Can be divided into four major kinds: Classification Numerical prediction Association Clustering Some examples: Automatic abstraction Financial forecasting Targeted marketing Medical diagnosis Credit card fraud detection Weather forecasting etc.
Introduction to RapidMiner RapidMiner (formerly YALE*)is an environment for machine learning and data mining experiments.  RapidMiner is used for both research and real-world data mining tasks. Software versions:  ,[object Object]
Enterprise edition (Community Edition + More Features + Services + Guarantees) *YALE - Yet Another Learning Environment
  Some properties of RapidMiner: Written in Java Knowledge discovery processes are modelled as operator trees Internal XML representation ensures standardized interchange format of data mining experiments Scriptinglanguage allows for automating large-scale experiments Multi-layered data view concept ensures efficient and transparent data handling GUI, command-line mode (batch mode), and Java API for using RapidMiner from other programs Several plugins already exist A large set of high-dimensional visualization schemes for data and models offered by its plotting facility. Applications: text mining, multimedia mining, feature engineering, data stream mining and tracking drifting concepts, development of ensemble methods, and distributed data mining.
Use of RapidMiner for Data Mining Using RapidMiner ,[object Object]
GUI can be used to design XML description of the operator tree
Break points can be used to check the intermediate resultsUse from a separate program Command line version and Java API can be used to invoke RapidMiner in your programs without using the GUI
Download and Installation Steps Download The latest version of RapidMiner can be downloaded from http://rapid-i.com/content/blogsection/7/82/lang,en/ by selecting the appropriate version(Windows x86, x64 etc.) and RapidMiner edition Installation Windows executable Download the windows executable (.exe) file Double-click the rapidminer-xxx-instal.exe file to run it Follow the instructions

Weitere ähnliche Inhalte

Mehr von DataminingTools Inc

AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceDataminingTools Inc
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web miningDataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDataminingTools Inc
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technologyDataminingTools Inc
 

Mehr von DataminingTools Inc (20)

Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Areas of machine leanring
Areas of machine leanringAreas of machine leanring
Areas of machine leanring
 
AI: Planning and AI
AI: Planning and AIAI: Planning and AI
AI: Planning and AI
 
AI: Logic in AI 2
AI: Logic in AI 2AI: Logic in AI 2
AI: Logic in AI 2
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
 
AI: Learning in AI 2
AI: Learning in AI 2AI: Learning in AI 2
AI: Learning in AI 2
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
 
AI: Belief Networks
AI: Belief NetworksAI: Belief Networks
AI: Belief Networks
 
AI: AI & Searching
AI: AI & SearchingAI: AI & Searching
AI: AI & Searching
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 

Kürzlich hochgeladen

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 

Kürzlich hochgeladen (20)

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 

RAPIDMINER: Introduction To Datamining

  • 2.
  • 4. Use of RapidMiner for Data Mining
  • 6. Memory Usage , Plug-ins & Settings
  • 7.
  • 8.
  • 9. Chi Square Automatic Interaction Detection (CHAID) . CART and CHAID are decision tree techniques used for classification of a dataset. Rule induction – The extraction of useful if-then rules from data based on statistical significance. Nearest neighbor – Classify records based on the k-most similar records Data visualization - Visual interpretation of complex relationships in multidimensional data.
  • 10. Applications Can be divided into four major kinds: Classification Numerical prediction Association Clustering Some examples: Automatic abstraction Financial forecasting Targeted marketing Medical diagnosis Credit card fraud detection Weather forecasting etc.
  • 11.
  • 12. Enterprise edition (Community Edition + More Features + Services + Guarantees) *YALE - Yet Another Learning Environment
  • 13. Some properties of RapidMiner: Written in Java Knowledge discovery processes are modelled as operator trees Internal XML representation ensures standardized interchange format of data mining experiments Scriptinglanguage allows for automating large-scale experiments Multi-layered data view concept ensures efficient and transparent data handling GUI, command-line mode (batch mode), and Java API for using RapidMiner from other programs Several plugins already exist A large set of high-dimensional visualization schemes for data and models offered by its plotting facility. Applications: text mining, multimedia mining, feature engineering, data stream mining and tracking drifting concepts, development of ensemble methods, and distributed data mining.
  • 14.
  • 15. GUI can be used to design XML description of the operator tree
  • 16. Break points can be used to check the intermediate resultsUse from a separate program Command line version and Java API can be used to invoke RapidMiner in your programs without using the GUI
  • 17. Download and Installation Steps Download The latest version of RapidMiner can be downloaded from http://rapid-i.com/content/blogsection/7/82/lang,en/ by selecting the appropriate version(Windows x86, x64 etc.) and RapidMiner edition Installation Windows executable Download the windows executable (.exe) file Double-click the rapidminer-xxx-instal.exe file to run it Follow the instructions
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. Supported File Formats Can read data files, read & write models, parameter sets and attribute sets. Most important – examples and instances
  • 23. Data files & attribute description files ARFFEXAMPLESOURCE - .arff format DATABASEEXAMPLESOURCE – To read from databases SPARSEFORMATEXAMPLESOURCE DENSEFORMATEXAMPLESOURCE Attribute description file (.aml) in order to retrieve metadata about the instances XML Attributes that can be set: Name – unique name of the attribute Sourcefile – name of the file containing the data(default used if not specified) Sourcecol –column within the file(Starting from 1) Sourcecol_end – sourcecol-sourcecol_end attributes are generated with the same properties. Valuetype– one out of nominal,numeric, integer, real, ordered, binominal, polynominal and file_path Blocktype – one out of single_value, value_series, value_series_start, value_series_end, interval, interval_start, interval_end
  • 24. Model files (.mod files) Contains the models generated by previous runs MODELWRITER – to write model files MODELLOADER – to read model files MODELAPPLIER – to apply model files Attribute construction files (.att files) ATTRIBUTECONSTRUCTIONWRITER – writes an attribute set ATTRIBUTECONSTRUCTIONLOADER – reads an attribute set Parameter set files (.par files) GRIDPARAMETEROPTIMIZTION – generates a set of optimal parameters for a particular task PARAMETERSETLOADER – use the parameter files Attribute weight files (.wgt files) Attibute selection is seen as attribute weighing which allows for more flexibility ATTRIBUTEWEIGHTSWRITER – to write attribute weights to a file ATTRIBUTEWEIGHTSLOADER – to read the attribute weights ATTRIBUTEWEIGHTSAPPLIER – to apply in the example sets
  • 26. More questions… Reach us at support@dataminingtools.net VISIT: WWW.DATAMININGTOOLS.NET