SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Introduction to XLMiner™ DATA Utilities XLMiner and Microsoft Office are registered trademarks of the respective owners.
Brief description of the features of XLMiner: Data Utilities The XLMiner provides the user with a host of Data Utilities at his disposal. They are: 	The different Data Utilities that XLMiner Provides are:- Sample from Worksheet/Database. ,[object Object]
Stratified Sampling.Missing Data handling. Bin Continuous Data. Transform Categorical Data . http://dataminingtools.net
Sample data from Worksheet When huge amounts of data are involved, statisticians prefer taking a sample of the data that represents the entire database. However, such a representative sample is very difficult to obtain.  The entire dataset we want information about is called the population. A sample is a part of population that we actually examine to draw conclusions.  A good sample should be a true representation of data. As far as possible the cases chosen for sample should be like the cases that are not chosen. If the sample design is poor it can produce misleading conclusions. Various methods and techniques are developed to ensure a true sample. XLMiner provides us sampling facilities. http://dataminingtools.net
Sample data from Worksheet In XLMiner, sampling can be done in two ways: Simple Random sampling: 	A random sample of x records is chosen from the data such that every record in that sample has an equal chance of being chosen Stratified Sampling : 	The data is divided into strata of similar items. Then each stratum is sampled using the simple random approach and the results are then combined to give a final sample. http://dataminingtools.net
Sample data from Worksheet- Simple Random Sampling Select the variables to be present in the sample Here “Simple Random sampling is selected We can specify the seed value( value used for random selection) or the wizard will specify it by default. Set the size for the sampled set If selected duplicate copies of records may be used. http://dataminingtools.net
Sample data from Worksheet- Simple Random Sampling output http://dataminingtools.net
Sample data from Worksheet-  Simple Random Sampling output with replacement. Duplicate copies of record exist in the sample. http://dataminingtools.net
Sample data from Worksheet- Stratified Sample( proportionate ) http://dataminingtools.net
Sample data from Worksheet- Stratified Sample( proportionate – output ) As selected by us, the % of records in each stratum in the sample set is same as that in the input set http://dataminingtools.net
Sample data from Worksheet- Stratified Sample(specify number) http://dataminingtools.net
Sample data from Worksheet- Stratified Sample(specify number) All stratums have equal sizes as specified by user (here 10 records each) http://dataminingtools.net
Sample data from Worksheet- Stratified Sample( size of smallest stratum) http://dataminingtools.net
Sample data from Worksheet- Stratified Sample( size of smallest stratum-output) All stratum have size equal to the size of the smallest stratum http://dataminingtools.net
Missing Data Handling This utility allows the user to process the data before any mining method is applied on it. It allows the user to detect the missing values in the data and handle them the way the user wants.   XLMiner� considers a cell to be missing data if it is empty or contains an invalid formula. XLMiner� can be prompted to treat a cell to be missing data  if it contains a certain value specified by the user or handles the data as specified by the user. The user can specify how XLMiner� should correct these missing values. A treatment can be assigned for every variable. The records with missing data can be either deleted fully or the missing values can be replaced.  XLMiner� provides options on how to replace the missing data, e.g. by mean or median or mode or a value specified by the user. The available options depend on the type of variable http://dataminingtools.net
Missing Data Handling http://dataminingtools.net
Missing Data Handling Data Set Select the action to handle the missing data in individual columns and click on “Apply this option to selected variable” http://dataminingtools.net
Missing Data Handling-Output Changed records high-lighted http://dataminingtools.net
Transform Categorical Data Sometimes our data sets may contain variables that take non-numeric values. This makes it difficult to apply standard procedures. Hence XLMiner provides us with a tool which can be used to rename (transform) non-numeric data to numeric data. There are two ways to transform  categorical data: Creating Dummies:  Consider the variable to have 4 distinct values as A,B,C and D. Then 3 new rows, VAL1,VAL2, VAL3 are created with values either 1 or 0 .If row one contains value A the VAL1 will have a value 1,rest have 0.If all have 0,then the row has a value D. Create category scores:  In this if the non-numeric holds 4 distinct values as above, each value( ordered alphabetically) will be numbered from 1 to 4 and a new column is created that contains the value of number the non-numeric variable corresponds to. http://dataminingtools.net
Transform Categorical Data- Dummies Select the variable that contains non-numeric Data and needs to be transformed http://dataminingtools.net
Transform Categorical Data-Category Scores http://dataminingtools.net
Transform Categorical Data-Category Scores(output) http://dataminingtools.net
Thank you For more visit: http://dataminingtools.net http://dataminingtools.net

Weitere ähnliche Inhalte

Was ist angesagt?

Data Processing-Presentation
Data Processing-PresentationData Processing-Presentation
Data Processing-Presentationnibraspk
 
Data Creation and Importing in IBM SPSS
Data Creation and Importing in IBM SPSSData Creation and Importing in IBM SPSS
Data Creation and Importing in IBM SPSSThiyagu K
 
What Is the Use of SPSS in Data Analysis
What Is the Use of SPSS in Data AnalysisWhat Is the Use of SPSS in Data Analysis
What Is the Use of SPSS in Data AnalysisSPSSResearch
 
Data processing & Analysis: SPSS an overview
Data processing & Analysis: SPSS an overviewData processing & Analysis: SPSS an overview
Data processing & Analysis: SPSS an overviewATHUL RAVI
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree inductionthamizh arasi
 
SPSS introduction Presentation
SPSS introduction Presentation SPSS introduction Presentation
SPSS introduction Presentation befikra
 
Ibm spss statistics 19 brief guide
Ibm spss statistics 19 brief guideIbm spss statistics 19 brief guide
Ibm spss statistics 19 brief guideMarketing Utopia
 
SELECTED DATA PREPARATION METHODS
SELECTED DATA PREPARATION METHODSSELECTED DATA PREPARATION METHODS
SELECTED DATA PREPARATION METHODSKAMIL MAJEED
 
Spss basics tutorial
Spss basics tutorialSpss basics tutorial
Spss basics tutorialJack Rabah
 

Was ist angesagt? (18)

Data Processing-Presentation
Data Processing-PresentationData Processing-Presentation
Data Processing-Presentation
 
Dsa unit 1
Dsa unit 1Dsa unit 1
Dsa unit 1
 
Classification
ClassificationClassification
Classification
 
Dma unit 2
Dma unit  2Dma unit  2
Dma unit 2
 
Data Creation and Importing in IBM SPSS
Data Creation and Importing in IBM SPSSData Creation and Importing in IBM SPSS
Data Creation and Importing in IBM SPSS
 
Spss beginners
Spss beginnersSpss beginners
Spss beginners
 
Spss as a research tool
Spss  as a research tool Spss  as a research tool
Spss as a research tool
 
Data processing
Data processingData processing
Data processing
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
 
What Is the Use of SPSS in Data Analysis
What Is the Use of SPSS in Data AnalysisWhat Is the Use of SPSS in Data Analysis
What Is the Use of SPSS in Data Analysis
 
Data entry in Excel and SPSS
Data entry in Excel and SPSS Data entry in Excel and SPSS
Data entry in Excel and SPSS
 
Data processing & Analysis: SPSS an overview
Data processing & Analysis: SPSS an overviewData processing & Analysis: SPSS an overview
Data processing & Analysis: SPSS an overview
 
Decision tree induction
Decision tree inductionDecision tree induction
Decision tree induction
 
Database design
Database designDatabase design
Database design
 
SPSS introduction Presentation
SPSS introduction Presentation SPSS introduction Presentation
SPSS introduction Presentation
 
Ibm spss statistics 19 brief guide
Ibm spss statistics 19 brief guideIbm spss statistics 19 brief guide
Ibm spss statistics 19 brief guide
 
SELECTED DATA PREPARATION METHODS
SELECTED DATA PREPARATION METHODSSELECTED DATA PREPARATION METHODS
SELECTED DATA PREPARATION METHODS
 
Spss basics tutorial
Spss basics tutorialSpss basics tutorial
Spss basics tutorial
 

Andere mochten auch

XL-MINER:Data Exploration
XL-MINER:Data ExplorationXL-MINER:Data Exploration
XL-MINER:Data Explorationxlminer content
 
XL-Miner: Classification
XL-Miner: ClassificationXL-Miner: Classification
XL-Miner: Classificationxlminer content
 
XL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl MinerXL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl Minerxlminer content
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technologyDataminingTools Inc
 

Andere mochten auch (17)

XL-MINER:Data Exploration
XL-MINER:Data ExplorationXL-MINER:Data Exploration
XL-MINER:Data Exploration
 
XL-Miner: Classification
XL-Miner: ClassificationXL-Miner: Classification
XL-Miner: Classification
 
XL-Miner: Time Series
XL-Miner: Time SeriesXL-Miner: Time Series
XL-Miner: Time Series
 
XL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl MinerXL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl Miner
 
XL MINER: Associations
XL MINER: AssociationsXL MINER: Associations
XL MINER: Associations
 
Areas of machine leanring
Areas of machine leanringAreas of machine leanring
Areas of machine leanring
 
XL-MINER:Prediction
XL-MINER:PredictionXL-MINER:Prediction
XL-MINER:Prediction
 
XL-MINER:Partition
XL-MINER:PartitionXL-MINER:Partition
XL-MINER:Partition
 
Prueba de corridas arriba y abajo de la media
Prueba de corridas arriba y abajo de la mediaPrueba de corridas arriba y abajo de la media
Prueba de corridas arriba y abajo de la media
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
AI: AI & Searching
AI: AI & SearchingAI: AI & Searching
AI: AI & Searching
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 

Ähnlich wie XL-MINER:Data Utilities

Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2Gokulks007
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advancedexcel content
 
Excel Datamining Addin Beginner
Excel Datamining Addin BeginnerExcel Datamining Addin Beginner
Excel Datamining Addin Beginnerexcel content
 
PATTERNS08 - Strong Typing and Data Validation in .NET
PATTERNS08 - Strong Typing and Data Validation in .NETPATTERNS08 - Strong Typing and Data Validation in .NET
PATTERNS08 - Strong Typing and Data Validation in .NETMichael Heron
 
UNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data MiningUNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data MiningNandakumar P
 
3. chapter iii(aggregate data)
3. chapter iii(aggregate data)3. chapter iii(aggregate data)
3. chapter iii(aggregate data)Chhom Karath
 
Computer notes - data structures
Computer notes - data structuresComputer notes - data structures
Computer notes - data structuresecomputernotes
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxHarsha Patel
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data PreprocessingT Kavitha
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data miningUjjawal
 
Splunk 6.2 new features
Splunk 6.2 new featuresSplunk 6.2 new features
Splunk 6.2 new featuresCleverDATA
 
computer notes - Data Structures - 1
computer notes - Data Structures - 1computer notes - Data Structures - 1
computer notes - Data Structures - 1ecomputernotes
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
Spss by vijay ambast
Spss by vijay ambastSpss by vijay ambast
Spss by vijay ambastVijay Ambast
 

Ähnlich wie XL-MINER:Data Utilities (20)

Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advanced
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advanced
 
Excel Datamining Addin Beginner
Excel Datamining Addin BeginnerExcel Datamining Addin Beginner
Excel Datamining Addin Beginner
 
Excel Datamining Addin Beginner
Excel Datamining Addin BeginnerExcel Datamining Addin Beginner
Excel Datamining Addin Beginner
 
PATTERNS08 - Strong Typing and Data Validation in .NET
PATTERNS08 - Strong Typing and Data Validation in .NETPATTERNS08 - Strong Typing and Data Validation in .NET
PATTERNS08 - Strong Typing and Data Validation in .NET
 
UNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data MiningUNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data Mining
 
data mining
data miningdata mining
data mining
 
Data Mining: Data Preprocessing
Data Mining: Data PreprocessingData Mining: Data Preprocessing
Data Mining: Data Preprocessing
 
3. chapter iii(aggregate data)
3. chapter iii(aggregate data)3. chapter iii(aggregate data)
3. chapter iii(aggregate data)
 
Computer notes - data structures
Computer notes - data structuresComputer notes - data structures
Computer notes - data structures
 
somhelpdoc
somhelpdocsomhelpdoc
somhelpdoc
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptx
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
 
Xlminer demo
Xlminer demoXlminer demo
Xlminer demo
 
Splunk 6.2 new features
Splunk 6.2 new featuresSplunk 6.2 new features
Splunk 6.2 new features
 
computer notes - Data Structures - 1
computer notes - Data Structures - 1computer notes - Data Structures - 1
computer notes - Data Structures - 1
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Spss by vijay ambast
Spss by vijay ambastSpss by vijay ambast
Spss by vijay ambast
 

Kürzlich hochgeladen

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Kürzlich hochgeladen (20)

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

XL-MINER:Data Utilities

  • 1. Introduction to XLMiner™ DATA Utilities XLMiner and Microsoft Office are registered trademarks of the respective owners.
  • 2.
  • 3. Stratified Sampling.Missing Data handling. Bin Continuous Data. Transform Categorical Data . http://dataminingtools.net
  • 4. Sample data from Worksheet When huge amounts of data are involved, statisticians prefer taking a sample of the data that represents the entire database. However, such a representative sample is very difficult to obtain. The entire dataset we want information about is called the population. A sample is a part of population that we actually examine to draw conclusions. A good sample should be a true representation of data. As far as possible the cases chosen for sample should be like the cases that are not chosen. If the sample design is poor it can produce misleading conclusions. Various methods and techniques are developed to ensure a true sample. XLMiner provides us sampling facilities. http://dataminingtools.net
  • 5. Sample data from Worksheet In XLMiner, sampling can be done in two ways: Simple Random sampling: A random sample of x records is chosen from the data such that every record in that sample has an equal chance of being chosen Stratified Sampling : The data is divided into strata of similar items. Then each stratum is sampled using the simple random approach and the results are then combined to give a final sample. http://dataminingtools.net
  • 6. Sample data from Worksheet- Simple Random Sampling Select the variables to be present in the sample Here “Simple Random sampling is selected We can specify the seed value( value used for random selection) or the wizard will specify it by default. Set the size for the sampled set If selected duplicate copies of records may be used. http://dataminingtools.net
  • 7. Sample data from Worksheet- Simple Random Sampling output http://dataminingtools.net
  • 8. Sample data from Worksheet- Simple Random Sampling output with replacement. Duplicate copies of record exist in the sample. http://dataminingtools.net
  • 9. Sample data from Worksheet- Stratified Sample( proportionate ) http://dataminingtools.net
  • 10. Sample data from Worksheet- Stratified Sample( proportionate – output ) As selected by us, the % of records in each stratum in the sample set is same as that in the input set http://dataminingtools.net
  • 11. Sample data from Worksheet- Stratified Sample(specify number) http://dataminingtools.net
  • 12. Sample data from Worksheet- Stratified Sample(specify number) All stratums have equal sizes as specified by user (here 10 records each) http://dataminingtools.net
  • 13. Sample data from Worksheet- Stratified Sample( size of smallest stratum) http://dataminingtools.net
  • 14. Sample data from Worksheet- Stratified Sample( size of smallest stratum-output) All stratum have size equal to the size of the smallest stratum http://dataminingtools.net
  • 15. Missing Data Handling This utility allows the user to process the data before any mining method is applied on it. It allows the user to detect the missing values in the data and handle them the way the user wants.   XLMiner� considers a cell to be missing data if it is empty or contains an invalid formula. XLMiner� can be prompted to treat a cell to be missing data  if it contains a certain value specified by the user or handles the data as specified by the user. The user can specify how XLMiner� should correct these missing values. A treatment can be assigned for every variable. The records with missing data can be either deleted fully or the missing values can be replaced.  XLMiner� provides options on how to replace the missing data, e.g. by mean or median or mode or a value specified by the user. The available options depend on the type of variable http://dataminingtools.net
  • 16. Missing Data Handling http://dataminingtools.net
  • 17. Missing Data Handling Data Set Select the action to handle the missing data in individual columns and click on “Apply this option to selected variable” http://dataminingtools.net
  • 18. Missing Data Handling-Output Changed records high-lighted http://dataminingtools.net
  • 19. Transform Categorical Data Sometimes our data sets may contain variables that take non-numeric values. This makes it difficult to apply standard procedures. Hence XLMiner provides us with a tool which can be used to rename (transform) non-numeric data to numeric data. There are two ways to transform categorical data: Creating Dummies: Consider the variable to have 4 distinct values as A,B,C and D. Then 3 new rows, VAL1,VAL2, VAL3 are created with values either 1 or 0 .If row one contains value A the VAL1 will have a value 1,rest have 0.If all have 0,then the row has a value D. Create category scores: In this if the non-numeric holds 4 distinct values as above, each value( ordered alphabetically) will be numbered from 1 to 4 and a new column is created that contains the value of number the non-numeric variable corresponds to. http://dataminingtools.net
  • 20. Transform Categorical Data- Dummies Select the variable that contains non-numeric Data and needs to be transformed http://dataminingtools.net
  • 21. Transform Categorical Data-Category Scores http://dataminingtools.net
  • 22. Transform Categorical Data-Category Scores(output) http://dataminingtools.net
  • 23. Thank you For more visit: http://dataminingtools.net http://dataminingtools.net
  • 24. Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net