1. DATA ANALYSIS
Basic Data Modeling and
Evaluation
Md Main Uddin Rony
Software Engineer, Infolytx, Inc.
2. Outlines
What is this?
Steps of Data Analysis
Tool’s overview, History
It’s offerings
Why should we use it?
Download
Introduce its interface
Building a Model
Evaluate the Model
Little Insight Of Data Analysis
Introduction of a Data Analysis Tool
Hands-On of The Tool
3. Data Analysis
Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful
information, suggesting conclusions, and supporting decision-making. (From Wikipedia)
❏ Acquiring meaningful insights from dataset
❏ Structuring the findings from survey research or other means of data collection
❏ Break a macro picture into a micro one
❏ Basing critical decisions from findings
❏ Ruling out human bias through proper statistical treatment
❏ Used in industries to allow companies and organization to make better business decisions
❏ Used to determine whether the system in place effectively protect data, operate efficiently and succeed in
accomplishing an organization’s goal
5. Rapidminer
❏ Most powerful, easy to use and intuitive graphical user interface for the design of analytical processes
❏ Formerly known as YALE (Yet Another Learning Environment)
❏ Developed in 2001 by Ralf Klinkenberg, Ingo Mierswa and Simon Fischer
❏ YALE changed to RapidMiner in 2007
6. What RapidMiner can Offer?
❏ An integrated environment written in Java for
- machine learning
- data mining
- text mining
- predictive analysis
- business analytics
❏ Provides a GUI to design and execute analytical workflows (Process, Operator)
❏ Provides 99% of an advanced analytical solution through template-based frameworks that speed delivery and reduce
7. Why Should Use Rapidminer?
❏ Powerful due to its learning operators and operator framework
❏ Easy to extend for Java programmers
❏ Stable
❏ Scalable
❏ Algorithms are optimized for speed
❏ Great visualization tools
❏ Available tools for data preprocessing
❏ Better community (http://rapid-i.com/)
❏ Better Debugging
❏ Wide range of supported file format
❏ Hadoop integration easy
8. Download RapidMiner
❏ Open source
❏ Easy to download and set up
❏ Link: https://rapidminer.com/products/studio/
10. Building Model Using RapidMiner
1. Importing Data
2. Visualizing Data
3. Creating a Model
4. Applying a Model
5. Evaluation of the Model
Business
Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment Data