This document provides an overview of data mining. It defines data mining as extracting meaningful information from large data sets. It describes the typical data mining process, which includes problem definition, data gathering/preparation, model building/evaluation, and knowledge deployment. It also outlines several common data mining techniques like neural networks, clustering, decision trees, and support vector machines. Finally, it discusses applications of data mining in business, science, security, marketing, and spatial data analysis.
New Opportunities for Connected Data - Emil Eifrem @ GraphConnect Boston + Ch...
Data Mining
1. LOGO www.themegallery.com
DATA MINING
Dayanand Academy of Management Studies
LOGO
2. www.themegallery.com
Contents
1 Data Mining Introduction
2 Data Mining Procedures
3 Data Mining Techniques
4 Data Mining Application
3. LOGO www.themegallery.com
Data Mining
LOGO
Introduction
4. www.themegallery.com
Intoduction
What is Data Mining?
Data mining is the process of extracting
meaningful piece of information from Data
warehouses , which can be useful for
maximizing profit , fraud detection , marketing
perspective and scientific research.
5. www.themegallery.com
Data Warehouses:
According to Stanford University,
"A Data Warehouse is a repository of integrated
information, available for queries and analysis. Data
and information are extracted from heterogeneous
sources as they are generated .This makes it much
easier and more efficient to run queries over data
that originally came from different sources."
6. www.themegallery.com
Data Minining Steps
Fourth Step Knowledge Deployment
Third Step Model Building
Second Step Data Gathering
First Step Problem Definition
7. www.themegallery.com
Data Mining Procedures
Problem Definition:-
Data mining project focuses on understanding the
objectives and requirements of a particular project of
business. The Project must be specified from a business
point of view. After that it can be formulated as a data
mining problem and develop a preliminary.
Data Gathering & Preparation:-
This task involves data collection and exploration. It can
be done by Removing unnecessary information
, Detecting Data Duplicity and supplying some new
information.
8. www.themegallery.com
Data Mining Procedures
Model Building and Evaluation:-
In this phase, various Modeling Techniques can be applied
to build the data model which is likely to be sufficient with
the requirement and then An Evaluation can be done to
compare the current model with the originally stated project
goal.
Knowledge Deployment:-
Knowledge deployment is the use of data mining within a
target environment. In the deployment phase, insight and
actionable information can be derived from data.
9. www.themegallery.com
History of Data Mining Techniques
1950 1960’s 1980’s
• Neural • Decision • Support
Networks Trees Vector
• Clustering Machine
1999 2004
• Cross Industry Standard • Java Data Mining
Platform Data Mining Package (JDM 1.0)
Package (Crisp DM 1.0)
10. LOGO www.themegallery.com
Data Mining
LOGO
Neural Networks
11. www.themegallery.com
Neural Networks:-
Neural networks are non-linear statistical data modeling
tools. They can be used to model complex relationships
between inputs and outputs or to find patterns in data.
Using neural networks as a tool, data warehousing firms
are extracting information from datasets in the process
known as data mining.
Neural network is a techniques derived from artificial
intelligence research that uses generalized regression
and provide methods to carry it out.
It is self adapted and it uses learning method.
12. www.themegallery.com
Processing of Neural Networks
Input data is presented to the
network and propagated
through the network until it
reaches the output layer. The
predicted output is subtracted
from the actual output and an
error value for the networks is
calculated through supervised
learning.
Once back propagation has
finished, the forward process
starts again, and this cycle is
continued until the error
between predicted and actual
outputs is minimized.
13. LOGO www.themegallery.com
Data Mining
LOGO
Clustering
14. www.themegallery.com
Clustering
Clustering is used to segment the data.
Clustering models segment records into groups
that are similar to each other which is totally
distinct from other groups.
Typical Applications of Clustering are Online
Document Classification and to cluster web log
data to discover groups of similar access
patterns. Pattern Recognition, Spatial Data
Analysis and Image processing are other
applications in Scientific areas.
16. LOGO www.themegallery.com
Data Mining
LOGO
Decision Trees
17. www.themegallery.com
Decision Trees
The Decision Tree algorithm is based on
conditional probabilities. Decision trees
generate rules. A rule is a conditional statement
that can easily be understood by humans and
easily used within a database to identify a set of
records.
The Decision Tree algorithm produces accurate
and interpretable models with relatively little
user intervention. The algorithm can be used for
both binary and multi-class classification
problems.
18. www.themegallery.com
Decision Trees
Node 1 sows about married persons and 0 describes single persons.
Node 1 has 712 records (cases). Of these, 382 have a target of 0 (not
likely to increase spending), and 330 have a target of 1 (likely to increase
spending).
19. LOGO www.themegallery.com
Data Mining
LOGO
Support Vector
Machines
20. www.themegallery.com
Support Vector Machine
An optimal Defined Surface.
Linear and non linear Input Space.
Linear or High Dimension Feature Space which
is specially defined Kernel function.
SVM involves the fitting of a hyper plane such
that the largest margin is formed between 2
classes of vectors while minimizing the effects
of classification errors so that we can classified
in to groups.
28. LOGO www.themegallery.com
Data Mining
LOGO
Cross Industry
Standard Platform
Data Mining
29. www.themegallery.com
Crisp DM 1.0
Business
Understanding
Data
Deployment
Understanding
Data
Data
Evaluation
Preparation
Data Modeling
30. LOGO www.themegallery.com
Data Mining
LOGO
Data Mining
Applications
31. www.themegallery.com
Data Mining Applications
Online Searching Business
Spatial Data
Science
Data Mining Mining
Security Marketing
32. www.themegallery.com
Data Mining Applications
BUSSINESS PRECPECTIVE:-
Data mining helps business to extract information from
resources such as print media, television, internet,
investment. Data mining tools predicts future trend and
behavior allowing business to make proactive
knowledge driven decision for increasing revenue, profit
of the company.
SCIENCTIFIC PRECPECTIVE:-
Practical perspective describe how techniques from
data mining can be used to address and resolve the
modern problem in science and engineering
domains.
33. www.themegallery.com
Data Mining Applications
SECURITY PRECPECTIVE:-
To prevent or detect for fraud such as showing wrong
geographical domain and to identify stolen credit card by
transaction history. Data Mining can help to make online
transactions more secure and reliable by analyzing
previous transaction records.
SPATIAL DATA MINING:-
Geo-marketing companies doing customer segmentation
based on spatial location through data mining by mining
the purchase and subscription history .
34. www.themegallery.com
WEBSITE PROMOTION:-
Web owner can attract most number of visitors by
mining their data and then modifying their layout on the
basis of extracted information.
35. LOGO www.themegallery.com
LOGO Add your company slogan