Automating Google Workspace (GWS) & more with Apps Script
A review on data mining
1. A REVIEW ON DATA MINING
1 Er.Nancy, perusing M-tech in CSE (2012-14), GNDEC,
Ludhiana, India.
2. ABSTRACT
Data Mining is an approach to discover or extract
knowledge from Databases.
Generally to store databases, enterprises make data
warehouses and data marts.
Data warehouses and data marts contain large
mountains of data.
The mountains represent valuable resource to the
enterprise.
Due to extracting knowledge from large data
warehouses or depositories, data mining plays great
role in various fields of machine learning.
Data mining is used in education, medical, scientific,
business fields. Various algorithms and programs are
2
used for data mining approach.[1]
3. INTRODUCTION
Now the world of information technology, all things are
become automated.
Information technology is used in every field of human
life such as business, engineering, medical,
mathematical, scientific.
All these fields of human life have lead to the large
volume of data storage in various formats such as
records, files, documents, images, sounds, recordings,
videos and many new data.
Collection of related data is also known as database. To
extract correct data from large databases, the proper
mechanism is used, that mechanism is also known as
DATA MINING. It is a knowledge discovery technique
(KDD).[2] 3
4. PROCESS
(1) Selection
(2) Pre-processing
(3) Transformation
(4) Data Mining
(5) Interpretation/Evaluation.[2,3]
4
6. PROCEDURE OF DATA MINING
1. Data cleaning: It is also known as data cleansing; in
this phase noise data and irrelevant data are removed
from the collection.
2. Data integration: In this stage, multiple data sources,
often heterogeneous, are combined in a common
source.
3. Data selection: The data relevant to the analysis is
decided on and retrieved from the data collection.
4. Data transformation: It is also known as data
consolidation; in this phase the selected data is
transformed into forms appropriate for the mining
procedure. 6
7. CONTD…….
5. Data mining: It is the crucial step in which clever
techniques are applied to extract potentially useful patterns.
6. Pattern evaluation: In this step, interesting patterns
representing knowledge are identified based on given
measures.
7. Knowledge representation: It is the final phase in
which the discovered knowledge is visually presented to
the user. This essential step uses visualization
techniques to help users understand and interpret the
data mining results.[4,2]
7
8. DATA MINING LIFE CYCLE
The Cross Industry Standard Process for Data
Mining (CRISP-DM) which defines six phases:
(1) Business Understanding
(2) Data Understanding
(3) Data Preparation
(4) Modeling
(5) Evaluation
(6) Deployment.[6]
8
9. CONTD……
1. Business Understanding: This phase focuses on
understanding the project objectives and requirements
from a business perspective, then converting this
knowledge into a data mining problem definition and a
preliminary plan designed to achieve the objectives.
2. Data Understanding: It starts with an initial data
collection, to get familiar with the data, to identify data
quality problems, to discover first insights into the data
or to detect interesting subsets to form hypotheses for
hidden information.
3. Data Preparation: It covers all activities to construct
the final dataset from the initial raw data.
9
10. CONTD……
4. Modeling: In this phase, various modeling techniques
are selected and applied and their parameters are
calibrated to optimal values.
5. Evaluation: In this stage the model is thoroughly
evaluated and reviewed. The steps executed to
construct the model to be certain it properly achieves
the business objectives.
6. Deployment: The purpose of the model is to increase
knowledge of the data, the knowledge gained will need
to be organized and presented in a way that the
10
customer can use it.[6,8]
11. TYPES OF DATA MINING SYSTEM
Classification of data mining systems according to the
type of data source mined: This classification is according to
the type of data handled such as spatial data, multimedia
Data, time-series data, text data, World Wide Web, etc.
Classification of data mining systems according to the
data model: This classification based on the data model
involved such as relational database, object-oriented
database, Data warehouse, transactional database, etc.
Classification of data mining systems according to the
kind of knowledge discovered: This classification based on
the kind of knowledge discovered or data mining
functionalities, such as characterization, discrimination,
association, classification, clustering, etc
11
12. CONTD……
Classification of data mining systems according to
mining techniques used: This classification is
according to the data analysis approach used such as
machine learning, Neural networks, genetic algorithms,
statistics, visualization, database oriented or data
warehouse-oriented, etc.[5,2]
12
13. DATA MINING METHODS
On-Line Analytical Factor analysis
Processing (OLAP) Neural Networks
Classification Regression analysis
Association Rule Mining Structured data analysis
Temporal Data Mining Sequence mining
Time Series Analysis Text mining
Spatial Mining, Drug Discovery
Anomaly/outlier/change Exploratory data analysis
detection
Predictive analytics
Association rule learning
Web Mining
Cluster analysis
Data analysis etc.[1,8] 13
Decision trees
14. DATA MINING APPLICATIONS
Games
Business
Science and Engineering
Human rights
Spatial data mining.[5]
14
15. CHALLENGES
Sensor data mining
Pattern mining
Visual data mining
Subject based data mining
Music data mining
The Digital Library retrieves.[1,2,5]
15
16. CONCLUSION
In this paper I briefly reviewed data mining, background
of data mining, methods, life cycle model, and types of
data mining, application of it.
This review will be helpful to you to easily understand
what data mining is, previously used data mining
techniques.
Now days use data mining techniques and also process
of data mining.
In this I completely, explain applications and types of
data mining. [1,2,4,5]
16
17. FUTURE SCOPE
Data mining is a very wide concept. It contains many
concepts such as various algorithms to extract
knowledge from large databases.
Soft Computing techniques like Fuzzy logic, Neural
Networks and Genetic Programming which contains
Complex data objects Includes high dimensional, high
speed data streams, sequence, noise in the time series,
graph, Multi instance objects, Multi represented objects
and temporal data etc.
These are used in Business, Web, Medical diagnosis,
Scientific and Research analysis fields (bio, remote
sensing etc…), Social networking etc.[1,7]
17
18. REFERENCES
1. WIKIPEDIA.Com
2. Mr. S. P. Deshpande1 and Dr. V. M. Thakar
International Journal of Distributed and Parallel systems
(IJDPS) Vol.1, No.1, September 2010 DOI.
3. A Review on Data mining from Past to the Future.
Venkatadri.M Research Scholar, Dept. of Computer
Science, Dravidian University, India. Lokanatha C.
Reddy Professor, Dept. of Computer Science,
International Journal of Computer Applications (0975 –
8887) Volume 15– No.7, February 2011.
18
19. CONTD…..
4. Data Mining for High Performance Data Cloud using
Association Rule Mining.1 T.V.Mahendra 2N.Deepika
3N.Keasava Rao Professor & HOD, IT, Narayana Engg.
College, Nellore, AP, India.2 Sr.Assistant Professor,
Dept. of ISE, New Horizon College of Engineering,
Bangalore, India 3 Associate Professor, IT, Narayana
Engg. College, Gudur, AP, India.
5. Data Mining and KDD: A Shifting Mosaic By Joseph
M. Firestone, Ph.D. White Paper No. Two March 12,
1997 the Idea.
19
20. CONTD….
6. Data mining and static’s: what’s the connection?
Jerome h. Friedman.
7.www.Google.Com.
8.www. IEEEXPLORE.Com.
20