SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Data mining
By Mamotlatsi Seotsa
Roll: 37
What is data mining?
It is the extraction of previously unknown, valid, novel and
understandable information or patterns from data in
repositories or sources:
 Databases
 Text files
 Social networks
 Computer simulation
The information obtained should be such that is can be used
in any organizations/enterprises for business making.
Why data mining
• Lots of data is being collected within
organizations such as banks, on e-commerce
based stores and it’s stored/warehoused.
• The need to explore the data and find possible
solutions to known problems may arise. These
solutions may be in a form of a pattern based on
previous data in this case the knowledge
obtained may enhance good decision making in
organizations hence why data mining is needed.
Applications of Data mining
Industry Application
Finance Credit card analysis
Insurance Claims and Fraud analysis
Telecommunications Call record analysis
transport Logistics management
Consumer goods Promotion analysis
Scientific research Image, video, speech
Components of data mining
• Knowledge Discovery
Concrete information gleaned from known data.
Data you may not have known but which is
supported by recorded facts.
• Knowledge prediction
Uses known data to forecast future trends,
events for example, stock market predictions
Steps in data mining
1. Data Integration
This involves combining data residing in different sources and
providing users with a unified or combined view of these data.
2. Data Selection
This is the process of determining the appropriate data type and
source as well as suitable instruments to collect data.
3. Data cleaning
Data cleaning is the process of detecting and correcting corrupt or
inaccurate records from a set, table or database and refers to
identifying incomplete, incorrect, inaccurate or irrelevant part of data
and replacing, modifying or deleting the dirty data.
4. Data transformation
Data transformation converts a set of data values
from the data format of a source data system into
the data format of a destination data system
5. Data mining
Here techniques are applied to extract data or
patterns of interest of which decisions will be made.
6. Pattern evaluation
In Pattern evaluation patterns are identified and
analyzed based on given measures.
7. Knowledge presentation
This is the final phase in which the discovered
knowledge is visually represented to the user.
This phase uses understandable techniques to
help users understand and interpret the data
mining results.
Data mining diagram based on a
Knowledge Discovery in databases
Advantages of data mining
• Marketing or Retailing
Data mining helps marketing companies build
models based on historical data to predict who will
respond to the new marketing campaign etc.
through the results markets will have an
appropriate approach to selling profitable products
to target customers.
Appropriate production arrangements can be made
based on marketing analysis and in that way
customers can buy products frequently.
• Banking or Finance
Data mining gives financial institutions
information about loan information and credit
reporting.
By building a model from historical customer’s
data, the bank, and financial institution can
determine good and bad loans. Moreover, data
mining helps banks detect fraudulent credit card
transactions to protect credit cards owner
• Manufacturing
Applying data mining in operational engineering
data, manufactures can detect faulty equipment
and determine optimal control parameters.
• Governments
Data mining helps government’s agencies by
digging and analyzing records of the financial
transaction t build patterns that can detect
money laundering or criminal activities.
Disadvantages of data mining
• Privacy issues
The use of the internet with social networks, e-commerce,
forums, blogs etc. raise a lot of privacy concerns, people are
afraid of their personal information is collected and used in an
unethical way that potentially causes them trouble.
• Security issues
Businesses own information about their employees and
customers including social security numbers, birthdays,
payroll etc. incase hackers access and steal the data of
customers so much personal information may lead to an
unsafe environment especially if the information obtained
involves finances.
• Misuse of information
Information may be exploited by unethical
people or businesses to take advantage of
vulnerable people or discriminate against a
group of people
Data mining techniques are also inaccurate
meaning if inaccurate information is used for
decision making then it may cause serious
consequences.
Current research
• Super computer data mining
The aim of the project is to produce a super
computing data mining resource for use by the
United Kingdom academic community which
utilizes a number of advanced machine learning and
statistical algorithms and the ensemble machine
approach will be used to exploit the large scale
parallelism possible in super-computing. This
purpose is embodied in the following objectives :
• To develop a massively parallel approach for commonly
used statistical and machine learning techniques for
exploratory data analysis
• To develop a massively parallel approach to the use of
evolutionary computing techniques for feature
creation and selection.
• To develop a massively parallel approach to the use of
evolutionary computing techniques for data modeling.
• To develop a massively parallel approach to the use of
ensemble machines for data modeling consisting of
many well-known machine learning algorithms
• To develop an appropriate super-computing
infra-structure to support the use of such
advanced machine learning techniques with
large datasets.
• Medical data mining
It is estimated that 150 million people have diabetes
worldwide, and that this number may double by 2025.
There Is no cure for diabetes, however, the condition can
be managed and early treatment can minimize the
complications described. A key factor in providing early
treatment is to identify those most at risk of
complications at an early stage. The data mining group of
university of East Anglia has been working on this area for
some time on a collaborative project with St. Thomas
Hospital London.
• St. Thomas Hospital London since 1973 had stored
patients information in a computerized clinical records
system
• In their research they identified factors that
were associated with early mortality. Current
research and teaching on outcome in people
with diabetes identifies cardiac risk factors as
being the most likely indicators of early
mortality. The data mining study occurred in
parallel with the independent analysis of a
cohort of 1000 patients with diabetes re-
examined after 10 years. This analysis also
identified peripheral neuropathy as the most
important risk factor for premature death.
• Time series data mining electricity usage
patterns
This is set to take place over the next decade
and will result in over 27 million households
being equipped with intelligent metering
systems that can monitor electricity
consumption in 15 minutes intervals and
facilitate easy communication of data usage.
Future research
• In future it is highly likely that data mining becomes
predictive analysis. data mining applications that will
enrich human life in various fields such as business,
education, medical field, scientific field, politics
include:
• Data mining in security and privacy preserving. For
example, recording of electronic commination like
email logs and web logs have captured human process
• Challenges in mining financial data for example ,
investors use models of assets prices to gain bigger
profits
• Detecting eco-system disturbances.
• Distributed data mining. Distributed algorithm
is developed for association analysis such as
parallel decision tree construction
• Text mining: an example is the use of opinion
or questionnaire mining where the objective is
to obtain useful information.
• Image mining: An example is the classification
of retinal image data and magnetic resonance
imaging scan data to identify disorders.
conclusion
Information extracted through data mining is
valuable for different organizations in different
industries that is, health sector, logistics,
marketing, finance, engineering etc. through it
businesses become information brokers, we can
weed out fraud, bad customers while targeting
good business customers, promising markets
and cross selling.

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining Techniques
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Data mining
Data mining Data mining
Data mining
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with Python
 
Data Mining
Data MiningData Mining
Data Mining
 
Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an Introduction
 
Data mining
Data mining Data mining
Data mining
 
Data mining techniques unit 1
Data mining techniques  unit 1Data mining techniques  unit 1
Data mining techniques unit 1
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Data mining and its applications!
Data mining and its applications!Data mining and its applications!
Data mining and its applications!
 
Data mining
Data miningData mining
Data mining
 
Database management system
Database management system Database management system
Database management system
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 
Machine Learning and Data Mining
Machine Learning and Data MiningMachine Learning and Data Mining
Machine Learning and Data Mining
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 

Ähnlich wie Data mining

Data mining by_ashok
Data mining by_ashokData mining by_ashok
Data mining by_ashokAshok Kumar
 
Application of data mining
Application of data miningApplication of data mining
Application of data miningSHIVANI SONI
 
Big Data in Healthcare and Medical Devices
Big Data in Healthcare and Medical DevicesBig Data in Healthcare and Medical Devices
Big Data in Healthcare and Medical DevicesPremNarayanan6
 
datamining management slyabbus and ppt.pptx
datamining management slyabbus and ppt.pptxdatamining management slyabbus and ppt.pptx
datamining management slyabbus and ppt.pptxshyam1985
 
Overview of data mining
Overview of data miningOverview of data mining
Overview of data miningMasterM0212
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applicationsPadma Metta
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxwahiba ben abdessalem
 
Overview of Data Mining
Overview of Data MiningOverview of Data Mining
Overview of Data Miningijtsrd
 
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTIONETHICAL ISSUES WITH CUSTOMER DATA COLLECTION
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTIONPranav Godse
 
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...Sahilakhurana
 
notes_dmdw_chap1.docx
notes_dmdw_chap1.docxnotes_dmdw_chap1.docx
notes_dmdw_chap1.docxAbshar Fatima
 
Data Ethics Framework 2.pptx
Data Ethics Framework 2.pptxData Ethics Framework 2.pptx
Data Ethics Framework 2.pptxUgurKaplancali
 
Data Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptxData Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptxhp41112004
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxssuser1a4f0f
 

Ähnlich wie Data mining (20)

DOWLD SLIDES.pptx
DOWLD SLIDES.pptxDOWLD SLIDES.pptx
DOWLD SLIDES.pptx
 
NCCT.pptx
NCCT.pptxNCCT.pptx
NCCT.pptx
 
Data mining by_ashok
Data mining by_ashokData mining by_ashok
Data mining by_ashok
 
Data mining
Data miningData mining
Data mining
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
 
Big Data in Healthcare and Medical Devices
Big Data in Healthcare and Medical DevicesBig Data in Healthcare and Medical Devices
Big Data in Healthcare and Medical Devices
 
datamining.ppt
datamining.pptdatamining.ppt
datamining.ppt
 
datamining.ppt
datamining.pptdatamining.ppt
datamining.ppt
 
datamining management slyabbus and ppt.pptx
datamining management slyabbus and ppt.pptxdatamining management slyabbus and ppt.pptx
datamining management slyabbus and ppt.pptx
 
datamining.ppt
datamining.pptdatamining.ppt
datamining.ppt
 
Overview of data mining
Overview of data miningOverview of data mining
Overview of data mining
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applications
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Overview of Data Mining
Overview of Data MiningOverview of Data Mining
Overview of Data Mining
 
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTIONETHICAL ISSUES WITH CUSTOMER DATA COLLECTION
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION
 
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...
Data Analytics in Industry Verticals, Data Analytics Lifecycle, Challenges of...
 
notes_dmdw_chap1.docx
notes_dmdw_chap1.docxnotes_dmdw_chap1.docx
notes_dmdw_chap1.docx
 
Data Ethics Framework 2.pptx
Data Ethics Framework 2.pptxData Ethics Framework 2.pptx
Data Ethics Framework 2.pptx
 
Data Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptxData Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptx
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 

Mehr von District Administration (16)

Real time Database
Real time DatabaseReal time Database
Real time Database
 
Presentation on bipolar encoding
Presentation on bipolar encodingPresentation on bipolar encoding
Presentation on bipolar encoding
 
Transactional workflow
Transactional workflowTransactional workflow
Transactional workflow
 
Temporal database
Temporal databaseTemporal database
Temporal database
 
Multimedia Database
Multimedia DatabaseMultimedia Database
Multimedia Database
 
Spatial Database
Spatial DatabaseSpatial Database
Spatial Database
 
Presentations on web database
Presentations on web databasePresentations on web database
Presentations on web database
 
Presentation on the topic selection sort
Presentation on the topic selection sortPresentation on the topic selection sort
Presentation on the topic selection sort
 
Presentation on control access protocol
Presentation on control access protocolPresentation on control access protocol
Presentation on control access protocol
 
Transaction Processing monitor
Transaction Processing monitorTransaction Processing monitor
Transaction Processing monitor
 
Graphical database
Graphical databaseGraphical database
Graphical database
 
Graph database
Graph database Graph database
Graph database
 
Distributed information system
Distributed information systemDistributed information system
Distributed information system
 
Adbms and mmdbms
Adbms and mmdbmsAdbms and mmdbms
Adbms and mmdbms
 
Active and main memory database
Active and main memory databaseActive and main memory database
Active and main memory database
 
Heap
HeapHeap
Heap
 

Kürzlich hochgeladen

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 

Kürzlich hochgeladen (20)

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 

Data mining

  • 1. Data mining By Mamotlatsi Seotsa Roll: 37
  • 2. What is data mining? It is the extraction of previously unknown, valid, novel and understandable information or patterns from data in repositories or sources:  Databases  Text files  Social networks  Computer simulation The information obtained should be such that is can be used in any organizations/enterprises for business making.
  • 3. Why data mining • Lots of data is being collected within organizations such as banks, on e-commerce based stores and it’s stored/warehoused. • The need to explore the data and find possible solutions to known problems may arise. These solutions may be in a form of a pattern based on previous data in this case the knowledge obtained may enhance good decision making in organizations hence why data mining is needed.
  • 4. Applications of Data mining Industry Application Finance Credit card analysis Insurance Claims and Fraud analysis Telecommunications Call record analysis transport Logistics management Consumer goods Promotion analysis Scientific research Image, video, speech
  • 5. Components of data mining • Knowledge Discovery Concrete information gleaned from known data. Data you may not have known but which is supported by recorded facts. • Knowledge prediction Uses known data to forecast future trends, events for example, stock market predictions
  • 6. Steps in data mining 1. Data Integration This involves combining data residing in different sources and providing users with a unified or combined view of these data. 2. Data Selection This is the process of determining the appropriate data type and source as well as suitable instruments to collect data. 3. Data cleaning Data cleaning is the process of detecting and correcting corrupt or inaccurate records from a set, table or database and refers to identifying incomplete, incorrect, inaccurate or irrelevant part of data and replacing, modifying or deleting the dirty data.
  • 7. 4. Data transformation Data transformation converts a set of data values from the data format of a source data system into the data format of a destination data system 5. Data mining Here techniques are applied to extract data or patterns of interest of which decisions will be made. 6. Pattern evaluation In Pattern evaluation patterns are identified and analyzed based on given measures.
  • 8. 7. Knowledge presentation This is the final phase in which the discovered knowledge is visually represented to the user. This phase uses understandable techniques to help users understand and interpret the data mining results.
  • 9. Data mining diagram based on a Knowledge Discovery in databases
  • 10. Advantages of data mining • Marketing or Retailing Data mining helps marketing companies build models based on historical data to predict who will respond to the new marketing campaign etc. through the results markets will have an appropriate approach to selling profitable products to target customers. Appropriate production arrangements can be made based on marketing analysis and in that way customers can buy products frequently.
  • 11. • Banking or Finance Data mining gives financial institutions information about loan information and credit reporting. By building a model from historical customer’s data, the bank, and financial institution can determine good and bad loans. Moreover, data mining helps banks detect fraudulent credit card transactions to protect credit cards owner
  • 12. • Manufacturing Applying data mining in operational engineering data, manufactures can detect faulty equipment and determine optimal control parameters. • Governments Data mining helps government’s agencies by digging and analyzing records of the financial transaction t build patterns that can detect money laundering or criminal activities.
  • 13. Disadvantages of data mining • Privacy issues The use of the internet with social networks, e-commerce, forums, blogs etc. raise a lot of privacy concerns, people are afraid of their personal information is collected and used in an unethical way that potentially causes them trouble. • Security issues Businesses own information about their employees and customers including social security numbers, birthdays, payroll etc. incase hackers access and steal the data of customers so much personal information may lead to an unsafe environment especially if the information obtained involves finances.
  • 14. • Misuse of information Information may be exploited by unethical people or businesses to take advantage of vulnerable people or discriminate against a group of people Data mining techniques are also inaccurate meaning if inaccurate information is used for decision making then it may cause serious consequences.
  • 15. Current research • Super computer data mining The aim of the project is to produce a super computing data mining resource for use by the United Kingdom academic community which utilizes a number of advanced machine learning and statistical algorithms and the ensemble machine approach will be used to exploit the large scale parallelism possible in super-computing. This purpose is embodied in the following objectives :
  • 16. • To develop a massively parallel approach for commonly used statistical and machine learning techniques for exploratory data analysis • To develop a massively parallel approach to the use of evolutionary computing techniques for feature creation and selection. • To develop a massively parallel approach to the use of evolutionary computing techniques for data modeling. • To develop a massively parallel approach to the use of ensemble machines for data modeling consisting of many well-known machine learning algorithms
  • 17. • To develop an appropriate super-computing infra-structure to support the use of such advanced machine learning techniques with large datasets.
  • 18. • Medical data mining It is estimated that 150 million people have diabetes worldwide, and that this number may double by 2025. There Is no cure for diabetes, however, the condition can be managed and early treatment can minimize the complications described. A key factor in providing early treatment is to identify those most at risk of complications at an early stage. The data mining group of university of East Anglia has been working on this area for some time on a collaborative project with St. Thomas Hospital London. • St. Thomas Hospital London since 1973 had stored patients information in a computerized clinical records system
  • 19. • In their research they identified factors that were associated with early mortality. Current research and teaching on outcome in people with diabetes identifies cardiac risk factors as being the most likely indicators of early mortality. The data mining study occurred in parallel with the independent analysis of a cohort of 1000 patients with diabetes re- examined after 10 years. This analysis also identified peripheral neuropathy as the most important risk factor for premature death.
  • 20. • Time series data mining electricity usage patterns This is set to take place over the next decade and will result in over 27 million households being equipped with intelligent metering systems that can monitor electricity consumption in 15 minutes intervals and facilitate easy communication of data usage.
  • 21. Future research • In future it is highly likely that data mining becomes predictive analysis. data mining applications that will enrich human life in various fields such as business, education, medical field, scientific field, politics include: • Data mining in security and privacy preserving. For example, recording of electronic commination like email logs and web logs have captured human process • Challenges in mining financial data for example , investors use models of assets prices to gain bigger profits • Detecting eco-system disturbances.
  • 22. • Distributed data mining. Distributed algorithm is developed for association analysis such as parallel decision tree construction • Text mining: an example is the use of opinion or questionnaire mining where the objective is to obtain useful information. • Image mining: An example is the classification of retinal image data and magnetic resonance imaging scan data to identify disorders.
  • 23. conclusion Information extracted through data mining is valuable for different organizations in different industries that is, health sector, logistics, marketing, finance, engineering etc. through it businesses become information brokers, we can weed out fraud, bad customers while targeting good business customers, promising markets and cross selling.