SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Management information system Third Year Information Technology Part 06 Data Warehousing Data Mining Tushar B Kute, Department of Information Technology, Sandip Institute of Technology and Research Centre, Nashik http://www.tusharkute.com
Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information, which is created by data, becomes the bases for decision making
DSS Database Requirements DSS Database Scheme Support Complex and Non-Normalized data Summarized and Aggregate data Multiple Relationships Queries must extract multi-dimensional time slices Redundant Data
DSS Database Requirements Data Extraction and Filtering DSS databases are created mainly by extracting data from operational databases combined with data imported from external source Need for advanced data extraction & filtering tools Allow batch / scheduled data extraction Support different types of data sources Check for inconsistent data / data validation rules Support advanced data integration / data formatting conflicts
DSS Database Requirements End User Analytical Interface Must support advanced data modeling and data presentation tools Data analysis tools Query generation Must Allow the User to Navigate through the DSS Size Requirements VERY Large – Terabytes Advanced Hardware (Multiple processors, multiple disk arrays, etc.)
Data Warehouse DSS – friendly data repository for the DSS is the DATA WAREHOUSE Definition:  Integrated, Subject-Oriented, Time-Variant, Nonvolatile database that provides support for decision making
Generic two-level data warehousing architecture L One, company-wide warehouse T E Periodic extraction  data is not completely current in warehouse
Integrated The data warehouse is a centralized, consolidated database that integrated data derived from the entire organization Multiple Sources Diverse Sources Diverse Formats
Subject-Oriented Data is arranged and optimized to provide answer to questions from diverse functional areas Data is organized and summarized by topic Sales / Marketing / Finance / Distribution / Etc.
Time-Variant The Data Warehouse represents the flow of data through time Can contain projected data from statistical models Data is periodically uploaded then time-dependent data is recomputed
Nonvolatile Once data is entered it is NEVER removed Represents the company’s entire history Near term history is continually added to it Always growing Must support terabyte databases and multiprocessors Read-Only database for data analysis and query processing
Additional characteristics Web based. Relational / Multidimensional. Client-Server Real Time. Include Metadata.
Data Marts Small Data Stores More manageable data sets Targeted to meet the needs of small groups within the organization Small, Single-Subject data warehouse subset that provides decision support to a small group of people
Operational data stores It provides a fairly recent form of customer information file (CRF). This type of database is often used as an interim staging area for a data warehouse. It is used for short term decisions involving mission-critical applications rather than for the medium and long term decisions associated with EDW.
Enterprise data warehouse It is a large scale data warehouse that is used across the enterprise for decision support. The large scale nature provide integration of data from many sources into standard format for effective BI and decision support applications. It is used to provide data for many types of DSS includes: CRM, SCM, BPM, BAM, PLM, KMS, Revenue management.
OLAP Online Analytical Processing Tools DSS tools that use multidimensional data analysis techniques Support for a DSS data store Data extraction and integration filter Specialized presentation interface
Rules of a Data Warehouse Data Warehouse and Operational Environments are Separated Data is integrated Contains historical data over a long period of time Data is a snapshot data captured at a given point in time Data is subject-oriented
Rules of Data Warehouse Mainly read-only with periodic batch updates Development Life Cycle has a data driven approach versus the traditional process-driven approach Data contains several levels of detail Current, Old, Lightly Summarized, Highly Summarized
Rules of Data Warehouse Environment is characterized by Read-only transactions to very large data sets System that traces data sources, transformations, and storage Metadata is a critical component Source, transformation, integration, storage, relationships, history, etc Contains a chargeback mechanism for resource usage that enforces optimal use of data by end users
OLAP Need for More Intensive Decision Support 4 Main Characteristics Multidimensional data analysis Advanced Database Support Easy-to-use end-user interfaces Support Client/Server architecture
Multidimensional Data Analysis Techniques Advanced Data Presentation Functions 3-D graphics, Pivot Tables, Crosstabs, etc. Compatible with Spreadsheets & Statistical packages Advanced data aggregations, consolidation and classification across time dimensions Advanced computational functions Advanced data modeling functions
Advanced Database Support Advanced Data Access Features Access to many kinds of DBMS’s, flat files, and internal and external data sources Access to aggregated data warehouse data Advanced data navigation (drill-downs and roll-ups) Ability to map end-user requests to the appropriate data source Support for Very Large Databases
Easy-to-Use End-User Interface Graphical User Interfaces Much more useful if access is kept simple
Client/Server Architecture Framework for the new systems to be designed, developed and implemented Divide the OLAP system into several components that define its architecture Same Computer Distributed among several computer
OLAP Architecture 3 Main Modules GUI Analytical Processing Logic Data-processing Logic
OLAP Client/Server Architecture
Data Warehouse Implementation An Active Decision Support Framework Not a Static Database Always a Work in Process Complete Infrastructure for Company-Wide decision support Hardware / Software / People / Procedures / Data Data Warehouse is a critical component of the Modern DSS – But not the Only critical component
Data Mining Discover Previously unknown data characteristics, relationships, dependencies, or trends Typical Data Analysis Relies on end users  Define the Problem Select the Data Initial the Data Analysis Reacts to External Stimulus
Data Mining Proactive Automatically searches Anomalies Possible Relationships Identify Problems before the end-user Data Mining tools analyze the data, uncover problems or opportunities hidden in data relationships, form computer models based on their findings, and then user the models to predict business behavior – with minimal end-user intervention
Data Mining A methodology designed to perform knowledge-discovery expeditions over the database data with minimal end-user intervention 3 Stages of Data Data Information Knowledge
Extraction of Knowledge from Data
4 Phases of Data Mining Data Preparation Identify the main data sets to be used by the data mining operation (usually the data warehouse) Data Analysis and Classification Study the data to identify common data characteristics or patterns Data groupings, classifications, clusters, sequences Data dependencies, links, or relationships Data patterns, trends, deviation
4 Phases of Data Mining Knowledge Acquisition Uses the Results of the Data Analysis and Classification phase Data mining tool selects the appropriate modeling or knowledge-acquisition algorithms Neural Networks Decision Trees Rules Induction Genetic algorithms Memory-Based Reasoning Prognosis Predict Future Behavior Forecast Business Outcomes 65% of customers who did not use a particular credit card in the last 6 months are 88% likely to cancel the account.
Data Mining Still a New Technique May find many Unmeaningful Relationships Good at finding Practical Relationships Define Customer Buying Patterns Improve Product Development and Acceptance Etc. Potential of becoming the next frontier in database development
Data Mining and Visualization Data mining: Knowledge discovery using a blend of statistical, AI, and computer graphics techniques Goals: Explain observed events or conditions Confirm hypotheses Explore data for new or unexpected relationships Techniques Statistical regression Decision tree induction Clustering and signal processing Affinity Sequence association Case-based reasoning Rule discovery Neural nets Fractals Data visualization–representing data in graphical/multimedia formats for analysis
reference ,[object Object]

Weitere ähnliche Inhalte

Andere mochten auch

Hacking & its types
Hacking & its typesHacking & its types
Hacking & its typesSai Sakoji
 
Customer relationship management in mis ppt
Customer relationship management in mis pptCustomer relationship management in mis ppt
Customer relationship management in mis pptRanjani Witted
 
MIS 13 Customer Relationship Management
MIS 13 Customer Relationship ManagementMIS 13 Customer Relationship Management
MIS 13 Customer Relationship ManagementTushar B Kute
 
SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)
SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)
SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)Biswajit Bhattacharjee
 
Porter's Value Chain Presentation 1
Porter's Value Chain Presentation 1Porter's Value Chain Presentation 1
Porter's Value Chain Presentation 1Bryant Pham
 
Basics of Supply Chain Managment
Basics of Supply Chain ManagmentBasics of Supply Chain Managment
Basics of Supply Chain ManagmentYoussef Serroukh
 
Customer Relationship Management (CRM)
Customer Relationship Management (CRM)Customer Relationship Management (CRM)
Customer Relationship Management (CRM)Jaiser Abbas
 
Value chain analysis
Value chain analysisValue chain analysis
Value chain analysisMonish rm
 

Andere mochten auch (14)

Hacking & its types
Hacking & its typesHacking & its types
Hacking & its types
 
Customer relationship management in mis ppt
Customer relationship management in mis pptCustomer relationship management in mis ppt
Customer relationship management in mis ppt
 
MIS 13 Customer Relationship Management
MIS 13 Customer Relationship ManagementMIS 13 Customer Relationship Management
MIS 13 Customer Relationship Management
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)
SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)
SECURITY & CONTROL OF INFORMATION SYSTEM (Management Information System)
 
OLAP
OLAPOLAP
OLAP
 
Warehousing
WarehousingWarehousing
Warehousing
 
CRM with MIS
CRM with MISCRM with MIS
CRM with MIS
 
Porter's Value Chain Presentation 1
Porter's Value Chain Presentation 1Porter's Value Chain Presentation 1
Porter's Value Chain Presentation 1
 
Ethical hacking presentation
Ethical hacking presentationEthical hacking presentation
Ethical hacking presentation
 
Basics of Supply Chain Managment
Basics of Supply Chain ManagmentBasics of Supply Chain Managment
Basics of Supply Chain Managment
 
Customer Relationship Management (CRM)
Customer Relationship Management (CRM)Customer Relationship Management (CRM)
Customer Relationship Management (CRM)
 
Supply Chain Management
Supply Chain ManagementSupply Chain Management
Supply Chain Management
 
Value chain analysis
Value chain analysisValue chain analysis
Value chain analysis
 

Mehr von Tushar B Kute

Apache Pig: A big data processor
Apache Pig: A big data processorApache Pig: A big data processor
Apache Pig: A big data processorTushar B Kute
 
01 Introduction to Android
01 Introduction to Android01 Introduction to Android
01 Introduction to AndroidTushar B Kute
 
Ubuntu OS and it's Flavours
Ubuntu OS and it's FlavoursUbuntu OS and it's Flavours
Ubuntu OS and it's FlavoursTushar B Kute
 
Install Drupal in Ubuntu by Tushar B. Kute
Install Drupal in Ubuntu by Tushar B. KuteInstall Drupal in Ubuntu by Tushar B. Kute
Install Drupal in Ubuntu by Tushar B. KuteTushar B Kute
 
Install Wordpress in Ubuntu Linux by Tushar B. Kute
Install Wordpress in Ubuntu Linux by Tushar B. KuteInstall Wordpress in Ubuntu Linux by Tushar B. Kute
Install Wordpress in Ubuntu Linux by Tushar B. KuteTushar B Kute
 
Share File easily between computers using sftp
Share File easily between computers using sftpShare File easily between computers using sftp
Share File easily between computers using sftpTushar B Kute
 
Signal Handling in Linux
Signal Handling in LinuxSignal Handling in Linux
Signal Handling in LinuxTushar B Kute
 
Implementation of FIFO in Linux
Implementation of FIFO in LinuxImplementation of FIFO in Linux
Implementation of FIFO in LinuxTushar B Kute
 
Implementation of Pipe in Linux
Implementation of Pipe in LinuxImplementation of Pipe in Linux
Implementation of Pipe in LinuxTushar B Kute
 
Basic Multithreading using Posix Threads
Basic Multithreading using Posix ThreadsBasic Multithreading using Posix Threads
Basic Multithreading using Posix ThreadsTushar B Kute
 
Part 04 Creating a System Call in Linux
Part 04 Creating a System Call in LinuxPart 04 Creating a System Call in Linux
Part 04 Creating a System Call in LinuxTushar B Kute
 
Part 03 File System Implementation in Linux
Part 03 File System Implementation in LinuxPart 03 File System Implementation in Linux
Part 03 File System Implementation in LinuxTushar B Kute
 
Part 02 Linux Kernel Module Programming
Part 02 Linux Kernel Module ProgrammingPart 02 Linux Kernel Module Programming
Part 02 Linux Kernel Module ProgrammingTushar B Kute
 
Part 01 Linux Kernel Compilation (Ubuntu)
Part 01 Linux Kernel Compilation (Ubuntu)Part 01 Linux Kernel Compilation (Ubuntu)
Part 01 Linux Kernel Compilation (Ubuntu)Tushar B Kute
 
Open source applications softwares
Open source applications softwaresOpen source applications softwares
Open source applications softwaresTushar B Kute
 
Introduction to Ubuntu Edge Operating System (Ubuntu Touch)
Introduction to Ubuntu Edge Operating System (Ubuntu Touch)Introduction to Ubuntu Edge Operating System (Ubuntu Touch)
Introduction to Ubuntu Edge Operating System (Ubuntu Touch)Tushar B Kute
 
Unit 6 Operating System TEIT Savitribai Phule Pune University by Tushar B Kute
Unit 6 Operating System TEIT Savitribai Phule Pune University by Tushar B KuteUnit 6 Operating System TEIT Savitribai Phule Pune University by Tushar B Kute
Unit 6 Operating System TEIT Savitribai Phule Pune University by Tushar B KuteTushar B Kute
 
Technical blog by Engineering Students of Sandip Foundation, itsitrc
Technical blog by Engineering Students of Sandip Foundation, itsitrcTechnical blog by Engineering Students of Sandip Foundation, itsitrc
Technical blog by Engineering Students of Sandip Foundation, itsitrcTushar B Kute
 
Chapter 01 Introduction to Java by Tushar B Kute
Chapter 01 Introduction to Java by Tushar B KuteChapter 01 Introduction to Java by Tushar B Kute
Chapter 01 Introduction to Java by Tushar B KuteTushar B Kute
 
Chapter 02: Classes Objects and Methods Java by Tushar B Kute
Chapter 02: Classes Objects and Methods Java by Tushar B KuteChapter 02: Classes Objects and Methods Java by Tushar B Kute
Chapter 02: Classes Objects and Methods Java by Tushar B KuteTushar B Kute
 

Mehr von Tushar B Kute (20)

Apache Pig: A big data processor
Apache Pig: A big data processorApache Pig: A big data processor
Apache Pig: A big data processor
 
01 Introduction to Android
01 Introduction to Android01 Introduction to Android
01 Introduction to Android
 
Ubuntu OS and it's Flavours
Ubuntu OS and it's FlavoursUbuntu OS and it's Flavours
Ubuntu OS and it's Flavours
 
Install Drupal in Ubuntu by Tushar B. Kute
Install Drupal in Ubuntu by Tushar B. KuteInstall Drupal in Ubuntu by Tushar B. Kute
Install Drupal in Ubuntu by Tushar B. Kute
 
Install Wordpress in Ubuntu Linux by Tushar B. Kute
Install Wordpress in Ubuntu Linux by Tushar B. KuteInstall Wordpress in Ubuntu Linux by Tushar B. Kute
Install Wordpress in Ubuntu Linux by Tushar B. Kute
 
Share File easily between computers using sftp
Share File easily between computers using sftpShare File easily between computers using sftp
Share File easily between computers using sftp
 
Signal Handling in Linux
Signal Handling in LinuxSignal Handling in Linux
Signal Handling in Linux
 
Implementation of FIFO in Linux
Implementation of FIFO in LinuxImplementation of FIFO in Linux
Implementation of FIFO in Linux
 
Implementation of Pipe in Linux
Implementation of Pipe in LinuxImplementation of Pipe in Linux
Implementation of Pipe in Linux
 
Basic Multithreading using Posix Threads
Basic Multithreading using Posix ThreadsBasic Multithreading using Posix Threads
Basic Multithreading using Posix Threads
 
Part 04 Creating a System Call in Linux
Part 04 Creating a System Call in LinuxPart 04 Creating a System Call in Linux
Part 04 Creating a System Call in Linux
 
Part 03 File System Implementation in Linux
Part 03 File System Implementation in LinuxPart 03 File System Implementation in Linux
Part 03 File System Implementation in Linux
 
Part 02 Linux Kernel Module Programming
Part 02 Linux Kernel Module ProgrammingPart 02 Linux Kernel Module Programming
Part 02 Linux Kernel Module Programming
 
Part 01 Linux Kernel Compilation (Ubuntu)
Part 01 Linux Kernel Compilation (Ubuntu)Part 01 Linux Kernel Compilation (Ubuntu)
Part 01 Linux Kernel Compilation (Ubuntu)
 
Open source applications softwares
Open source applications softwaresOpen source applications softwares
Open source applications softwares
 
Introduction to Ubuntu Edge Operating System (Ubuntu Touch)
Introduction to Ubuntu Edge Operating System (Ubuntu Touch)Introduction to Ubuntu Edge Operating System (Ubuntu Touch)
Introduction to Ubuntu Edge Operating System (Ubuntu Touch)
 
Unit 6 Operating System TEIT Savitribai Phule Pune University by Tushar B Kute
Unit 6 Operating System TEIT Savitribai Phule Pune University by Tushar B KuteUnit 6 Operating System TEIT Savitribai Phule Pune University by Tushar B Kute
Unit 6 Operating System TEIT Savitribai Phule Pune University by Tushar B Kute
 
Technical blog by Engineering Students of Sandip Foundation, itsitrc
Technical blog by Engineering Students of Sandip Foundation, itsitrcTechnical blog by Engineering Students of Sandip Foundation, itsitrc
Technical blog by Engineering Students of Sandip Foundation, itsitrc
 
Chapter 01 Introduction to Java by Tushar B Kute
Chapter 01 Introduction to Java by Tushar B KuteChapter 01 Introduction to Java by Tushar B Kute
Chapter 01 Introduction to Java by Tushar B Kute
 
Chapter 02: Classes Objects and Methods Java by Tushar B Kute
Chapter 02: Classes Objects and Methods Java by Tushar B KuteChapter 02: Classes Objects and Methods Java by Tushar B Kute
Chapter 02: Classes Objects and Methods Java by Tushar B Kute
 

Kürzlich hochgeladen

Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEaurabinda banchhor
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxElton John Embodo
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 

Kürzlich hochgeladen (20)

Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSE
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
EMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docxEMBODO Lesson Plan Grade 9 Law of Sines.docx
EMBODO Lesson Plan Grade 9 Law of Sines.docx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 

MIS 06 Data Warehousing and Mining

  • 1. Management information system Third Year Information Technology Part 06 Data Warehousing Data Mining Tushar B Kute, Department of Information Technology, Sandip Institute of Technology and Research Centre, Nashik http://www.tusharkute.com
  • 2. Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information, which is created by data, becomes the bases for decision making
  • 3. DSS Database Requirements DSS Database Scheme Support Complex and Non-Normalized data Summarized and Aggregate data Multiple Relationships Queries must extract multi-dimensional time slices Redundant Data
  • 4. DSS Database Requirements Data Extraction and Filtering DSS databases are created mainly by extracting data from operational databases combined with data imported from external source Need for advanced data extraction & filtering tools Allow batch / scheduled data extraction Support different types of data sources Check for inconsistent data / data validation rules Support advanced data integration / data formatting conflicts
  • 5. DSS Database Requirements End User Analytical Interface Must support advanced data modeling and data presentation tools Data analysis tools Query generation Must Allow the User to Navigate through the DSS Size Requirements VERY Large – Terabytes Advanced Hardware (Multiple processors, multiple disk arrays, etc.)
  • 6. Data Warehouse DSS – friendly data repository for the DSS is the DATA WAREHOUSE Definition: Integrated, Subject-Oriented, Time-Variant, Nonvolatile database that provides support for decision making
  • 7. Generic two-level data warehousing architecture L One, company-wide warehouse T E Periodic extraction  data is not completely current in warehouse
  • 8. Integrated The data warehouse is a centralized, consolidated database that integrated data derived from the entire organization Multiple Sources Diverse Sources Diverse Formats
  • 9. Subject-Oriented Data is arranged and optimized to provide answer to questions from diverse functional areas Data is organized and summarized by topic Sales / Marketing / Finance / Distribution / Etc.
  • 10. Time-Variant The Data Warehouse represents the flow of data through time Can contain projected data from statistical models Data is periodically uploaded then time-dependent data is recomputed
  • 11. Nonvolatile Once data is entered it is NEVER removed Represents the company’s entire history Near term history is continually added to it Always growing Must support terabyte databases and multiprocessors Read-Only database for data analysis and query processing
  • 12. Additional characteristics Web based. Relational / Multidimensional. Client-Server Real Time. Include Metadata.
  • 13. Data Marts Small Data Stores More manageable data sets Targeted to meet the needs of small groups within the organization Small, Single-Subject data warehouse subset that provides decision support to a small group of people
  • 14. Operational data stores It provides a fairly recent form of customer information file (CRF). This type of database is often used as an interim staging area for a data warehouse. It is used for short term decisions involving mission-critical applications rather than for the medium and long term decisions associated with EDW.
  • 15. Enterprise data warehouse It is a large scale data warehouse that is used across the enterprise for decision support. The large scale nature provide integration of data from many sources into standard format for effective BI and decision support applications. It is used to provide data for many types of DSS includes: CRM, SCM, BPM, BAM, PLM, KMS, Revenue management.
  • 16. OLAP Online Analytical Processing Tools DSS tools that use multidimensional data analysis techniques Support for a DSS data store Data extraction and integration filter Specialized presentation interface
  • 17. Rules of a Data Warehouse Data Warehouse and Operational Environments are Separated Data is integrated Contains historical data over a long period of time Data is a snapshot data captured at a given point in time Data is subject-oriented
  • 18. Rules of Data Warehouse Mainly read-only with periodic batch updates Development Life Cycle has a data driven approach versus the traditional process-driven approach Data contains several levels of detail Current, Old, Lightly Summarized, Highly Summarized
  • 19. Rules of Data Warehouse Environment is characterized by Read-only transactions to very large data sets System that traces data sources, transformations, and storage Metadata is a critical component Source, transformation, integration, storage, relationships, history, etc Contains a chargeback mechanism for resource usage that enforces optimal use of data by end users
  • 20. OLAP Need for More Intensive Decision Support 4 Main Characteristics Multidimensional data analysis Advanced Database Support Easy-to-use end-user interfaces Support Client/Server architecture
  • 21. Multidimensional Data Analysis Techniques Advanced Data Presentation Functions 3-D graphics, Pivot Tables, Crosstabs, etc. Compatible with Spreadsheets & Statistical packages Advanced data aggregations, consolidation and classification across time dimensions Advanced computational functions Advanced data modeling functions
  • 22. Advanced Database Support Advanced Data Access Features Access to many kinds of DBMS’s, flat files, and internal and external data sources Access to aggregated data warehouse data Advanced data navigation (drill-downs and roll-ups) Ability to map end-user requests to the appropriate data source Support for Very Large Databases
  • 23. Easy-to-Use End-User Interface Graphical User Interfaces Much more useful if access is kept simple
  • 24. Client/Server Architecture Framework for the new systems to be designed, developed and implemented Divide the OLAP system into several components that define its architecture Same Computer Distributed among several computer
  • 25. OLAP Architecture 3 Main Modules GUI Analytical Processing Logic Data-processing Logic
  • 27. Data Warehouse Implementation An Active Decision Support Framework Not a Static Database Always a Work in Process Complete Infrastructure for Company-Wide decision support Hardware / Software / People / Procedures / Data Data Warehouse is a critical component of the Modern DSS – But not the Only critical component
  • 28. Data Mining Discover Previously unknown data characteristics, relationships, dependencies, or trends Typical Data Analysis Relies on end users Define the Problem Select the Data Initial the Data Analysis Reacts to External Stimulus
  • 29. Data Mining Proactive Automatically searches Anomalies Possible Relationships Identify Problems before the end-user Data Mining tools analyze the data, uncover problems or opportunities hidden in data relationships, form computer models based on their findings, and then user the models to predict business behavior – with minimal end-user intervention
  • 30. Data Mining A methodology designed to perform knowledge-discovery expeditions over the database data with minimal end-user intervention 3 Stages of Data Data Information Knowledge
  • 32. 4 Phases of Data Mining Data Preparation Identify the main data sets to be used by the data mining operation (usually the data warehouse) Data Analysis and Classification Study the data to identify common data characteristics or patterns Data groupings, classifications, clusters, sequences Data dependencies, links, or relationships Data patterns, trends, deviation
  • 33. 4 Phases of Data Mining Knowledge Acquisition Uses the Results of the Data Analysis and Classification phase Data mining tool selects the appropriate modeling or knowledge-acquisition algorithms Neural Networks Decision Trees Rules Induction Genetic algorithms Memory-Based Reasoning Prognosis Predict Future Behavior Forecast Business Outcomes 65% of customers who did not use a particular credit card in the last 6 months are 88% likely to cancel the account.
  • 34. Data Mining Still a New Technique May find many Unmeaningful Relationships Good at finding Practical Relationships Define Customer Buying Patterns Improve Product Development and Acceptance Etc. Potential of becoming the next frontier in database development
  • 35. Data Mining and Visualization Data mining: Knowledge discovery using a blend of statistical, AI, and computer graphics techniques Goals: Explain observed events or conditions Confirm hypotheses Explore data for new or unexpected relationships Techniques Statistical regression Decision tree induction Clustering and signal processing Affinity Sequence association Case-based reasoning Rule discovery Neural nets Fractals Data visualization–representing data in graphical/multimedia formats for analysis
  • 36.
  • 37. E. Turban, J. Aronson, T.P. Liang, R. Sharda, “Decision Support and Business Intelligence Systems”, 8th Edition, Pearson Education.