SlideShare ist ein Scribd-Unternehmen logo
1 von 17
DATA WAREHOUSING AND MINING
Akhil Singhal 3263
Aunj Gaikwad 3268
Anushka Srivastava 3269
Rahul Raisinghani 3293
Russall DMello 3322
Angad Chattwal 3323
What is Data warehouse?
Data warehouse is an information system that contains historical and commutative data
from single or multiple sources. It simplifies reporting and analysis process of the
organization.
It is also a single version of truth for any company for decision making and forecasting.
Characteristics of Data warehouse
A data warehouse has following characteristics:
Subject-Oriented
Integrated
Time-variant
Non-volatile
 Data Warehouse Architectures
There are mainly three types of Data warehouse Architectures: -
 Single-tier architecture
The objective of a single layer is to minimize the amount of data stored.This goal is to
remove data redundancy.This architecture is not frequently used in practice.
 Two-tier architecture
Two-layer architecture separates physically available sources and data warehouse.This
architecture is not expandable and also not supporting a large number of end-users. It
also has connectivity problems because of network limitations.
 Three-tier architecture
 This is the most widely used architecture.
It consists of theTop, Middle and BottomTier.
 Bottom Tier:The database of the Datawarehouse servers as the bottom tier. Data is
cleansed, transformed, and loaded into this layer using back-end tools.
 MiddleTier: The middle tier in Data warehouse is an OLAP server which is
implemented using either ROLAP or MOLAP model.
 Top-Tier: The top tier is a front-end client layer.Top tier is the tools and API that you
connect and get data out from the data warehouse.
Data warehouse Components
There are mainly five components of DataWarehouse:
 Data Warehouse Database
 The central database is the foundation of the data
warehousing environment.This database is implemented on
the RDBMS technology.
 Sourcing,Acquisition, Clean-up andTransformationTools
(ETL)
 The data sourcing, transformation, and migration tools are
used for performing all the conversions, summarizations,
and all the changes needed to transform data into a unified
format in the datawarehouse.
 Metadata
 Metadata is data about data which defines the data warehouse. It
is used for building, maintaining and managing the data
warehouse.
 Metadata can be classified into following categories:
 Technical Meta Data
 Business Meta Data
 QueryTools
 One of the primary objects of data warehousing is to provide
information to businesses to make strategic decisions. Query tools
allow users to interact with the data warehouse system.
 These tools fall into four different categories:
 Query and reporting tools
 Application Development tools
 Data mining tools
 OLAP tools
 Data warehouse Bus Architecture
 Data warehouse Bus determines the flow of data
in your warehouse.The data flow in a data
warehouse can be categorized as Inflow, Upflow,
Downflow, Outflow and Meta flow.
 Data Marts
 A data mart is an access layer which is used to get
data out to the users. It is presented as an option
for large size data warehouse as it takes less time
and money to build.
Data Mining
 Data mining is defined as a process
used to extract usable data from a
larger set of any raw data.
 It implies analysing data patterns in
large batches of data using one or
more software.
 For segmenting the data and
evaluating the probability of future
events, data mining uses
sophisticated mathematical
algorithms. Data mining is also known
as Knowledge Discovery in Data
(KDD).
Key features of data mining
 Automatic pattern predictions based on trend and
behaviour analysis.
 Prediction based on likely outcomes.
 Creation of decision-oriented information.
 Focus on large data sets and databases for analysis.
 Clustering based on finding and visually documented
groups of facts not previously known.
Data Mining Functionalities
 Are used to specify the kind of pattern to be found in data
mining tasks.There are 2 types of tasks:
Descriptive Task:
 These tasks present the general properties of data stored in
database. The descriptive tasks are used to find out patterns in
data i.e. cluster, correlation, trends and anomalies etc.
Predictive Tasks:
 Predictive data mining tasks predict the value of one attribute
on the bases of values of other attributes, which is known as
target or dependent variable and the attributes used for making
the prediction are known as independent variables.
Clustering
 Clustering is used to identify data objects that are similar to one another. Process of
partitioning a set of object or data in a same group called a cluster.
 Used in- machine learning, patterns recognition, image analysis and information
retrieval. For example, an insurance company can cluster its customers based on age,
residence, income etc. .
Associations and correlations:
 Association discovers the association or connection among a set of items.
 A retailer can identify the products that normally customers purchase together or even
find the customers who respond to the promotion of same kind of products.
 For example, a set of items, such as table and chair.
Summarization
 A set of relevant data is summarized which result in a smaller set that gives aggregated
information of the data.
 For example, the shopping done by a customer can be summarized into total products,
total spending, offers used, etc.
Data mining under DescriptiveTask
Prediction
 Prediction task predicts the possible values of future data.
 Prediction involves developing a model based on the available data and this model is
used in predicting future values of a new data set of interest.
 For example, a model can predict the income of an employee based on education,
experience and other demographic factors like place of stay, gender etc.
Time - Series Analysis
 Time series is a sequence of events where the next event is determined by one or more
of the preceding events.
 Time series analysis includes methods to analyze time-series data in order to extract
useful patterns, trends, rules and statistics. Stock market prediction is an important
application of time- series analysis.
Classification:
 Classification is used to builds models from data with predefined classes as the model
is used to classify new instance whose classification is not known.
 for example one may classify the employee’s potential salary on the bases of salary
classification of similar employees in the company.
Data mining under PredictiveTask
Applications of Data Mining
 Sales and Marketing
 Banking and Finance
 Healthcare and Insuarance
 Retail Industry
 Telecommunications Industry
 Higher Education
Amazon Web Services, Inc.
(IT service management company)
 AWS allows you to take advantage of all of the core benefits associated with on-demand
computing, such as access to seemingly limitless storage and compute capacity, and the
ability to scale your system in parallel with the growing amount of data collected, stored,
and queried, paying only for the resources you provision.
 Further, AWS offers a broad set of managed services that integrate seamlessly with each
other so that you can quickly deploy an end-to-end analytics and data warehousing
solution.
Amazon Redshift
 Amazon Redshift is a fast, fully managed, and cost-effective data
warehouse that gives you petabyte scale data warehousing and exabyte
scale data lake analytics together in one service.
 Amazon Redshift is up to ten times faster than traditional on-premises
data warehouses. Get unique insights by querying across petabytes of
data in Redshift and exabytes of structured data or open file formats in
Amazon S3, without the need to move or transform your data.
 Redshift is 1/10th the cost of traditional on-premises data warehouse
solutions.You can start small for just $0.25 per hour with no commitments,
scale out to petabytes of data for $250 to $333 per uncompressed terabyte
per year, and extend analytics to your Amazon S3 data lake for as little as
$0.05 for every 10 gigabytes of data scanned.
Amazon Redshift Customer Success
 “Amazon Redshift enables faster business insights and growth, and provides an
easy-to-manage infrastructure to support our data workloads. Redshift has given us
the confidence to run more data and analytics workloads on AWS and helps us meet
the growing needs of our customers.”
(Abhi Bhatt, Director Global Data & Analytics, McDonald’s)
 “Amazon Redshift allows us to ingest, optimize, transform, and aggregate billions
of transactional events per day at scale, coming to us from a variety of first and
third party sources. We query live data across our data warehouse and data lake,
and now with the new Amazon Redshift Federated Query feature we can easily
query and analyse live data across our relational databases as well.”
(AlexTverdohleb, Vice President Data Services, Consumer Products & Engineering,
FOX Corporation)
 “AtWD we use Amazon Redshift to enable the enterprise to gain value and insights
from large, complex, and dispersed datasets. Our data is nearly doubling every year
and we run six Redshift clusters with a total of 78 nodes and 631+TB of compressed
data stored to get insights that our business analysts and leadership depend on.”
(Fayaz Syed, Sr. Manager, Big Data Platform, Western Digital)

Weitere ähnliche Inhalte

Was ist angesagt?

Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and workAmr Abd El Latief
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesFellowBuddy.com
 
Ch 1 Intro to Data Mining
Ch 1 Intro to Data MiningCh 1 Intro to Data Mining
Ch 1 Intro to Data MiningSushil Kulkarni
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousingSunny Gandhi
 
Introduction to data mining technique
Introduction to data mining techniqueIntroduction to data mining technique
Introduction to data mining techniquePawneshwar Datt Rai
 
Difference between data warehouse and data mining
Difference between data warehouse and data miningDifference between data warehouse and data mining
Difference between data warehouse and data miningmaxonlinetr
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introductionDr-Dipali Meher
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.pptneelamoberoi1030
 
Importance of Data Mining
Importance of Data MiningImportance of Data Mining
Importance of Data MiningScottperrone
 
Information Technology Data Mining
Information Technology Data MiningInformation Technology Data Mining
Information Technology Data Miningsamiksha sharma
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Application of Data Warehousing & Data Mining to Exploitation for Supporting ...
Application of Data Warehousing & Data Mining to Exploitation for Supporting ...Application of Data Warehousing & Data Mining to Exploitation for Supporting ...
Application of Data Warehousing & Data Mining to Exploitation for Supporting ...Gihan Wikramanayake
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousingShahed Khalili
 

Was ist angesagt? (20)

Lecture1
Lecture1Lecture1
Lecture1
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Data warehouse logical design
Data warehouse logical designData warehouse logical design
Data warehouse logical design
 
Data mining
Data miningData mining
Data mining
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
 
Ch 1 Intro to Data Mining
Ch 1 Intro to Data MiningCh 1 Intro to Data Mining
Ch 1 Intro to Data Mining
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousing
 
Introduction to data mining technique
Introduction to data mining techniqueIntroduction to data mining technique
Introduction to data mining technique
 
Difference between data warehouse and data mining
Difference between data warehouse and data miningDifference between data warehouse and data mining
Difference between data warehouse and data mining
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Importance of Data Mining
Importance of Data MiningImportance of Data Mining
Importance of Data Mining
 
Information Technology Data Mining
Information Technology Data MiningInformation Technology Data Mining
Information Technology Data Mining
 
Star schema
Star schemaStar schema
Star schema
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Application of Data Warehousing & Data Mining to Exploitation for Supporting ...
Application of Data Warehousing & Data Mining to Exploitation for Supporting ...Application of Data Warehousing & Data Mining to Exploitation for Supporting ...
Application of Data Warehousing & Data Mining to Exploitation for Supporting ...
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
 
An introduction to data warehousing
An introduction to data warehousingAn introduction to data warehousing
An introduction to data warehousing
 

Ähnlich wie DATA WAREHOUSING AND MINING FOR INSIGHTS

Data warehousing
Data warehousingData warehousing
Data warehousingkeeyre
 
Data warehouse
Data warehouseData warehouse
Data warehouseRajThakuri
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forAyushMeraki1
 
MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)Krishan Pareek
 
Business Intelligence and Analytics Unit-2 part-A .pptx
Business Intelligence and Analytics Unit-2 part-A .pptxBusiness Intelligence and Analytics Unit-2 part-A .pptx
Business Intelligence and Analytics Unit-2 part-A .pptxRupaRani28
 
Data Ware House System in Cloud Environment
Data Ware House System in Cloud EnvironmentData Ware House System in Cloud Environment
Data Ware House System in Cloud EnvironmentIJERA Editor
 
Introduction-to-Databases.pptx
Introduction-to-Databases.pptxIntroduction-to-Databases.pptx
Introduction-to-Databases.pptxIvanDarrylLopez
 
Application Of A New Database Management System
Application Of A New Database Management SystemApplication Of A New Database Management System
Application Of A New Database Management SystemPamela Wright
 
Discussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxDiscussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxmadlynplamondon
 
Data warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswersData warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswersSourav Singh
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundationshktripathy
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data LakeMetroStar
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl conceptsjeshocarme
 

Ähnlich wie DATA WAREHOUSING AND MINING FOR INSIGHTS (20)

Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining for
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)
 
Data Mining
Data MiningData Mining
Data Mining
 
ETL QA
ETL QAETL QA
ETL QA
 
Advanced Database System
Advanced Database SystemAdvanced Database System
Advanced Database System
 
Business Intelligence and Analytics Unit-2 part-A .pptx
Business Intelligence and Analytics Unit-2 part-A .pptxBusiness Intelligence and Analytics Unit-2 part-A .pptx
Business Intelligence and Analytics Unit-2 part-A .pptx
 
MS-CIT Unit 9.pptx
MS-CIT Unit 9.pptxMS-CIT Unit 9.pptx
MS-CIT Unit 9.pptx
 
Data Ware House System in Cloud Environment
Data Ware House System in Cloud EnvironmentData Ware House System in Cloud Environment
Data Ware House System in Cloud Environment
 
Introduction-to-Databases.pptx
Introduction-to-Databases.pptxIntroduction-to-Databases.pptx
Introduction-to-Databases.pptx
 
Application Of A New Database Management System
Application Of A New Database Management SystemApplication Of A New Database Management System
Application Of A New Database Management System
 
CTP Data Warehouse
CTP Data WarehouseCTP Data Warehouse
CTP Data Warehouse
 
Discussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxDiscussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docx
 
Data warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswersData warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswers
 
Lecture4 big data technology foundations
Lecture4 big data technology foundationsLecture4 big data technology foundations
Lecture4 big data technology foundations
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 

Mehr von AkhilSinghal21

Sales and distribution 23456
Sales and distribution 23456Sales and distribution 23456
Sales and distribution 23456AkhilSinghal21
 
Sales and distribution 12345
Sales and distribution 12345Sales and distribution 12345
Sales and distribution 12345AkhilSinghal21
 
Project udaan- An SCMS Initiative
Project udaan- An SCMS InitiativeProject udaan- An SCMS Initiative
Project udaan- An SCMS InitiativeAkhilSinghal21
 
Ocean trash final 12345
Ocean trash final 12345Ocean trash final 12345
Ocean trash final 12345AkhilSinghal21
 

Mehr von AkhilSinghal21 (9)

Retail oneplus 12345
Retail oneplus 12345Retail oneplus 12345
Retail oneplus 12345
 
Erm oneplus 12345
Erm oneplus 12345Erm oneplus 12345
Erm oneplus 12345
 
TQM Zamil Steel 12345
TQM Zamil Steel 12345TQM Zamil Steel 12345
TQM Zamil Steel 12345
 
Sales and distribution 23456
Sales and distribution 23456Sales and distribution 23456
Sales and distribution 23456
 
Sales and distribution 12345
Sales and distribution 12345Sales and distribution 12345
Sales and distribution 12345
 
Project udaan- An SCMS Initiative
Project udaan- An SCMS InitiativeProject udaan- An SCMS Initiative
Project udaan- An SCMS Initiative
 
Ocean trash final 12345
Ocean trash final 12345Ocean trash final 12345
Ocean trash final 12345
 
Mock court ppt 123456
Mock court ppt 123456Mock court ppt 123456
Mock court ppt 123456
 
Dystopian World 2050
Dystopian World 2050Dystopian World 2050
Dystopian World 2050
 

Kürzlich hochgeladen

USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Kürzlich hochgeladen (20)

USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 

DATA WAREHOUSING AND MINING FOR INSIGHTS

  • 1. DATA WAREHOUSING AND MINING Akhil Singhal 3263 Aunj Gaikwad 3268 Anushka Srivastava 3269 Rahul Raisinghani 3293 Russall DMello 3322 Angad Chattwal 3323
  • 2. What is Data warehouse? Data warehouse is an information system that contains historical and commutative data from single or multiple sources. It simplifies reporting and analysis process of the organization. It is also a single version of truth for any company for decision making and forecasting. Characteristics of Data warehouse A data warehouse has following characteristics: Subject-Oriented Integrated Time-variant Non-volatile
  • 3.  Data Warehouse Architectures There are mainly three types of Data warehouse Architectures: -  Single-tier architecture The objective of a single layer is to minimize the amount of data stored.This goal is to remove data redundancy.This architecture is not frequently used in practice.  Two-tier architecture Two-layer architecture separates physically available sources and data warehouse.This architecture is not expandable and also not supporting a large number of end-users. It also has connectivity problems because of network limitations.  Three-tier architecture  This is the most widely used architecture. It consists of theTop, Middle and BottomTier.  Bottom Tier:The database of the Datawarehouse servers as the bottom tier. Data is cleansed, transformed, and loaded into this layer using back-end tools.  MiddleTier: The middle tier in Data warehouse is an OLAP server which is implemented using either ROLAP or MOLAP model.  Top-Tier: The top tier is a front-end client layer.Top tier is the tools and API that you connect and get data out from the data warehouse.
  • 4. Data warehouse Components There are mainly five components of DataWarehouse:  Data Warehouse Database  The central database is the foundation of the data warehousing environment.This database is implemented on the RDBMS technology.  Sourcing,Acquisition, Clean-up andTransformationTools (ETL)  The data sourcing, transformation, and migration tools are used for performing all the conversions, summarizations, and all the changes needed to transform data into a unified format in the datawarehouse.
  • 5.  Metadata  Metadata is data about data which defines the data warehouse. It is used for building, maintaining and managing the data warehouse.  Metadata can be classified into following categories:  Technical Meta Data  Business Meta Data  QueryTools  One of the primary objects of data warehousing is to provide information to businesses to make strategic decisions. Query tools allow users to interact with the data warehouse system.  These tools fall into four different categories:  Query and reporting tools  Application Development tools  Data mining tools  OLAP tools
  • 6.  Data warehouse Bus Architecture  Data warehouse Bus determines the flow of data in your warehouse.The data flow in a data warehouse can be categorized as Inflow, Upflow, Downflow, Outflow and Meta flow.  Data Marts  A data mart is an access layer which is used to get data out to the users. It is presented as an option for large size data warehouse as it takes less time and money to build.
  • 7. Data Mining  Data mining is defined as a process used to extract usable data from a larger set of any raw data.  It implies analysing data patterns in large batches of data using one or more software.  For segmenting the data and evaluating the probability of future events, data mining uses sophisticated mathematical algorithms. Data mining is also known as Knowledge Discovery in Data (KDD).
  • 8. Key features of data mining  Automatic pattern predictions based on trend and behaviour analysis.  Prediction based on likely outcomes.  Creation of decision-oriented information.  Focus on large data sets and databases for analysis.  Clustering based on finding and visually documented groups of facts not previously known.
  • 9. Data Mining Functionalities  Are used to specify the kind of pattern to be found in data mining tasks.There are 2 types of tasks: Descriptive Task:  These tasks present the general properties of data stored in database. The descriptive tasks are used to find out patterns in data i.e. cluster, correlation, trends and anomalies etc. Predictive Tasks:  Predictive data mining tasks predict the value of one attribute on the bases of values of other attributes, which is known as target or dependent variable and the attributes used for making the prediction are known as independent variables.
  • 10.
  • 11. Clustering  Clustering is used to identify data objects that are similar to one another. Process of partitioning a set of object or data in a same group called a cluster.  Used in- machine learning, patterns recognition, image analysis and information retrieval. For example, an insurance company can cluster its customers based on age, residence, income etc. . Associations and correlations:  Association discovers the association or connection among a set of items.  A retailer can identify the products that normally customers purchase together or even find the customers who respond to the promotion of same kind of products.  For example, a set of items, such as table and chair. Summarization  A set of relevant data is summarized which result in a smaller set that gives aggregated information of the data.  For example, the shopping done by a customer can be summarized into total products, total spending, offers used, etc. Data mining under DescriptiveTask
  • 12. Prediction  Prediction task predicts the possible values of future data.  Prediction involves developing a model based on the available data and this model is used in predicting future values of a new data set of interest.  For example, a model can predict the income of an employee based on education, experience and other demographic factors like place of stay, gender etc. Time - Series Analysis  Time series is a sequence of events where the next event is determined by one or more of the preceding events.  Time series analysis includes methods to analyze time-series data in order to extract useful patterns, trends, rules and statistics. Stock market prediction is an important application of time- series analysis. Classification:  Classification is used to builds models from data with predefined classes as the model is used to classify new instance whose classification is not known.  for example one may classify the employee’s potential salary on the bases of salary classification of similar employees in the company. Data mining under PredictiveTask
  • 13. Applications of Data Mining  Sales and Marketing  Banking and Finance  Healthcare and Insuarance  Retail Industry  Telecommunications Industry  Higher Education
  • 14.
  • 15. Amazon Web Services, Inc. (IT service management company)  AWS allows you to take advantage of all of the core benefits associated with on-demand computing, such as access to seemingly limitless storage and compute capacity, and the ability to scale your system in parallel with the growing amount of data collected, stored, and queried, paying only for the resources you provision.  Further, AWS offers a broad set of managed services that integrate seamlessly with each other so that you can quickly deploy an end-to-end analytics and data warehousing solution.
  • 16. Amazon Redshift  Amazon Redshift is a fast, fully managed, and cost-effective data warehouse that gives you petabyte scale data warehousing and exabyte scale data lake analytics together in one service.  Amazon Redshift is up to ten times faster than traditional on-premises data warehouses. Get unique insights by querying across petabytes of data in Redshift and exabytes of structured data or open file formats in Amazon S3, without the need to move or transform your data.  Redshift is 1/10th the cost of traditional on-premises data warehouse solutions.You can start small for just $0.25 per hour with no commitments, scale out to petabytes of data for $250 to $333 per uncompressed terabyte per year, and extend analytics to your Amazon S3 data lake for as little as $0.05 for every 10 gigabytes of data scanned.
  • 17. Amazon Redshift Customer Success  “Amazon Redshift enables faster business insights and growth, and provides an easy-to-manage infrastructure to support our data workloads. Redshift has given us the confidence to run more data and analytics workloads on AWS and helps us meet the growing needs of our customers.” (Abhi Bhatt, Director Global Data & Analytics, McDonald’s)  “Amazon Redshift allows us to ingest, optimize, transform, and aggregate billions of transactional events per day at scale, coming to us from a variety of first and third party sources. We query live data across our data warehouse and data lake, and now with the new Amazon Redshift Federated Query feature we can easily query and analyse live data across our relational databases as well.” (AlexTverdohleb, Vice President Data Services, Consumer Products & Engineering, FOX Corporation)  “AtWD we use Amazon Redshift to enable the enterprise to gain value and insights from large, complex, and dispersed datasets. Our data is nearly doubling every year and we run six Redshift clusters with a total of 78 nodes and 631+TB of compressed data stored to get insights that our business analysts and leadership depend on.” (Fayaz Syed, Sr. Manager, Big Data Platform, Western Digital)