SlideShare ist ein Scribd-Unternehmen logo
1 von 44
SQL Server 2008 for Business Intelligence UTS Short Course
Peter Gfader Specializes in  C# and .NET (Java not anymore) TestingAutomated tests Agile, ScrumCertified Scrum Trainer Technology aficionado  Silverlight ASP.NET Windows Forms
Admin Stuff Attendance You initial sheet Hands On Lab You get me to initial sheet Homework Certificate  At end of 5 sessions If I say if you have completed successfully 
Course Website Course Timetable & Materials http://www.ssw.com.au/ssw/Events/2010UTSSQL/ Resources http://sharepoint.ssw.com.au/Training/UTSSQL/
Course Overview
Last week(s) Other cube browsers Microsoft Data Analyzer Proclarity Excel 2003/2007/2010 Excel services Thinslicer Performance Point Power Pivot
Create report on top of Northwind Top 10 customers (Table) Top 10 products (Table) Top 10 employees (Table) 1 chart that shows the top 10 customers 1 usage of the gauge control (surprise me) Homework
The plan
Step by step to BI Create Data Warehouse Copy data to data warehouse  Create OLAP Cubes Create Reports Browse the cube Do some Data Mining Discovering relationships Predict future events
Agenda What is Data Mining? Why? Uses Algorithms Demo Hands on Lab
What is Data Mining? “Data mining is the use of powerful software tools to discover significant traits or relationships,from databases or data warehouses and often used to predict future events”
What is Data Mining? It exploits statistical algorithms  Once the “knowledge” is extracted it: Can be used to discover Can be used to predict values of other cases
Why Data Mining? Marketing Who picks the movie? The kids, the wife, me Who are our Customers and what sort of films do they hire? Is a 30 year old woman with 2 children going to hire Arnie’s latest film Validation Is this data sensible? Terminator 2 and Toy Story Prediction Sales Next Year
Get new information from data, future trends, past trends, outlier, maximums, minimums Analyse data from different perspectives and summarizing it into useful information New information to increase revenue cuts costs or both :-) Why?   Its all about money
Who are our biggest customers? What are customers buying with cigars? What are the customer retention levels of our branches? Which customers have bought olives, feta cheese but no ciabatta bread? Which regions have the highest male/female ratio of single 20 somethings? Which region has lowest customer retention levels and list out lost customers? Which Questions are Data Mining?
Ad hoc query Drill through to details Business Intelligence tool What’s not data mining
[object Object]
Good raw material  good data miningSamples should be representative Samples "similar" to domain Not all-seeing crystal ball Verify and Validate! Data - Uncover patterns in samples
OLAP Is about fast ad hoc querying Analysis by dimensions and measures Gives precise answers Data Mining May use RDBMS or OLAP source Is about discovering and predicting Gives imprecise answers OLAP is not a prerequisite for data mining, but it  almost always comes first OLAP versus Data Mining (learning to ride a bike before a car)
Classification algorithms  predictone or more discrete variables, based on the other attributes in the dataset Regression algorithms  predictone or more continuous variables, such as profit or loss, based on other attributes in the dataset Segmentation algorithms  dividedata into groups, or clusters, of items that have similar properties Association algorithms  find correlations between different attributes in a dataset Sequence analysis algorithms  summarize frequent sequences or episodes in data, such as a Web path flow Types of Data Mining Algorithms
Clustering Time Series Decision Trees Naïve Bayes Association Linear Regression Complete Set Of AlgorithmsWays to analyze your data Neural Network Sequence Clustering Logistic Regression
Split data Each of branch is like an attribute Brightness = amount of data Decision trees
Decision Trees (1) Decision Trees assign (classify) each case to one of a few (discrete) broad categories of selected attribute (variable) and explains the classification with few selected input variables The process of building is recursive partitioning – splitting data into partitions and then splitting it up more Initially all cases are in one big box
Decision Trees (2) The algorithm tries all possible breaks in classes using all possible values of each input attribute; it then selects the split that partitions data to the purest classes of the searched variable Several measures of purity Then it repeats splitting for each new class Again testing all possible breaks Unuseful branches of the tree can be pre-pruned or post-pruned
Decision Trees (3) Decision trees are used for classification and prediction Typical questions: Predict which customers will leave Help in mailing and promotion campaigns Explain reasons for a decision What are the movies young female customers like to buy?
Decision Trees – Who Decides
Naïve Bayes Bayes Formula Uses statistics to say falls into certain category or not with probability Spam filtering: score of spam (Bayes) Testing only a particular attribute
Naïve Bayes Quickly builds mining models that can be used for classification and prediction It calculates probabilities for each possible state of the input attribute, given each state of the predictable attribute This can later be used to predict an outcome of the predicted attribute based on the known input attributes  This makes the model a good option for exploring the data
Cluster Analysis (1) Grouping data into clusters Objects within a cluster have high similarity based on the attribute values The class label of each object is not known Several techniques Partitioning methods Hierarchical methods Density based methods Model based methods And more…
Cluster Analysis (2) Segments a heterogeneous population into a number of more homogenous subgroups or clusters Some typical questions: Discover distinct groups of customers Identification of groups of houses in a city In biology, derive animal and plant taxonomies Find outliers
Clustering Annual  Income Age
Time series Timebaseddata  prediction
Sequence clustering Numbers orders stronger associations Direction of association (not necessary the other direction)
If you own certain stocks ' you own maybe other ones as well Probability = thickness of line Association
Let system learn how to classify data Neural Network adapts to the new data Formulate statement/hypothesis Outcome is know (Data / Surveys) 1. 70% data to train network (outcome is known) 2. 30% of data to test network (outcome is known) 3. New data (no survey needed, predict from network) Other example: OCR  Neural Nets
Both have directions Sequence Clustering has probability number and colour They are very similar. The difference is that Association analyses items that occur together whereas sequence clustering analyses items that follow one another. An example is that Sequence Clustering might be used by credit card companies to spot fraud, e.g. a petrol station refill followed by another petrol station refill followed by a big purchase = fraud (different transactions) Whereas Association will be more like: when someone buys popcorn at the cinemas, they also buy a drink (same transaction) Difference between algorithms: Association and Sequence
Conclusion: When To Use What
Visual Numerics 3rd party algorithms http://www.vni.com/company/whitepapers/                              MicrosoftBIwithNumericalLibraries.pdf There is more...
Excel Data Mining Microsoft SQL Server 2008 Data Mining Add-ins for Microsoft Office 2007 http://www.microsoft.com/downloads/en/details.aspx?familyid=896A493A-2502-4795-94AE-E00632BA6DE7&displaylang=en
Train station / airport  Who is the bad guy Farmers  Find the best crops Supermarket  Find to figure out how to get you to buy more, where the expensive items Other usages of data miningFind patterns - Profiling
SSIS 2008 - Data profiling task Get a profile of the data in a table  potential candidate keys length of data values in columns Null percentage of rows distribution of values .... Tip
Video: Simple data mining model http://www.sqlservercentral.com/articles/Video/65055/ Video: Data mining and Reporting Services http://www.sqlservercentral.com/articles/Video/64190/ Data Mining Algorithms http://msdn.microsoft.com/en-us/library/ms175595.aspx Resources 1
Jamie MacLennan http://blogs.msdn.com/b/jamiemac/ Richard Lees on BI http://richardlees.blogspot.com/ Book Data Mining with Microsoft SQL Server 2008 http://www.amazon.com/gp/product/0470277742?ie=UTF8&tag=sqlserverda09-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=0470277742 Resources 2
Summary Why Data Mining? Uses Algorithms Demo Hands on Lab

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Online retail a look at data consulting approach
Online retail   a look at data consulting approachOnline retail   a look at data consulting approach
Online retail a look at data consulting approach
 
Data analysis
Data analysisData analysis
Data analysis
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Data Science in Action
Data Science in ActionData Science in Action
Data Science in Action
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
Data preparation and processing chapter 2
Data preparation and processing chapter  2Data preparation and processing chapter  2
Data preparation and processing chapter 2
 
Application of data mining tools for
Application of data mining tools forApplication of data mining tools for
Application of data mining tools for
 
Key Principles Of Data Mining
Key Principles Of Data MiningKey Principles Of Data Mining
Key Principles Of Data Mining
 
BAS 250 Lecture 2
BAS 250 Lecture 2BAS 250 Lecture 2
BAS 250 Lecture 2
 
Classification of data
Classification of dataClassification of data
Classification of data
 
Data analytics
Data analyticsData analytics
Data analytics
 
3 classification
3  classification3  classification
3 classification
 
Artificial Intelligence in Data Curation
Artificial Intelligence in Data CurationArtificial Intelligence in Data Curation
Artificial Intelligence in Data Curation
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
What is Data analytics and it's importance ?
What is Data analytics and it's importance ?What is Data analytics and it's importance ?
What is Data analytics and it's importance ?
 
Business analyst
Business analystBusiness analyst
Business analyst
 
4 Data preparation and processing
4  Data preparation and processing4  Data preparation and processing
4 Data preparation and processing
 
Data Science
Data ScienceData Science
Data Science
 
Analytics in Online Retail
Analytics in Online RetailAnalytics in Online Retail
Analytics in Online Retail
 

Ähnlich wie SQL Server 2008 for Business Intelligence

Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dmsumit621
 
Data science technology overview
Data science technology overviewData science technology overview
Data science technology overviewSoojung Hong
 
How To Buy Data Warehouse
How To Buy Data WarehouseHow To Buy Data Warehouse
How To Buy Data WarehouseEric Sun
 
Data Mining with SQL Server 2005
Data Mining with SQL Server 2005Data Mining with SQL Server 2005
Data Mining with SQL Server 2005Dean Willson
 
Meetup sthlm - introduction to Machine Learning with demo cases
Meetup sthlm - introduction to Machine Learning with demo casesMeetup sthlm - introduction to Machine Learning with demo cases
Meetup sthlm - introduction to Machine Learning with demo casesZenodia Charpy
 
Big Data Analytics: From SQL to Machine Learning and Graph Analysis
Big Data Analytics: From SQL to Machine Learning and Graph AnalysisBig Data Analytics: From SQL to Machine Learning and Graph Analysis
Big Data Analytics: From SQL to Machine Learning and Graph AnalysisYuanyuan Tian
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysNEWYORKSYS-IT SOLUTIONS
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine LearningMostafa
 
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411Mark Tabladillo
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining Sushil Kulkarni
 
Tom Martens - Cube Ware - The big data challenge - bo
Tom Martens - Cube Ware - The big data challenge - boTom Martens - Cube Ware - The big data challenge - bo
Tom Martens - Cube Ware - The big data challenge - boSogeti Nederland B.V.
 
Knowledge Discovery Using Data Mining
Knowledge Discovery Using Data MiningKnowledge Discovery Using Data Mining
Knowledge Discovery Using Data Miningparthvora18
 
Knowledge discovery claudiad amato
Knowledge discovery claudiad amatoKnowledge discovery claudiad amato
Knowledge discovery claudiad amatoSSSW
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingJason S
 
Top 30 Data Analyst Interview Questions.pdf
Top 30 Data Analyst Interview Questions.pdfTop 30 Data Analyst Interview Questions.pdf
Top 30 Data Analyst Interview Questions.pdfShaikSikindar1
 

Ähnlich wie SQL Server 2008 for Business Intelligence (20)

Cssu dw dm
Cssu dw dmCssu dw dm
Cssu dw dm
 
Data science technology overview
Data science technology overviewData science technology overview
Data science technology overview
 
How To Buy Data Warehouse
How To Buy Data WarehouseHow To Buy Data Warehouse
How To Buy Data Warehouse
 
Data Mining with SQL Server 2005
Data Mining with SQL Server 2005Data Mining with SQL Server 2005
Data Mining with SQL Server 2005
 
Meetup sthlm - introduction to Machine Learning with demo cases
Meetup sthlm - introduction to Machine Learning with demo casesMeetup sthlm - introduction to Machine Learning with demo cases
Meetup sthlm - introduction to Machine Learning with demo cases
 
Big Data Analytics: From SQL to Machine Learning and Graph Analysis
Big Data Analytics: From SQL to Machine Learning and Graph AnalysisBig Data Analytics: From SQL to Machine Learning and Graph Analysis
Big Data Analytics: From SQL to Machine Learning and Graph Analysis
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
Tom Martens - Cube Ware - The big data challenge - bo
Tom Martens - Cube Ware - The big data challenge - boTom Martens - Cube Ware - The big data challenge - bo
Tom Martens - Cube Ware - The big data challenge - bo
 
Knowledge Discovery Using Data Mining
Knowledge Discovery Using Data MiningKnowledge Discovery Using Data Mining
Knowledge Discovery Using Data Mining
 
Knowledge discovery claudiad amato
Knowledge discovery claudiad amatoKnowledge discovery claudiad amato
Knowledge discovery claudiad amato
 
Talk
TalkTalk
Talk
 
Part1
Part1Part1
Part1
 
Chapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data MiningChapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data Mining
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Top 30 Data Analyst Interview Questions.pdf
Top 30 Data Analyst Interview Questions.pdfTop 30 Data Analyst Interview Questions.pdf
Top 30 Data Analyst Interview Questions.pdf
 

Mehr von Peter Gfader

Achieving Technical Excellence in Your Software Teams - from Devternity
Achieving Technical Excellence in Your Software Teams - from Devternity Achieving Technical Excellence in Your Software Teams - from Devternity
Achieving Technical Excellence in Your Software Teams - from Devternity Peter Gfader
 
You Can't Be Agile If Your Testing Practices Suck - Vilnius October 2019
You Can't Be Agile If Your Testing Practices Suck - Vilnius October 2019You Can't Be Agile If Your Testing Practices Suck - Vilnius October 2019
You Can't Be Agile If Your Testing Practices Suck - Vilnius October 2019Peter Gfader
 
You Cant Be Agile If Your Code Sucks (with 9 Tips For Dev Teams)
You Cant Be Agile If Your Code Sucks (with 9 Tips For Dev Teams)You Cant Be Agile If Your Code Sucks (with 9 Tips For Dev Teams)
You Cant Be Agile If Your Code Sucks (with 9 Tips For Dev Teams)Peter Gfader
 
How to make more impact as an engineer
How to make more impact as an engineerHow to make more impact as an engineer
How to make more impact as an engineerPeter Gfader
 
13 explosive things you should try as an agilist
13 explosive things you should try as an agilist13 explosive things you should try as an agilist
13 explosive things you should try as an agilistPeter Gfader
 
You cant be agile if your code sucks
You cant be agile if your code sucksYou cant be agile if your code sucks
You cant be agile if your code sucksPeter Gfader
 
Use Scrum and Continuous Delivery to innovate like crazy!
Use Scrum and Continuous Delivery to innovate like crazy!Use Scrum and Continuous Delivery to innovate like crazy!
Use Scrum and Continuous Delivery to innovate like crazy!Peter Gfader
 
Innovation durch Scrum und Continuous Delivery
Innovation durch Scrum und Continuous DeliveryInnovation durch Scrum und Continuous Delivery
Innovation durch Scrum und Continuous DeliveryPeter Gfader
 
Qcon london2012 recap
Qcon london2012 recapQcon london2012 recap
Qcon london2012 recapPeter Gfader
 
Continuous Delivery with TFS msbuild msdeploy
Continuous Delivery with TFS msbuild msdeployContinuous Delivery with TFS msbuild msdeploy
Continuous Delivery with TFS msbuild msdeployPeter Gfader
 
Silverlight vs HTML5 - Lessons learned from the real world...
Silverlight vs HTML5 - Lessons learned from the real world...Silverlight vs HTML5 - Lessons learned from the real world...
Silverlight vs HTML5 - Lessons learned from the real world...Peter Gfader
 
Clean Code Development
Clean Code DevelopmentClean Code Development
Clean Code DevelopmentPeter Gfader
 
SSAS - Other Cube Browsers
SSAS - Other Cube BrowsersSSAS - Other Cube Browsers
SSAS - Other Cube BrowsersPeter Gfader
 
Reports with SQL Server Reporting Services
Reports with SQL Server Reporting ServicesReports with SQL Server Reporting Services
Reports with SQL Server Reporting ServicesPeter Gfader
 
OLAP – Creating Cubes with SQL Server Analysis Services
OLAP – Creating Cubes with SQL Server Analysis ServicesOLAP – Creating Cubes with SQL Server Analysis Services
OLAP – Creating Cubes with SQL Server Analysis ServicesPeter Gfader
 
Business Intelligence with SQL Server
Business Intelligence with SQL ServerBusiness Intelligence with SQL Server
Business Intelligence with SQL ServerPeter Gfader
 
SQL Server - Full text search
SQL Server - Full text searchSQL Server - Full text search
SQL Server - Full text searchPeter Gfader
 
Usability AJAX and other ASP.NET Features
Usability AJAX and other ASP.NET FeaturesUsability AJAX and other ASP.NET Features
Usability AJAX and other ASP.NET FeaturesPeter Gfader
 
Work with data in ASP.NET
Work with data in ASP.NETWork with data in ASP.NET
Work with data in ASP.NETPeter Gfader
 

Mehr von Peter Gfader (20)

Achieving Technical Excellence in Your Software Teams - from Devternity
Achieving Technical Excellence in Your Software Teams - from Devternity Achieving Technical Excellence in Your Software Teams - from Devternity
Achieving Technical Excellence in Your Software Teams - from Devternity
 
You Can't Be Agile If Your Testing Practices Suck - Vilnius October 2019
You Can't Be Agile If Your Testing Practices Suck - Vilnius October 2019You Can't Be Agile If Your Testing Practices Suck - Vilnius October 2019
You Can't Be Agile If Your Testing Practices Suck - Vilnius October 2019
 
You Cant Be Agile If Your Code Sucks (with 9 Tips For Dev Teams)
You Cant Be Agile If Your Code Sucks (with 9 Tips For Dev Teams)You Cant Be Agile If Your Code Sucks (with 9 Tips For Dev Teams)
You Cant Be Agile If Your Code Sucks (with 9 Tips For Dev Teams)
 
How to make more impact as an engineer
How to make more impact as an engineerHow to make more impact as an engineer
How to make more impact as an engineer
 
13 explosive things you should try as an agilist
13 explosive things you should try as an agilist13 explosive things you should try as an agilist
13 explosive things you should try as an agilist
 
You cant be agile if your code sucks
You cant be agile if your code sucksYou cant be agile if your code sucks
You cant be agile if your code sucks
 
Use Scrum and Continuous Delivery to innovate like crazy!
Use Scrum and Continuous Delivery to innovate like crazy!Use Scrum and Continuous Delivery to innovate like crazy!
Use Scrum and Continuous Delivery to innovate like crazy!
 
Innovation durch Scrum und Continuous Delivery
Innovation durch Scrum und Continuous DeliveryInnovation durch Scrum und Continuous Delivery
Innovation durch Scrum und Continuous Delivery
 
Speed = $$$
Speed = $$$Speed = $$$
Speed = $$$
 
Qcon london2012 recap
Qcon london2012 recapQcon london2012 recap
Qcon london2012 recap
 
Continuous Delivery with TFS msbuild msdeploy
Continuous Delivery with TFS msbuild msdeployContinuous Delivery with TFS msbuild msdeploy
Continuous Delivery with TFS msbuild msdeploy
 
Silverlight vs HTML5 - Lessons learned from the real world...
Silverlight vs HTML5 - Lessons learned from the real world...Silverlight vs HTML5 - Lessons learned from the real world...
Silverlight vs HTML5 - Lessons learned from the real world...
 
Clean Code Development
Clean Code DevelopmentClean Code Development
Clean Code Development
 
SSAS - Other Cube Browsers
SSAS - Other Cube BrowsersSSAS - Other Cube Browsers
SSAS - Other Cube Browsers
 
Reports with SQL Server Reporting Services
Reports with SQL Server Reporting ServicesReports with SQL Server Reporting Services
Reports with SQL Server Reporting Services
 
OLAP – Creating Cubes with SQL Server Analysis Services
OLAP – Creating Cubes with SQL Server Analysis ServicesOLAP – Creating Cubes with SQL Server Analysis Services
OLAP – Creating Cubes with SQL Server Analysis Services
 
Business Intelligence with SQL Server
Business Intelligence with SQL ServerBusiness Intelligence with SQL Server
Business Intelligence with SQL Server
 
SQL Server - Full text search
SQL Server - Full text searchSQL Server - Full text search
SQL Server - Full text search
 
Usability AJAX and other ASP.NET Features
Usability AJAX and other ASP.NET FeaturesUsability AJAX and other ASP.NET Features
Usability AJAX and other ASP.NET Features
 
Work with data in ASP.NET
Work with data in ASP.NETWork with data in ASP.NET
Work with data in ASP.NET
 

Kürzlich hochgeladen

Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 

Kürzlich hochgeladen (20)

Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 

SQL Server 2008 for Business Intelligence

  • 1. SQL Server 2008 for Business Intelligence UTS Short Course
  • 2. Peter Gfader Specializes in C# and .NET (Java not anymore) TestingAutomated tests Agile, ScrumCertified Scrum Trainer Technology aficionado Silverlight ASP.NET Windows Forms
  • 3. Admin Stuff Attendance You initial sheet Hands On Lab You get me to initial sheet Homework Certificate At end of 5 sessions If I say if you have completed successfully 
  • 4. Course Website Course Timetable & Materials http://www.ssw.com.au/ssw/Events/2010UTSSQL/ Resources http://sharepoint.ssw.com.au/Training/UTSSQL/
  • 6. Last week(s) Other cube browsers Microsoft Data Analyzer Proclarity Excel 2003/2007/2010 Excel services Thinslicer Performance Point Power Pivot
  • 7. Create report on top of Northwind Top 10 customers (Table) Top 10 products (Table) Top 10 employees (Table) 1 chart that shows the top 10 customers 1 usage of the gauge control (surprise me) Homework
  • 9. Step by step to BI Create Data Warehouse Copy data to data warehouse Create OLAP Cubes Create Reports Browse the cube Do some Data Mining Discovering relationships Predict future events
  • 10. Agenda What is Data Mining? Why? Uses Algorithms Demo Hands on Lab
  • 11. What is Data Mining? “Data mining is the use of powerful software tools to discover significant traits or relationships,from databases or data warehouses and often used to predict future events”
  • 12. What is Data Mining? It exploits statistical algorithms Once the “knowledge” is extracted it: Can be used to discover Can be used to predict values of other cases
  • 13. Why Data Mining? Marketing Who picks the movie? The kids, the wife, me Who are our Customers and what sort of films do they hire? Is a 30 year old woman with 2 children going to hire Arnie’s latest film Validation Is this data sensible? Terminator 2 and Toy Story Prediction Sales Next Year
  • 14. Get new information from data, future trends, past trends, outlier, maximums, minimums Analyse data from different perspectives and summarizing it into useful information New information to increase revenue cuts costs or both :-) Why? Its all about money
  • 15. Who are our biggest customers? What are customers buying with cigars? What are the customer retention levels of our branches? Which customers have bought olives, feta cheese but no ciabatta bread? Which regions have the highest male/female ratio of single 20 somethings? Which region has lowest customer retention levels and list out lost customers? Which Questions are Data Mining?
  • 16. Ad hoc query Drill through to details Business Intelligence tool What’s not data mining
  • 17.
  • 18. Good raw material  good data miningSamples should be representative Samples "similar" to domain Not all-seeing crystal ball Verify and Validate! Data - Uncover patterns in samples
  • 19. OLAP Is about fast ad hoc querying Analysis by dimensions and measures Gives precise answers Data Mining May use RDBMS or OLAP source Is about discovering and predicting Gives imprecise answers OLAP is not a prerequisite for data mining, but it almost always comes first OLAP versus Data Mining (learning to ride a bike before a car)
  • 20. Classification algorithms predictone or more discrete variables, based on the other attributes in the dataset Regression algorithms predictone or more continuous variables, such as profit or loss, based on other attributes in the dataset Segmentation algorithms dividedata into groups, or clusters, of items that have similar properties Association algorithms find correlations between different attributes in a dataset Sequence analysis algorithms summarize frequent sequences or episodes in data, such as a Web path flow Types of Data Mining Algorithms
  • 21. Clustering Time Series Decision Trees Naïve Bayes Association Linear Regression Complete Set Of AlgorithmsWays to analyze your data Neural Network Sequence Clustering Logistic Regression
  • 22. Split data Each of branch is like an attribute Brightness = amount of data Decision trees
  • 23. Decision Trees (1) Decision Trees assign (classify) each case to one of a few (discrete) broad categories of selected attribute (variable) and explains the classification with few selected input variables The process of building is recursive partitioning – splitting data into partitions and then splitting it up more Initially all cases are in one big box
  • 24. Decision Trees (2) The algorithm tries all possible breaks in classes using all possible values of each input attribute; it then selects the split that partitions data to the purest classes of the searched variable Several measures of purity Then it repeats splitting for each new class Again testing all possible breaks Unuseful branches of the tree can be pre-pruned or post-pruned
  • 25. Decision Trees (3) Decision trees are used for classification and prediction Typical questions: Predict which customers will leave Help in mailing and promotion campaigns Explain reasons for a decision What are the movies young female customers like to buy?
  • 26. Decision Trees – Who Decides
  • 27. Naïve Bayes Bayes Formula Uses statistics to say falls into certain category or not with probability Spam filtering: score of spam (Bayes) Testing only a particular attribute
  • 28. Naïve Bayes Quickly builds mining models that can be used for classification and prediction It calculates probabilities for each possible state of the input attribute, given each state of the predictable attribute This can later be used to predict an outcome of the predicted attribute based on the known input attributes This makes the model a good option for exploring the data
  • 29. Cluster Analysis (1) Grouping data into clusters Objects within a cluster have high similarity based on the attribute values The class label of each object is not known Several techniques Partitioning methods Hierarchical methods Density based methods Model based methods And more…
  • 30. Cluster Analysis (2) Segments a heterogeneous population into a number of more homogenous subgroups or clusters Some typical questions: Discover distinct groups of customers Identification of groups of houses in a city In biology, derive animal and plant taxonomies Find outliers
  • 31. Clustering Annual Income Age
  • 32. Time series Timebaseddata  prediction
  • 33. Sequence clustering Numbers orders stronger associations Direction of association (not necessary the other direction)
  • 34. If you own certain stocks ' you own maybe other ones as well Probability = thickness of line Association
  • 35. Let system learn how to classify data Neural Network adapts to the new data Formulate statement/hypothesis Outcome is know (Data / Surveys) 1. 70% data to train network (outcome is known) 2. 30% of data to test network (outcome is known) 3. New data (no survey needed, predict from network) Other example: OCR Neural Nets
  • 36. Both have directions Sequence Clustering has probability number and colour They are very similar. The difference is that Association analyses items that occur together whereas sequence clustering analyses items that follow one another. An example is that Sequence Clustering might be used by credit card companies to spot fraud, e.g. a petrol station refill followed by another petrol station refill followed by a big purchase = fraud (different transactions) Whereas Association will be more like: when someone buys popcorn at the cinemas, they also buy a drink (same transaction) Difference between algorithms: Association and Sequence
  • 38. Visual Numerics 3rd party algorithms http://www.vni.com/company/whitepapers/ MicrosoftBIwithNumericalLibraries.pdf There is more...
  • 39. Excel Data Mining Microsoft SQL Server 2008 Data Mining Add-ins for Microsoft Office 2007 http://www.microsoft.com/downloads/en/details.aspx?familyid=896A493A-2502-4795-94AE-E00632BA6DE7&displaylang=en
  • 40. Train station / airport Who is the bad guy Farmers Find the best crops Supermarket Find to figure out how to get you to buy more, where the expensive items Other usages of data miningFind patterns - Profiling
  • 41. SSIS 2008 - Data profiling task Get a profile of the data in a table potential candidate keys length of data values in columns Null percentage of rows distribution of values .... Tip
  • 42. Video: Simple data mining model http://www.sqlservercentral.com/articles/Video/65055/ Video: Data mining and Reporting Services http://www.sqlservercentral.com/articles/Video/64190/ Data Mining Algorithms http://msdn.microsoft.com/en-us/library/ms175595.aspx Resources 1
  • 43. Jamie MacLennan http://blogs.msdn.com/b/jamiemac/ Richard Lees on BI http://richardlees.blogspot.com/ Book Data Mining with Microsoft SQL Server 2008 http://www.amazon.com/gp/product/0470277742?ie=UTF8&tag=sqlserverda09-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=0470277742 Resources 2
  • 44. Summary Why Data Mining? Uses Algorithms Demo Hands on Lab
  • 46. Thank You! Gateway Court Suite 10 81 - 91 Military Road Neutral Bay, Sydney NSW 2089 AUSTRALIA ABN: 21 069 371 900 Phone: + 61 2 9953 3000 Fax: + 61 2 9953 3105 info@ssw.com.auwww.ssw.com.au

Hinweis der Redaktion

  1. Click to add notesPeter Gfader shows SQL Server
  2. Java current version 1.6 Update 211.7 released next year 2010Dynamic languages Parallel computingMaybe closures
  3. 3. Create the following report on top of Northwind Top 10 customers (Table) Top 10 products (Table) Top 10 employees (Table) 1 chart that shows the top 10 customers 1 usage of the gauge control (surprise me)a. Download Report builder 2 from http://www.microsoft.com/downloads/en/details.aspx?FamilyID=9f783224-9871-4eea-b1d5-f3140a253db6&displaylang=enb. Send me the screenshot of the final report
  4. Data mining can be used to uncover patterns in data samples, it is important to be aware that the use of non-representative samples of data may produce results that are not indicative of the domainSimilarly, data mining will not find patterns that may be present in the domain, if those patterns are not present in the sample being "mined". There is a tendency for insufficiently knowledgeable "consumers" of the results to attribute "magical abilities" to data mining, treating the technique as a sort of all-seeing crystal ball. Like any other tool, it only functions in conjunction with the appropriate raw material: in this case, indicative and representative data that the user must first collect. Further, the discovery of a particular pattern in a particular set of data does not necessarily mean that pattern is representative of the whole population from which that data was drawn. Hence, an important part of the process is the verification and validation of patterns on other samples of data.
  5. Data mining can be used to uncover patterns in data samples, it is important to be aware that the use of non-representative samples of data may produce results that are not indicative of the domain Similarly, data mining will not find patterns that may be present in the domain, if those patterns are not present in the sample being "mined".  There is a tendency for insufficiently knowledgeable "consumers" of the results to attribute "magical abilities" to data mining, treating the technique as a sort of all-seeing crystal ball. Like any other tool, it only functions in conjunction with the appropriate raw material: in this case, indicative and representative data that the user must first collect.  Further, the discovery of a particular pattern in a particular set of data does not necessarily mean that pattern is representative of the whole population from which that data was drawn. Hence, an important part of the process is the verification and validation of patterns on other samples of data. 
  6. http://msdn.microsoft.com/en-us/library/ms175595.aspxWays to analyze your dataDT = split dataEach of branch is like an attributeBrightness = amount of dataTODO: Check out barsClustering = mapping of popular pointsNumber of childrenDarkness = Lines are links between clusters (associations)Time seriesTimebased data  predictionSequence clusteringNumbers orders stronger associationsDirection of association (not necessary the other direction)AssociationIf you own certain stocks  you own maybe other ones as wellProbability = thickness of lineNaive BayesBayes FormulaUses statistics to say falls into certain category or not (with probabiblty)Spam filtering  score of spam (bayes)Testing only a particular attributeNeural NetsLet system learn how to classify dataFormulate statement/hypothesisOutcome is know(Data / Surveys)1. 70% data to train network (outcome is known)2. 30% of data to test network (outcome is known)3. New data (no survey needed, predict from network)Ex: OCR Example above = get loyalty of customersNeural Network adapts to the new data
  7. What attributes I am interested inAlgorithm splits data for me
  8. Pruned = gestutzt
  9. Diff. Color = relationshipUser clicked on toy story2
  10. Very easy to setupClassifies and gives a score  prediction
  11. Class label:Combination of diff. AttributesName clusters yourself
  12. Diff. Color = relationshipUser clicked on toy story2
  13. Diff. Color = relationshipUser clicked on toy story2
  14. Get loyalty of customers
  15. Click to add notesPeter Gfader shows SQL Server