Presentation for SQL Saturday Raleigh NC, Septmber 18, 2010
Overview of using DMX (Data Mining Extensions) in Excel, SSMS (SQL Server Management Studio), BIDS (Business Intelligence Development Studio), and PowerShell
7. Text Mining Product
Comparison from 2008
Š 2010 Mark Tabladillo Ph.D.
7
Feinerer, I., Hornik, K., & Meyer, D. (2008). Text Mining Infrastructure in R. Journal of Statistical Software, 25(5).
8. SQL Server Data Mining
Activity How
Preprocess T-SQL; Integration Services; Data Mining Add-In for Excel; .NET
programming
Associate Microsoft Association Rules (algorithm)
Š 2010 Mark Tabladillo Ph.D.
Cluster Microsoft Clustering (algorithm)
Summarize Integration Services (Term Extraction, Term Lookup)
Categorize Integration Services
API Includes DMX, XMLA, AMO, ADOMD.NET
8
9. APIs for Data Mining
Acronym Term Definition
DMX Data Mining Extensions SQL-like queries
(OLE DB for Data Mining)
XMLA Extensible Markup Language for Client communication
Analysis protocol
Š 2010 Mark Tabladillo Ph.D.
AMO Analysis Management Objects .NET library to manage
Analysis Services
ADOMD.NET ActiveX Data Objects .NET Framework data
(Multidimensional) for .NET provider
9
10. DMX Tasks
⢠Data Definition
⢠Create, Alter, Drop â Mining Structure
⢠Create, Drop â Mining Model
⢠Export and Import Models
⢠Data Manipulation
Š 2010 Mark Tabladillo Ph.D.
⢠Query Models, Content, Cases, Sample Cases, Dimension Content
10
11. SQL Server Data Mining
Applications (User Interfaces)
User Interface Activity
Excel (and PowerPivot for Excel) DMX
BIDS (Business Intelligence Analysis Services Project; Integration
Development Studio) Services Project (T-SQL; DMX; XMLA)
Š 2010 Mark Tabladillo Ph.D.
SSMS (SQL Server Management T-SQL; DMX; XMLA
Studio)
PowerShell version 2.0 T-SQL; DMX; XMLA
AMO; ADOMD.NET
SharePoint (Requires Setup or Customization)
Your Name Here (Develop Your Own) ?
11
12. Outline
Š 2010 Mark Tabladillo Ph.D.
Tools for
Demos
Text Mining
12
14. Excel
⢠Use the 32-bit Excel add-in for Data Mining
⢠Written for SQL Server 2008, ok for 2008 R2
⢠Written for Office 2007, ok for 2010
⢠(Optional) Add the free PowerPivot add-in
(http://powerpivot.com)
Š 2010 Mark Tabladillo Ph.D.
14
15. Click to edit Master title style
Datasets
&
Models Public Cloud or On-
Premise Private
Cloud
SQL
Server
⢠SQL Server PowerPivot Analysis
⢠Access Data Sources Services
⢠Oracle
⢠Teradata
⢠Sybase
⢠Informix
⢠DB2
⢠Data Feeds
⢠Text Files
Š2010 Predixion Software
16. BIDS
⢠The preferred application for production data mining
⢠Analysis Services Projects
⢠Make Mining Structures and Models
⢠Data Mining for OLAP Cubes
⢠Excellent for Experimentation
Š 2010 Mark Tabladillo Ph.D.
⢠Integration Services Projects
⢠Term Extraction and Term Lookup Text Mining
⢠Excellent for Production
⢠Reporting Services Projects
⢠Similar to Crystal Reports
16
17. SSMS
⢠Production management and maintenance
⢠Scripts can become stored procedures
⢠T-SQL, DMX, MDX, XMLA
Š 2010 Mark Tabladillo Ph.D.
17
19. Excel in Production
⢠Can create and manage permanent data mining models
⢠Can document data mining models
⢠Can do some preprocessing (ETL)
Š 2010 Mark Tabladillo Ph.D.
19
20. BIDS in Production
⢠Can create a production workflow with Integration Services
projects
⢠Can create production data mining models with Analysis
Services projects
Š 2010 Mark Tabladillo Ph.D.
20
21. SSMS in Production
⢠The standard production user interface for SQL Server
⢠Also the standard production user interface for Analysis
Services Databases
⢠Built for
⢠Scripting (T-SQL, MDX, DMX, XMLA)
Š 2010 Mark Tabladillo Ph.D.
⢠Security
⢠Assembly Registration (Analysis Services)
⢠Stored Procedures (SQL Server)
21
22. PowerShell in Production
⢠Features
⢠Object-oriented
⢠Command window or ISE (Integrated Scripting Environment)
⢠Accesses .NET libraries and WMI (Windows Management
Instrumentation)
Š 2010 Mark Tabladillo Ph.D.
⢠Version two adds event and exception handling
22
23. Resources
⢠MarkTab.NET
Blog, links, video resources and information for
data mining
⢠Blog: http://marktab.net/datamining
Š 2010 Mark Tabladillo Ph.D.
⢠Twitter: @MarkTabNet
23