SlideShare ist ein Scribd-Unternehmen logo
1 von 50
Downloaden Sie, um offline zu lesen
An Overview of
Microsoft Data Mining
Technology
Mark Tabladillo, Ph.D. (MVP, MCAD .NET, MCITP, MCT)
February 12, 2013
About Data Science ATL
Meetup Group
http://www.meetup.com/Data-Science-ATL/
Networking
Interactive
About MarkTab
Training and Consulting with        Ph.D. – Industrial Engineering,
http://marktab.com                  Georgia Tech
Data Mining Resources and Blog at   Training and consulting
http://marktab.net                  internationally across many
                                    industries – SAS and Microsoft
                                    Contributed to peer-reviewed
                                    research and legislation
                                      Mentoring doctoral dissertations at the
                                      accredited University of Phoenix
                                    Presenter
Interactive
Name three things you want from enterprise data
mining
Microsoft Offers
Bing
  Maps
Xbox Kinect
  Hacker Magnet
SQL Server 2012
  Analysis Services (Multidimensional and Data Mining)
  Integration Services
  Semantic Search
  Hadoop Partnership
Excel Projects from Microsoft Research
Outline
Definitions
What is data mining?
Definition
Data mining is the automated or semi-automated process of
discovering patterns in data
Machine learning is the development and optimization of
algorithms for automated or semi-automated pattern discovery
Purposes
    Phrase          Goal

    “Data Mining”   Inform actionable decisions



    “Machine        Determine best performing
    Learning”       algorithm
MarkTab Decision Cycle
                             GO




           Synthesis                 Analysis
               (art)                (science)


         Science needs science fiction -- MarkTab
MarkTab Decision Cycle
                      GO




          Synthesis        Analysis
            (art)          (science)
Industry Comparisons
2012-2013
Gartner 2013
           Magic Quadrant for
           Business Intelligence
           and Analytics
           Platforms




  Retrieved from http://www.gartner.com/technology/reprints.do?id=1-1DZLPEH&ct=130207&st=sb
  – February 5, 2013
Microsoft Response
Focus on familiar, intuitive user experiences delivered via high quality, industry-leading
products that businesses already know and use today is key to making BI truly
accessible to all users.
By providing Business Intelligence capabilities in familiar tools such as Excel and
SharePoint, we empower an entirely new segment of business users to build and
consume rich BI solutions as part of their everyday work.
Delivering the server-side capabilities to enable self-service BI via SharePoint and SQL
Server provides a common, scalable data platform to handle any data, any size, from
anywhere, and tackle all of your Big Data needs.

Retrieved from http://blogs.msdn.com/b/microsoft_business_intelligence1/archive/2013/02/07/microsoft-in-
leaders-quadrant-of-gartner-magic-quadrant-for-business-intelligence-and-analytics-platforms.aspx -- Feb 2013
Gartner 2013
           Magic Quadrant for
           Data Warehouse
           Database
           Management
           Systems




  Retrieved from http://www.gartner.com/technology/reprints.do?id=1-1DU2VD4&ct=130131&st=sb
  – January 31, 2013
KDNuggets 2012
http://marktab.net/datamining/2012/06/15/excel-number-
commercial-tool-analytics-data-mining-big-data/
SQL Server 2012
Business Intelligence and Business Analytics
New Platform options: managed services
   Platform       Infrastructure                         Platform                            Software
(Self Managed)     (as a Service)                      (as a Service)                      (as a Service)

  Applications     Applications                         Applications                        Applications

     Data              Data                                Data                                Data

   Runtime           Runtime                             Runtime                             Runtime

  Middleware       Middleware                           Middleware                          Middleware




                                                                                                            Managed Services
   Database          Database                            Database                            Database




                                                                        Managed Services
      O/S               O/S                                 O/S                                 O/S

 Virtualization    Virtualization                      Virtualization                      Virtualization




                                    Managed Services
    Servers           Servers                             Servers                             Servers

    Storage          Storage                              Storage                             Storage

  Networking       Networking                           Networking                          Networking
SQL Release timelines                                                                                                                 2008
                                                                                                                                 SQL Server 2008
                                                                                                                                                            2012
                                                                                                                                                      SQL Server 2012
                                                                                                                                                         AlwaysOn
                                                                                                                                                        Columnstore
      1989                   1993                                            2000                                                Sparse Columns          FileTable
  SQL Server 1.0         SQL Server 4.21         1996                  SQL Server 2000                                            Spatial Types       Semantic Search
     (OS/2)                   (NT)           SQL Server 6.5            Reporting Services                                         FILESTREAM            Power View



          1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012



                 1991                       1995                 1998                                            2005                           2010
             SQL Server 1.1             SQL Server 6.0     SQL Server 7.0                                 SQL Server 2005                SQL Server 2008 R2
                (OS/2)                                    Dynamic Locking                                  Unicode Support                 Data-tier Apps
                                                            Auto-Tuning                                      Native XML                     StreamInsight
                                                           Full-text search                                    SQLCLR                        PowerPivot
                                                             Replication                                    Service Broker               Master Data Services
                                                          Analysis Services                              Integration Services
                                                                                                                             Aug 11
                                                    Aug 10
                                                                                                                     New Portal Experience
                                              SQL Azure SU4 RTW                        Feb 11
                                                                                                                         Sparse Columns
                                                Database Copy                 SQL Azure Reporting CTP2              SQL Azure Reporting CTP3
                                                 Web Admin                  Dec DataSync CTP2 Update
                                                                                10                                  SQL Azure DataSync CTP3
                          Apr 10
             Feb 10 SQL Azure SU2 RTW         Jul 10                   SQL Azure SU6 RTW                            DAC Import/Export Service
         SQL Azure RTW MARS               DataSync CTP1                  DataSync CTP2                                     Denali TSQL



                        Apr 10             Jul 10             Oct 10             Jan 11           Apr 11                Jul 11             Oct 11



                Feb 10                  Jun 10                          Nov 10                     Apr 11
          SQL Azure SU1 RTW       SQL Azure SU3 RTW                DataMarket RTW            SQL Azure SU V.Next
             Alter Edition             50 GB Db                SQL Azure Reporting CTP1        Multiple Servers
                                     Spatial Type                                             Server Mgmt API
                                   HierarchyId Type                                                 JDBC
                                                                                                DAC Upgrade
Data platform: SQL Server 2012
                              Data Integration
  Database Services                                      Analytical Services      Reporting Services
                                 Services

          SQL Server*            Integration Services*                               Reporting Services*
                                                             Analysis Services*
          SQL Azure*                                                                SQL Azure Reporting*


                                Master Data Services*
          Replication
                                                               Data Mining             Report Builder
     SQL Azure Data Sync*
                                Data Quality Services*


      Full Text & Semantic
                                   StreamInsight*              PowerPivot*              Power View*
             Search*
                                  Project “Austin”*




* New / improved in SQL Server 2012
SQL Server 2012 Editions




    Retrieved from http://www.microsoft.com/en-us/sqlserver/editions.aspx -- February 2013
What Enterprise Tools support Microsoft
Data Mining?
                  Data
                 Mining

      SSMS        SSIS    PowerShell
Variable      0   1   2   3   4   5   6   7



Discretized
Discretized
Continuous
Discrete
Variable      0   1   2   3   4   5   6   7



Discretized
Discretized
Continuous
Discrete
Variable      0   1   2   3   4   5   6   7



Discretized
Discretized
Continuous
Discrete
Variable      0   1   2   3   4   5   6   7



Discretized
Discretized
Continuous
Discrete
Variable      0   1   2   3   4   5   6   7


Discretized
Discretized
Continuous
Discrete
Data Mining Capacities
   SQL Server 2008 R2 Analysis Services Object                    Maximum sizes/numbers
   Maximum data mining models per structure                       2^31-1 = 2,147,483,647

   Maximum data mining structures per solution                    2^31-1 = 2,147,483,647

   Maximum data mining structures per Analysis
                                                                  2^31-1 = 2,147,483,647
   Services database
   Maximum data mining attributes (variables) per
                                                                  2^31-1 = 2,147,483,647
   structure


Reference:
http://www.marktab.net/datamining/index.php/2010/08/01/sql-server-data-mining-capacities-2008-r2/
Third-Party
Predixion Software
Semantic Search
Text Mining
Future: Most data is Text
Two Research Types
• Quantitative research = data mining
• Qualitative research = text mining
The future is combining both
Statistical Semantic Search
Comprises some aspects of text mining
Identifies statistically relevant key phrases
Based on these phrases, can identify (by score) similar documents
FileTables
Built on existing SQL Server FILESTREAM technology
Files and documents
   Stored in special tables in SQL Server
   Accessed if they were stored in the file system
Full-Text Search Enhancements
Property search: search on tagged properties (such as author or title)
Customizable NEAR: find words or phrases close to one another
New Word Breakers and Stemmers (for many languages)
From Documents to Output
                    Office
         Varchar
                                 PDF
        NVarchar
                     Rowset
                     Output
                   with Scores
(iFilter Required)
                                  iFilters   Full-Text
       Documents                             Keyword
                                              Index
                                               “FTI”



                                              Semantic
                                             Key Phrase
                                  Semantic     Index –
         Semantic Document        Database    Tag Index
         Similarity Index “DSI”                  “TI”
Languages Currently Supported
Traditional Chinese   Simplified Chinese
German                British English
English               Portuguese
French                Chinese (Hong Kong SAR, PRC)
Italian               Spanish
Brazilian             Chinese (Singapore)
Russian               Chinese (Macau SAR)
Swedish
Phases of Semantic Indexing
      Full Text Keyword Index “FTI”

                                                 Semantic Document Similarity
                                                         Index “DSI”
      Semantic Key Phrase Index –
            Tag Index “TI”




     http://msdn.microsoft.com/en-us/library/gg492085.aspx#SemanticIndexing
Integrated Full Text Search (iFTS)
Improved Performance and Scale:
  Scale-up to 350M documents for storage and search
  iFTS query performance 7-10 times faster than in SQL Server 2008
  Worst-case iFTS query response times less than 3 sec for corpus
  Similar or better than main database search competitors
(2012, Michael Rys, Microsoft)
Linear Scale of FTI/TI/DSI
First known linearly scaling end-to-end Search and Semantic product in the industry




            Time in Seconds vs. Number of Documents
            (2011 – K. Mukerjee, T. Porter, S. Gherman – Microsoft)
Text Mining References
Video
  http://channel9.msdn.com/Shows/DataBound/DataBound-Episode-2-Semantic-
  Search
  http://www.microsoftpdc.com/2009/SVR32
Semantic Search (Books Online) – explains the demo
  http://msdn.microsoft.com/en-us/library/gg492075.aspx
Paper
  http://users.cis.fiu.edu/~lzhen001/activities/KDD2011Program/docs/p213.pdf
Microsoft Resources
Links
Software
SQL Server 2012 Enterprise
(includes database engine, Analysis Services, SSMS and SSDT)
 http://www.microsoft.com/sqlserver/en/us/get-sql-server/try-it.aspx
Microsoft Office 2012 Professional
 http://office.microsoft.com/en-us/try
Organizations
 Professional Association for SQL Server http://www.sqlpass.org
   Atlanta MDF http://www.atlantamdf.com/
   Atlanta Microsoft BI Users Group http://www.meetup.com/Atlanta-Microsoft-
   Business-Intelligence-Users/
PASS Business Analytics Conference http://www.passbaconference.com
Microsoft TechEd North America http://northamerica.msteched.com/
Interactive
Takeaways
Conclusion
Microsoft competes well with other vendors
 Business Intelligence and Analytics
 Data Warehouse
 Excel
SQL Server Data Mining 2012 provides data mining and semantic search
Connect
Data Mining Resources and blog http://marktab.net
Data Mining Training and Consulting (especially Microsoft and SAS)
http://marktab.com

Weitere ähnliche Inhalte

Was ist angesagt?

Whats New Sql Server 2008 R2 Cw
Whats New Sql Server 2008 R2 CwWhats New Sql Server 2008 R2 Cw
Whats New Sql Server 2008 R2 CwEduardo Castro
 
SQL Server Reporting Services: IT Best Practices
SQL Server Reporting Services: IT Best PracticesSQL Server Reporting Services: IT Best Practices
SQL Server Reporting Services: IT Best PracticesDenny Lee
 
Introducing SQL Server Data Services
Introducing SQL Server Data ServicesIntroducing SQL Server Data Services
Introducing SQL Server Data Servicesgoodfriday
 
KoprowskiT_SQLSoton_WADBforbeginners
KoprowskiT_SQLSoton_WADBforbeginnersKoprowskiT_SQLSoton_WADBforbeginners
KoprowskiT_SQLSoton_WADBforbeginnersTobias Koprowski
 
SQLUG event: An evening in the cloud: the old, the new and the big
 SQLUG event: An evening in the cloud: the old, the new and the big  SQLUG event: An evening in the cloud: the old, the new and the big
SQLUG event: An evening in the cloud: the old, the new and the big Mike Martin
 
First Look to SSIS 2012
First Look to SSIS 2012First Look to SSIS 2012
First Look to SSIS 2012Pedro Perfeito
 
Sql server reporting services
Sql server reporting servicesSql server reporting services
Sql server reporting servicesssuser1eca7d
 
Ssrs introduction session 1
Ssrs introduction session 1Ssrs introduction session 1
Ssrs introduction session 1Muthuvel P
 
Microsoft SQL Server Distributing Data with R2 Bertucci
Microsoft SQL Server Distributing Data with R2 BertucciMicrosoft SQL Server Distributing Data with R2 Bertucci
Microsoft SQL Server Distributing Data with R2 BertucciMark Ginnebaugh
 
Session 2: SQL Server 2012 with Christian Malbeuf
Session 2: SQL Server 2012 with Christian MalbeufSession 2: SQL Server 2012 with Christian Malbeuf
Session 2: SQL Server 2012 with Christian MalbeufCTE Solutions Inc.
 
Microsoft SQL Server - SQL Server 2008 R2 Editions Datasheet
Microsoft SQL Server - SQL Server 2008 R2 Editions DatasheetMicrosoft SQL Server - SQL Server 2008 R2 Editions Datasheet
Microsoft SQL Server - SQL Server 2008 R2 Editions DatasheetMicrosoft Private Cloud
 
SQL Server 2008 New Features
SQL Server 2008 New FeaturesSQL Server 2008 New Features
SQL Server 2008 New FeaturesDan English
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 OverviewDavid Chou
 
Sql azure data services OData
Sql azure data services ODataSql azure data services OData
Sql azure data services ODataEduardo Castro
 
SQL SERVER 2008 R2 CTP
SQL SERVER 2008 R2 CTPSQL SERVER 2008 R2 CTP
SQL SERVER 2008 R2 CTPGovind S Yadav
 
SSRS integration with share point
SSRS integration with share pointSSRS integration with share point
SSRS integration with share pointJacob Chang
 

Was ist angesagt? (18)

Whats New Sql Server 2008 R2 Cw
Whats New Sql Server 2008 R2 CwWhats New Sql Server 2008 R2 Cw
Whats New Sql Server 2008 R2 Cw
 
SQL Server Reporting Services: IT Best Practices
SQL Server Reporting Services: IT Best PracticesSQL Server Reporting Services: IT Best Practices
SQL Server Reporting Services: IT Best Practices
 
Introducing SQL Server Data Services
Introducing SQL Server Data ServicesIntroducing SQL Server Data Services
Introducing SQL Server Data Services
 
KoprowskiT_SQLSoton_WADBforbeginners
KoprowskiT_SQLSoton_WADBforbeginnersKoprowskiT_SQLSoton_WADBforbeginners
KoprowskiT_SQLSoton_WADBforbeginners
 
SQLUG event: An evening in the cloud: the old, the new and the big
 SQLUG event: An evening in the cloud: the old, the new and the big  SQLUG event: An evening in the cloud: the old, the new and the big
SQLUG event: An evening in the cloud: the old, the new and the big
 
First Look to SSIS 2012
First Look to SSIS 2012First Look to SSIS 2012
First Look to SSIS 2012
 
Patel v res_(1)
Patel v res_(1)Patel v res_(1)
Patel v res_(1)
 
Sql server reporting services
Sql server reporting servicesSql server reporting services
Sql server reporting services
 
Ssrs introduction session 1
Ssrs introduction session 1Ssrs introduction session 1
Ssrs introduction session 1
 
Microsoft SQL Server Distributing Data with R2 Bertucci
Microsoft SQL Server Distributing Data with R2 BertucciMicrosoft SQL Server Distributing Data with R2 Bertucci
Microsoft SQL Server Distributing Data with R2 Bertucci
 
Session 2: SQL Server 2012 with Christian Malbeuf
Session 2: SQL Server 2012 with Christian MalbeufSession 2: SQL Server 2012 with Christian Malbeuf
Session 2: SQL Server 2012 with Christian Malbeuf
 
Microsoft SQL Server - SQL Server 2008 R2 Editions Datasheet
Microsoft SQL Server - SQL Server 2008 R2 Editions DatasheetMicrosoft SQL Server - SQL Server 2008 R2 Editions Datasheet
Microsoft SQL Server - SQL Server 2008 R2 Editions Datasheet
 
SQL Server 2008 New Features
SQL Server 2008 New FeaturesSQL Server 2008 New Features
SQL Server 2008 New Features
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 Overview
 
Ssrs 2008 R2 webinar
Ssrs 2008 R2   webinarSsrs 2008 R2   webinar
Ssrs 2008 R2 webinar
 
Sql azure data services OData
Sql azure data services ODataSql azure data services OData
Sql azure data services OData
 
SQL SERVER 2008 R2 CTP
SQL SERVER 2008 R2 CTPSQL SERVER 2008 R2 CTP
SQL SERVER 2008 R2 CTP
 
SSRS integration with share point
SSRS integration with share pointSSRS integration with share point
SSRS integration with share point
 

Andere mochten auch

JBoss Enterprise Data Services (Data Virtualization)
JBoss Enterprise Data Services (Data Virtualization)JBoss Enterprise Data Services (Data Virtualization)
JBoss Enterprise Data Services (Data Virtualization)plarsen67
 
Big data insights with Red Hat JBoss Data Virtualization
Big data insights with Red Hat JBoss Data VirtualizationBig data insights with Red Hat JBoss Data Virtualization
Big data insights with Red Hat JBoss Data VirtualizationKenneth Peeples
 
Big Data and Data Virtualization
Big Data and Data VirtualizationBig Data and Data Virtualization
Big Data and Data VirtualizationKenneth Peeples
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Deep Dive: OpenStack Summit (Red Hat Summit 2014)
Deep Dive: OpenStack Summit (Red Hat Summit 2014)Deep Dive: OpenStack Summit (Red Hat Summit 2014)
Deep Dive: OpenStack Summit (Red Hat Summit 2014)Stephen Gordon
 

Andere mochten auch (6)

JBoss Enterprise Data Services (Data Virtualization)
JBoss Enterprise Data Services (Data Virtualization)JBoss Enterprise Data Services (Data Virtualization)
JBoss Enterprise Data Services (Data Virtualization)
 
Big data insights with Red Hat JBoss Data Virtualization
Big data insights with Red Hat JBoss Data VirtualizationBig data insights with Red Hat JBoss Data Virtualization
Big data insights with Red Hat JBoss Data Virtualization
 
Big Data and Data Virtualization
Big Data and Data VirtualizationBig Data and Data Virtualization
Big Data and Data Virtualization
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Deep Dive: OpenStack Summit (Red Hat Summit 2014)
Deep Dive: OpenStack Summit (Red Hat Summit 2014)Deep Dive: OpenStack Summit (Red Hat Summit 2014)
Deep Dive: OpenStack Summit (Red Hat Summit 2014)
 

Ähnlich wie Microsoft Data Mining Overview

SQL Saturday 79 Enterprise Data Mining for SQL Server 2008 R2
SQL Saturday 79 Enterprise Data Mining for SQL Server 2008 R2SQL Saturday 79 Enterprise Data Mining for SQL Server 2008 R2
SQL Saturday 79 Enterprise Data Mining for SQL Server 2008 R2Mark Tabladillo
 
Introducing SQL Server Data Services
Introducing SQL Server Data ServicesIntroducing SQL Server Data Services
Introducing SQL Server Data Servicesgoodfriday
 
System Center
System CenterSystem Center
System CenterBlauge
 
SQL Server Workshop Paul Bertucci
SQL Server Workshop Paul BertucciSQL Server Workshop Paul Bertucci
SQL Server Workshop Paul BertucciMark Ginnebaugh
 
SQL Server 2008 Migration Workshop 04/29/2009
SQL Server 2008 Migration Workshop 04/29/2009SQL Server 2008 Migration Workshop 04/29/2009
SQL Server 2008 Migration Workshop 04/29/2009Database Architechs
 
SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)Gert Drapers
 
Secrets of Enterprise Data Mining 201305
Secrets of Enterprise Data Mining 201305Secrets of Enterprise Data Mining 201305
Secrets of Enterprise Data Mining 201305Mark Tabladillo
 
Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310Mark Tabladillo
 
Software architecture to analyze licensing needs for pcms- pegasus cargo ma...
Software architecture   to analyze licensing needs for pcms- pegasus cargo ma...Software architecture   to analyze licensing needs for pcms- pegasus cargo ma...
Software architecture to analyze licensing needs for pcms- pegasus cargo ma...Shahzad
 
Introduction to NuoDB - March 2018
Introduction to NuoDB - March 2018Introduction to NuoDB - March 2018
Introduction to NuoDB - March 2018NuoDB
 
Raymond Cochrane 12_12_12
Raymond Cochrane 12_12_12Raymond Cochrane 12_12_12
Raymond Cochrane 12_12_12Ray Cochrane
 
Ms Sql Server Black Book
Ms Sql Server Black BookMs Sql Server Black Book
Ms Sql Server Black BookLiquidHub
 
Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008Tobias Koprowski
 
Leveraging PowerPivot
Leveraging PowerPivotLeveraging PowerPivot
Leveraging PowerPivotDan English
 
KoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginners
KoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginnersKoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginners
KoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginnersTobias Koprowski
 
Moving to the cloud azure, office365, and intune - concurrency
Moving to the cloud   azure, office365, and intune - concurrencyMoving to the cloud   azure, office365, and intune - concurrency
Moving to the cloud azure, office365, and intune - concurrencyConcurrency, Inc.
 
SQL Server 2008 for Developers
SQL Server 2008 for DevelopersSQL Server 2008 for Developers
SQL Server 2008 for Developersukdpe
 

Ähnlich wie Microsoft Data Mining Overview (20)

SQL Saturday 79 Enterprise Data Mining for SQL Server 2008 R2
SQL Saturday 79 Enterprise Data Mining for SQL Server 2008 R2SQL Saturday 79 Enterprise Data Mining for SQL Server 2008 R2
SQL Saturday 79 Enterprise Data Mining for SQL Server 2008 R2
 
Introducing SQL Server Data Services
Introducing SQL Server Data ServicesIntroducing SQL Server Data Services
Introducing SQL Server Data Services
 
SQL Server User Group 02/2009
SQL Server User Group 02/2009SQL Server User Group 02/2009
SQL Server User Group 02/2009
 
System Center
System CenterSystem Center
System Center
 
SQL Server Workshop Paul Bertucci
SQL Server Workshop Paul BertucciSQL Server Workshop Paul Bertucci
SQL Server Workshop Paul Bertucci
 
SQL Server 2008 Migration Workshop 04/29/2009
SQL Server 2008 Migration Workshop 04/29/2009SQL Server 2008 Migration Workshop 04/29/2009
SQL Server 2008 Migration Workshop 04/29/2009
 
SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)
 
Secrets of Enterprise Data Mining 201305
Secrets of Enterprise Data Mining 201305Secrets of Enterprise Data Mining 201305
Secrets of Enterprise Data Mining 201305
 
Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310
 
Confio presentation
Confio presentationConfio presentation
Confio presentation
 
Software architecture to analyze licensing needs for pcms- pegasus cargo ma...
Software architecture   to analyze licensing needs for pcms- pegasus cargo ma...Software architecture   to analyze licensing needs for pcms- pegasus cargo ma...
Software architecture to analyze licensing needs for pcms- pegasus cargo ma...
 
Introduction to NuoDB - March 2018
Introduction to NuoDB - March 2018Introduction to NuoDB - March 2018
Introduction to NuoDB - March 2018
 
Raymond Cochrane 12_12_12
Raymond Cochrane 12_12_12Raymond Cochrane 12_12_12
Raymond Cochrane 12_12_12
 
Ms Sql Server Black Book
Ms Sql Server Black BookMs Sql Server Black Book
Ms Sql Server Black Book
 
Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008Introduction to SQL Server Analysis services 2008
Introduction to SQL Server Analysis services 2008
 
Leveraging PowerPivot
Leveraging PowerPivotLeveraging PowerPivot
Leveraging PowerPivot
 
KoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginners
KoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginnersKoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginners
KoprowskiT_SQLRelay2014#6_Leeds_WADBForBeginners
 
Data In Cloud
Data In CloudData In Cloud
Data In Cloud
 
Moving to the cloud azure, office365, and intune - concurrency
Moving to the cloud   azure, office365, and intune - concurrencyMoving to the cloud   azure, office365, and intune - concurrency
Moving to the cloud azure, office365, and intune - concurrency
 
SQL Server 2008 for Developers
SQL Server 2008 for DevelopersSQL Server 2008 for Developers
SQL Server 2008 for Developers
 

Mehr von Mark Tabladillo

How to find low-cost or free data science resources 202006
How to find low-cost or free data science resources 202006How to find low-cost or free data science resources 202006
How to find low-cost or free data science resources 202006Mark Tabladillo
 
Microsoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science RecapMicrosoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science RecapMark Tabladillo
 
201909 Automated ML for Developers
201909 Automated ML for Developers201909 Automated ML for Developers
201909 Automated ML for DevelopersMark Tabladillo
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated MLMark Tabladillo
 
201906 01 Introduction to ML.NET 1.0
201906 01 Introduction to ML.NET 1.0201906 01 Introduction to ML.NET 1.0
201906 01 Introduction to ML.NET 1.0Mark Tabladillo
 
201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019Mark Tabladillo
 
201906 03 Introduction to NimbusML
201906 03 Introduction to NimbusML201906 03 Introduction to NimbusML
201906 03 Introduction to NimbusMLMark Tabladillo
 
201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0Mark Tabladillo
 
201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine LearningMark Tabladillo
 
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...Mark Tabladillo
 
Big Data Advanced Analytics on Microsoft Azure 201904
Big Data Advanced Analytics on Microsoft Azure 201904Big Data Advanced Analytics on Microsoft Azure 201904
Big Data Advanced Analytics on Microsoft Azure 201904Mark Tabladillo
 
Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904Mark Tabladillo
 
Training of Python scikit-learn models on Azure
Training of Python scikit-learn models on AzureTraining of Python scikit-learn models on Azure
Training of Python scikit-learn models on AzureMark Tabladillo
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureMark Tabladillo
 
Advanced Analytics with Power BI 201808
Advanced Analytics with Power BI 201808Advanced Analytics with Power BI 201808
Advanced Analytics with Power BI 201808Mark Tabladillo
 
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)Mark Tabladillo
 
Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017Mark Tabladillo
 
Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612Mark Tabladillo
 
How Big Companies plan to use Our Big Data 201610
How Big Companies plan to use Our Big Data 201610How Big Companies plan to use Our Big Data 201610
How Big Companies plan to use Our Big Data 201610Mark Tabladillo
 
Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016Mark Tabladillo
 

Mehr von Mark Tabladillo (20)

How to find low-cost or free data science resources 202006
How to find low-cost or free data science resources 202006How to find low-cost or free data science resources 202006
How to find low-cost or free data science resources 202006
 
Microsoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science RecapMicrosoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science Recap
 
201909 Automated ML for Developers
201909 Automated ML for Developers201909 Automated ML for Developers
201909 Automated ML for Developers
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 
201906 01 Introduction to ML.NET 1.0
201906 01 Introduction to ML.NET 1.0201906 01 Introduction to ML.NET 1.0
201906 01 Introduction to ML.NET 1.0
 
201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019
 
201906 03 Introduction to NimbusML
201906 03 Introduction to NimbusML201906 03 Introduction to NimbusML
201906 03 Introduction to NimbusML
 
201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0
 
201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning
 
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
 
Big Data Advanced Analytics on Microsoft Azure 201904
Big Data Advanced Analytics on Microsoft Azure 201904Big Data Advanced Analytics on Microsoft Azure 201904
Big Data Advanced Analytics on Microsoft Azure 201904
 
Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904
 
Training of Python scikit-learn models on Azure
Training of Python scikit-learn models on AzureTraining of Python scikit-learn models on Azure
Training of Python scikit-learn models on Azure
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
 
Advanced Analytics with Power BI 201808
Advanced Analytics with Power BI 201808Advanced Analytics with Power BI 201808
Advanced Analytics with Power BI 201808
 
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
 
Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017
 
Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612
 
How Big Companies plan to use Our Big Data 201610
How Big Companies plan to use Our Big Data 201610How Big Companies plan to use Our Big Data 201610
How Big Companies plan to use Our Big Data 201610
 
Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016
 

Kürzlich hochgeladen

Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfJos Voskuil
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfpollardmorgan
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Kirill Klimov
 
India Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportIndia Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportMintel Group
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyotictsugar
 
PSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationPSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationAnamaria Contreras
 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCRashishs7044
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Riya Pathan
 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607dollysharma2066
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchirictsugar
 
Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMVoces Mineras
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMintel Group
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckHajeJanKamps
 
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaoncallgirls2057
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfRbc Rbcua
 

Kürzlich hochgeladen (20)

Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdf
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
 
India Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample ReportIndia Consumer 2024 Redacted Sample Report
India Consumer 2024 Redacted Sample Report
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyot
 
PSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationPSCC - Capability Statement Presentation
PSCC - Capability Statement Presentation
 
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR8447779800, Low rate Call girls in Tughlakabad Delhi NCR
8447779800, Low rate Call girls in Tughlakabad Delhi NCR
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737
 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
 
Call Us ➥9319373153▻Call Girls In North Goa
Call Us ➥9319373153▻Call Girls In North GoaCall Us ➥9319373153▻Call Girls In North Goa
Call Us ➥9319373153▻Call Girls In North Goa
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchir
 
Memorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQMMemorándum de Entendimiento (MoU) entre Codelco y SQM
Memorándum de Entendimiento (MoU) entre Codelco y SQM
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 Edition
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
 
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdf
 

Microsoft Data Mining Overview

  • 1. An Overview of Microsoft Data Mining Technology Mark Tabladillo, Ph.D. (MVP, MCAD .NET, MCITP, MCT) February 12, 2013
  • 2. About Data Science ATL Meetup Group http://www.meetup.com/Data-Science-ATL/
  • 4. About MarkTab Training and Consulting with Ph.D. – Industrial Engineering, http://marktab.com Georgia Tech Data Mining Resources and Blog at Training and consulting http://marktab.net internationally across many industries – SAS and Microsoft Contributed to peer-reviewed research and legislation Mentoring doctoral dissertations at the accredited University of Phoenix Presenter
  • 5. Interactive Name three things you want from enterprise data mining
  • 6. Microsoft Offers Bing Maps Xbox Kinect Hacker Magnet SQL Server 2012 Analysis Services (Multidimensional and Data Mining) Integration Services Semantic Search Hadoop Partnership Excel Projects from Microsoft Research
  • 9. Definition Data mining is the automated or semi-automated process of discovering patterns in data Machine learning is the development and optimization of algorithms for automated or semi-automated pattern discovery
  • 10. Purposes Phrase Goal “Data Mining” Inform actionable decisions “Machine Determine best performing Learning” algorithm
  • 11. MarkTab Decision Cycle GO Synthesis Analysis (art) (science) Science needs science fiction -- MarkTab
  • 12. MarkTab Decision Cycle GO Synthesis Analysis (art) (science)
  • 14. Gartner 2013 Magic Quadrant for Business Intelligence and Analytics Platforms Retrieved from http://www.gartner.com/technology/reprints.do?id=1-1DZLPEH&ct=130207&st=sb – February 5, 2013
  • 15. Microsoft Response Focus on familiar, intuitive user experiences delivered via high quality, industry-leading products that businesses already know and use today is key to making BI truly accessible to all users. By providing Business Intelligence capabilities in familiar tools such as Excel and SharePoint, we empower an entirely new segment of business users to build and consume rich BI solutions as part of their everyday work. Delivering the server-side capabilities to enable self-service BI via SharePoint and SQL Server provides a common, scalable data platform to handle any data, any size, from anywhere, and tackle all of your Big Data needs. Retrieved from http://blogs.msdn.com/b/microsoft_business_intelligence1/archive/2013/02/07/microsoft-in- leaders-quadrant-of-gartner-magic-quadrant-for-business-intelligence-and-analytics-platforms.aspx -- Feb 2013
  • 16. Gartner 2013 Magic Quadrant for Data Warehouse Database Management Systems Retrieved from http://www.gartner.com/technology/reprints.do?id=1-1DU2VD4&ct=130131&st=sb – January 31, 2013
  • 18. SQL Server 2012 Business Intelligence and Business Analytics
  • 19. New Platform options: managed services Platform Infrastructure Platform Software (Self Managed) (as a Service) (as a Service) (as a Service) Applications Applications Applications Applications Data Data Data Data Runtime Runtime Runtime Runtime Middleware Middleware Middleware Middleware Managed Services Database Database Database Database Managed Services O/S O/S O/S O/S Virtualization Virtualization Virtualization Virtualization Managed Services Servers Servers Servers Servers Storage Storage Storage Storage Networking Networking Networking Networking
  • 20. SQL Release timelines 2008 SQL Server 2008 2012 SQL Server 2012 AlwaysOn Columnstore 1989 1993 2000 Sparse Columns FileTable SQL Server 1.0 SQL Server 4.21 1996 SQL Server 2000 Spatial Types Semantic Search (OS/2) (NT) SQL Server 6.5 Reporting Services FILESTREAM Power View 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 1991 1995 1998 2005 2010 SQL Server 1.1 SQL Server 6.0 SQL Server 7.0 SQL Server 2005 SQL Server 2008 R2 (OS/2) Dynamic Locking Unicode Support Data-tier Apps Auto-Tuning Native XML StreamInsight Full-text search SQLCLR PowerPivot Replication Service Broker Master Data Services Analysis Services Integration Services Aug 11 Aug 10 New Portal Experience SQL Azure SU4 RTW Feb 11 Sparse Columns Database Copy SQL Azure Reporting CTP2 SQL Azure Reporting CTP3 Web Admin Dec DataSync CTP2 Update 10 SQL Azure DataSync CTP3 Apr 10 Feb 10 SQL Azure SU2 RTW Jul 10 SQL Azure SU6 RTW DAC Import/Export Service SQL Azure RTW MARS DataSync CTP1 DataSync CTP2 Denali TSQL Apr 10 Jul 10 Oct 10 Jan 11 Apr 11 Jul 11 Oct 11 Feb 10 Jun 10 Nov 10 Apr 11 SQL Azure SU1 RTW SQL Azure SU3 RTW DataMarket RTW SQL Azure SU V.Next Alter Edition 50 GB Db SQL Azure Reporting CTP1 Multiple Servers Spatial Type Server Mgmt API HierarchyId Type JDBC DAC Upgrade
  • 21. Data platform: SQL Server 2012 Data Integration Database Services Analytical Services Reporting Services Services SQL Server* Integration Services* Reporting Services* Analysis Services* SQL Azure* SQL Azure Reporting* Master Data Services* Replication Data Mining Report Builder SQL Azure Data Sync* Data Quality Services* Full Text & Semantic StreamInsight* PowerPivot* Power View* Search* Project “Austin”* * New / improved in SQL Server 2012
  • 22. SQL Server 2012 Editions Retrieved from http://www.microsoft.com/en-us/sqlserver/editions.aspx -- February 2013
  • 23. What Enterprise Tools support Microsoft Data Mining? Data Mining SSMS SSIS PowerShell
  • 24.
  • 25.
  • 26. Variable 0 1 2 3 4 5 6 7 Discretized Discretized Continuous Discrete
  • 27. Variable 0 1 2 3 4 5 6 7 Discretized Discretized Continuous Discrete
  • 28. Variable 0 1 2 3 4 5 6 7 Discretized Discretized Continuous Discrete
  • 29. Variable 0 1 2 3 4 5 6 7 Discretized Discretized Continuous Discrete
  • 30. Variable 0 1 2 3 4 5 6 7 Discretized Discretized Continuous Discrete
  • 31. Data Mining Capacities SQL Server 2008 R2 Analysis Services Object Maximum sizes/numbers Maximum data mining models per structure 2^31-1 = 2,147,483,647 Maximum data mining structures per solution 2^31-1 = 2,147,483,647 Maximum data mining structures per Analysis 2^31-1 = 2,147,483,647 Services database Maximum data mining attributes (variables) per 2^31-1 = 2,147,483,647 structure Reference: http://www.marktab.net/datamining/index.php/2010/08/01/sql-server-data-mining-capacities-2008-r2/
  • 34. Future: Most data is Text Two Research Types • Quantitative research = data mining • Qualitative research = text mining The future is combining both
  • 35. Statistical Semantic Search Comprises some aspects of text mining Identifies statistically relevant key phrases Based on these phrases, can identify (by score) similar documents
  • 36. FileTables Built on existing SQL Server FILESTREAM technology Files and documents Stored in special tables in SQL Server Accessed if they were stored in the file system
  • 37. Full-Text Search Enhancements Property search: search on tagged properties (such as author or title) Customizable NEAR: find words or phrases close to one another New Word Breakers and Stemmers (for many languages)
  • 38. From Documents to Output Office Varchar PDF NVarchar Rowset Output with Scores
  • 39. (iFilter Required) iFilters Full-Text Documents Keyword Index “FTI” Semantic Key Phrase Semantic Index – Semantic Document Database Tag Index Similarity Index “DSI” “TI”
  • 40. Languages Currently Supported Traditional Chinese Simplified Chinese German British English English Portuguese French Chinese (Hong Kong SAR, PRC) Italian Spanish Brazilian Chinese (Singapore) Russian Chinese (Macau SAR) Swedish
  • 41. Phases of Semantic Indexing Full Text Keyword Index “FTI” Semantic Document Similarity Index “DSI” Semantic Key Phrase Index – Tag Index “TI” http://msdn.microsoft.com/en-us/library/gg492085.aspx#SemanticIndexing
  • 42. Integrated Full Text Search (iFTS) Improved Performance and Scale: Scale-up to 350M documents for storage and search iFTS query performance 7-10 times faster than in SQL Server 2008 Worst-case iFTS query response times less than 3 sec for corpus Similar or better than main database search competitors (2012, Michael Rys, Microsoft)
  • 43. Linear Scale of FTI/TI/DSI First known linearly scaling end-to-end Search and Semantic product in the industry Time in Seconds vs. Number of Documents (2011 – K. Mukerjee, T. Porter, S. Gherman – Microsoft)
  • 44. Text Mining References Video http://channel9.msdn.com/Shows/DataBound/DataBound-Episode-2-Semantic- Search http://www.microsoftpdc.com/2009/SVR32 Semantic Search (Books Online) – explains the demo http://msdn.microsoft.com/en-us/library/gg492075.aspx Paper http://users.cis.fiu.edu/~lzhen001/activities/KDD2011Program/docs/p213.pdf
  • 46. Software SQL Server 2012 Enterprise (includes database engine, Analysis Services, SSMS and SSDT) http://www.microsoft.com/sqlserver/en/us/get-sql-server/try-it.aspx Microsoft Office 2012 Professional http://office.microsoft.com/en-us/try
  • 47. Organizations Professional Association for SQL Server http://www.sqlpass.org Atlanta MDF http://www.atlantamdf.com/ Atlanta Microsoft BI Users Group http://www.meetup.com/Atlanta-Microsoft- Business-Intelligence-Users/ PASS Business Analytics Conference http://www.passbaconference.com Microsoft TechEd North America http://northamerica.msteched.com/
  • 49. Conclusion Microsoft competes well with other vendors Business Intelligence and Analytics Data Warehouse Excel SQL Server Data Mining 2012 provides data mining and semantic search
  • 50. Connect Data Mining Resources and blog http://marktab.net Data Mining Training and Consulting (especially Microsoft and SAS) http://marktab.com