Delivered to SQL Saturday Columbus, GA
Microsoft provides several technologies which can be used for casual to serious data science. This presentation provides an authoritative overview of two major categories: products and services. The products include: SQL Server Analysis Services, Excel Add-in for SSAS, Semantic Search, SQL Server R Services, Microsoft R Technologies, and F#. The services include Cortana Intelligence and Bing Predicts. These technologies have been used by the presenter in various companies and industries, and he will be speaking toward how Microsoft uses these technologies today for its largest Azure customers.
5. Terms Definition
Data Science
Machine Learning
Data Mining
Applied Statistics
the automated or semi-
automated process of
discovering patterns in
data
Applied scientific method
10. Technology Choices
SQL SERVER ANALYSIS SERVICES Enterprise
Business Intelligence
EXCEL ADD-IN FOR SSAS Office 365
Office 2013 or Higher x64
SEMANTIC SEARCH Enterprise
Business Intelligence
Standard
Web
Express with Advanced Services
MICROSOFT AZURE ML Free (Size Limited)
Paid (Web Service): Experiment + Query
F# Open Source
SQL SERVER R SERVICES SQL Server 2016 or higher
22. Time in Seconds vs. Number of Documents
(2011 – K. Mukerjee, T. Porter, S. Gherman – Microsoft)
http://users.cis.fiu.edu/~lzhen001/activities/KDD2011Program/docs/p213.pdf
23.
24.
25.
26.
27. Features
Microsoft R Open
R Distribution (Free)
Microsoft R Client
Free
Microsoft R Server
Commercial
Big Data
In-memory bound
Can only process datasets that fit
into the available memory
In-memory bound
Can process datasets that fit into the available
memory
Operates on large volumes when connected
to R Server
Disk scalability
Operates on bigger volumes &
factors
Speed of
Analysis
Multi-threaded when MKL is
installed for non-ScaleR functions
Multi-threaded with MKL for non-ScaleR
functions
Up to 2 threads for ScaleR functions with a
local compute context
Full parallel threading &
processing
Enterprise
Readiness
Community support Community support Commercial support
Analytic
Breadth
& Depth
8000+ open source packages
Leverage & optimize open source R packages
plus 'Big Data'-ready ScaleR packages
Leverage & optimize open source
R packages plus 'Big Data'-ready
+ Multithreaded ready ScaleR
packages
Commercial
Viability
Risk of deployment to open
source
Free for everyone Commercial licenses
DeployR
Enterprise
Not available Not available Included
28. Microsoft R Server Editions Description Install ScaleR Get Started
R Server for Hadoop
Scale your analysis transparently
by distributing work across
nodes without complex
programming
Doc Doc
R Server for Teradata DB
Run advanced analytics in-
database for seamless data
analysis
Doc Doc
R Server for Linux
Bring predictive and prescriptive
analytics power to your Linux
environments
Doc Doc