SlideShare ist ein Scribd-Unternehmen logo
1 von 40
.NET per la Data Science
(e anche di più)
Marco Parenzan
SPONSOR
Marco Parenzan
Solution Sales Specialist in Insight for Digital Innovation
Azure MVP
Community Lead 1nn0va // Pordenone
Linkedin: https://www.linkedin.com/in/marcoparenzan/
.NET per la Data Science
(e anche di più)
Marco Parenzan
C# language evolution
• C# 1.0 was a new managed language
• C# 2.0 introduced generics
• C# 3.0 enabled LINQ
• C# 4.0 was all about interoperability with dynamic non-strongly typed languages.
• C# 5.0 simplified asynchronous programming with the async and await keywords.
• C# 6.0 the language has been increasingly shaped by conversation with the
community, now to the point of taking language features as contributions from
outside Microsoft
• C# 7.x will be no exception, with tuples and pattern matching as the biggest
features, transforming and streamlining the flow of data and control in code.
Point releases
• C# 7.1, 7.2, 7.3 Safe Efficient Code, More Freedom, Less Code
• C# 8 running in the function path
• C# 9 records, top level statements
C# language evolution
• C# 1.0 was a new managed language
• C# 2.0 introduced generics
• C# 3.0 enabled LINQ
• C# 4.0 was all about interoperability with dynamic non-strongly typed languages.
• C# 5.0 simplified asynchronous programming with the async and await keywords.
• C# 6.0 the language has been increasingly shaped by conversation with the
community, now to the point of taking language features as contributions from
outside Microsoft
• C# 7.x will be no exception, with tuples and pattern matching as the biggest
features, transforming and streamlining the flow of data and control in code.
Point releases
• C# 7.1, 7.2, 7.3 Safe Efficient Code, More Freedom, Less Code
• C# 8 running in the function path
• C# 9 records, top level statements
Add a footer 7
Top level statement
• Only one file in a project the code you could write in a Main
method directly without method and class
• Else is a syntax error
• You have no reference to that class (that is compiler generated)
• …and it is async by default! 
Pythonizing C# (since C# 7.x)
Add a footer 8
Guido Van Rossum joins Microsoft
What about Data Science and Spark?
• In a recent survey, more than 70% of .NET devs expressed
interest in Apache Spark
• Millions of lines of big data-usable business logic are written in
.NET
• But .NET devs are locked out from big data processing – lack of
.NET support in OSS big data solutions
• We want a first-class .net data processing experience
Batch vs. Notebooks
• Batch
• Work on slow data stored into a
Datalake
• Submit a complete app in one
single deploy
• Receive the entire output
• Notebook
• «sketching» the code
• Write/delete/rewrite
continuously
• Run cell by cell (but also all at
once) interactive
• In a world of Mathematica
The .NET Interactive
experience
Evolution of REPL
• At the beginning there was mono
• Then Dynamic/DLR (C# 4)
• C#/F# interactive (C#6 + Roslyn)
• .NET Try.NET Interactive
.NET Interactive Architectural Overview
• The kernel concept in
.NET Interactive is a
component that
accepts commands and
produces outputs.
• The commands are
typically blocks of
arbitrary code, and the
outputs are events that
describe the results and
effects of that code.
The Kernel class
represents this core
abstraction.
• “Coding like a chat with
a Bot”
Jupyter
• Evolution and generalization of the seminal role of Mathematica
(notebook)
• +Python adoption (ipynb)
• +Web (HTTP+Html+Markdown)
• +Kernel
Demo
Data Processing in a Azure World...
• Thousands of IoT sensors in a factory,
producing petabytes of data
• Started with Stream Analytics...
• Like Azure Data Explorer...
• Data Warehouse...
• ...but there is a standard? Yes!
Apache Spark
The data compute experience
Spark Unifies:
 Batch Processing
 Interactive SQL
 Real-time processing
 Machine Learning
 Deep Learning
 Graph Processing
An unified, open source, parallel, data processing framework for Big Data Analytics
Spark Core Engine
Spark SQL
Batch processing
Spark Structured
Streaming
Stream processing
Spark MLlib
Machine
Learning
Yarn
Spark MLlib
Machine
Learning
Spark
Streaming
Stream processing
GraphX
Graph
Computation
http://spark.apache.org
Apache Spark
Data Sources (HDFS, SQL, NoSQL, …)
Cluster Manager
Node Node Node
Cache Cache Cache
Driver Program
SparkContext
General Spark Cluster Architecture
• ‘Driver’ runs the user’s ‘main’ function
and executes the various parallel
operations on the worker nodes.
• The results of the operations are
collected by the driver
• The worker nodes read and write data
from/to Data Sources including HDFS.
• Worker node also cache transformed
data in memory as RDDs (Resilient
Data Sets).
• Worker nodes and the Driver Node
execute as VMs in public clouds
(AWS, Google and Azure).
Read from
HDFS
Write to
HDFS
Read from
HDFS
Write to
HDFS
Read from
HDFS
WhatmakesSparkfast
DataFrame as the core of Spark
• Recipe:
• Create Session
• Create Dataframe
• Define a user defined function
• Manipulate and view Data
CSV Data
JSON Data
RDBMS Data
Parquet Data
Binary Data
DataFrame
User programs
against the
DataFrame
abstraction
Demo
.NET for Apache Spark
• .NET bindings (C# e F#) to Spark
• Written on the Spark interop layer,
designed to provide high
performance bindings to multiple
languages
• Re-use knowledge, skills, code
you have as a .NET developer
• Compliant with .NET Standard
• You can use .NET for Apache
Spark anywhere you write .NET
code
• Original project Moebius
• https://github.com/microsoft/Mob
ius
.NET Spark support
Spark DataFramews
with SparkSQL
• Spark 2.3.x, 2.4.x,
3.0
• ~300 SparkSQL
function
• DeltaLake
.NET Standard 2.0
• C#/F#
• .NET Framework
4.6.1+
• .NET Core 2.1+
Batch&Streaming
• Structured
Streaming
Data Science
• ML.NET
• Notebooks
Using .NET for Spark
• Get started with .NET for
Apache Spark | Microsoft
Docs
• https://docs.microsoft.com/en-
us/dotnet/spark/tutorials/get-
started?tabs=windows
• Install .NET
• Install Java
• Install Apache Spark
• Install .NET for Apache Spark
• Create your app
• Install NuGet package
Run Spark with .NET in a container
• Not an official Microsoft image
• https://hub.docker.com/r/3rdman/dotnet-spark
• Install nothing on your machine other than docker
• Launch and Debug your code from Visual Studio and Visual Studio
Code
• Very good for development
Demo
The Azure Synapse Analytics
Experience
Engine for business-changing insights with seamless ecosystem
integration
Azure
Synapse Analytics
Data
integration
Data
warehousing
Big data
processing
Azure Data Lake Storage + Common Data Model
Azure Synapse Analytics
Limitless analytics service with unmatched time to insight
Synapse Analytics
Platform
Azure
Data Lake Storage
Common Data Model
Enterprise Security
Optimized for Analytics
Data lake integrated and
Common Data Model aware
METASTORE
SECURITY
MANAGEMENT
MONITORING
Integrated platform services
for, management, security,
monitoring, and meta-store
DATA INTEGRATION
SQL
Analytics Runtimes
Integrated analytics runtimes
available provisioned and
serverless on-demand
Synapse SQL offering T-SQL for
batch, streaming and interactive
processing
Apache Spark for big data
processing with Python, Scala and
.NET
PROVISIONED ON-DEMAND
Form Factors
SQL
Languages
Python .NET Java Scala R
Multiple languages suited to
different analytics workloads
Experience Synapse Analytics Studio
SaaS developer experiences for
code free and code first
Artificial Intelligence / Machine Learning / Internet of Things
Intelligent Apps / Business Intelligence
Designed for analytics workloads
at any scale
METASTORE
SECURITY
MANAGEMENT
MONITORING
Manage – Apache Spark pools
• Overview
• Provides ability to Pause, Scale, Assign Tags and upload packages from Studio.
Spark ML
Algorithms
Spark ML Algorithms
Synapse Service
Job Service Frontend
Spark API
Controller …
Job Service Backend
Spark Plugin
Gateway
Resource
Provider
DB
Synapse Studio
AAD
Auth Service
Instance
Creation Service
DBDB
Azure
Spark Instance
VM VM VM VM VM
…
VM
Synapse Job Service
Develop Hub - Notebooks
• Notebooks
• Allows to write multiple
languages in one notebook
• %%<Name of language>
• Offers use of temporary tables
across languages
• Language support for Syntax
highlight, syntax error, syntax
code completion, smart indent,
code folding
• Export results
Demo
Conclusion
Add a footer 38
Conclusion
• Spark is here to stay
• Now it is a place for .NET skills too
• Azure Synapse Analytics the best fusion between the old and
the new world
Add a footer 39
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Using Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformUsing Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformDatabricks
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesJen Aman
 
Lessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and MicroservicesLessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and MicroservicesAlexis Seigneurin
 
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath LakkundiApache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath LakkundiDatabricks
 
Insights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured StreamingInsights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured StreamingDatabricks
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleDatabricks
 
Real-Time Machine Learning with Redis, Apache Spark, Tensor Flow, and more wi...
Real-Time Machine Learning with Redis, Apache Spark, Tensor Flow, and more wi...Real-Time Machine Learning with Redis, Apache Spark, Tensor Flow, and more wi...
Real-Time Machine Learning with Redis, Apache Spark, Tensor Flow, and more wi...Databricks
 
Empowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETLEmpowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETLDatabricks
 
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaDeep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaJen Aman
 
Time Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETTTime Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETTMarco Parenzan
 
3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta3D: DBT using Databricks and Delta
3D: DBT using Databricks and DeltaDatabricks
 
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Alex Zeltov
 
Data Science with Spark & Zeppelin
Data Science with Spark & ZeppelinData Science with Spark & Zeppelin
Data Science with Spark & ZeppelinVinay Shukla
 
SQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightSQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightTillmann Eitelberg
 
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...Spark Summit
 
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...Databricks
 
Accelerating Machine Learning on Databricks Runtime
Accelerating Machine Learning on Databricks RuntimeAccelerating Machine Learning on Databricks Runtime
Accelerating Machine Learning on Databricks RuntimeDatabricks
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Databricks
 

Was ist angesagt? (20)

Using Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformUsing Databricks as an Analysis Platform
Using Databricks as an Analysis Platform
 
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best PracticesDeep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best Practices
 
Lessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and MicroservicesLessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and Microservices
 
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath LakkundiApache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
 
Insights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured StreamingInsights Without Tradeoffs: Using Structured Streaming
Insights Without Tradeoffs: Using Structured Streaming
 
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life CycleMLflow: Infrastructure for a Complete Machine Learning Life Cycle
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
 
Real-Time Machine Learning with Redis, Apache Spark, Tensor Flow, and more wi...
Real-Time Machine Learning with Redis, Apache Spark, Tensor Flow, and more wi...Real-Time Machine Learning with Redis, Apache Spark, Tensor Flow, and more wi...
Real-Time Machine Learning with Redis, Apache Spark, Tensor Flow, and more wi...
 
Empowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETLEmpowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETL
 
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaDeep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
 
Time Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETTTime Series Anomaly Detection with Azure and .NETT
Time Series Anomaly Detection with Azure and .NETT
 
3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta
 
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
 
Data science lifecycle with Apache Zeppelin
Data science lifecycle with Apache ZeppelinData science lifecycle with Apache Zeppelin
Data science lifecycle with Apache Zeppelin
 
Data Science with Spark & Zeppelin
Data Science with Spark & ZeppelinData Science with Spark & Zeppelin
Data Science with Spark & Zeppelin
 
SQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightSQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsight
 
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Spark Sum...
 
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
BigDL: Bringing Ease of Use of Deep Learning for Apache Spark with Jason Dai ...
 
Accelerating Machine Learning on Databricks Runtime
Accelerating Machine Learning on Databricks RuntimeAccelerating Machine Learning on Databricks Runtime
Accelerating Machine Learning on Databricks Runtime
 
LinkedIn
LinkedInLinkedIn
LinkedIn
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
 

Ähnlich wie .NET per la Data Science e oltre

Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Michael Rys
 
.net developer for Jupyter Notebook and Apache Spark and viceversa
.net developer for Jupyter Notebook and Apache Spark and viceversa.net developer for Jupyter Notebook and Apache Spark and viceversa
.net developer for Jupyter Notebook and Apache Spark and viceversaMarco Parenzan
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Michael Rys
 
201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine LearningMark Tabladillo
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventTrivadis
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Wes McKinney
 
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015Mike Broberg
 
Apache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code WorkshopApache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code WorkshopAmanda Casari
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 editionDavid Talby
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkDatabricks
 
Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要Paulo Gutierrez
 
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Teaching Apache Spark: Demonstrations on the Databricks Cloud PlatformTeaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Teaching Apache Spark: Demonstrations on the Databricks Cloud PlatformYao Yao
 
Ananth_Ravishankar
Ananth_RavishankarAnanth_Ravishankar
Ananth_Ravishankarananth R
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAlberto Diaz Martin
 
Lambda architecture with Spark
Lambda architecture with SparkLambda architecture with Spark
Lambda architecture with SparkVincent GALOPIN
 
Global AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksGlobal AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksAlberto Diaz Martin
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsAndrew Brust
 
Using PySpark to Process Boat Loads of Data
Using PySpark to Process Boat Loads of DataUsing PySpark to Process Boat Loads of Data
Using PySpark to Process Boat Loads of DataRobert Dempsey
 

Ähnlich wie .NET per la Data Science e oltre (20)

Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
Building data pipelines for modern data warehouse with Apache® Spark™ and .NE...
 
.net developer for Jupyter Notebook and Apache Spark and viceversa
.net developer for Jupyter Notebook and Apache Spark and viceversa.net developer for Jupyter Notebook and Apache Spark and viceversa
.net developer for Jupyter Notebook and Apache Spark and viceversa
 
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
Bringing the Power and Familiarity of .NET, C# and F# to Big Data Processing ...
 
201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018
 
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015
Apache Spark™ + IBM Watson + Twitter DataPalooza SF 2015
 
Apache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code WorkshopApache Spark for Everyone - Women Who Code Workshop
Apache Spark for Everyone - Women Who Code Workshop
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache Spark
 
Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要
 
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Teaching Apache Spark: Demonstrations on the Databricks Cloud PlatformTeaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
 
Ananth_Ravishankar
Ananth_RavishankarAnanth_Ravishankar
Ananth_Ravishankar
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
 
Lambda architecture with Spark
Lambda architecture with SparkLambda architecture with Spark
Lambda architecture with Spark
 
Global AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksGlobal AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure Databricks
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
Using PySpark to Process Boat Loads of Data
Using PySpark to Process Boat Loads of DataUsing PySpark to Process Boat Loads of Data
Using PySpark to Process Boat Loads of Data
 
Venkata
VenkataVenkata
Venkata
 
Ow
OwOw
Ow
 

Mehr von Marco Parenzan

Azure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineerAzure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineerMarco Parenzan
 
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptxStatic abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptxMarco Parenzan
 
Azure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT SolutionsAzure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT SolutionsMarco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central Marco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralMarco Parenzan
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralMarco Parenzan
 
Developing Actors in Azure with .net
Developing Actors in Azure with .netDeveloping Actors in Azure with .net
Developing Actors in Azure with .netMarco Parenzan
 
Math with .NET for you and Azure
Math with .NET for you and AzureMath with .NET for you and Azure
Math with .NET for you and AzureMarco Parenzan
 
Power BI data flow and Azure IoT Central
Power BI data flow and Azure IoT CentralPower BI data flow and Azure IoT Central
Power BI data flow and Azure IoT CentralMarco Parenzan
 
.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogame.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogameMarco Parenzan
 
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...Marco Parenzan
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETMarco Parenzan
 
Deploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data SolutionsDeploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data SolutionsMarco Parenzan
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetMarco Parenzan
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .netMarco Parenzan
 
Code Generation for Azure with .net
Code Generation for Azure with .netCode Generation for Azure with .net
Code Generation for Azure with .netMarco Parenzan
 
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magicRunning Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magicMarco Parenzan
 
Code Generation for Azure with .net
Code Generation for Azure with .netCode Generation for Azure with .net
Code Generation for Azure with .netMarco Parenzan
 

Mehr von Marco Parenzan (20)

Azure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineerAzure IoT Central per lo SCADA engineer
Azure IoT Central per lo SCADA engineer
 
Azure Hybrid @ Home
Azure Hybrid @ HomeAzure Hybrid @ Home
Azure Hybrid @ Home
 
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptxStatic abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
Static abstract members nelle interfacce di C# 11 e dintorni di .NET 7.pptx
 
Azure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT SolutionsAzure Synapse Analytics for your IoT Solutions
Azure Synapse Analytics for your IoT Solutions
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
 
Power BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT CentralPower BI Streaming Data Flow e Azure IoT Central
Power BI Streaming Data Flow e Azure IoT Central
 
Developing Actors in Azure with .net
Developing Actors in Azure with .netDeveloping Actors in Azure with .net
Developing Actors in Azure with .net
 
Math with .NET for you and Azure
Math with .NET for you and AzureMath with .NET for you and Azure
Math with .NET for you and Azure
 
Power BI data flow and Azure IoT Central
Power BI data flow and Azure IoT CentralPower BI data flow and Azure IoT Central
Power BI data flow and Azure IoT Central
 
.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogame.net for fun: write a Christmas videogame
.net for fun: write a Christmas videogame
 
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
Building IoT infrastructure on edge with .net, Raspberry PI and ESP32 to conn...
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
 
Deploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data SolutionsDeploy Microsoft Azure Data Solutions
Deploy Microsoft Azure Data Solutions
 
Deep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnetDeep Dive Time Series Anomaly Detection in Azure with dotnet
Deep Dive Time Series Anomaly Detection in Azure with dotnet
 
Azure IoT Central
Azure IoT CentralAzure IoT Central
Azure IoT Central
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
 
Code Generation for Azure with .net
Code Generation for Azure with .netCode Generation for Azure with .net
Code Generation for Azure with .net
 
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magicRunning Kafka and Spark on Raspberry PI with Azure and some .net magic
Running Kafka and Spark on Raspberry PI with Azure and some .net magic
 
Code Generation for Azure with .net
Code Generation for Azure with .netCode Generation for Azure with .net
Code Generation for Azure with .net
 

Kürzlich hochgeladen

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 

Kürzlich hochgeladen (20)

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 

.NET per la Data Science e oltre

  • 1. .NET per la Data Science (e anche di più) Marco Parenzan
  • 3. Marco Parenzan Solution Sales Specialist in Insight for Digital Innovation Azure MVP Community Lead 1nn0va // Pordenone Linkedin: https://www.linkedin.com/in/marcoparenzan/
  • 4. .NET per la Data Science (e anche di più) Marco Parenzan
  • 5. C# language evolution • C# 1.0 was a new managed language • C# 2.0 introduced generics • C# 3.0 enabled LINQ • C# 4.0 was all about interoperability with dynamic non-strongly typed languages. • C# 5.0 simplified asynchronous programming with the async and await keywords. • C# 6.0 the language has been increasingly shaped by conversation with the community, now to the point of taking language features as contributions from outside Microsoft • C# 7.x will be no exception, with tuples and pattern matching as the biggest features, transforming and streamlining the flow of data and control in code. Point releases • C# 7.1, 7.2, 7.3 Safe Efficient Code, More Freedom, Less Code • C# 8 running in the function path • C# 9 records, top level statements
  • 6. C# language evolution • C# 1.0 was a new managed language • C# 2.0 introduced generics • C# 3.0 enabled LINQ • C# 4.0 was all about interoperability with dynamic non-strongly typed languages. • C# 5.0 simplified asynchronous programming with the async and await keywords. • C# 6.0 the language has been increasingly shaped by conversation with the community, now to the point of taking language features as contributions from outside Microsoft • C# 7.x will be no exception, with tuples and pattern matching as the biggest features, transforming and streamlining the flow of data and control in code. Point releases • C# 7.1, 7.2, 7.3 Safe Efficient Code, More Freedom, Less Code • C# 8 running in the function path • C# 9 records, top level statements
  • 7. Add a footer 7 Top level statement • Only one file in a project the code you could write in a Main method directly without method and class • Else is a syntax error • You have no reference to that class (that is compiler generated) • …and it is async by default! 
  • 8. Pythonizing C# (since C# 7.x) Add a footer 8
  • 9. Guido Van Rossum joins Microsoft
  • 10. What about Data Science and Spark? • In a recent survey, more than 70% of .NET devs expressed interest in Apache Spark • Millions of lines of big data-usable business logic are written in .NET • But .NET devs are locked out from big data processing – lack of .NET support in OSS big data solutions • We want a first-class .net data processing experience
  • 11. Batch vs. Notebooks • Batch • Work on slow data stored into a Datalake • Submit a complete app in one single deploy • Receive the entire output • Notebook • «sketching» the code • Write/delete/rewrite continuously • Run cell by cell (but also all at once) interactive • In a world of Mathematica
  • 13. Evolution of REPL • At the beginning there was mono • Then Dynamic/DLR (C# 4) • C#/F# interactive (C#6 + Roslyn) • .NET Try.NET Interactive
  • 14. .NET Interactive Architectural Overview • The kernel concept in .NET Interactive is a component that accepts commands and produces outputs. • The commands are typically blocks of arbitrary code, and the outputs are events that describe the results and effects of that code. The Kernel class represents this core abstraction. • “Coding like a chat with a Bot”
  • 15. Jupyter • Evolution and generalization of the seminal role of Mathematica (notebook) • +Python adoption (ipynb) • +Web (HTTP+Html+Markdown) • +Kernel
  • 16. Demo
  • 17. Data Processing in a Azure World... • Thousands of IoT sensors in a factory, producing petabytes of data • Started with Stream Analytics... • Like Azure Data Explorer... • Data Warehouse... • ...but there is a standard? Yes!
  • 18. Apache Spark The data compute experience
  • 19. Spark Unifies:  Batch Processing  Interactive SQL  Real-time processing  Machine Learning  Deep Learning  Graph Processing An unified, open source, parallel, data processing framework for Big Data Analytics Spark Core Engine Spark SQL Batch processing Spark Structured Streaming Stream processing Spark MLlib Machine Learning Yarn Spark MLlib Machine Learning Spark Streaming Stream processing GraphX Graph Computation http://spark.apache.org Apache Spark
  • 20. Data Sources (HDFS, SQL, NoSQL, …) Cluster Manager Node Node Node Cache Cache Cache Driver Program SparkContext General Spark Cluster Architecture • ‘Driver’ runs the user’s ‘main’ function and executes the various parallel operations on the worker nodes. • The results of the operations are collected by the driver • The worker nodes read and write data from/to Data Sources including HDFS. • Worker node also cache transformed data in memory as RDDs (Resilient Data Sets). • Worker nodes and the Driver Node execute as VMs in public clouds (AWS, Google and Azure).
  • 21. Read from HDFS Write to HDFS Read from HDFS Write to HDFS Read from HDFS WhatmakesSparkfast
  • 22. DataFrame as the core of Spark • Recipe: • Create Session • Create Dataframe • Define a user defined function • Manipulate and view Data CSV Data JSON Data RDBMS Data Parquet Data Binary Data DataFrame User programs against the DataFrame abstraction
  • 23. Demo
  • 24. .NET for Apache Spark • .NET bindings (C# e F#) to Spark • Written on the Spark interop layer, designed to provide high performance bindings to multiple languages • Re-use knowledge, skills, code you have as a .NET developer • Compliant with .NET Standard • You can use .NET for Apache Spark anywhere you write .NET code • Original project Moebius • https://github.com/microsoft/Mob ius
  • 25. .NET Spark support Spark DataFramews with SparkSQL • Spark 2.3.x, 2.4.x, 3.0 • ~300 SparkSQL function • DeltaLake .NET Standard 2.0 • C#/F# • .NET Framework 4.6.1+ • .NET Core 2.1+ Batch&Streaming • Structured Streaming Data Science • ML.NET • Notebooks
  • 26. Using .NET for Spark • Get started with .NET for Apache Spark | Microsoft Docs • https://docs.microsoft.com/en- us/dotnet/spark/tutorials/get- started?tabs=windows • Install .NET • Install Java • Install Apache Spark • Install .NET for Apache Spark • Create your app • Install NuGet package
  • 27. Run Spark with .NET in a container • Not an official Microsoft image • https://hub.docker.com/r/3rdman/dotnet-spark • Install nothing on your machine other than docker • Launch and Debug your code from Visual Studio and Visual Studio Code • Very good for development
  • 28. Demo
  • 29. The Azure Synapse Analytics Experience
  • 30. Engine for business-changing insights with seamless ecosystem integration Azure Synapse Analytics Data integration Data warehousing Big data processing Azure Data Lake Storage + Common Data Model
  • 31. Azure Synapse Analytics Limitless analytics service with unmatched time to insight Synapse Analytics Platform Azure Data Lake Storage Common Data Model Enterprise Security Optimized for Analytics Data lake integrated and Common Data Model aware METASTORE SECURITY MANAGEMENT MONITORING Integrated platform services for, management, security, monitoring, and meta-store DATA INTEGRATION SQL Analytics Runtimes Integrated analytics runtimes available provisioned and serverless on-demand Synapse SQL offering T-SQL for batch, streaming and interactive processing Apache Spark for big data processing with Python, Scala and .NET PROVISIONED ON-DEMAND Form Factors SQL Languages Python .NET Java Scala R Multiple languages suited to different analytics workloads Experience Synapse Analytics Studio SaaS developer experiences for code free and code first Artificial Intelligence / Machine Learning / Internet of Things Intelligent Apps / Business Intelligence Designed for analytics workloads at any scale METASTORE SECURITY MANAGEMENT MONITORING
  • 32. Manage – Apache Spark pools • Overview • Provides ability to Pause, Scale, Assign Tags and upload packages from Studio.
  • 33.
  • 35. Synapse Service Job Service Frontend Spark API Controller … Job Service Backend Spark Plugin Gateway Resource Provider DB Synapse Studio AAD Auth Service Instance Creation Service DBDB Azure Spark Instance VM VM VM VM VM … VM Synapse Job Service
  • 36. Develop Hub - Notebooks • Notebooks • Allows to write multiple languages in one notebook • %%<Name of language> • Offers use of temporary tables across languages • Language support for Syntax highlight, syntax error, syntax code completion, smart indent, code folding • Export results
  • 37. Demo
  • 39. Conclusion • Spark is here to stay • Now it is a place for .NET skills too • Azure Synapse Analytics the best fusion between the old and the new world Add a footer 39