Presented at: Global Azure Bootcamp (Melbourne)
Participants will get a deep dive into one of Azure’s newest offering: Azure Databricks, a fast, easy and collaborative Apache® Spark™ based analytics platform optimized for Azure. In this session, we will go through Azure Databricks key collaboration features, cluster management, and tight data integration with Azure data sources. We’ll also walk through an end-to-end Recommendation System Data Pipeline built using Spark on Azure Databricks.
19. Optimized Databricks Runtime Engine
DATABRICKS I/O SERVERLESS
Collaborative Workspace
Cloud storage
Data warehouses
Hadoop storage
IoT / streaming data
Rest APIs
Machine learning models
BI tools
Data exports
Data warehouses
AZURE DATABRICKS
Enhance Productivity
Deploy Production Jobs & Workflows
APACHE SPARK
MULTI-STAGE PIPELINES
DATA ENGINEER
JOB SCHEDULER NOTIFICATION & LOGS
DATA SCIENTIST BUSINESS ANALYST
Build on secure & trusted cloud Scale without limits
20. Cosmos DB
Kafka on HDInsight
Event Hubs
Power BI
SQL DW
Data Factory
O R C H E S T R A T I O N
Storage (Azure) Azure Data Lake
S T O R A G E
I N G E S T V I S U A L I Z E
S E C U R E Azure Active Directory
A Z U RE DATA BRIC KS
25. https://movielens.org/
F. Maxwell Harper and Joseph A. Konstan. 2015.
The MovieLens Datasets: History and Context.
ACM Transactions on Interactive Intelligent
Systems (TiiS) 5, 4, Article 19 (December 2015), 19
pages. DOI=http://dx.doi.org/10.1145/2827872
35. Collaborative Workspace
Deploy Production Jobs & Workflows
MULTI-STAGE PIPELINES
DATA ENGINEER
JOB SCHEDULER NOTIFICATION & LOGS
DATA SCIENTIST BUSINESS ANALYST