SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Modern ETL: Azure Data Factory,
Data Lake, and SQL Database
Eric Bragas
Please Support Our Sponsors
Local User Groups
Los Angeles User Group
3rd Thursday of each odd month
sqlla.pass.org
Malibu User Group
3rd Wednesday of each month
sqlmalibu.pass.org
San Diego User Group
1st & 3rd Thursday of each month
meetup.com/sdsqlug
meetup.com/sdsqlbig
Los Angeles - Korean
Every Other Tuesday
sqlangeles.pass.org
Orange County User Group
2rd Thursday of each month
bigpass.pass.org
SQLSaturday Los Angeles
June 9th
SQLSaturday San Diego
September 15th
SQL Summit
Annual International Conference
November 6 -9 | Seattle, WA
2 Days of Pre-Cons
200+ sessions over 3 days
Over 5,000 SQL Professionals
Evening Networking Activities
Discount Code: SSDISODNS
About Me
• Senior Business Intelligence Consultant with
DesignMind
• Undergoing a metamorphosis (somewhat
Kafkaesque) into a Cloud Data Engineer
• Always had a passion for art, design, and clean
engineering (I own a Dyson vacuum) and those
passions have stuck with me over the years
• I returned from a trip to Dresden, Germany,
Prague, and Venice this week
• Undergoing my Accelerated Freefall training to
become a certified skydiver
https://www.linkedin.com/in/ericbragas93/ @ericbragas
eric@designmind.com
Overview
This session IS
• A discussion of the awesome tools available in Azure for batch processing
data
• A comparison of ETL and ELT (or LETS)
• PaaS first!
This session IS NOT
• A technical deep dive
• A discussion about migrations
• For the faint of heart ;)
Overview (cont’d)
• Background in architecting and implementing SQL Server Data
Warehouses
• Experience with lift-and-shift, hybrid IaaS and Paas warehouses, and
brand new implementations using just PaaS
PaaS vs. IaaS
Benefits of PaaS
• No server to maintain!
• Literally just data and configurations
• A lot less room for user error
• Ridiculous reliability
• Developers, develop
• Elasticity of all services, including on as needed basis
• U-SQL AUs
• Data Factory parallelism
• SQL Database scaling (kills connections)
PaaS vs. IaaS (cont’d)
Benefits of PaaS Development Process
• Wide variety of tools, both visual and via the API
• Azure Portal makes the dev-test cycle very fast
• Also web based which makes working from anywhere really easy
• Visual Studio and VS Code extensions for development and tuning
• Excellent for integrating with source control
• And a bunch more!
The effectiveness of a solution is largely influenced by the effectiveness
of the team
Azure Data Factory
• "[Azure Data Factory] is a cloud-
based data integration service
that allows you to create data-
driven workflows in the cloud
that orchestrate and automate
data movement and data
transformation.“
• Version 1 – service for batch
processing of time series data
• Version 2 – a general purpose
data processing and workflow
orchestration tool
Demo
Azure Data Factory
What if we need more?
Loading directly to SQL and transforming using SQL can be a good
option for smaller datasets where you don’t expect much evolution
What if you want more flexibility to add larger or more varying data
sets? Or you need a warehouse, but the business doesn’t know what
exactly they need until they see it?
Enter, the Data Lake!
Azure Data Lake
Two components:
• Data Lake Store – a distributed file store that
enables massively parallel read/write on data by a
number of services i.e. ADF, ADLA, HDInsight, ADW,
etc.
• Data Lake Analytics – a data processing engine that
leverages the hybrid SQL and C# language called U-
SQL to perform massively parallel processing of data.
Pay only for what you use.
Note: ADLA is not an ad hoc query engine. It is a batch
processing engine that takes file inputs and produces
file outputs.
What is a Data Lake?
• Place to load all your raw data into a folder framework
• Important to maintain order
• Schema-on-read queries to process data as needed
• Unstructured, semi-structured, and structured data
• Batch data processing at scale to feed your data marts
• Extensible query language
• Utilize as hub for analytics
• ADW, ADLA, ML, etc.
What are the Benefits?
• Load data without first defining or being locked into a particular
schema
• Explore the data before deciding what schema to impose and
processing for your downstream analytics
• Alleviates a major challenge with starting a DW project
• Faster time-to-value (less time deliberating, more time iterating)
• Feed multiple downstream systems from the same system
• Enable a variety of user types to interact with data at the level they
need
• Data Scientists on raw data; Analysts on Data Marts
Data Lake as a Hub
Demo
Azure Data Lake Store and Analytics
SQL Database
• Cloud managed database
service; similar but not the same
as SQL Server
• Use as the presentation or
semantic layer for your data
warehouse
• Fast ad hoc queries and many
concurrent connections
• Supports clustered indexes,
memory optimized tables, etc.
Creating Data Marts
• Pre-process data incrementally using Data Lake Analytics, and stage in
Data Lake Store
• Copy to SQL Database table using pre-copy script and the copy
activity
• More advanced requirements can be serviced by the “writeStoredProcedure”
in the copy activity
• Maintain metadata for incremental loading within the same database
• Track what was loaded last, then load the difference using lookup activity
Demo
Run Pipeline and Query SQL Database
Q&A
Thanks!

Weitere ähnliche Inhalte

Was ist angesagt?

Tokyo azure meetup #2 big data made easy
Tokyo azure meetup #2   big data made easyTokyo azure meetup #2   big data made easy
Tokyo azure meetup #2 big data made easy
Tokyo Azure Meetup
 

Was ist angesagt? (20)

Deep Dive into Azure Data Factory v2
Deep Dive into Azure Data Factory v2Deep Dive into Azure Data Factory v2
Deep Dive into Azure Data Factory v2
 
Microsoft Build 2018 Analytic Solutions with Azure Data Factory and Azure SQL...
Microsoft Build 2018 Analytic Solutions with Azure Data Factory and Azure SQL...Microsoft Build 2018 Analytic Solutions with Azure Data Factory and Azure SQL...
Microsoft Build 2018 Analytic Solutions with Azure Data Factory and Azure SQL...
 
Analyzing StackExchange data with Azure Data Lake
Analyzing StackExchange data with Azure Data LakeAnalyzing StackExchange data with Azure Data Lake
Analyzing StackExchange data with Azure Data Lake
 
Moving to the cloud; PaaS, IaaS or Managed Instance
Moving to the cloud; PaaS, IaaS or Managed InstanceMoving to the cloud; PaaS, IaaS or Managed Instance
Moving to the cloud; PaaS, IaaS or Managed Instance
 
ETL in the Cloud With Microsoft Azure
ETL in the Cloud With Microsoft AzureETL in the Cloud With Microsoft Azure
ETL in the Cloud With Microsoft Azure
 
ADF Mapping Data Flow Private Preview Migration
ADF Mapping Data Flow Private Preview MigrationADF Mapping Data Flow Private Preview Migration
ADF Mapping Data Flow Private Preview Migration
 
Azure Data Factory presentation with links
Azure Data Factory presentation with linksAzure Data Factory presentation with links
Azure Data Factory presentation with links
 
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...
Develop scalable analytical solutions with Azure Data Factory & Azure SQL Dat...
 
Azure Data Factory Data Flow
Azure Data Factory Data FlowAzure Data Factory Data Flow
Azure Data Factory Data Flow
 
Introduction to AWS Glue
Introduction to AWS Glue Introduction to AWS Glue
Introduction to AWS Glue
 
Integration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data LakeIntegration Monday - Analysing StackExchange data with Azure Data Lake
Integration Monday - Analysing StackExchange data with Azure Data Lake
 
Azure Data Factory Data Wrangling with Power Query
Azure Data Factory Data Wrangling with Power QueryAzure Data Factory Data Wrangling with Power Query
Azure Data Factory Data Wrangling with Power Query
 
Cortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data LakeCortana Analytics Workshop: Azure Data Lake
Cortana Analytics Workshop: Azure Data Lake
 
Data Engineering Roles
Data Engineering RolesData Engineering Roles
Data Engineering Roles
 
Tokyo azure meetup #2 big data made easy
Tokyo azure meetup #2   big data made easyTokyo azure meetup #2   big data made easy
Tokyo azure meetup #2 big data made easy
 
Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)
Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)
Deliver Your Modern Data Warehouse (Microsoft Tech Summit Oslo 2018)
 
Microsoft Azure Data Factory Data Flow Scenarios
Microsoft Azure Data Factory Data Flow ScenariosMicrosoft Azure Data Factory Data Flow Scenarios
Microsoft Azure Data Factory Data Flow Scenarios
 
Vitalii Bondarenko "Machine Learning on Fast Data"
Vitalii Bondarenko "Machine Learning on Fast Data"Vitalii Bondarenko "Machine Learning on Fast Data"
Vitalii Bondarenko "Machine Learning on Fast Data"
 
Lift SSIS package to Azure Data Factory V2
Lift SSIS package to Azure Data Factory V2Lift SSIS package to Azure Data Factory V2
Lift SSIS package to Azure Data Factory V2
 
Azure Databricks is Easier Than You Think
Azure Databricks is Easier Than You ThinkAzure Databricks is Easier Than You Think
Azure Databricks is Easier Than You Think
 

Ähnlich wie Modern ETL: Azure Data Factory, Data Lake, and SQL Database

Geek Sync | Deployment and Management of Complex Azure Environments
Geek Sync | Deployment and Management of Complex Azure EnvironmentsGeek Sync | Deployment and Management of Complex Azure Environments
Geek Sync | Deployment and Management of Complex Azure Environments
IDERA Software
 
How To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQLHow To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQL
DataStax
 
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
Amazon Web Services Korea
 

Ähnlich wie Modern ETL: Azure Data Factory, Data Lake, and SQL Database (20)

Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
 
Designing a modern data warehouse in azure
Designing a modern data warehouse in azure   Designing a modern data warehouse in azure
Designing a modern data warehouse in azure
 
Scalable relational database with SQL Azure
Scalable relational database with SQL AzureScalable relational database with SQL Azure
Scalable relational database with SQL Azure
 
Geek Sync | Deployment and Management of Complex Azure Environments
Geek Sync | Deployment and Management of Complex Azure EnvironmentsGeek Sync | Deployment and Management of Complex Azure Environments
Geek Sync | Deployment and Management of Complex Azure Environments
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. NielsenJ1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
 
How To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQLHow To Tell if Your Business Needs NoSQL
How To Tell if Your Business Needs NoSQL
 
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
 
Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriar
Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriarAdf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriar
Adf and ala design c sharp corner toronto chapter feb 2019 meetup nik shahriar
 
Introducing DocumentDB
Introducing DocumentDB Introducing DocumentDB
Introducing DocumentDB
 
20160331 sa introduction to big data pipelining berlin meetup 0.3
20160331 sa introduction to big data pipelining berlin meetup   0.320160331 sa introduction to big data pipelining berlin meetup   0.3
20160331 sa introduction to big data pipelining berlin meetup 0.3
 
So You Want to Build a Data Lake?
So You Want to Build a Data Lake?So You Want to Build a Data Lake?
So You Want to Build a Data Lake?
 
Basic Introduction to Crate @ ViennaDB Meetup
Basic Introduction to Crate @ ViennaDB MeetupBasic Introduction to Crate @ ViennaDB Meetup
Basic Introduction to Crate @ ViennaDB Meetup
 
Introduction to Azure Data Lake
Introduction to Azure Data LakeIntroduction to Azure Data Lake
Introduction to Azure Data Lake
 

Kürzlich hochgeladen

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 

Kürzlich hochgeladen (20)

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 

Modern ETL: Azure Data Factory, Data Lake, and SQL Database

  • 1. Modern ETL: Azure Data Factory, Data Lake, and SQL Database Eric Bragas
  • 3. Local User Groups Los Angeles User Group 3rd Thursday of each odd month sqlla.pass.org Malibu User Group 3rd Wednesday of each month sqlmalibu.pass.org San Diego User Group 1st & 3rd Thursday of each month meetup.com/sdsqlug meetup.com/sdsqlbig Los Angeles - Korean Every Other Tuesday sqlangeles.pass.org Orange County User Group 2rd Thursday of each month bigpass.pass.org SQLSaturday Los Angeles June 9th SQLSaturday San Diego September 15th
  • 4. SQL Summit Annual International Conference November 6 -9 | Seattle, WA 2 Days of Pre-Cons 200+ sessions over 3 days Over 5,000 SQL Professionals Evening Networking Activities Discount Code: SSDISODNS
  • 5. About Me • Senior Business Intelligence Consultant with DesignMind • Undergoing a metamorphosis (somewhat Kafkaesque) into a Cloud Data Engineer • Always had a passion for art, design, and clean engineering (I own a Dyson vacuum) and those passions have stuck with me over the years • I returned from a trip to Dresden, Germany, Prague, and Venice this week • Undergoing my Accelerated Freefall training to become a certified skydiver https://www.linkedin.com/in/ericbragas93/ @ericbragas eric@designmind.com
  • 6. Overview This session IS • A discussion of the awesome tools available in Azure for batch processing data • A comparison of ETL and ELT (or LETS) • PaaS first! This session IS NOT • A technical deep dive • A discussion about migrations • For the faint of heart ;)
  • 7. Overview (cont’d) • Background in architecting and implementing SQL Server Data Warehouses • Experience with lift-and-shift, hybrid IaaS and Paas warehouses, and brand new implementations using just PaaS
  • 8. PaaS vs. IaaS Benefits of PaaS • No server to maintain! • Literally just data and configurations • A lot less room for user error • Ridiculous reliability • Developers, develop • Elasticity of all services, including on as needed basis • U-SQL AUs • Data Factory parallelism • SQL Database scaling (kills connections)
  • 9. PaaS vs. IaaS (cont’d) Benefits of PaaS Development Process • Wide variety of tools, both visual and via the API • Azure Portal makes the dev-test cycle very fast • Also web based which makes working from anywhere really easy • Visual Studio and VS Code extensions for development and tuning • Excellent for integrating with source control • And a bunch more! The effectiveness of a solution is largely influenced by the effectiveness of the team
  • 10.
  • 11.
  • 12. Azure Data Factory • "[Azure Data Factory] is a cloud- based data integration service that allows you to create data- driven workflows in the cloud that orchestrate and automate data movement and data transformation.“ • Version 1 – service for batch processing of time series data • Version 2 – a general purpose data processing and workflow orchestration tool
  • 13.
  • 14.
  • 16. What if we need more? Loading directly to SQL and transforming using SQL can be a good option for smaller datasets where you don’t expect much evolution What if you want more flexibility to add larger or more varying data sets? Or you need a warehouse, but the business doesn’t know what exactly they need until they see it? Enter, the Data Lake!
  • 17. Azure Data Lake Two components: • Data Lake Store – a distributed file store that enables massively parallel read/write on data by a number of services i.e. ADF, ADLA, HDInsight, ADW, etc. • Data Lake Analytics – a data processing engine that leverages the hybrid SQL and C# language called U- SQL to perform massively parallel processing of data. Pay only for what you use. Note: ADLA is not an ad hoc query engine. It is a batch processing engine that takes file inputs and produces file outputs.
  • 18. What is a Data Lake? • Place to load all your raw data into a folder framework • Important to maintain order • Schema-on-read queries to process data as needed • Unstructured, semi-structured, and structured data • Batch data processing at scale to feed your data marts • Extensible query language • Utilize as hub for analytics • ADW, ADLA, ML, etc.
  • 19.
  • 20. What are the Benefits? • Load data without first defining or being locked into a particular schema • Explore the data before deciding what schema to impose and processing for your downstream analytics • Alleviates a major challenge with starting a DW project • Faster time-to-value (less time deliberating, more time iterating) • Feed multiple downstream systems from the same system • Enable a variety of user types to interact with data at the level they need • Data Scientists on raw data; Analysts on Data Marts
  • 21. Data Lake as a Hub
  • 22. Demo Azure Data Lake Store and Analytics
  • 23. SQL Database • Cloud managed database service; similar but not the same as SQL Server • Use as the presentation or semantic layer for your data warehouse • Fast ad hoc queries and many concurrent connections • Supports clustered indexes, memory optimized tables, etc.
  • 24. Creating Data Marts • Pre-process data incrementally using Data Lake Analytics, and stage in Data Lake Store • Copy to SQL Database table using pre-copy script and the copy activity • More advanced requirements can be serviced by the “writeStoredProcedure” in the copy activity • Maintain metadata for incremental loading within the same database • Track what was loaded last, then load the difference using lookup activity
  • 25. Demo Run Pipeline and Query SQL Database