Cross organization big data collaboration
Retail
Sales, inventory,
demographics
data for
demand
forecasting and
price
optimization
Finance
Financial
market data
for quantitative
analytics
Utilities
Utility data for
research on
conservation,
alternative
energies
Farming
Field sensors,
crop yields,
weather data for
smart
agriculture
Automotive
Connected car
IOT data for
personalized
experiences
and failure
analysis
Healthcare
Healthcare,
student data
for research
Government
Education
Patient data for
research and
prevention
Traffic, crime
data for
planning and
justice
How data is shared today
Data consumer #1
Data consumer #2
Data consumer #3
Sends via
email or USB
Copies to FTP
server
APIs or web app
Extracts data
Data provider
Difficult to manage, track, and not suitable for big data
Azure Data Share
Secure and controlled
Manage what data is shared
and with who. No exchange
of credentials between
provider and consumer
Flexible
Share by snapshot or in-
place, from and to different
Azure data store
Code free data sharing
with just a few clicks. No
infrastructure to set up
Enhance analytics
Use the power of Azure
analytics tools to enhance
insights with shared data
Easily share data
Easily share data
Simple
• Share data cross tenant with a few clicks
• Intuitive user experience
• No infrastructure to setup and no code to write
Productive
• Focus on data, not infrastructure
• Schedule automated incremental updates with
granular control (hourly, daily)
• Automate sharing through REST API
Designed for big data
• Scales to handle big datasets
• Share terabytes of data in a single share with
multiple recipients
Flexible
Snapshot and in-place
• Snapshot-based sharing for batch processing
• In-place sharing for real time access
Different Azure data stores
• Blob storage, ADLS Gen1 and Gen2, Azure SQL DB, SQL
DW, Azure Data Explorer
• Heterogenous source and target (e.g. table to file)
Various data formats
• Share both structured and unstructured data
• File systems, folders, files
• Containers, blobs
• Databases, tables, views
Secure and controlled
Manage
• Manage all data sharing relationships in one place
• Visibility into what data is shared, who it is shared with
and when data is sent
Control
• Set terms of use, data consumer must accept to receive
data
• Revoke access and stop sharing at any time
• Logs and metrics to track sharing activities
Secure
• Leverages underlying Azure security measures to help
protect data
• AAD-based authentication
• Data is encrypted in transit; metadata is encrypted at rest
and in transit
• No exchange of credentials between data provider and
consumer
Expand analytics
Enrich
• Enhance insights in the modern
data warehouse with data from
partners and customers
Collaborate
• Form industry specific consortium
to pool data among members
Innovate
• Integrate into custom solutions;
expand market via new service
capabilities
Azure
Data Factory
Azure
Databricks
(Data Prep)
Azure Data
Lake Storage
Azure Synapse
Analytics
Power BI
Ingest & Prep
Store
Model & Serve Visualize
Azure
Data Share
Company A
Azure
Data Share
Share
Azure
Data Share
Company B
Azure
Data Share
Company C
Azure
Data Share
Azure
Data Share
How data share works
Source
store
Target
store
Data provider Data consumer
Invitation
In-place access
Snapshot
Intra and cross tenant
Data provider
Data consumer accepts share
Starting with Blob, ADLS, Azure SQL DB,
Azure Synapse Analytics, and Azure
Data Explorer
Supported Azure data stores
Source
Target
Blob Storage ADLS Gen1 ADLS Gen2 Azure SQL DB
Azure Synapse
Analytics
Azure Data
Explorer
Blob Storage Snapshot Snapshot
ADLS Gen1 Snapshot Snapshot
ADLS Gen2 Snapshot Snapshot
Azure SQL DB Snapshot Snapshot Snapshot Snapshot
Azure Synapse
Analytics dedicated
SQL pool
Snapshot Snapshot Snapshot Snapshot
Azure Data Explorer In-place
Accelerate innovation via open ecosystem
“Our decision to integrate Azure Data Share with Finastra’s FusionFabric.cloud
platform is now a great way to further accelerate innovation via an expanded
open ecosystem.”
Eli Rosner, Chief Product and Technology Officer, Finastra
Finastra, one of the worlds' leading FinTechs, is fully integrating Azure Data Share
with their open platform, FusionFabric.cloud, to enable seamless distribution of
premium datasets to a wider ecosystem of application developers across the
FinTech value chain.
Azure Data Share significantly reduces the go to market timeframe and unlocking
net new revenue potential for Finastra.
Streamline buy-side data analysis
“Our clients love that ability to easily, seamlessly, and securely connect to their
data and then build their own custom reports and analytics. And near real-time
sharing with Azure Data Explorer and Azure Data Share permits cross-
organizational data collaboration without compromising data security.”
Paul Stirpe, Chief Technology Officer, Financial Fabric
Improve analytics agility across the company
“We are currently managing 9 trillion rows of data with a total original size of
938TB in Azure Data Explorer. With many different efforts going on within DTNA,
the challenge has been making the data available to each group without making
multiple copies or extracts. For these large sensitive datasets, this would not only
increase cost, but risk losing control of the data.
Azure Data Share has solved the problem for us, allowing us to maintain one
instance, our single source of truth, then add decoupled compute clusters
provisioned to just the data needed by each group. This avoids over sharing as
well as competition for resources from other users accessing the same data.”
Sammi Li, Data Analyst, Daimler Trucks North America
(DTNA)
Next steps
1. Visit the Azure Data Share product page
2. Access documentation, quick starts, and tutorials
3. Get started with Azure Data Share
Service limits
Azure subscription limits and quotas - Azure Resource Manager | Microsoft Docs
Resource Limit
Maximum number of Data Share resources per Azure subscription 100
Maximum number of sent shares per Data Share resource 200
Maximum number of received shares per Data Share resource 100
Maximum number of invitations per sent share 200
Maximum number of share subscriptions per sent share 200
Maximum number of datasets per share 200
Maximum number of snapshot schedules per share 1
Role-based Access Control (RBAC)
Who can create Data Share resource?
Owner or Contributor of an Azure subscription
Who can access Data Share resource?
Owner, Contributor or Reader of the Data Share resource
• Owner or Contributor can manage any share within a Data Share resource
• Reader can view shares within a Data Share resource
Who can share data from a source Azure data store?
User/service principal with Add Role Assignment and Write permission (e.g. Owner of storage account)
Who can receive data into a target Azure data store?
User/service principal with Add Role Assignment and Write permission (e.g. Owner of storage account)
Reference: https://docs.microsoft.com/azure/data-share/concepts-roles-permissions
Control Flow
Data Flow
Data Provider
Data Consumer
Source Data
Azure Data Store
Data Provider Azure subscription Data Consumer Azure subscription
Received Data
1
Azure Data Store
Data Share
Resource
Create share, add source
datasets, schedule and
recipients
2
View email
Click on email link to login to Azure, create
or select Data Share resource, accept
invitation and configure target data store
Login to Azure and create Data
Share resource
2
Monitor invitation and
snapshot status
9 Monitor snapshot status
4
5
6
8
Data Share
Resource
Create share, add source
datasets, schedule and
recipients
Send email invite to data
consumer admin
Copy data per snapshot
schedule
7
Receive share,
configure target data
store
6
Azure
Data Share
Service
Copy data per snapshot
schedule
7
Snapshot-based sharing user experience
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received share
Monitors snapshots
Sally receives invitation
Overview
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received share
Monitors snapshots
Sally receives invitation
Overview
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received share
Monitors snapshots
Sally receives invitation
Overview
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received share
Monitors snapshots
Sally receives invitation
Overview
storageaccount/product
storageaccount/product
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received share
Monitors snapshots
Sally receives invitation
Overview
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received share
Monitors snapshots
Sally receives invitation
Overview
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received share
Monitors snapshots
Sally receives invitation
Overview
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received shares
Monitors snapshots
Sally receives invitation
Overview
John@contoso.com
John@contoso.com
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received shares
Monitors snapshots
Sally receives invitation
Overview
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received shares
Monitors snapshots
Sally receives invitation
Overview
Sally@fabrikam.com
Fabrikam
Sally@fabrikam.com
Fabrikam
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received share
Monitors snapshots
Sally receives invitation
Overview
john@contoso.com
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received share
Monitors snapshots
Sally receives invitation
Overview
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received shares
Monitors snapshots
Sally receives invitation
Overview
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received share
Monitors snapshots
Sally receives invitation
Overview
John@contoso.com
Contoso
This is terms of use
Sally@fabrikam.com
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received share
Monitors snapshots
Sally receives invitation
Overview
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received share
Monitors snapshots
Sally receives invitation
Overview
John sets up Data Share Account
Creates new share
Views list of sent shares
Views details of sent share
Monitors usage of shared data
Views and accepts invitation
Configures received shares
Monitors snapshots
Sally receives invitation
Overview
Hinweis der Redaktion
In a world where data volume, variety and type are exponentially growing, organizations need to collaborate using large datasets. In many cases data is at its most powerful when it can be shared and combined with data that resides outside organizational boundaries with business partners and third parties.
Microsoft is investing in data sharing because customers all over the industries are looking to share data with their partners and customers. We have customers approaching us from retail, automotive, utilities, farming, finance, healthcare, education and government sectors. A typical scenario is to share data with partners or customers, so that their partners or customers can combine this data with their own data, or other data from third party to run analytics to derive insights.
For customers, sharing this data in a simple and governed way is challenging. Common data sharing approaches using FTP or web APIs are complex and require infrastructure to manage and knowledge of code. These tools do not provide the security or governance required to meet enterprise standards, and they are not suitable for sharing large datasets. To enable enterprise collaboration we are unveiling Azure Data Share, a new data management service for sharing big data across external organizations in Azure.
Majority of our customers are looking to share time series data, which gets updated on regular basis (e.g. on daily basis, new files are generated). If you look at how data is shared today, FTP, secure FTP, APIs or web apps are the most popular way of sharing data today. However, they require set up and maintenance. Some customers are sharing data through email, USB stick, tapes, which are not trackable, and not efficient for on-going data sharing. All these technologies are not suitable for sharing large amount of data.
heterogenous support – i.e a data provider may share data in ADLS but the consumer may opt to receive it in Azure Blob Storage
Collaborate with large datasets
Combine existing data with shared data to enrich analytics use cases for deeper insights
Enhance insights in the modern data warehouse
Azure storage integrates with other Azure analytics services for preparing, processing, and analyzing data
In many cases data is at its most powerful when it can be shared and combined with data that resides outside organizational boundaries with business partners and third parties. Azure Data Share enables enterprise collaboration across organizational boundaries.
Cross Organization Big Data analytics. In all industries we have seen needs for data sharing between partners. For example, retailers share sales data with consumer goods suppliers who then use the data to do demand forecasting. In automotive space, we have seen car OEMs sharing IOT data with service providers. Oil and gas companies are sharing data with equipment and infrastructure providers. In precision agriculture, service providers deploy sensors to the field and share soil data with farmers to make watering/fertilizing decisions. In financial industries, index data, transaction data are shared to financial institutions and hedge funds, and sometimes monetized. In health care and education sector, anonymous patient data is shared to research cure for diseases. Governments are sharing data between agencies and with commercial companies.
Analytics outsourcing is another scenario we have heard from our customers. Some of our customers do not have the expertise or other datasets required to analyze the data to derive insights, so they outsource it to a service provider, who will analyze the data and provide results back. This resulted in two data sharing. One from the data owner to the service provider, and one from the service provider to the data owner.
Another scenario we have heard from our customer is industry-specific data consortium. For example, in the healthcare/education sector, data is shared with the members of the consortium to conduct research on disease. The data can also potentially be sold to pharmaceutical companies for a fee.
Data marketplace is a scenario we have heard from a number of customers. These companies will create their own marketplace storefront, where their customers will discover the datasets. Once the purchase is made, Data Share service will be used to automate the process of data distribution and tracking. In this case, the company who owns the marketplace will be leveraging the Data Share API for bulk data sharing.
Cross industry, data sharing through supply chain
One example is IOT data collected by the service provider. Another example is transaction data in financial industry collected by the bank.
e.g. Patient data
Data share resource is where you can create a Data Share resource in Azure. This is where metadata about the resource is located.
Snapshot execution is where the compute resource is located to copy data from source storage account to target storage account
Data Share resource and storage accounts do not need to collocate in the data center. For example, data share source can locate in East US 2, where storage account is in West US.