SlideShare ist ein Scribd-Unternehmen logo
1 von 14
TheThe BI SandboxBI Sandbox
Madison, Wisconsin AreaMadison, Wisconsin Area
Business Intelligence & Data WarehousingBusiness Intelligence & Data Warehousing
Discussion GroupDiscussion Group
Production ETL
Analytic Data LayerData Acquisition
Layer
Operational Data Layer
BI architecture at a glance …
Legacy
Source
Systems
Legacy
Source
Systems
New
Source
Systems
New
Source
Systems
TriageTriage
ConformedConformed
StorageStorage
AreaArea
batch
transaction OperationalOperational
Data StoresData Stores
OperationalOperational
Data StoresData Stores
XML
Message
XML
Message
DataData
MartsMarts
AnalysisAnalysis
SandboxesSandboxes
Other Sources:
Operational systems
 User supplied data
Manual Loads
BI architecture at a glance …
Operational Data Layer Analytic Data Layer
ConformedConformed
StorageStorage
AreaArea
OperationalOperational
Data StoresData Stores
OperationalOperational
Data StoresData Stores
DataData
MartsMarts
Consolidated
data feeds
(legacy & new)
to downstream
systems
Consolidated
data feeds
(legacy & new)
to downstream
systems
Near real-time
data feeds of new
systems’ data
Near real-time
data feeds of new
systems’ data
Standardized
reporting, ad
hoc reporting
and analysis,
data mining,
predictive
models
Standardized
reporting, ad
hoc reporting
and analysis,
data mining,
predictive
models
Standardized
reporting
Standardized
reporting
AnalysisAnalysis
SandboxesSandboxes
What do you think of when you hear
“sandbox”?
Sandboxes are places to play where
The sand and box are provided
You bring your own toys
What you create is temporary

Obviously some of us are more talented
with sandboxes than others…
Which is the best analogy for a BI
environment?
Assembly Line
Assembly Line
A Predictive Model Test Bed
A Predictive Model Test Bed
A Library
A Library
An Artist’s Studio
An Artist’s Studio
An Information Goldmine
An Information Goldmine
sandbox noun /'san(d) , bäks/
The BI Sandbox, defined
Responsibilities • To facilitate short term ad-hoc exploratory analysis.
• To remove roadblocks to client self-service (minimizing the need for I/S
assistance) with short term ad-hoc exploratory analysis.
• To avoid the creation of unmanaged spreadsheet based data on user
desktops or shared network drives.
• To better enable short term ad-hoc exploratory analysis to be converted to
long term operational analysis as needed (through traceability)
Collaborators Semantic Layer, Operational Data Layer (ODL), Analytic Data Layer (ADL)
Rationale Typically reporting and analysis is ongoing, consistent, and can be enabled by
production structures such as ODSs and data marts.
Occasionally, business requirements indicate a need for temporary or ad-hoc
exploratory data analysis that cannot be supported by existing data structures.
These business requirements often results in unmanaged disparate spreadsheet data
on individual user desktops or shared network drives.
Sandboxes are meant to mitigate the risk that these ad hoc data sets are created
through inconsistent techniques and the subsequent risk that analytical results
discovered by using them are hard to trace and convert to a more permanent
process; and doing so typically requires a complex project to convert the untraceable
data set, integration, and analytical rules into repeatable rules.
The BI Sandbox, defined
Issues and
Notes
• Sandbox data sets will be short-lived.
• The sandbox will support Ad hoc analysis.
• Sandbox data sets will be intended for a specific purpose.
• Reporting generated from the sandbox will not be considered “official”.
• Sandbox data sets should be transitional.
• Sandboxes, if they cannot be decommissioned, should be transitioned into
production structures (e.g., ODSs or data marts).
• Sandbox data set structure/format will be dependent on access tools.
• Sandbox data set composition and quality will be dependent on the source.
• Sandbox check-out (data validation) strategy will be the responsibility of the
end user.
• Sandbox data sets should require minimal I/S intervention.
• Sandbox data can come from external or user supplied sources.
• Data acquisition from operational systems is restricted.
• Sandbox data will not be automatically refreshed on a regular basis.
• Naming standards do not apply to sandbox structures.
The BI Sandbox, the real why
• Shed light on data integration work clients do
whether I/S wishes to acknowledge it or not
• Increase partnership between I/S and business
– I/S has an appropriate solution to offer for more real
problems
• Most innovation doesn’t happen in well-defined
structures
The BI Sandbox, the how
Provide a place to play
• Typically SAS storage
Bring your own toys
• Manual loads of data from various sources including
• Data marts
• ODSs
• Operational systems
• User-supplied data sets
Create & Learn
• Use analysis tools (Business Objects, SAS, Excel) to
explore the data and discover
Transfer what you learn elsewhere
• Covert discoveries into operational changes to build
value
The BI Sandbox, the limitations
• Joins between disparate sources on natural keys
alone
– Operational system keys
– Functional keys
• No cleansing, no column renaming, minimal
metadata, no data modeling
• No automated refresh process
The BI Sandbox, the examples
• Prototyping new enterprise measure
• Experimenting with integration of disparate data
sources
• Predictive model creation, testing & validation
(in parallel with production development)
Discussion

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012Empowered Holdings, LLC
 
Koalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache SparkKoalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache SparkDatabricks
 
BigData_TP3 : Spark
BigData_TP3 : SparkBigData_TP3 : Spark
BigData_TP3 : SparkLilia Sfaxi
 
Data Lakes - The Key to a Scalable Data Architecture
Data Lakes - The Key to a Scalable Data ArchitectureData Lakes - The Key to a Scalable Data Architecture
Data Lakes - The Key to a Scalable Data ArchitectureZaloni
 
Business Intelligence : introduction to datawarehouse
Business Intelligence : introduction to datawarehouseBusiness Intelligence : introduction to datawarehouse
Business Intelligence : introduction to datawarehouseAlexandre Equoy
 
Making Structured Streaming Ready for Production
Making Structured Streaming Ready for ProductionMaking Structured Streaming Ready for Production
Making Structured Streaming Ready for ProductionDatabricks
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark Mostafa
 
Data Privacy with Apache Spark: Defensive and Offensive Approaches
Data Privacy with Apache Spark: Defensive and Offensive ApproachesData Privacy with Apache Spark: Defensive and Offensive Approaches
Data Privacy with Apache Spark: Defensive and Offensive ApproachesDatabricks
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Databricks
 
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...Databricks
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks
 
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...Seeling Cheung
 
Introduction au big data
Introduction au big dataIntroduction au big data
Introduction au big dataAbdelghani Azri
 
Real time big data stream processing
Real time big data stream processing Real time big data stream processing
Real time big data stream processing Luay AL-Assadi
 
Conhecendo Apache Cassandra @Movile
Conhecendo Apache Cassandra  @MovileConhecendo Apache Cassandra  @Movile
Conhecendo Apache Cassandra @MovileEiti Kimura
 

Was ist angesagt? (20)

NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012
 
Koalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache SparkKoalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache Spark
 
BigData_TP3 : Spark
BigData_TP3 : SparkBigData_TP3 : Spark
BigData_TP3 : Spark
 
Data Lakes - The Key to a Scalable Data Architecture
Data Lakes - The Key to a Scalable Data ArchitectureData Lakes - The Key to a Scalable Data Architecture
Data Lakes - The Key to a Scalable Data Architecture
 
Business Intelligence : introduction to datawarehouse
Business Intelligence : introduction to datawarehouseBusiness Intelligence : introduction to datawarehouse
Business Intelligence : introduction to datawarehouse
 
Introduction à HDFS
Introduction à HDFSIntroduction à HDFS
Introduction à HDFS
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Making Structured Streaming Ready for Production
Making Structured Streaming Ready for ProductionMaking Structured Streaming Ready for Production
Making Structured Streaming Ready for Production
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark
 
Data Privacy with Apache Spark: Defensive and Offensive Approaches
Data Privacy with Apache Spark: Defensive and Offensive ApproachesData Privacy with Apache Spark: Defensive and Offensive Approaches
Data Privacy with Apache Spark: Defensive and Offensive Approaches
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
 
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With Data
 
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
 
Dev Ops Training
Dev Ops TrainingDev Ops Training
Dev Ops Training
 
Data warehousing and Data mining
Data warehousing and Data mining Data warehousing and Data mining
Data warehousing and Data mining
 
Introduction au big data
Introduction au big dataIntroduction au big data
Introduction au big data
 
Real time big data stream processing
Real time big data stream processing Real time big data stream processing
Real time big data stream processing
 
Conhecendo Apache Cassandra @Movile
Conhecendo Apache Cassandra  @MovileConhecendo Apache Cassandra  @Movile
Conhecendo Apache Cassandra @Movile
 

Ähnlich wie The BI Sandbox

Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2RojaT4
 
BI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessBI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessJawaherAlbaddawi
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
CS8091_BDA_Unit_I_Analytical_Architecture
CS8091_BDA_Unit_I_Analytical_ArchitectureCS8091_BDA_Unit_I_Analytical_Architecture
CS8091_BDA_Unit_I_Analytical_ArchitecturePalani Kumar
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefitsRicky Barron
 
data resource management
 data resource management data resource management
data resource managementsoodsurbhi123
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousingEr. Nawaraj Bhandari
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?RTTS
 
BD_Architecture and Charateristics.pptx.pdf
BD_Architecture and Charateristics.pptx.pdfBD_Architecture and Charateristics.pptx.pdf
BD_Architecture and Charateristics.pptx.pdferamfatima43
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...DATAVERSITY
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonCapgemini
 
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptx
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptxUNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptx
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptxshruthisweety4
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)Moacyr Passador
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 
Dbms and it infrastructure
Dbms and  it infrastructureDbms and  it infrastructure
Dbms and it infrastructureprojectandppt
 

Ähnlich wie The BI Sandbox (20)

Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
BI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessBI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business business
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
CS8091_BDA_Unit_I_Analytical_Architecture
CS8091_BDA_Unit_I_Analytical_ArchitectureCS8091_BDA_Unit_I_Analytical_Architecture
CS8091_BDA_Unit_I_Analytical_Architecture
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 
DW 101
DW 101DW 101
DW 101
 
data resource management
 data resource management data resource management
data resource management
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousing
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
 
BD_Architecture and Charateristics.pptx.pdf
BD_Architecture and Charateristics.pptx.pdfBD_Architecture and Charateristics.pptx.pdf
BD_Architecture and Charateristics.pptx.pdf
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A Comparison
 
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptx
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptxUNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptx
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptx
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Dbms and it infrastructure
Dbms and  it infrastructureDbms and  it infrastructure
Dbms and it infrastructure
 

Kürzlich hochgeladen

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 

Kürzlich hochgeladen (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

The BI Sandbox

  • 1. TheThe BI SandboxBI Sandbox Madison, Wisconsin AreaMadison, Wisconsin Area Business Intelligence & Data WarehousingBusiness Intelligence & Data Warehousing Discussion GroupDiscussion Group
  • 2. Production ETL Analytic Data LayerData Acquisition Layer Operational Data Layer BI architecture at a glance … Legacy Source Systems Legacy Source Systems New Source Systems New Source Systems TriageTriage ConformedConformed StorageStorage AreaArea batch transaction OperationalOperational Data StoresData Stores OperationalOperational Data StoresData Stores XML Message XML Message DataData MartsMarts AnalysisAnalysis SandboxesSandboxes Other Sources: Operational systems  User supplied data Manual Loads
  • 3. BI architecture at a glance … Operational Data Layer Analytic Data Layer ConformedConformed StorageStorage AreaArea OperationalOperational Data StoresData Stores OperationalOperational Data StoresData Stores DataData MartsMarts Consolidated data feeds (legacy & new) to downstream systems Consolidated data feeds (legacy & new) to downstream systems Near real-time data feeds of new systems’ data Near real-time data feeds of new systems’ data Standardized reporting, ad hoc reporting and analysis, data mining, predictive models Standardized reporting, ad hoc reporting and analysis, data mining, predictive models Standardized reporting Standardized reporting AnalysisAnalysis SandboxesSandboxes
  • 4. What do you think of when you hear “sandbox”? Sandboxes are places to play where The sand and box are provided You bring your own toys What you create is temporary 
  • 5. Obviously some of us are more talented with sandboxes than others…
  • 6. Which is the best analogy for a BI environment? Assembly Line Assembly Line A Predictive Model Test Bed A Predictive Model Test Bed A Library A Library An Artist’s Studio An Artist’s Studio An Information Goldmine An Information Goldmine
  • 8. The BI Sandbox, defined Responsibilities • To facilitate short term ad-hoc exploratory analysis. • To remove roadblocks to client self-service (minimizing the need for I/S assistance) with short term ad-hoc exploratory analysis. • To avoid the creation of unmanaged spreadsheet based data on user desktops or shared network drives. • To better enable short term ad-hoc exploratory analysis to be converted to long term operational analysis as needed (through traceability) Collaborators Semantic Layer, Operational Data Layer (ODL), Analytic Data Layer (ADL) Rationale Typically reporting and analysis is ongoing, consistent, and can be enabled by production structures such as ODSs and data marts. Occasionally, business requirements indicate a need for temporary or ad-hoc exploratory data analysis that cannot be supported by existing data structures. These business requirements often results in unmanaged disparate spreadsheet data on individual user desktops or shared network drives. Sandboxes are meant to mitigate the risk that these ad hoc data sets are created through inconsistent techniques and the subsequent risk that analytical results discovered by using them are hard to trace and convert to a more permanent process; and doing so typically requires a complex project to convert the untraceable data set, integration, and analytical rules into repeatable rules.
  • 9. The BI Sandbox, defined Issues and Notes • Sandbox data sets will be short-lived. • The sandbox will support Ad hoc analysis. • Sandbox data sets will be intended for a specific purpose. • Reporting generated from the sandbox will not be considered “official”. • Sandbox data sets should be transitional. • Sandboxes, if they cannot be decommissioned, should be transitioned into production structures (e.g., ODSs or data marts). • Sandbox data set structure/format will be dependent on access tools. • Sandbox data set composition and quality will be dependent on the source. • Sandbox check-out (data validation) strategy will be the responsibility of the end user. • Sandbox data sets should require minimal I/S intervention. • Sandbox data can come from external or user supplied sources. • Data acquisition from operational systems is restricted. • Sandbox data will not be automatically refreshed on a regular basis. • Naming standards do not apply to sandbox structures.
  • 10. The BI Sandbox, the real why • Shed light on data integration work clients do whether I/S wishes to acknowledge it or not • Increase partnership between I/S and business – I/S has an appropriate solution to offer for more real problems • Most innovation doesn’t happen in well-defined structures
  • 11. The BI Sandbox, the how Provide a place to play • Typically SAS storage Bring your own toys • Manual loads of data from various sources including • Data marts • ODSs • Operational systems • User-supplied data sets Create & Learn • Use analysis tools (Business Objects, SAS, Excel) to explore the data and discover Transfer what you learn elsewhere • Covert discoveries into operational changes to build value
  • 12. The BI Sandbox, the limitations • Joins between disparate sources on natural keys alone – Operational system keys – Functional keys • No cleansing, no column renaming, minimal metadata, no data modeling • No automated refresh process
  • 13. The BI Sandbox, the examples • Prototyping new enterprise measure • Experimenting with integration of disparate data sources • Predictive model creation, testing & validation (in parallel with production development)