SlideShare ist ein Scribd-Unternehmen logo
1 von 128
S U M M I T
B E RLIN
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
TheScout24 Data Platform
ATechnical Deep Dive
Sean Gustafson
Senior Technical Product Manager
Scout24
S e s s i o n I D
Raffael Dzikowski
Senior Data Engineer
Scout24
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
5
Core Geographies
and an overall presence
in 18 countries
80m
Household Reach
2
Major Household Brand Names
Scout24 AG
• SDAX
• € 489 million revenue (2017)
• ˜1500 employees
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Our technicalevolution
Production
Database
Monolith app
Data Warehouse Data
Warehouse
Microservice
Microservice
Microservice
Microservice
Microservice
Microservice
Microservice
Microservice
Microservice
Microservice
Microservice
Microservice
Microservice
Microservice
Persistence
Persistence
Persistence
Persistence
Persistence
Persistence
Persistence
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Our datawarehousewasabottleneck
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Scout24 wants to become a truly data-driven company
Fast & easy data-driven
product development…
…supported by
Data & Analytics
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Scout24 wants to become a truly data-driven company
Everywhere in the company... ...without bloating up Data &
Analytics
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Our solution:
Build an internal “platform” for data
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
We thinkof our Data Platform asa Product
Just like AWS, Salesforce, etc. – the platform is a generic layer upon which
Scout24’s products can be built
BUT, we have a very, very small number of customers.
That means, product teams get personalized support and there is lots of
opportunity for collaboration.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Wedon’tdictateanything.
Wejusttrytomakecertainthingseasierbyofferinga“pavedpath”
Productteamsarefullyempoweredtomaketheirownchoicesaboutwhatisthe
bestuseoftheirresources.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
“Inalmostallcases,wewillnot
mandatethatinternalteamuse
theseplatformsandservices—
theseplatform teamswill
to win over andsatisfy their
internalcustomers,
evencompetingwithexternal
vendors.”
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Guiding principle of theplatform
Autonomy for producers and consumers
Self-service Analytics
Self-service Data Ingestion
Self-service ETL
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Self-service ETL
Self-service Analytics
Central Data Lake on Amazon S3
Data
Scientist
AnalystPM Leader
Engineer
Analyst
Self-service Data Ingestion
Data
Producer
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
OurApproach
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Multi-AccountSetting
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data LakeBucketTypes
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
DataLakeAccessRoles
Data Lake Account
ImmobilienScout24
Data Lake Account
AutoScout24
Regular DL Bucket Access Role
Restricted DL Bucket Access Roles
Personal DL Bucket Access Roles
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
IngestionGoals
Microservices
Ingestion
Goals
Batch Support
Streaming Support
Rest APIScalability
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
IngestionOptions
Amazon
KinesisData
Firehose
Kafka Connect
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
IngestionOptions
Amazon
KinesisData
Firehose
Kafka Connect
Simple
Config
Simple
Config
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
FirehoseIngestionArchitecture
Data Lake Account
Amazon
KinesisData
Firehose
Producer to
Firehose Role
Producer Account
Data Producer
Firehose to
Datalake Role
Datalake Bucket
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
FirehoseIngestionArchitecture
Data Lake Account
Amazon
KinesisData
Firehose
Producer to
Firehose Role
Producer Account
Data Producer
Firehose to
Datalake Role
Datalake Bucket
STS Assume
Role
STS Assume
Role
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
FirehoseIngestionArchitecture
Data Lake Account
Amazon
KinesisData
Firehose
Producer to
Firehose Role
Producer Account
Data Producer
Firehose to
Datalake Role
Datalake Bucket
STS Assume
Role
STS Assume
Role
Send Data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
FirehoseIngestionArchitecture
Data Lake Account
Amazon
KinesisData
Firehose
Producer to
Firehose Role
Producer Account
Data Producer
Firehose to
Datalake Role
Datalake Bucket
STS Assume
Role
STS Assume
Role
Send Data
Write Data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnect
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnect
Elasticsearch
Amazon S3
ActiveMQ
Cassandra
Kafka
RDMBS…
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnect
Elasticsearch
Amazon S3
ActiveMQ
Cassandra
Kafka
RDMBS…
Kafka
Connect
Cluster
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnect
Elasticsearch
Amazon S3
ActiveMQ
Cassandra
Kafka
RDMBS…
Kafka
Connect
ClusterRead Data from
Topic(s)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnect
Elasticsearch
Amazon S3
ActiveMQ
Cassandra
Kafka
RDMBS…
Kafka
Connect
ClusterRead Data from
Topic(s)
Write
Data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Scout24InfinityCluster
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Scout24InfinityCluster
Amazon ECS
Infinity Service
Simple
Config
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Scout24InfinityCluster
Amazon ECS
Infinity Service
Simple
Config
Central Logging to
Elasticsearch
Monitoring in Datadog
Managed By Scout24
Cloud Platform
Engineering
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
RelatedBreakouts
15:00 in Hall 1
To Infinity and Beyond – Handling Heterogeneous Container
Clusters in AWS
Christine Trahe, Platform Engineer @ Scout24
16:00 in Hall 1
Boost your AWS Infrastructure
Philipp Garbe, AWS Container Hero @ Scout24
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnecton Infinity
Simple
Config
(Infinity)
Amazon ECS
Infinity Service
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnecton Infinity
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnecton Infinity
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnecton Infinity
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnecton Infinity
Simple
Config
(Kafka Connect)
Kafka Connect Service
Infinity Service
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnectDeployment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnectDeployment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnectDeployment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnectDeployment
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
KafkaConnectDeployment
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
DataWario –OurWrapper forAWS DataPipeline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
DataWario –OurWrapper forAWS DataPipeline
Simple
Config
AWS Data
Pipeline
DataWario
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
DataWario –OurWrapper forAWS DataPipeline
Builtin Support for
Scout24 Ecosystem
Shortens
Development Cycles
Only Exposes
Configuration
Essentials
Introduces Custom
Step Types
Automatically
Manages Artifacts
Simple
Config
AWS Data
Pipeline
DataWario
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
DataWarioArchitecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
DataWarioArchitecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
DataWarioArchitecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
DataWarioArchitecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Library ofCommon DataTransformations
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
QueryChallenges
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
What’sAhead
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
What’sAhead
OneScout Hive Metastore
Unlock the Datalake for Scout24’s
Toolset and Users with Different
Skillsets
Data Analysis for Various User
Groups
Provide a Timely and Accurate
Update of the Metadata Layer
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
What’sAhead
Personal Analytics ClusterOneScout Hive Metastore
Unlock the Datalake for Scout24’s
Toolset and Users with Different
Skillsets
Data Analysis for Various User
Groups
Provide a Timely and Accurate
Update of the Metadata Layer
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
What’sAhead
Automatic Hive Partition DetectionPersonal Analytics ClusterOneScout Hive Metastore
Unlock the Datalake for Scout24’s
Toolset and Users with Different
Skillsets
Data Analysis for Various User
Groups
Provide a Timely and Accurate
Update of the Metadata Layer
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
OneScout HiveMetastore –ASchematicView
Personal Analytics
Cluster
Hive Tables and Presto Views
Datalake
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
OneScout HiveMetastore – Recapof Ecosystem
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
EMR MetastoreConfigurationOptions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
EMR MetastoreConfigurationOptions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
EMR MetastoreConfigurationOptions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
EMR MetastoreConfigurationOptions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
EMR MetastoreConfigurationOptions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
EMR MetastoreConfigurationOptions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
EMR MetastoreConfigurationOptions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
EMR MetastoreConfigurationOptions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
EMR MetastoreConfigurationOptions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
EMR MetastoreConfigurationOptions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
EMR MetastoreConfigurationOptions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
EMR MetastoreConfigurationOptions
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
TheScout24Hive MetastoreProxy –AMotivation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
TheScout24Hive MetastoreProxy –AMotivation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
TheScout24Hive MetastoreProxy –AMotivation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
TheScout24Hive MetastoreProxy –AMotivation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
TheScout24Hive MetastoreProxy –AMotivation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
TheScout24Hive MetastoreProxy –AMotivation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
TheScout24Hive MetastoreProxy –AMotivation
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
ThePersonalAnalyticsCluster –AnOverview
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
ThePersonalAnalyticsCluster –AnOverview
Amazon EMR
Personal Analytics
Cluster
Simple
Config
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
ThePersonalAnalyticsCluster –AnOverview
Amazon EMR
Personal Analytics
Cluster
Simple
Config
Easy Access via Web
Interface
Zeppelin and Jupyter
Notebook Restore
OneClick Deployment
Managed Scaling and
Shutdown
Support for Pre-baked
AMIs and Configs
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
ThePersonalAnalyticsCluster –AnOverview
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Automated PartitionDetection –AMotivation
Personal Analytics
Cluster
Hive Tables and Presto Views
Datalake
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Automated PartitionDetection –AMotivation
Personal Analytics
Cluster
Hive Tables and Presto Views
Datalake
Partitioned Table
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Automated PartitionDetection –AMotivation
Personal Analytics
Cluster
Hive Tables and Presto Views
Datalake
Partitioned Table
Data Ingestion
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Automated PartitionDetection –AMotivation
Personal Analytics
Cluster
Hive Tables and Presto Views
Datalake
Partitioned Table
Data Ingestion
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Automated PartitionDetection –AMotivation
Personal Analytics
Cluster
Hive Tables and Presto Views
Datalake
Partitioned Table
Data Ingestion
Table Access
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Automated PartitionDetection –AMotivation
Personal Analytics
Cluster
Hive Tables and Presto Views
Datalake
Partitioned Table
Data Ingestion
Table Access
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Automated PartitionDetection –AMotivation
Personal Analytics
Cluster
Hive Tables and Presto Views
Datalake
Partitioned Table
Data Ingestion
Table Access
Automatic Partition Detection
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Automated PartitionDetection –AMotivation
Personal Analytics
Cluster
Hive Tables and Presto Views
Datalake
Partitioned Table
Data Ingestion
Table Access
Automatic Partition Detection
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
PartitionDetectionArchitecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
PartitionDetectionArchitecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
PartitionDetectionArchitecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
PartitionDetectionArchitecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
PartitionDetectionArchitecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
PartitionDetectionArchitecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
PartitionDetectionArchitecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
PartitionDetectionArchitecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
PartitionDetectionArchitecture
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Build our own vs.AWSmanaged services
Metastore  Glue
Presto  Athena
DataWario  Glue, Step function, Lambda, …
Personal Analytics Cluster  Glue notebooks, Sagemaker
We hope to throw out most of the custom components we build.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Our datawarehousewasabottleneck
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Our data platform holds nothing back
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Raffael Dzikowski
Senior Data Engineer
Scout24
Sean Gustafson
Senior Technical Product Manager
Scout24
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I TS U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Extra slides
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Centralized Federated
Control Autonomy
Perfection Scale
Pull Push
Product is Data Product is Platform
Reporting Reporting, Advanced Analytics,
Machine Learning, etc.
DataWarehouse vs. DataPlatform
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Core
DB
APPAPPAPPAPPAPPAPPAPPAPPAPP
MicroStrategy
Presto
Central Data Lake on S3
CRM Core
DB
Micro
Service
REST API / Firehose
Data
Scientist
Jupyter
Zeppelin
Analyst
PM
Data
Producer
Personal
Analytics
Clusters
SQL
Alation
Data Catalog
Metastore
Leader
Engineer
Analyst
DataWario
Spark
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
OurJourney to Presto
Personal Analytics
Cluster
Hive Tables and Presto Views
Datalake
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
OurJourney to Presto
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
OurJourney to Presto
Amazon
Athena
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
OurJourney to Presto
Amazon
Athena
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
OurJourney to Presto
Amazon
Athena
On EMR
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
OurJourney to Presto
Amazon
Athena
On EMR
Cross Account Support
(OneScout Hive
Metastore)
Leverages Datalake
Access Roles (EMRFS)
Scheduled Scaling
Configurations
Fits our GDPRConcept
(multiple isolated
Clusters)
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
SCOUT24
DATA LANDSCAPE
MANIFESTO
ROLES, RESPONSIBILITIES, AND VALUES
FOR A DATA-DRIVEN COMPANY AT SCALE
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data is a key asset of our
company.
#1 Preamble
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
#2 Our Responsibility
We, Data & Analytics, are
responsible for providing a
solid Data Platform as well
as clear guidelines and
training how to participate
in the Data Landscape. Data Platform
DnA
Data Landscape
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
#3 Data Autonomy, Not Anarchy
Data autonomy puts data
producers & data consumers in
control of their data & of
their metrics and thereby allows
us to be data-driven at scale, but
this comes with responsibility. Data Platform
Data
Producer Consumer
DnA
Data Landscape
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
#4 Producer’s Responsibility
Data producers are responsible
for publishing data to the
central Data Lake, for the
data's quality, and for
publishing metadata that
makes it easy to find and
consume the data.
Data Platform
Metadata
Data
Producer
DnA
Data Landscape
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
#5 Consumer’s Responsibility
Data consumers are responsible
for the definition & visualization
of metrics and for driving the
implementation and
maintenance of these metrics.
Data Platform
Producer Consumer
DnA
Data Landscape
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
#6 Exception: Core KPIs
We, Data & Analytics, take the
full ownership and
responsibility of the few top
company-wide core KPIs.
Data Platform
Producer Consumer
DnA
Data Landscape
Core
metric
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
#7 Transparency Over Continuity
We value data transparency
over data continuity, which
means we may break metric
comparability if it is for the
cause of enabling better
insights. Data Platform
Producer Consumer
DnA
Data Landscape
Core
metric
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
The Ultimate Goal
Data Platform
Metadata
Data
Producer Consumer
DnA
Data Landscape
Core
metric
A federal landscape of data
producers and consumers with
just enough rules to ensure
seamless co-operation without
severely impeding autonomy.

Weitere ähnliche Inhalte

Was ist angesagt?

Breaking the Ice: Transform Cold Archival Data into Fresh Insights (STG355) -...
Breaking the Ice: Transform Cold Archival Data into Fresh Insights (STG355) -...Breaking the Ice: Transform Cold Archival Data into Fresh Insights (STG355) -...
Breaking the Ice: Transform Cold Archival Data into Fresh Insights (STG355) -...Amazon Web Services
 
Airbyte - Series-B deck
Airbyte - Series-B deckAirbyte - Series-B deck
Airbyte - Series-B deckAirbyte
 
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...Amazon Web Services
 
How to Quickly Get Insights from IoT Data on AWS (ANT337-S) - AWS re:Invent 2018
How to Quickly Get Insights from IoT Data on AWS (ANT337-S) - AWS re:Invent 2018How to Quickly Get Insights from IoT Data on AWS (ANT337-S) - AWS re:Invent 2018
How to Quickly Get Insights from IoT Data on AWS (ANT337-S) - AWS re:Invent 2018Amazon Web Services
 
Democratization of Data @Indix
Democratization of Data @IndixDemocratization of Data @Indix
Democratization of Data @IndixManoj Mahalingam
 
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)Amazon Web Services
 
AWS101 Cloud is the New Normal
AWS101  Cloud is the New Normal AWS101  Cloud is the New Normal
AWS101 Cloud is the New Normal Sandy Carter
 
Javaedge 2010-cschalk
Javaedge 2010-cschalkJavaedge 2010-cschalk
Javaedge 2010-cschalkChris Schalk
 
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMakerBDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMakerAmazon Web Services
 
Amazon QuickSight First Call Deck
Amazon QuickSight First Call DeckAmazon QuickSight First Call Deck
Amazon QuickSight First Call DeckAmazon Web Services
 
Enable Your Smart Factory with the AWS Industrial IoT Reference Solution (MFG...
Enable Your Smart Factory with the AWS Industrial IoT Reference Solution (MFG...Enable Your Smart Factory with the AWS Industrial IoT Reference Solution (MFG...
Enable Your Smart Factory with the AWS Industrial IoT Reference Solution (MFG...Amazon Web Services
 
The Big Connection: Integrating Cloud with Enterprise Systems
The Big Connection: Integrating Cloud with Enterprise SystemsThe Big Connection: Integrating Cloud with Enterprise Systems
The Big Connection: Integrating Cloud with Enterprise SystemsInside Analysis
 
AWS Summit - Atlanta
AWS Summit - Atlanta AWS Summit - Atlanta
AWS Summit - Atlanta Sandy Carter
 
Big data: analyzing large data sets
Big data: analyzing large data setsBig data: analyzing large data sets
Big data: analyzing large data setsR A Akerkar
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningKai Wähner
 
Deep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseDeep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseGanesan Narayanasamy
 
Amazon Managed Blockchain and Quantum Ledger Database QLDB
Amazon Managed Blockchain and Quantum Ledger Database QLDBAmazon Managed Blockchain and Quantum Ledger Database QLDB
Amazon Managed Blockchain and Quantum Ledger Database QLDBJohn Yeung
 

Was ist angesagt? (20)

Breaking the Ice: Transform Cold Archival Data into Fresh Insights (STG355) -...
Breaking the Ice: Transform Cold Archival Data into Fresh Insights (STG355) -...Breaking the Ice: Transform Cold Archival Data into Fresh Insights (STG355) -...
Breaking the Ice: Transform Cold Archival Data into Fresh Insights (STG355) -...
 
Airbyte - Series-B deck
Airbyte - Series-B deckAirbyte - Series-B deck
Airbyte - Series-B deck
 
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
Connecting the dots - How Amazon Neptune and Graph Databases can transform yo...
 
CurrencyCloud and AWS
CurrencyCloud and AWSCurrencyCloud and AWS
CurrencyCloud and AWS
 
How to Quickly Get Insights from IoT Data on AWS (ANT337-S) - AWS re:Invent 2018
How to Quickly Get Insights from IoT Data on AWS (ANT337-S) - AWS re:Invent 2018How to Quickly Get Insights from IoT Data on AWS (ANT337-S) - AWS re:Invent 2018
How to Quickly Get Insights from IoT Data on AWS (ANT337-S) - AWS re:Invent 2018
 
Democratization of Data @Indix
Democratization of Data @IndixDemocratization of Data @Indix
Democratization of Data @Indix
 
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
雲上打造資料湖 (Data Lake):智能化駕馭商機 (Level 300)
 
AWS101 Cloud is the New Normal
AWS101  Cloud is the New Normal AWS101  Cloud is the New Normal
AWS101 Cloud is the New Normal
 
Javaedge 2010-cschalk
Javaedge 2010-cschalkJavaedge 2010-cschalk
Javaedge 2010-cschalk
 
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMakerBDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
 
Amazon QuickSight First Call Deck
Amazon QuickSight First Call DeckAmazon QuickSight First Call Deck
Amazon QuickSight First Call Deck
 
Enable Your Smart Factory with the AWS Industrial IoT Reference Solution (MFG...
Enable Your Smart Factory with the AWS Industrial IoT Reference Solution (MFG...Enable Your Smart Factory with the AWS Industrial IoT Reference Solution (MFG...
Enable Your Smart Factory with the AWS Industrial IoT Reference Solution (MFG...
 
The Big Connection: Integrating Cloud with Enterprise Systems
The Big Connection: Integrating Cloud with Enterprise SystemsThe Big Connection: Integrating Cloud with Enterprise Systems
The Big Connection: Integrating Cloud with Enterprise Systems
 
AWS Summit - Atlanta
AWS Summit - Atlanta AWS Summit - Atlanta
AWS Summit - Atlanta
 
AWS Analytics Experience Argentina - Intro
AWS Analytics Experience Argentina - IntroAWS Analytics Experience Argentina - Intro
AWS Analytics Experience Argentina - Intro
 
Big data: analyzing large data sets
Big data: analyzing large data setsBig data: analyzing large data sets
Big data: analyzing large data sets
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
 
Migrating database to cloud
Migrating database to cloudMigrating database to cloud
Migrating database to cloud
 
Deep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseDeep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the Enterprise
 
Amazon Managed Blockchain and Quantum Ledger Database QLDB
Amazon Managed Blockchain and Quantum Ledger Database QLDBAmazon Managed Blockchain and Quantum Ledger Database QLDB
Amazon Managed Blockchain and Quantum Ledger Database QLDB
 

Ähnlich wie The Scout24 Data Platform - a technical deep dive

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSCobus Bernard
 
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Amazon Web Services
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time dataAmazon Web Services
 
Machine learning at the edge for industrial applications - SVC302 - New York ...
Machine learning at the edge for industrial applications - SVC302 - New York ...Machine learning at the edge for industrial applications - SVC302 - New York ...
Machine learning at the edge for industrial applications - SVC302 - New York ...Amazon Web Services
 
Getting Started with Microservices, Containers, and Serverless Architectures
Getting Started with Microservices, Containers, and Serverless ArchitecturesGetting Started with Microservices, Containers, and Serverless Architectures
Getting Started with Microservices, Containers, and Serverless ArchitecturesAmazon Web Services
 
How to speed up and scale your innovation efforts - MAD203 - Chicago AWS Summit
How to speed up and scale your innovation efforts - MAD203 - Chicago AWS SummitHow to speed up and scale your innovation efforts - MAD203 - Chicago AWS Summit
How to speed up and scale your innovation efforts - MAD203 - Chicago AWS SummitAmazon Web Services
 
AWS Summit Singapore 2019 | AWS Techfest Opening Keynote
AWS Summit Singapore 2019 | AWS Techfest Opening KeynoteAWS Summit Singapore 2019 | AWS Techfest Opening Keynote
AWS Summit Singapore 2019 | AWS Techfest Opening KeynoteAWS Summits
 
Castles in Castles - Secure Operational Scale - AWS Summit Sydney
Castles in Castles - Secure Operational Scale - AWS Summit SydneyCastles in Castles - Secure Operational Scale - AWS Summit Sydney
Castles in Castles - Secure Operational Scale - AWS Summit SydneyAmazon Web Services
 
AWS及客戶在AI/ML的數位運行過程中得到的重要經驗與學習
AWS及客戶在AI/ML的數位運行過程中得到的重要經驗與學習AWS及客戶在AI/ML的數位運行過程中得到的重要經驗與學習
AWS及客戶在AI/ML的數位運行過程中得到的重要經驗與學習Amazon Web Services
 
AWS Summit Singapore 2019 | Operating Microservices at Hyperscale
AWS Summit Singapore 2019 | Operating Microservices at HyperscaleAWS Summit Singapore 2019 | Operating Microservices at Hyperscale
AWS Summit Singapore 2019 | Operating Microservices at HyperscaleAWS Summits
 
Desktop-as-a-Service: Flexible Application Delivery to Cloud-Native Desktops
Desktop-as-a-Service: Flexible Application Delivery to Cloud-Native DesktopsDesktop-as-a-Service: Flexible Application Delivery to Cloud-Native Desktops
Desktop-as-a-Service: Flexible Application Delivery to Cloud-Native DesktopsAmazon Web Services
 
From Unattended Ground Sensors (UGS) to Installations; Leveraging AWS IoT fo...
 From Unattended Ground Sensors (UGS) to Installations; Leveraging AWS IoT fo... From Unattended Ground Sensors (UGS) to Installations; Leveraging AWS IoT fo...
From Unattended Ground Sensors (UGS) to Installations; Leveraging AWS IoT fo...Amazon Web Services
 
Amplifying fullstack serverless apps with AppSync & the Amplify Framework - M...
Amplifying fullstack serverless apps with AppSync & the Amplify Framework - M...Amplifying fullstack serverless apps with AppSync & the Amplify Framework - M...
Amplifying fullstack serverless apps with AppSync & the Amplify Framework - M...Amazon Web Services
 
"Integrate your front end apps with serverless backend in the cloud", Sebasti...
"Integrate your front end apps with serverless backend in the cloud", Sebasti..."Integrate your front end apps with serverless backend in the cloud", Sebasti...
"Integrate your front end apps with serverless backend in the cloud", Sebasti...Provectus
 
AWS DevDay Berlin 2019 - Simplify your Web & Mobile apps with cloud-based ser...
AWS DevDay Berlin 2019 - Simplify your Web & Mobile appswith cloud-based ser...AWS DevDay Berlin 2019 - Simplify your Web & Mobile appswith cloud-based ser...
AWS DevDay Berlin 2019 - Simplify your Web & Mobile apps with cloud-based ser...Darko Mesaroš
 
Tools for Building your MVP on AWS
Tools for Building your MVP on AWSTools for Building your MVP on AWS
Tools for Building your MVP on AWSAmazon Web Services
 
Modern Application Development in the Cloud
Modern Application Development in the CloudModern Application Development in the Cloud
Modern Application Development in the CloudAmazon Web Services
 

Ähnlich wie The Scout24 Data Platform - a technical deep dive (20)

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
 
Moving to DevOps the Amazon Way
Moving to DevOps the Amazon WayMoving to DevOps the Amazon Way
Moving to DevOps the Amazon Way
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time data
 
Machine learning at the edge for industrial applications - SVC302 - New York ...
Machine learning at the edge for industrial applications - SVC302 - New York ...Machine learning at the edge for industrial applications - SVC302 - New York ...
Machine learning at the edge for industrial applications - SVC302 - New York ...
 
Simplify front end apps.pdf
Simplify front end apps.pdfSimplify front end apps.pdf
Simplify front end apps.pdf
 
Getting Started with Microservices, Containers, and Serverless Architectures
Getting Started with Microservices, Containers, and Serverless ArchitecturesGetting Started with Microservices, Containers, and Serverless Architectures
Getting Started with Microservices, Containers, and Serverless Architectures
 
How to speed up and scale your innovation efforts - MAD203 - Chicago AWS Summit
How to speed up and scale your innovation efforts - MAD203 - Chicago AWS SummitHow to speed up and scale your innovation efforts - MAD203 - Chicago AWS Summit
How to speed up and scale your innovation efforts - MAD203 - Chicago AWS Summit
 
AWS Summit Singapore 2019 | AWS Techfest Opening Keynote
AWS Summit Singapore 2019 | AWS Techfest Opening KeynoteAWS Summit Singapore 2019 | AWS Techfest Opening Keynote
AWS Summit Singapore 2019 | AWS Techfest Opening Keynote
 
Castles in Castles - Secure Operational Scale - AWS Summit Sydney
Castles in Castles - Secure Operational Scale - AWS Summit SydneyCastles in Castles - Secure Operational Scale - AWS Summit Sydney
Castles in Castles - Secure Operational Scale - AWS Summit Sydney
 
AWS及客戶在AI/ML的數位運行過程中得到的重要經驗與學習
AWS及客戶在AI/ML的數位運行過程中得到的重要經驗與學習AWS及客戶在AI/ML的數位運行過程中得到的重要經驗與學習
AWS及客戶在AI/ML的數位運行過程中得到的重要經驗與學習
 
AWS Summit Singapore 2019 | Operating Microservices at Hyperscale
AWS Summit Singapore 2019 | Operating Microservices at HyperscaleAWS Summit Singapore 2019 | Operating Microservices at Hyperscale
AWS Summit Singapore 2019 | Operating Microservices at Hyperscale
 
Simplify front end apps.pdf
Simplify front end apps.pdfSimplify front end apps.pdf
Simplify front end apps.pdf
 
Desktop-as-a-Service: Flexible Application Delivery to Cloud-Native Desktops
Desktop-as-a-Service: Flexible Application Delivery to Cloud-Native DesktopsDesktop-as-a-Service: Flexible Application Delivery to Cloud-Native Desktops
Desktop-as-a-Service: Flexible Application Delivery to Cloud-Native Desktops
 
From Unattended Ground Sensors (UGS) to Installations; Leveraging AWS IoT fo...
 From Unattended Ground Sensors (UGS) to Installations; Leveraging AWS IoT fo... From Unattended Ground Sensors (UGS) to Installations; Leveraging AWS IoT fo...
From Unattended Ground Sensors (UGS) to Installations; Leveraging AWS IoT fo...
 
Amplifying fullstack serverless apps with AppSync & the Amplify Framework - M...
Amplifying fullstack serverless apps with AppSync & the Amplify Framework - M...Amplifying fullstack serverless apps with AppSync & the Amplify Framework - M...
Amplifying fullstack serverless apps with AppSync & the Amplify Framework - M...
 
"Integrate your front end apps with serverless backend in the cloud", Sebasti...
"Integrate your front end apps with serverless backend in the cloud", Sebasti..."Integrate your front end apps with serverless backend in the cloud", Sebasti...
"Integrate your front end apps with serverless backend in the cloud", Sebasti...
 
AWS DevDay Berlin 2019 - Simplify your Web & Mobile apps with cloud-based ser...
AWS DevDay Berlin 2019 - Simplify your Web & Mobile appswith cloud-based ser...AWS DevDay Berlin 2019 - Simplify your Web & Mobile appswith cloud-based ser...
AWS DevDay Berlin 2019 - Simplify your Web & Mobile apps with cloud-based ser...
 
Tools for Building your MVP on AWS
Tools for Building your MVP on AWSTools for Building your MVP on AWS
Tools for Building your MVP on AWS
 
Modern Application Development in the Cloud
Modern Application Development in the CloudModern Application Development in the Cloud
Modern Application Development in the Cloud
 

Kürzlich hochgeladen

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 

Kürzlich hochgeladen (20)

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 

The Scout24 Data Platform - a technical deep dive

  • 1. S U M M I T B E RLIN
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T TheScout24 Data Platform ATechnical Deep Dive Sean Gustafson Senior Technical Product Manager Scout24 S e s s i o n I D Raffael Dzikowski Senior Data Engineer Scout24
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T 5 Core Geographies and an overall presence in 18 countries 80m Household Reach 2 Major Household Brand Names Scout24 AG • SDAX • € 489 million revenue (2017) • ˜1500 employees
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Our technicalevolution Production Database Monolith app Data Warehouse Data Warehouse Microservice Microservice Microservice Microservice Microservice Microservice Microservice Microservice Microservice Microservice Microservice Microservice Microservice Microservice Persistence Persistence Persistence Persistence Persistence Persistence Persistence
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Our datawarehousewasabottleneck
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Scout24 wants to become a truly data-driven company Fast & easy data-driven product development… …supported by Data & Analytics
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Scout24 wants to become a truly data-driven company Everywhere in the company... ...without bloating up Data & Analytics
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Our solution: Build an internal “platform” for data
  • 9. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T We thinkof our Data Platform asa Product Just like AWS, Salesforce, etc. – the platform is a generic layer upon which Scout24’s products can be built BUT, we have a very, very small number of customers. That means, product teams get personalized support and there is lots of opportunity for collaboration.
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Wedon’tdictateanything. Wejusttrytomakecertainthingseasierbyofferinga“pavedpath” Productteamsarefullyempoweredtomaketheirownchoicesaboutwhatisthe bestuseoftheirresources.
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T “Inalmostallcases,wewillnot mandatethatinternalteamuse theseplatformsandservices— theseplatform teamswill to win over andsatisfy their internalcustomers, evencompetingwithexternal vendors.”
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Guiding principle of theplatform Autonomy for producers and consumers Self-service Analytics Self-service Data Ingestion Self-service ETL
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Self-service ETL Self-service Analytics Central Data Lake on Amazon S3 Data Scientist AnalystPM Leader Engineer Analyst Self-service Data Ingestion Data Producer
  • 15. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T OurApproach
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Multi-AccountSetting
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data LakeBucketTypes
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T DataLakeAccessRoles Data Lake Account ImmobilienScout24 Data Lake Account AutoScout24 Regular DL Bucket Access Role Restricted DL Bucket Access Roles Personal DL Bucket Access Roles
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T IngestionGoals Microservices Ingestion Goals Batch Support Streaming Support Rest APIScalability
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T IngestionOptions Amazon KinesisData Firehose Kafka Connect
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T IngestionOptions Amazon KinesisData Firehose Kafka Connect Simple Config Simple Config
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T FirehoseIngestionArchitecture Data Lake Account Amazon KinesisData Firehose Producer to Firehose Role Producer Account Data Producer Firehose to Datalake Role Datalake Bucket
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T FirehoseIngestionArchitecture Data Lake Account Amazon KinesisData Firehose Producer to Firehose Role Producer Account Data Producer Firehose to Datalake Role Datalake Bucket STS Assume Role STS Assume Role
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T FirehoseIngestionArchitecture Data Lake Account Amazon KinesisData Firehose Producer to Firehose Role Producer Account Data Producer Firehose to Datalake Role Datalake Bucket STS Assume Role STS Assume Role Send Data
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T FirehoseIngestionArchitecture Data Lake Account Amazon KinesisData Firehose Producer to Firehose Role Producer Account Data Producer Firehose to Datalake Role Datalake Bucket STS Assume Role STS Assume Role Send Data Write Data
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnect
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnect Elasticsearch Amazon S3 ActiveMQ Cassandra Kafka RDMBS…
  • 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnect Elasticsearch Amazon S3 ActiveMQ Cassandra Kafka RDMBS… Kafka Connect Cluster
  • 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnect Elasticsearch Amazon S3 ActiveMQ Cassandra Kafka RDMBS… Kafka Connect ClusterRead Data from Topic(s)
  • 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnect Elasticsearch Amazon S3 ActiveMQ Cassandra Kafka RDMBS… Kafka Connect ClusterRead Data from Topic(s) Write Data
  • 32. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Scout24InfinityCluster
  • 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Scout24InfinityCluster Amazon ECS Infinity Service Simple Config
  • 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Scout24InfinityCluster Amazon ECS Infinity Service Simple Config Central Logging to Elasticsearch Monitoring in Datadog Managed By Scout24 Cloud Platform Engineering
  • 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T RelatedBreakouts 15:00 in Hall 1 To Infinity and Beyond – Handling Heterogeneous Container Clusters in AWS Christine Trahe, Platform Engineer @ Scout24 16:00 in Hall 1 Boost your AWS Infrastructure Philipp Garbe, AWS Container Hero @ Scout24
  • 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnecton Infinity Simple Config (Infinity) Amazon ECS Infinity Service
  • 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnecton Infinity
  • 38. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnecton Infinity
  • 39. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnecton Infinity
  • 40. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnecton Infinity Simple Config (Kafka Connect) Kafka Connect Service Infinity Service
  • 41. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnectDeployment
  • 42. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnectDeployment
  • 43. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnectDeployment
  • 44. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnectDeployment
  • 45. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T KafkaConnectDeployment
  • 46. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 47. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T DataWario –OurWrapper forAWS DataPipeline
  • 48. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T DataWario –OurWrapper forAWS DataPipeline Simple Config AWS Data Pipeline DataWario
  • 49. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T DataWario –OurWrapper forAWS DataPipeline Builtin Support for Scout24 Ecosystem Shortens Development Cycles Only Exposes Configuration Essentials Introduces Custom Step Types Automatically Manages Artifacts Simple Config AWS Data Pipeline DataWario
  • 50. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T DataWarioArchitecture
  • 51. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T DataWarioArchitecture
  • 52. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T DataWarioArchitecture
  • 53. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T DataWarioArchitecture
  • 54. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Library ofCommon DataTransformations
  • 55. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 56. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T QueryChallenges
  • 57. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T What’sAhead
  • 58. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T What’sAhead OneScout Hive Metastore Unlock the Datalake for Scout24’s Toolset and Users with Different Skillsets Data Analysis for Various User Groups Provide a Timely and Accurate Update of the Metadata Layer
  • 59. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T What’sAhead Personal Analytics ClusterOneScout Hive Metastore Unlock the Datalake for Scout24’s Toolset and Users with Different Skillsets Data Analysis for Various User Groups Provide a Timely and Accurate Update of the Metadata Layer
  • 60. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T What’sAhead Automatic Hive Partition DetectionPersonal Analytics ClusterOneScout Hive Metastore Unlock the Datalake for Scout24’s Toolset and Users with Different Skillsets Data Analysis for Various User Groups Provide a Timely and Accurate Update of the Metadata Layer
  • 61. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T OneScout HiveMetastore –ASchematicView Personal Analytics Cluster Hive Tables and Presto Views Datalake
  • 62. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T OneScout HiveMetastore – Recapof Ecosystem
  • 63. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T EMR MetastoreConfigurationOptions
  • 64. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T EMR MetastoreConfigurationOptions
  • 65. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T EMR MetastoreConfigurationOptions
  • 66. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T EMR MetastoreConfigurationOptions
  • 67. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T EMR MetastoreConfigurationOptions
  • 68. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T EMR MetastoreConfigurationOptions
  • 69. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T EMR MetastoreConfigurationOptions
  • 70. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T EMR MetastoreConfigurationOptions
  • 71. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T EMR MetastoreConfigurationOptions
  • 72. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T EMR MetastoreConfigurationOptions
  • 73. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T EMR MetastoreConfigurationOptions
  • 74. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T EMR MetastoreConfigurationOptions
  • 75. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T TheScout24Hive MetastoreProxy –AMotivation
  • 76. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T TheScout24Hive MetastoreProxy –AMotivation
  • 77. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T TheScout24Hive MetastoreProxy –AMotivation
  • 78. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T TheScout24Hive MetastoreProxy –AMotivation
  • 79. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T TheScout24Hive MetastoreProxy –AMotivation
  • 80. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T TheScout24Hive MetastoreProxy –AMotivation
  • 81. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T TheScout24Hive MetastoreProxy –AMotivation
  • 82. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T ThePersonalAnalyticsCluster –AnOverview
  • 83. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T ThePersonalAnalyticsCluster –AnOverview Amazon EMR Personal Analytics Cluster Simple Config
  • 84. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T ThePersonalAnalyticsCluster –AnOverview Amazon EMR Personal Analytics Cluster Simple Config Easy Access via Web Interface Zeppelin and Jupyter Notebook Restore OneClick Deployment Managed Scaling and Shutdown Support for Pre-baked AMIs and Configs
  • 85. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T ThePersonalAnalyticsCluster –AnOverview
  • 86. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Automated PartitionDetection –AMotivation Personal Analytics Cluster Hive Tables and Presto Views Datalake
  • 87. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Automated PartitionDetection –AMotivation Personal Analytics Cluster Hive Tables and Presto Views Datalake Partitioned Table
  • 88. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Automated PartitionDetection –AMotivation Personal Analytics Cluster Hive Tables and Presto Views Datalake Partitioned Table Data Ingestion
  • 89. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Automated PartitionDetection –AMotivation Personal Analytics Cluster Hive Tables and Presto Views Datalake Partitioned Table Data Ingestion
  • 90. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Automated PartitionDetection –AMotivation Personal Analytics Cluster Hive Tables and Presto Views Datalake Partitioned Table Data Ingestion Table Access
  • 91. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Automated PartitionDetection –AMotivation Personal Analytics Cluster Hive Tables and Presto Views Datalake Partitioned Table Data Ingestion Table Access
  • 92. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Automated PartitionDetection –AMotivation Personal Analytics Cluster Hive Tables and Presto Views Datalake Partitioned Table Data Ingestion Table Access Automatic Partition Detection
  • 93. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Automated PartitionDetection –AMotivation Personal Analytics Cluster Hive Tables and Presto Views Datalake Partitioned Table Data Ingestion Table Access Automatic Partition Detection
  • 94. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T PartitionDetectionArchitecture
  • 95. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T PartitionDetectionArchitecture
  • 96. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T PartitionDetectionArchitecture
  • 97. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T PartitionDetectionArchitecture
  • 98. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T PartitionDetectionArchitecture
  • 99. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T PartitionDetectionArchitecture
  • 100. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T PartitionDetectionArchitecture
  • 101. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T PartitionDetectionArchitecture
  • 102. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T PartitionDetectionArchitecture
  • 103. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 104. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Build our own vs.AWSmanaged services Metastore  Glue Presto  Athena DataWario  Glue, Step function, Lambda, … Personal Analytics Cluster  Glue notebooks, Sagemaker We hope to throw out most of the custom components we build.
  • 105. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Our datawarehousewasabottleneck
  • 106. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Our data platform holds nothing back
  • 107. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Raffael Dzikowski Senior Data Engineer Scout24 Sean Gustafson Senior Technical Product Manager Scout24
  • 108. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I TS U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 109.
  • 110. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Extra slides
  • 111. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Centralized Federated Control Autonomy Perfection Scale Pull Push Product is Data Product is Platform Reporting Reporting, Advanced Analytics, Machine Learning, etc. DataWarehouse vs. DataPlatform
  • 112. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Core DB APPAPPAPPAPPAPPAPPAPPAPPAPP MicroStrategy Presto Central Data Lake on S3 CRM Core DB Micro Service REST API / Firehose Data Scientist Jupyter Zeppelin Analyst PM Data Producer Personal Analytics Clusters SQL Alation Data Catalog Metastore Leader Engineer Analyst DataWario Spark
  • 113. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T OurJourney to Presto Personal Analytics Cluster Hive Tables and Presto Views Datalake
  • 114. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T OurJourney to Presto
  • 115. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T OurJourney to Presto Amazon Athena
  • 116. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T OurJourney to Presto Amazon Athena
  • 117. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T OurJourney to Presto Amazon Athena On EMR
  • 118. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T OurJourney to Presto Amazon Athena On EMR Cross Account Support (OneScout Hive Metastore) Leverages Datalake Access Roles (EMRFS) Scheduled Scaling Configurations Fits our GDPRConcept (multiple isolated Clusters)
  • 119. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 120. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T SCOUT24 DATA LANDSCAPE MANIFESTO ROLES, RESPONSIBILITIES, AND VALUES FOR A DATA-DRIVEN COMPANY AT SCALE
  • 121. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data is a key asset of our company. #1 Preamble
  • 122. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T #2 Our Responsibility We, Data & Analytics, are responsible for providing a solid Data Platform as well as clear guidelines and training how to participate in the Data Landscape. Data Platform DnA Data Landscape
  • 123. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T #3 Data Autonomy, Not Anarchy Data autonomy puts data producers & data consumers in control of their data & of their metrics and thereby allows us to be data-driven at scale, but this comes with responsibility. Data Platform Data Producer Consumer DnA Data Landscape
  • 124. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T #4 Producer’s Responsibility Data producers are responsible for publishing data to the central Data Lake, for the data's quality, and for publishing metadata that makes it easy to find and consume the data. Data Platform Metadata Data Producer DnA Data Landscape
  • 125. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T #5 Consumer’s Responsibility Data consumers are responsible for the definition & visualization of metrics and for driving the implementation and maintenance of these metrics. Data Platform Producer Consumer DnA Data Landscape
  • 126. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T #6 Exception: Core KPIs We, Data & Analytics, take the full ownership and responsibility of the few top company-wide core KPIs. Data Platform Producer Consumer DnA Data Landscape Core metric
  • 127. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T #7 Transparency Over Continuity We value data transparency over data continuity, which means we may break metric comparability if it is for the cause of enabling better insights. Data Platform Producer Consumer DnA Data Landscape Core metric
  • 128. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T The Ultimate Goal Data Platform Metadata Data Producer Consumer DnA Data Landscape Core metric A federal landscape of data producers and consumers with just enough rules to ensure seamless co-operation without severely impeding autonomy.

Hinweis der Redaktion

  1. https://www.pexels.com/photo/bottle-pouring-water-on-glass-1246745/
  2. https://www.pexels.com/photo/boardwalk-clouds-country-countryside-276299/
  3. https://www.pexels.com/photo/bottle-pouring-water-on-glass-1246745/
  4. https://www.pexels.com/photo/bird-s-eye-view-photography-of-water-falls-rushing-through-cliff-1229846/
  5. - To signal a new thinking here, we had to idea to formulate a Data Landscape Manifesto which we as a company would agree on. - This is about roles, responsibilities and common values - Consists of 7 principles, which are each based on a assumption or a belief from which we derived that principle.
  6. We believe that collecting & analyzing data is crucial to understand our business, our customers, and the market in order to provide the right services & products Although this is nothing surprising these days, we wanted to start with this in order to ensure a common understanding of why all of this is important in the first place. --> Loosely coupled (Microservices), strongly ALIGNED (Jez Humble, Adrian Cockroft)
  7. We therefore believe that everyone in the company must have easy access to the data available and it must be easy to publish data which can be used by others. This requires a solid Data Platform: easy-to-use tools, reliable infrastructure , and simple guidelines for publishing & consuming data. … This is our core responsibility (and we wanted to start with this side). The data landscape is the playground on which data producers and data consumers interact. We provide the platform and the clear guidelines but we do not own that space . The reason for this is that we believe..
  8. We believe that an exhaustive centralized data management does not allow us to scale to the level of data creation and consumption we aspire as a company, because it creates a bottleneck and introduces accidental, indirect dependencies. Instead , we believe that data autonomy is the only way for data usage to scale across the company. However, for data autonomy to not become data anarchy, there has to be a clear set of basic rules and responsibilities. Data autonomy puts…
  9. We believe that extensive data availability, data discoverability, and data usability are crucial and that – at scale – no one else can ensure this other than the one controlling the source where the data is originally generated.
  10. We believe that the stakeholder of a metric has to be the single owner of that metric and its definition, and has to drive its implementation. Without a single source of truth about what a metric means, we risk that multiple diverging and possibly contradicting understandings and implementations develop over time.
  11. We believe that a minimum level of company-wide compar-ability& reliability of core KPIs is crucial for leading the company into the right direction. The management is the owner of these core KPIs and the data group represents the management here in terms of metric ownership.
  12. We believe that transparency is crucial for understanding what the meaning of a metric is. If month-to-month comparability must never break, there is no way to continuously improve metrics and their transparency based on new insights. To stay in the example: if we actually understand that a certain number of orders are actually fraud than we want to report the actual real revenue.
  13. A federal landscape of data producers and consumers with just enough rules to ensure seamless co-operation without severely impeding autonomy.