SlideShare ist ein Scribd-Unternehmen logo
1 von 25
KAFKA Meetup - December 2021
Andrea Gioia
CTO at Quantyca
Co-Founder at Blindata
Digital Integration Hub: why, what and how?
Legacy systems
Some truths to face
Legacy system are growing in
size and number. They are here
to stay!
If your architecture does not
manage legacy systems, legacy
systems will menage sooner or
later your architecture.
Who am I?
Not an easy question to answer but keeping it simple...
Andrea Gioia
andrea.gioia@quantyca.it
Quantyca is a privately owned technological
consulting firm specialized in data and metadata
management based in Italy
quantyca.it
Blindata is a SAAS platform that leverages Data
Governance and Compliance to empower your
Data Management projects.
blindata.io
CTO
CO-FOUNDER
What is legacy modernization
Digital transformation continuously push toward the
development of new
● touchpoints in a omnichannel logic (System of
engagement)
● analytical and AI based services (System of insight)
These new applications are usually integrated with
back-office legacy systems with a point-to-point logic.
This way of integrating the new with the legacy does not scale
up in the long term.
Because the legacy cannot be simply thrown away a better
integration architecture is needed in order to modernize them
in place.
...and why it matters
System of Engagement System of Insight
System of Records
Legacy
Systems
Application
Layer
Integration
Layer
Point to point “Spaghetti” integration
Legacy modernization
TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT: Go beyond the limits imposed by
legacy systems to improve business agility
Key business drivers
COSTS AND RISKS REDUCTION: Rationalize integrations to reduce development and
maintenance costs and to avoid uncontrolled access to data
RESILIENCE AND PERFORMANCE IMPROVEMENT: Ensure the uptime of legacy systems even
in the face of significant increases in the workloads
Integration architecture #1
All new functionalities are implemented directly by extending
the legacy system or by buying complementary products
offered by the same vendor of the legacy system.
Integration layer if present is limited to an API Gateway to
decouple legacy backend from frontend applications
Legacy systems take it all
System of Engagement
Frontend
System of Insight
Frontend
System of Records
Legacy
Systems
Application
Layer
Integration
Layer
API Gateway
SoE
&
SoI
Backend
SoE
&
SoI
Backend
SoE
&
SoI
Backend
SoE
&
SoI
Backend
SoE
&
SoI
Backend
TIME-TO-MARKET AND BUSINESS AGILITY
IMPROVEMENT
COSTS AND RISKS REDUCTION
RESILIENCE AND PERFORMANCE
IMPROVEMENT
Integration architecture #2
Integration rationalization through composite services
System of engagement System Of Insight
System of Records
Legacy
Systems
Application
Layer
Integration
Platform
API Gateway
Request Based Integration Layer
Application Services
Process Services
Sourcing Services
Composite Services
Integrations are rationalized through different layers of
reusable and composable services.
Sourcing services wrap legacy systems, process service
orchestrate business process and application services
provide a backend for frontend applications
TIME-TO-MARKET AND BUSINESS AGILITY
IMPROVEMENT
COSTS AND RISKS REDUCTION
RESILIENCE AND PERFORMANCE
IMPROVEMENT
Integration architecture #2
Integration rationalization through data virtualization
System of engagement System Of Insight
System of Records
Legacy
Systems
Application
Layer
Integration
Platform
API Gateway
Request Based Integration Layer
Application Layer
Business Layer
Physical Layer
Virtual DWH
TIME-TO-MARKET AND BUSINESS AGILITY
IMPROVEMENT
COSTS AND RISKS REDUCTION
RESILIENCE AND PERFORMANCE
IMPROVEMENT
Integrations are rationalized through different layers of
views served by a data virtualization application.
Physical layer wraps legacy systems, business layer
exposes the business model and application layer provide
projections designed to facilitate consumption.
Integration architecture #2
Integration rationalization
System of engagement System Of Insight
System of Records
Legacy
Systems
Application
Layer
Hybrid
Integration
Platform
API Gateway
Request Based Integration Layer
Virtual DWH
Composite Services
TIME-TO-MARKET AND BUSINESS AGILITY
IMPROVEMENT
COSTS AND RISKS REDUCTION
RESILIENCE AND PERFORMANCE
IMPROVEMENT
Composite services and data virtualization can be used in the
same architecture. The former is preferred to back system of
engagement the latter to back system of insight.
Both solutions simplify integrations but don’t reduce the
workload on the backend systems
Integration architecture #3
Data offloading
System of engagement System Of Insight
System of Records
Legacy
Systems
Application
Layer
Hybrid
Integration
Platform
API Gateway
Event-Based Integration Layer
High-Performance Data Store
Microservices
Metadata Management
TIME-TO-MARKET AND BUSINESS AGILITY
IMPROVEMENT
COSTS AND RISKS REDUCTION
RESILIENCE AND PERFORMANCE
IMPROVEMENT
Data offloaded from legacy systems are aggregated into
low-latency, high performance datastore accessible via APIs,
events or batch.
The data store synchronizes with the beck ends via
event-driven integration patterns.
Digital Integration Hub
Key building blocks
Event store
High
performance
data store
Connectors
Legacy
Systems Applications
Services
Where the data is
stored
Keeps the legacy
systems and the high
performance data
store in sync
offloading all
modifications to
relevant data in real
time
Transform technical
events coming from
connectors to domain
and business events
that can be
consumed
downstream by high
performance data
store or other
consumers (event
driven integration)
Stores domain
specific data exposing
a single
consolidated view of
entities
~
Supports fast
ingestion to reduce
eventual consistency
window
~
Can support
analytical queries
Connect to high
performance data
store for read queries
Execute write on the
legacy systems by
means of command
events pushed on the
event store
(command query
responsibility
segregation)
Where the data is
used
Connectors
Data acquisition patterns
Trigger
(Push Mode)
Good for neo-legacies.
Problematic for old-school
legacies.
Change Data Capture
(Backend Interception)
The best option but the CDC
connectors can be quite
expansive.
Active Pooling
(Pop Mode)
Difficult to find a trade off that
satisfies the load constraints of
the legacy and real time needs
of applications.
Interesting source connectors for legacy
modernization available for Kafka are:
○ JDBC Connectors: for active pooling
○ Debezium Connector: for CDC from
MySql, Postgres, …
○ Salesforce Connectors: for CDC from
salesforce
○ Oracle Connector: for CDC from oracle
○ Partner Connectors: for CDC from other
legacies like SAP and Mainframe (ex. Qlik
Replicate Connector)
Decorating Collaborator
(Frontend Interception)
Largely cited in letterature.
Good in theory problematic in
reality.
Event Store
Event driven integration
Legacy System Streaming Platform
Technical
Events
(Speed &
Fidelity)
Domain
Events
(Trusted
Views)
Business
Events
(Ease of
consumption)
High Performance Data
Store
Event Store
Offloading patterns
Legacy System Streaming Platform
Technical
Events
(Speed &
Fidelity)
Domain
Events
(Trusted
Views)
One table per topic
Changes to each table are mapped to
distinct topics, one topic per table.
Stream joins are used to
create domain events from technical
events spread across different topics
Preserving transactional coherence within
aggregates can be complex when the
aggregate is spread among multiple tables
updated by long running transactions
Event Store
Offloading patterns
Legacy System Streaming Platform
Technical
Events
(Speed &
Fidelity)
Domain
Events
(Trusted
Views)
One aggregate per topic
All changes to tables that are part of the
same aggregate are mapped to the same
topic.
The identifier of the aggregate is used to
partition the topic.
It’s easier to create domain events from
technical events preserving transactional
coherence even with complex aggregates
or unpredictable transactional pattern.
Event Store
Offloading patterns
Legacy System Streaming Platform
Technical
Events
(Speed &
Fidelity)
Domain
Events
(Trusted
Views)
Transactional outbox pattern
The legacy system is modified in order to
inserts messages/events into an outbox
table as part of the local transaction.
The modification can be performed at
code or database level (es. triggers or
materialized views).
The connector that offload data to the
streaming platform is triggered by the
outbox table.
OUTBOX Table
COMMIT TRX
INSERT
UPDATE
DELETE
INSERT
Event Store
Offloading patterns
Legacy System Streaming Platform
Technical
Events
(Speed &
Fidelity)
Domain
Events
(Trusted
Views)
Triggered publisher
All changes to tables that are part of the same
aggregate are mapped to the same topic as
technical event that can contain only the
aggregate id and transaction id as payload.
For every transaction id a stream processor query
the legacy database extracting the modified
aggregate, filtering by id, and publishing it as
payload of a new domain event
To reduce the workload on legacy the stream
processor can query a read replica
Transactional coherence within the aggregate is
guaranteed by the upstream database
High-performance data store
Some options with pros and cons
KSQL
DB
Document
DB
In Memory
DB
HATP DB
PROS
+ Does not requery external components
+ Low Latency
+ Can handle very high throughput
+ Moving from event to state is simple and requires a small
integration effort
+ Stored data can be consumed directly also by stream
processors
CONS
- Not SQL compliant
- Serving to external consumers have some limitations that
must be managed directly by the consumers
- It’s not a good fit for complex analytical workloads
- TCO maybe not optimal for huge data volumes
High-performance data store
Some options with pros and cons
KSQL
DB
Document
DB
In Memory
DB
HATP DB
PROS
+ Does not require format transformation during the whole
flow from streaming platform to services
+ Largely used by service developers, probably already
present in the architecture
+ Good fit to expose single read view of domain entities
consolidated from different sources
+ Quite easy to handle schema changes
CONS
- Not SQL compliant
- Not a good fit for complex analytical workloads
- Not a good fit to expose business entity whose access
pattern from service is not predictable
- Can have some performance issues at very high
throughput
High-performance data store
Some options with pros and cons
KSQL
DB
Document
DB
In Memory
DB
HATP DB
PROS
+ SQL compliant (some of them, not all)
+ Can handle very high throughput
+ Can handle complex analytical queries
+ Good fit to expose read view of domain events and
business events as well
+ TCO can be optimize selecting the right strategy of
distribution of stored data between RAM and disk
CONS
- Require format transformation from document to
relational and then back to document when moving data
from streaming platform first and to service then
- Changes in schema performed upstream must be actively
managed
High-performance data store
Some options with pros and cons
KSQL
DB
Document
DB
In Memory
DB
HATP DB
PROS
+ Can handle very high throughput
CONS
- Not SQL compliant (in most of the cases, not all)
- Not a good fit for complex analytical workloads
- Can require format transformation when data is read
from streaming platform first and then again when data
is consumed by services
- TCO maybe not optimal for huge data volumes
Closing the loop with CQRS
From services back to legacy systems
Legacy System Streaming Platform
Technical
Events
(Speed &
Fidelity)
Domain
Events
(Trusted
Views)
High
Performance
Data Store
Business
Events
(Ease of
consumption)
Commands Micro/Mini
Services
READ
WRITE
The legacy modernization journey
Offloading, Isolation and Refactoring
Legacy System
Digital Integration
Hub
Applications
1
Legacy
Offloading
Legacy System
Digital Integration
Hub
Applications
Anti Corruption
Layer
Bubble Context
2
Legacy
Isolation
Digital Integration
Hub
Applications
Anti Corruption
Layer
Bubble Context
3
Legacy
Refactoring
Takeaways
Digital integration hub can be seen as a way of decoupled systems using data as anti corruption layer. Data offloaded into the
integration platform become a first-class citizen of the new data centric architecture.
Benefits
○ Responsive user experience
○ Offload legacy systems from expansive workloads
generated by front-end services
○ Support legacy refactoring
○ Align services to business domain
○ Enable real time analytics
○ Foster a data centric approach to integration
Challenges
○ Adapting the conceptual architecture to your
specific context
○ Assembling different technology components,
possibly from different vendors
○ Operating a complex distributed and loosely coupled
architecture
○ Supporting bidirectional synchronization
○ Designing the domain data models for the business
entities
○ Developing services that can tolerate eventual
consistency
○ Managing organizational politics related to data
ownership
Questions?
Feel free to ask
andrea.gioia@quantyca.it

Weitere ähnliche Inhalte

Was ist angesagt?

Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...
Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...
Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...
Alan McSweeney
 

Was ist angesagt? (20)

A Summary of TOGAF's Architecture Capability Framework
A Summary of TOGAF's Architecture Capability FrameworkA Summary of TOGAF's Architecture Capability Framework
A Summary of TOGAF's Architecture Capability Framework
 
Introduction to Event-Driven Architecture
Introduction to Event-Driven Architecture Introduction to Event-Driven Architecture
Introduction to Event-Driven Architecture
 
Digital Operating Model & IT4IT
Digital Operating Model & IT4ITDigital Operating Model & IT4IT
Digital Operating Model & IT4IT
 
Design Architecture Review Board (ARB) to Enable Digital Strategy
Design Architecture Review Board (ARB) to Enable Digital Strategy Design Architecture Review Board (ARB) to Enable Digital Strategy
Design Architecture Review Board (ARB) to Enable Digital Strategy
 
Event Driven Architecture
Event Driven ArchitectureEvent Driven Architecture
Event Driven Architecture
 
Enterprise Architecture
Enterprise ArchitectureEnterprise Architecture
Enterprise Architecture
 
IT4IT - Manage the Digital Enterprise.pdf
IT4IT - Manage the Digital Enterprise.pdfIT4IT - Manage the Digital Enterprise.pdf
IT4IT - Manage the Digital Enterprise.pdf
 
Togaf 9.2 Introduction
Togaf 9.2 IntroductionTogaf 9.2 Introduction
Togaf 9.2 Introduction
 
Customer Event Hub - the modern Customer 360° view
Customer Event Hub - the modern Customer 360° viewCustomer Event Hub - the modern Customer 360° view
Customer Event Hub - the modern Customer 360° view
 
Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...
Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...
Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...
 
Best Practices with Apex in 2022.pdf
Best Practices with Apex in 2022.pdfBest Practices with Apex in 2022.pdf
Best Practices with Apex in 2022.pdf
 
Driving the Telecom Digital Transformation through Open Digital Architecture
Driving the Telecom Digital Transformation through Open Digital ArchitectureDriving the Telecom Digital Transformation through Open Digital Architecture
Driving the Telecom Digital Transformation through Open Digital Architecture
 
TOGAF Reference Models
TOGAF Reference ModelsTOGAF Reference Models
TOGAF Reference Models
 
ArchiSurance Case Study
ArchiSurance Case StudyArchiSurance Case Study
ArchiSurance Case Study
 
ArchiMate application and data architecture layer - Simplify the models
ArchiMate application and data architecture layer - Simplify the modelsArchiMate application and data architecture layer - Simplify the models
ArchiMate application and data architecture layer - Simplify the models
 
Security Modelling in ArchiMate
Security Modelling in ArchiMateSecurity Modelling in ArchiMate
Security Modelling in ArchiMate
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
IT4IT Framework Overview
IT4IT Framework OverviewIT4IT Framework Overview
IT4IT Framework Overview
 
On business capabilities, functions and application features
On business capabilities, functions and application featuresOn business capabilities, functions and application features
On business capabilities, functions and application features
 
Oracle Cloud Computing Strategy
Oracle Cloud Computing StrategyOracle Cloud Computing Strategy
Oracle Cloud Computing Strategy
 

Ähnlich wie Digital integration hub: Why, what and how?

KAFKA Summit 2021: From legacy systems to microservices and back.pdf
KAFKA Summit 2021: From legacy systems to microservices and back.pdfKAFKA Summit 2021: From legacy systems to microservices and back.pdf
KAFKA Summit 2021: From legacy systems to microservices and back.pdf
Andrea Gioia
 
Kafka Summit 2022: Handling Eventual Consistency in a Transactional World.pdf
Kafka Summit 2022: Handling Eventual Consistency in a Transactional World.pdfKafka Summit 2022: Handling Eventual Consistency in a Transactional World.pdf
Kafka Summit 2022: Handling Eventual Consistency in a Transactional World.pdf
Andrea Gioia
 
adopt_soa.94145841
adopt_soa.94145841adopt_soa.94145841
adopt_soa.94145841
ypai
 

Ähnlich wie Digital integration hub: Why, what and how? (20)

KAFKA Summit 2021: From legacy systems to microservices and back.pdf
KAFKA Summit 2021: From legacy systems to microservices and back.pdfKAFKA Summit 2021: From legacy systems to microservices and back.pdf
KAFKA Summit 2021: From legacy systems to microservices and back.pdf
 
From legacy systems to microservices and back | Andera Gioia, Quantyca
From legacy systems to microservices and back | Andera Gioia, QuantycaFrom legacy systems to microservices and back | Andera Gioia, Quantyca
From legacy systems to microservices and back | Andera Gioia, Quantyca
 
Handling eventual consistency in a transactional world with Matteo Cimini and...
Handling eventual consistency in a transactional world with Matteo Cimini and...Handling eventual consistency in a transactional world with Matteo Cimini and...
Handling eventual consistency in a transactional world with Matteo Cimini and...
 
Kafka Summit 2022: Handling Eventual Consistency in a Transactional World.pdf
Kafka Summit 2022: Handling Eventual Consistency in a Transactional World.pdfKafka Summit 2022: Handling Eventual Consistency in a Transactional World.pdf
Kafka Summit 2022: Handling Eventual Consistency in a Transactional World.pdf
 
The Digital Decoupling Journey | John Kriter, Accenture
The Digital Decoupling Journey | John Kriter, AccentureThe Digital Decoupling Journey | John Kriter, Accenture
The Digital Decoupling Journey | John Kriter, Accenture
 
Integrating Salesforce.com and Oracle ERP Using IBM WebSphere Cast Iron
Integrating Salesforce.com and Oracle ERP Using IBM WebSphere Cast IronIntegrating Salesforce.com and Oracle ERP Using IBM WebSphere Cast Iron
Integrating Salesforce.com and Oracle ERP Using IBM WebSphere Cast Iron
 
Integrating SFDC and Oracle ERP with IBM Websphere CastIron Appliance
Integrating SFDC and Oracle ERP with IBM Websphere CastIron ApplianceIntegrating SFDC and Oracle ERP with IBM Websphere CastIron Appliance
Integrating SFDC and Oracle ERP with IBM Websphere CastIron Appliance
 
TechDays 2010 Portugal - Event Driven Architectures - 16x9
TechDays 2010 Portugal - Event Driven Architectures - 16x9TechDays 2010 Portugal - Event Driven Architectures - 16x9
TechDays 2010 Portugal - Event Driven Architectures - 16x9
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
 
Replicate Salesforce Data in Real Time with Change Data Capture
Replicate Salesforce Data in Real Time with Change Data CaptureReplicate Salesforce Data in Real Time with Change Data Capture
Replicate Salesforce Data in Real Time with Change Data Capture
 
How the detailed process of soa
How the detailed process of soaHow the detailed process of soa
How the detailed process of soa
 
Why Your Digital Transformation Strategy Demands Middleware Modernization
Why Your Digital Transformation Strategy Demands Middleware ModernizationWhy Your Digital Transformation Strategy Demands Middleware Modernization
Why Your Digital Transformation Strategy Demands Middleware Modernization
 
Soa 101
Soa 101Soa 101
Soa 101
 
Message Driven and Event Sourcing
Message Driven and Event SourcingMessage Driven and Event Sourcing
Message Driven and Event Sourcing
 
Confluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with ReplyConfluent Partner Tech Talk with Reply
Confluent Partner Tech Talk with Reply
 
adopt_soa.94145841
adopt_soa.94145841adopt_soa.94145841
adopt_soa.94145841
 
Technology Overview
Technology OverviewTechnology Overview
Technology Overview
 
Confluent Partner Tech Talk with BearingPoint
Confluent Partner Tech Talk with BearingPointConfluent Partner Tech Talk with BearingPoint
Confluent Partner Tech Talk with BearingPoint
 
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
 
Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)Data Services and the Modern Data Ecosystem (ASEAN)
Data Services and the Modern Data Ecosystem (ASEAN)
 

Mehr von confluent

Mehr von confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Kürzlich hochgeladen (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

Digital integration hub: Why, what and how?

  • 1. KAFKA Meetup - December 2021 Andrea Gioia CTO at Quantyca Co-Founder at Blindata Digital Integration Hub: why, what and how?
  • 2. Legacy systems Some truths to face Legacy system are growing in size and number. They are here to stay! If your architecture does not manage legacy systems, legacy systems will menage sooner or later your architecture.
  • 3. Who am I? Not an easy question to answer but keeping it simple... Andrea Gioia andrea.gioia@quantyca.it Quantyca is a privately owned technological consulting firm specialized in data and metadata management based in Italy quantyca.it Blindata is a SAAS platform that leverages Data Governance and Compliance to empower your Data Management projects. blindata.io CTO CO-FOUNDER
  • 4. What is legacy modernization Digital transformation continuously push toward the development of new ● touchpoints in a omnichannel logic (System of engagement) ● analytical and AI based services (System of insight) These new applications are usually integrated with back-office legacy systems with a point-to-point logic. This way of integrating the new with the legacy does not scale up in the long term. Because the legacy cannot be simply thrown away a better integration architecture is needed in order to modernize them in place. ...and why it matters System of Engagement System of Insight System of Records Legacy Systems Application Layer Integration Layer Point to point “Spaghetti” integration
  • 5. Legacy modernization TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT: Go beyond the limits imposed by legacy systems to improve business agility Key business drivers COSTS AND RISKS REDUCTION: Rationalize integrations to reduce development and maintenance costs and to avoid uncontrolled access to data RESILIENCE AND PERFORMANCE IMPROVEMENT: Ensure the uptime of legacy systems even in the face of significant increases in the workloads
  • 6. Integration architecture #1 All new functionalities are implemented directly by extending the legacy system or by buying complementary products offered by the same vendor of the legacy system. Integration layer if present is limited to an API Gateway to decouple legacy backend from frontend applications Legacy systems take it all System of Engagement Frontend System of Insight Frontend System of Records Legacy Systems Application Layer Integration Layer API Gateway SoE & SoI Backend SoE & SoI Backend SoE & SoI Backend SoE & SoI Backend SoE & SoI Backend TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT COSTS AND RISKS REDUCTION RESILIENCE AND PERFORMANCE IMPROVEMENT
  • 7. Integration architecture #2 Integration rationalization through composite services System of engagement System Of Insight System of Records Legacy Systems Application Layer Integration Platform API Gateway Request Based Integration Layer Application Services Process Services Sourcing Services Composite Services Integrations are rationalized through different layers of reusable and composable services. Sourcing services wrap legacy systems, process service orchestrate business process and application services provide a backend for frontend applications TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT COSTS AND RISKS REDUCTION RESILIENCE AND PERFORMANCE IMPROVEMENT
  • 8. Integration architecture #2 Integration rationalization through data virtualization System of engagement System Of Insight System of Records Legacy Systems Application Layer Integration Platform API Gateway Request Based Integration Layer Application Layer Business Layer Physical Layer Virtual DWH TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT COSTS AND RISKS REDUCTION RESILIENCE AND PERFORMANCE IMPROVEMENT Integrations are rationalized through different layers of views served by a data virtualization application. Physical layer wraps legacy systems, business layer exposes the business model and application layer provide projections designed to facilitate consumption.
  • 9. Integration architecture #2 Integration rationalization System of engagement System Of Insight System of Records Legacy Systems Application Layer Hybrid Integration Platform API Gateway Request Based Integration Layer Virtual DWH Composite Services TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT COSTS AND RISKS REDUCTION RESILIENCE AND PERFORMANCE IMPROVEMENT Composite services and data virtualization can be used in the same architecture. The former is preferred to back system of engagement the latter to back system of insight. Both solutions simplify integrations but don’t reduce the workload on the backend systems
  • 10. Integration architecture #3 Data offloading System of engagement System Of Insight System of Records Legacy Systems Application Layer Hybrid Integration Platform API Gateway Event-Based Integration Layer High-Performance Data Store Microservices Metadata Management TIME-TO-MARKET AND BUSINESS AGILITY IMPROVEMENT COSTS AND RISKS REDUCTION RESILIENCE AND PERFORMANCE IMPROVEMENT Data offloaded from legacy systems are aggregated into low-latency, high performance datastore accessible via APIs, events or batch. The data store synchronizes with the beck ends via event-driven integration patterns.
  • 11. Digital Integration Hub Key building blocks Event store High performance data store Connectors Legacy Systems Applications Services Where the data is stored Keeps the legacy systems and the high performance data store in sync offloading all modifications to relevant data in real time Transform technical events coming from connectors to domain and business events that can be consumed downstream by high performance data store or other consumers (event driven integration) Stores domain specific data exposing a single consolidated view of entities ~ Supports fast ingestion to reduce eventual consistency window ~ Can support analytical queries Connect to high performance data store for read queries Execute write on the legacy systems by means of command events pushed on the event store (command query responsibility segregation) Where the data is used
  • 12. Connectors Data acquisition patterns Trigger (Push Mode) Good for neo-legacies. Problematic for old-school legacies. Change Data Capture (Backend Interception) The best option but the CDC connectors can be quite expansive. Active Pooling (Pop Mode) Difficult to find a trade off that satisfies the load constraints of the legacy and real time needs of applications. Interesting source connectors for legacy modernization available for Kafka are: ○ JDBC Connectors: for active pooling ○ Debezium Connector: for CDC from MySql, Postgres, … ○ Salesforce Connectors: for CDC from salesforce ○ Oracle Connector: for CDC from oracle ○ Partner Connectors: for CDC from other legacies like SAP and Mainframe (ex. Qlik Replicate Connector) Decorating Collaborator (Frontend Interception) Largely cited in letterature. Good in theory problematic in reality.
  • 13. Event Store Event driven integration Legacy System Streaming Platform Technical Events (Speed & Fidelity) Domain Events (Trusted Views) Business Events (Ease of consumption) High Performance Data Store
  • 14. Event Store Offloading patterns Legacy System Streaming Platform Technical Events (Speed & Fidelity) Domain Events (Trusted Views) One table per topic Changes to each table are mapped to distinct topics, one topic per table. Stream joins are used to create domain events from technical events spread across different topics Preserving transactional coherence within aggregates can be complex when the aggregate is spread among multiple tables updated by long running transactions
  • 15. Event Store Offloading patterns Legacy System Streaming Platform Technical Events (Speed & Fidelity) Domain Events (Trusted Views) One aggregate per topic All changes to tables that are part of the same aggregate are mapped to the same topic. The identifier of the aggregate is used to partition the topic. It’s easier to create domain events from technical events preserving transactional coherence even with complex aggregates or unpredictable transactional pattern.
  • 16. Event Store Offloading patterns Legacy System Streaming Platform Technical Events (Speed & Fidelity) Domain Events (Trusted Views) Transactional outbox pattern The legacy system is modified in order to inserts messages/events into an outbox table as part of the local transaction. The modification can be performed at code or database level (es. triggers or materialized views). The connector that offload data to the streaming platform is triggered by the outbox table. OUTBOX Table COMMIT TRX INSERT UPDATE DELETE INSERT
  • 17. Event Store Offloading patterns Legacy System Streaming Platform Technical Events (Speed & Fidelity) Domain Events (Trusted Views) Triggered publisher All changes to tables that are part of the same aggregate are mapped to the same topic as technical event that can contain only the aggregate id and transaction id as payload. For every transaction id a stream processor query the legacy database extracting the modified aggregate, filtering by id, and publishing it as payload of a new domain event To reduce the workload on legacy the stream processor can query a read replica Transactional coherence within the aggregate is guaranteed by the upstream database
  • 18. High-performance data store Some options with pros and cons KSQL DB Document DB In Memory DB HATP DB PROS + Does not requery external components + Low Latency + Can handle very high throughput + Moving from event to state is simple and requires a small integration effort + Stored data can be consumed directly also by stream processors CONS - Not SQL compliant - Serving to external consumers have some limitations that must be managed directly by the consumers - It’s not a good fit for complex analytical workloads - TCO maybe not optimal for huge data volumes
  • 19. High-performance data store Some options with pros and cons KSQL DB Document DB In Memory DB HATP DB PROS + Does not require format transformation during the whole flow from streaming platform to services + Largely used by service developers, probably already present in the architecture + Good fit to expose single read view of domain entities consolidated from different sources + Quite easy to handle schema changes CONS - Not SQL compliant - Not a good fit for complex analytical workloads - Not a good fit to expose business entity whose access pattern from service is not predictable - Can have some performance issues at very high throughput
  • 20. High-performance data store Some options with pros and cons KSQL DB Document DB In Memory DB HATP DB PROS + SQL compliant (some of them, not all) + Can handle very high throughput + Can handle complex analytical queries + Good fit to expose read view of domain events and business events as well + TCO can be optimize selecting the right strategy of distribution of stored data between RAM and disk CONS - Require format transformation from document to relational and then back to document when moving data from streaming platform first and to service then - Changes in schema performed upstream must be actively managed
  • 21. High-performance data store Some options with pros and cons KSQL DB Document DB In Memory DB HATP DB PROS + Can handle very high throughput CONS - Not SQL compliant (in most of the cases, not all) - Not a good fit for complex analytical workloads - Can require format transformation when data is read from streaming platform first and then again when data is consumed by services - TCO maybe not optimal for huge data volumes
  • 22. Closing the loop with CQRS From services back to legacy systems Legacy System Streaming Platform Technical Events (Speed & Fidelity) Domain Events (Trusted Views) High Performance Data Store Business Events (Ease of consumption) Commands Micro/Mini Services READ WRITE
  • 23. The legacy modernization journey Offloading, Isolation and Refactoring Legacy System Digital Integration Hub Applications 1 Legacy Offloading Legacy System Digital Integration Hub Applications Anti Corruption Layer Bubble Context 2 Legacy Isolation Digital Integration Hub Applications Anti Corruption Layer Bubble Context 3 Legacy Refactoring
  • 24. Takeaways Digital integration hub can be seen as a way of decoupled systems using data as anti corruption layer. Data offloaded into the integration platform become a first-class citizen of the new data centric architecture. Benefits ○ Responsive user experience ○ Offload legacy systems from expansive workloads generated by front-end services ○ Support legacy refactoring ○ Align services to business domain ○ Enable real time analytics ○ Foster a data centric approach to integration Challenges ○ Adapting the conceptual architecture to your specific context ○ Assembling different technology components, possibly from different vendors ○ Operating a complex distributed and loosely coupled architecture ○ Supporting bidirectional synchronization ○ Designing the domain data models for the business entities ○ Developing services that can tolerate eventual consistency ○ Managing organizational politics related to data ownership
  • 25. Questions? Feel free to ask andrea.gioia@quantyca.it