SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Achieving Predictability and
Compliance with BNY Mellon’s
Data Distribution Hub
Rajesh Kaveti
May 8, 2017
Senior Principal Developer
2 Information Classification: Confidential
$30.6
trillion
$1.7
trillion
100+
markets
assets under
custody and/or
administration*
assets under
management*
across the world*
* All figures as of March 31, 2017
BNY Mellon
3 Information Classification: Confidential
11,000+ employees
in over 50 cities
$195M retail
accounts serviced
200,000 professionals
access our services daily
8,000 Java virtual
machines in production
1.6 billion Digital Pulse events/
month from 100+ sources
BNY Mellon Technology at a Glance
4 Information Classification: Confidential
As a response to
the financial crisis
of 2007-2008,
significant
changes were
done to financial
regulation.
Background
DATA
management
has become a
focus in the past
10 years.
quality.
This effort is
primarily in
response to some
of those
requirements.
The main idea
was around Data
Integration
across the
enterprise and
managing
lineage and
quality.
quality.
5 Information Classification: Confidential
With some kind of ETL…
How do we do this..?
6 Information Classification: Confidential
Source 1
Source 3
Source 2
Source 4
Enterprise Data Warehouse
ETL
ETL
ETL
ETL
Seems Simple..
7 Information Classification: Confidential
But typically
it does not look that neat…
8 Information Classification: Confidential
Consumer 1
Source 1
Source 3
Source 2
Source 4
Consumer 2
Consumer 3
DB
HADOOP
ETL
ETL
ETL
ETLETL
ETL
ETL
ETL
ETL
ETL
Cache
ETL
ETL
ETL
ETL
It is lot messier than this.. It is
more like a Hairball..
9 Information Classification: Confidential
Consumer 1Source 1
Source 3
Source 2
Source 4
Consumer 2
Consumer 3
Distribution
Hub
10 Information Classification: Confidential
Source and Consumer are not connected directly
Decoupling helps to have consumers and sources evolve.
Middle Layer evolving with technology
The idea was…
Distribution Hub
11 Information Classification: Confidential
Our Vision:
• Be the trusted, go-to data provider.
• Centralize
- Transformation and enrichment logic
- Security
• Make it easy to manage data lineage
• Monitor to ensure elasticity and consistent performance.
12 Information Classification: Confidential
• 1000s of systems with different and diverse data
structures.
• Needed a flexible read-level schema.
• Given the size, didn't know what end state would look
like.
Our Challenges:
13 Information Classification: Confidential
A platform for this distribution hub
One that evolves with changing technology and requirements
Self service, business friendly model
Our own technology, not a vendor tool
ZERO data LOSS
Reconcilable
Centralized Data Lineage
Be the basis for enterprise Data Dictionary
What we really needed was ...
14 Information Classification: Confidential14
Consumer 1
Source 1
Source 3
Source 2
Source 4
Consumer 2
Consumer 3
Distribution
Hub
Source 1
Enrichment
15 Information Classification: Confidential
Micro-services
• Framework built as Microservices
• LEGO blocks – Allowing us to use these block to transform
and morph and making no assumptions about the future.
16 Information Classification: Confidential
Our platform uses KAFKA extensively in both
message and file-based paradigms
Let’s look at how we used KAFKA to achieve reliability, performance and economy.
17 Information Classification: Confidential
Functional Decomposition-
MICROSERVICES
Horizontal Scaling-
INTERNAL CLOUD
Data Splitting -
SHARDING
Our Scaling Strategy - 3 Dimensions of Scaling*
*The Art of Scalability – Martin Abott and Michael Fisher
18 Information Classification: Confidential
Functional Decomposition-
MICROSERVICES
Horizontal Scaling-
INTERNAL CLOUD
Data Splitting -
SHARDING
Our Scaling Strategy - 3 Dimensions of Scaling
• Separation of Work by
responsibility.
• Based on micro-services where
services are more specialized for
a task. Each of these micro-
services exposed via API Store
resulting in more reuse.
• Tasks that need more CPU would
be scaled separately without
scaling the entire infrastructure.
• Cloning of services or data such
that work can be easily
distributed across instances with
absolutely no bias.
• Can be implemented by scaling
out using BNY Mellon Cloud.
• Functional Decomposition
required for easy horizontal
scaling.
• As the data grows, the ability to
handle scale in a horizontal scale
environment gets harder
• For horizontal scaling, data split is
required to ensure that the
memory requirement for each
node is consistent.
• Idea of data splitting across a set
of servers. Each server only deals
with a subset of data and in the
process improves memory
management and transaction
scalability.
19 Information Classification: Confidential
2 Main Pillars
• Inbound
• Outbound
Functional Decomposition
20 Information Classification: Confidential
Extraction
Extraction
Validation
Staging
Enrichment
Distribution
Validation Staging Enrichment Distribution
RULES
Monolith to
Micro-service
Inbound Outbound
We started
with Monolith Inbound
Outbound
21 Information Classification: Confidential
In both cases, we used Vertica as the way to stage
our data
Extraction Validation
Staging
Enrichment Distribution
On Hadoop
On Premise
In the Cloud
Vertica – SQL Analytics Platform
Blazingly Fast and Scalable
23 Information Classification: Confidential
Batch Based
Processing for
Files
• Traditional File Processing
• Started with Hadoop
• Moved to Traditional Batch Based Processing
• Batch based – Simple but not scalable
• Resulted in missed SLAs and very resource intensive.
Extraction Validation Staging Enrichment Distribution
Traditional File Processing
24 Information Classification: Confidential
Integrated all micro services using KAFKA.
Maintain the state using OFFSETS. Knew exactly where to start from if things failed..
25 Information Classification: Confidential
KAFKA TOPIC
Second Dimension - Horizontal Scaling
Active
Active
- KAFKA is the basis for integrating all micro-services.
- Partitions in KAFKA allowed separate paths for different files.
- Larger files would not impact smaller files.
- File splits would run concurrently. For services that need
synchronous, used request-response.
- Maintain the state using OFFSETS.
- Knew exactly where to start from if things failed..
Each PARTITION contains
instructions for file splits
related to a file .
Splits for a specific file is
processed concurrently.
KAFKA synch
multiple regions
Cluster Working making
sure that there is no Singe
Point of Failure
26 Information Classification: Confidential
1 2 3 4 5
Splits for a specific file is
processed concurrently.
File
Each File is spit into
smaller files
Vertica
Reconcile
KAFKA
1 2 3 54
Client
Third Dimension - Data Sharding
Instead of JDBC, we use COPY construct in Vertica.
Larger Files broken into smaller files and those jobs
could be sent to cluster of servers.
We could now transfer files in parallel
27 Information Classification: Confidential
Resolved Concerns
• COPY statement in Vertica helped the bulk-loads data into an HPE Vertica database.
• KAFKA maintained the offsets and that meant any failure would not require us to restart.
• Idempotency was a key gain as the client could just call again and system would know where
to restart from..
• Network timeout and other issues no longer an issue.
• Restarting the job - Now we could start even when it failed in between
28 Information Classification: Confidential
Ability to Fine Tune Our Engine
Instances, Split Size and Memory were three dimensions that we could play with based on the availability of the
hardware
29 Information Classification: Confidential
Real time
for Messages
Extraction Validation Staging Enrichment Distribution
• STORM for managing the Streams
• Kafka as a Queue
• And used Vertica to Stage the data to create outbounds
Messaging
30 Information Classification: Confidential
Typical Real time Streaming Architecture
Storm
Topology
Storm
Topology
Storm
Topology
Storm
Topology
Storm
Topology
KAFKA KAFKA KAFKA
31 Information Classification: Confidential
• Kafka is designed for a streaming use case
• In Vertica, streaming effect can be achieved by running a series of COPY statements
• BUT - process can become tedious and complex
• So we used Kafka integration to load data to Vertica database
Vertica- KAFKA Integration
32 Information Classification: Confidential
Performance - 4 TIMES THROUGHPUT
33 Information Classification: Confidential
Consumption -
70 - 80% Gain
Resource Consumption
Percent difference between JDBC & KAFKA Excavator
34 Information Classification: Confidential
This will be an ever evolving and dynamic project. We have changed, evolved and
redesigned so that it is easy for us to evolve and be forward looking.
Thanks

Weitere ähnliche Inhalte

Was ist angesagt?

EDA Governance Model: a multicloud approach based on GitOps | Alejandro Alija...
EDA Governance Model: a multicloud approach based on GitOps | Alejandro Alija...EDA Governance Model: a multicloud approach based on GitOps | Alejandro Alija...
EDA Governance Model: a multicloud approach based on GitOps | Alejandro Alija...HostedbyConfluent
 
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...HostedbyConfluent
 
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...HostedbyConfluent
 
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka StreamsReal-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streamsconfluent
 
Elastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using ConfluentElastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using Confluentconfluent
 
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...HostedbyConfluent
 
0-330km/h: Porsche's Data Streaming Journey | Sridhar Mamella, Porsche
0-330km/h: Porsche's Data Streaming Journey | Sridhar Mamella, Porsche0-330km/h: Porsche's Data Streaming Journey | Sridhar Mamella, Porsche
0-330km/h: Porsche's Data Streaming Journey | Sridhar Mamella, PorscheHostedbyConfluent
 
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...HostedbyConfluent
 
Stream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, Jupiter
Stream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, JupiterStream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, Jupiter
Stream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, JupiterHostedbyConfluent
 
The Migration to Event-Driven Microservices (Adam Bellemare, Flipp) Kafka Sum...
The Migration to Event-Driven Microservices (Adam Bellemare, Flipp) Kafka Sum...The Migration to Event-Driven Microservices (Adam Bellemare, Flipp) Kafka Sum...
The Migration to Event-Driven Microservices (Adam Bellemare, Flipp) Kafka Sum...confluent
 
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...HostedbyConfluent
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Big Data Spain
 
Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...
Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...
Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...HostedbyConfluent
 
Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...
Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...
Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...HostedbyConfluent
 
Kafka Summit SF 2017 - Providing Reliability Guarantees in Kafka at One Trill...
Kafka Summit SF 2017 - Providing Reliability Guarantees in Kafka at One Trill...Kafka Summit SF 2017 - Providing Reliability Guarantees in Kafka at One Trill...
Kafka Summit SF 2017 - Providing Reliability Guarantees in Kafka at One Trill...confluent
 
Building a Codeless Log Pipeline w/ Confluent Sink Connector | Pollyanna Vale...
Building a Codeless Log Pipeline w/ Confluent Sink Connector | Pollyanna Vale...Building a Codeless Log Pipeline w/ Confluent Sink Connector | Pollyanna Vale...
Building a Codeless Log Pipeline w/ Confluent Sink Connector | Pollyanna Vale...HostedbyConfluent
 
Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...
Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...
Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...confluent
 
Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...
Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...
Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...HostedbyConfluent
 
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...HostedbyConfluent
 
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIuser Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIconfluent
 

Was ist angesagt? (20)

EDA Governance Model: a multicloud approach based on GitOps | Alejandro Alija...
EDA Governance Model: a multicloud approach based on GitOps | Alejandro Alija...EDA Governance Model: a multicloud approach based on GitOps | Alejandro Alija...
EDA Governance Model: a multicloud approach based on GitOps | Alejandro Alija...
 
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
 
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
5 lessons learned for successful migration to Confluent cloud | Natan Silinit...
 
Real-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka StreamsReal-Time Market Data Analytics Using Kafka Streams
Real-Time Market Data Analytics Using Kafka Streams
 
Elastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using ConfluentElastically Scaling Kafka Using Confluent
Elastically Scaling Kafka Using Confluent
 
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...
 
0-330km/h: Porsche's Data Streaming Journey | Sridhar Mamella, Porsche
0-330km/h: Porsche's Data Streaming Journey | Sridhar Mamella, Porsche0-330km/h: Porsche's Data Streaming Journey | Sridhar Mamella, Porsche
0-330km/h: Porsche's Data Streaming Journey | Sridhar Mamella, Porsche
 
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
 
Stream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, Jupiter
Stream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, JupiterStream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, Jupiter
Stream Processing with Kafka and KSQL in Jupiter | Namit Mahuvakar, Jupiter
 
The Migration to Event-Driven Microservices (Adam Bellemare, Flipp) Kafka Sum...
The Migration to Event-Driven Microservices (Adam Bellemare, Flipp) Kafka Sum...The Migration to Event-Driven Microservices (Adam Bellemare, Flipp) Kafka Sum...
The Migration to Event-Driven Microservices (Adam Bellemare, Flipp) Kafka Sum...
 
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
 
Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...
Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...
Why Kafka Works the Way It Does (And Not Some Other Way) | Tim Berglund, Conf...
 
Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...
Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...
Server Sent Events using Reactive Kafka and Spring Web flux | Gagan Solur Ven...
 
Kafka Summit SF 2017 - Providing Reliability Guarantees in Kafka at One Trill...
Kafka Summit SF 2017 - Providing Reliability Guarantees in Kafka at One Trill...Kafka Summit SF 2017 - Providing Reliability Guarantees in Kafka at One Trill...
Kafka Summit SF 2017 - Providing Reliability Guarantees in Kafka at One Trill...
 
Building a Codeless Log Pipeline w/ Confluent Sink Connector | Pollyanna Vale...
Building a Codeless Log Pipeline w/ Confluent Sink Connector | Pollyanna Vale...Building a Codeless Log Pipeline w/ Confluent Sink Connector | Pollyanna Vale...
Building a Codeless Log Pipeline w/ Confluent Sink Connector | Pollyanna Vale...
 
Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...
Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...
Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...
 
Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...
Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...
Exposing and Controlling Kafka Event Streaming with Kong Konnect Enterprise |...
 
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
 
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIuser Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
 

Ähnlich wie Kafka Summit NYC 2017 - Achieving Predictability and Compliance with BNY Mellon's Data Distribution Hub

Including All Your Mission-Critical Data in Modern Apps and Analytics
Including All Your Mission-Critical Data in Modern Apps and AnalyticsIncluding All Your Mission-Critical Data in Modern Apps and Analytics
Including All Your Mission-Critical Data in Modern Apps and AnalyticsPrecisely
 
Including All Your Mission-Critical Data in Modern Apps and Analytics
Including All Your Mission-Critical Data in Modern Apps and AnalyticsIncluding All Your Mission-Critical Data in Modern Apps and Analytics
Including All Your Mission-Critical Data in Modern Apps and AnalyticsDATAVERSITY
 
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL DatabaseNuoDB
 
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
Big Data Fabric: A Necessity For Any Successful Big Data InitiativeBig Data Fabric: A Necessity For Any Successful Big Data Initiative
Big Data Fabric: A Necessity For Any Successful Big Data InitiativeDenodo
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItDenodo
 
Key Database Criteria for Cloud Applications
Key Database Criteria for Cloud ApplicationsKey Database Criteria for Cloud Applications
Key Database Criteria for Cloud ApplicationsNuoDB
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Denodo
 
Evolving From Monolithic to Distributed Architecture Patterns in the Cloud
Evolving From Monolithic to Distributed Architecture Patterns in the CloudEvolving From Monolithic to Distributed Architecture Patterns in the Cloud
Evolving From Monolithic to Distributed Architecture Patterns in the CloudDenodo
 
The Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data ImplementationThe Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data ImplementationInside Analysis
 
Overcoming Your Data Integration Challenges
Overcoming Your Data Integration Challenges Overcoming Your Data Integration Challenges
Overcoming Your Data Integration Challenges Precisely
 
Address Your Blind Spots Around Mission-Critical Data
Address Your Blind Spots Around Mission-Critical Data Address Your Blind Spots Around Mission-Critical Data
Address Your Blind Spots Around Mission-Critical Data Precisely
 
Modernizing Data Architecture using Data Virtualization for Agile Data Delivery
Modernizing Data Architecture using Data Virtualization for Agile Data DeliveryModernizing Data Architecture using Data Virtualization for Agile Data Delivery
Modernizing Data Architecture using Data Virtualization for Agile Data DeliveryDenodo
 
Bring Your Mission-Critical Data to Your Cloud Apps and Analytics
Bring Your Mission-Critical Data to Your Cloud Apps and Analytics Bring Your Mission-Critical Data to Your Cloud Apps and Analytics
Bring Your Mission-Critical Data to Your Cloud Apps and Analytics Precisely
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise AnalyticsDATAVERSITY
 
How Yelp Leapt to Microservices with More than a Message Queue
How Yelp Leapt to Microservices with More than a Message QueueHow Yelp Leapt to Microservices with More than a Message Queue
How Yelp Leapt to Microservices with More than a Message Queueconfluent
 
Digital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraDigital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraAttunity
 
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentApache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentHostedbyConfluent
 
Accelerate Innovation by Bringing all Your Mission-Critical Data to Your Clou...
Accelerate Innovation by Bringing all Your Mission-Critical Data to Your Clou...Accelerate Innovation by Bringing all Your Mission-Critical Data to Your Clou...
Accelerate Innovation by Bringing all Your Mission-Critical Data to Your Clou...Precisely
 
Introducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building SocietyIntroducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building Societyconfluent
 

Ähnlich wie Kafka Summit NYC 2017 - Achieving Predictability and Compliance with BNY Mellon's Data Distribution Hub (20)

Including All Your Mission-Critical Data in Modern Apps and Analytics
Including All Your Mission-Critical Data in Modern Apps and AnalyticsIncluding All Your Mission-Critical Data in Modern Apps and Analytics
Including All Your Mission-Critical Data in Modern Apps and Analytics
 
Including All Your Mission-Critical Data in Modern Apps and Analytics
Including All Your Mission-Critical Data in Modern Apps and AnalyticsIncluding All Your Mission-Critical Data in Modern Apps and Analytics
Including All Your Mission-Critical Data in Modern Apps and Analytics
 
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
 
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
Big Data Fabric: A Necessity For Any Successful Big Data InitiativeBig Data Fabric: A Necessity For Any Successful Big Data Initiative
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need It
 
Key Database Criteria for Cloud Applications
Key Database Criteria for Cloud ApplicationsKey Database Criteria for Cloud Applications
Key Database Criteria for Cloud Applications
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 
Evolving From Monolithic to Distributed Architecture Patterns in the Cloud
Evolving From Monolithic to Distributed Architecture Patterns in the CloudEvolving From Monolithic to Distributed Architecture Patterns in the Cloud
Evolving From Monolithic to Distributed Architecture Patterns in the Cloud
 
The Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data ImplementationThe Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data Implementation
 
Overcoming Your Data Integration Challenges
Overcoming Your Data Integration Challenges Overcoming Your Data Integration Challenges
Overcoming Your Data Integration Challenges
 
Address Your Blind Spots Around Mission-Critical Data
Address Your Blind Spots Around Mission-Critical Data Address Your Blind Spots Around Mission-Critical Data
Address Your Blind Spots Around Mission-Critical Data
 
Modernizing Data Architecture using Data Virtualization for Agile Data Delivery
Modernizing Data Architecture using Data Virtualization for Agile Data DeliveryModernizing Data Architecture using Data Virtualization for Agile Data Delivery
Modernizing Data Architecture using Data Virtualization for Agile Data Delivery
 
Bring Your Mission-Critical Data to Your Cloud Apps and Analytics
Bring Your Mission-Critical Data to Your Cloud Apps and Analytics Bring Your Mission-Critical Data to Your Cloud Apps and Analytics
Bring Your Mission-Critical Data to Your Cloud Apps and Analytics
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
How Yelp Leapt to Microservices with More than a Message Queue
How Yelp Leapt to Microservices with More than a Message QueueHow Yelp Leapt to Microservices with More than a Message Queue
How Yelp Leapt to Microservices with More than a Message Queue
 
Digital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraDigital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming Era
 
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentApache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
 
Accelerate Innovation by Bringing all Your Mission-Critical Data to Your Clou...
Accelerate Innovation by Bringing all Your Mission-Critical Data to Your Clou...Accelerate Innovation by Bringing all Your Mission-Critical Data to Your Clou...
Accelerate Innovation by Bringing all Your Mission-Critical Data to Your Clou...
 
Introducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building SocietyIntroducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building Society
 

Mehr von confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flinkconfluent
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluentconfluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkconfluent
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloudconfluent
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluentconfluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernizationconfluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataconfluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023confluent
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesisconfluent
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023confluent
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streamsconfluent
 

Mehr von confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Kürzlich hochgeladen

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 

Kürzlich hochgeladen (20)

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 

Kafka Summit NYC 2017 - Achieving Predictability and Compliance with BNY Mellon's Data Distribution Hub

  • 1. Achieving Predictability and Compliance with BNY Mellon’s Data Distribution Hub Rajesh Kaveti May 8, 2017 Senior Principal Developer
  • 2. 2 Information Classification: Confidential $30.6 trillion $1.7 trillion 100+ markets assets under custody and/or administration* assets under management* across the world* * All figures as of March 31, 2017 BNY Mellon
  • 3. 3 Information Classification: Confidential 11,000+ employees in over 50 cities $195M retail accounts serviced 200,000 professionals access our services daily 8,000 Java virtual machines in production 1.6 billion Digital Pulse events/ month from 100+ sources BNY Mellon Technology at a Glance
  • 4. 4 Information Classification: Confidential As a response to the financial crisis of 2007-2008, significant changes were done to financial regulation. Background DATA management has become a focus in the past 10 years. quality. This effort is primarily in response to some of those requirements. The main idea was around Data Integration across the enterprise and managing lineage and quality. quality.
  • 5. 5 Information Classification: Confidential With some kind of ETL… How do we do this..?
  • 6. 6 Information Classification: Confidential Source 1 Source 3 Source 2 Source 4 Enterprise Data Warehouse ETL ETL ETL ETL Seems Simple..
  • 7. 7 Information Classification: Confidential But typically it does not look that neat…
  • 8. 8 Information Classification: Confidential Consumer 1 Source 1 Source 3 Source 2 Source 4 Consumer 2 Consumer 3 DB HADOOP ETL ETL ETL ETLETL ETL ETL ETL ETL ETL Cache ETL ETL ETL ETL It is lot messier than this.. It is more like a Hairball..
  • 9. 9 Information Classification: Confidential Consumer 1Source 1 Source 3 Source 2 Source 4 Consumer 2 Consumer 3 Distribution Hub
  • 10. 10 Information Classification: Confidential Source and Consumer are not connected directly Decoupling helps to have consumers and sources evolve. Middle Layer evolving with technology The idea was… Distribution Hub
  • 11. 11 Information Classification: Confidential Our Vision: • Be the trusted, go-to data provider. • Centralize - Transformation and enrichment logic - Security • Make it easy to manage data lineage • Monitor to ensure elasticity and consistent performance.
  • 12. 12 Information Classification: Confidential • 1000s of systems with different and diverse data structures. • Needed a flexible read-level schema. • Given the size, didn't know what end state would look like. Our Challenges:
  • 13. 13 Information Classification: Confidential A platform for this distribution hub One that evolves with changing technology and requirements Self service, business friendly model Our own technology, not a vendor tool ZERO data LOSS Reconcilable Centralized Data Lineage Be the basis for enterprise Data Dictionary What we really needed was ...
  • 14. 14 Information Classification: Confidential14 Consumer 1 Source 1 Source 3 Source 2 Source 4 Consumer 2 Consumer 3 Distribution Hub Source 1 Enrichment
  • 15. 15 Information Classification: Confidential Micro-services • Framework built as Microservices • LEGO blocks – Allowing us to use these block to transform and morph and making no assumptions about the future.
  • 16. 16 Information Classification: Confidential Our platform uses KAFKA extensively in both message and file-based paradigms Let’s look at how we used KAFKA to achieve reliability, performance and economy.
  • 17. 17 Information Classification: Confidential Functional Decomposition- MICROSERVICES Horizontal Scaling- INTERNAL CLOUD Data Splitting - SHARDING Our Scaling Strategy - 3 Dimensions of Scaling* *The Art of Scalability – Martin Abott and Michael Fisher
  • 18. 18 Information Classification: Confidential Functional Decomposition- MICROSERVICES Horizontal Scaling- INTERNAL CLOUD Data Splitting - SHARDING Our Scaling Strategy - 3 Dimensions of Scaling • Separation of Work by responsibility. • Based on micro-services where services are more specialized for a task. Each of these micro- services exposed via API Store resulting in more reuse. • Tasks that need more CPU would be scaled separately without scaling the entire infrastructure. • Cloning of services or data such that work can be easily distributed across instances with absolutely no bias. • Can be implemented by scaling out using BNY Mellon Cloud. • Functional Decomposition required for easy horizontal scaling. • As the data grows, the ability to handle scale in a horizontal scale environment gets harder • For horizontal scaling, data split is required to ensure that the memory requirement for each node is consistent. • Idea of data splitting across a set of servers. Each server only deals with a subset of data and in the process improves memory management and transaction scalability.
  • 19. 19 Information Classification: Confidential 2 Main Pillars • Inbound • Outbound Functional Decomposition
  • 20. 20 Information Classification: Confidential Extraction Extraction Validation Staging Enrichment Distribution Validation Staging Enrichment Distribution RULES Monolith to Micro-service Inbound Outbound We started with Monolith Inbound Outbound
  • 21. 21 Information Classification: Confidential In both cases, we used Vertica as the way to stage our data Extraction Validation Staging Enrichment Distribution
  • 22. On Hadoop On Premise In the Cloud Vertica – SQL Analytics Platform Blazingly Fast and Scalable
  • 23. 23 Information Classification: Confidential Batch Based Processing for Files • Traditional File Processing • Started with Hadoop • Moved to Traditional Batch Based Processing • Batch based – Simple but not scalable • Resulted in missed SLAs and very resource intensive. Extraction Validation Staging Enrichment Distribution Traditional File Processing
  • 24. 24 Information Classification: Confidential Integrated all micro services using KAFKA. Maintain the state using OFFSETS. Knew exactly where to start from if things failed..
  • 25. 25 Information Classification: Confidential KAFKA TOPIC Second Dimension - Horizontal Scaling Active Active - KAFKA is the basis for integrating all micro-services. - Partitions in KAFKA allowed separate paths for different files. - Larger files would not impact smaller files. - File splits would run concurrently. For services that need synchronous, used request-response. - Maintain the state using OFFSETS. - Knew exactly where to start from if things failed.. Each PARTITION contains instructions for file splits related to a file . Splits for a specific file is processed concurrently. KAFKA synch multiple regions Cluster Working making sure that there is no Singe Point of Failure
  • 26. 26 Information Classification: Confidential 1 2 3 4 5 Splits for a specific file is processed concurrently. File Each File is spit into smaller files Vertica Reconcile KAFKA 1 2 3 54 Client Third Dimension - Data Sharding Instead of JDBC, we use COPY construct in Vertica. Larger Files broken into smaller files and those jobs could be sent to cluster of servers. We could now transfer files in parallel
  • 27. 27 Information Classification: Confidential Resolved Concerns • COPY statement in Vertica helped the bulk-loads data into an HPE Vertica database. • KAFKA maintained the offsets and that meant any failure would not require us to restart. • Idempotency was a key gain as the client could just call again and system would know where to restart from.. • Network timeout and other issues no longer an issue. • Restarting the job - Now we could start even when it failed in between
  • 28. 28 Information Classification: Confidential Ability to Fine Tune Our Engine Instances, Split Size and Memory were three dimensions that we could play with based on the availability of the hardware
  • 29. 29 Information Classification: Confidential Real time for Messages Extraction Validation Staging Enrichment Distribution • STORM for managing the Streams • Kafka as a Queue • And used Vertica to Stage the data to create outbounds Messaging
  • 30. 30 Information Classification: Confidential Typical Real time Streaming Architecture Storm Topology Storm Topology Storm Topology Storm Topology Storm Topology KAFKA KAFKA KAFKA
  • 31. 31 Information Classification: Confidential • Kafka is designed for a streaming use case • In Vertica, streaming effect can be achieved by running a series of COPY statements • BUT - process can become tedious and complex • So we used Kafka integration to load data to Vertica database Vertica- KAFKA Integration
  • 32. 32 Information Classification: Confidential Performance - 4 TIMES THROUGHPUT
  • 33. 33 Information Classification: Confidential Consumption - 70 - 80% Gain Resource Consumption Percent difference between JDBC & KAFKA Excavator
  • 34. 34 Information Classification: Confidential This will be an ever evolving and dynamic project. We have changed, evolved and redesigned so that it is easy for us to evolve and be forward looking. Thanks

Hinweis der Redaktion

  1. We have over 1000s of systems with different and diverse data structures. We needed a flexible read level schema. One cannot create a global schema to accommodate all the diversity. Given the size, we really didn't know what our end state would look like..
  2. What we really needed was to .. Build a platform for this distribution hub which evolves with changing technology and requirements. At the same time, it provides a self service model that is business friendly. And … that is the only way to be cost effective. We needed our own technology and not a vendor tool… Which a true platform.. where business users can build their own custom tranformation rules.. Which responds to all changing trends and thinking process. ZERO data LOSS And be reconcileable. And .. Serve as a centralized data hub between sources and consumers, enriching, transforming, and delivering data across the company. It is the brains behind everything around the data movement. It Enriches, Transforms, and Delivers data
  3. And this is what we built -- This framework is built as collection of micro-services. Like LEGO blocks - Allowing us to morph this product over time – making no assumptions about future.
  4. What were our pillars for functional decomposition? At the highest level – Inbound and Outbound Source and Consumer are not connected directly - No point to point connection. Decoupling helps to have consumers and sources evolve. Different volume and performance requirements Middle Layer evolving with technology without ever changing the business rules or the interfaces
  5. Vertica is a blazingly fast SQL analytics platform based on Columnar and MPP architecture. It supports Advanced analytics and Machine learning functions. Vertica is an ACID based database that supports ANSI SQL interface to query your data.
  6. KAFKA is the basis for integrating all microservices. Partitions in KAFKA would allowed us to build separate paths for different files. Larger files would not impact smaller files. File splits would run concurrently as each split will be handled by group of nodes dedicated to that partition We used KAFKA to load balance across multiple REGIONS. Load Balancing was out of the box. Created separate paths for larger and smaller files. To increase Concurrency, we split large files into multiple smaller files and processed them in parallel.. And we did that again using … KAFKA partitions which would maintain the state. Clusters would listen on partitions.
  7. The COPY statement in Vertica helped the bulk-loads data into an HPE Vertica database. One can initiate loading one or more files or pipes on a cluster host or on a client system (using the COPY LOCAL option). This helped us with some of the classic database issues For instance - Throughput/Loading into RDMS databases in the ETL space is a general concern and generally creates bottlenecks and backpressure. And Vertica COPY was very efficient! Once all processed, we reconciled and merged all the files. KAFKA maintained the offsets and that meant any failure would not require us to restart. Idempotency was a key gain as the client could just call again and system would know where to restart from.. Earlier really large file meant issues with network timeout and other issues outside our control. Restarting the job would mean more delay.. Now we could start even when it failed in between
  8. One can initiate loading one or more files or pipes on a cluster host or on a client system (using the COPY LOCAL option). This helped us with some of the classic database issues For instance - Throughput/Loading into RDMS databases in the ETL space is a general concern and generally creates bottlenecks and backpressure. Vertica COPY was very efficient! Once all processed, we reconciled and merged all the files. Earlier really large file meant issues with network timeout and other issues outside our control.
  9. Ability to fine tune our engine – Performance, Split Size and Memory were three dimensions that we could play with based on the availability of the hardware
  10. Kafka is designed for a streaming use case (high volumes of data with low latency). In Vertica, one can achieve this streaming effect by running a series of COPY statements, each of which loads small amounts of data into your database. However, this process can become tedious and complex. Instead, we used the Kafka integration feature to automatically load data to Vertica database as it streams through Kafka.