Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Current 2022

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 13 Anzeige

Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Current 2022

Herunterladen, um offline zu lesen

Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Current 2022

In an integrated business environment where heterogeneous database technologies are deployed, Kafka Connect offers data sinks and sources that easily enable seamless integration, abstracting data exchange details from senders and receivers. However, challenges may arise as integration grows in complexity.

Recently, the Enterprise Business Information Systems division at Jet Propulsion Laboratory was tasked with delivering an event-driven data exchange between two of its major systems. Delivering this solution successfully required conquering complex data dependencies across tables and respecting business and atomicity requirements. However, enterprise data exchanges also require a robust feedback loop to properly identify, disseminate and remediate errors in the process to maintain data integrity and user trust.

In this talk, we will discuss how we overcame these challenges and delivered a fully automated and robust data exchange solution by extending Kafka Connect, leveraging ksqlDB streams/tables and aggregations, and developing custom microservices.

Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Current 2022

In an integrated business environment where heterogeneous database technologies are deployed, Kafka Connect offers data sinks and sources that easily enable seamless integration, abstracting data exchange details from senders and receivers. However, challenges may arise as integration grows in complexity.

Recently, the Enterprise Business Information Systems division at Jet Propulsion Laboratory was tasked with delivering an event-driven data exchange between two of its major systems. Delivering this solution successfully required conquering complex data dependencies across tables and respecting business and atomicity requirements. However, enterprise data exchanges also require a robust feedback loop to properly identify, disseminate and remediate errors in the process to maintain data integrity and user trust.

In this talk, we will discuss how we overcame these challenges and delivered a fully automated and robust data exchange solution by extending Kafka Connect, leveraging ksqlDB streams/tables and aggregations, and developing custom microservices.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Ähnlich wie Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Current 2022 (20)

Weitere von HostedbyConfluent (20)

Anzeige

Aktuellste (20)

Designing a Feedback Loop for Event-Driven Data Sharing With Teresa Wang | Current 2022

  1. 1. Designing a Feedback Loop for Event-Driven Data Sharing Enabled with Kafka Connect and KSQL Presented by: Teresa Wang weijiuan.t.wang@jpl.nasa.gov Enterprise Business Information Services Division (EBIS) Jet Propulsion Laboratory Oct. 4th 2022
  2. 2. 10/4/2022 2 Agenda • Use Case • Challenges • Design Principles • Solution Design • Summary
  3. 3. Disclaimer Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology 10/4/2022 3
  4. 4. • Handles complex data dependencies across tables and respecting business and atomicity requirements between the source and target systems • Establishes a robust feedback loop to properly notify sending system about successful deliveries and to identify/remediate errors 10/4/2022 4 Tasks: Design a fully automated data transport system that … Earned Value Management System Scope Budgets Schedule
  5. 5. Challenges • The triggering event is recorded in a staging table and data related to the triggered event are located in 9 separate tables with Foreign-Key relationship • Some source tables have a large data structure (e.g. > 120 data elements) – Cannot use multi-joins in a single source connector for query-based polling for all involved source tables – Certain data elements require data type recasting • Determining the event-based data transport has completed and received by the destination – Each source table contains various amount of data; may be as few as dozens, may be more than 100,000 rows – How to determine if an error occurred during data transport? • Triggering downstream processing in source and destination systems upon event completion 10/4/2022 5
  6. 6. Design Principles • Maximizing the usage of Kafka Components – To ensure data consistency and integrity – Extend/Augment as necessary • Separation-of-Concerns – Data Transport Abstraction means … • No need for software engineers to write Kafka producers or consumers • Kafka can do SMT for label translation for business domain-specific terms • Maintainability & Reusability – Prefer configuration over coding for change requests – Future-proof with enabling data pipeline extensions in mind – Framework is reusable for other use cases 10/4/2022 6
  7. 7. 10/4/2022 7 Solution Design Components KSQL Streaming, KTables Control - Manifest W iretapping JDBC Java Microservices Confluent Connectors
  8. 8. 9 Source connectors Event table 9 data tables 9 Topics 9 Sink connectors The Basis: A Typical Data Pipeline Design with Source/Sink Connectors ? Event status ? 9 data tables ? Event status ? Kafka EVM sys.
  9. 9. Visualized Data Pipeline Design feedback flows: 10/4/2022 9 Topic PROGRAM Topic WIRETAPS Topic PENDING_ALERT S KSQL-Stream PROGRAM_S KSQL-Table PROGRAM_T KSQL-Table/Topic EVENT_COMPLETION_T KSQL-Stream WIRETAPS_S KSQL-Table WIRETAPS_T KSQL-Table/Topic ALERTS_T KSQL-Stream PENDING_ALERTS_S Source connectors Wiretapped Sink connectors Sink connectors Watcher microservice Notifications microservice Other data Topics pivoted (compares running totals in WIRETAPS_T with control totals on manifest in PROGRAM_T)
  10. 10. Summary – An Enterprise Integration Design Pattern 10/4/2022 10 Supervisory Structure • Acquires from the source a manifest for each event • Reports unmet expectations with a continuously running “watcher” microservice • Keep producing “pending” alerts until an event is either completed or erred Wiretapped Sink Connectors • Captures the #-messages written to destination during sink connectors deliveries KSQL Aggregations • for event-completion declaration when expectations are met • for declaring an “alert” when the “#-pending alerts” exceeds a preset threshold Producing Feedbacks • Consuming from both event-completions (Successes) & event-errors (Alerts) topics • Sent to both source and target systems to trigger further processing
  11. 11. Summary – Why We Like This Solution Design • Data transport abstraction: Software Engineers don’t need to code for – Kafka producers or consumers – Web services for data exchange validation with the target system • Leveraging KSQL aggregations for – Declaration on an event “completion”, or – Declaration on an event “alert” • A minimalist approach – Less (coding by data engineers) is More! – Extending JDBC to enable wiretapping on sink connectors based on configurable attributes – Single purpose “Watcher” microservice to produce “pending alerts” – Employing an existing “notification” microservice for multi-channel broadcasting about ”alerts” 10/4/2022 11
  12. 12. Acknowledgements Jacob Nowicki for introducing Kafka to the EBIS Division at JPL !! Peter Grzegorczyk for his innovative engineering collaboration !! 10/4/2022 12
  13. 13. jpl.nasa.gov 10/4/2022 13

×