SlideShare a Scribd company logo
1 of 26
Download to read offline
Building a Dynamic DSL Rules
Engine with Kafka Streams
Will LaForest
Field CTO
Michael Peacock
Field Engineer
2
Why Something Other Than These
FRAMEWORKS
ksqlDB
Stream Processing Frameworks General Purpose
3
Stream processing very powerful
BUT
● Uses GPLs (Java, Python) or SQL
● 99.2% of workforce are NOT
coders
● Maybe SQL is easier
○ Still general purpose
○ Impractical for some
domains
99% 1%
• Volume of rules high? 100s or
1000s?
• If each rule is a stream
processing job
• Each a consumer and
producer
• Same network IO 1000x
• Process overhead
• Serialization/Deserialization
Technical Challenges
4
Cartoon style. Gnome sees something gross and vomits
Technical Challenges
5
• Multiple dynamic output
topics?
• Change rate of ruleset high?
• Minimize time to to
processing?
• Must be dynamic
• Don’t compile and deploy
• Don’t create a new stream
processor
• Don’t restart topology if
possible
Anyone Know what this is?
6
to spiral
make "n 1
while [:n < 100] [
make "n :n + 5
fd :n rt 90
]
end
Domain Specific Languages
Purpose built for a domain
● HTML web pages
● Regex for patterns
● MATLAB for numerical analysis
● GraphQL - Querying APIs
● Sigma for cyber and log data
If serialized as JSON, YAML, XML
● DSL as Data
● Apply data oriented operations
● Validate, secure, govern as data
● Can be transformed easily
● Generated from domain specific tool
● Dynamic
The Cybersecurity Domain
A Typical SIEM Platform Architecture
Agents /
Collectors
Reports and
alerts
Search and
investigate
Monitor and
analyze
SIEM
Tool
Network traffic
Firewall logs
RDBMS
Application logs
HTTP proxy logs
Forensic
Archive
AI/ML
Architecture with Kafka
Curate
Real-time Detection
APP SIEM
Index
SOAR
HDFS
S3
Big Query
Syslog
Network traffic
Firewall logs
RDBMS
Application logs
HTTP proxy logs
QRadar
Arcsight
Splunk
Machine Data
spooldir (files), SNMP Traps,
Databases, Sftp, MQs, S2S
Elastic
Confluent Sigma
https://github.com/confluentinc/confluent-sigma
What is Sigma?
12
https://github.com/SigmaHQ/sigma
● Sigma is an open signature (patterns) format
● Focused on log and network
● Researchers or analysts can describe developed
detection patterns
● Shareable with others (IMPORTANT)
● Rules schema agnostic
● Platform agnostic
○ Same rule can be be used for
○ Elastic, Splunk, Azure Sentinel, (now Kafka)
Confluent Sigma
13
Sigma Rules
(YAML)
Source
Data
Sigma Stream
Processor
SIEM
Applications
Cold storage
High Risk Detections
SOAR
Filtered
Cold retention
Data
Rules
Why Kafka Streams API?
14
● Simple to build a topology
● Easy to run as app
● No required runtime
● Start small easily
App 1
● Scale massively
● All you need is Kafka
● Can run anywhere!
A Walkthrough of the Sigma
Project
A Walkthrough of the Sigma Project
16
sigma-parser
sigma-streams
sigma-streams-ui
https://github.com/confluentinc/confluent-sigma
Sigma Stream Topologies
Simple Topology
supports one
sub-topology to
many rules
Aggregate Topology
requires one
sub-topology for
each rule
Simple Topology
18
flatMapValues KStream
iterate through each rule for the
processor (product/service
filtering)
validates the streaming data
against the DSL rule and add the
results to the output list if a
match is found
optional config variable to return
after the first match or return all
matches
dynamic output topic flatMapValues
Create a new KStream by transforming the value of
each record in this stream into zero or more
values with the same key in the new stream.
{
"@stream": "dns",
"@system": "bobs.bigwheel.local",
"@proc": "zeek",
"ts": 1588205199.82437,
"uid": "Cvf4XX17hSAgXDdGEd",
"id_orig_h": "10.0.1.6",
"id_orig_p": 54243,
"id_resp_h": "10.0.0.4",
"id_resp_p": 53,
"proto": "udp",
"trans_id": 41180,
"rtt": 0.001528024673461914,
"query": "newyork.dmevals.local",
"qclass": 1,
"qclass_name": "C_INTERNET",
"qtype": 1,
Sigma Rule
Streaming Data
1
2
3
4
5
Detection Results / Dynamic Output Topic
19
'^(?<timestamp>w{3}sd{2}sd{2}:d{2}:d{2})s(?< hostname>[^s]+)
s%ASA-d-(?< messageID>[^:]+):s(?< action>[^s]+)s(?< protocol>[^s]+)
ssrcsinside:(?< src>[0-9.]+)/(?< srcport>[0-9]+)sdstsoutside:(?< d
est>[0-9.]+)/(?< destport>[0-9]+)'
Sigma Rule
firewalls topic
adds the record data and rule
metadata
if the rule contains a regex with
mapping, adds the mapped
fields
adds any custom fields
dynamic output topic
1
2
3
4
Aggregate Topology
20
iterate through each rule for the
processor (product/service
filtering)
validates the streaming data
against the DSL rule
groups by the key
sliding window from rule
counts number of instances
within the window
filters based on aggregation and
operation in rule
dynamic output topic
{
"@stream": "dns",
"@system":
"bobs.bigwheel.local",
"@proc": "zeek",
"ts": 1588205199.82437,
"uid": "Cvf4XX17hSAgXDdGEd",
"id_orig_h": "10.0.1.6",
"id_orig_p": 54243,
"id_resp_h": "10.0.0.4",
"id_resp_p": 53,
"proto": "udp",
"trans_id": 41180,
"rtt": 0.001528024673461914,
"query":
"newyork.dmevals.local",
"qclass": 1,
"qclass_name": "C_INTERNET",
"qtype": 1,
Sigma Rule
Streaming Data
1
3
5
6
7
4
2
Confluent Sigma Demo
Sigma Stream Processors
Sigma Streams UI
Sigma Rule Editor
sigma rules topic
DNS
dns
detections
topic
dns topic
rule parsing,
filtering,
aggregation,
windowing
sigma
rules
cache
CONN
DHCP
HTTP
SSL
x509
Zeek Data
Demo
Considerations for DSL Stream Processing
● Consider how to handle newly arriving rules
○ Apply only to unprocessed messages
○ What about previous data?
■ Spawn separate app or topology
■ When “caught up” merge into single
● Versioned state stores now allow new strategy
○ Only join rules available for corresponding record time
○ For records only join against rules that existed at record time
■ Upcoming KIP-960, KIP-968, KIP-969 for our customized joins
23
https://www.confluent.io/blog/introducing-versioned-state-store-in-kafka-streams/
Considerations Continued
● Consider how routines may interact with each other
○ Sigma doesn’t support notion of priority or dependence
● For a DSL in which rules interact (IFTTT)
○ Use Processor API to map to an appropriate DAG
24
Whats Next?
“Roadmap” https://github.com/confluentinc/confluent-sigma/projects/1
● Refactoring framework to make DSL pluggable (on going)
○ Implement a builder class: DSL routines -> Kafka Streams topology
○ Re-use all the scaffolding we have built
○ LLM stream processing
● Optimizations
○ Aggregate topology optimizations (bin based on time and key)
○ More graceful management of aggregate topology changes
■ Self coordination
○ General profiling and optimization
● Switch to GlobalKTable from KCache 25
Building a Dynamic Rules Engine with Kafka Streams

More Related Content

What's hot

Migrating with Debezium
Migrating with DebeziumMigrating with Debezium
Migrating with DebeziumMike Fowler
 
Use case and integration of ClickHouse with Apache Superset & Dremio
Use case and integration of ClickHouse with Apache Superset & DremioUse case and integration of ClickHouse with Apache Superset & Dremio
Use case and integration of ClickHouse with Apache Superset & DremioAltinity Ltd
 
Productizing Structured Streaming Jobs
Productizing Structured Streaming JobsProductizing Structured Streaming Jobs
Productizing Structured Streaming JobsDatabricks
 
Oracle GoldenGate Microservices Overview ( with Demo )
Oracle GoldenGate Microservices Overview ( with Demo )Oracle GoldenGate Microservices Overview ( with Demo )
Oracle GoldenGate Microservices Overview ( with Demo )Mari Kupatadze
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersApache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersDataWorks Summit
 
Migration to Oracle Multitenant
Migration to Oracle MultitenantMigration to Oracle Multitenant
Migration to Oracle MultitenantJitendra Singh
 
Kafka connect 101
Kafka connect 101Kafka connect 101
Kafka connect 101Whiteklay
 
Oracle Office Hours - Exposing REST services with APEX and ORDS
Oracle Office Hours - Exposing REST services with APEX and ORDSOracle Office Hours - Exposing REST services with APEX and ORDS
Oracle Office Hours - Exposing REST services with APEX and ORDSDoug Gault
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practicesconfluent
 
Reliability Guarantees for Apache Kafka
Reliability Guarantees for Apache KafkaReliability Guarantees for Apache Kafka
Reliability Guarantees for Apache Kafkaconfluent
 
Introduction to Apache solr
Introduction to Apache solrIntroduction to Apache solr
Introduction to Apache solrKnoldus Inc.
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdfLars Albertsson
 
Deep Dive into Keystone Tokens and Lessons Learned
Deep Dive into Keystone Tokens and Lessons LearnedDeep Dive into Keystone Tokens and Lessons Learned
Deep Dive into Keystone Tokens and Lessons LearnedPriti Desai
 
Apache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, ConfluentApache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, ConfluentHostedbyConfluent
 
Kafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced ProducersKafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced ProducersJean-Paul Azar
 
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...MongoDB
 

What's hot (20)

Migrating with Debezium
Migrating with DebeziumMigrating with Debezium
Migrating with Debezium
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
Use case and integration of ClickHouse with Apache Superset & Dremio
Use case and integration of ClickHouse with Apache Superset & DremioUse case and integration of ClickHouse with Apache Superset & Dremio
Use case and integration of ClickHouse with Apache Superset & Dremio
 
Productizing Structured Streaming Jobs
Productizing Structured Streaming JobsProductizing Structured Streaming Jobs
Productizing Structured Streaming Jobs
 
Oracle GoldenGate Microservices Overview ( with Demo )
Oracle GoldenGate Microservices Overview ( with Demo )Oracle GoldenGate Microservices Overview ( with Demo )
Oracle GoldenGate Microservices Overview ( with Demo )
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersApache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
 
Migration to Oracle Multitenant
Migration to Oracle MultitenantMigration to Oracle Multitenant
Migration to Oracle Multitenant
 
Kafka connect 101
Kafka connect 101Kafka connect 101
Kafka connect 101
 
Oracle Office Hours - Exposing REST services with APEX and ORDS
Oracle Office Hours - Exposing REST services with APEX and ORDSOracle Office Hours - Exposing REST services with APEX and ORDS
Oracle Office Hours - Exposing REST services with APEX and ORDS
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practices
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Reliability Guarantees for Apache Kafka
Reliability Guarantees for Apache KafkaReliability Guarantees for Apache Kafka
Reliability Guarantees for Apache Kafka
 
Introduction to Apache solr
Introduction to Apache solrIntroduction to Apache solr
Introduction to Apache solr
 
Data engineering in 10 years.pdf
Data engineering in 10 years.pdfData engineering in 10 years.pdf
Data engineering in 10 years.pdf
 
Deep Dive into Keystone Tokens and Lessons Learned
Deep Dive into Keystone Tokens and Lessons LearnedDeep Dive into Keystone Tokens and Lessons Learned
Deep Dive into Keystone Tokens and Lessons Learned
 
elk_stack_alexander_szalonnas
elk_stack_alexander_szalonnaselk_stack_alexander_szalonnas
elk_stack_alexander_szalonnas
 
Apache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, ConfluentApache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, Confluent
 
Exadata Cloud Service Overview(v2)
Exadata Cloud Service Overview(v2) Exadata Cloud Service Overview(v2)
Exadata Cloud Service Overview(v2)
 
Kafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced ProducersKafka Tutorial: Advanced Producers
Kafka Tutorial: Advanced Producers
 
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
MongoDB San Francisco 2013: Hash-based Sharding in MongoDB 2.4 presented by B...
 

Similar to Building a Dynamic Rules Engine with Kafka Streams

Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...HostedbyConfluent
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingYaroslav Tkachenko
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Apex
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at ScaleSean Zhong
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformApache Apex
 
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Ontico
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networksinside-BigData.com
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...DataWorks Summit/Hadoop Summit
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application Apache Apex
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka MeetupCliff Gilmore
 
Data Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksData Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksMatthias Niehoff
 
Cisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven TelemetryCisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven TelemetryCisco Canada
 
Real-time Streaming Pipelines with FLaNK
Real-time Streaming Pipelines with FLaNKReal-time Streaming Pipelines with FLaNK
Real-time Streaming Pipelines with FLaNKData Con LA
 
WarsawITDays_ ApacheNiFi202
WarsawITDays_ ApacheNiFi202WarsawITDays_ ApacheNiFi202
WarsawITDays_ ApacheNiFi202Timothy Spann
 
Hyperledger 구조 분석
Hyperledger 구조 분석Hyperledger 구조 분석
Hyperledger 구조 분석Jongseok Choi
 
FIWARE Global Summit - Fast RTPS: Programming with the Default middleware for...
FIWARE Global Summit - Fast RTPS: Programming with the Default middleware for...FIWARE Global Summit - Fast RTPS: Programming with the Default middleware for...
FIWARE Global Summit - Fast RTPS: Programming with the Default middleware for...FIWARE
 
Data Pipelines and Telephony Fraud Detection Using Machine Learning
Data Pipelines and Telephony Fraud Detection Using Machine Learning Data Pipelines and Telephony Fraud Detection Using Machine Learning
Data Pipelines and Telephony Fraud Detection Using Machine Learning Eugene
 
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...Altinity Ltd
 

Similar to Building a Dynamic Rules Engine with Kafka Streams (20)

Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application  Introduction to Apache Apex and writing a big data streaming application
Introduction to Apache Apex and writing a big data streaming application
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
 
Data Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and FrameworksData Stream Processing - Concepts and Frameworks
Data Stream Processing - Concepts and Frameworks
 
Cisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven TelemetryCisco Connect Toronto 2017 - Model-driven Telemetry
Cisco Connect Toronto 2017 - Model-driven Telemetry
 
Real-time Streaming Pipelines with FLaNK
Real-time Streaming Pipelines with FLaNKReal-time Streaming Pipelines with FLaNK
Real-time Streaming Pipelines with FLaNK
 
WarsawITDays_ ApacheNiFi202
WarsawITDays_ ApacheNiFi202WarsawITDays_ ApacheNiFi202
WarsawITDays_ ApacheNiFi202
 
Hyperledger 구조 분석
Hyperledger 구조 분석Hyperledger 구조 분석
Hyperledger 구조 분석
 
FIWARE Global Summit - Fast RTPS: Programming with the Default middleware for...
FIWARE Global Summit - Fast RTPS: Programming with the Default middleware for...FIWARE Global Summit - Fast RTPS: Programming with the Default middleware for...
FIWARE Global Summit - Fast RTPS: Programming with the Default middleware for...
 
Fast RTPS
Fast RTPSFast RTPS
Fast RTPS
 
Data Pipelines and Telephony Fraud Detection Using Machine Learning
Data Pipelines and Telephony Fraud Detection Using Machine Learning Data Pipelines and Telephony Fraud Detection Using Machine Learning
Data Pipelines and Telephony Fraud Detection Using Machine Learning
 
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfFIDO Alliance
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...FIDO Alliance
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyUXDXConf
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomCzechDreamin
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityScyllaDB
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPTiSEO AI
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераMark Opanasiuk
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...panagenda
 

Recently uploaded (20)

Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 

Building a Dynamic Rules Engine with Kafka Streams

  • 1. Building a Dynamic DSL Rules Engine with Kafka Streams Will LaForest Field CTO Michael Peacock Field Engineer
  • 2. 2 Why Something Other Than These FRAMEWORKS ksqlDB
  • 3. Stream Processing Frameworks General Purpose 3 Stream processing very powerful BUT ● Uses GPLs (Java, Python) or SQL ● 99.2% of workforce are NOT coders ● Maybe SQL is easier ○ Still general purpose ○ Impractical for some domains 99% 1%
  • 4. • Volume of rules high? 100s or 1000s? • If each rule is a stream processing job • Each a consumer and producer • Same network IO 1000x • Process overhead • Serialization/Deserialization Technical Challenges 4 Cartoon style. Gnome sees something gross and vomits
  • 5. Technical Challenges 5 • Multiple dynamic output topics? • Change rate of ruleset high? • Minimize time to to processing? • Must be dynamic • Don’t compile and deploy • Don’t create a new stream processor • Don’t restart topology if possible
  • 6. Anyone Know what this is? 6 to spiral make "n 1 while [:n < 100] [ make "n :n + 5 fd :n rt 90 ] end
  • 7. Domain Specific Languages Purpose built for a domain ● HTML web pages ● Regex for patterns ● MATLAB for numerical analysis ● GraphQL - Querying APIs ● Sigma for cyber and log data If serialized as JSON, YAML, XML ● DSL as Data ● Apply data oriented operations ● Validate, secure, govern as data ● Can be transformed easily ● Generated from domain specific tool ● Dynamic
  • 9. A Typical SIEM Platform Architecture Agents / Collectors Reports and alerts Search and investigate Monitor and analyze SIEM Tool Network traffic Firewall logs RDBMS Application logs HTTP proxy logs
  • 10. Forensic Archive AI/ML Architecture with Kafka Curate Real-time Detection APP SIEM Index SOAR HDFS S3 Big Query Syslog Network traffic Firewall logs RDBMS Application logs HTTP proxy logs QRadar Arcsight Splunk Machine Data spooldir (files), SNMP Traps, Databases, Sftp, MQs, S2S Elastic
  • 12. What is Sigma? 12 https://github.com/SigmaHQ/sigma ● Sigma is an open signature (patterns) format ● Focused on log and network ● Researchers or analysts can describe developed detection patterns ● Shareable with others (IMPORTANT) ● Rules schema agnostic ● Platform agnostic ○ Same rule can be be used for ○ Elastic, Splunk, Azure Sentinel, (now Kafka)
  • 13. Confluent Sigma 13 Sigma Rules (YAML) Source Data Sigma Stream Processor SIEM Applications Cold storage High Risk Detections SOAR Filtered Cold retention Data Rules
  • 14. Why Kafka Streams API? 14 ● Simple to build a topology ● Easy to run as app ● No required runtime ● Start small easily App 1 ● Scale massively ● All you need is Kafka ● Can run anywhere!
  • 15. A Walkthrough of the Sigma Project
  • 16. A Walkthrough of the Sigma Project 16 sigma-parser sigma-streams sigma-streams-ui https://github.com/confluentinc/confluent-sigma
  • 17. Sigma Stream Topologies Simple Topology supports one sub-topology to many rules Aggregate Topology requires one sub-topology for each rule
  • 18. Simple Topology 18 flatMapValues KStream iterate through each rule for the processor (product/service filtering) validates the streaming data against the DSL rule and add the results to the output list if a match is found optional config variable to return after the first match or return all matches dynamic output topic flatMapValues Create a new KStream by transforming the value of each record in this stream into zero or more values with the same key in the new stream. { "@stream": "dns", "@system": "bobs.bigwheel.local", "@proc": "zeek", "ts": 1588205199.82437, "uid": "Cvf4XX17hSAgXDdGEd", "id_orig_h": "10.0.1.6", "id_orig_p": 54243, "id_resp_h": "10.0.0.4", "id_resp_p": 53, "proto": "udp", "trans_id": 41180, "rtt": 0.001528024673461914, "query": "newyork.dmevals.local", "qclass": 1, "qclass_name": "C_INTERNET", "qtype": 1, Sigma Rule Streaming Data 1 2 3 4 5
  • 19. Detection Results / Dynamic Output Topic 19 '^(?<timestamp>w{3}sd{2}sd{2}:d{2}:d{2})s(?< hostname>[^s]+) s%ASA-d-(?< messageID>[^:]+):s(?< action>[^s]+)s(?< protocol>[^s]+) ssrcsinside:(?< src>[0-9.]+)/(?< srcport>[0-9]+)sdstsoutside:(?< d est>[0-9.]+)/(?< destport>[0-9]+)' Sigma Rule firewalls topic adds the record data and rule metadata if the rule contains a regex with mapping, adds the mapped fields adds any custom fields dynamic output topic 1 2 3 4
  • 20. Aggregate Topology 20 iterate through each rule for the processor (product/service filtering) validates the streaming data against the DSL rule groups by the key sliding window from rule counts number of instances within the window filters based on aggregation and operation in rule dynamic output topic { "@stream": "dns", "@system": "bobs.bigwheel.local", "@proc": "zeek", "ts": 1588205199.82437, "uid": "Cvf4XX17hSAgXDdGEd", "id_orig_h": "10.0.1.6", "id_orig_p": 54243, "id_resp_h": "10.0.0.4", "id_resp_p": 53, "proto": "udp", "trans_id": 41180, "rtt": 0.001528024673461914, "query": "newyork.dmevals.local", "qclass": 1, "qclass_name": "C_INTERNET", "qtype": 1, Sigma Rule Streaming Data 1 3 5 6 7 4 2
  • 21. Confluent Sigma Demo Sigma Stream Processors Sigma Streams UI Sigma Rule Editor sigma rules topic DNS dns detections topic dns topic rule parsing, filtering, aggregation, windowing sigma rules cache CONN DHCP HTTP SSL x509 Zeek Data
  • 22. Demo
  • 23. Considerations for DSL Stream Processing ● Consider how to handle newly arriving rules ○ Apply only to unprocessed messages ○ What about previous data? ■ Spawn separate app or topology ■ When “caught up” merge into single ● Versioned state stores now allow new strategy ○ Only join rules available for corresponding record time ○ For records only join against rules that existed at record time ■ Upcoming KIP-960, KIP-968, KIP-969 for our customized joins 23 https://www.confluent.io/blog/introducing-versioned-state-store-in-kafka-streams/
  • 24. Considerations Continued ● Consider how routines may interact with each other ○ Sigma doesn’t support notion of priority or dependence ● For a DSL in which rules interact (IFTTT) ○ Use Processor API to map to an appropriate DAG 24
  • 25. Whats Next? “Roadmap” https://github.com/confluentinc/confluent-sigma/projects/1 ● Refactoring framework to make DSL pluggable (on going) ○ Implement a builder class: DSL routines -> Kafka Streams topology ○ Re-use all the scaffolding we have built ○ LLM stream processing ● Optimizations ○ Aggregate topology optimizations (bin based on time and key) ○ More graceful management of aggregate topology changes ■ Self coordination ○ General profiling and optimization ● Switch to GlobalKTable from KCache 25