SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Real-Time Streaming Pipelines With FLaNK
(Apache Flink, Apache NiFi & Apache Kafka)
Timothy Spann - Principal DataFlow Field Engineer
15-April-2021
2
https://github.com/tspannhw https://www.datainmotion.dev/
https://www.meetup.com/futureofdata-princeton/
3
FLaNK Stack for Cloud Data Engineers
Multiple users, frameworks, languages, clouds, data sources & clusters
CLOUD DATA ENGINEER
• Experience in ETL/ELT
• Coding skills in Python or Java
• Knowledge of database query
languages such as SQL
• Experience with Streaming
• Knowledge of Cloud Tools
• Expert in ETL (Eating, Ties and Laziness)
• Edge Camera Interaction
• Typical User
• No Coding Skills
• Can use NiFi
• Questions your cloud spend
CAT AI / Deep Learning / ML / DS
• Can run in Apache NiFi
• Can run in Kafka Streams
• Can run in Apache Flink
• Can run in MiNiFi Agents
4
I Can Haz Data?
Today’s Data. REST and Websocket JSON “stonks”
{"symbol":"CLDR",
"uuid":"10640832-f139-4b82-8780-e3ad37b3d0
ce",
"ts":1618529574078,
"dt":1612098900000,
"datetime":"2021/01/31 08:15:00",
"open":"12.24500",
"close":"12.25500",
"high":"12.25500",
"volume":"12353",
"low":"12.24500"}
5
https://github.com/tspannhw/SmartStocks
6
End to End Streaming Demo Pipeline
Enterprise
sources
Weather
Errors
Aggregates
Alerts
Stocks
ETL
Analytics
Streaming SQL
Clickstream Market data
Machine logs Social
https://github.com/tspannhw/CloudDemo2021
7
8
Apache NiFi in a Nutshell
Enable easy ingestion, routing, management and delivery of any data anywhere (Edge, cloud, data
center) to any downstream system with built in end-to-end security and provenance
Advanced tooling to industrialize
flow development (Flow Development
Life Cycle)
ACQUIRE
• Over 340 Prebuilt Processors
• Easy to build your own
• Parse, Enrich & Apply Schema
• Filter, Split, Merger & Route
• Throttle & Backpressure
FTP
SFTP
HL7
UDP
XML
HTTP
EMAIL
HTML
IMAGE
SYSLOG
PROCESS
HASH
MERGE
EXTRACT
DUPLICATE
SPLIT
ENCRYPT
TALL
EVALUATE
EXECUTE
GEOENRICH
SCAN
REPLACE
TRANSLATE
CONVERT
ROUTE TEXT
ROUTE CONTENT
ROUTE CONTEXT
ROUTE RATE
DISTRIBUTE LOAD
DELIVER
• Guaranteed Delivery
• Full data provenance from
acquisition to delivery
• Diverse, Non-Traditional Sources
• Eco-system integration
FTP
SFTP
HL7
UDP
XML
HTTP
EMAIL
HTML
IMAGE
SYSLOG
© 2021 Cloudera, Inc. All rights reserved. 9
ParquetReader /
ParquetWriter
Records
• Native Record Processors for
Apache Parquet Files!
• CVS <-> Parquet
• XML <-> Parquet
• AVRO <-> Parquet
• JSON <-> Parquet
• More...
https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html
https://www.datainmotion.dev/2019/10/migrating-apache-flume-flows-to-apac
he_7.html
© 2021 Cloudera, Inc. All rights reserved. 10
UpdateRecord
• Use with LookupRecord
• ELT
• Works on CSV, XML, JSON,
AVRO, …
• RecordPath or Literals
• Use Schemas and Schema
Registry
© 2021 Cloudera, Inc. All rights reserved. 11
ValidateRecord
• Works on CSV, XML, JSON,
AVRO, …
• RecordPath or Literals
• Use Schemas and Schema
Registry
• Checks fields, types, nullable
© 2021 Cloudera, Inc. All rights reserved. 12
RestLookupService
• Works on CSV, XML, JSON,
AVRO, …
• Use Schemas and Schema
Registry
• Can call Cloduera ML Models
• SSL and Proxy enabled
© 2021 Cloudera, Inc. All rights reserved. 13
https://www.datainmotion.dev/2020/06/no-more-spaghetti-flows.html
● Reduce, Reuse, Recycle. Use Parameters to reuse
common modules.
● Put flows, reusable chunks into separate Process
Groups.
● Write custom processors if you need new or
specialized features
● Use Cloudera supported NiFi Processors
● Use Record Processors everywhere
No More Spaghetti Flows
© 2021 Cloudera, Inc. All rights reserved. 14
New Features
… based on Apache NiFi 1.13.2
https://www.datainmotion.dev/2021/02/new-features-of-apache-nifi-1130.html
● ListenFTP
● Data Drift
● SampleRecord
● Generic Record Sink
● Generic Record Reader
● PutRecord
● WindowsEventLogReader
15
Yes, Franz, It’s Kafka
Let’s do a metamorphosis on your data. Don’t fear changing data.
You don’t need to be a brilliant writer to stream
data.
Franz Kafka was a German-speaking
Bohemian novelist and short-story writer,
widely regarded as one of the major figures of
20th-century literature. His work fuses
elements of realism and the fantastic.
Wikipedia
© 2021 Cloudera, Inc. All rights reserved. 16
Apache Kafka
• Highly reliable distributed
messaging system
• Decouple applications, enables
many-to-many patterns
• Publish-Subscribe semantics
• Horizontal scalability
• Efficient implementation to
operate at speed with big data
volumes
• Organized by topic to support
several use cases
Source
System
Source
System
Source
System
Kafka
Fraud
Detection
Security
Systems
Real-Time
Monitoring
Many-To-Many
Publish-Subscribe
EVENTS
© 2021 Cloudera, Inc. All rights reserved. 17
Flink SQL
… based on Apache Flink 1.12
https://www.datainmotion.dev/2021/04/cloudera-sql-stream-builder-ssb-updated.html
● Streaming Analytics
● Continuous SQL
● Continuous ETL
● Complex Event Processing
● Standard SQL Powered by Apache Calcite
● Deployed Apache Flink Apps on YARN
● Scalable Stream Processing
18
Flink SQL
-- specify Kafka partition key on output
SELECT foo AS _eventKey FROM sensors
-- use event time timestamp from kafka
-- exactly once compatible
SELECT eventTimestamp FROM sensors
-- nested structures access
SELECT foo.’bar’ FROM table; -- must quote nested
column
-- timestamps
SELECT * FROM payments
WHERE eventTimestamp > CURRENT_TIMESTAMP-interval
'10' second;
-- unnest
SELECT b.*, u.*
FROM bgp_avro b,
UNNEST(b.path) AS u(pathitem)
-- aggregations and windows
SELECT card,
MAX(amount) as theamount,
TUMBLE_END(eventTimestamp, interval '5' minute) as
ts
FROM payments
WHERE lat IS NOT NULL
AND lon IS NOT NULL
GROUP BY card,
TUMBLE(eventTimestamp, interval '5' minute)
HAVING COUNT(*) > 4 -- >4==fraud
-- try to do this ksql!
SELECT us_west.user_score+ap_south.user_score
FROM kafka_in_zone_us_west us_west
FULL OUTER JOIN kafka_in_zone_ap_south ap_south
ON us_west.user_id = ap_south.user_id;
Key Takeaway: Rich SQL grammar with advanced time and aggregation tools
© 2021 Cloudera, Inc. All rights reserved. 19
Flink SQL
SELECT location, station_id, latitude, longitude, observation_time, weather, temperature_string,
relative_humidity, wind_string, wind_dir, wind_degrees, wind_mph, pressure_in, dewpoint_string,
dewpoint_f, dewpoint_c FROM weather2 WHERE location is not null and location <> 'null' and
trim(location) <> '' and location like '%NJ'
SELECT HOP_END(eventTimestamp, INTERVAL '1' SECOND, INTERVAL '30' SECOND) as
windowEnd, count("close") as closeCount, sum(cast("close" as float)) as closeSum, avg(cast("close" as
float)) as closeAverage, min("close") as closeMin, max("close") as closeMax, sum(case when "close" >
14 then 1 else 0 end) as stockGreaterThan14 FROM stocksraw GROUP BY HOP(eventTimestamp,
INTERVAL '1' SECOND, INTERVAL '30' SECOND)
© 2021 Cloudera, Inc. All rights reserved. 20
© 2021 Cloudera, Inc. All rights reserved. 21
Upcoming Events
April 27 May 19
22
TH N Y U

Weitere ähnliche Inhalte

Was ist angesagt?

Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeTimothy Spann
 
Bullet: A Real Time Data Query Engine
Bullet: A Real Time Data Query EngineBullet: A Real Time Data Query Engine
Bullet: A Real Time Data Query EngineDataWorks Summit
 
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...Timothy Spann
 
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!Timo Walther
 
Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Timothy Spann
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsTimothy Spann
 
Using the flipn stack for edge ai (flink, nifi, pulsar)
Using the flipn stack for edge ai (flink, nifi, pulsar)Using the flipn stack for edge ai (flink, nifi, pulsar)
Using the flipn stack for edge ai (flink, nifi, pulsar)Timothy Spann
 
How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021
How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021
How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021StreamNative
 
Cracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworksCracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworksTimothy Spann
 
Api world apache nifi 101
Api world   apache nifi 101Api world   apache nifi 101
Api world apache nifi 101Timothy Spann
 
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaTimothy Spann
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computingTimothy Spann
 
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...Timothy Spann
 
Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...Timothy Spann
 
Incrementally streaming rdbms data to your data lake automagically
Incrementally streaming rdbms data to your data lake automagicallyIncrementally streaming rdbms data to your data lake automagically
Incrementally streaming rdbms data to your data lake automagicallyTimothy Spann
 
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...Timothy Spann
 
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021StreamNative
 
Python web conference 2022 apache pulsar development 101 with python (f li-...
Python web conference 2022   apache pulsar development 101 with python (f li-...Python web conference 2022   apache pulsar development 101 with python (f li-...
Python web conference 2022 apache pulsar development 101 with python (f li-...Timothy Spann
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineData Con LA
 

Was ist angesagt? (20)

Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
 
Bullet: A Real Time Data Query Engine
Bullet: A Real Time Data Query EngineBullet: A Real Time Data Query Engine
Bullet: A Real Time Data Query Engine
 
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
 
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
 
Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
 
Using the flipn stack for edge ai (flink, nifi, pulsar)
Using the flipn stack for edge ai (flink, nifi, pulsar)Using the flipn stack for edge ai (flink, nifi, pulsar)
Using the flipn stack for edge ai (flink, nifi, pulsar)
 
How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021
How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021
How Tencent Applies Apache Pulsar to Apache InLong - Pulsar Summit Asia 2021
 
Cracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworksCracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworks
 
Api world apache nifi 101
Api world   apache nifi 101Api world   apache nifi 101
Api world apache nifi 101
 
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
 
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
 
Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...
 
Incrementally streaming rdbms data to your data lake automagically
Incrementally streaming rdbms data to your data lake automagicallyIncrementally streaming rdbms data to your data lake automagically
Incrementally streaming rdbms data to your data lake automagically
 
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...Osacon 2021   hello hydrate! from stream to clickhouse with apache pulsar and...
Osacon 2021 hello hydrate! from stream to clickhouse with apache pulsar and...
 
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
 
Python web conference 2022 apache pulsar development 101 with python (f li-...
Python web conference 2022   apache pulsar development 101 with python (f li-...Python web conference 2022   apache pulsar development 101 with python (f li-...
Python web conference 2022 apache pulsar development 101 with python (f li-...
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
 
Migrating pipelines into Docker
Migrating pipelines into DockerMigrating pipelines into Docker
Migrating pipelines into Docker
 

Ähnlich wie Real-time Streaming Pipelines with FLaNK

Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019Thomas Weise
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka MeetupCliff Gilmore
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesDatabricks
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkTimothy Spann
 
Ingesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmedIngesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmedwhoschek
 
Sparkly Notebook: Interactive Analysis and Visualization with Spark
Sparkly Notebook: Interactive Analysis and Visualization with SparkSparkly Notebook: Interactive Analysis and Visualization with Spark
Sparkly Notebook: Interactive Analysis and Visualization with Sparkfelixcss
 
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...HostedbyConfluent
 
Hands on with CoAP and Californium
Hands on with CoAP and CaliforniumHands on with CoAP and Californium
Hands on with CoAP and CaliforniumJulien Vermillard
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data
 
DBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesDBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesTimothy Spann
 
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache PulsarApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache PulsarTimothy Spann
 
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Apex
 
RTAS 2023: Building a Real-Time IoT Application
RTAS 2023:  Building a Real-Time IoT ApplicationRTAS 2023:  Building a Real-Time IoT Application
RTAS 2023: Building a Real-Time IoT ApplicationTimothy Spann
 
big data fest building modern data streaming apps
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming appsTimothy Spann
 
BigDataFest_ Building Modern Data Streaming Apps
BigDataFest_  Building Modern Data Streaming AppsBigDataFest_  Building Modern Data Streaming Apps
BigDataFest_ Building Modern Data Streaming Appsssuser73434e
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMjavier ramirez
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureTimothy Spann
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 

Ähnlich wie Real-time Streaming Pipelines with FLaNK (20)

Jug - ecosystem
Jug -  ecosystemJug -  ecosystem
Jug - ecosystem
 
Chti jug - 2018-06-26
Chti jug - 2018-06-26Chti jug - 2018-06-26
Chti jug - 2018-06-26
 
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
 
Running Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using KubernetesRunning Apache Spark Jobs Using Kubernetes
Running Apache Spark Jobs Using Kubernetes
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
 
Ingesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmedIngesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmed
 
Sparkly Notebook: Interactive Analysis and Visualization with Spark
Sparkly Notebook: Interactive Analysis and Visualization with SparkSparkly Notebook: Interactive Analysis and Visualization with Spark
Sparkly Notebook: Interactive Analysis and Visualization with Spark
 
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...
 
Hands on with CoAP and Californium
Hands on with CoAP and CaliforniumHands on with CoAP and Californium
Hands on with CoAP and Californium
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
DBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesDBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data Lakes
 
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache PulsarApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
 
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache ApexApache Big Data EU 2016: Building Streaming Applications with Apache Apex
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
 
RTAS 2023: Building a Real-Time IoT Application
RTAS 2023:  Building a Real-Time IoT ApplicationRTAS 2023:  Building a Real-Time IoT Application
RTAS 2023: Building a Real-Time IoT Application
 
big data fest building modern data streaming apps
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming apps
 
BigDataFest_ Building Modern Data Streaming Apps
BigDataFest_  Building Modern Data Streaming AppsBigDataFest_  Building Modern Data Streaming Apps
BigDataFest_ Building Modern Data Streaming Apps
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 

Mehr von Data Con LA

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA
 

Mehr von Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Kürzlich hochgeladen

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Kürzlich hochgeladen (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

Real-time Streaming Pipelines with FLaNK

  • 1. Real-Time Streaming Pipelines With FLaNK (Apache Flink, Apache NiFi & Apache Kafka) Timothy Spann - Principal DataFlow Field Engineer 15-April-2021
  • 3. 3 FLaNK Stack for Cloud Data Engineers Multiple users, frameworks, languages, clouds, data sources & clusters CLOUD DATA ENGINEER • Experience in ETL/ELT • Coding skills in Python or Java • Knowledge of database query languages such as SQL • Experience with Streaming • Knowledge of Cloud Tools • Expert in ETL (Eating, Ties and Laziness) • Edge Camera Interaction • Typical User • No Coding Skills • Can use NiFi • Questions your cloud spend CAT AI / Deep Learning / ML / DS • Can run in Apache NiFi • Can run in Kafka Streams • Can run in Apache Flink • Can run in MiNiFi Agents
  • 4. 4 I Can Haz Data? Today’s Data. REST and Websocket JSON “stonks” {"symbol":"CLDR", "uuid":"10640832-f139-4b82-8780-e3ad37b3d0 ce", "ts":1618529574078, "dt":1612098900000, "datetime":"2021/01/31 08:15:00", "open":"12.24500", "close":"12.25500", "high":"12.25500", "volume":"12353", "low":"12.24500"}
  • 6. 6 End to End Streaming Demo Pipeline Enterprise sources Weather Errors Aggregates Alerts Stocks ETL Analytics Streaming SQL Clickstream Market data Machine logs Social https://github.com/tspannhw/CloudDemo2021
  • 7. 7
  • 8. 8 Apache NiFi in a Nutshell Enable easy ingestion, routing, management and delivery of any data anywhere (Edge, cloud, data center) to any downstream system with built in end-to-end security and provenance Advanced tooling to industrialize flow development (Flow Development Life Cycle) ACQUIRE • Over 340 Prebuilt Processors • Easy to build your own • Parse, Enrich & Apply Schema • Filter, Split, Merger & Route • Throttle & Backpressure FTP SFTP HL7 UDP XML HTTP EMAIL HTML IMAGE SYSLOG PROCESS HASH MERGE EXTRACT DUPLICATE SPLIT ENCRYPT TALL EVALUATE EXECUTE GEOENRICH SCAN REPLACE TRANSLATE CONVERT ROUTE TEXT ROUTE CONTENT ROUTE CONTEXT ROUTE RATE DISTRIBUTE LOAD DELIVER • Guaranteed Delivery • Full data provenance from acquisition to delivery • Diverse, Non-Traditional Sources • Eco-system integration FTP SFTP HL7 UDP XML HTTP EMAIL HTML IMAGE SYSLOG
  • 9. © 2021 Cloudera, Inc. All rights reserved. 9 ParquetReader / ParquetWriter Records • Native Record Processors for Apache Parquet Files! • CVS <-> Parquet • XML <-> Parquet • AVRO <-> Parquet • JSON <-> Parquet • More... https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html https://www.datainmotion.dev/2019/10/migrating-apache-flume-flows-to-apac he_7.html
  • 10. © 2021 Cloudera, Inc. All rights reserved. 10 UpdateRecord • Use with LookupRecord • ELT • Works on CSV, XML, JSON, AVRO, … • RecordPath or Literals • Use Schemas and Schema Registry
  • 11. © 2021 Cloudera, Inc. All rights reserved. 11 ValidateRecord • Works on CSV, XML, JSON, AVRO, … • RecordPath or Literals • Use Schemas and Schema Registry • Checks fields, types, nullable
  • 12. © 2021 Cloudera, Inc. All rights reserved. 12 RestLookupService • Works on CSV, XML, JSON, AVRO, … • Use Schemas and Schema Registry • Can call Cloduera ML Models • SSL and Proxy enabled
  • 13. © 2021 Cloudera, Inc. All rights reserved. 13 https://www.datainmotion.dev/2020/06/no-more-spaghetti-flows.html ● Reduce, Reuse, Recycle. Use Parameters to reuse common modules. ● Put flows, reusable chunks into separate Process Groups. ● Write custom processors if you need new or specialized features ● Use Cloudera supported NiFi Processors ● Use Record Processors everywhere No More Spaghetti Flows
  • 14. © 2021 Cloudera, Inc. All rights reserved. 14 New Features … based on Apache NiFi 1.13.2 https://www.datainmotion.dev/2021/02/new-features-of-apache-nifi-1130.html ● ListenFTP ● Data Drift ● SampleRecord ● Generic Record Sink ● Generic Record Reader ● PutRecord ● WindowsEventLogReader
  • 15. 15 Yes, Franz, It’s Kafka Let’s do a metamorphosis on your data. Don’t fear changing data. You don’t need to be a brilliant writer to stream data. Franz Kafka was a German-speaking Bohemian novelist and short-story writer, widely regarded as one of the major figures of 20th-century literature. His work fuses elements of realism and the fantastic. Wikipedia
  • 16. © 2021 Cloudera, Inc. All rights reserved. 16 Apache Kafka • Highly reliable distributed messaging system • Decouple applications, enables many-to-many patterns • Publish-Subscribe semantics • Horizontal scalability • Efficient implementation to operate at speed with big data volumes • Organized by topic to support several use cases Source System Source System Source System Kafka Fraud Detection Security Systems Real-Time Monitoring Many-To-Many Publish-Subscribe EVENTS
  • 17. © 2021 Cloudera, Inc. All rights reserved. 17 Flink SQL … based on Apache Flink 1.12 https://www.datainmotion.dev/2021/04/cloudera-sql-stream-builder-ssb-updated.html ● Streaming Analytics ● Continuous SQL ● Continuous ETL ● Complex Event Processing ● Standard SQL Powered by Apache Calcite ● Deployed Apache Flink Apps on YARN ● Scalable Stream Processing
  • 18. 18 Flink SQL -- specify Kafka partition key on output SELECT foo AS _eventKey FROM sensors -- use event time timestamp from kafka -- exactly once compatible SELECT eventTimestamp FROM sensors -- nested structures access SELECT foo.’bar’ FROM table; -- must quote nested column -- timestamps SELECT * FROM payments WHERE eventTimestamp > CURRENT_TIMESTAMP-interval '10' second; -- unnest SELECT b.*, u.* FROM bgp_avro b, UNNEST(b.path) AS u(pathitem) -- aggregations and windows SELECT card, MAX(amount) as theamount, TUMBLE_END(eventTimestamp, interval '5' minute) as ts FROM payments WHERE lat IS NOT NULL AND lon IS NOT NULL GROUP BY card, TUMBLE(eventTimestamp, interval '5' minute) HAVING COUNT(*) > 4 -- >4==fraud -- try to do this ksql! SELECT us_west.user_score+ap_south.user_score FROM kafka_in_zone_us_west us_west FULL OUTER JOIN kafka_in_zone_ap_south ap_south ON us_west.user_id = ap_south.user_id; Key Takeaway: Rich SQL grammar with advanced time and aggregation tools
  • 19. © 2021 Cloudera, Inc. All rights reserved. 19 Flink SQL SELECT location, station_id, latitude, longitude, observation_time, weather, temperature_string, relative_humidity, wind_string, wind_dir, wind_degrees, wind_mph, pressure_in, dewpoint_string, dewpoint_f, dewpoint_c FROM weather2 WHERE location is not null and location <> 'null' and trim(location) <> '' and location like '%NJ' SELECT HOP_END(eventTimestamp, INTERVAL '1' SECOND, INTERVAL '30' SECOND) as windowEnd, count("close") as closeCount, sum(cast("close" as float)) as closeSum, avg(cast("close" as float)) as closeAverage, min("close") as closeMin, max("close") as closeMax, sum(case when "close" > 14 then 1 else 0 end) as stockGreaterThan14 FROM stocksraw GROUP BY HOP(eventTimestamp, INTERVAL '1' SECOND, INTERVAL '30' SECOND)
  • 20. © 2021 Cloudera, Inc. All rights reserved. 20
  • 21. © 2021 Cloudera, Inc. All rights reserved. 21 Upcoming Events April 27 May 19