SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
© 2020 Ververica
1
© 2020 Ververica
2
2
● Caito Scherr
Introduction
© 2020 Ververica
3
3
● Caito Scherr
Introduction
3
● Caito Scherr
● Developer Advocate
© 2020 Ververica
4
● Caito Scherr
● Developer Advocate
● Ververica, GmbH
Introduction
© 2020 Ververica
5
● Caito Scherr
● Developer Advocate
● Ververica, GmbH
● Portland, OR, USA
Introduction
© 2020 Ververica
6
Introduction
© 2020 Ververica
7
Demo credit: Marta Paes
Introduction
© 2020 Ververica
8
Agenda
● Pulsar + Flink
● Where SQL comes in
● Demo: Pulsar + Flink SQL
© 2020 Ververica
9
● Pulsar + Flink
● Where SQL comes in
● Demo: Pulsar + Flink SQL
Agenda
© 2020 Ververica
10
Agenda
● Pulsar + Flink
● Where SQL comes in
● Demo: Pulsar + Flink SQL
© 2020 Ververica
11
>> What is Flink?
Pulsar + Flink
● Stateful
● Stream processing engine
● Unified batch & streaming
© 2020 Ververica
12
>> What is Flink?
Pulsar + Flink
© 2020 Ververica
13
>> What is Flink?
Pulsar + Flink
© 2020 Ververica
14
>> Why Pulsar + Flink?
Pulsar + Flink
“Batch as a special case
of streaming”
“Stream as a unified
view on data”
© 2020 Ververica
15
>> Pulsar: Unified Storage
● Pub/Sub messaging layer
(Streaming)
● Durable storage layer
(Batch)
Pulsar + Flink
© 2020 Ververica
16
now
bounded query
unbounded query
past future
bounded query
start of the stream
unbounded query
>> Flink: Unified Processing
● Reuse code and logic
● Consistent semantics
● Simplify operations
● Mix historic and real-time
● Pub/Sub messaging layer (Stream)
● Durable storage layer (Batch)
Pulsar + Flink
© 2020 Ververica
17
Unified Processing Engine
(Batch / Streaming)
Unified Storage
(Segments / Pub/Sub)
>> A Unified Data Stack
Pulsar + Flink
© 2020 Ververica
18
Flink 1.6+
2018
Streaming Source/Sink Connectors
Table Sink Connector
>> Pulsar + Flink History
Pulsar + Flink
© 2020 Ververica
19
Flink 1.6+
2018
Streaming Source/Sink Connectors
Table Sink Connector
>> Pulsar + Flink History
Pulsar + Flink
Flink 1.9+
Pulsar Schema + Flink Catalog
Table API/SQL as 1st class citizens
Exactly-once Source
At-least once Sink
© 2020 Ververica
20
Flink 1.6+
2018
Streaming Source/Sink Connectors
Table Sink Connector
>> Pulsar + Flink History
Pulsar + Flink
Flink 1.9+
Pulsar Schema + Flink Catalog
Table API/SQL as 1st class citizens
Exactly-once Source
At-least once Sink
Flink 1.12
Upserts
DDL Computed Columns, Watermarks. Metadata
End-to-end Exactly-once
Key-shared Subscription Model
© 2020 Ververica
21
Flink Runtime
Stateful Computations over Data Streams
Stateful
Stream Processing
Streams, State, Time
Event-Driven
Applications
Stateful Functions
Streaming
Analytics & ML
SQL, PyFlink, Tables
>> Why Flink SQL?
Pulsar + Flink
© 2020 Ververica
22
>> Why Flink SQL?
● Focus on business logic, not implementation
● Mixed workloads (batch + streaming)
● Maximize developer speed and autonomy
ML Feature Generation
Unified Online/Offline Model
Training
E2E Streaming Analytics
Pipelines
Pulsar + Flink
© 2020 Ververica
23
user cnt
Mary 2
Bob 1
SELECT user_id,
COUNT(url) AS cnt
FROM clicks
GROUP BY user_id;
Take a snapshot when the
query starts
A final result is
produced
A row that was added after the query
was started is not considered
user cTime url
Mary 12:00:00 https://…
Bob 12:00:00 https://…
Mary 12:00:02 https://…
Liz 12:00:03 https://…
The query
terminates
Where SQL Fits In >> A Regular SQL Engine
© 2020 Ververica
24
user cTime url
user cnt
SELECT user_id,
COUNT(url) AS cnt
FROM clicks
GROUP BY user_id;
Mary 12:00:00 https://…
Bob 12:00:00 https://…
Mary 12:00:02 https://…
Liz 12:00:03 https://…
Bob 1
Liz 1
Mary 1
Mary 2
Ingest all changes as
they happen
Continuously update the
result
The result is identical to the one-time query (at this point)
Where SQL Fits In >> A Streaming SQL Engine
© 2020 Ververica
25
● Standard SQL syntax and semantics (i.e. not a “SQL-flavor”)
● Unified APIs for batch and streaming
● Support for advanced time handling and operations (e.g. CDC, pattern matching)
UDF Support
Python
Java
Scala
Execution
TPC-DS Coverage
Batch
Streaming
+
Formats
Native Connectors
Apache Kafka
Elasticsearch
FileSystems
JDBC HBase
+
Kinesis
Metastore Postgres (JDBC)
Data Catalogs
Debezium
Where SQL Fits In >> Flink SQL In A Nutshell
© 2020 Ververica
26
>> 1a. Twitter Firehose
Demo
© 2020 Ververica
27
Demo >> 1b. Data?
© 2020 Ververica
28
Demo >> 2. SQL Client + Pulsar
CREATE CATALOG pulsar WITH (
'type' = 'pulsar',
'service-url' = 'pulsar://pulsar:6650',
'admin-url' = 'http://pulsar:8080',
'format' = 'json'
);
Catalog DDL
© 2020 Ververica
29
Not cool. 👹
Demo
© 2020 Ververica
30
Demo
CREATE TABLE pulsar_tweets (
publishTime TIMESTAMP(3) METADATA,
WATERMARK FOR publishTime AS publishTime - INTERVAL '5'
SECOND
) WITH (
'connector' = 'pulsar',
'topic' = 'persistent://public/default/tweets',
'value.format' = 'json',
'service-url' = 'pulsar://pulsar:6650',
'admin-url' = 'http://pulsar:8080',
'scan.startup.mode' = 'earliest-offset'
)
LIKE tweets;
Derive schema from the original topic
Define the source connector (Pulsar)
Read and use Pulsar message metadata
>> 3. Get relevant timestamp
© 2020 Ververica
31
Demo >> 4. Windowed aggregation
CREATE TABLE pulsar_tweets_agg (
tmstmp TIMESTAMP(3),
tweet_cnt BIGINT
) WITH (
'connector'='pulsar',
'topic'='persistent://public/default/tweets_agg',
'value.format'='json',
'service-url'='pulsar://pulsar:6650',
'admin-url'='http://pulsar:8080'
);
Sink Table DDL
INSERT INTO pulsar_tweets_agg
SELECT TUMBLE_START(publishTime, INTERVAL '10'
SECOND) AS wStart,
COUNT(id) AS tweet_cnt
FROM pulsar_tweets
GROUP BY TUMBLE(publishTime, INTERVAL '10'
SECOND);
Continuous SQL Query
© 2020 Ververica
32
Demo >> 5. Tweet count in windows
© 2020 Ververica
33
What Next? >> Flink SQL Cookbook
© 2020 Ververica
Resources
● Flink Ahead: What Comes After Batch & Streaming: https://youtu.be/h5OYmy9Yx7Y
● Apache Pulsar as one Storage System for Real Time & Historical Data Analysis:
https://medium.com/streamnative/apache-pulsar-as-one-storage-455222c59017
● Flink Table API & SQL:
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#operatio
ns
● Flink SQL Cookbook: https://github.com/ververica/flink-sql-cookbook
● When Flink & Pulsar Come Together: https://flink.apache.org/2019/05/03/pulsar-flink.html
● How to Query Pulsar Streams in Flink:
https://flink.apache.org/news/2019/11/25/query-pulsar-streams-using-apache-flink.html
● What’s New in the Flink/Pulsar Connector:
● https://flink.apache.org/2021/01/07/pulsar-flink-connector-270.html
● Marta’s Demo: https://github.com/morsapaes/flink-sql-pulsar
34
@Caito_200_OK
© 2020 Ververica
● Pulsar Conference staff!!
● Marta Paes
35
Thank You!
@Caito_200_OK
Scan here for links
& resources

Weitere ähnliche Inhalte

Was ist angesagt?

Web Application Firewall - Friend of your DevOps Chain?
Web Application Firewall - Friend of your DevOps Chain?Web Application Firewall - Friend of your DevOps Chain?
Web Application Firewall - Friend of your DevOps Chain?Franziska Buehler
 
Reactive Applications with Apache Pulsar and Spring Boot
Reactive Applications with Apache Pulsar and Spring BootReactive Applications with Apache Pulsar and Spring Boot
Reactive Applications with Apache Pulsar and Spring BootVMware Tanzu
 
Spring Boot to Quarkus: A real app migration experience | DevNation Tech Talk
Spring Boot to Quarkus: A real app migration experience | DevNation Tech TalkSpring Boot to Quarkus: A real app migration experience | DevNation Tech Talk
Spring Boot to Quarkus: A real app migration experience | DevNation Tech TalkRed Hat Developers
 
Developing a user-friendly OpenResty application
Developing a user-friendly OpenResty applicationDeveloping a user-friendly OpenResty application
Developing a user-friendly OpenResty applicationThibault Charbonnier
 
SpringBoot and Spring Cloud Service for MSA
SpringBoot and Spring Cloud Service for MSASpringBoot and Spring Cloud Service for MSA
SpringBoot and Spring Cloud Service for MSAOracle Korea
 
A Series of Fortunate Events: Building an Operator in Java
A Series of Fortunate Events: Building an Operator in JavaA Series of Fortunate Events: Building an Operator in Java
A Series of Fortunate Events: Building an Operator in JavaVMware Tanzu
 
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)Roberto Pérez Alcolea
 
Servlet vs Reactive Stacks in 5 Use Cases
Servlet vs Reactive Stacks in 5 Use CasesServlet vs Reactive Stacks in 5 Use Cases
Servlet vs Reactive Stacks in 5 Use CasesVMware Tanzu
 
Spring framework 5: New Core and Reactive features
Spring framework 5: New Core and Reactive featuresSpring framework 5: New Core and Reactive features
Spring framework 5: New Core and Reactive featuresAliaksei Zhynhiarouski
 
Gatekeeper: API gateway
Gatekeeper: API gatewayGatekeeper: API gateway
Gatekeeper: API gatewayChengHui Weng
 
Oracle Service Bus 12c (12.2.1) What You Always Wanted to Know
Oracle Service Bus 12c (12.2.1) What You Always Wanted to KnowOracle Service Bus 12c (12.2.1) What You Always Wanted to Know
Oracle Service Bus 12c (12.2.1) What You Always Wanted to KnowFrank Munz
 
A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13Thibault Charbonnier
 
Oracle soa 10g to 11g migration
Oracle soa 10g to 11g migrationOracle soa 10g to 11g migration
Oracle soa 10g to 11g migrationKrishna R
 
CI/CD 기반의 Microservice 개발
 CI/CD 기반의 Microservice 개발 CI/CD 기반의 Microservice 개발
CI/CD 기반의 Microservice 개발Oracle Korea
 
Upgrading Oracle SOA Suite 10g to 11g (whitepaper)
Upgrading Oracle SOA Suite 10g to 11g (whitepaper)Upgrading Oracle SOA Suite 10g to 11g (whitepaper)
Upgrading Oracle SOA Suite 10g to 11g (whitepaper)Revelation Technologies
 
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...Martin Etmajer
 
Java Microservices with Netflix OSS & Spring
Java Microservices with Netflix OSS & Spring Java Microservices with Netflix OSS & Spring
Java Microservices with Netflix OSS & Spring Conor Svensson
 

Was ist angesagt? (20)

Web Application Firewall - Friend of your DevOps Chain?
Web Application Firewall - Friend of your DevOps Chain?Web Application Firewall - Friend of your DevOps Chain?
Web Application Firewall - Friend of your DevOps Chain?
 
Reactive Applications with Apache Pulsar and Spring Boot
Reactive Applications with Apache Pulsar and Spring BootReactive Applications with Apache Pulsar and Spring Boot
Reactive Applications with Apache Pulsar and Spring Boot
 
Spring Boot to Quarkus: A real app migration experience | DevNation Tech Talk
Spring Boot to Quarkus: A real app migration experience | DevNation Tech TalkSpring Boot to Quarkus: A real app migration experience | DevNation Tech Talk
Spring Boot to Quarkus: A real app migration experience | DevNation Tech Talk
 
Developing a user-friendly OpenResty application
Developing a user-friendly OpenResty applicationDeveloping a user-friendly OpenResty application
Developing a user-friendly OpenResty application
 
SpringBoot and Spring Cloud Service for MSA
SpringBoot and Spring Cloud Service for MSASpringBoot and Spring Cloud Service for MSA
SpringBoot and Spring Cloud Service for MSA
 
Reactive Spring Framework 5
Reactive Spring Framework 5Reactive Spring Framework 5
Reactive Spring Framework 5
 
A Series of Fortunate Events: Building an Operator in Java
A Series of Fortunate Events: Building an Operator in JavaA Series of Fortunate Events: Building an Operator in Java
A Series of Fortunate Events: Building an Operator in Java
 
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
Leveraging Gradle @ Netflix (Madrid GUG Feb 2, 2021)
 
Servlet vs Reactive Stacks in 5 Use Cases
Servlet vs Reactive Stacks in 5 Use CasesServlet vs Reactive Stacks in 5 Use Cases
Servlet vs Reactive Stacks in 5 Use Cases
 
Spring framework 5: New Core and Reactive features
Spring framework 5: New Core and Reactive featuresSpring framework 5: New Core and Reactive features
Spring framework 5: New Core and Reactive features
 
Gatekeeper: API gateway
Gatekeeper: API gatewayGatekeeper: API gateway
Gatekeeper: API gateway
 
Oracle Service Bus 12c (12.2.1) What You Always Wanted to Know
Oracle Service Bus 12c (12.2.1) What You Always Wanted to KnowOracle Service Bus 12c (12.2.1) What You Always Wanted to Know
Oracle Service Bus 12c (12.2.1) What You Always Wanted to Know
 
Spring webflux
Spring webfluxSpring webflux
Spring webflux
 
A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13A Kong retrospective: from 0.10 to 0.13
A Kong retrospective: from 0.10 to 0.13
 
Sprint 17
Sprint 17Sprint 17
Sprint 17
 
Oracle soa 10g to 11g migration
Oracle soa 10g to 11g migrationOracle soa 10g to 11g migration
Oracle soa 10g to 11g migration
 
CI/CD 기반의 Microservice 개발
 CI/CD 기반의 Microservice 개발 CI/CD 기반의 Microservice 개발
CI/CD 기반의 Microservice 개발
 
Upgrading Oracle SOA Suite 10g to 11g (whitepaper)
Upgrading Oracle SOA Suite 10g to 11g (whitepaper)Upgrading Oracle SOA Suite 10g to 11g (whitepaper)
Upgrading Oracle SOA Suite 10g to 11g (whitepaper)
 
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
Challenges in a Microservices Age: Monitoring, Logging and Tracing on Red Hat...
 
Java Microservices with Netflix OSS & Spring
Java Microservices with Netflix OSS & Spring Java Microservices with Netflix OSS & Spring
Java Microservices with Netflix OSS & Spring
 

Ähnlich wie Select Star: Unified Batch & Streaming with Flink SQL & Pulsar

Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021StreamNative
 
Don't Cross the Streams! (or do, we got you)
Don't Cross the Streams! (or do, we got you)Don't Cross the Streams! (or do, we got you)
Don't Cross the Streams! (or do, we got you)Caito Scherr
 
Better, Faster, Stronger Streaming: Your First Dive into Flink SQL
Better, Faster, Stronger Streaming: Your First Dive into Flink SQLBetter, Faster, Stronger Streaming: Your First Dive into Flink SQL
Better, Faster, Stronger Streaming: Your First Dive into Flink SQLCaito Scherr
 
Flink sql for continuous sql etl apps & Apache NiFi devops
Flink sql for continuous sql etl apps & Apache NiFi devopsFlink sql for continuous sql etl apps & Apache NiFi devops
Flink sql for continuous sql etl apps & Apache NiFi devopsTimothy Spann
 
PartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionPartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionTimothy Spann
 
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!Timo Walther
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...StreamNative
 
What's new for Apache Flink's Table & SQL APIs?
What's new for Apache Flink's Table & SQL APIs?What's new for Apache Flink's Table & SQL APIs?
What's new for Apache Flink's Table & SQL APIs?Timo Walther
 
Real time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and CouchbaseReal time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and CouchbaseWill Gardella
 
Unlock cassandra data for application developers using graphQL
Unlock cassandra data for application developers using graphQLUnlock cassandra data for application developers using graphQL
Unlock cassandra data for application developers using graphQLCédrick Lunven
 
Why Airflow? & What's new in Airflow 2.3?
Why Airflow? & What's new in Airflow 2.3?Why Airflow? & What's new in Airflow 2.3?
Why Airflow? & What's new in Airflow 2.3?Kaxil Naik
 
Explore Advanced CA Release Automation Configuration Topics
Explore Advanced CA Release Automation Configuration TopicsExplore Advanced CA Release Automation Configuration Topics
Explore Advanced CA Release Automation Configuration TopicsCA Technologies
 
Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?Ludovico Caldara
 
Starschema Products
Starschema ProductsStarschema Products
Starschema ProductsEndre Adam
 
What's New in Oracle SQL Developer for 2018
What's New in Oracle SQL Developer for 2018What's New in Oracle SQL Developer for 2018
What's New in Oracle SQL Developer for 2018Jeff Smith
 
An oss api layer for your cassandra
An oss api layer for your cassandraAn oss api layer for your cassandra
An oss api layer for your cassandraCédrick Lunven
 
HP Helion European Webinar Series ,Webinar #3
HP Helion European Webinar Series ,Webinar #3 HP Helion European Webinar Series ,Webinar #3
HP Helion European Webinar Series ,Webinar #3 BeMyApp
 
Overview SQL Server 2019
Overview SQL Server 2019Overview SQL Server 2019
Overview SQL Server 2019Juan Fabian
 

Ähnlich wie Select Star: Unified Batch & Streaming with Flink SQL & Pulsar (20)

Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
 
Don't Cross the Streams! (or do, we got you)
Don't Cross the Streams! (or do, we got you)Don't Cross the Streams! (or do, we got you)
Don't Cross the Streams! (or do, we got you)
 
Better, Faster, Stronger Streaming: Your First Dive into Flink SQL
Better, Faster, Stronger Streaming: Your First Dive into Flink SQLBetter, Faster, Stronger Streaming: Your First Dive into Flink SQL
Better, Faster, Stronger Streaming: Your First Dive into Flink SQL
 
Flink sql for continuous sql etl apps & Apache NiFi devops
Flink sql for continuous sql etl apps & Apache NiFi devopsFlink sql for continuous sql etl apps & Apache NiFi devops
Flink sql for continuous sql etl apps & Apache NiFi devops
 
PartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC SolutionPartnerSkillUp_Enable a Streaming CDC Solution
PartnerSkillUp_Enable a Streaming CDC Solution
 
Google Cloud Dataflow
Google Cloud DataflowGoogle Cloud Dataflow
Google Cloud Dataflow
 
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!ApacheCon 2020 - Flink SQL in 2020: Time to show off!
ApacheCon 2020 - Flink SQL in 2020: Time to show off!
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
What's new for Apache Flink's Table & SQL APIs?
What's new for Apache Flink's Table & SQL APIs?What's new for Apache Flink's Table & SQL APIs?
What's new for Apache Flink's Table & SQL APIs?
 
Real time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and CouchbaseReal time Messages at Scale with Apache Kafka and Couchbase
Real time Messages at Scale with Apache Kafka and Couchbase
 
Unlock cassandra data for application developers using graphQL
Unlock cassandra data for application developers using graphQLUnlock cassandra data for application developers using graphQL
Unlock cassandra data for application developers using graphQL
 
Why Airflow? & What's new in Airflow 2.3?
Why Airflow? & What's new in Airflow 2.3?Why Airflow? & What's new in Airflow 2.3?
Why Airflow? & What's new in Airflow 2.3?
 
Explore Advanced CA Release Automation Configuration Topics
Explore Advanced CA Release Automation Configuration TopicsExplore Advanced CA Release Automation Configuration Topics
Explore Advanced CA Release Automation Configuration Topics
 
Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability, is it a developer's job?
 
Starschema Products
Starschema ProductsStarschema Products
Starschema Products
 
What's New in Oracle SQL Developer for 2018
What's New in Oracle SQL Developer for 2018What's New in Oracle SQL Developer for 2018
What's New in Oracle SQL Developer for 2018
 
An oss api layer for your cassandra
An oss api layer for your cassandraAn oss api layer for your cassandra
An oss api layer for your cassandra
 
HP Helion European Webinar Series ,Webinar #3
HP Helion European Webinar Series ,Webinar #3 HP Helion European Webinar Series ,Webinar #3
HP Helion European Webinar Series ,Webinar #3
 
Overview SQL Server 2019
Overview SQL Server 2019Overview SQL Server 2019
Overview SQL Server 2019
 
DevCon5 (July 2014) - Acision SDK
DevCon5 (July 2014) - Acision SDKDevCon5 (July 2014) - Acision SDK
DevCon5 (July 2014) - Acision SDK
 

Kürzlich hochgeladen

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...software pro Development
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 

Kürzlich hochgeladen (20)

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 

Select Star: Unified Batch & Streaming with Flink SQL & Pulsar

  • 2. © 2020 Ververica 2 2 ● Caito Scherr Introduction
  • 3. © 2020 Ververica 3 3 ● Caito Scherr Introduction 3 ● Caito Scherr ● Developer Advocate
  • 4. © 2020 Ververica 4 ● Caito Scherr ● Developer Advocate ● Ververica, GmbH Introduction
  • 5. © 2020 Ververica 5 ● Caito Scherr ● Developer Advocate ● Ververica, GmbH ● Portland, OR, USA Introduction
  • 7. © 2020 Ververica 7 Demo credit: Marta Paes Introduction
  • 8. © 2020 Ververica 8 Agenda ● Pulsar + Flink ● Where SQL comes in ● Demo: Pulsar + Flink SQL
  • 9. © 2020 Ververica 9 ● Pulsar + Flink ● Where SQL comes in ● Demo: Pulsar + Flink SQL Agenda
  • 10. © 2020 Ververica 10 Agenda ● Pulsar + Flink ● Where SQL comes in ● Demo: Pulsar + Flink SQL
  • 11. © 2020 Ververica 11 >> What is Flink? Pulsar + Flink ● Stateful ● Stream processing engine ● Unified batch & streaming
  • 12. © 2020 Ververica 12 >> What is Flink? Pulsar + Flink
  • 13. © 2020 Ververica 13 >> What is Flink? Pulsar + Flink
  • 14. © 2020 Ververica 14 >> Why Pulsar + Flink? Pulsar + Flink “Batch as a special case of streaming” “Stream as a unified view on data”
  • 15. © 2020 Ververica 15 >> Pulsar: Unified Storage ● Pub/Sub messaging layer (Streaming) ● Durable storage layer (Batch) Pulsar + Flink
  • 16. © 2020 Ververica 16 now bounded query unbounded query past future bounded query start of the stream unbounded query >> Flink: Unified Processing ● Reuse code and logic ● Consistent semantics ● Simplify operations ● Mix historic and real-time ● Pub/Sub messaging layer (Stream) ● Durable storage layer (Batch) Pulsar + Flink
  • 17. © 2020 Ververica 17 Unified Processing Engine (Batch / Streaming) Unified Storage (Segments / Pub/Sub) >> A Unified Data Stack Pulsar + Flink
  • 18. © 2020 Ververica 18 Flink 1.6+ 2018 Streaming Source/Sink Connectors Table Sink Connector >> Pulsar + Flink History Pulsar + Flink
  • 19. © 2020 Ververica 19 Flink 1.6+ 2018 Streaming Source/Sink Connectors Table Sink Connector >> Pulsar + Flink History Pulsar + Flink Flink 1.9+ Pulsar Schema + Flink Catalog Table API/SQL as 1st class citizens Exactly-once Source At-least once Sink
  • 20. © 2020 Ververica 20 Flink 1.6+ 2018 Streaming Source/Sink Connectors Table Sink Connector >> Pulsar + Flink History Pulsar + Flink Flink 1.9+ Pulsar Schema + Flink Catalog Table API/SQL as 1st class citizens Exactly-once Source At-least once Sink Flink 1.12 Upserts DDL Computed Columns, Watermarks. Metadata End-to-end Exactly-once Key-shared Subscription Model
  • 21. © 2020 Ververica 21 Flink Runtime Stateful Computations over Data Streams Stateful Stream Processing Streams, State, Time Event-Driven Applications Stateful Functions Streaming Analytics & ML SQL, PyFlink, Tables >> Why Flink SQL? Pulsar + Flink
  • 22. © 2020 Ververica 22 >> Why Flink SQL? ● Focus on business logic, not implementation ● Mixed workloads (batch + streaming) ● Maximize developer speed and autonomy ML Feature Generation Unified Online/Offline Model Training E2E Streaming Analytics Pipelines Pulsar + Flink
  • 23. © 2020 Ververica 23 user cnt Mary 2 Bob 1 SELECT user_id, COUNT(url) AS cnt FROM clicks GROUP BY user_id; Take a snapshot when the query starts A final result is produced A row that was added after the query was started is not considered user cTime url Mary 12:00:00 https://… Bob 12:00:00 https://… Mary 12:00:02 https://… Liz 12:00:03 https://… The query terminates Where SQL Fits In >> A Regular SQL Engine
  • 24. © 2020 Ververica 24 user cTime url user cnt SELECT user_id, COUNT(url) AS cnt FROM clicks GROUP BY user_id; Mary 12:00:00 https://… Bob 12:00:00 https://… Mary 12:00:02 https://… Liz 12:00:03 https://… Bob 1 Liz 1 Mary 1 Mary 2 Ingest all changes as they happen Continuously update the result The result is identical to the one-time query (at this point) Where SQL Fits In >> A Streaming SQL Engine
  • 25. © 2020 Ververica 25 ● Standard SQL syntax and semantics (i.e. not a “SQL-flavor”) ● Unified APIs for batch and streaming ● Support for advanced time handling and operations (e.g. CDC, pattern matching) UDF Support Python Java Scala Execution TPC-DS Coverage Batch Streaming + Formats Native Connectors Apache Kafka Elasticsearch FileSystems JDBC HBase + Kinesis Metastore Postgres (JDBC) Data Catalogs Debezium Where SQL Fits In >> Flink SQL In A Nutshell
  • 26. © 2020 Ververica 26 >> 1a. Twitter Firehose Demo
  • 28. © 2020 Ververica 28 Demo >> 2. SQL Client + Pulsar CREATE CATALOG pulsar WITH ( 'type' = 'pulsar', 'service-url' = 'pulsar://pulsar:6650', 'admin-url' = 'http://pulsar:8080', 'format' = 'json' ); Catalog DDL
  • 29. © 2020 Ververica 29 Not cool. 👹 Demo
  • 30. © 2020 Ververica 30 Demo CREATE TABLE pulsar_tweets ( publishTime TIMESTAMP(3) METADATA, WATERMARK FOR publishTime AS publishTime - INTERVAL '5' SECOND ) WITH ( 'connector' = 'pulsar', 'topic' = 'persistent://public/default/tweets', 'value.format' = 'json', 'service-url' = 'pulsar://pulsar:6650', 'admin-url' = 'http://pulsar:8080', 'scan.startup.mode' = 'earliest-offset' ) LIKE tweets; Derive schema from the original topic Define the source connector (Pulsar) Read and use Pulsar message metadata >> 3. Get relevant timestamp
  • 31. © 2020 Ververica 31 Demo >> 4. Windowed aggregation CREATE TABLE pulsar_tweets_agg ( tmstmp TIMESTAMP(3), tweet_cnt BIGINT ) WITH ( 'connector'='pulsar', 'topic'='persistent://public/default/tweets_agg', 'value.format'='json', 'service-url'='pulsar://pulsar:6650', 'admin-url'='http://pulsar:8080' ); Sink Table DDL INSERT INTO pulsar_tweets_agg SELECT TUMBLE_START(publishTime, INTERVAL '10' SECOND) AS wStart, COUNT(id) AS tweet_cnt FROM pulsar_tweets GROUP BY TUMBLE(publishTime, INTERVAL '10' SECOND); Continuous SQL Query
  • 32. © 2020 Ververica 32 Demo >> 5. Tweet count in windows
  • 33. © 2020 Ververica 33 What Next? >> Flink SQL Cookbook
  • 34. © 2020 Ververica Resources ● Flink Ahead: What Comes After Batch & Streaming: https://youtu.be/h5OYmy9Yx7Y ● Apache Pulsar as one Storage System for Real Time & Historical Data Analysis: https://medium.com/streamnative/apache-pulsar-as-one-storage-455222c59017 ● Flink Table API & SQL: https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#operatio ns ● Flink SQL Cookbook: https://github.com/ververica/flink-sql-cookbook ● When Flink & Pulsar Come Together: https://flink.apache.org/2019/05/03/pulsar-flink.html ● How to Query Pulsar Streams in Flink: https://flink.apache.org/news/2019/11/25/query-pulsar-streams-using-apache-flink.html ● What’s New in the Flink/Pulsar Connector: ● https://flink.apache.org/2021/01/07/pulsar-flink-connector-270.html ● Marta’s Demo: https://github.com/morsapaes/flink-sql-pulsar 34 @Caito_200_OK
  • 35. © 2020 Ververica ● Pulsar Conference staff!! ● Marta Paes 35 Thank You! @Caito_200_OK Scan here for links & resources