SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Confidential - Do Not Share or Distribute
Apache Arrow Flight SQL
A universal standard for high performance
data transfers from databases
Alex Merced
Developer Advocate @ Dremio
Confidential - Do Not Share or Distribute
Agenda
● What if we could have for our data transfers…
● What options do we have?
● What is Apache Arrow? Is it enough?
● What is Arrow Flight? Is it enough?
● What is Arrow Flight SQL? Is it enough?
● What are the Arrow Flight SQL ODBC & JDBC
Drivers?
Confidential - Do Not Share or Distribute
What if we could have for our data transfers…
✓ Standardized APIs
✓ No need for many different drivers
✓ High performance transfers
✓ Parallel data transfers (1:many, many:many)
✓ Easy to implement on server side, easy to use on client side
Confidential - Do Not Share or Distribute
● Built in 1992 & 1997, respectively
○ Inherently row-based
○ Prevented the many:many of
client-to-database-APIs
○ But resulted in one:many of
API-to-database-protocols
Why don’t ODBC & JDBC fit in today’s analytics world?
Today
Yesterday
● Today: heterogeneous big data workloads
● Most tools involved with OLAP workloads
today are columnar
○ But there is no columnar transfer
standard
Confidential - Do Not Share or Distribute
Increasingly, columnar → columnar is the standard
Yesterday
Today
Confidential - Do Not Share or Distribute
● Custom protocols
○ One per server
○ Implemented individually by each client
What options do we have to keep things columnar?
Confidential - Do Not Share or Distribute
Pick your poison
Enter: Apache Arrow Flight
JDBC/ODBC Custom protocol
Standardize client interface ✅ ❌
Standardize client database access
interface
✅ ❌
Standardize server interface ❌ ❌
Prevent unnecessary serialization ❌ ✅
Confidential - Do Not Share or Distribute
What if?
Enter: Arrow Flight
Confidential - Do Not Share or Distribute
Before we go there:
Apache Arrow
Confidential - Do Not Share or Distribute
Introduction to Apache Arrow
● An in-memory columnar format
○ Includes libraries for working with the format
■ E.g., computation engine, IPC, serialization /
deserialization from file formats.
○ Support for many languages (12 currently)
● Co-created by Dremio and Wes McKinney
● Rapid adoption, popularity, and capability growth
Confidential - Do Not Share or Distribute
● Arrow is primarily geared towards operations on data in a single-node
○ No specification for cross-network transfers
■ Within a cluster
■ Client-server
11
Isn’t Arrow enough?
Enter: Arrow Flight
Confidential - Do Not Share or Distribute
12
What is Arrow Flight?
● A general-purpose client-server framework to simplify high performance transport of
large datasets over network interfaces
● Designed for data in the modern world
○ Column-oriented, rather than row-oriented
○ Arrow’s columnar format is designed for high compressibility and large numbers of
rows
● Makes including the client-side in distributed computing easier
● Enables parallel data transfers
Confidential - Do Not Share or Distribute
13
Parallel data transfers with Arrow Flight
Client
Node 2
Node N
Node 1
CPU
memory
CPU
memory
CPU
memory
CPU
memory
DoGet(<ticket>)
DoGet(<ticket>)
DoGet(<ticket>)
Confidential - Do Not Share or Distribute
Isn’t Arrow Flight Enough?
JDBC/ODBC Custom protocols Arrow Flight
Standardize client interface ✅ ❌ ✅
Standardize client database access
interface
✅ ❌ ❌
Standardize server interface ❌ ❌ ✅
Prevent unnecessary serialization ❌ ✅ ✅
Confidential - Do Not Share or Distribute
Isn’t Arrow Flight Enough?
● Client sends an arbitrary byte stream, server sends a result
○ The byte stream only has meaning for a particular server that’s chosen to
support it
○ Example – Dremio interprets the byte stream to be a UTF-8 encoded SQL
query string
● Flight is meant to serve any tabular data, not databases in particular
○ Catalog information is not part of Arrow Flight’s design
● Haven’t we seen these problems before?
○ ODBC and JDBC standardize database activities (e.g., query execution,
catalog access)
Enter: Arrow Flight SQL
Confidential - Do Not Share or Distribute
16
● Client-server protocol for interacting with SQL databases built on Arrow Flight
○ Allows databases to use Arrow Flight as the transport protocol
○ Leverage the performance of Arrow and Flight for database access
● Set of commands to standardize a SQL interface on Flight:
○ Query execution
○ Prepared statements
○ Database catalog metadata (tables, columns, data types).
○ SQL syntax capabilities
● Language and database-agnostic
○ A single set of client libraries to connect to any Flight SQL server
○ Clients can talk to any database implementing the protocol
○ No need for a client-side driver per database
What is Arrow Flight SQL?
Confidential - Do Not Share or Distribute
Client Database
2 - FlightInfo<Schema, Endpoints>
1 - GetFlightInfo(GetTables)
4 - Arrow record batches
3 - DoGet(<ticket>)
6 - FlightInfo<Schema, Endpoints>
5 - GetFlightInfo(CommandStatementQuery)
7 - DoGet(<ticket>)
Memory
Listing Tables:
Querying a
table:
Memory
8 - Arrow record batches
Confidential - Do Not Share or Distribute
Queries:
● CommandStatementQuery
● CommandStatementUpdate
Prepared statements:
● ActionClosePreparedStatementRequest
● ActionCreatePreparedStatementRequest
● CommandPreparedStatementQuery
● CommandPreparedStatementUpdate
Metadata requests:
● CommandGetCatalogs
● CommandGetCrossReference
● CommandGetDbSchemas
● CommandGetExportedKeys
● CommandGetImportedKeys
● CommandGetPrimaryKeys
● CommandGetSqlInfo
● CommandGetTables
● CommandGetTableTypes
Arrow Flight SQL Commands
Confidential - Do Not Share or Distribute
Isn’t Arrow Flight SQL Enough?
JDBC/ODBC Custom protocols Arrow Flight Arrow Flight SQL
Standardize client interface ✅ ❌ ✅ ✅
Standardize client database
access interface
✅ ❌ ❌ ✅
Standardize server interface ❌ ❌ ✅ ✅
Prevent unnecessary serialization ❌ ✅ ✅ ✅
Confidential - Do Not Share or Distribute
● Arrow Flight SQL is fairly new, will take time to be widely adopted by existing
client tools (especially the enterprise BI ones)
○ Most BI tools already support ODBC or JDBC
Isn’t Arrow Flight SQL Enough?
Enter: Arrow Flight SQL ODBC & JDBC Drivers
Confidential - Do Not Share or Distribute
21
Arrow Flight SQL ODBC & JDBC Drivers
● ODBC & JDBC Drivers built on top of Flight SQL client libraries
○ A single driver to connect to any Flight SQL server
○ Supports arbitrary server-side options as connection properties
● The drivers will work with any ODBC/JDBC client tool without any code changes
● Completely open source
Confidential - Do Not Share or Distribute
22
Traditional ODBC/JDBC vs Arrow Flight SQL ODBC/JDBC
Traditional ODBC/JDBC Arrow Flight SQL ODBC/JDBC
Driver Management
Performance
install and manage a driver
for each database
row-based transfer, so not
great for large amounts of
data
column-based transfer, so
benefit from compression of
data over the wire
install and manage
a single driver for all
databases
Confidential - Do Not Share or Distribute
When wouldn’t I use Arrow Flight ODBC/JDBC Drivers?
● Generally preferable to have native Flight SQL applications
○ Easier to harness future features like multiple endpoints
○ Better performance if your client is column-based
Client is row-based
Client is column-based
Network
gRPC
Arrow
Arrow Flight
Arrow Flight SQL
JDBC
Legacy
Client
SQL Native
Client
Non-SQL
Client
Confidential - Do Not Share or Distribute
What if we could have for our data transfers…
✓ Standardized APIs
○ Arrow Flight for arbitrary tabular transfers, Arrow Flight SQL for transfers from databases
✓ No need for many different drivers
○ Both client and server side APIs are standardized
✓ High performance transfers
○ Transfers via highly optimized Arrow format
✓ Parallel data transfers (1:many, many:many)
○ Lays the foundation for parallel data transfers
✓ Easy to implement on server side, easy to use on client side
○ Server-side implementation is as easy as implementing a few RPC request handlers
○ A single set of client libraries to connect to any Flight SQL server
Confidential - Do Not Share or Distribute
Q&A
Confidential - Do Not Share or Distribute
Thanks!
jason@dremio.com

Weitere ähnliche Inhalte

Ähnlich wie Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for High-Performance Data Transfers from Data

Presto: Query Anything - Data Engineer’s perspective
Presto: Query Anything - Data Engineer’s perspectivePresto: Query Anything - Data Engineer’s perspective
Presto: Query Anything - Data Engineer’s perspectiveAlluxio, Inc.
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDBI Goo Lee
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksDatabricks
 
Serverless Data Platform
Serverless Data PlatformServerless Data Platform
Serverless Data PlatformShu-Jeng Hsieh
 
introduction to micro services
introduction to micro servicesintroduction to micro services
introduction to micro servicesSpyros Lambrinidis
 
Bootify Yyour App from Zero to Hero
Bootify Yyour App from Zero to HeroBootify Yyour App from Zero to Hero
Bootify Yyour App from Zero to HeroEPAM
 
Introduction to Datasource V2 API
Introduction to Datasource V2 APIIntroduction to Datasource V2 API
Introduction to Datasource V2 APIdatamantra
 
Introdcution to Azure
Introdcution to AzureIntrodcution to Azure
Introdcution to AzureOmid Vahdaty
 
Defrag 2014 - Blend Web IDEs, Open Source and PaaS to Create and Deploy APIs
Defrag 2014 - Blend Web IDEs, Open Source and PaaS to Create and Deploy APIsDefrag 2014 - Blend Web IDEs, Open Source and PaaS to Create and Deploy APIs
Defrag 2014 - Blend Web IDEs, Open Source and PaaS to Create and Deploy APIsRestlet
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraAnant Corporation
 
Normalizing x pages web development
Normalizing x pages web development Normalizing x pages web development
Normalizing x pages web development Shean McManus
 
Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...DataWorks Summit
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flinkdatamantra
 
MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개Ha-Yang(White) Moon
 
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauBig Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauSam Palani
 
Ghost Environment
Ghost EnvironmentGhost Environment
Ghost EnvironmentPratipD
 
Provisioning infrastructure to AWS using Terraform – Exove
Provisioning infrastructure to AWS using Terraform – ExoveProvisioning infrastructure to AWS using Terraform – Exove
Provisioning infrastructure to AWS using Terraform – ExoveExove
 

Ähnlich wie Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for High-Performance Data Transfers from Data (20)

Presto: Query Anything - Data Engineer’s perspective
Presto: Query Anything - Data Engineer’s perspectivePresto: Query Anything - Data Engineer’s perspective
Presto: Query Anything - Data Engineer’s perspective
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
Serverless Data Platform
Serverless Data PlatformServerless Data Platform
Serverless Data Platform
 
introduction to micro services
introduction to micro servicesintroduction to micro services
introduction to micro services
 
Vue d'ensemble Dremio
Vue d'ensemble DremioVue d'ensemble Dremio
Vue d'ensemble Dremio
 
Bootify Yyour App from Zero to Hero
Bootify Yyour App from Zero to HeroBootify Yyour App from Zero to Hero
Bootify Yyour App from Zero to Hero
 
Introduction to Datasource V2 API
Introduction to Datasource V2 APIIntroduction to Datasource V2 API
Introduction to Datasource V2 API
 
Introdcution to Azure
Introdcution to AzureIntrodcution to Azure
Introdcution to Azure
 
Defrag 2014 - Blend Web IDEs, Open Source and PaaS to Create and Deploy APIs
Defrag 2014 - Blend Web IDEs, Open Source and PaaS to Create and Deploy APIsDefrag 2014 - Blend Web IDEs, Open Source and PaaS to Create and Deploy APIs
Defrag 2014 - Blend Web IDEs, Open Source and PaaS to Create and Deploy APIs
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
 
Normalizing x pages web development
Normalizing x pages web development Normalizing x pages web development
Normalizing x pages web development
 
Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...Present and future of unified, portable, and efficient data processing with A...
Present and future of unified, portable, and efficient data processing with A...
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
 
Red Hat Storage Roadmap
Red Hat Storage RoadmapRed Hat Storage Roadmap
Red Hat Storage Roadmap
 
Red Hat Storage Roadmap
Red Hat Storage RoadmapRed Hat Storage Roadmap
Red Hat Storage Roadmap
 
MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개
 
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauBig Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
 
Ghost Environment
Ghost EnvironmentGhost Environment
Ghost Environment
 
Provisioning infrastructure to AWS using Terraform – Exove
Provisioning infrastructure to AWS using Terraform – ExoveProvisioning infrastructure to AWS using Terraform – Exove
Provisioning infrastructure to AWS using Terraform – Exove
 

Mehr von Anant Corporation

QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137Anant Corporation
 
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfKono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfAnant Corporation
 
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotData Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotAnant Corporation
 
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...Anant Corporation
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAnant Corporation
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapAnant Corporation
 
Machine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowMachine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowAnant Corporation
 
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksAnant Corporation
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionAnant Corporation
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Anant Corporation
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & FutureAnant Corporation
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Anant Corporation
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackAnant Corporation
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergAnant Corporation
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsAnant Corporation
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Anant Corporation
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessAnant Corporation
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsAnant Corporation
 

Mehr von Anant Corporation (20)

QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
 
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfKono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
 
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotData Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
 
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
 
YugabyteDB Developer Tools
YugabyteDB Developer ToolsYugabyteDB Developer Tools
YugabyteDB Developer Tools
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
 
Machine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowMachine Learning Orchestration with Airflow
Machine Learning Orchestration with Airflow
 
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward Talks
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
CL 121
CL 121CL 121
CL 121
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
 

Kürzlich hochgeladen

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...gragchanchal546
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...HyderabadDolls
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdfkhraisr
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numberssuginr1
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...kumargunjan9515
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 

Kürzlich hochgeladen (20)

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 

Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for High-Performance Data Transfers from Data

  • 1. Confidential - Do Not Share or Distribute Apache Arrow Flight SQL A universal standard for high performance data transfers from databases Alex Merced Developer Advocate @ Dremio
  • 2. Confidential - Do Not Share or Distribute Agenda ● What if we could have for our data transfers… ● What options do we have? ● What is Apache Arrow? Is it enough? ● What is Arrow Flight? Is it enough? ● What is Arrow Flight SQL? Is it enough? ● What are the Arrow Flight SQL ODBC & JDBC Drivers?
  • 3. Confidential - Do Not Share or Distribute What if we could have for our data transfers… ✓ Standardized APIs ✓ No need for many different drivers ✓ High performance transfers ✓ Parallel data transfers (1:many, many:many) ✓ Easy to implement on server side, easy to use on client side
  • 4. Confidential - Do Not Share or Distribute ● Built in 1992 & 1997, respectively ○ Inherently row-based ○ Prevented the many:many of client-to-database-APIs ○ But resulted in one:many of API-to-database-protocols Why don’t ODBC & JDBC fit in today’s analytics world? Today Yesterday ● Today: heterogeneous big data workloads ● Most tools involved with OLAP workloads today are columnar ○ But there is no columnar transfer standard
  • 5. Confidential - Do Not Share or Distribute Increasingly, columnar → columnar is the standard Yesterday Today
  • 6. Confidential - Do Not Share or Distribute ● Custom protocols ○ One per server ○ Implemented individually by each client What options do we have to keep things columnar?
  • 7. Confidential - Do Not Share or Distribute Pick your poison Enter: Apache Arrow Flight JDBC/ODBC Custom protocol Standardize client interface ✅ ❌ Standardize client database access interface ✅ ❌ Standardize server interface ❌ ❌ Prevent unnecessary serialization ❌ ✅
  • 8. Confidential - Do Not Share or Distribute What if? Enter: Arrow Flight
  • 9. Confidential - Do Not Share or Distribute Before we go there: Apache Arrow
  • 10. Confidential - Do Not Share or Distribute Introduction to Apache Arrow ● An in-memory columnar format ○ Includes libraries for working with the format ■ E.g., computation engine, IPC, serialization / deserialization from file formats. ○ Support for many languages (12 currently) ● Co-created by Dremio and Wes McKinney ● Rapid adoption, popularity, and capability growth
  • 11. Confidential - Do Not Share or Distribute ● Arrow is primarily geared towards operations on data in a single-node ○ No specification for cross-network transfers ■ Within a cluster ■ Client-server 11 Isn’t Arrow enough? Enter: Arrow Flight
  • 12. Confidential - Do Not Share or Distribute 12 What is Arrow Flight? ● A general-purpose client-server framework to simplify high performance transport of large datasets over network interfaces ● Designed for data in the modern world ○ Column-oriented, rather than row-oriented ○ Arrow’s columnar format is designed for high compressibility and large numbers of rows ● Makes including the client-side in distributed computing easier ● Enables parallel data transfers
  • 13. Confidential - Do Not Share or Distribute 13 Parallel data transfers with Arrow Flight Client Node 2 Node N Node 1 CPU memory CPU memory CPU memory CPU memory DoGet(<ticket>) DoGet(<ticket>) DoGet(<ticket>)
  • 14. Confidential - Do Not Share or Distribute Isn’t Arrow Flight Enough? JDBC/ODBC Custom protocols Arrow Flight Standardize client interface ✅ ❌ ✅ Standardize client database access interface ✅ ❌ ❌ Standardize server interface ❌ ❌ ✅ Prevent unnecessary serialization ❌ ✅ ✅
  • 15. Confidential - Do Not Share or Distribute Isn’t Arrow Flight Enough? ● Client sends an arbitrary byte stream, server sends a result ○ The byte stream only has meaning for a particular server that’s chosen to support it ○ Example – Dremio interprets the byte stream to be a UTF-8 encoded SQL query string ● Flight is meant to serve any tabular data, not databases in particular ○ Catalog information is not part of Arrow Flight’s design ● Haven’t we seen these problems before? ○ ODBC and JDBC standardize database activities (e.g., query execution, catalog access) Enter: Arrow Flight SQL
  • 16. Confidential - Do Not Share or Distribute 16 ● Client-server protocol for interacting with SQL databases built on Arrow Flight ○ Allows databases to use Arrow Flight as the transport protocol ○ Leverage the performance of Arrow and Flight for database access ● Set of commands to standardize a SQL interface on Flight: ○ Query execution ○ Prepared statements ○ Database catalog metadata (tables, columns, data types). ○ SQL syntax capabilities ● Language and database-agnostic ○ A single set of client libraries to connect to any Flight SQL server ○ Clients can talk to any database implementing the protocol ○ No need for a client-side driver per database What is Arrow Flight SQL?
  • 17. Confidential - Do Not Share or Distribute Client Database 2 - FlightInfo<Schema, Endpoints> 1 - GetFlightInfo(GetTables) 4 - Arrow record batches 3 - DoGet(<ticket>) 6 - FlightInfo<Schema, Endpoints> 5 - GetFlightInfo(CommandStatementQuery) 7 - DoGet(<ticket>) Memory Listing Tables: Querying a table: Memory 8 - Arrow record batches
  • 18. Confidential - Do Not Share or Distribute Queries: ● CommandStatementQuery ● CommandStatementUpdate Prepared statements: ● ActionClosePreparedStatementRequest ● ActionCreatePreparedStatementRequest ● CommandPreparedStatementQuery ● CommandPreparedStatementUpdate Metadata requests: ● CommandGetCatalogs ● CommandGetCrossReference ● CommandGetDbSchemas ● CommandGetExportedKeys ● CommandGetImportedKeys ● CommandGetPrimaryKeys ● CommandGetSqlInfo ● CommandGetTables ● CommandGetTableTypes Arrow Flight SQL Commands
  • 19. Confidential - Do Not Share or Distribute Isn’t Arrow Flight SQL Enough? JDBC/ODBC Custom protocols Arrow Flight Arrow Flight SQL Standardize client interface ✅ ❌ ✅ ✅ Standardize client database access interface ✅ ❌ ❌ ✅ Standardize server interface ❌ ❌ ✅ ✅ Prevent unnecessary serialization ❌ ✅ ✅ ✅
  • 20. Confidential - Do Not Share or Distribute ● Arrow Flight SQL is fairly new, will take time to be widely adopted by existing client tools (especially the enterprise BI ones) ○ Most BI tools already support ODBC or JDBC Isn’t Arrow Flight SQL Enough? Enter: Arrow Flight SQL ODBC & JDBC Drivers
  • 21. Confidential - Do Not Share or Distribute 21 Arrow Flight SQL ODBC & JDBC Drivers ● ODBC & JDBC Drivers built on top of Flight SQL client libraries ○ A single driver to connect to any Flight SQL server ○ Supports arbitrary server-side options as connection properties ● The drivers will work with any ODBC/JDBC client tool without any code changes ● Completely open source
  • 22. Confidential - Do Not Share or Distribute 22 Traditional ODBC/JDBC vs Arrow Flight SQL ODBC/JDBC Traditional ODBC/JDBC Arrow Flight SQL ODBC/JDBC Driver Management Performance install and manage a driver for each database row-based transfer, so not great for large amounts of data column-based transfer, so benefit from compression of data over the wire install and manage a single driver for all databases
  • 23. Confidential - Do Not Share or Distribute When wouldn’t I use Arrow Flight ODBC/JDBC Drivers? ● Generally preferable to have native Flight SQL applications ○ Easier to harness future features like multiple endpoints ○ Better performance if your client is column-based Client is row-based Client is column-based Network gRPC Arrow Arrow Flight Arrow Flight SQL JDBC Legacy Client SQL Native Client Non-SQL Client
  • 24. Confidential - Do Not Share or Distribute What if we could have for our data transfers… ✓ Standardized APIs ○ Arrow Flight for arbitrary tabular transfers, Arrow Flight SQL for transfers from databases ✓ No need for many different drivers ○ Both client and server side APIs are standardized ✓ High performance transfers ○ Transfers via highly optimized Arrow format ✓ Parallel data transfers (1:many, many:many) ○ Lays the foundation for parallel data transfers ✓ Easy to implement on server side, easy to use on client side ○ Server-side implementation is as easy as implementing a few RPC request handlers ○ A single set of client libraries to connect to any Flight SQL server
  • 25. Confidential - Do Not Share or Distribute Q&A
  • 26. Confidential - Do Not Share or Distribute Thanks! jason@dremio.com