SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Unlocking Proprietary Data
with PostgreSQL Foreign
Data Wrappers
Pat Patterson
Principal Developer Evangelist
ppatterson@salesforce.com
@metadaddy
Agenda

 Foreign Data Wrappers

 Writing FDW’s in C

 Multicorn

 Database.com FDW for PostgreSQL

 FDW in action
Why Foreign Data Wrappers?

 External data sources look like local tables!
   – Other SQL database
      • MySQL, Oracle, SQL Server, etc
   – NoSQLdatabase
      • CouchDB, Redis, etc
   – File
   – LDAP
   – Web services
      • Twitter!
Why Foreign Data Wrappers?

 Make the database do the work
  – SELECT syntax
     • DISTINCT, ORDER BY etc
  – Functions
     • COUNT(), MIN(), MAX() etc
  – JOIN external data to internal tables
  – Use standard apps, libraries for data analysis,
    reporting
Foreign Data Wrappers

 2003 - SQL Management of External Data (SQL/MED)
 2011 – PostgreSQL 9.1 implementation
   – Read-only
   – SELECT-clause optimization
   – WHERE-clause push-down
      • Minimize data requested from external source

 Future Improvements
   – JOIN push-down
      • Where two foreign tables are in the same server
   – Support cursors
FDW’s in PostgreSQL

 ‘Compiled language’ (C) interface
 Implement a set of callbacks
  typedefstructFdwRoutine
  {
  NodeTagtype;
      /* These functions are required. */
  GetForeignRelSize_functionGetForeignRelSize;
  GetForeignPaths_functionGetForeignPaths;
  GetForeignPlan_functionGetForeignPlan;
  ExplainForeignScan_functionExplainForeignScan;
  BeginForeignScan_functionBeginForeignScan;
  IterateForeignScan_functionIterateForeignScan;
  ReScanForeignScan_functionReScanForeignScan;
  EndForeignScan_functionEndForeignScan;
      /* These functions are optional. */
  AnalyzeForeignTable_functionAnalyzeForeignTable;
  } FdwRoutine;
FDW’s in PostgreSQL

 Much work!
     • CouchDBFDW
     • https://github.com/ZhengYang/couchdb_fdw/
     • couchdb_fdw.c> 1700 LoC
Multicorn

 http://multicorn.org/
 PostgreSQL 9.1+ extension
 Python framework for FDW’s
 Implement two methods…
Multicorn
from multicorn import ForeignDataWrapper

class ConstantForeignDataWrapper(ForeignDataWrapper):

    def __init__(self, options, columns):
super(ConstantForeignDataWrapper,
self).__init__(options, columns)
self.columns = columns

    def execute(self, quals, columns):
        for index in range(20):
            line = {}
            for column_name in self.columns:
line[column_name] =
                '%s %s' % (column_name, index)
            yield line
Database.com FDW for PostgreSQL

 OAuth login to Database.com / Force.com
   – Refresh on token expiry
 Force.com REST API
   – SOQL query
      • SELECT firstname, lastname FROM Contact

 Request thread puts records in Queue, execute()
  method gets them from Queue
 JSON parsing – skip embedded metadat
 < 250 lines code
Demo
Conclusion

 Foreign Data Wrappers make the whole world look like
  tables!
 Writing FDW’s in C is hard!
   – Or, at least, time consuming!
 Writing FDW’s in Python via Multicorn is easy!
   – Or, at least, quick!
 Try it for yourself!
Resources

 http://wiki.postgresql.org/wiki/SQL/MED

 http://wiki.postgresql.org/wiki/Foreign_data_wrappers

 http://multicorn.org/

 https://github.com/metadaddy-sfdc/Database.com-
  FDW-for-PostgreSQL
Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

Weitere ähnliche Inhalte

Mehr von Pat Patterson

Salesforce Integration with Twilio
Salesforce Integration with TwilioSalesforce Integration with Twilio
Salesforce Integration with Twilio
Pat Patterson
 

Mehr von Pat Patterson (20)

Integrating with Einstein Analytics
Integrating with Einstein AnalyticsIntegrating with Einstein Analytics
Integrating with Einstein Analytics
 
Efficient Schemas in Motion with Kafka and Schema Registry
Efficient Schemas in Motion with Kafka and Schema RegistryEfficient Schemas in Motion with Kafka and Schema Registry
Efficient Schemas in Motion with Kafka and Schema Registry
 
Dealing With Drift - Building an Enterprise Data Lake
Dealing With Drift - Building an Enterprise Data LakeDealing With Drift - Building an Enterprise Data Lake
Dealing With Drift - Building an Enterprise Data Lake
 
Building Data Pipelines with Spark and StreamSets
Building Data Pipelines with Spark and StreamSetsBuilding Data Pipelines with Spark and StreamSets
Building Data Pipelines with Spark and StreamSets
 
Adaptive Data Cleansing with StreamSets and Cassandra
Adaptive Data Cleansing with StreamSets and CassandraAdaptive Data Cleansing with StreamSets and Cassandra
Adaptive Data Cleansing with StreamSets and Cassandra
 
Building Custom Big Data Integrations
Building Custom Big Data IntegrationsBuilding Custom Big Data Integrations
Building Custom Big Data Integrations
 
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
 
Open Source Big Data Ingestion - Without the Heartburn!
Open Source Big Data Ingestion - Without the Heartburn!Open Source Big Data Ingestion - Without the Heartburn!
Open Source Big Data Ingestion - Without the Heartburn!
 
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
 
All Aboard the Boxcar! Going Beyond the Basics of REST
All Aboard the Boxcar! Going Beyond the Basics of RESTAll Aboard the Boxcar! Going Beyond the Basics of REST
All Aboard the Boxcar! Going Beyond the Basics of REST
 
Provisioning IDaaS - Using SCIM to Enable Cloud Identity
Provisioning IDaaS - Using SCIM to Enable Cloud IdentityProvisioning IDaaS - Using SCIM to Enable Cloud Identity
Provisioning IDaaS - Using SCIM to Enable Cloud Identity
 
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
 
Enterprise IoT: Data in Context
Enterprise IoT: Data in ContextEnterprise IoT: Data in Context
Enterprise IoT: Data in Context
 
OData: A Standard API for Data Access
OData: A Standard API for Data AccessOData: A Standard API for Data Access
OData: A Standard API for Data Access
 
API-Driven Relationships: Building The Trans-Internet Express of the Future
API-Driven Relationships: Building The Trans-Internet Express of the FutureAPI-Driven Relationships: Building The Trans-Internet Express of the Future
API-Driven Relationships: Building The Trans-Internet Express of the Future
 
Using Salesforce to Manage Your Developer Community
Using Salesforce to Manage Your Developer CommunityUsing Salesforce to Manage Your Developer Community
Using Salesforce to Manage Your Developer Community
 
Identity in the Cloud
Identity in the CloudIdentity in the Cloud
Identity in the Cloud
 
OpenID Connect: An Overview
OpenID Connect: An OverviewOpenID Connect: An Overview
OpenID Connect: An Overview
 
How I Learned to Stop Worrying and Love Open Source Identity (Paris Edition)
How I Learned to Stop Worrying and Love Open Source Identity (Paris Edition)How I Learned to Stop Worrying and Love Open Source Identity (Paris Edition)
How I Learned to Stop Worrying and Love Open Source Identity (Paris Edition)
 
Salesforce Integration with Twilio
Salesforce Integration with TwilioSalesforce Integration with Twilio
Salesforce Integration with Twilio
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers

  • 1. Unlocking Proprietary Data with PostgreSQL Foreign Data Wrappers Pat Patterson Principal Developer Evangelist ppatterson@salesforce.com @metadaddy
  • 2. Agenda  Foreign Data Wrappers  Writing FDW’s in C  Multicorn  Database.com FDW for PostgreSQL  FDW in action
  • 3. Why Foreign Data Wrappers?  External data sources look like local tables! – Other SQL database • MySQL, Oracle, SQL Server, etc – NoSQLdatabase • CouchDB, Redis, etc – File – LDAP – Web services • Twitter!
  • 4. Why Foreign Data Wrappers?  Make the database do the work – SELECT syntax • DISTINCT, ORDER BY etc – Functions • COUNT(), MIN(), MAX() etc – JOIN external data to internal tables – Use standard apps, libraries for data analysis, reporting
  • 5. Foreign Data Wrappers  2003 - SQL Management of External Data (SQL/MED)  2011 – PostgreSQL 9.1 implementation – Read-only – SELECT-clause optimization – WHERE-clause push-down • Minimize data requested from external source  Future Improvements – JOIN push-down • Where two foreign tables are in the same server – Support cursors
  • 6. FDW’s in PostgreSQL  ‘Compiled language’ (C) interface  Implement a set of callbacks typedefstructFdwRoutine { NodeTagtype; /* These functions are required. */ GetForeignRelSize_functionGetForeignRelSize; GetForeignPaths_functionGetForeignPaths; GetForeignPlan_functionGetForeignPlan; ExplainForeignScan_functionExplainForeignScan; BeginForeignScan_functionBeginForeignScan; IterateForeignScan_functionIterateForeignScan; ReScanForeignScan_functionReScanForeignScan; EndForeignScan_functionEndForeignScan; /* These functions are optional. */ AnalyzeForeignTable_functionAnalyzeForeignTable; } FdwRoutine;
  • 7. FDW’s in PostgreSQL  Much work! • CouchDBFDW • https://github.com/ZhengYang/couchdb_fdw/ • couchdb_fdw.c> 1700 LoC
  • 8. Multicorn  http://multicorn.org/  PostgreSQL 9.1+ extension  Python framework for FDW’s  Implement two methods…
  • 9. Multicorn from multicorn import ForeignDataWrapper class ConstantForeignDataWrapper(ForeignDataWrapper): def __init__(self, options, columns): super(ConstantForeignDataWrapper, self).__init__(options, columns) self.columns = columns def execute(self, quals, columns): for index in range(20): line = {} for column_name in self.columns: line[column_name] = '%s %s' % (column_name, index) yield line
  • 10. Database.com FDW for PostgreSQL  OAuth login to Database.com / Force.com – Refresh on token expiry  Force.com REST API – SOQL query • SELECT firstname, lastname FROM Contact  Request thread puts records in Queue, execute() method gets them from Queue  JSON parsing – skip embedded metadat  < 250 lines code
  • 11. Demo
  • 12. Conclusion  Foreign Data Wrappers make the whole world look like tables!  Writing FDW’s in C is hard! – Or, at least, time consuming!  Writing FDW’s in Python via Multicorn is easy! – Or, at least, quick!  Try it for yourself!
  • 13. Resources  http://wiki.postgresql.org/wiki/SQL/MED  http://wiki.postgresql.org/wiki/Foreign_data_wrappers  http://multicorn.org/  https://github.com/metadaddy-sfdc/Database.com- FDW-for-PostgreSQL