Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Getting Started with PL/Proxy

7.933 Aufrufe

Veröffentlicht am

presentation from PgEast 2011

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Getting Started with PL/Proxy

  1. 1. Getting Started with PL/Proxy Peter Eisentraut peter@eisentraut.org F-Secure Corporation PostgreSQL Conference East 2011 CC-BY
  2. 2. Concept • a database partitioning system implemented as a procedural language • “sharding”/horizontal partitioning • PostgreSQL’s No(t-only)SQL solution
  3. 3. Concept application application application application frontend partition 1 partition 2 partition 3 partition 4
  4. 4. Areas of Application • high write load • (high read load) • allow for some “eventual consistency” • have reasonable partitioning keys • use/plan to use server-side functions
  5. 5. Example Have:1 CREATE TABLE products ( prod_id serial PRIMARY KEY , category integer NOT NULL , title varchar (50) NOT NULL , actor varchar (50) NOT NULL , price numeric (12 ,2) NOT NULL , special smallint , common_prod_id integer NOT NULL ); INSERT INTO products VALUES (...) ; UPDATE products SET ... WHERE ...; DELETE FROM products WHERE ...; plus various queries 1 dellstore2 example database
  6. 6. Installation • Download: http://plproxy.projects.postgresql.org, Deb, RPM, . . . • Create language: psql -d dellstore2 -f ...../plproxy.sql
  7. 7. Backend Functions I CREATE FUNCTION insert_product ( p_category int , p_title varchar , p_actor varchar , p_price numeric , p_special smallint , p_common_prod_id int ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE cnt int ; BEGIN INSERT INTO products ( category , title , actor , price , special , common_prod_id ) VALUES ( p_category , p_title , p_actor , p_price , p_special , p_common_prod_id ) ; GET DIAGNOSTICS cnt = ROW_COUNT ; RETURN cnt ; END ; $$ ;
  8. 8. Backend Functions II CREATE FUNCTION update_product_price ( p_prod_id int , p_price numeric ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE cnt int ; BEGIN UPDATE products SET price = p_price WHERE prod_id = p_prod_id ; GET DIAGNOSTICS cnt = ROW_COUNT ; RETURN cnt ; END ; $$ ;
  9. 9. Backend Functions III CREATE FUNCTION delete_product_by_title ( p_title varchar ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE cnt int ; BEGIN DELETE FROM products WHERE title = p_title ; GET DIAGNOSTICS cnt = ROW_COUNT ; RETURN cnt ; END ; $$ ;
  10. 10. Frontend Functions I CREATE FUNCTION insert_product ( p_category int , p_title varchar , p_actor varchar , p_price numeric , p_special smallint , p_common_prod_id int ) RETURNS SETOF int LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON hashtext ( p_title ) ; $$ ; CREATE FUNCTION update_product_price ( p_prod_id int , p_price numeric ) RETURNS SETOF int LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON ALL ; $$ ;
  11. 11. Frontend Functions II CREATE FUNCTION delete_product_by_title ( p_title varchar ) RETURNS int LANGUAGE plpgsql AS $$ CLUSTER dellstore_cluster ; RUN ON hashtext ( p_title ) ; $$ ;
  12. 12. Frontend Query Functions I CREATE FUNCTION get_product_price ( p_prod_id int ) RETURNS SETOF numeric LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON ALL ; SELECT price FROM products WHERE prod_id = p_prod_id ; $$ ;
  13. 13. Frontend Query Functions II CREATE FUNCTION get_products_by_category ( p_category int ) RETURNS SETOF products LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON ALL ; SELECT * FROM products WHERE category = p_category ; $$ ;
  14. 14. Unpartitioned Small Tables CREATE FUNCTION insert_category ( p_categoryname ) RETURNS SETOF int LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON 0; $$ ;
  15. 15. Which Hash Key? • natural keys (names, descriptions, UUIDs) • not serials (Consider using fewer “ID” fields.) • single columns • group sensibly to allow joins on backend
  16. 16. Set Basic Parameters • number of partitions (2n ), e. g. 8 • host names, e. g. • frontend: dbfe • backends: dbbe1, . . . , dbbe8 • database names, e. g. • frontend: dellstore2 • backends: store01, . . . , store08 • user names, e. g. storeapp • hardware: • frontend: lots of memory, normal disk • backends: full-sized database server
  17. 17. Set Basic Parameters • number of partitions (2n ), e. g. 8 • host names, e. g. • frontend: dbfe • backends: dbbe1, . . . , dbbe8 (or start at 0?) • database names, e. g. • frontend: dellstore2 • backends: store01, . . . , store08 (or start at 0?) • user names, e. g. storeapp • hardware: • frontend: lots of memory, normal disk • backends: full-sized database server
  18. 18. Configuration CREATE FUNCTION plproxy . get_cluster_partitions ( cluster_name text ) RETURNS SETOF text LANGUAGE plpgsql AS $$ ... $$ ; CREATE FUNCTION plproxy . get_cluster_version ( cluster_name text ) RETURNS int LANGUAGE plpgsql AS $$ ... $$ ; CREATE FUNCTION plproxy . get_cluster_config ( IN cluster_name text , OUT key text , OUT val text ) RETURNS SETOF record LANGUAGE plpgsql AS $$ ... $$ ;
  19. 19. get_cluster_partitions Simplistic approach: CREATE FUNCTION plproxy . get_cluster_partitions ( cluster_name text ) RETURNS SETOF text LANGUAGE plpgsql AS $$ BEGIN IF cluster_name = dellstore_cluster THEN RETURN NEXT dbname = store01 host = dbbe1 ; RETURN NEXT dbname = store02 host = dbbe2 ; ... RETURN NEXT dbname = store08 host = dbbe8 ; RETURN ; END IF ; RAISE EXCEPTION Unknown cluster ; END ; $$ ;
  20. 20. get_cluster_version Simplistic approach: CREATE FUNCTION plproxy . get_cluster_version ( cluster_name text ) RETURNS int LANGUAGE plpgsql AS $$ BEGIN IF cluster_name = dellstore_cluster THEN RETURN 1; END IF ; RAISE EXCEPTION Unknown cluster ; END ; $$ LANGUAGE plpgsql ;
  21. 21. get_cluster_config CREATE OR REPLACE FUNCTION plproxy . get_cluster_config ( IN cluster_name text , OUT key text , OUT val text ) RETURNS SETOF record LANGUAGE plpgsql AS $$ BEGIN -- same config for all clusters key := connection_lifetime ; val := 30*60; -- 30 m RETURN NEXT ; RETURN ; END ; $$ ;
  22. 22. Table-Driven Configuration I CREATE TABLE plproxy . partitions ( cluster_name text NOT NULL , host text NOT NULL , port text NOT NULL , dbname text NOT NULL , PRIMARY KEY ( cluster_name , dbname ) ); INSERT INTO plproxy . partitions VALUES ( dellstore_cluster , dbbe1 , 5432 , store01 ) , ( dellstore_cluster , dbbe2 , 5432 , store02 ) , ... ( dellstore_cluster , dbbe8 , 5432 , store03 ) ;
  23. 23. Table-Driven Configuration II CREATE TABLE plproxy . cluster_users ( cluster_name text NOT NULL , remote_user text NOT NULL , local_user NOT NULL , PRIMARY KEY ( cluster_name , remote_user , local_user ) ); INSERT INTO plproxy . cluster_users VALUES ( dellstore_cluster , storeapp , storeapp ) ;
  24. 24. Table-Driven Configuration III CREATE TABLE plproxy . remote_passwords ( host text NOT NULL , port text NOT NULL , dbname text NOT NULL , remote_user text NOT NULL , password text , PRIMARY KEY ( host , port , dbname , remote_user ) ); INSERT INTO plproxy . remote_passwords VALUES ( dbbe1 , 5432 , store01 , storeapp , Thu1Ued0 ) , ... -- or use . pgpass ?
  25. 25. Table-Driven Configuration IV CREATE TABLE plproxy . cluster_version ( id int PRIMARY KEY ); INSERT INTO plproxy . cluster_version VALUES (1) ; GRANT SELECT ON plproxy . cluster_version TO PUBLIC ; /* extra credit : write trigger that changes the version when one of the other tables changes */
  26. 26. Table-Driven Configuration V CREATE OR REPLACE FUNCTION plproxy . get_cluster_partitions ( p_cluster_name text ) RETURNS SETOF text LANGUAGE plpgsql SECURITY DEFINER AS $$ DECLARE r record ; BEGIN FOR r IN SELECT host = || host || port = || port || dbname = || dbname || user = || remote_user || password = || password AS dsn FROM plproxy . partitions NATURAL JOIN plproxy . cluster_users NATURAL JOIN plproxy . remote_passwords WHERE cluster_name = p_cluster_name AND local_user = session_user ORDER BY dbname -- important LOOP RETURN NEXT r. dsn ; END LOOP ; IF NOT found THEN RAISE EXCEPTION no such cluster : % , p_cluster_name ; END IF ; RETURN ; END ; $$ ;
  27. 27. Table-Driven Configuration VI CREATE FUNCTION plproxy . get_cluster_version ( p_cluster_name text ) RETURNS int LANGUAGE plpgsql AS $$ DECLARE ret int ; BEGIN SELECT INTO ret id FROM plproxy . cluster_version ; RETURN ret ; END ; $$ ;
  28. 28. SQL/MED Configuration CREATE SERVER dellstore_cluster FOREIGN DATA WRAPPER plproxy OPTIONS ( connection_lifetime 1800 , p0 dbname = store01 host = dbbe1 , p1 dbname = store02 host = dbbe2 , ... p7 dbname = store08 host = dbbe8 ); CREATE USER MAPPING FOR storeapp SERVER dellstore_cluster OPTIONS ( user storeapp , password sekret ) ; GRANT USAGE ON SERVER dellstore_cluster TO storeapp ;
  29. 29. Hash Functions RUN ON hashtext ( somecolumn ) ; • want a fast, uniform hash function • typically use hashtext • problem: implementation might change • possible solution: https://github.com/petere/pgvihash
  30. 30. Sequences shard 1: ALTER SEQUENCE products_prod_id_seq MINVALUE 1 MAXVALUE 100000000 START 1; shard 2: ALTER SEQUENCE products_prod_id_seq MINVALUE 100000001 MAXVALUE 200000000 START 100000001; etc.
  31. 31. Aggregates Example: count all products Backend: CREATE FUNCTION count_products () RETURNS bigint LANGUAGE SQL STABLE AS $$SELECT count (*) FROM products$$ ; Frontend: CREATE FUNCTION count_products () RETURNS SETOF bigint LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON ALL ; $$ ; SELECT sum ( x ) AS count FROM count_products () AS t(x);
  32. 32. Dynamic Queries I a. k. a. “cheating” ;-) CREATE FUNCTION execute_query ( sql text ) RETURNS SETOF RECORD LANGUAGE plproxy AS $$ CLUSTER dellstore_cluster ; RUN ON ALL ; $$ ; CREATE FUNCTION execute_query ( sql text ) RETURNS SETOF RECORD LANGUAGE plpgsql AS $$ BEGIN RETURN QUERY EXECUTE sql ; END ; $$ ;
  33. 33. Dynamic Queries II SELECT * FROM execute_query ( SELECT title , price FROM products ) AS ( title varchar , price numeric ) ; SELECT category , sum ( sum_price ) FROM execute_query ( SELECT category , sum ( price ) FROM products GROUP BY category ) AS ( category int , sum_price numeric ) GROUP BY category ;
  34. 34. Repartitioning • changing partitioning key is extremely cumbersome • adding partitions is somewhat cumbersome, e. g., to split shard 0: COPY ( SELECT * FROM products WHERE hashtext ( title :: text ) & 15 <> 0) TO somewhere ; DELETE FROM products WHERE hashtext ( title :: text ) & 15 <> 0; Better start out with enough partitions!
  35. 35. PgBouncer application application application application frontend PgBouncer PgBouncer PgBouncer PgBouncer partition 1 partition 2 partition 3 partition 4 Use pool_mode = statement
  36. 36. Development Issues • foreign keys • notifications • hash key check constraints • testing (pgTAP), no validator
  37. 37. Administration • centralized logging • distributed shell (dsh) • query canceling/timeouts • access control, firewalling • deployment
  38. 38. High Availability Frontend: • multiple frontends (DNS, load balancer?) • replicate partition configuration (Slony, Bucardo, WAL) • Heartbeat, UCARP, etc. Backend: • replicate backends shards individually (Slony, WAL, DRBD) • use partition configuration to configure load spreading or failover
  39. 39. Advanced Topics • generic insert, update, delete functions • frontend joins • backend joins • finding balance between function interface and dynamic queries • arrays, SPLIT BY • use for remote database calls • cross-shard calls • SQL/MED (foreign table) integration
  40. 40. The End

×