Scaling a SaaS backend with PostgreSQL - A case study

•

7 gefällt mir•1,371 views

The document discusses the challenges of scaling a PostgreSQL database for a SAAS backend with growing data. It describes how the company initially separated OLTP and OLAP data into separate databases but later unified them into a single database approach. It discusses partitioning the data using separate databases for each customer account and the benefits and limitations of this approach. It also covers additional performance issues encountered and solutions implemented including advisory locks, bulk loading optimizations, and maintaining spare databases to speed up new account creation. The document emphasizes the importance of schemas for code versioning and staging releases.

Software

SCALING A SAAS BACKEND
WITH POSTGRESQL – A CASE STUDY
PostgreSQL Conference Europe
Madrid 2014-10-24
Oliver Seemann - Bidmanagement GmbH
oliver.seemann@adspert.net

We do productivity tools for
advertisers

Upper boundary:
5M keywords × 365 days
× 20 bigints/doubles
≅ 300GB

“Slow” OLAP data for daily batch-processing
jobs

Initially separate databases
Slow
Data
Fast
Data

Data overlaps significantly
Slow
Data
Fast
Data

We went with unified approach
Slow
Data
Fast
Data

Design by the book
Keywords
PK,FK1 adgroup_id
PK keyword_id
Campaign
PK campaign_id
FK1 account_id
Adgroup
PK adgroup_id
FK1 campaign_id
Account
PK account_id
FK1 customer_id
Customer
PK customer_id
User
PK user_id
FK1 customer_id
History
PK day
PK,FK1 keyword_id
PK,FK1,FK2 adgroup_id
UserAccountAccess
PK,FK1 account_id
PK,FK2 user_id
Scenario
PK,FK1 keyword_id
PK,FK1 adgroup_id
PK factor

All Accounts
Account 1 – Rec 1
Account 2 – Rec 1
Account 1 – Rec 2
Account 3 – Rec 1
Account 2 – Rec 2
Account 2 – Rec 3
Account 1 – Rec 3
Account 3 – Rec 2

+10fold increase per level
Account >10
FK Campaign >1k
FK Ad Group >100K
FK Keyword >10M
FK History >100M

Partitioning, somehow
Account 1
Account 1 – Rec 1
Account 1 – Rec 2
Account 1 – Rec 3
Account 2
Account 2 – Rec 1
Account 2 – Rec 2
Account 2 – Rec 3
Account 3
Account 3 – Rec 1
Account 3 – Rec 2
Account 3 – Rec 3

Partitioning with inheritance
Parent
Child Child Child
check-constraints
SELECT
INSERT

PG Partitioning is nifty –
but not a match for our case

Our case:
Little to no shared data between
clients

Isolate accounts
One DB Many DBs/Schemas?

Both approaches:
+ Good horizontal scaling

Both approaches:
+ Good tool support
(e.g. pg_dump/restore)

Partition into databases:
+ Easy cloning
CREATE DATABASE foo TEMPLATE bar;

Partition into databases:
+ Stricter isolation (security)

Partition into databases:
- Some Overhead

Partition into databases:
- No direct references

Partition into schemas:
+ More lightweight

Partition into schemas:
+ Full references

Partition into schemas:
- No easy cloning

Partition into schemas:
- No cascading schemas

Now:
Several thousand databases
on five 1TB machines

Now:
Plus main DB server pair
with <10GB data

Setup
Main DB Hosts
standalone-0 standalone-1
standalone-2 standalone-3
master slave
Account DB Hosts

From 300MB/s to 30MB/s
More concurrent queries
Longer query runtime

Different apps
Different access patterns
Web
Apps
Compute
Cluster
Many small/
fast queries
Few very slow/
big queries

Limit concurrent access
with counting semaphore
Web
Apps
Compute
Cluster
Many small/
fast queries
Few very slow/
big queries

Implement Semaphore using
Advisory Locks

Signup Delays
Signup
Web App
+
CREATE
DATABASE
Can take
up to 5-15 min

In general:
Very happy with our approach

CREATE SCHEMA foo;
CREATE SCHEMA foo.bar;
CREATE SCHEMA foo.bar.baz;

Previously:
1. Read bulk raw data from DB
2. Number crunching in app
3. Write bulk results to DB

How to test it?
App test suite
goes a long way.

Different production stages
Versioning with schemas

Every 4-8 weeks:
CREATE SCHEMA version_%d;

Assign each version a stage:
unstable -> testing -> stable

Stage App Schema COUNT(account)
unstable v22.4 version_22 0% - 2%
testing v21.13 version_21 1% – 50%
stable v20.19 version_20 50% – 100%

Watchdogs on key metrics
alert on suspicious behaviour

Takeaway:
Databases can be aptly
used as partitions

Takeaway:
Schemas can be used
for versioning

ORM
Can’t live with it,
can’t live without it

Weitere ähnliche Inhalte

Was ist angesagt?

The main topic of slides is building high availability high throughput system for receiveing and saving different kind of information with horizontal scalling possibility using HBase, Flume and Grizzly hosted on Amazon EC2 low cost instances. Talk describes HBase HA cluster setup process with useful hints and EC2 pitfalls, Flume setup process with providing comparasion between standalone and embedded Flume versions and show difference and usecases of both versions. A lot of attention payed to Flume2Hbase streaming features with tweaks and different approaches for speeding up this process.

Jee conf

Valerii Moisieienko

Amazon Elastic Map Reduce - Ian Meyers

huguk

Amazon DynamoDB is a fully-managed, zero-admin, high-speed NoSQL database service. Amazon DynamoDB was built to support applications at any scale. With the click of a button, you can scale your database capacity from a few hundred I/Os per second to hundreds of thousands of I/Os per second or more. You can dynamically scale your database to keep up with your application's requirements while minimizing costs during low-traffic periods. The service has no limit on storage. You also learn about Amazon DynamoDB's design principles and history.

Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...

Amazon Web Services

hbaseconasia2019 HBase at Tencent

Michael Stack

HBaseConAsia2018 Keynote1: Apache HBase Project Status

Michael Stack

Copy data management

Paresh Motiwala, PMP®

HBaseConAsia2018 Track1-3: HBase at Xiaomi

Michael Stack

Black Friday and Cyber Monday- Best Practices for Your E-Commerce Database

Tim Vaillancourt

Building a Scalable Web Crawler with Hadoop by Ahad Rana from CommonCrawl Ahad Rana, engineer at CommonCrawl, will go over CommonCrawl’s extensive use of Hadoop to fulfill their mission of building an open, and accessible Web-Scale crawl. He will discuss their Hadoop data processing pipeline, including their PageRank implementation, describe techniques they use to optimize Hadoop, discuss the design of their URL Metadata service, and conclude with details on how you can leverage the crawl (using Hadoop) today.

Building a Scalable Web Crawler with Hadoop

Hadoop User Group

PHP and Cassandra

Dave Gardner

PipelineDB is an open-source relational database that runs SQL queries continuously on streaming data, incrementally storing results in tables. Our talk will include an overview of PipelineDB’s architecture, the use cases for continuous SQL queries on streams, user case studies, and outline how PipelineDB can used to easily build scalable and highly available streaming and realtime analytics applications using only SQL with no external dependencies.

IMC Summit 2016 Innovation - Derek Nelson - PipelineDB: The Streaming-SQL Dat...

In-Memory Computing Summit

Some of the most common questions we hear from users relate to capacity planning and hardware choices. How many replicas do I need? Should I consider sharding right away? How much RAM will I need for my working set? SSD or HDD? No one likes spending a lot of cash on hardware and cloud bills can just be as painful. MongoDB is different from traditional RDBMSs in its resource management, so you need to be mindful when deciding on the cluster layout and hardware. In this talk we will review the factors that drive the capacity requirements: volume of queries, access patterns, indexing, working set size, among others. Attendees will gain additional insight as we go through a few real-world scenarios, as experienced with MongoDB Inc customers, and come up with their ideal cluster layout and hardware.

Hardware Provisioning

MongoDB

Hdfs high availability

Hadoop User Group

Columnstore Customer Stories 2016 by Sunil Agarwal

Brent Ozar

HBaseConAsia2018 Track3-4: HBase and OpenTSDB practice at Huawei

Michael Stack

•Arun Murthy, from the Hadoop team at Yahoo! will introduce compendium of best practices for applications running on Apache Hadoop. In fact, we introduce the notion of a Grid Pattern which, similar to Design Pattern, represents a general reusable solution for applications running on the Grid. He will even cover the anti-patterns of applications running on the Apache Hadoop clusters. Arun will enumerate characteristics of well-behaved applications and provide guidance on appropriate uses of various features and capabilities of the Hadoop framework. It is largely prescriptive in its nature; a useful way to look at the presention is to understand that applications that follow, in spirit, the best practices prescribed here are very likely to be efficient, well-behaved in the multi-tenant environment of the Apache Hadoop clusters and unlikely to fall afoul of most policies and limits.

HUG August 2010: Best practices

Hadoop User Group

Hardware Provisioning for MongoDB

MongoDB

Building Hybrid data cluster using PostgreSQL and MongoDB

Ashnikbiz

HBASE by Nicolas Liochon - Meetup HUGFR du 22 Sept 2014

Modern Data Stack France

UnConference for Georgia Southern Computer Science March 31, 2015

Christopher Curtin

Was ist angesagt? (20)

Jee conf

Amazon Elastic Map Reduce - Ian Meyers

Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...

hbaseconasia2019 HBase at Tencent

HBaseConAsia2018 Keynote1: Apache HBase Project Status

Copy data management

HBaseConAsia2018 Track1-3: HBase at Xiaomi

Black Friday and Cyber Monday- Best Practices for Your E-Commerce Database

Building a Scalable Web Crawler with Hadoop

PHP and Cassandra

IMC Summit 2016 Innovation - Derek Nelson - PipelineDB: The Streaming-SQL Dat...

Hardware Provisioning

Hdfs high availability

Columnstore Customer Stories 2016 by Sunil Agarwal

HBaseConAsia2018 Track3-4: HBase and OpenTSDB practice at Huawei

HUG August 2010: Best practices

Hardware Provisioning for MongoDB

Building Hybrid data cluster using PostgreSQL and MongoDB

HBASE by Nicolas Liochon - Meetup HUGFR du 22 Sept 2014

UnConference for Georgia Southern Computer Science March 31, 2015

Andere mochten auch

All data is relational and can be represented through relational algebra, right? Perhaps, but there are other ways to represent data, and the PostgreSQL team continues to work on making it easier and more efficient to do so! With the upcoming 9.4 release, PostgreSQL is introducing the "JSONB" data type which allows for fast, compressed, storage of JSON formatted data, and for quick retrieval. And JSONB comes with all the benefits of PostgreSQL, like its data durability, MVCC, and of course, access to all the other data types and features in PostgreSQL. How fast is JSONB? How do we access data stored with this type? What can it do with the rest of PostgreSQL? What can't it do? How can we leverage this new data type and make PostgreSQL scale horizontally? Follow along with our presentation as we try to answer these questions.

Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies

Jonathan Katz

Design Strategy for Data Isolation in SaaS Model

Techcello

Денормализованное хранение данных в PostgreSQL 9.2 (Александр Коротков)

Ontico

The Software as a Service or SaaS market is large and growing. Demands of 24/7 availability, high performance, back-up, security, affordability, scalability, manageability, audit ability and easy integration when delivering your product and or service to your customers, are business challenges which we will address in this presentation. By demonstrating MySQL’s proven ability in this area, we will show how we can help new and seasoned SaaS vendors.

MySQL for Software-as-a-Service (SaaS)

Mario Beck

Scaling Wanelo.com 100x in Six Months

Konstantin Gredeskoul

In this exciting and informative talk, presented at PgConf Sillicon Valley 2015, Konstantin cut through the theory to deliver a clear set of practical solutions for scaling applications atop PostgreSQL, eventually supporting millions of active users, tens of thousands concurrently, and with the application stack that responds to requests with a 100ms average. He will share how his team solved one of the biggest challenges they faced: effectively storing and retrieving over 3B rows of "saves" (a Wanelo equivalent of Instagram's "like" or Pinterest's "pin"), all in PostgreSQL, with highly concurrent random access. Over the last three years, the team at Wanelo optimized the hell out of their application and database stacks. Using PostgreSQL version 9 as their primary data store, Joyent Public Cloud as a hosting environment, the team re-architected their backend for rapid expansion several times over, as the unrelenting traffic kept climbing up. This ultimately resulted in a highly efficient, horizontally scalable, fault tolerant application infrastructure. Unimpressed? Now try getting there without the OPS or DBA teams, all while deploying seven times per day to production, with an application measuring 99.999% uptime over the last 6 months.

From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL

Konstantin Gredeskoul

The proliferation of SaaS applications like Salesforce.com is creating a host of new management challenges. For example, how do you measure the performance of applications you don’t host? What real-time data do you have to communicate with business stakeholders? How will you know if SLA commitments are being met? Join us for a webinar exploring the best practices for managing SaaS applications, including: *Important ways that the management of SaaS and hosted application management differ *The unique challenges of supporting enterprise SaaS applications *Case studies demonstrating new techniques and tools for measuring the performance of hosted applications like Salesforce.com

Best Practices for Managing SaaS Applications

Correlsense

This presentation is from my talk at the 2017 SaaStr Annual Conference in San Francisco. It offers an overview of a simple model to understand a SaaS business and the key levers a CEO can pull to get the most impact. The presentation covers: Optimizing the SaaS Funnel: - Get inside your customer’s head - Break down the funnel into microsteps - Identify bottlenecks - Use funnel math to make improvements 12 key levers within the funnel: 1) Product/Market fit 2) Top of the funnel flow 3) Conversion rate 4) CAC (customer acquisition cost) 5) Number of sales people 6) PPR (productivity per rep) 7) Getting enough leads 8) Pricing 9) Customer retention rate 10) Dollar retention rate 11) Months to recover CAC 12) Recruiting, onboarding & management

12 Key Levers of SaaS Success

David Skok

Developing the pricing model for your B2B SaaS app is one of the biggest marketing challenges your company will face. This is a guide to developing your SaaS pricing model was created by noted SaaS Marketing expert and Growth Hacker Lincoln Murphy of Sixteen Ventures. This guide takes you through the questions you need to ask about not just your market and customers, but about your company and goals, to help you figure out your SaaS pricing model. Whether you have a self-service sales model or one that requires outside sales reps to drive business, the tips and techniques contained in this guide and the source blog post will help you create a profitable and successful SaaS pricing model.

How to Develop Your SaaS Pricing Model

Lincoln Murphy

The SaaS business model and metrics

David Skok

Andere mochten auch (10)

Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies

Design Strategy for Data Isolation in SaaS Model

Денормализованное хранение данных в PostgreSQL 9.2 (Александр Коротков)

MySQL for Software-as-a-Service (SaaS)

Scaling Wanelo.com 100x in Six Months

From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL

Best Practices for Managing SaaS Applications

12 Key Levers of SaaS Success

How to Develop Your SaaS Pricing Model

The SaaS business model and metrics

Ähnlich wie Scaling a SaaS backend with PostgreSQL - A case study

Amazon Redshift is a fast, fully-managed petabyte-scale data warehouse service, for less than $1,000 per TB per year. In this presentation, you'll get an overview of Amazon Redshift, including how Amazon Redshift uses columnar technology, optimized hardware, and massively parallel processing to deliver fast query performance on data sets ranging in size from hundreds of gigabytes to a petabyte or more. Learn how, with just a few clicks in the AWS Management Console, you can set up with a fully functional data warehouse, ready to accept data without learning any new languages and easily plugging in with the existing business intelligence tools and applications you use today. This webinar is ideal for anyone looking to gain deeper insight into their data, without the usual challenges of time, cost and effort. In this webinar, you will learn: • Understand what Amazon Redshift is and how it works • Create a data warehouse interactively through the AWS Management Console • Load some data into your new Amazon Redshift data warehouse from S3 Who Should Attend • IT professionals, developers, line-of-business managers

AWS June Webinar Series - Getting Started: Amazon Redshift

Amazon Web Services

Amazon RedShift - Ianni Vamvadelis

huguk

Learn How Dell Improved Postgres/Greenplum Performance 20x with a Database Pr...

VMware Tanzu

Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010

Bhupesh Bansal

Hadoop and Voldemort @ LinkedIn

Hadoop User Group

Big data should be simple

Dori Waldman

Customer Data Platforms, commonly called CDPs, form an integral part of the marketing stack powering Zeotap's Adtech and Martech use-cases. The company offers a privacy-compliant CDP platform, and ScyllaDB is an integral part. Zeotap's CDP demands a mix of OLTP, OLAP, and real-time data ingestion, requiring a highly-performant store. In this presentation, Shubham Patil, Lead Software Engineer, and Safal Pandita, Senior Software Engineer at Zeotap will share how ScyllaDB is powering their solution and why it's a great fit. They begin by describing their business use case and the challenges they were facing before moving to ScyllaDB. Then they cover their technical use-cases and requirements for real-time and batch data ingestions. They delve into our data access patterns and describe their data model supporting all use cases simultaneously for ingress/egress. They explain how they are using Scylla Migrator for our migration needs, then describe their multiregional, multi-tenant production setup for onboarding more than 130+ partners. Finally, they finish by sharing some of their learnings, performance benchmarks, and future plans. To watch all of the recordings hosted during Scylla Summit 2022 visit our website here: https://www.scylladb.com/summit.

Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...

ScyllaDB

Storage is the item most often responsible for performance success or failures, and also has the largest impact on total costs. Achieving good cost - performance levels requires a storage architecture optimized for VDI. The issues relating to storage can be categorized into three areas: storage capacity, storage performance requirements, and administrative changes. This presentation provides a brief overview on what you should consider when evaluating storage for VDI.

Evaluating Storage for VDI Projects

Tegile Systems

Building a high-performance data lake analytics engine at Alibaba Cloud with ...

Alluxio, Inc.

Galaxy Big Data with MariaDB

MariaDB Corporation

Many web businesses enjoy a spike in traffic at some point in the year. Whether it's Black Friday, the NFL draft day, or Mother’s Day, your app needs to be able to scale and capture customer value when it is most needed. Downtime is not an option. For a database, that means having enough capacity to ensure transaction latency stays within acceptable limits. For high capacity apps using MySQL, this means you may need to deploy triple the normal capacity usage to sustain traffic for one day. But what do you do with that hardware for the rest of the year? Do you leave it idling? That unused capacity is costing you an arm and a leg, and wasted expenses make CFOs grumpy. In Part 3 of our Tech Talk series, we discuss what the options are for scaling down MySQL, as well as explore answers to the following questions: - How do I figure out the costs of not scaling down? - How does ClustrixDB scale-down differently than MySQL? - How real is elastically scaling in ClustrixDB? What are the catches? View the webcast of this Tech Talk on our YouTube channel.

Tech Talk Series, Part 3: Why is your CFO right to demand you scale down MySQL?

Clustrix

DoneDeal - AWS Data Analytics Platform

martinbpeters

NativeX (formerly W3i) recently transitioned a large portion of their backend infrastructure from MS SQL Server to Apache Cassandra. Today, its Cassandra cluster backs its mobile advertising network supporting over 10 million daily active users producing over 10,000 transactions per second with an average database request latency of under 2 milliseconds. Going from relational to noSQL required NativeX's engineers to re-train, re-tool and re-think the way it architects applications and infrastructure. Learn why Cassandra was selected as a replacement, what challenges were encountered along the way, and what architecture and infrastructure were involved in the implementation.

MinneBar 2013 - Scaling with Cassandra

Jeff Smoley

Catalyst - refactor large apps with it and have fun!

mold

TIBCO Jaspersoft® for AWS is a business intelligence suite that helps you deliver stunning interactive reports and dashboards inside your app that make it easy for your customers to get answers. Purpose-built for AWS, our reporting and analytics server quickly and easily connects to Amazon Relational Database Service (RDS), Amazon Redshift, and Amazon EMR. It includes ad-hoc reporting, dashboards, data analysis, data visualization, and data blending. In less than 10 minutes, you can be analyzing and reporting on your data. You get a full Cloud BI server starting at less than $1/hour, with no user or data limits and no additional fees. This webinar deck shows how embeddable analytics with TIBCO Jaspersoft for AWS gives you the power to create the experience your end users demand and how to scale and manage that experience across your customer base with AWS.

Building Analytic Apps for SaaS: “Analytics as a Service”

Amazon Web Services

Configuring sql server - SQL Saturday, Athens Oct 2014

Antonios Chatzipavlis

Scaling Your Web Application

Ketan Deshmukh

Handling Data in Mega Scale Systems

Directi Group

SQL Server is really the brain of SharePoint. The default settings of SQL server are not optimised for SharePoint. In this session, Serge Luca (SharePoint MVP) and Isabelle Van Campenhoudt (SQL Server MVP) will give you an overview of what every SQL Server DBA needs to know regarding configuring, monitoring and setting up SQL Server for SharePoint 2013. After a quick description of the SharePoint architecture (site, site collections,…), we will describe the different types of SharePoint databases and their specific configuration settings. Some do’s and don’ts specific to SharePoint and also the disaster recovery options for SharePoint, including (but not only) SQL Server Always On Availability, groups for High availability and disaster recovery in order to achieve an optimal level of business continuity. Benefits of Attending this Session: Tips & tricks Lessons learned from the field Super return on Investment

Espc17 make your share point fly by tuning and optimising sql server

Isabelle Van Campenhoudt

Make your SharePoint fly by tuning and optimizing SQL Server

serge luca

Ähnlich wie Scaling a SaaS backend with PostgreSQL - A case study (20)

AWS June Webinar Series - Getting Started: Amazon Redshift

Amazon RedShift - Ianni Vamvadelis

Learn How Dell Improved Postgres/Greenplum Performance 20x with a Database Pr...

Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010

Hadoop and Voldemort @ LinkedIn

Big data should be simple

Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...

Evaluating Storage for VDI Projects

Building a high-performance data lake analytics engine at Alibaba Cloud with ...

Galaxy Big Data with MariaDB

Tech Talk Series, Part 3: Why is your CFO right to demand you scale down MySQL?

DoneDeal - AWS Data Analytics Platform

MinneBar 2013 - Scaling with Cassandra

Catalyst - refactor large apps with it and have fun!

Building Analytic Apps for SaaS: “Analytics as a Service”

Configuring sql server - SQL Saturday, Athens Oct 2014

Scaling Your Web Application

Handling Data in Mega Scale Systems

Espc17 make your share point fly by tuning and optimising sql server

Make your SharePoint fly by tuning and optimizing SQL Server

Kürzlich hochgeladen

A great deal of attention in medical devices has shifted towards cybersecurity with the ratification of section 524B of the FD&C act. This new law enables the FDA to enforce cybersecurity controls in any medical device that is capable of networked communications or that has software. In this webinar we will recap the process for managing vulnerabilities, identify categories of vulnerabilities and solutions and more.

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...

ICS

Looking for an efficient way to manage your finances? Look no further than our money management app. With easy-to-use features, you can track your expenses, create budgets, and monitor your savings goals all in one place. Our app provides real-time updates on your spending habits and helps you make smarter financial decisions. Take control of your finances today with our user-friendly money management app.

Right Money Management App For Your Financial Goals

Jhone kinadey

Foundation models are machine learning models which are easily capable of performing variable tasks on large and huge datasets. FMs have managed to get a lot of attention due to this feature of handling large datasets. It can do text generation, video editing to protein folding and robotics. In case we believe that FMs can help the hospitals and patients in any way, we need to perform some important evaluations, tests to test these assumptions. In this review, we take a walk through Fms and their evaluation regimes assumed clinical value. To clarify on this topic, we reviewed no less than 80 clinical FMs built from the EMR data. We added all the models trained on structured and unstructured data. We are referring to this combination of structured and unstructured EMR data or clinical data.

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...

harshavardhanraghave

Data spaces in distributed environments should be allowed to evolve in agile ways providing data space owners with large flexibility about which data they store. Agility and heterogeneity, however, jeopardize data exchanges because representations may build on varying ontologies and data consumers may not rely on the semantic correctness of their queries in the context of semantically heterogeneous, evolving data spaces. Graph data spaces are one example of a powerful model for representing and querying data whose semantics may change over time. To assert and enforce conditions on individual graph data spaces, shape languages (e.g SHACL) have been developed. We investigate the question of how querying and programming can be guarded by reasoning over SHACL constraints in a distributed setting and we sketch a picture of how a future landscape based on semantically heterogeneous data spaces might look like.

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...

Steffen Staab

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

kellynguyen01

In the realm of real-time applications, Large Language Models (LLMs) have long dominated language-centric tasks, while tools like OpenCV have excelled in the visual domain. However, the future (maybe) lies in the fusion of LLMs and deep learning, giving birth to the revolutionary concept of Large Action Models (LAMs). Imagine a world where AI not only comprehends language but mimics human actions on technology interfaces. For example, the Rabbit r1 device presented at CES 2024, driven by an AI operating system and LAM, brings this vision to life. It executes complex commands, leveraging GUIs with unprecedented ease. In this presentation, join me on a journey as a software engineer tinkering with WebRTC, Janus, and LLM/LAMs. Together, we’ll evaluate the current state of these AI technologies, unraveling the potential they hold for shaping the future of real-time applications.

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications

Alberto González Trastoy

How To Troubleshoot Collaboration Apps for the Modern Connected Worker

ThousandEyes

5 Signs You Need a Fashion PLM Software.pdf

Wave PLM

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live Booking Contact Details :- WhatsApp Chat :- [+91-9999965857 ] The Best Call Girls Delhi At Your Service Russian Call Girls Delhi Doing anything intimate with can be a wonderful way to unwind from life's stresses, while having some fun. These girls specialize in providing sexual pleasure that will satisfy your fetishes; from tease and seduce their clients to keeping it all confidential - these services are also available both install and outcall, making them great additions for parties or business events alike. Their expert sex skills include deep penetration, oral sex, cum eating and cum eating - always respecting your wishes as part of the experience (29-April-2024(PSS)

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live

Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

Hand gesture recognition PROJECT PPT.pptx

bodapatigopi8531

Investing in AI transformation today The modern business advantage: Uncovering deep insights with AI Organizations around the world have come to recognize AI as the transformative technology that enables them to gain real business advantage. AI’s ability to organize vast quantities of data allows those who implement it to uncover deep business insights, augment human expertise, drive operational efficiency, transform their products, and better serve their customers

Microsoft AI Transformation Partner Playbook.pdf

Willy Marroquin (WillyDevNET)

Unlocking the Future of AI Agents with Large Language Models

aagamshah0812

👉 𝑾𝑰𝑳𝑳 𝑷𝑹𝑶𝑽𝑰𝑫𝑬 𝒀𝑶𝑼 𝑾𝑰𝑻𝑯 𝑺𝑬𝑿𝒀 𝑴𝑶𝑫𝑬𝑳𝑺 𝑾𝑯𝑶 𝑾𝑰𝑳𝑳 𝑫𝑨𝑵𝑪𝑬 & 𝑫𝑹𝑰𝑵𝑲 𝑾𝑰𝑻𝑯 𝒀𝑶𝑼 𝑨𝑵𝑫 𝑨𝑳𝑺𝑶 𝑷𝑹𝑶𝑽𝑰𝑫𝑬 𝒀𝑶𝑼 𝑺𝑬𝑿𝑼𝑨𝑳 𝑩𝑶𝑫𝒀 𝑻𝑶 𝑩𝑶𝑫𝒀 𝑴𝑨𝑺𝑺𝑨𝑮𝑬 𝑾𝑰𝑻𝑯 𝑺𝑬𝑿. 👉𝒀𝑶𝑼 𝑴𝑨𝒀 𝑻𝑨𝑲𝑬 𝑻𝑯𝑬𝑴 𝑶𝑼𝑻 𝑭𝑶𝑹 𝑨 𝑷𝑨𝑹𝑻𝒀 𝑶𝑹 𝑨𝑳𝑺𝑶 𝑭𝑶𝑹 𝑨𝑵𝒀 𝑷𝑹𝑰𝑽𝑨𝑻𝑬 𝑷𝑨𝑹𝑻𝑰𝑬𝑺. 👉𝑻𝑯𝑬𝑺𝑬 𝑮𝑰𝑹𝑳𝑺 𝑨𝑹𝑬 𝑰𝑵𝑻𝑬𝑹𝑬𝑺𝑻𝑬𝑫 𝑰𝑵 𝑯𝑨𝑽𝑰𝑵𝑮 𝑺𝑶𝑴𝑬 𝑭𝑼𝑵 𝑾𝑰𝑻𝑯 𝒀𝑶𝑼 𝑨𝑵𝑫 𝑾𝑰𝑳𝑳 𝑬𝑵𝑺𝑼𝑹𝑬 𝑻𝑯𝑨𝑻 𝒀𝑶𝑼 𝑯𝑨𝑽𝑬 𝑪𝑶𝑴𝑷𝑳𝑬𝑻𝑬 𝑭𝑼𝑵. 28/April/2024 {{🎗️SCH🎗️}} 8923113531 How to book call girls The Booking process is particularly time-saving and more comfortable for us. Below mentioned are the steps you need to follow to hire our call girl: Step 1 - Visit our Call Girls Service website Step 2 - check for the portfolios of our sexy and hot call girls Step 3 - check on the services provided by the specific call girls Step 4 - Once you have selected the call girls and respective services, go to the contact us page. Step 5 - on the contact page, you will get our phone number, WhatsApp number, and email address. You can choose any one of them to connect with us. Call Us – 8923113531 ✣ ✤ ✥ ✦ 𝑻𝑰𝑴𝑬 𝑾𝑨𝑺𝑻𝑬𝑹𝑺 𝑨𝑵𝑫 𝑩𝑨𝑹𝑮𝑨𝑰𝑵𝑬𝑹𝑺 𝑨𝑹𝑬 𝑷𝑳𝑬𝑨𝑺𝑬 𝑬𝑿𝑪𝑼𝑺𝑬, 𝑾𝑬 𝑹𝑬𝑺𝑷𝑬𝑪𝑻 𝒀𝑶𝑼𝑹 𝑺𝑨𝑭𝑬𝑻𝒀 𝑨𝑵𝑫 𝑷𝑹𝑰𝑽𝑨𝑪𝒀 𝑨𝑵𝑫 𝑬𝑿𝑷𝑬𝑪𝑻 𝑻𝑯𝑬 𝑺𝑨𝑴𝑬 𝑭𝑹𝑶𝑴 𝒀𝑶𝑼.✣ ✤ ✥ ✦

CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service

anilsa9823

In an era where security concerns are paramount, the integration of artificial intelligence (AI) into CCTV cameras has revolutionized surveillance capabilities. One of the most significant advancements is the ability to achieve real-time threat detection, enabling immediate responses to potential security breaches. This blog explores how AI is reshaping surveillance through real-time threat detection and the implications of this technology.

Optimizing AI for immediate response in Smart CCTV

shikhaohhpro

At TECUNIQUE, we're a stable and steadily growing Indian software services company with over 14 years of industry experience. Specializing in offshore software development and quality assurance services, we've built a reputation for delivering unique and effective solutions to start-ups, software development companies, enterprises, and digital agencies. We pride ourselves on our commitment to excellence and innovation. By blending insightful business domain knowledge with exceptional technical prowess, we craft tailor-made solutions that meet the unique needs of our clients. Our dedicated teams are adept in specific technologies, ensuring seamless integration of skills and delivering reliable, scalable, and high-quality software solutions aligned with our clients' preferences. Bespoke Dedicated Teams: Crafted to meet your specific needs and technology preferences, our dedicated teams are committed to delivering top-notch software solutions. Offshore Software Development: Accelerate your software development and scale up quickly with our 12+ years of expertise in offshore development. Quality Assurance Services: Ensure the quality of your software products with our dedicated teams of experienced QA professionals. IT Staff Augmentation: Overcome skill gaps with our client-centric software team, offering staff augmentation services. Expert Software Services: Unlock our capabilities in custom software development, product development, and quality assurance. Mission and Vision: Our mission at TECUNIQUE is to be the catalyst for our clients' success in the dynamic domain of software development. Rooted in our core values of respect, authenticity, and responsibility, we strive to ease the software outsourcing experience, reducing both time and cost to market for our clients. We envision ourselves as the leading Indian software services company, renowned for our unwavering commitment to excellence and innovation. www.tecunique.com

TECUNIQUE: Success Stories: IT Service provider

mohitmore19

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf

kalichargn70th171

A Secure and Reliable Document Management System is Essential.docx

ComplianceQuest1

Introducing MyIntelliAccount™ Cloud Accounting Software as a Service (SaaS) , the complete system for simplifying business accounting needs. MyIntelliAccount Cloud Accounting SaaS is an easy to understand and easy to use application and system designed for the Web with supporting applications for iOS and Android devices. Designed to work like a natural extension of your web browser, the user interface for MyIntelliAccount Cloud Accounting SaaS is intuitive and thoughtfully organized to help you easily navigate and access your business accounting information. Because our company takes the time to research, study, and understand the applications we develop, we are confident that once you use MyIntelliAccount Cloud Accounting SaaS , you will be asking yourself how you ever got along without it.

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...

MyIntelliSource, Inc.

+971565801893 Mtp-Kit (500MG) Prices » Dubai [(+971565801893**)] Abortion Pills For Sale In Dubai, UAE, Mifepristone and Misoprostol Tablets Available In Dubai, UAE CONTACT DR.Leen Whatsapp +971565801893 We Have Abortion Pills / Cytotec Tablets /Mifegest Kit Available in Dubai, Sharjah, Abudhabi, Ajman, Alain, Fujairah, Ras Al Khaimah, Umm Al Quwain, UAE, Buy cytotec in Dubai +971565801893''''Abortion Pills near me DUBAI | ABU DHABI|UAE. Price of Misoprostol, Cytotec” +971565801893' Dr.DEEM ''BUY ABORTION PILLS MIFEGEST KIT, MISOPROTONE, CYTOTEC PILLS IN DUBAI, ABU DHABI,UAE'' Contact me now via What's App…… abortion Pills Cytotec also available Oman Qatar Doha Saudi Arabia Bahrain Above all, Cytotec Abortion Pills are Available In Dubai / UAE, you will be very happy to do abortion in Dubai we are providing cytotec 200mg abortion pill in Dubai, UAE. Medication abortion offers an alternative to Surgical Abortion for women in the early weeks of pregnancy. We only offer abortion pills from 1 week-6 Months. We then advise you to use surgery if its beyond 6 months. Our Abu Dhabi, Ajman, Al Ain, Dubai, Fujairah, Ras Al Khaimah (RAK), Sharjah, Umm Al Quwain (UAQ) United Arab Emirates Abortion Clinic provides the safest and most advanced techniques for providing non-surgical, medical and surgical abortion methods for early through late second trimester, including the Abortion By Pill Procedure (RU 486, Mifeprex, Mifepristone, early options French Abortion Pill), Tamoxifen, Methotrexate and Cytotec (Misoprostol). The Abu Dhabi, United Arab Emirates Abortion Clinic performs Same Day Abortion Procedure using medications that are taken on the first day of the office visit and will cause the abortion to occur generally within 4 to 6 hours (as early as 30 minutes) for patients who are 3 to 12 weeks pregnant. When Mifepristone and Misoprostol are used, 50% of patients complete in 4 to 6 hours; 75% to 80% in 12 hours; and 90% in 24 hours. We use a regimen that allows for completion without the need for surgery 99% of the time. All advanced second trimester and late term pregnancies at our Tampa clinic (17 to 24 weeks or greater) can be completed within 24 hours or less 99% of the time without the need surgery. The procedure is completed with minimal to no complications. Our Women's Health Center located in Abu Dhabi, United Arab Emirates, uses the latest medications for medical abortions (RU-486, Mifeprex, Mifegyne, Mifepristone, early options French abortion pill), Methotrexate and Cytotec (Misoprostol). The safety standards of our Abu Dhabi, United Arab Emirates Abortion Doctors remain unparalleled. They consistently maintain the lowest complication rates throughout the nation. Our Physicians and staff are always available to answer questions and care for women in one of the most difficult times in their lives. The decision to have an abortion at the Abortion Clinic in Abu Dhabi, United Arab Emirates.+971565801893

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...

Health

Model Call Girl Services in Delhi reach out to us at 🔝 9953056974 🔝✔️✔️ Our agency presents a selection of young, charming call girls available for bookings at Oyo Hotels. Experience high-class escort services at pocket-friendly rates, with our female escorts exuding both beauty and a delightful personality, ready to meet your desires. Whether it's Housewives, College girls, Russian girls, Muslim girls, or any other preference, we offer a diverse range of options to cater to your tastes. We provide both in-call and out-call services for your convenience. Our in-call location in Delhi ensures cleanliness, hygiene, and 100% safety, while our out-call services offer doorstep delivery for added ease. We value your time and money, hence we kindly request pic collectors, time-passers, and bargain hunters to refrain from contacting us. Our services feature various packages at competitive rates: One shot: ₹2000/in-call, ₹5000/out-call Two shots with one girl: ₹3500/in-call, ₹6000/out-call Body to body massage with sex: ₹3000/in-call Full night for one person: ₹7000/in-call, ₹10000/out-call Full night for more than 1 person: Contact us at 🔝 9953056974 🔝. for details Operating 24/7, we serve various locations in Delhi, including Green Park, Lajpat Nagar, Saket, and Hauz Khas near metro stations. For premium call girl services in Delhi 🔝 9953056974 🔝. Thank you for considering us!

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

9953056974 Low Rate Call Girls In Saket, Delhi NCR

Kürzlich hochgeladen (20)

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...

Right Money Management App For Your Financial Goals

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications

How To Troubleshoot Collaboration Apps for the Modern Connected Worker

5 Signs You Need a Fashion PLM Software.pdf

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live

Hand gesture recognition PROJECT PPT.pptx

Microsoft AI Transformation Partner Playbook.pdf

Unlocking the Future of AI Agents with Large Language Models

CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service

Optimizing AI for immediate response in Smart CCTV

TECUNIQUE: Success Stories: IT Service provider

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf

A Secure and Reliable Document Management System is Essential.docx

Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

Scaling a SaaS backend with PostgreSQL - A case study

1. SCALING A SAAS BACKEND WITH POSTGRESQL – A CASE STUDY PostgreSQL Conference Europe Madrid 2014-10-24 Oliver Seemann - Bidmanagement GmbH oliver.seemann@adspert.net

2. Gigabytes Terabytes Growing Data

3. We do productivity tools for advertisers

4. Significant amounts of data

5. Upper boundary: 5M keywords × 365 days × 20 bigints/doubles ≅ 300GB

6. OLTP / OLAP Duality

7. “Slow” OLAP data for daily batch-processing jobs

8. “Fast” OLTP data for human interaction

9. Initially separate databases Slow Data Fast Data

10. Data overlaps significantly Slow Data Fast Data

11. We went with unified approach Slow Data Fast Data

12. Currently: 7 machines running PG 9.3

13. Currently: ~3 TB Data

14. Currently: largest table: ~100GB

15. How it all started..

16. It began as an experiment

17. Design by the book Keywords PK,FK1 adgroup_id PK keyword_id Campaign PK campaign_id FK1 account_id Adgroup PK adgroup_id FK1 campaign_id Account PK account_id FK1 customer_id Customer PK customer_id User PK user_id FK1 customer_id History PK day PK,FK1 keyword_id PK,FK1,FK2 adgroup_id UserAccountAccess PK,FK1 account_id PK,FK2 user_id Scenario PK,FK1 keyword_id PK,FK1 adgroup_id PK factor

18. Soon tens of GB >100M records

19. All Accounts Account 1 – Rec 1 Account 2 – Rec 1 Account 1 – Rec 2 Account 3 – Rec 1 Account 2 – Rec 2 Account 2 – Rec 3 Account 1 – Rec 3 Account 3 – Rec 2

20. +10fold increase per level Account >10 FK Campaign >1k FK Ad Group >100K FK Keyword >10M FK History >100M

21. Partitioning, somehow Account 1 Account 1 – Rec 1 Account 1 – Rec 2 Account 1 – Rec 3 Account 2 Account 2 – Rec 1 Account 2 – Rec 2 Account 2 – Rec 3 Account 3 Account 3 – Rec 1 Account 3 – Rec 2 Account 3 – Rec 3

22. Partitioning with inheritance Parent Child Child Child check-constraints SELECT INSERT

23. PG Partitioning is nifty – but not a match for our case

24. Our case: Little to no shared data between clients

25. Isolate accounts One DB Many DBs/Schemas?

26. Both approaches: + Good horizontal scaling

27. Both approaches: + Good tool support (e.g. pg_dump/restore)

28. Partition into databases: + Easy cloning CREATE DATABASE foo TEMPLATE bar;

29. Partition into databases: + Stricter isolation (security)

30. Partition into databases: - Some Overhead

31. Partition into databases: - No direct references

32. Partition into schemas: + More lightweight

33. Partition into schemas: + Full references

34. Partition into schemas: - No easy cloning

35. Partition into schemas: - No cascading schemas

36. Now: Several thousand databases on five 1TB machines

37. Now: Plus main DB server pair with <10GB data

38. Setup Main DB Hosts standalone-0 standalone-1 standalone-2 standalone-3 master slave Account DB Hosts

39. No replication on account db hosts?

40. Performance Problems

41. Too many concurrent full table scans

42. From 300MB/s to 30MB/s More concurrent queries Longer query runtime

43. Different apps Different access patterns Web Apps Compute Cluster Many small/ fast queries Few very slow/ big queries

44. Limit concurrent access with counting semaphore Web Apps Compute Cluster Many small/ fast queries Few very slow/ big queries

45. Implement Semaphore using Advisory Locks

46. Simpler than setting up Zookeeper

47. More performance problems: Bulk Inserts

48. Solved with common best practices:

49. COPY exclusively

50. Drop / Recreate indexes

51. COPY to new table + swap

52. Another problem:

53. CREATE DATABASE can take a while

54. Signup Delays Signup Web App + CREATE DATABASE Can take up to 5-15 min

55. CREATE DATABASE performs a CHECKPOINT

56. Solution: Keep stock of spare databases

57. In general: Very happy with our approach

58. Databases are tangible

59. Move DBs between hosts

60. Painless 9.0 -> 9.3 migration

61. Use schemas as partitions?

62. Would prevent regular schema usage

63. CREATE SCHEMA foo; CREATE SCHEMA foo.bar; CREATE SCHEMA foo.bar.baz;

64. Schemas are crucial for us

65. Versioning of database code

66. Grown to about ~15k SQL functions/views

67. Moved core algorithms from app to db

68. Previously: 1. Read bulk raw data from DB 2. Number crunching in app 3. Write bulk results to DB

69. 4x-10x faster in DB 2x-4x RAM reduction

70. SQL is harder to read & write

71. How to test it? App test suite goes a long way.

72. Different production stages Versioning with schemas

73. Every 4-8 weeks: CREATE SCHEMA version_%d;

74. Assign each version a stage: unstable -> testing -> stable

75. Stage App Schema COUNT(account) unstable v22.4 version_22 0% - 2% testing v21.13 version_21 1% – 50% stable v20.19 version_20 50% – 100%

76. Watchdogs on key metrics alert on suspicious behaviour

77. Schemas are VERY important

78. Takeaway: Databases can be aptly used as partitions

79. Takeaway: So can schemas

80. Takeaway: Schemas can be used for versioning

81. Thanks for listening Questions?

82. Managing Schema Changes

83. ORM Can’t live with it, can’t live without it

84. PG Wishes

Hinweis der Redaktion

Hi, I’m Oliver, I’m a software developer, currently heading the development team at Bidmanagement GmbH in Berlin.
I’m going to talk about how we’re using it as main datastore in our system Non of the solutions or approaches are .. But by using some pg features in a non-standard way certain problems can be solved quite elegantly And seeing that this works very well for some, maybe will be helpful to some of you when you have similar problems now or in the future.
Mostly in the area of search engine marketing Which today is mostly adwords, however we also support other networks, for example yandex Our flagship product is a fully automatic bid management solution. Everyday we’re changing the bids on tens of millions of keywords and learn from the effects to steer campaign performance towards goals configured by the user. The philosophy is to take the mind numbing number crunching tasks away from the user, because a computer can do it better and much more efficiently, especially when you have thousands or millions of objects to analyze.
- Replicate the campaign structure Provide reporting interface I don’t want to bore you with the technical details about how search engine marketing work so let’s just say we store a lot of ints and floats and especially time series data of those. To get an idea of the ballpark we’re working with let’s have a look at the upper boundary
Ballpark estimates Upper bound Time series data Hierarchical data Clicks, impressions and also lots of statistical projections with confidence intervals Of course most of those values are actually zero and those can be omitted when storing the data. So it may actually only be 5 or 10% or that. However we have thousands of accounts, most of which only have a few hundred MB to a few GB. But the occasional outlier with 100GB must work just as well.
The different kinds of data we store can be largely separated into two groups.
One internal (batch processing), One external (web app access)
Mostly the time series data So we had to either duplicate lots data and synchronize changes. Or integrate both into one and make sure different parts of the system don’t get in each other’s way.
We opted for the latter because it makes for a simpler system. We just have to make sure So far it turned out well and we havent looked back.
Let’s have a peek into the past in order to understand how the system evolved.
Our CTO is a mathematician Skunk works project
PostgreSQL supports partitioning via inheritance [insert scheme] Use CHECK constraints to tell Query Planner where to look Cannot insert into parent table, must insert into child table Lot of effort goes to application logic Tried it on one table, weren’t it conviced
The database or schema as a logical unit is a central part of PG with good tool support Easy to add, easy to drop Can be Backed up Restored Moved between machines Very Tangible from an ops view
MainDB still replicated To enable quick failover Here we can’t afford extended downtime
Can make availability / cost trade offs here
Big cheap HDDs Bottle neck is Gigabit Ethernet
Capacity doubled, cost reduced 40% The more servers, the faster the restore Gbit Ethernet on backup server is limiting factor
Not really feasible: We rewrite lots of data every day (crude approach, but simpler code) Complex Administration (no dedicated DBA)
From sequential reads to random reads The cause of the problem is only on one side ..
Webapp-queries with humans waiting are quite fast Problematic queries done by the analysis jobs Frequent full table scans Queries with huge results Need way to synchronize queries, control concurrency Could use a connection pooler Or an external synchronization mechanism e.g. Zookeeper
Webapp-queries with humans waiting are quite fast Problematic queries done by the analysis jobs Frequent full table scans Queries with huge results Need way to synchronize queries, control concurrency Could use a connection pooler Or an external synchronization mechanism e.g. Zookeeper
Very simple mechanism Unfair, but that’s no problem
However, it’s starting to spread with a tendency to be mis-used.
An ALTER INDEX foo DISABLE would come in handy;
We added a self-service signup 2-minute process to add AdWords account to the system OAuth  User Info  Optimization Bootstrap Biggest problem: CREATE DATABASE can take several minutes Depends on current amount of write activity More granular checkpoint (per db) would be cool?
Restrict checkpoints to databases?
So all of the drawbacks that came up could be worked around, more or less elegantly. In total, we’re very happy with the way the approach has turned out. Especially the scalability and isolation aspects of it have us very pleased. So much in fact, that we also used it for a second product and it feels very natural.
Databases as a unit of abstraction on a client or account level level are very much tangible. Which makes it comfortable both from an development and operations point of view. They can be connected to, renamed, cloned, copied, moved, backed up and restored. When we remove a customer from the system we just dump the account databases and put them on S3 Glacier for some amount of time, instead of keeping the 100GB in the system.
To manage capacity. Currently this is still a manual process because it’s not required very often. Making it automatic would require amongst other things a means to briefly disable the app from connecting to it. Does “ALTER DATABASE set CONNECTION LIMIT 0” work?
Moving between hosts means we can also move it between PG versions. We upgraded from 9.0 to 9.3 without much effort by installing both on all machines and then dumping them one after another from 9.0 and restoring into 9.3. Over a period of 2-3 months. Memory is not a problem as shared_buffers is relatively low (a few gigabytes) and most memory is used by page and buffer cache and all files continue to exist only once. Used 9.3 in development for a few months Btw, I only remember one case where we needed to adapt code for 9.3, something with the order of a query result. Otherwise the ugprade was a breeze.
But, even though this works very well with using databases as partitions. Would schemas have worked the same way?
The biggest problem we would have had is, that we wouldnt be able to use schemas for other purposes anymore.
This is become necessary
It has grown quite a bit because we started with lots of Perl codeand a “dumb” data store
Up to 100GB memory in step 2
Only works when we can limit concurrent the batch jobs per machine (advisory locks).
But it’s not all sunshine and rainbows with that approach, of course. Because SQL is much harder to write and to read than procedural code. The notion that “code is a liability” has some truth to it. So the more we move into the database, the hard it becomes to manage. Python I just much more tangible and malleable than SQL. We have to compromise between easy to debug&test and performance.
But, given a bit of time and quiet one can accomplish much with little code in SQL. Testing of individual snippets can be done by calling it from the application code, as part of an integrated test suite that has test data and expects certain results. Covering most of the code in tests is not the problem, but covering most data scenarios is much more work (Div by zero sneeks in from time to time). Those cases are postponed to …
The SQL code decides how to spend millions of euros in advertising money every months. We can’t afford deploying any code changes (app or db) to all account databases at the same time. So we use schemas to manage multiple versions of the optimization code.
The schema is filled with all the objects from a set of source .sql files. The application software version that uses the db and schema version are identical, app sets search path. We don’t use minor versions for fixes in the db code.
What we do is, assign each version a stage. Unstable, testing, stable, borrowed from Debian. And we also can assign individual client accounts a stage.
Typically test accounts or one with a pathological case that is fixed by the new release. Those are closely monitored (performance, errors, log files, debugging data). Brand new unstable: few, selected (test-)accounts Testing stage for incremental roll-out

Scaling a SaaS backend with PostgreSQL - A case study

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (10)

Ähnlich wie Scaling a SaaS backend with PostgreSQL - A case study

Ähnlich wie Scaling a SaaS backend with PostgreSQL - A case study (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Scaling a SaaS backend with PostgreSQL - A case study

Hinweis der Redaktion