PgDog is funded and coming to a database near you

(pgdog.dev)

153 points | by levkk 2 hours ago

24 comments

codegeek 29 minutes ago
"Why Us" => "I ran Postgres at Instacart, where we scaled the company 5x in April of 2020. The biggest problem we had was making Postgres serve 100,000s of grocery delivery orders per minute"
Couldn't be a better why us :)
yabones 1 hour ago
I'm curious how this might help with our biggest downtime-causer with postgres, which is major version upgrades. Poolers do a great job for failover and load balancing, but we consistently need ~10-20 minutes of downtime once or twice a year to do upgrades. Logical replication between old->new versions could probably help, but it would still require flipping everything over to the new cluster without partial writes or anything silly. Anybody have experience with this?
[-]
- tempest_ 1 hour ago
  We use logical replication and a pause / swap in pgbouncer for ~5s of paused (but not failed) writes.
  This is for DBs that are ~1-1.5TB but doesnt have a huge amount of churn/qps
  Effectively what is described here https://www.pgedge.com/blog/always-online-or-bust-zero-downt...
- tux3 1 hour ago
  Logical replication is how this is typically done. If you have some infra-as-code setup, you create a new cluster with identical settings except for the major version, import the schema, start copying data from a read-replica running the old version, stop accepting writes from the old version (downtime starts), sync the sequence numbers, and point your services to the new cluster (downtime ends).
  If you use something like CloudNativePG they automate parts of the process with cli tools and declarative syntax. Otherwise you take the time to figure it out by hand. It might sound complicated, but just practice on your staging DB, and if all goes well you do the same procedure in prod.
  Edit: Apparently Postgres 19 has a patch for one-shot logical replication of sequences! https://www.depesz.com/2025/11/11/waiting-for-postgresql-19-...
  [-]
  - paulryanrogers 17 minutes ago
    RDS has blue green deployments that can help. It was rough at first, though seems they worked out the kinks.
- tschellenbach 1 hour ago
  Logical replication solves this. You roll the cluster, downtime is minimal. like 60s maybe.
  [-]
  - briffle 32 minutes ago
    Logical replication needs a special 'upgrade' use case that will automate most of its pain points away. I understand why DDL does not replicate, and that you may want to replicate to a data warehouse that only needs some columns, etc, but there should be a case just for upgrading that handles all DDL, sequences all existing everything, and just works...
- znpy 47 minutes ago
  It's weird that PostgreSQL still doesn't have a proper, open source, general multi-master implementation.
  At this point i wonder if i'll ever see that.
  [-]
  - jjice 45 minutes ago
    Do other RDBMSs have this? I genuinely have no clue. I've been fortunate enough to be able to get away with one primary and multiple secondaries at my largest usage of Postgres. Multi-master is the kind of thing I am fully out of my depth on, so I'm curious if there's a well defined path for implementation here or what.
    [-]
    - aynyc 29 minutes ago
      MySQL has Galera cluster for that.
  - timacles 17 minutes ago
    It has been tried many times. Good luck to pgdog, but there’s a reason these projects don’t stick.
    Multi master, from even a conceptual perspective, is incredibly complicated. Databases, transactions, consistency, parallelism are all very complicated.
    It’s something that always seems promising at the start but as soon as maintenance and long term improvements enter the picture(ie integrating new Postgres versions), the complexity becomes too much.
- boxed 1 hour ago
  Seconded. Coming from MySQL this is a huge regression that makes Postgres look like something from the 80s. I still wonder why this isn't seen as the absolutely highest priority.
  [-]
  - jeltz 1 hour ago
    I have not ran MySQL for some years but it at least used to have exactly the same issue. Upgrading a database with MySQL can take a long time if you have many tables. The main difference is only really that PostgreSQL does it with a separate tool, pg_upgrade, while MySQL does it as part of the main binary.
    For both MySQL and PostgreSQL you will need to use some kind of logical upgrades if you want no downtime.
    [-]
    - boxed 30 minutes ago
      They don't change the on-disk structure all the time though...
    - tomnipotent 1 hour ago
      MySQL has advocated for decades spinning up a replica with the upgraded version, waiting for it to catch up to master before promoting it to the new master. You can do the same thing with Postgres.
      [-]
      - jeltz 1 hour ago
        Exactly, MySQL and PostgreSQL are the same here. Maybe one is a bit faster than the other at doing major version upgrades but the behaviours are quite similar.
  - Blackthorn 1 hour ago
    Probably because it's an open source project and apparently none of its users cared about this feature enough to develop it or fund it.
    [-]
    - jeltz 43 minutes ago
      It is also a bit tricky tradeoff. You do not want to be stuck with the same data format forever. So databases like MySQL and PostgreSQL need a downtime when doing a major version upgrade. They both try to keep it short, usually seconds, but minutes can happen in either database.
Ozzie_osman 1 hour ago
```
  We sharded over 20 TB that we know about.
```
This is probably a typo, right? 20TB isn't that big. I would imagine they've sharded a lot more than that
[-]
- singron 3 minutes ago
  If your working set is 20 TB, then it's pretty big. Each database has its own mix of hot/cold data, so it's impossible to compare without more information. A better measure might be IOPS. RDS has fairly low maximum IOPS unless you spend a lot more for provisioned IOPS or use Aurora.
- rbranson 1 hour ago
  You are correct. As a point of comparison: almost ten years ago at Segment we had a single Aurora PostgreSQL instance with ~50T of data, it was used to index potential identity data in a much larger corpus of files stored in S3.
- GiorgioG 1 hour ago
  For a vast majority of use cases 20TB is positively enormous.
  [-]
  - mplanchard 57 minutes ago
    RDS caps out at 64 TB unless you use Aurora, so 20 TB is totally manageable without sharding.
  - jeltz 58 minutes ago
    Yes. But for most workloads it is not much for PostgreSQL. You often will not have to shard at all.
  - returningfory2 1 hour ago
    This product is for Postgres deployments that are so large they need to be sharded. For these use cases, I think 20TB is about normal.
  - happyopossum 1 hour ago
    Sure, but 20TB in “the only database you need” is mere hours or minutes worth of data for many workflows.
  - tingletech 1 hour ago
    that article seems to suggest 20TB total over the dozen deployments in prod.
valorzard 17 minutes ago
I've seen a couple of these "distributed" postgres extensions.
My question is, has any of them been talked about being upstreamed to postgres itself? Or, adding a custom built in feature to postgres itself?
[-]
- levkk 7 minutes ago
  This is not an extension, it's a proxy! Very different. You can deploy it anywhere already without having to wait for upstreaming or your cloud provider adding support for it. It's one of the two reasons why we built it this way, the other being performance (it's much faster to do this in the proxy than inside Postgres).
welder 38 minutes ago
Three real-world issues I've run into recently with PgBouncer + Postgres are:
1. pool exhaustion from idle connections inside open long-running transactions
2. SQLAlchemy's client-side pool using dead connections that PgBouncer had already killed, causing periodic request errors
3. Some tasks have to bypass PgBouncer when they use SET or prepared statements
I've already sharded large datasets at the application layer, but looks like PgDog solves the above problems for any future work?
[-]
- tempest_ 27 minutes ago
  SQLA async is a bit of a struggle with pgbouncer.
  I had to disable application pooling as it was causing read only transactions I could couldnt pin down the cause.
tschellenbach 1 hour ago
PgDog, Neki, multigres, awesome to see. And yes this is the main issue with postgres. Well this and not having index hints, looking forward to 19
[-]
- paulryanrogers 6 minutes ago
  pg_hint_plan isn't in core yet pretty competent when you need to override planner.
- welder 48 minutes ago
  Don't forget the original PgBouncer. Hard to setup, but with the help of AI these days it's easier to configure.
kjuulh 2 hours ago
I tried out PgDog a while ago, but couldn't find a good way of handling the config except for having this users / pgdog toml file, which makes it a bit awkward to handle in kubernetes where we often do multi-tenancy in postgres - or rather having many databases on the same instance(s), and have them come and go at will.
Also had an issue with it because it cached authentication requests when doing passthrough it seems, I'd changed the roles password, but it kept using the old one, which was no bueno ;).
PgDog seems to make more sense when you really care about a few databases that need massive scale, rather than a simple proxy in front of postgres. I'll keep following the development though, it is much needed in this space, postgres can use all the investment it can get to get it past the single machine scale that it excels at currently.
[-]
- apt-get 1 minute ago
  We successfully did this with pgdog at $JOB using our own "controller" -- the same service that handles deploying new instances of our application (instancing an argoCD Application that fires Crossplane DB creation, making new Deployments of bricks, etc) will also, at the end of that process, scan the cluster for Database CRDs, use those to generate a new pgdog.toml + users.toml, update the Secrets in the cluster, enable maintenance mode on all pgdog pods, do a live config reload on each of them, then disable maintenance mode (this is to make the change atomic between all the pgdog instances). Downtime there is about 2-3 seconds and all it does is make new SQL requests from existing clients wait, it doesn't break the connection or anything.
- maherbeg 1 hour ago
  Happy to chat about this, but we use the AWS secrets manager flowing into External Secrets Operator to generate a pgdog_users.toml. We then kick off a workflow to refresh things, but our rate of change here is much smaller than a super dynamic multi-tenant system.
  You could also build a watcher side car that watches for changes of the pgdog_users.toml and have pgdog refresh itself then too with this combination. We thought about that but prefer to control the reloads for our needs.
chrisvenum 1 hour ago
I am trying to gain a basic understanding of this: Right now I have a 4TB DB on one large box. Is the idea that using a proxy tool like PGDog I could spin up 8 smaller boxes handling ~500GB each and then one medium box for the proxy?
Right now I have a project that has very heavy write traffic from multiple services and a web app that reads from this. We are starting to hit the point where no amount of indexing, query optimisation, caching or box upgrades is helping us. We are looking at maybe moving the bulk of the static data to clickhouse to reduce the DB size but I would love to hear if PgDog or other kind of sharding could be useful for this use case.
[-]
- levkk 5 minutes ago
  > 8 smaller boxes handling ~500GB each and then one medium box for the proxy?
  That's exactly right. Get in touch (lev@pgdog.dev), happy to help or at the very least tell you what current works (or doesn't) so you know what your options are.
mnbbrown 1 hour ago
I've loved using pgdog for the last 6 months. It's been incredibly stable. It's nifty how they've solved the LISTEN/NOTIFY on a transaction pooler problem.
maherbeg 1 hour ago
I'm a big PGDog fan! It really helped us scale our connection proxy needs pretty substantially and it has great features like auto mode to support Aurora failovers neatly. It's infra that just works.
htrp 2 hours ago
>PgDog is a sharder, connection pooler and load balancer for PostgreSQL. Written in Rust, PgDog is fast, reliable and scales databases horizontally without requiring changes to application code.
Still trying to figure out how this works technically, is the performance gain really just re-write in rust?
[-]
- levkk 2 hours ago
  Not quite. The performance gain is to bring those features to Postgres!
  Edit:
  Performance gains are from having the ability to load balance reads (horizontal scaling for read queries) and scale out writes (with sharding). Once instance bottleneck in Postgres has many faces:
  1. Behind schedule vacuums because of too many dead tuples (too many writes)
  2. The WALWriter is single-threaded and IO-bound - Postgres can only do about 200-300MB/sec in writes per instance (real prod numbers on EC2 with NVMes and ZFS, basically best case scenario).
  3. Bulkheading: single primary is a single point of failure. With 12 primaries, if one fails, 91% of your customers don't notice.
  The list goes on. Rust is just a side effect. We love it because it's fast and correct - the perfect match for a database product.
  [-]
  - hylaride 55 minutes ago
    So to oversimplify, is the idea to bring an AWS Aurora-style storage mechanism natively to Postgres?
    [-]
    - levkk 43 minutes ago
      Yes, except it doesn't have any cross-dependencies on the same volume, so the uptime here should be higher.
  - VeninVidiaVicii 2 hours ago
    Oh thanks for clearing that up.
    [-]
    - levkk 2 hours ago
      Sorry, out walking the dog (not a pun). I'll post more details in a few.
ParadisoShlee 2 hours ago
I've moved from pgbouncer to pgdog a few months ago without issue. Huge fan.
jeremyjh 2 hours ago
It’s surprising they don’t mention advantages over other sharding systems like Citus. Maybe it’s just the fact that it’s only a proxy and not core extensions? But that could limit capabilities.
[-]
- levkk 1 hour ago
  We do, just buried deep in our blog: https://pgdog.dev/blog/pgdog-vs-citus
  The same old processes vs. threads debate, plus having the ability to scale the coordinator past a single machine. So, if you're OLTP, definitely consider PgDog. OLAP - Citus still wins because of its advanced query engine. We'll get there.
sandeepkd 33 minutes ago
Nit-Pick: It might be anti-marketing, still it would be helpful if the use cases can be articulated in a way where it would make sense to use this Vs any other type of database. Honesty goes a long way with the more technical folks for anything related to infrastructure.
Surfacing where and how PG is better than Dynamo or any other database is probably a good starting point instead of calling out PG a silver bullet for everything. At the end of the day its all a trade-off.
[-]
- levkk 4 minutes ago
  Always is. Marketing is not our strong suit (only engineers here). We'll get better at it.
drchaim 1 hour ago
Good stuff, although I’m not quite sure about the fast OLAP use case.
If you’re already sharding by tenant for other reasons, OK… But I see CDC to a true OLAP system as more scalable.
PostgreSQL still needs real columnar tables in the core, hopefully one day
[-]
- levkk 58 minutes ago
  OLAP means different things to different people. For us, it's just making sure your admin dashboard keeps working basically:
```
  SELECT tenant_id, COUNT(clicks)
  FROM users
  GROUP BY tenant_id
  ORDER BY 2 DESC
  LIMIT 25;
```
  Performance is a side effect - definitely needed and we'll do everything we can, but we are not competing with ClickHouse or Snowflake - just trying to make sharded Postgres work with your app.
simonw 2 hours ago
Suggestion: have more than just helm and Docker in your quickstart documentation. I'd like to try this out just to see what it can do, but not quite enough to fire up one of those systems for it.
Is there a binary I can run directly?
[-]
- e12e 1 hour ago
  In addition - the docker compose example doesn't set up any data volumes for the postgres instances - that might be considered a bug?
  Then again, sharding on a single host probably isn't very useful anyway - but it might work with docker in swarm mode?
  [-]
  - levkk 1 hour ago
    The docker compose example is just a demo. I don't know anyone who runs Postgres with docker compose / swarm in prod :) But yes, happy to add volumes so it seems more real.
- levkk 1 hour ago
  We should add it to brew/apt/etc for sure. Also, we could add it to crates.io so you could do something like `cargo install pgdog`. Distribution, distribution, distribution.
  [-]
  - simonw 1 hour ago
    I also appreciate GitHub releases with pre-compiled binaries for different platforms. The more options the better!
- frogbydjsd 1 hour ago
  [dead]
fulafel 2 hours ago
Does making it "just work" here come with any caveats vs standard PG?
[-]
- levkk 2 hours ago
  Getting there! Cross-shard writes do because of 2pc. Reads are eventually consistent.
- danielheath 2 hours ago
  Given that they implement connection pooling and sharding, I'm going to say "not at all".
  You _could_ make that ACID, but it's not going to be faster than a single machine.
bourbonproof 2 hours ago
the reason mongo is a joy to use in scaled env is because no additional setup/software needed and all drivers natively support secondary/primary writes/reads and topological changes. so it's end to end, and adding is as a new proxy in frontend of postgres leads to all clients being incompatible or the code itself has no control anymore about when to use a secondary and what allowed stall is acceptable for a particular query. Any solutions to this by pgdog?
[-]
- saghm 1 hour ago
  > all drivers natively support secondary/primary writes/reads and topological changes.
  Expanding on that a bit, mongo drivers even have a shared specification of the state machine for monitoring topology changes[1] and algorithm for selecting the server to send an operation to[2] (along with various declarative test cases that the drivers use to validate them alongside the specs in the repo). I think people sometimes underestimate how important the client-side work is to this sort of experience; for all of the faults mongo has had over the years, the amount of investment that they put into the client libraries is something I've never seen anywhere else (although having spent several years working on some of these libraries, my take is likely very biased).
  [1]: https://github.com/mongodb/specifications/blob/master/source... [2]: https://github.com/mongodb/specifications/blob/master/source...
melon_tsui 2 hours ago
2M qps in production is legit. Curious how much RAM and CPU that takes on average per deployment though
[-]
- levkk 1 hour ago
  Depends. Only pooling, very little. Load balancing/sharding needs to parse queries, so a bit more. Could go up to a GB per pod, sometimes more if you have a lot of unique SQL queries (unique by text, not by parameters). We cache query ASTs to avoid parsing them on each request - that's the bulk of memory usage.
  [-]
  - parthdesai 1 hour ago
    Semi related question - I have always wondered, how do you tackle OOM issues at the proxy layer, i.e. let's say a particular SQL query requires proxy to fan out the query to multiple shards, which return a pretty large dataset. I'm assuming you would need to load this dataset in the ram to perform certain operations. What happens if the resulting dataset causes the proxy pod to go OOM?
    [-]
    - levkk 1 hour ago
      Two schools of thought:
      1. Let it crash. Increase the RAM, try again.
      2. Page to disk (swap), make it slow but ultimately work.
      Both have their trade-offs. There is no free lunch here.
Pet_Ant 2 hours ago
I hope people pronounce this as „pig-dog” and has a mascot that looks like „man-bear-pig”
[-]
- levkk 2 hours ago
  Crap! Missed opportunity.
faangguyindia 2 hours ago
i am not using any tool like pgbouncer and have not run into any issues so far. Is it even required these days? Have you guys tested your setup without these connection poolers/multiplexers?
[-]
- rswail 1 hour ago
  Each connection is a process on the server, that takes up both CPU and RAM, it will run out.
  This solves the thousands of clients case for read in a way that is transparent to the clients.
  Yes it's required at large scale, especially if you want to distribute reads or shard to a particular geographical area.
999900000999 1 hour ago
How are 3 developers going to QA this properly ?
[-]
- codegeek 25 minutes ago
  They are not just some random 3 have decades of real db experience behind them. They also just got funded which gives them the ability to expand and stay longer in the game.
- pantulis 1 hour ago
  How are 3 developers going to sell that to any company? Procurement will have a field day.
  [-]
  - rswail 1 hour ago
    They have funding. That's what it will be for. I wish them well and appreciate that people are still doing FOSS.
    As long as they don't get undercut by the equivalent of AWS https://aws.amazon.com/rds/proxy/ which is a managed pgbouncer.
    [-]
    - 999900000999 1 hour ago
      The issue is if the DB layer fails your product is going to completely stop working.
      You’d need a ton of faith in these 3 people.
      Feels more like it would work better inside of a bigger organization.
      The QA tester in me is kinda risk adverse.
skiwithuge 2 hours ago
we are using PG bouncer in production. Interesting, I will follow the evolution of this project
moralestapia 2 hours ago
Cool work, thanks.
Wrt. the pooler, how do you compare with pgbouncer?
I'm interested because I have a postgres instance, low-traffic but still like ... tens of r(eads)ps. I was not running anything close to the machine limits but still added pgbouncer to improve performance and didn't see a noticeable difference. I was stress-testing the machine obv., I'm not talking about the 10 rps, lol.
For context, my numbers were something like 10k rps +/- 1k vanilla postgres and like 9k rps +/- 1k with pgbouncer in front of it. So ... slightly slower but big error bars so I wouldn't say for sure. I ended up not using pgbouncer as the benefit was immaterial.
Also yeah, in case you want to check it out, it's the db that backs this project: https://httpstate.com.
[-]
- levkk 1 hour ago
  Old benchmark, but still good: https://pgdog.dev/blog/pgbouncer-vs-pgdog