PostgreSQL Reconsiders Its Process-Based Model 54
Jonathan Corbet, writing at LWN: In the fast-moving open-source world, programs can come and go quickly; a tool that has many users today can easily be eclipsed by something better next week. Even in this environment, though, some programs endure for a long time. As an example, consider the PostgreSQL database system, which traces its history back to 1986. Making fundamental changes to a large code base with that much history is never an easy task. As fundamental changes go, moving PostgreSQL away from its process-oriented model is not a small one, but it is one that the project is considering seriously.
A PostgreSQL instance runs as a large set of cooperating processes, including one for each connected client. These processes communicate through a number of shared-memory regions using an elaborate library that enables the creation of complex data structures in a setting where not all processes have the same memory mapped at the same address. This model has served the project well for many years, but the world has changed a lot over the history of this project. As a result, PostgreSQL developers are increasingly thinking that it may be time to make a change.
A PostgreSQL instance runs as a large set of cooperating processes, including one for each connected client. These processes communicate through a number of shared-memory regions using an elaborate library that enables the creation of complex data structures in a setting where not all processes have the same memory mapped at the same address. This model has served the project well for many years, but the world has changed a lot over the history of this project. As a result, PostgreSQL developers are increasingly thinking that it may be time to make a change.
Missing Key Info in Quote (Score:5, Informative)
The overhead of cross-process context switches is inherently higher than switching between threads in the same process - and my suspicion is that that overhead will continue to increase.
Skipped because they're a CS person? (Score:4, Interesting)
Yes, that is a general reason when separate processes might be a bad design choice where you have a lot of data being passed around. But it also keeps some things separate from each other. Makes it impossible to cheat in certain ways.
Processes share an address space. Meaning all the data in RAM that one thread can access is also physically possible for other threads to access. Separate processes should prevent that, even by mistake.
I assume data would go from untrusted (you'd want separate processes and higher security) to usable (reviewed or created from a trusted source). And there should be a good design pattern for that situation. I don't happen to know it, but... this isn't a rare situation in software.
Re: (Score:2)
I think you mean "Threads in a process share a single address space." ?
Re: (Score:1)
If it were a big problem, wouldn't a new chip architecture sell well?
Re: (Score:2)
Right. The original article summary just laments the problem, but does not mention the proposed solution and the /. TFS just copies it.
Re: (Score:2)
About time (Score:1)
I like PostgresSQL but there are a couple things they're stubborn about because they're so incredibly conservative when it comes to database design. No index hinting is annoyingly paternalistic.
At least some of those annoying things can be circumvented by using modules (for example, the lack of a single-byte int type is provided by pguint)
Glad to see they're catching up with these newfangled "threads"
Re: (Score:2)
They need to get with the times (Score:5, Funny)
For each individual SQL query, a new Docker container should be launched with a fresh web server to handle the request. Javascript on the server will convert the request to REST calls to a pool of backend VMs. These will then convert all of the data to YAML, which will also be the new on-disk format. This all needs to run exclusively in the cloud.
Re:They need to get with the times (Score:5, Funny)
you will burn in a lake of hellfire, and like it (Score:1)
Re: (Score:3, Insightful)
Re: (Score:2)
I assume you meant a data lake of hellfire. We can’t forget that in our new architecture. And of course we’ll like it. While we’re at it, we’ll just make Postgres into a NoSQL database.
Re:They need to get with the times (Score:5, Funny)
Stop accurately describing my employer's software architecture!
Re: (Score:3)
Have we worked for some of the same companies?
Re: (Score:2)
I'm angry you didn't have Rust anywhere in your stack. Maybe compile Rust to WASM and run it under Node.JS?
That sounds non-performant, unless your Node.JS instance was written in Go.
Re: (Score:1)
Excellent point! We definitely should port V8 to Go. That will be sure this architecture is both clean and performant! It's scary how long this can go -- I feel that many of us have encountered this sort of thing at work.
There needs to be a DailyWTF comic strip or something.
Re: (Score:1)
Re:They need to get with the times (Score:4, Funny)
As a CEO myself I find this conversation deeply arousing. Can you provide contact details so we may retain your consulting services?
Re: (Score:3)
I'm shocked! shocked I tell you!
You haven't got a blockchain running anywhere !
And I'm assuming that you're mounting this in a J2EE environment with (n+1) layers of almost empty files, XML definitions and multiple empty directories.
Re: (Score:2)
It's the new new thing (Score:3)
For each individual SQL query, a new Docker container should be launched
Yes it's the hot new server tech called "macro services"
Re: (Score:2)
Yes it's the hot new server tech called "macro services"
It looks more like "server full" code to me.
Great followup! (Score:1)
It looks more like "server full" code to me.
Even though this comes late enough few will ever see it, thought I'd let you know *I* appreciate the humor. :-)
Re: (Score:2)
But where is the Event DB?
Re: (Score:1)
Re: (Score:1)
Nah, each request should be queued and handled by a lambda. Then it can be cloud-native.
PostgreSQL de-microservicing? (Score:1)
Do the god's of architecture allow that?
Not a good idea (Score:5, Interesting)
If you want to fix some warts in Postgres efforts would be better spent by:
(1) Using io_uring or direct I/O rather than double buffering everything in the page cache.
(2) Handling correlated columns better so the planner is smarter about certain kinds of joins.
(3) Fixing the terrible plans that come out of queries with OR-clauses and some star-queries.
(4) Reducing write amplification in indexes if possible, ala HOT updates.
Re: (Score:2)
Doesn't x86 have address-space #s now? (Like other arches have had for decades.) If the working set for the TLB is close to its total size, then it wouldn't help much, but if it's a problem of TLB-invalidate on context switch, address space numbers allow the TLB and other virtually addressed caches to avoid flushing entries that might hit again when context is switched back.
Re: (Score:1)
Is that only on recent x86 CPU's? (Score:2)
I know a lot of features in modern CPU's is to fix issues like this. And that ARM compatible CPU's are not as high performing.
Would this situation change on a simpler CPU? Say with a smaller TLB and cache sizes in general.
There are companies making 100's of ARM compatible cores (per socket? board?) for cloud and server companies. Having software that works well on a range of devices is useful.
And maybe there are other issues with the processes, like making debugging harder? Or scaring away new open so
Re: (Score:1)
Almost no real-world Postgres databases fit in RAM - and because of ACID, even with synchronous_commit off, you will be doing writes unless you have a read-only database. So worrying about RAM (i.e. TLB) isn't as likely to have a good payoff as optimizing th
Re: (Score:2)
I know a lot of features in modern CPU's is to fix issues like this. And that ARM compatible CPU's are not as high performing.
Would this situation change on a simpler CPU? Say with a smaller TLB and cache sizes in general.
On any database where the performance really matters, you're going to have far more data than RAM. You'll be churning through tons of data on a big query, so you'll be thrashing the TLB and cache regardless.
I think the bigger issue is that Postgres will try to share data between processes when possible. Doing that adds a lot of overhead. You can't use direct pointers to the data, as the memory will be mapped to different addresses in each process. All your data accesses will have a layer of indirection adde
Re: Not a good idea (Score:1)
Ohhh... They read/write through the standard OS fs-libraries, thereby - as you write - hit that cache as well as their own? Sounds like something they should fix a.s.a.p
. and shouldn't be that hard to implement. Libraries for Direct I/O, or Asynch I/O, bypassing the OS fs, exists almost everywhere nowadays, right?
I was thinking of switching to PostgreSQL, but I understand I need to investigate this a bit further. ðY'
Re: (Score:2)
PostgreSQL gets so many things right. Even updating the system catalog is ACID, that's nice. PostgreSQL also handles dates well. In fact it handles many data types really, really well - far better than MySQL.
I also have found that when you start piling on the load (real world OLTP) Postgres h
Re: (Score:2)
If your queries are dominated by primary key lookups MySQL will probably win.
Why would you think that? My tables have hundreds of millions of rows, and I can construct a query that does direct primary key lookups for multiple columns (they usually have nine or ten columns) using subqueries, and pull in targeted results in three to five milliseconds.
Re: (Score:1)
This means for updates on tables with lots of secondary indexes (sometimes you just need those) you get far more writes in Postgres than MySQL.
In case my posts here didn
A quick summary and extrapolation (Score:5, Insightful)
The crux of the issue is that postgres is using a more expensive multi-process model instead of using kernel threading to communicate, duplicating both memory resources used by each process/connection but also requiring a non-trivial amount of IPC which would not otherwise happen with lighter threads.
Some OSes have a concept called "lightweight threads" which sort of helps with this condition. I'm somewhat surprised postgres still uses this model, honestly - it's exceedingly long in the tooth. What I've seen typically is, even with applications which have this kind of model they'll use multiple worker threads within each process, at least, to spread resource use across cores more effectively.
For reference, the Linux kernel treats both threads and processes roughly the same (as "tasks"), and the fundamental problem is simply in how postgres communicates.
From kernel docs -
"Linux uses a different implementation for threads. The basic unit is called a task (hence the struct task_struct) and it is used for both tasks and processes. Instead of embedding resources in the task structure it has pointers to these resources.
Thus, if two threads are the same process will point to the same resource structure instance. If two threads are in different processes they will point to different resource structure instances."
https://linux-kernel-labs.github.io/refs/heads/master/lectures/processes.html
Effectively, this has interesting implications for administration, as each thread has the same kind of limit a process would normally (eg. open file limits, for instance). You can get a better accounting of what's actually running on your system by using `ps -eLf` - and of course, /proc/ contains all the appropriate info as well.
Re: (Score:2)
And the typical use case of Postgres is to do some sort of connection pooling. Either externally, like PGbouncer, or within your application code (by maintaining a pool of connections at the ready).
Re: (Score:2)
All those things might be true in the specific context to postgres, which would be curious - because it's definitely not the norm for that to be the case with process vs thread models.
The IPC overhead, though... that's definitely an added expense, and is going to get more expensive/less performant disproportionate to the work being done as load increases, particularly with our ever-expanding many-cored NUMA systems.
Re: (Score:2)
Re: (Score:2)
I'm somewhat surprised postgres still uses this model, honestly - it's exceedingly long in the tooth.
After reading the whole article the issue seems to me fundamentally a project management and staffing one. Sure you can posit rewriting the process/memory management using threads in shared memory like you probably would if writing a new server from scratch, but how to not break stuff right and left (2000 gloal variables, etc.) and still support all old installs and have regular releases with new features?
A fork of the current system, which continues to be maintained, while the new version is developed and
Re: (Score:1)
The performance gains are likely to be quite modest. I say this as someone pretty familiar with the Postgres code base. I'm certainly a fan of threading in ma
Way over my head (Score:2)
Go the other way (Score:2)
Maybe this has to happen at the Linux kernel level, however, I am not quite sure.
But does it have web scale? (Score:2)
That's good and all but the important question is:
Does it have web scale? [youtube.com] =P
Since when is the FOSS world "fast moving"? (Score:2)
As it is not about getting rich on some hype before the victims/customers notice what something actually can and cannot do, FISS is typically moing at a rather sedate phase and that is how it should be for all solid engineering.