Tuesday, 12 May 2026

PostgreSQL Multiprocessing Explained

PostgreSQL uses multiprocessing, not multithreading, as a core architectural design.
This design choice affects performance, concurrency, isolation, and scalability.


1. What Is Multiprocessing in PostgreSQL?

Multiprocessing means:

PostgreSQL uses multiple operating system processes to do work in parallel.

Each important task runs in its own process, not as a thread inside one process.

This is different from databases that rely heavily on multithreading.


2. The PostgreSQL Process Model (Core Concept)

PostgreSQL follows a process‑per‑connection architecture.

Main process types:

Postmaster (main server)

 ── Client Backend Process (one per connection)

 ── WAL Writer

 ── Checkpointer

 ── Background Writer

 ── Autovacuum Workers

 ── Logical Replication Workers

 └── Parallel Query Workers

Each box above is a separate OS process.


3. One Connection = One Backend Process

When a client connects:

Plain Text

Client → PostgreSQL → New backend OS process

Show more lines

That backend process:

  • Parses SQL
  • Executes queries
  • Uses its own memory context
  • Communicates with shared memory

Strong isolation
Crash protection
More memory per connection


4. Shared Memory + Multiple Processes

Even though PostgreSQL uses multiple processes, they share data using:

  • Shared memory segments
  • Lightweight locks (LWLocks)
  • Spinlocks

Shared structures include:

  • Buffer cache
  • WAL buffers
  • Lock tables
  • Statistics

This is how multiprocessing works without data corruption.


5. Parallel Query Execution (True Multiprocessing)

PostgreSQL supports parallel execution, where one query uses multiple processes.

Example:

SQL

SET max_parallel_workers_per_gather = 4;

Show more lines

For large queries, PostgreSQL may spawn:

Leader Process

 ── Parallel Worker 1

 ── Parallel Worker 2

 ── Parallel Worker 3

 └── Parallel Worker 4

Used for:

  • Sequential scans
  • Aggregations
  • Hash joins

Uses multiple CPU cores
Faster analytics queries


6. Background Multiprocessing Tasks

PostgreSQL always runs multiple background processes, even with no users connected.

Examples:

  • Autovacuum workers → clean dead rows
  • WAL writer → write logs
  • Checkpointer → ensure durability
  • Logical replication workers
  • Stats collector

These run independently and concurrently with user queries.


7. MVCC + Multiprocessing (Key Advantage)

PostgreSQL’s MVCC (Multi‑Version Concurrency Control) works perfectly with multiprocessing.

Result:

  • Readers never block writers
  • Writers never block readers
  • Each process sees its own snapshot

This is critical for:

  • High concurrency
  • Large transactional systems
  • Long‑running queries

8. Why PostgreSQL Chose Multiprocessing (Not Threads)

Advantages:

Better stability (one process crash doesn’t kill others)
Easier memory cleanup
Safer extension execution
Strong OS‑level isolation

Trade‑offs:

Higher memory usage
Too many connections can overload the server

This is why connection pooling is essential.


9. Multiprocessing + Connection Pooling

Because every connection is a process:

5,000 connections = 5,000 OS processes
Use connection pools

Common tools:

  • PgBouncer
  • PgPool‑II
  • App‑level pools (HikariCP, psycopg pool)

Best practice:

Many app requests → Few PostgreSQL processes


10. Monitoring Multiprocessing in PostgreSQL

View running processes:

SQL

SELECT pid, state, backend_type, query

FROM pg_stat_activity;

Show more lines

Backend types include:

  • client backend
  • autovacuum worker
  • parallel worker
  • logical replication worker

11. Common Multiprocessing Issues

Too many backend processes

Use pooling
Reduce max_connections

High context switching

Tune parallel workers
Reduce idle connections

Autovacuum contention

Proper autovacuum tuning


12. Simple Summary

PostgreSQL uses multiprocessing, not multithreading
Each connection runs in its own OS process
Parallel queries use multiple worker processes
Background workers run continuously
Shared memory + locks keep data safe
This design favors stability and correctness


In One Line:

PostgreSQL multiprocessing prioritizes safety, isolation, and predictable concurrency over raw lightweight threading.

 

No comments:

Post a Comment

Multithreading in PostgreSQL Using dblink and FDW (Reality vs Myth) First: The Critical Truth (Must Be Clear) You cannot create true mul...