PostgreSQL Multiprocessing Explained
PostgreSQL uses multiprocessing, not multithreading,
as a core architectural design.
This design choice affects performance, concurrency, isolation, and
scalability.
1. What Is Multiprocessing in PostgreSQL?
Multiprocessing means:
PostgreSQL uses multiple operating system processes
to do work in parallel.
Each important task runs in its own process, not as a
thread inside one process.
This is different from databases that rely heavily on multithreading.
2. The PostgreSQL Process Model (Core Concept)
PostgreSQL follows a process‑per‑connection architecture.
Main process types:
Postmaster (main server)
├── Client Backend Process (one per connection)
├── WAL Writer
├── Checkpointer
├── Background Writer
├── Autovacuum Workers
├── Logical Replication Workers
└── Parallel Query
Workers
Each box above is a separate OS process.
3. One Connection = One Backend Process
When a client connects:
Plain Text
Client → PostgreSQL → New backend OS process
Show more lines
That backend process:
- Parses
SQL
- Executes
queries
- Uses
its own memory context
- Communicates
with shared memory
✅ Strong isolation
✅
Crash protection
❌
More memory per connection
4. Shared Memory + Multiple Processes
Even though PostgreSQL uses multiple processes, they share
data using:
- Shared
memory segments
- Lightweight
locks (LWLocks)
- Spinlocks
Shared structures include:
- Buffer
cache
- WAL
buffers
- Lock
tables
- Statistics
This is how multiprocessing works without data corruption.
5. Parallel Query Execution (True Multiprocessing)
PostgreSQL supports parallel execution, where one
query uses multiple processes.
Example:
SQL
SET max_parallel_workers_per_gather = 4;
Show more lines
For large queries, PostgreSQL may spawn:
Leader Process
├── Parallel Worker 1
├── Parallel Worker 2
├── Parallel Worker 3
└── Parallel Worker 4
Used for:
- Sequential
scans
- Aggregations
- Hash
joins
✅ Uses multiple CPU cores
✅
Faster analytics queries
6. Background Multiprocessing Tasks
PostgreSQL always runs multiple background processes,
even with no users connected.
Examples:
- Autovacuum
workers → clean dead rows
- WAL
writer → write logs
- Checkpointer
→ ensure durability
- Logical
replication workers
- Stats
collector
These run independently and concurrently with user
queries.
7. MVCC + Multiprocessing (Key Advantage)
PostgreSQL’s MVCC (Multi‑Version Concurrency Control)
works perfectly with multiprocessing.
Result:
- Readers
never block writers
- Writers
never block readers
- Each
process sees its own snapshot
This is critical for:
- High
concurrency
- Large
transactional systems
- Long‑running
queries
8. Why PostgreSQL Chose Multiprocessing (Not Threads)
Advantages:
✅ Better stability (one process
crash doesn’t kill others)
✅
Easier memory cleanup
✅
Safer extension execution
✅
Strong OS‑level isolation
Trade‑offs:
❌ Higher memory usage
❌
Too many connections can overload the server
This is why connection pooling is essential.
9. Multiprocessing + Connection Pooling
Because every connection is a process:
❌ 5,000 connections = 5,000 OS
processes
✅
Use connection pools
Common tools:
- PgBouncer
- PgPool‑II
- App‑level
pools (HikariCP, psycopg pool)
Best practice:
Many app requests → Few PostgreSQL processes
10. Monitoring Multiprocessing in PostgreSQL
View running processes:
SQL
SELECT pid, state, backend_type, query
FROM pg_stat_activity;
Show more lines
Backend types include:
- client
backend
- autovacuum
worker
- parallel
worker
- logical
replication worker
11. Common Multiprocessing Issues
Too many backend processes
✅ Use pooling
✅
Reduce max_connections
High context switching
✅ Tune parallel workers
✅
Reduce idle connections
Autovacuum contention
✅ Proper autovacuum tuning
12. Simple Summary
✅ PostgreSQL uses multiprocessing,
not multithreading
✅
Each connection runs in its own OS process
✅
Parallel queries use multiple worker processes
✅
Background workers run continuously
✅
Shared memory + locks keep data safe
✅
This design favors stability and correctness
In One Line:
PostgreSQL multiprocessing prioritizes safety, isolation,
and predictable concurrency over raw lightweight threading.
No comments:
Post a Comment