Thursday, 14 May 2026

Multithreading in PostgreSQL Using dblink and FDW (Reality vs Myth)

First: The Critical Truth (Must Be Clear)

You cannot create true multithreading inside a single PostgreSQL backend, even with dblink or Foreign Data Wrappers (FDW).

What you can do: Achieve parallel execution across multiple PostgreSQL backends
Simulate multithreading by delegating work to other connections
Run independent queries concurrently using remote sessions

This is process‑level parallelism, not thread‑level multithreading.


1. What dblink and FDW Actually Do

dblink

  • Opens another PostgreSQL connection
  • Executes SQL on that connection
  • Returns results to the caller
  • Runs in a separate backend process

postgres_fdw

  • Creates foreign tables mapped to another PostgreSQL server
  • Uses remote connections
  • Query planner decides whether to push down work

📌 Key Point
Each dblink or FDW call = another PostgreSQL process, not a thread.


2. Why People Use dblink / FDW for “Multithreading”

Typical goals:

  • Parallel processing of large datasets
  • Running independent tasks at the same time
  • Avoiding blocking in long‑running operations
  • Batch processing / ETL parallelism

PostgreSQL does not allow THREAD {} blocks, so people use extra connections instead.


3. What You CANNOT Do (Important)

You cannot:

  • Run two SQL statements at the same time inside one function
  • Spawn threads inside PL/pgSQL
  • Use dblink calls concurrently from the same backend
  • Force dblink calls to run asynchronously

This will still execute sequentially in one backend:

SQL

SELECT dblink_exec(...);

SELECT dblink_exec(...);

Show more lines


4. Correct Way: Parallel Work Using dblink + Multiple Sessions

Architecture Concept

Plain Text

Coordinator Session

── DBLink Session 1 Task A

── DBLink Session 2 Task B

└── DBLink Session 3 → Task C

Show more lines

Each DBLink session:

  • Is a separate PostgreSQL backend
  • Can run concurrently
  • Uses a different CPU core

5. Example: Parallel Task Execution Using dblink

Step 1: Enable dblink

Plain Text

CREATE EXTENSION dblink;

Show more lines


Step 2: Open Multiple Connections

SQL

SELECT dblink_connect('w1', 'dbname=mydb');

SELECT dblink_connect('w2', 'dbname=mydb');

SELECT dblink_connect('w3', 'dbname=mydb');

Show more lines

Each connection runs in a separate backend process.


Step 3: Dispatch Work

SQL

SELECT dblink_exec('w1', 'CALL process_partition(1)');

SELECT dblink_exec('w2', 'CALL process_partition(2)');

SELECT dblink_exec('w3', 'CALL process_partition(3)');

Show more lines

Work runs in parallel Uses multiple CPU cores Each task isolated


Step 4: Disconnect

SQL

SELECT dblink_disconnect('w1');

SELECT dblink_disconnect('w2');

SELECT dblink_disconnect('w3');

Show more lines


6. Using FDW for Parallelism (Cleaner, Safer)

FDW is often better than dblink.

Setup postgres_fdw

SQL

CREATE EXTENSION postgres_fdw;

 

CREATE SERVER worker1 FOREIGN DATA WRAPPER postgres_fdw

OPTIONS (host 'localhost', dbname 'mydb', port '5432');

 

CREATE USER MAPPING FOR CURRENT_USER

SERVER worker1 OPTIONS (user 'postgres');

Show more lines


Parallel Reads via FDW

SQL

SELECT count(*) FROM remote_table_part1

UNION ALL

SELECT count(*) FROM remote_table_part2;

Show more lines

If:

  • use_remote_estimate = on
  • parallel_append = on

PostgreSQL can plan parallel execution across FDW connections.

📌 FDW parallelism is planner‑dependent, not guaranteed.


7. Best Pattern: Leader / Worker Model

This is how production systems do it.

Leader (Coordinator)

  • Assigns tasks
  • Uses dblink or FDW connections

Workers

  • Run heavy logic
  • Independent transactions
  • Fail independently

Example use cases:

  • ETL pipelines
  • Parallel data validation
  • Background batch jobs
  • Sharded processing

8. Performance and Safety Considerations

Pros

  • True CPU parallelism
  • Failure isolation
  • Scales well on multi‑core systems

⚠️ Cons

  • High connection cost
  • More memory usage
  • Needs connection pooling
  • Harder error handling

9. When NOT to Use dblink/FDW for Parallelism

For OLTP transaction processing
For row‑by‑row loops
For small queries
Instead of native parallel queries

Use PostgreSQL parallel queries first, then dblink only when needed.


10. Recommended Alternatives (Often Better)

Goal

Better Option

CPU parallelism

Native parallel query

Task concurrency

App‑level threads

Batch processing

Job queue + workers

Scheduling

pg_cron

Background jobs

Background workers

High throughput

Partitioning


Final Verdict (Very Important)

dblink and FDW do NOT create multithreading.
They create controlled, parallel multiprocessing using additional sessions.

This is: Safe
Supported
Production‑proven

But it must be designed carefully.


Simple One‑Line Summary

Multithreading in PostgreSQL using dblink/FDW = parallel work via multiple backend processes, not real threads.

 

Wednesday, 13 May 2026

 How to Achieve Multithreading in PostgreSQL (Correct Way)

PostgreSQL uses multiprocessing, not multithreading
You do not convert PostgreSQL into a threaded engine
You achieve multithreading behavior using PostgreSQL‑supported mechanisms


1. Important Truth (Must Understand First)

PostgreSQL DOES NOT:

  • Run multiple threads inside one backend process
  • Allow user‑defined threaded execution in PL/pgSQL
  • Support THREAD, PARALLEL blocks like Java/C++

PostgreSQL DOES:

  • Run multiple OS processes
  • Execute parallel queries
  • Use parallel workers
  • Support concurrent client execution
  • Safely scale across CPU cores

This is intentional and architecturally correct


2. Primary Way: Parallel Query Execution (Best Replacement for Multithreading)

PostgreSQL supports true parallel execution using multiple worker processes.

Enable Parallelism

SQL

SHOW max_parallel_workers;

SHOW max_parallel_workers_per_gather;

Show more lines

Recommended settings (example):

Apache Config

max_parallel_workers = 8

max_parallel_workers_per_gather = 4

Show more lines


Example: Parallel Query Execution

SQL

EXPLAIN ANALYZE

SELECT COUNT(*)

FROM large_table;

Show more lines

You will see:

Gather

  Workers Planned: 4

  -> Parallel Seq Scan

Multiple CPU cores used
Similar to multithreading
Fully managed by PostgreSQL


3. Application‑Level Multithreading (MOST IMPORTANT)

Correct pattern

Threads live in the application, not in PostgreSQL

Each thread:

  • Uses its own DB connection (or pooled connection)
  • Executes queries concurrently

Example (Java / Python / Node):

Thread 1 → Query A → PostgreSQL Process 1

Thread 2 → Query B → PostgreSQL Process 2

Thread 3 → Query C → PostgreSQL Process 3

This gives real multithreaded behavior PostgreSQL handles concurrency safely This is the industry standard


Use Connection Pooling (MANDATORY)

Because PostgreSQL is process‑based:

Tools:

  • PgBouncer
  • PgPool
  • HikariCP / psycopg pool / Sequelize pool

Rule:

1000 threads ≠ 1000 DB connections


4. Background Workers (Advanced / Internal Multitasking)

PostgreSQL supports background worker processes (C extensions).

Used for:

  • Async processing
  • Queue consumers
  • Continuous tasks

Examples:

  • Logical replication
  • Autovacuum
  • pg_cron jobs

⚠️ Requires C programming ⚠️ Not for normal application use


5. Job Parallelism Using pg_cron or Queues

For “task‑based multithreading”:

pg_cron example

SQL

SELECT cron.schedule(

'parallel_job',

'*/5 * * * *',

$CALL process_batch();$

);

Show more lines

Run:

  • Multiple jobs
  • Parallel execution
  • Independent processes

Safe concurrency
Ideal for ETL / batch systems


6. Logical Sharding / Partition Parallelism

Partitioning enables parallel scans and writes:

SQL

CREATE TABLE events (

id bigint,

created_at date

) PARTITION BY RANGE (created_at);

Show more lines

PostgreSQL can:

  • Scan partitions in parallel
  • Insert concurrently without contention

Practical performance boost
Works like multithreaded data access


7. What You CANNOT Do (Common Mistakes)

Create threads in PL/pgSQL
Run multiple threads inside a function
Use fork() or pthread in SQL
Force PostgreSQL to behave like MySQL threads

PostgreSQL intentionally avoids threading to ensure:

  • Memory safety
  • Crash isolation
  • Extension safety

8. Summary: Correct Ways to Achieve Multithreading

Goal

Correct PostgreSQL Solution

CPU parallelism

Parallel queries

Concurrent execution

Multiple connections

Threaded workloads

Application‑level threads

Background tasks

Background workers / pg_cron

Batch parallelism

Job scheduling + queues

High performance

Pooling + partitioning


Final Verdict (Very Important)

You do not “add multithreading” to PostgreSQL.
You DESIGN around its multiprocessing architecture.

This is why PostgreSQL:

  • Scales exceptionally well
  • Rarely crashes
  • Handles massive concurrency safely

 

Tuesday, 12 May 2026

PostgreSQL Multiprocessing Explained

PostgreSQL uses multiprocessing, not multithreading, as a core architectural design.
This design choice affects performance, concurrency, isolation, and scalability.


1. What Is Multiprocessing in PostgreSQL?

Multiprocessing means:

PostgreSQL uses multiple operating system processes to do work in parallel.

Each important task runs in its own process, not as a thread inside one process.

This is different from databases that rely heavily on multithreading.


2. The PostgreSQL Process Model (Core Concept)

PostgreSQL follows a process‑per‑connection architecture.

Main process types:

Postmaster (main server)

 ── Client Backend Process (one per connection)

 ── WAL Writer

 ── Checkpointer

 ── Background Writer

 ── Autovacuum Workers

 ── Logical Replication Workers

 └── Parallel Query Workers

Each box above is a separate OS process.


3. One Connection = One Backend Process

When a client connects:

Plain Text

Client → PostgreSQL → New backend OS process

Show more lines

That backend process:

  • Parses SQL
  • Executes queries
  • Uses its own memory context
  • Communicates with shared memory

Strong isolation
Crash protection
More memory per connection


4. Shared Memory + Multiple Processes

Even though PostgreSQL uses multiple processes, they share data using:

  • Shared memory segments
  • Lightweight locks (LWLocks)
  • Spinlocks

Shared structures include:

  • Buffer cache
  • WAL buffers
  • Lock tables
  • Statistics

This is how multiprocessing works without data corruption.


5. Parallel Query Execution (True Multiprocessing)

PostgreSQL supports parallel execution, where one query uses multiple processes.

Example:

SQL

SET max_parallel_workers_per_gather = 4;

Show more lines

For large queries, PostgreSQL may spawn:

Leader Process

 ── Parallel Worker 1

 ── Parallel Worker 2

 ── Parallel Worker 3

 └── Parallel Worker 4

Used for:

  • Sequential scans
  • Aggregations
  • Hash joins

Uses multiple CPU cores
Faster analytics queries


6. Background Multiprocessing Tasks

PostgreSQL always runs multiple background processes, even with no users connected.

Examples:

  • Autovacuum workers → clean dead rows
  • WAL writer → write logs
  • Checkpointer → ensure durability
  • Logical replication workers
  • Stats collector

These run independently and concurrently with user queries.


7. MVCC + Multiprocessing (Key Advantage)

PostgreSQL’s MVCC (Multi‑Version Concurrency Control) works perfectly with multiprocessing.

Result:

  • Readers never block writers
  • Writers never block readers
  • Each process sees its own snapshot

This is critical for:

  • High concurrency
  • Large transactional systems
  • Long‑running queries

8. Why PostgreSQL Chose Multiprocessing (Not Threads)

Advantages:

Better stability (one process crash doesn’t kill others)
Easier memory cleanup
Safer extension execution
Strong OS‑level isolation

Trade‑offs:

Higher memory usage
Too many connections can overload the server

This is why connection pooling is essential.


9. Multiprocessing + Connection Pooling

Because every connection is a process:

5,000 connections = 5,000 OS processes
Use connection pools

Common tools:

  • PgBouncer
  • PgPool‑II
  • App‑level pools (HikariCP, psycopg pool)

Best practice:

Many app requests → Few PostgreSQL processes


10. Monitoring Multiprocessing in PostgreSQL

View running processes:

SQL

SELECT pid, state, backend_type, query

FROM pg_stat_activity;

Show more lines

Backend types include:

  • client backend
  • autovacuum worker
  • parallel worker
  • logical replication worker

11. Common Multiprocessing Issues

Too many backend processes

Use pooling
Reduce max_connections

High context switching

Tune parallel workers
Reduce idle connections

Autovacuum contention

Proper autovacuum tuning


12. Simple Summary

PostgreSQL uses multiprocessing, not multithreading
Each connection runs in its own OS process
Parallel queries use multiple worker processes
Background workers run continuously
Shared memory + locks keep data safe
This design favors stability and correctness


In One Line:

PostgreSQL multiprocessing prioritizes safety, isolation, and predictable concurrency over raw lightweight threading.

 

Monday, 11 May 2026

 PostgreSQL Multitasking Explained

PostgreSQL does not use multitasking in the traditional application sense (like threads performing different jobs in one process).
Instead, PostgreSQL achieves multitasking through process‑based concurrency, MVCC, and parallel execution.

In simple words:

PostgreSQL can handle many users and many queries at the same time safely and efficiently.


1. What “Multitasking” Means in PostgreSQL

In PostgreSQL, multitasking usually refers to:

  • Multiple users running queries at the same time
  • Reads and writes happening concurrently
  • Background maintenance tasks running alongside queries
  • Queries using parallel workers to run faster

This is handled through concurrency + parallelism, not OS‑level threading like MySQL or application code.


2. Process‑Based Architecture (Key Concept)

PostgreSQL uses a process‑per‑connection model.

How it works:

  • One Postmaster process controls the server
  • Each client connection gets its own backend process
  • Background tasks run in separate system processes

Example active processes:

  • Client query processes
  • Autovacuum workers
  • WAL writer
  • Checkpointer
  • Background writer

This provides strong isolation
More memory per connection


3. MVCC – How PostgreSQL Avoids Locks

The most important multitasking feature is MVCC (Multi‑Version Concurrency Control).

What MVCC Does:

  • Readers do not block writers
  • Writers do not block readers
  • Each transaction sees a consistent snapshot

Example:

Plain Text

User A reads a row

User B updates the same row

User A still sees the old version

User B sees the new version

Show more lines

High concurrency
Minimal locking
Excellent for multitasking workloads


4. Locks Still Exist (But Used Carefully)

PostgreSQL does use locks, but only when necessary.

Main lock types:

  • Row‑level locks (UPDATE, DELETE)
  • Table locks (DDL)
  • Advisory locks (developer‑controlled)

Example:

SQL

SELECT * FROM orders WHERE id = 10 FOR UPDATE;

Show more lines

This locks only that row, not the whole table.


5. Parallel Query Execution (True Parallelism)

PostgreSQL supports parallel query execution, meaning:

One query can use multiple CPU cores

Supported operations:

  • Sequential scans
  • Aggregations
  • Joins

Example:

SQL

SET max_parallel_workers_per_gather = 4;

``

Show more lines

PostgreSQL may spawn parallel worker processes.

Faster large queries
Better CPU utilization


6. Background Workers (Automatic Multitasking)

PostgreSQL runs multiple background tasks simultaneously.

Common examples:

  • Autovacuum (cleanup)
  • WAL writer (write‑ahead logging)
  • Checkpointer
  • Logical replication workers

They run independently of user queries.

Example:

SQL

SELECT * FROM pg_stat_activity;

Show more lines

You can see:

  • Active queries
  • Background tasks
  • Idle connections

7. Connection Pooling (Important for Multitasking)

Because each connection = one process:

Too many connections = performance problems

Use connection pooling

Popular options:

  • PgBouncer
  • PgPool‑II
  • Application pools (HikariCP, SQLAlchemy pool)

Best practice:

Plain Text

Many app requests → few PostgreSQL connections

Show more lines


8. Task Scheduling (Cron‑Like Multitasking)

PostgreSQL does not have native job scheduling, but you can multitask using:

  • pg_cron extension
  • OS cron jobs + psql
  • Background workers

Example with pg_cron:

SQL

SELECT cron.schedule(

'cleanup_job',

'0 3 * * *',

'VACUUM ANALYZE'

);

Show more lines


9. How PostgreSQL Handles Heavy Multitasking Safely

Key mechanisms:

  • MVCC snapshots
  • Fine‑grained locks
  • Per‑process memory
  • Autovacuum cleanup
  • Query planner cost models

This is why PostgreSQL scales well under:

  • High read workloads
  • Mixed read/write workloads
  • Long‑running queries

10. Common Multitasking Problems (and Fixes)

Too many active connections

Use connection pooling

Slow concurrent updates

Index properly
Reduce hot‑row updates

Autovacuum conflicts

Tune autovacuum settings

High CPU with parallel queries

Limit parallel workers


Simple Summary

PostgreSQL multitasking is built on:

  • Process‑based concurrency
  • MVCC (non‑blocking reads/writes)
  • Parallel query execution
  • Background workers
  • Connection pooling

It is very safe, scalable, and predictable

It is ideal for high‑concurrency systems

 

Sunday, 10 May 2026

Add Email Delivery for PostgreSQL HTML Index Report

With email delivery:

  • DBAs get reports without logging into servers
  • Managers can review performance easily
  • Issues are noticed early
  • Reports become part of operational workflow

We’ll use Linux mail / sendmail / mailx, which is the most common, reliable approach.


What This Will Do

Generate HTML report
Email it as HTML body (not just attachment)
Optional attachment support
Fully automated via cron
Safe for production


Prerequisites (One‑Time)

1️ Ensure Mail Utility Is Installed

Debian / Ubuntu

Shell

sudo apt install mailutils

Show more lines

RHEL / CentOS / Rocky

Shell

sudo yum install mailx

Show more lines

Assumes server can send email (SMTP configured or relay allowed).


Step 1: Decide Report Recipients

Edit once in script:

Shell

EMAIL_TO="dba@company.com,team@company.com"

EMAIL_FROM="postgres-monitor@company.com"

Show more lines


Step 2: Update HTML Report Generator Script

Below is the FULL updated script with HTML generation + email delivery.

📄 scripts/generate_index_report.sh

Shell

#!/bin/bash

 

DB="mydb"

BASE="/opt/pg_monitoring"

DATE=$(date +%F)

PSQL="/usr/bin/psql"

 

EMAIL_TO="dba@company.com"

EMAIL_FROM="postgres-monitor@company.com"

SUBJECT="PostgreSQL Index Report - $DATE"

 

HTML_OUT="$BASE/output/html/index_report_$DATE.html"

 

HEADER="$BASE/reports/html/header.html"

FOOTER="$BASE/reports/html/footer.html"

STYLE="$BASE/reports/html/style.css"

 

# ---------- HTML HEADER ----------

sed "s/{{DATE}}/$DATE/" $HEADER > $HTML_OUT

echo "<style>" >> $HTML_OUT

cat $STYLE >> $HTML_OUT

echo "</style>" >> $HTML_OUT

 

add_section () {

TITLE=$1

QUERY=$2

 

echo "<h2>$TITLE</h2>" >> $HTML_OUT

$PSQL -d $DB -H -c "$QUERY" >> $HTML_OUT

}

 

# ---------- HTML CONTENT ----------

add_section "Index Health Summary" "

SELECT

COUNT(*) AS total_indexes,

SUM(CASE WHEN idx_scan = 0 THEN 1 ELSE 0 END) AS unused_indexes,

pg_size_pretty(SUM(pg_relation_size(indexrelid))) AS total_index_size

FROM pg_stat_user_indexes;

"

 

add_section "Top 10 Largest Indexes" "

SELECT

indexrelname AS index_name,

relname AS table_name,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,

idx_scan

FROM pg_stat_user_indexes

ORDER BY pg_relation_size(indexrelid) DESC

LIMIT 10;

"

 

add_section "Fast Growing Indexes (Last 7 Days)" "

SELECT

index_name,

pg_size_pretty(MAX(index_size_bytes) - MIN(index_size_bytes)) AS growth

FROM monitoring.index_size_history

WHERE captured_at >= now() - INTERVAL '7 days'

GROUP BY index_name

HAVING MAX(index_size_bytes) - MIN(index_size_bytes) > 100000000;

"

 

add_section "Unused Indexes" "

SELECT

indexrelname AS index_name,

relname AS table_name,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size

FROM pg_stat_user_indexes

WHERE idx_scan = 0

ORDER BY pg_relation_size(indexrelid) DESC;

"

 

add_section "GIN Indexes (High Risk)" "

SELECT

indexrelname AS gin_index,

relname AS table_name,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,

idx_scan

FROM pg_stat_user_indexes

WHERE indexrelname ILIKE '%gin%'

ORDER BY pg_relation_size(indexrelid) DESC;

"

 

# ---------- HTML FOOTER ----------

cat $FOOTER >> $HTML_OUT

 

# ---------- EMAIL DELIVERY ----------

mail -a "Content-Type: text/html" \

-s "$SUBJECT" \

-r "$EMAIL_FROM" \

"$EMAIL_TO" < $HTML_OUT

Show more lines

Make executable (if not already):

Shell

chmod +x /opt/pg_monitoring/scripts/generate_index_report.sh

Show more lines


Step 3: Cron Job (No Change)

Your existing cron job now automatically emails the report.

Shell

0 3 * * * /opt/pg_monitoring/scripts/generate_index_report.sh

Show more lines

HTML is generated
Email is sent immediately
Zero manual steps


Step 4: What the Email Looks Like

📧 Email Subject

PostgreSQL Index Report - 2026-04-18

📄 Email Body

  • Full HTML report
  • Styled tables
  • Index risk information
  • Easy to read on desktop & mobile

No attachment required
Opens instantly
Manager‑friendly


Optional: Send as HTML Attachment Instead

If your mail server blocks HTML bodies, use attachment mode:

Shell

mail -s "$SUBJECT" \

-a "$HTML_OUT" \

"$EMAIL_TO" < /dev/null

Show more lines


Optional: Send Only If Issues Found

Add logic before sending:

Shell

ISSUES=$(grep -c "<tr>" $HTML_OUT)

 

if [ "$ISSUES" -gt 5 ]; then

mail -a "Content-Type: text/html" -s "$SUBJECT" "$EMAIL_TO" < $HTML_OUT

fi

Show more lines

Avoids alert fatigue
Sends only meaningful reports


Best Practices for Email Reports

Send once per day
Use clear subject with date
Group recipients wisely
Use HTML body, not just attachments
Keep reports under control (dont spam)


Summary

  • HTML index reports now emailed automatically
  • Works with cron
  • Uses standard Linux mail tools
  • Production‑safe and readable
  • Zero downtime impact

You now have end‑to‑end PostgreSQL index monitoring:

Collect
Analyze
Report
Email

 

Saturday, 9 May 2026

Add HTML Report Generation for PostgreSQL Index Monitoring

HTML reports are useful because they:

  • Are easy to read in browsers
  • Can be shared with managers and teams
  • Look professional
  • Can be emailed or hosted internally
  • Combine multiple reports into one dashboard

We will generate a single self‑contained HTML report automatically.


Final Architecture (After This Step)

Shell

/opt/pg_monitoring/

── reports/

── sql/

│ │ ── index_health_report.sql

│ │ ── largest_indexes.sql

│ │ ── growing_indexes.sql

│ │ ── unused_indexes.sql

│ │ └── gin_indexes.sql

│ └── html/

── header.html

── footer.html

│ └── style.css

── output/

── txt/

── csv/

│ └── html/

│ └── index_report_YYYY-MM-DD.html

└── scripts/

└── generate_index_report.sh

Show more lines


1️ Create HTML Template Files

📄 reports/html/header.html

HTML

<!DOCTYPE html>

<html>

<head>

<meta charset="UTF-8">

<title>PostgreSQL Index Report</title>

<link rel="stylesheet" href="style.css">

</head>

<body>

<h1>PostgreSQL Index Monitoring Report</h1>

<p>Generated at: {{DATE}}</p>

<hr>

Show more lines


📄 reports/html/footer.html

HTML

<hr>

<p style="text-align:center;">

Generated automatically by PostgreSQL Index Monitoring

</p>

</body>

</html>

Show more lines


📄 reports/html/style.css

CSS

body {

font-family: Arial, sans-serif;

margin: 40px;

background-color: #f9f9f9;

}

 

h1, h2 {

color: #2c3e50;

}

 

table {

border-collapse: collapse;

width: 100%;

margin-bottom: 30px;

background: #ffffff;

}

 

th, td {

border: 1px solid #dddddd;

padding: 8px;

}

 

th {

background-color: #34495e;

color: white;

}

 

tr:nth-child(even) {

background-color: #f2f2f2;

}

 

.warning {

color: red;

font-weight: bold;

}

Show more lines


2️ Generate HTML Tables Using psql

psql has a built‑in HTML output mode.

We’ll use:

Shell

psql -H -c "SELECT ..."

 

Show more lines


3️ Update the Report Generator Script (HTML Included)

📄 scripts/generate_index_report.sh

Shell

#!/bin/bash

 

DB="mydb"

BASE="/opt/pg_monitoring"

DATE=$(date +%F)

PSQL="/usr/bin/psql"

 

HTML_OUT="$BASE/output/html/index_report_$DATE.html"

 

HEADER="$BASE/reports/html/header.html"

FOOTER="$BASE/reports/html/footer.html"

STYLE="$BASE/reports/html/style.css"

 

# ---------- HTML HEADER ----------

sed "s/{{DATE}}/$DATE/" $HEADER > $HTML_OUT

 

# embed CSS directly

echo "<style>" >> $HTML_OUT

cat $STYLE >> $HTML_OUT

echo "</style>" >> $HTML_OUT

 

# ---------- SECTION FUNCTION ----------

add_section () {

TITLE=$1

QUERY=$2

 

echo "<h2>$TITLE</h2>" >> $HTML_OUT

$PSQL -d $DB -H -c "$QUERY" >> $HTML_OUT

}

 

# ---------- HTML SECTIONS ----------

add_section "Index Health Summary" "

SELECT

COUNT(*) AS total_indexes,

SUM(CASE WHEN idx_scan = 0 THEN 1 ELSE 0 END) AS unused_indexes,

pg_size_pretty(SUM(pg_relation_size(indexrelid))) AS total_index_size

FROM pg_stat_user_indexes;

"

 

add_section "Top 10 Largest Indexes" "

SELECT

indexrelname AS index_name,

relname AS table_name,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,

idx_scan

FROM pg_stat_user_indexes

ORDER BY pg_relation_size(indexrelid) DESC

LIMIT 10;

"

 

add_section "Fast Growing Indexes (Last 7 Days)" "

SELECT

index_name,

pg_size_pretty(MAX(index_size_bytes) - MIN(index_size_bytes)) AS growth

FROM monitoring.index_size_history

WHERE captured_at >= now() - INTERVAL '7 days'

GROUP BY index_name

HAVING MAX(index_size_bytes) - MIN(index_size_bytes) > 100000000;

"

 

add_section "Unused Indexes" "

SELECT

indexrelname AS index_name,

relname AS table_name,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size

FROM pg_stat_user_indexes

WHERE idx_scan = 0

ORDER BY pg_relation_size(indexrelid) DESC;

"

 

add_section "GIN Indexes (High Risk)" "

SELECT

indexrelname AS gin_index,

relname AS table_name,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,

idx_scan

FROM pg_stat_user_indexes

WHERE indexrelname ILIKE '%gin%'

ORDER BY pg_relation_size(indexrelid) DESC;

"

 

# ---------- HTML FOOTER ----------

cat $FOOTER >> $HTML_OUT

Show more lines

Make executable (if not already):

Shell

chmod +x /opt/pg_monitoring/scripts/generate_index_report.sh

Show more lines


4️ Cron Job (Already Works)

No change required — HTML is now generated automatically:

Shell

0 3 * * * /opt/pg_monitoring/scripts/generate_index_report.sh

Show more lines


5️ Sample HTML Output (What You’ll See)

Clear tables
Color‑coded headers
Scrollable and shareable
Manager‑friendly

Example sections:

  • Index Health Summary
  • Top Largest Indexes
  • Fast‑Growing Indexes
  • Unused Indexes
  • GIN Index Risk Table

Open it with:

Shell

xdg-open index_report_2026-04-18.html

Show more lines

or serve via Nginx/Apache.


6️ Best Practices for HTML Reports

One HTML file per day
Retain 3090 days
Embed CSS for portability
Do NOT auto‑fix based on HTML
Use HTML for decision review


Summary

  • HTML reports added cleanly
  • Uses built‑in psql HTML mode
  • Integrates with existing cron/job setup
  • Produces professional, readable reports
  • Safe for production systems

You now have TXT + CSV + HTML auto‑generated PostgreSQL index reports

 

Multithreading in PostgreSQL Using dblink and FDW (Reality vs Myth) First: The Critical Truth (Must Be Clear) You cannot create true mul...