PostgreSql: May 2026

Thursday, 14 May 2026

Multithreading in PostgreSQL Using dblink and FDW (Reality vs Myth)

First: The Critical Truth (Must Be Clear)

You cannot create true multithreading inside a single PostgreSQL backend, even with dblink or Foreign Data Wrappers (FDW).

What you can do: ✅ Achieve parallel execution across multiple PostgreSQL backends
✅ Simulate multithreading by delegating work to other connections
✅ Run independent queries concurrently using remote sessions

This is process‑level parallelism, not thread‑level multithreading.

1. What dblink and FDW Actually Do

dblink

Opens another PostgreSQL connection
Executes SQL on that connection
Returns results to the caller
Runs in a separate backend process

postgres_fdw

Creates foreign tables mapped to another PostgreSQL server
Uses remote connections
Query planner decides whether to push down work

📌 Key Point
Each dblink or FDW call = another PostgreSQL process, not a thread.

2. Why People Use dblink / FDW for “Multithreading”

Typical goals:

Parallel processing of large datasets
Running independent tasks at the same time
Avoiding blocking in long‑running operations
Batch processing / ETL parallelism

PostgreSQL does not allow THREAD {} blocks, so people use extra connections instead.

3. ❌ What You CANNOT Do (Important)

❌ You cannot:

Run two SQL statements at the same time inside one function
Spawn threads inside PL/pgSQL
Use dblink calls concurrently from the same backend
Force dblink calls to run asynchronously

This will still execute sequentially in one backend:

SQL

SELECT dblink_exec(...);

Show more lines

4. ✅ Correct Way: Parallel Work Using dblink + Multiple Sessions

Architecture Concept

Plain Text

Coordinator Session

├── DBLink Session 1 → Task A

├── DBLink Session 2 → Task B

└── DBLink Session 3 → Task C

Show more lines

Each DBLink session:

Is a separate PostgreSQL backend
Can run concurrently
Uses a different CPU core

5. Example: Parallel Task Execution Using dblink

Step 1: Enable dblink

Plain Text

CREATE EXTENSION dblink;

Show more lines

Step 2: Open Multiple Connections

SQL

SELECT dblink_connect('w1', 'dbname=mydb');

SELECT dblink_connect('w2', 'dbname=mydb');

SELECT dblink_connect('w3', 'dbname=mydb');

Show more lines

Each connection runs in a separate backend process.

Step 3: Dispatch Work

SQL

SELECT dblink_exec('w1', 'CALL process_partition(1)');

SELECT dblink_exec('w2', 'CALL process_partition(2)');

SELECT dblink_exec('w3', 'CALL process_partition(3)');

Show more lines

✅ Work runs in parallel ✅ Uses multiple CPU cores ✅ Each task isolated

Step 4: Disconnect

SQL

SELECT dblink_disconnect('w1');

SELECT dblink_disconnect('w2');

SELECT dblink_disconnect('w3');

Show more lines

6. Using FDW for Parallelism (Cleaner, Safer)

FDW is often better than dblink.

Setup postgres_fdw

SQL

CREATE EXTENSION postgres_fdw;

CREATE SERVER worker1 FOREIGN DATA WRAPPER postgres_fdw

OPTIONS (host 'localhost', dbname 'mydb', port '5432');

CREATE USER MAPPING FOR CURRENT_USER

SERVER worker1 OPTIONS (user 'postgres');

Show more lines

Parallel Reads via FDW

SQL

SELECT count(*) FROM remote_table_part1

UNION ALL

SELECT count(*) FROM remote_table_part2;

Show more lines

If:

use_remote_estimate = on
parallel_append = on

PostgreSQL can plan parallel execution across FDW connections.

📌 FDW parallelism is planner‑dependent, not guaranteed.

7. ✅ Best Pattern: Leader / Worker Model

This is how production systems do it.

Leader (Coordinator)

Assigns tasks
Uses dblink or FDW connections

Workers

Run heavy logic
Independent transactions
Fail independently

Example use cases:

ETL pipelines
Parallel data validation
Background batch jobs
Sharded processing

8. Performance and Safety Considerations

✅ Pros

True CPU parallelism
Failure isolation
Scales well on multi‑core systems

⚠️ Cons

High connection cost
More memory usage
Needs connection pooling
Harder error handling

9. When NOT to Use dblink/FDW for Parallelism

❌ For OLTP transaction processing
❌ For row‑by‑row loops
❌ For small queries
❌ Instead of native parallel queries

Use PostgreSQL parallel queries first, then dblink only when needed.

10. Recommended Alternatives (Often Better)

Goal	Better Option
CPU parallelism	Native parallel query
Task concurrency	App‑level threads
Batch processing	Job queue + workers
Scheduling	pg_cron
Background jobs	Background workers
High throughput	Partitioning

Final Verdict (Very Important)

dblink and FDW do NOT create multithreading.
They create controlled, parallel multiprocessing using additional sessions.

This is: ✅ Safe
✅ Supported
✅ Production‑proven

But it must be designed carefully.

Simple One‑Line Summary

Multithreading in PostgreSQL using dblink/FDW = parallel work via multiple backend processes, not real threads.

Wednesday, 13 May 2026

How to Achieve Multithreading in PostgreSQL (Correct Way)

✅ PostgreSQL uses multiprocessing, not multithreading
✅ You do not convert PostgreSQL into a threaded engine
✅ You achieve multithreading behavior using PostgreSQL‑supported mechanisms

1. Important Truth (Must Understand First)

PostgreSQL DOES NOT:

Run multiple threads inside one backend process
Allow user‑defined threaded execution in PL/pgSQL
Support THREAD, PARALLEL blocks like Java/C++

PostgreSQL DOES:

Run multiple OS processes
Execute parallel queries
Use parallel workers
Support concurrent client execution
Safely scale across CPU cores

✅ This is intentional and architecturally correct

2. ✅ Primary Way: Parallel Query Execution (Best Replacement for Multithreading)

PostgreSQL supports true parallel execution using multiple worker processes.

Enable Parallelism

SQL

SHOW max_parallel_workers;

SHOW max_parallel_workers_per_gather;

Show more lines

Recommended settings (example):

Apache Config

max_parallel_workers = 8

max_parallel_workers_per_gather = 4

Show more lines

Example: Parallel Query Execution

SQL

EXPLAIN ANALYZE

SELECT COUNT(*)

FROM large_table;

Show more lines

You will see:

Gather

Workers Planned: 4

-> Parallel Seq Scan

✅ Multiple CPU cores used
✅ Similar to multithreading
✅ Fully managed by PostgreSQL

3. ✅ Application‑Level Multithreading (MOST IMPORTANT)

✅ Correct pattern

Threads live in the application, not in PostgreSQL

Each thread:

Uses its own DB connection (or pooled connection)
Executes queries concurrently

Example (Java / Python / Node):

Thread 1 → Query A → PostgreSQL Process 1

Thread 2 → Query B → PostgreSQL Process 2

Thread 3 → Query C → PostgreSQL Process 3

✅ This gives real multithreaded behavior ✅ PostgreSQL handles concurrency safely ✅ This is the industry standard

Use Connection Pooling (MANDATORY)

Because PostgreSQL is process‑based:

Tools:

✅ PgBouncer
✅ PgPool
✅ HikariCP / psycopg pool / Sequelize pool

Rule:

1000 threads ≠ 1000 DB connections

4. ✅ Background Workers (Advanced / Internal Multitasking)

PostgreSQL supports background worker processes (C extensions).

Used for:

Async processing
Queue consumers
Continuous tasks

Examples:

Logical replication
Autovacuum
pg_cron jobs

⚠️ Requires C programming ⚠️ Not for normal application use

5. ✅ Job Parallelism Using pg_cron or Queues

For “task‑based multithreading”:

pg_cron example

SQL

SELECT cron.schedule(

'parallel_job',

'*/5 * * * *',

$CALL process_batch();$

);

Show more lines

Run:

Multiple jobs
Parallel execution
Independent processes

✅ Safe concurrency
✅ Ideal for ETL / batch systems

6. ✅ Logical Sharding / Partition Parallelism

Partitioning enables parallel scans and writes:

SQL

CREATE TABLE events (

id bigint,

created_at date

) PARTITION BY RANGE (created_at);

Show more lines

PostgreSQL can:

Scan partitions in parallel
Insert concurrently without contention

✅ Practical performance boost
✅ Works like multithreaded data access

7. ❌ What You CANNOT Do (Common Mistakes)

❌ Create threads in PL/pgSQL
❌ Run multiple threads inside a function
❌ Use fork() or pthread in SQL
❌ Force PostgreSQL to behave like MySQL threads

PostgreSQL intentionally avoids threading to ensure:

Memory safety
Crash isolation
Extension safety

8. ✅ Summary: Correct Ways to Achieve Multithreading

Goal	Correct PostgreSQL Solution
CPU parallelism	Parallel queries
Concurrent execution	Multiple connections
Threaded workloads	Application‑level threads
Background tasks	Background workers / pg_cron
Batch parallelism	Job scheduling + queues
High performance	Pooling + partitioning

Final Verdict (Very Important)

You do not “add multithreading” to PostgreSQL.
You DESIGN around its multiprocessing architecture.

This is why PostgreSQL:

Scales exceptionally well
Rarely crashes
Handles massive concurrency safely

Tuesday, 12 May 2026

PostgreSQL Multiprocessing Explained

PostgreSQL uses multiprocessing, not multithreading, as a core architectural design.
This design choice affects performance, concurrency, isolation, and scalability.

1. What Is Multiprocessing in PostgreSQL?

Multiprocessing means:

PostgreSQL uses multiple operating system processes to do work in parallel.

Each important task runs in its own process, not as a thread inside one process.

This is different from databases that rely heavily on multithreading.

2. The PostgreSQL Process Model (Core Concept)

PostgreSQL follows a process‑per‑connection architecture.

Main process types:

Postmaster (main server)

├── Client Backend Process (one per connection)

├── WAL Writer

├── Checkpointer

├── Background Writer

├── Autovacuum Workers

├── Logical Replication Workers

└── Parallel Query Workers

Each box above is a separate OS process.

3. One Connection = One Backend Process

When a client connects:

Plain Text

Client → PostgreSQL → New backend OS process

Show more lines

That backend process:

Parses SQL
Executes queries
Uses its own memory context
Communicates with shared memory

✅ Strong isolation
✅ Crash protection
❌ More memory per connection

4. Shared Memory + Multiple Processes

Even though PostgreSQL uses multiple processes, they share data using:

Shared memory segments
Lightweight locks (LWLocks)
Spinlocks

Shared structures include:

Buffer cache
WAL buffers
Lock tables
Statistics

This is how multiprocessing works without data corruption.

5. Parallel Query Execution (True Multiprocessing)

PostgreSQL supports parallel execution, where one query uses multiple processes.

Example:

SQL

SET max_parallel_workers_per_gather = 4;

Show more lines

For large queries, PostgreSQL may spawn:

Leader Process

├── Parallel Worker 1

├── Parallel Worker 2

├── Parallel Worker 3

└── Parallel Worker 4

Used for:

Sequential scans
Aggregations
Hash joins

✅ Uses multiple CPU cores
✅ Faster analytics queries

6. Background Multiprocessing Tasks

PostgreSQL always runs multiple background processes, even with no users connected.

Examples:

Autovacuum workers → clean dead rows
WAL writer → write logs
Checkpointer → ensure durability
Logical replication workers
Stats collector

These run independently and concurrently with user queries.

7. MVCC + Multiprocessing (Key Advantage)

PostgreSQL’s MVCC (Multi‑Version Concurrency Control) works perfectly with multiprocessing.

Result:

Readers never block writers
Writers never block readers
Each process sees its own snapshot

This is critical for:

High concurrency
Large transactional systems
Long‑running queries

8. Why PostgreSQL Chose Multiprocessing (Not Threads)

Advantages:

✅ Better stability (one process crash doesn’t kill others)
✅ Easier memory cleanup
✅ Safer extension execution
✅ Strong OS‑level isolation

Trade‑offs:

❌ Higher memory usage
❌ Too many connections can overload the server

This is why connection pooling is essential.

9. Multiprocessing + Connection Pooling

Because every connection is a process:

❌ 5,000 connections = 5,000 OS processes
✅ Use connection pools

Common tools:

PgBouncer
PgPool‑II
App‑level pools (HikariCP, psycopg pool)

Best practice:

Many app requests → Few PostgreSQL processes

10. Monitoring Multiprocessing in PostgreSQL

View running processes:

SQL

SELECT pid, state, backend_type, query

FROM pg_stat_activity;

Show more lines

Backend types include:

client backend
autovacuum worker
parallel worker
logical replication worker

11. Common Multiprocessing Issues

Too many backend processes

✅ Use pooling
✅ Reduce max_connections

High context switching

✅ Tune parallel workers
✅ Reduce idle connections

Autovacuum contention

✅ Proper autovacuum tuning

12. Simple Summary

✅ PostgreSQL uses multiprocessing, not multithreading
✅ Each connection runs in its own OS process
✅ Parallel queries use multiple worker processes
✅ Background workers run continuously
✅ Shared memory + locks keep data safe
✅ This design favors stability and correctness

In One Line:

PostgreSQL multiprocessing prioritizes safety, isolation, and predictable concurrency over raw lightweight threading.

Monday, 11 May 2026

PostgreSQL Multitasking Explained

PostgreSQL does not use multitasking in the traditional application sense (like threads performing different jobs in one process).
Instead, PostgreSQL achieves multitasking through process‑based concurrency, MVCC, and parallel execution.

In simple words:

PostgreSQL can handle many users and many queries at the same time safely and efficiently.

1. What “Multitasking” Means in PostgreSQL

In PostgreSQL, multitasking usually refers to:

Multiple users running queries at the same time
Reads and writes happening concurrently
Background maintenance tasks running alongside queries
Queries using parallel workers to run faster

This is handled through concurrency + parallelism, not OS‑level threading like MySQL or application code.

2. Process‑Based Architecture (Key Concept)

PostgreSQL uses a process‑per‑connection model.

How it works:

One Postmaster process controls the server
Each client connection gets its own backend process
Background tasks run in separate system processes

Example active processes:

Client query processes
Autovacuum workers
WAL writer
Checkpointer
Background writer

✅ This provides strong isolation
❌ More memory per connection

3. MVCC – How PostgreSQL Avoids Locks

The most important multitasking feature is MVCC (Multi‑Version Concurrency Control).

What MVCC Does:

Readers do not block writers
Writers do not block readers
Each transaction sees a consistent snapshot

Example:

Plain Text

User A reads a row

User B updates the same row

User A still sees the old version

User B sees the new version

Show more lines

✅ High concurrency
✅ Minimal locking
✅ Excellent for multitasking workloads

4. Locks Still Exist (But Used Carefully)

PostgreSQL does use locks, but only when necessary.

Main lock types:

Row‑level locks (UPDATE, DELETE)
Table locks (DDL)
Advisory locks (developer‑controlled)

Example:

SQL

SELECT * FROM orders WHERE id = 10 FOR UPDATE;

Show more lines

This locks only that row, not the whole table.

5. Parallel Query Execution (True Parallelism)

PostgreSQL supports parallel query execution, meaning:

One query can use multiple CPU cores

Supported operations:

Sequential scans
Aggregations
Joins

Example:

SQL

SET max_parallel_workers_per_gather = 4;

Show more lines

PostgreSQL may spawn parallel worker processes.

✅ Faster large queries
✅ Better CPU utilization

6. Background Workers (Automatic Multitasking)

PostgreSQL runs multiple background tasks simultaneously.

Common examples:

Autovacuum (cleanup)
WAL writer (write‑ahead logging)
Checkpointer
Logical replication workers

They run independently of user queries.

Example:

SQL

SELECT * FROM pg_stat_activity;

Show more lines

You can see:

Active queries
Background tasks
Idle connections

7. Connection Pooling (Important for Multitasking)

Because each connection = one process:

❌ Too many connections = performance problems

✅ Use connection pooling

Popular options:

PgBouncer
PgPool‑II
Application pools (HikariCP, SQLAlchemy pool)

Best practice:

Plain Text

Many app requests → few PostgreSQL connections

Show more lines

8. Task Scheduling (Cron‑Like Multitasking)

PostgreSQL does not have native job scheduling, but you can multitask using:

pg_cron extension
OS cron jobs + psql
Background workers

Example with pg_cron:

SQL

SELECT cron.schedule(

'cleanup_job',

'0 3 * * *',

'VACUUM ANALYZE'

);

Show more lines

9. How PostgreSQL Handles Heavy Multitasking Safely

Key mechanisms:

MVCC snapshots
Fine‑grained locks
Per‑process memory
Autovacuum cleanup
Query planner cost models

This is why PostgreSQL scales well under:

High read workloads
Mixed read/write workloads
Long‑running queries

10. Common Multitasking Problems (and Fixes)

Too many active connections

✅ Use connection pooling

Slow concurrent updates

✅ Index properly
✅ Reduce hot‑row updates

Autovacuum conflicts

✅ Tune autovacuum settings

High CPU with parallel queries

✅ Limit parallel workers

Simple Summary

✅ PostgreSQL multitasking is built on:

Process‑based concurrency
MVCC (non‑blocking reads/writes)
Parallel query execution
Background workers
Connection pooling

✅ It is very safe, scalable, and predictable

✅ It is ideal for high‑concurrency systems

Sunday, 10 May 2026

Add Email Delivery for PostgreSQL HTML Index Report

With email delivery:

DBAs get reports without logging into servers
Managers can review performance easily
Issues are noticed early
Reports become part of operational workflow

We’ll use Linux mail / sendmail / mailx, which is the most common, reliable approach.

What This Will Do

✅ Generate HTML report
✅ Email it as HTML body (not just attachment)
✅ Optional attachment support
✅ Fully automated via cron
✅ Safe for production

Prerequisites (One‑Time)

1️⃣ Ensure Mail Utility Is Installed

Debian / Ubuntu

Shell

sudo apt install mailutils

Show more lines

RHEL / CentOS / Rocky

Shell

sudo yum install mailx

Show more lines

✅ Assumes server can send email (SMTP configured or relay allowed).

Step 1: Decide Report Recipients

Edit once in script:

Shell

EMAIL_TO="dba@company.com,team@company.com"

EMAIL_FROM="postgres-monitor@company.com"

Show more lines

Step 2: Update HTML Report Generator Script

Below is the FULL updated script with HTML generation + email delivery.

📄 scripts/generate_index_report.sh

Shell

#!/bin/bash

DB="mydb"

BASE="/opt/pg_monitoring"

DATE=$(date +%F)

PSQL="/usr/bin/psql"

EMAIL_TO="dba@company.com"

EMAIL_FROM="postgres-monitor@company.com"

SUBJECT="PostgreSQL Index Report - $DATE"

HTML_OUT="$BASE/output/html/index_report_$DATE.html"

HEADER="$BASE/reports/html/header.html"

FOOTER="$BASE/reports/html/footer.html"

STYLE="$BASE/reports/html/style.css"

# ---------- HTML HEADER ----------

sed "s/{{DATE}}/$DATE/" $HEADER > $HTML_OUT

echo "<style>" >> $HTML_OUT

cat $STYLE >> $HTML_OUT

echo "</style>" >> $HTML_OUT

add_section () {

TITLE=$1

QUERY=$2

echo "<h2>$TITLE</h2>" >> $HTML_OUT

$PSQL -d $DB -H -c "$QUERY" >> $HTML_OUT

}

# ---------- HTML CONTENT ----------

add_section "Index Health Summary" "

SELECT

COUNT(*) AS total_indexes,

SUM(CASE WHEN idx_scan = 0 THEN 1 ELSE 0 END) AS unused_indexes,

pg_size_pretty(SUM(pg_relation_size(indexrelid))) AS total_index_size

FROM pg_stat_user_indexes;

add_section "Top 10 Largest Indexes" "

SELECT

indexrelname AS index_name,

relname AS table_name,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,

idx_scan

FROM pg_stat_user_indexes

ORDER BY pg_relation_size(indexrelid) DESC

LIMIT 10;

add_section "Fast Growing Indexes (Last 7 Days)" "

SELECT

index_name,

pg_size_pretty(MAX(index_size_bytes) - MIN(index_size_bytes)) AS growth

FROM monitoring.index_size_history

WHERE captured_at >= now() - INTERVAL '7 days'

GROUP BY index_name

HAVING MAX(index_size_bytes) - MIN(index_size_bytes) > 100000000;

add_section "Unused Indexes" "

SELECT

indexrelname AS index_name,

relname AS table_name,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size

FROM pg_stat_user_indexes

WHERE idx_scan = 0

ORDER BY pg_relation_size(indexrelid) DESC;

add_section "GIN Indexes (High Risk)" "

SELECT

indexrelname AS gin_index,

relname AS table_name,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,

idx_scan

FROM pg_stat_user_indexes

WHERE indexrelname ILIKE '%gin%'

ORDER BY pg_relation_size(indexrelid) DESC;

# ---------- HTML FOOTER ----------

cat $FOOTER >> $HTML_OUT

# ---------- EMAIL DELIVERY ----------

mail -a "Content-Type: text/html" \

-s "$SUBJECT" \

-r "$EMAIL_FROM" \

"$EMAIL_TO" < $HTML_OUT

Show more lines

Make executable (if not already):

Shell

chmod +x /opt/pg_monitoring/scripts/generate_index_report.sh

Show more lines

Step 3: Cron Job (No Change)

Your existing cron job now automatically emails the report.

Shell

0 3 * * * /opt/pg_monitoring/scripts/generate_index_report.sh

Show more lines

✅ HTML is generated
✅ Email is sent immediately
✅ Zero manual steps

Step 4: What the Email Looks Like

📧 Email Subject

PostgreSQL Index Report - 2026-04-18

📄 Email Body

Full HTML report
Styled tables
Index risk information
Easy to read on desktop & mobile

✅ No attachment required
✅ Opens instantly
✅ Manager‑friendly

Optional: Send as HTML Attachment Instead

If your mail server blocks HTML bodies, use attachment mode:

Shell

mail -s "$SUBJECT" \

-a "$HTML_OUT" \

"$EMAIL_TO" < /dev/null

Show more lines

Optional: Send Only If Issues Found

Add logic before sending:

Shell

ISSUES=$(grep -c "<tr>" $HTML_OUT)

if [ "$ISSUES" -gt 5 ]; then

mail -a "Content-Type: text/html" -s "$SUBJECT" "$EMAIL_TO" < $HTML_OUT

Show more lines

✅ Avoids alert fatigue
✅ Sends only meaningful reports

Best Practices for Email Reports

✅ Send once per day
✅ Use clear subject with date
✅ Group recipients wisely
✅ Use HTML body, not just attachments
✅ Keep reports under control (don’t spam)

Summary

HTML index reports now emailed automatically
Works with cron
Uses standard Linux mail tools
Production‑safe and readable
Zero downtime impact

You now have end‑to‑end PostgreSQL index monitoring:

✅ Collect
✅ Analyze
✅ Report
✅ Email

Saturday, 9 May 2026

Add HTML Report Generation for PostgreSQL Index Monitoring

HTML reports are useful because they:

Are easy to read in browsers
Can be shared with managers and teams
Look professional
Can be emailed or hosted internally
Combine multiple reports into one dashboard

We will generate a single self‑contained HTML report automatically.

Final Architecture (After This Step)

Shell

/opt/pg_monitoring/

├── reports/

│ ├── sql/

│ │ ├── index_health_report.sql

│ │ ├── largest_indexes.sql

│ │ ├── growing_indexes.sql

│ │ ├── unused_indexes.sql

│ │ └── gin_indexes.sql

│ └── html/

│ ├── header.html

│ ├── footer.html

│ └── style.css

├── output/

│ ├── txt/

│ ├── csv/

│ └── html/

│ └── index_report_YYYY-MM-DD.html

└── scripts/

└── generate_index_report.sh

Show more lines

1️⃣ Create HTML Template Files

📄 reports/html/header.html

HTML

<!DOCTYPE html>

<html>

<head>

<title>PostgreSQL Index Report</title>

</head>

<body>

<h1>PostgreSQL Index Monitoring Report</h1>

<p>Generated at: {{DATE}}</p>

<hr>

Show more lines

📄 reports/html/footer.html

HTML

<hr>

Generated automatically by PostgreSQL Index Monitoring

</p>

</body>

</html>

Show more lines

📄 reports/html/style.css

CSS

body {

font-family: Arial, sans-serif;

margin: 40px;

background-color: #f9f9f9;

}

h1, h2 {

color: #2c3e50;

}

table {

border-collapse: collapse;

width: 100%;

margin-bottom: 30px;

background: #ffffff;

}

th, td {

border: 1px solid #dddddd;

padding: 8px;

}

th {

background-color: #34495e;

color: white;

}

tr:nth-child(even) {

background-color: #f2f2f2;

}

.warning {

color: red;

font-weight: bold;

}

Show more lines

2️⃣ Generate HTML Tables Using psql

psql has a built‑in HTML output mode.

We’ll use:

Shell

psql -H -c "SELECT ..."

Show more lines

3️⃣ Update the Report Generator Script (HTML Included)

📄 scripts/generate_index_report.sh

Shell

#!/bin/bash

DB="mydb"

BASE="/opt/pg_monitoring"

DATE=$(date +%F)

PSQL="/usr/bin/psql"

HTML_OUT="$BASE/output/html/index_report_$DATE.html"

HEADER="$BASE/reports/html/header.html"

FOOTER="$BASE/reports/html/footer.html"

STYLE="$BASE/reports/html/style.css"

# ---------- HTML HEADER ----------

sed "s/{{DATE}}/$DATE/" $HEADER > $HTML_OUT

# embed CSS directly

echo "<style>" >> $HTML_OUT

cat $STYLE >> $HTML_OUT

echo "</style>" >> $HTML_OUT

# ---------- SECTION FUNCTION ----------

add_section () {

TITLE=$1

QUERY=$2

echo "<h2>$TITLE</h2>" >> $HTML_OUT

$PSQL -d $DB -H -c "$QUERY" >> $HTML_OUT

}

# ---------- HTML SECTIONS ----------

add_section "Index Health Summary" "

SELECT

COUNT(*) AS total_indexes,

SUM(CASE WHEN idx_scan = 0 THEN 1 ELSE 0 END) AS unused_indexes,

pg_size_pretty(SUM(pg_relation_size(indexrelid))) AS total_index_size

FROM pg_stat_user_indexes;

add_section "Top 10 Largest Indexes" "

SELECT

indexrelname AS index_name,

relname AS table_name,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,

idx_scan

FROM pg_stat_user_indexes

ORDER BY pg_relation_size(indexrelid) DESC

LIMIT 10;

add_section "Fast Growing Indexes (Last 7 Days)" "

SELECT

index_name,

pg_size_pretty(MAX(index_size_bytes) - MIN(index_size_bytes)) AS growth

FROM monitoring.index_size_history

WHERE captured_at >= now() - INTERVAL '7 days'

GROUP BY index_name

HAVING MAX(index_size_bytes) - MIN(index_size_bytes) > 100000000;

add_section "Unused Indexes" "

SELECT

indexrelname AS index_name,

relname AS table_name,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size

FROM pg_stat_user_indexes

WHERE idx_scan = 0

ORDER BY pg_relation_size(indexrelid) DESC;

add_section "GIN Indexes (High Risk)" "

SELECT

indexrelname AS gin_index,

relname AS table_name,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,

idx_scan

FROM pg_stat_user_indexes

WHERE indexrelname ILIKE '%gin%'

ORDER BY pg_relation_size(indexrelid) DESC;

# ---------- HTML FOOTER ----------

cat $FOOTER >> $HTML_OUT

Show more lines

Make executable (if not already):

Shell

chmod +x /opt/pg_monitoring/scripts/generate_index_report.sh

Show more lines

4️⃣ Cron Job (Already Works)

No change required — HTML is now generated automatically:

Shell

0 3 * * * /opt/pg_monitoring/scripts/generate_index_report.sh

Show more lines

5️⃣ Sample HTML Output (What You’ll See)

✅ Clear tables
✅ Color‑coded headers
✅ Scrollable and shareable
✅ Manager‑friendly

Example sections:

Index Health Summary
Top Largest Indexes
Fast‑Growing Indexes
Unused Indexes
GIN Index Risk Table

Open it with:

Shell

xdg-open index_report_2026-04-18.html

Show more lines

or serve via Nginx/Apache.

6️⃣ Best Practices for HTML Reports

✅ One HTML file per day
✅ Retain 30–90 days
✅ Embed CSS for portability
✅ Do NOT auto‑fix based on HTML
✅ Use HTML for decision review

Summary

HTML reports added cleanly
Uses built‑in psql HTML mode
Integrates with existing cron/job setup
Produces professional, readable reports
Safe for production systems

You now have TXT + CSV + HTML auto‑generated PostgreSQL index reports ✅