Loading Now

PostgreSQL – Incremental Backups with pg_basebackup: What Changed, What Matters, and What Can Go Wrong

For years, PostgreSQL DBAs had to rely on external tools if they wanted true incremental physical backups.

We had:

  • pg_basebackup for full backups
  • WAL archiving for point-in-time recovery
  • Third-party tools for incremental strategies

Now PostgreSQL 17 changes that.

It introduces native incremental physical backups using pg_basebackup.

That’s a big deal.

But like every powerful feature in PostgreSQL, it looks simple — until you try to automate restore in production.

Let’s go deeper.


Why Incremental Backups Matter

Full physical backups are reliable.
But they are expensive.

If your cluster is:

  • 2 TB
  • 5 TB
  • 20 TB

Running full backups daily is not realistic.

Even if you can store them, you waste:

  • Disk space
  • Network bandwidth
  • Backup window time

Incremental backups change the equation.

They copy only what changed since the previous backup.

That means:

  • Smaller backup sizes
  • Faster execution
  • Lower I/O impact
  • Reduced storage cost

For large environments, this is not a luxury.

It’s survival.


The Traditional Problem Before PostgreSQL 17

Before v17:

  • pg_basebackup only supported full backups
  • You had to combine full + WAL archiving
  • Or use tools like pgBackRest / Barman

Those tools are excellent.

But having incremental support natively inside PostgreSQL is a major architectural evolution.

Now PostgreSQL can compare backup manifests and copy only changed files.

That is a big shift.


Creating a Full Backup with pg_basebackup (PostgreSQL 17)

The full backup still looks familiar:

pg_basebackup -h localhost -U postgres -D "C:\backup\backup1_full" -Ft -X fetch -P -v -c fast

Important flags explained practically:

  • -Ft → TAR format (portable and compact)
  • -X fetch → Fetch required WAL files
  • -c fast → Fast checkpoint (important in production to avoid long waits)
  • -P → Progress display (very useful in large clusters)

This creates a complete physical copy of your cluster.

But the key piece here is:

The backup creates a backup_manifest file.

This file is the foundation of incremental backups.

Without it, incremental does not exist.


Incremental Backup — The Game Changer

Now comes the interesting part.

Instead of copying everything again, we reference the previous manifest:

pg_basebackup -h localhost -U postgres -D "C:\backup\backup1_incr1" \
-i "C:\backup\backup1_full\backup_manifest" \
-c fast -v

What happens here?

PostgreSQL:

  • Reads the previous manifest
  • Compares file checksums
  • Copies only changed files

This is not WAL-based incremental.

This is file-level incremental.

That distinction matters.


Second Incremental

You can chain them:

pg_basebackup -h localhost -U postgres -D "C:\backup\backup1_incr2" \
-i "C:\backup\backup1_incr1\backup_manifest" \
-c fast -v -Ft

Now you are building a backup chain:

Full → Incremental 1 → Incremental 2

And this is where things get serious.


The Manifest File: The Most Important Piece

The backup_manifest contains:

  • File names
  • Sizes
  • Checksums
  • Metadata

Without it, PostgreSQL cannot determine what changed.

If that file is lost or corrupted:

Your incremental chain is broken.

That means:

Backup design must include manifest protection.

This is not optional.


The Real Challenge: Restore Complexity

Creating incremental backups is easy.

Restoring them is where most DBAs get nervous.

To restore:

You must:

  1. Restore the base full backup
  2. Apply each incremental in correct order
  3. Ensure WAL consistency
  4. Verify integrity

If you skip one incremental layer?

Your restore fails.

If you misalign WAL?

Recovery fails.

If your script logic is weak?

Disaster recovery becomes a nightmare.

This is why automation is critical.

Manual incremental chains are dangerous.


Full + Incremental Strategy Design

Here’s how I think about it in real environments:

Scenario 1 – Medium Environment (1–3 TB)

  • Weekly full backup
  • Daily incremental backups
  • Continuous WAL archiving

Scenario 2 – Large Enterprise (10+ TB)

  • Full backup every 2–4 weeks
  • Daily incrementals
  • Hourly WAL validation
  • Restore test automation

Backup without restore testing is fantasy.

Incremental backups multiply restore complexity.


Performance Considerations

Incremental backups:

  • Reduce disk I/O
  • Reduce network usage
  • Reduce backup window

But they:

  • Increase recovery planning complexity
  • Require stricter chain management
  • Demand better scripting discipline

This is not “set and forget.”


What Can Go Wrong?

From experience, here are real risks:

  • Losing a manifest file
  • Breaking incremental chain order
  • Forgetting WAL retention configuration
  • Mixing TAR and plain formats incorrectly
  • Not validating backup integrity
  • Long-running replication slots blocking WAL cleanup

Incremental backups reduce size — but increase responsibility.


Comparing to Other Databases

SQL Server

  • Differential backups built-in
  • Log backups mature
  • Restore chain well documented

Oracle

  • RMAN incremental backups highly mature

PostgreSQL (Before 17)

  • Required third-party tools

PostgreSQL 17

Now native incremental exists — but still young compared to RMAN.

This is evolution, not final perfection.


Should You Replace pgBackRest or Barman?

Not necessarily.

Enterprise-grade environments still benefit from:

  • Backup catalog management
  • Retention automation
  • Encryption integration
  • Parallel restore
  • Compression optimization

pg_basebackup incremental is powerful.

But ecosystem tools still add operational maturity.


My Professional Opinion

This feature is huge.

But don’t mistake feature availability for operational simplicity.

If you:

  • Run large clusters
  • Need predictable RTO
  • Have strict compliance requirements

You must:

  • Automate backup chains
  • Validate manifests
  • Test restores regularly
  • Monitor WAL archiving
  • Protect backup metadata

Incremental backups are not about saving disk space.

They are about backup architecture discipline.


The Real DBA Question

Don’t ask:

“How do I create incremental backups?”

Ask:

“If my primary server crashes right now, can I restore everything correctly — fast?”

If the answer is not a confident yes,

Your backup strategy is incomplete.


Final Thought

PostgreSQL 17 incremental backups are a major step forward.

They:

  • Reduce backup overhead
  • Improve scalability
  • Lower storage cost

But they increase architectural responsibility.

Backup is not about copying data.

Backup is about restoring certainty.

And incremental strategies demand serious automation and testing.

That’s how I approach it.

🚀 Ready to boost your career in data?

👉 DBAcademy – DBA & Data Analyst Training
Over 1,300 lessons and 412 hours of exclusive content.
Includes subtitles in English, Spanish, and French.

🔗 https://filiado.wixsite.com/dbacademy

💡 Start learning today and become a highly in-demand data professional.

Share this content:

Sandro Servino is a senior IT professional with over 30 years of experience in technology, having worked as a Developer, Project Manager (acting as a Requirements Analyst and Scrum Master), Professor, IT Infrastructure Team Coordinator, IT Manager, and Database Administrator. He has been working with Database technologies since 1996 and has been vendor-certified since the early years of his career. Throughout his professional journey, he has combined deep technical expertise with leadership, education, and consulting experience in mission-critical environments. Sandro has trained more than 20,000 students in database technologies, helping professionals build strong foundations and advance their careers in data platforms and database administration. He has delivered corporate training programs for multiple companies and served as a university professor teaching Database and Data Administration for over five years. For many years, he worked as an independent consultant specializing in SQL Server, providing strategic and technical support for complex database environments. He has extensive experience in troubleshooting and resolving critical issues in SQL Server production environments, including performance tuning, high availability, disaster recovery, security, and infrastructure optimization. His academic background includes: Postgraduate Degree in School Education MBA in IT Governance Master’s Degree in Knowledge Management and Information Technology Currently, Sandro works as a Database Administrator for multinational companies in Europe, managing enterprise-level SQL Server environments and supporting large-scale, high-demand infrastructures. Areas of Expertise SQL Server (Administration, Performance, HA/DR, Troubleshooting) Azure SQL Databases MySQL Oracle PostgreSQL Power BI Data Analytics Data Warehouse Windows Server Oracle Linux Server Ubuntu Linux Server DBA Training and Mentorship Business Continuity and Disaster Recovery Strategies Courses and Training Programs Sandro delivers professional training programs focused on the formation of DBAs and Data/BI Analysts, covering: SQL Server and Azure SQL Databases MySQL Oracle PostgreSQL Power BI Data Analytics Data Warehouse Windows Server Oracle Linux Server Ubuntu Linux Server With a unique combination of technical depth, academic knowledge, real-world consulting experience, and international exposure, Sandro Servino brings practical, results-driven expertise to database professionals and organizations seeking reliability, performance, and resilience in their data platforms.

Post Comment