pg_repack

1. Overview

pg_repack is an extension for PostgreSQL used to rebuild tables and indexes online and eliminate storage bloat. It can achieve effects similar to VACUUM FULL and CLUSTER (reclaiming space, reordering physical layout by index), while barely blocking business read/write operations. pg_repack rebuilds a table by creating a new table rather than modifying it in place. Throughout the process, it only holds a brief exclusive lock at the very start and end; the rest of the time the table remains available for normal SELECT, INSERT, UPDATE, and DELETE operations.

Both the PG mode and Oracle compatibility mode of IvorySQL have been adapted for pg_repack. Note that in Oracle compatibility mode, before compiling pg_repack, you need to manually modify the lib/pg_repack.control.in file in the extension source code to add a configuration item pg_dialect = true, otherwise the installation will fail.

License: BSD-3-Clause License

2. Principle

VACUUM FULL / CLUSTER hold an ACCESS EXCLUSIVE lock on the entire table while rewriting it, during which reads and writes are not possible. pg_repack avoids long-duration table locking by means of "shadow table + triggers + file node swap".

2.1. Shadow Table and Change Capture

  • Create a shadow table with the same structure as the original table (repack.table_<oid>), and bulk-copy the original table’s data into it.

  • Install row-level triggers on the original table to record the INSERT / UPDATE / DELETE operations that occur during the rebuild into a log table (repack.log_<oid>).

  • After the data copy is complete, replay the log so that the shadow table catches up with the latest state of the original table.

2.2. Swap and Cleanup

  • At the system catalog level, swap the physical file nodes of the original table and the shadow table. This step is instantaneous and only requires a brief exclusive lock.

  • Remove the triggers, log tables, and other temporary objects, and run ANALYZE on the new table.

Prerequisites for use: the target table must have a primary key or a non-null unique key; the operation requires approximately twice the table size of temporary disk space during the process.

3. Installation and Enabling

IvorySQL 5.0 or above is already installed in the environment. pg_repack consists of two parts: the server-side extension (pg_repack.so and SQL scripts) and the client-side command-line tool pg_repack.

3.1. Source Code Installation

  • Set the PG_CONFIG environment variable

export PG_CONFIG=/usr/local/ivorysql/bin/pg_config
  • Pull the pg_repack source code

git clone --branch ver_1.5.3 https://github.com/reorg/pg_repack.git
  • Compile and install pg_repack

To install pg_repack in IvorySQL’s Oracle compatibility mode, you need to first modify the lib/pg_repack.control.in file of pg_repack to add the pg_dialect configuration item:

# pg_repack/lib/pg_repack.control.in
pg_dialect = true

Perform compilation and installation:

cd pg_repack
sudo --preserve-env=PG_CONFIG make
sudo --preserve-env=PG_CONFIG make install
  • Create the pg_repack extension

[ivorysql@localhost ivorysql]$ psql
psql (18.0)
Type "help" for help.

ivorysql=# CREATE EXTENSION pg_repack;
CREATE EXTENSION

Note: CREATE EXTENSION pg_repack must be executed by a superuser; the client tool pg_repack also requires connecting and running as a superuser by default. After installation, the client tool is located at /usr/local/ivorysql/bin/pg_repack.

4. Usage

pg_repack is driven through the command-line client. The connection parameters are the same as those of psql (-h host, -p port, -U user, -d database, or use the PGHOST / PGPORT / PGUSER environment variables).

  • Rebuild all eligible tables in the entire database

pg_repack -d ivorysql
  • Rebuild a specified table (the schema.table form is recommended)

pg_repack -d ivorysql -t public.big_table
  • Rebuild indexes only

pg_repack -d ivorysql -t big_table --only-indexes    # rebuild all indexes of this table
pg_repack -d ivorysql -i public.big_table_idx        # rebuild a single index
  • Rebuild ordered by a specified column (online CLUSTER effect)

pg_repack -d ivorysql -t big_table -o "created_at DESC"
  • Migrate a table and its indexes to another tablespace

pg_repack -d ivorysql -t big_table -s fast_ssd_ts --moveidx
  • Dry-run mode (only shows the actions that would be performed, without actually rebuilding)

pg_repack -d ivorysql -t big_table --dry-run