0%

更新 Docker 运行的 PostgreSQL 实例的版本

背景

服务器的 PostgreSQL 实例通过 docker-compose 运行在 docker 环境中, 使用的官方镜像;

PostgreSQL 发布了新的主要的版本, 兴冲冲的修改镜像准备升级, 运行后报错:

2023-10-11 16:34:29.913 CST [1] FATAL:  database files are incompatible with server
2023-10-11 16:34:29.913 CST [1] DETAIL:  The data directory was initialized by PostgreSQL version 16, which is not compatible with this version 16.0 (Debian 16.0-1.pgdg120+1).

这是 PostgreSQL 16 发布时候的时候的报错.

postgres-1  | 2024-10-10 01:57:26.786 UTC [1] FATAL:  database files are incompatible with server
postgres-1  | 2024-10-10 01:57:26.786 UTC [1] DETAIL:  The data directory was initialized by PostgreSQL version 16, which is not compatible with this version 17.0 (Debian 17.0-1.pgdg120+1).

这是 PostgreSQL 17 发布时候的时候的报错.

Error: in 18+, these Docker images are configured to store database data in a
       format which is compatible with "pg_ctlcluster" (specifically, using
       major-version-specific directory names).  This better reflects how
       PostgreSQL itself works, and how upgrades are to be performed.

       See also https://github.com/docker-library/postgres/pull/1259

       Counter to that, there appears to be PostgreSQL data in:
         /var/lib/postgresql/17/docker

       This is usually the result of upgrading the Docker image without upgrading
       the underlying database using "pg_upgrade" (which requires both versions).

       See https://github.com/docker-library/postgres/issues/37 for a (long)
       discussion around this process, and suggestions for how to do so.

这是 PostgreSQL 18 发布时候的时候的报错.

原因是 PostgreSQL 数据库数据不同主要版本之间不兼容, 需要手动升级数据.

解决办法

阅读官方文档, 说可以使用 pg_dumpallpg_upgrade 工具进行处理:

  • pg_dumpall
    pg_dumpallPostgreSQL 的转储工具, 通过再老版本转储后再新版本导入的方式可以实现数据升级;
  • pg_upgrade
    pg_upgradePostgreSQL 的专用升级工具, 大量数据的情况下比 pg_dumpall 性能更好; 本文使用此工具进行操作;

pg_upgrade 工具需要传入至少 4 个参数:

pg_upgrade -b oldbindir [-B newbindir] -d oldconfigdir -D newconfigdir [option...]
  • -b: 旧版本的 PostgreSQL 程序文件目录;
  • -B: 新版本的 PostgreSQL 程序文件目录;
  • -d: 旧版本的数据目录;
  • -D: 新版本的数据目录;

升级数据

服务器本身跑的 docker, 直接使用一个临时容器来操作:

操作前务必停止正在运行的实例, 否则可能导致数据损坏!!!

启动新版本容器:

docker run -ti --rm -v ~/data/postgres/data/:/var/lib/postgresql/ --entrypoint bash postgres:18.0
  • ~/data/postgres/data/ 是本地工作数据目录。
  • postgres 18 开始修改了 docker 默认数据目录为 /var/lib/postgresql/PG_MAJOR/docker/, 且推荐挂载点为 /var/lib/postgresql/

接下来都在容器中操作即可:

# 配置信息
export PG_MAJOR_OLD=17
export PGBINOLD=/usr/lib/postgresql/$PG_MAJOR_OLD/bin/
export PGBINNEW=/usr/lib/postgresql/$PG_MAJOR/bin/
export PGDATAOLD=/var/lib/postgresql/$PG_MAJOR_OLD/docker/
export PGDATANEW=/var/lib/postgresql/$PG_MAJOR/docker/

# 安装旧版本,pg_upgrade 会用到相关的内容
sed -e 's|deb.debian.org|mirrors.tuna.tsinghua.edu.cn|g' -i.bak /etc/apt/sources.list.d/debian.sources
apt update
apt install -y --no-install-recommends --no-install-suggests postgresql-$PG_MAJOR_OLD

# 准备环境
chmod 0700 -R /var/lib/postgresql/
chown postgres:postgres -R /var/lib/postgresql/

# 先切到 postgres 用户, 并进入有写权限的目录, initdb、pg_upgrade 不允许以 root 用户执行
su postgres
cd ~

# 初始化新版本基础数据
initdb --pgdata=$PGDATANEW

# 同步配置文件
cp -rf $PGDATAOLD/pg_hba.conf $PGDATANEW/pg_hba.conf

# 检查 & 升级
pg_upgrade --check
pg_upgrade

问题记录

pg_upgrade 升级失败,提示旧版本数据没有数据开启校验

Performing Consistency Checks
-----------------------------
Checking cluster versions                                     ok

old cluster does not use data checksums but the new one does
Failure, exiting

解决方法,开启旧版本数据校验:

/usr/lib/postgresql/$PG_MAJOR_OLD/bin/pg_checksums --pgdata=$PGDATAOLD --enable --progress --verbose

也可以通过创建无校验的新数据库来处理:

initdb --pgdata=$PGDATANEW --no-data-checksums

pg_upgrade 升级失败,提示不能连接旧版本数据库

Performing Consistency Checks
-----------------------------
Checking cluster versions                                     ok

*failure*
Consult the last few lines of "/var/lib/postgresql/18/docker/pg_upgrade_output.d/20251009T100017.514/log/pg_upgrade_server.log" for
the probable cause of the failure.

connection to server on socket "/var/lib/postgresql/.s.PGSQL.50432" failed: No such file or directory
        Is the server running locally and accepting connections on that socket?


could not connect to source postmaster started with the command:
"/usr/lib/postgresql/17/bin/pg_ctl" -w -l "/var/lib/postgresql/18/docker/pg_upgrade_output.d/20251009T100017.514/log/pg_upgrade_server.log" -D "/var/lib/postgresql/17/docker" -o "-p 50432 -b  -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/postgresql'" start
Failure, exiting

进一步分析日志 pg_upgrade_server.log, 提示数据目录权限错误,启动服务失败。

-----------------------------------------------------------------
  pg_upgrade run on Thu Oct  9 10:00:17 2025
-----------------------------------------------------------------

command: "/usr/lib/postgresql/17/bin/pg_ctl" -w -l "/var/lib/postgresql/18/docker/pg_upgrade_output.d/20251009T100017.514/log/pg_upgrade_server.log" -D "/var/lib/postgresql/17/docker" -o "-p 50432 -b  -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/postgresql'" start >> "/var/lib/postgresql/18/docker/pg_upgrade_output.d/20251009T100017.514/log/pg_upgrade_server.log" 2>&1
waiting for server to start....2025-10-09 10:00:17.886 UTC [1536] FATAL:  data directory "/var/lib/postgresql/17/docker" has invalid permissions
2025-10-09 10:00:17.886 UTC [1536] DETAIL:  Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).
 stopped waiting
pg_ctl: could not start server
Examine the log output.

解决方法,修改数据目录权限为 0700

chmod 0700 -R /var/lib/postgresql/

升级之后无法连接

2025-10-09 19:16:39.576 CST [35] FATAL:  no pg_hba.conf entry for host "192.168.8.12", user "postgres", database "postgres", no encryption

解决方法,同步 pg_hba.conf 配置文件,docker 镜像默认会开放所有连接:

cp -rf $PGDATAOLD/pg_hba.conf $PGDATANEW/pg_hba.conf

docker 镜像首次初始化之后会自动加上:host all all all scram-sha-256,因为升级是手动调用的 initdb 所有没有这个配置。

参考