背景
服务器的 PostgreSQL 实例通过 docker-compose 运行在 docker 环境中, 使用的官方镜像;
PostgreSQL 发布了新的主要的版本, 兴冲冲的修改镜像准备升级, 运行后报错:
2023-10-11 16:34:29.913 CST [1] FATAL: database files are incompatible with server
2023-10-11 16:34:29.913 CST [1] DETAIL: The data directory was initialized by PostgreSQL version 16, which is not compatible with this version 16.0 (Debian 16.0-1.pgdg120+1).
这是
PostgreSQL 16发布时候的时候的报错.
postgres-1 | 2024-10-10 01:57:26.786 UTC [1] FATAL: database files are incompatible with server
postgres-1 | 2024-10-10 01:57:26.786 UTC [1] DETAIL: The data directory was initialized by PostgreSQL version 16, which is not compatible with this version 17.0 (Debian 17.0-1.pgdg120+1).
这是
PostgreSQL 17发布时候的时候的报错.
Error: in 18+, these Docker images are configured to store database data in a
format which is compatible with "pg_ctlcluster" (specifically, using
major-version-specific directory names). This better reflects how
PostgreSQL itself works, and how upgrades are to be performed.
See also https://github.com/docker-library/postgres/pull/1259
Counter to that, there appears to be PostgreSQL data in:
/var/lib/postgresql/17/docker
This is usually the result of upgrading the Docker image without upgrading
the underlying database using "pg_upgrade" (which requires both versions).
See https://github.com/docker-library/postgres/issues/37 for a (long)
discussion around this process, and suggestions for how to do so.
这是
PostgreSQL 18发布时候的时候的报错.
原因是 PostgreSQL 数据库数据不同主要版本之间不兼容, 需要手动升级数据.
解决办法
阅读官方文档, 说可以使用 pg_dumpall 或 pg_upgrade 工具进行处理:
- pg_dumpall
pg_dumpall是PostgreSQL的转储工具, 通过再老版本转储后再新版本导入的方式可以实现数据升级; - pg_upgrade
pg_upgrade是PostgreSQL的专用升级工具, 大量数据的情况下比pg_dumpall性能更好; 本文使用此工具进行操作;
pg_upgrade 工具需要传入至少 4 个参数:
pg_upgrade -b oldbindir [-B newbindir] -d oldconfigdir -D newconfigdir [option...]
-b: 旧版本的PostgreSQL程序文件目录;-B: 新版本的PostgreSQL程序文件目录;-d: 旧版本的数据目录;-D: 新版本的数据目录;
升级数据
服务器本身跑的 docker, 直接使用一个临时容器来操作:
操作前务必停止正在运行的实例, 否则可能导致数据损坏!!!
启动新版本容器:
docker run -ti --rm -v ~/data/postgres/data/:/var/lib/postgresql/ --entrypoint bash postgres:18.0
~/data/postgres/data/是本地工作数据目录。postgres 18开始修改了docker默认数据目录为/var/lib/postgresql/PG_MAJOR/docker/, 且推荐挂载点为/var/lib/postgresql/。
接下来都在容器中操作即可:
# 配置信息
export PG_MAJOR_OLD=17
export PGBINOLD=/usr/lib/postgresql/$PG_MAJOR_OLD/bin/
export PGBINNEW=/usr/lib/postgresql/$PG_MAJOR/bin/
export PGDATAOLD=/var/lib/postgresql/$PG_MAJOR_OLD/docker/
export PGDATANEW=/var/lib/postgresql/$PG_MAJOR/docker/
# 安装旧版本,pg_upgrade 会用到相关的内容
sed -e 's|deb.debian.org|mirrors.tuna.tsinghua.edu.cn|g' -i.bak /etc/apt/sources.list.d/debian.sources
apt update
apt install -y --no-install-recommends --no-install-suggests postgresql-$PG_MAJOR_OLD
# 准备环境
chmod 0700 -R /var/lib/postgresql/
chown postgres:postgres -R /var/lib/postgresql/
# 先切到 postgres 用户, 并进入有写权限的目录, initdb、pg_upgrade 不允许以 root 用户执行
su postgres
cd ~
# 初始化新版本基础数据
initdb --pgdata=$PGDATANEW
# 同步配置文件
cp -rf $PGDATAOLD/pg_hba.conf $PGDATANEW/pg_hba.conf
# 检查 & 升级
pg_upgrade --check
pg_upgrade
问题记录
pg_upgrade 升级失败,提示旧版本数据没有数据开启校验
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
old cluster does not use data checksums but the new one does
Failure, exiting
解决方法,开启旧版本数据校验:
/usr/lib/postgresql/$PG_MAJOR_OLD/bin/pg_checksums --pgdata=$PGDATAOLD --enable --progress --verbose
也可以通过创建无校验的新数据库来处理:
initdb --pgdata=$PGDATANEW --no-data-checksums
pg_upgrade 升级失败,提示不能连接旧版本数据库
Performing Consistency Checks
-----------------------------
Checking cluster versions ok
*failure*
Consult the last few lines of "/var/lib/postgresql/18/docker/pg_upgrade_output.d/20251009T100017.514/log/pg_upgrade_server.log" for
the probable cause of the failure.
connection to server on socket "/var/lib/postgresql/.s.PGSQL.50432" failed: No such file or directory
Is the server running locally and accepting connections on that socket?
could not connect to source postmaster started with the command:
"/usr/lib/postgresql/17/bin/pg_ctl" -w -l "/var/lib/postgresql/18/docker/pg_upgrade_output.d/20251009T100017.514/log/pg_upgrade_server.log" -D "/var/lib/postgresql/17/docker" -o "-p 50432 -b -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/postgresql'" start
Failure, exiting
进一步分析日志 pg_upgrade_server.log, 提示数据目录权限错误,启动服务失败。
-----------------------------------------------------------------
pg_upgrade run on Thu Oct 9 10:00:17 2025
-----------------------------------------------------------------
command: "/usr/lib/postgresql/17/bin/pg_ctl" -w -l "/var/lib/postgresql/18/docker/pg_upgrade_output.d/20251009T100017.514/log/pg_upgrade_server.log" -D "/var/lib/postgresql/17/docker" -o "-p 50432 -b -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/var/lib/postgresql'" start >> "/var/lib/postgresql/18/docker/pg_upgrade_output.d/20251009T100017.514/log/pg_upgrade_server.log" 2>&1
waiting for server to start....2025-10-09 10:00:17.886 UTC [1536] FATAL: data directory "/var/lib/postgresql/17/docker" has invalid permissions
2025-10-09 10:00:17.886 UTC [1536] DETAIL: Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).
stopped waiting
pg_ctl: could not start server
Examine the log output.
解决方法,修改数据目录权限为 0700:
chmod 0700 -R /var/lib/postgresql/
升级之后无法连接
2025-10-09 19:16:39.576 CST [35] FATAL: no pg_hba.conf entry for host "192.168.8.12", user "postgres", database "postgres", no encryption
解决方法,同步 pg_hba.conf 配置文件,docker 镜像默认会开放所有连接:
cp -rf $PGDATAOLD/pg_hba.conf $PGDATANEW/pg_hba.conf
docker镜像首次初始化之后会自动加上:host all all all scram-sha-256,因为升级是手动调用的initdb所有没有这个配置。
参考
- https://www.postgresql.org/docs/current/release-18.html#RELEASE-18-MIGRATION
- https://www.postgresql.org/docs/current/release-17.html#RELEASE-17-MIGRATION
- https://www.postgresql.org/docs/current/pgupgrade.html
- https://www.postgresql.org/docs/current/app-pg-dumpall.html
- https://github.com/docker-library/postgres/pull/1259