前言
PostgreSQL事务id是通过32位无符号数来表示的:2^32 = 4294967296,当事务id超过40亿以后就会有溢出的风险,为了防止事务回卷,PostgreSQL通过vacuum freeze的方式来循环利用这些事务号,如果vacuum freeze 不及时或者遇到故障,数据库会给相关报警提示,甚至关闭数据库。
看到这个,我们就可以试着模拟,尽快让xmin达到vacuum_freeze_min_age,然后触发vacuum freeze,但是我们又想办法不让vacuum freeze真正执行成功。
实际验证
记录一下基本环境配置:
RELS7 + PostgreSQL 14.4
/proc/cpuinfo:
Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz 4Core memory: 8G
1、几个参数默认值
mydb=# show autovacuum_freeze_max_age ;
autovacuum_freeze_max_age
---------------------------
200000000
(1 row)
mydb=# show vacuum_freeze_min_age;
vacuum_freeze_min_age
-----------------------
50000000
(1 row)
mydb=# show vacuum_freeze_table_age ;
vacuum_freeze_table_age
-------------------------
150000000
(1 row)
mydb=# select name, setting from pg_settings where name ~ 'vacuum_freeze';
name | setting
---------------------------+-----------
autovacuum_freeze_max_age | 200000000
vacuum_freeze_min_age | 50000000
vacuum_freeze_table_age | 150000000
(3 rows)
上边几个参数都用的默认值。因为磁盘空间有限,将归档关闭
vi postgresql.conf
archive_mode = off
2、准备简单的表及数据
create table vacuum_test(id int);
create table vacuum_test_1(id int);
insert into vacuum_test select 1;
insert into vacuum_test_1 select 1;
顺带简单查一下相关的age及relfrozenxid
select xmin,age(xmin),* from vacuum_test;
select relname,relfilenode,relfrozenxid from pg_class where relname like 'vacuum_test%';
select oid,datname,datfrozenxid, age(datfrozenxid) from pg_database where datname='mydb';
mydb=# select xmin,age(xmin),* from vacuum_test;
xmin | age | id
------+-----+----
739 | 2 | 1
(1 row)
mydb=# select relname,relfilenode,relfrozenxid from pg_class where relname like 'vacuum_test%';
relname | relfilenode | relfrozenxid
---------------+-------------+--------------
vacuum_test | 16386 | 737
vacuum_test_1 | 16389 | 738
(2 rows)
mydb=# select oid,datname,datfrozenxid, age(datfrozenxid) from pg_database where datname='mydb';
oid | datname | datfrozenxid | age
-------+---------+--------------+-----
16384 | mydb | 726 | 15
(1 row)
3、disable avacuum freeze
嗯,这里使用到了一个技巧, 当我们把其中一个表的文件移走以后,执行vacuum freeze会失败,也就无效了,age继续往前猛涨。无效或失败时的日志大概是这个样子的:
2023-04-15 22:24:57.725 UTC [5638] DETAIL: Multiple failures --- write error might be permanent.
2023-04-15 22:24:58.761 UTC [5638] ERROR: could not open file "base/16384/32777": No such file or directory
2023-04-15 22:24:58.761 UTC [5638] CONTEXT: writing block 0 of relation base/16384/32777
2023-04-15 22:24:58.761 UTC [5638] WARNING: could not write block 0 of base/16384/32777
2023-04-15 22:24:58.761 UTC [5638] DETAIL: Multiple failures --- write error might be permanent.
2023-04-15 22:24:59.810 UTC [5638] ERROR: could not open file "base/16384/32777": No such file or directory
2023-04-15 22:24:59.810 UTC [5638] CONTEXT: writing block 0 of relation base/16384/32777
移出文件的操作如下:
mydb=# select a.setting || '/' || b.pg_relation_filepath from (select setting from pg_settings where name='data_directory') as a, (select pg_relation_filepath('vacuum_test')) as b;
?column?
-----------------------------------------
/var/lib/pgsql/14/data/base/16384/16386
(1 row)
\! mv -v /var/lib/pgsql/14/data/base/16384/16386 /var/lib/pgsql/14/data/base/16384/16386.old
4、加压不断生成新的XID
script.sql
begin;
select pg_current_xact_id();
-- 或者 select txid_current();
end;
然后执行:
nohup pgbench -c 60 -j 60 -n -f script.sql -T 18000 mydb &
transaction type: script.sql
scaling factor: 1
query mode: simple
number of clients: 60
number of threads: 60
duration: 18000 s
number of transactions actually processed: 0
pgbench: fatal: Run was aborted; the above results are incomplete.
pgbench (14.7)
[01:15:22-postgres@sean-rh1:/var/lib/pgsql]$ ps -ef | grep pgbench
postgres 3739 18994 99 01:04 pts/0 00:13:00 pgbench -c 60 -j 60 -n -f script.sql -T 18000 mydb
5、间接观测结果
与此同时,在另一个会话窗口,我们可以定期使用SQL脚本观测一下相关age值。
check.sql, 内容如下:
insert into vacuum_test_1 select 1;
select xmin,age(xmin),ctid,* from vacuum_test_1 limit 1;
select relname,relfilenode,relfrozenxid from pg_class where relname like 'vacuum_test%';
select oid,datname,datfrozenxid,age(datfrozenxid) from pg_database;
使用命令:
watch -n 5 "psql mydb < check.sql"
相当于每5秒钟执行一次。
Every 5.0s: psql mydb < check.sql Sun Apr 16 01:15:58 2023
INSERT 0 1
xmin | age | ctid | id
------+----------+-------+----
740 | 39743791 | (0,1) | 1
(1 row)
relname | relfilenode | relfrozenxid
---------------+-------------+--------------
vacuum_test | 16386 | 737
vacuum_test_1 | 16389 | 738
(2 rows)
oid | datname | datfrozenxid | age
-------+-----------+--------------+----------
14486 | postgres | 726 | 39744962
1 | template1 | 726 | 39744962
14485 | template0 | 726 | 39744962
16384 | mydb | 726 | 39744962
(4 rows)
我们也能看到13分钟的压力下,可以达到age: 39743791. 如果要达到20亿,差不到要13 * 50 = 650分钟。
继续调整参数跑:
nohup pgbench -c 60 -j 60 -n -f script.sql -T 54000 mydb &
在跑了近3上小时(2小时45分钟)以后:
[03:39:48-postgres@sean-rh1:/var/lib/pgsql]$ ps -ef | grep pgbench
postgres 5274 18994 99 01:19 pts/0 02:51:53 pgbench -c 60 -j 60 -n -f script.sql -T 54000 mydb
Every 5.0s: psql mydb < check.sql Sun Apr 16 03:40:04 2023
INSERT 0 1
xmin | age | ctid | id
------+-----------+-------+----
740 | 550387062 | (0,1) | 1
(1 row)
relname | relfilenode | relfrozenxid
---------------+-------------+--------------
vacuum_test | 16386 | 737
vacuum_test_1 | 16389 | 455547776
(2 rows)
oid | datname | datfrozenxid | age
-------+-----------+--------------+-----------
14486 | postgres | 150655352 | 399733650
1 | template1 | 202440850 | 347948152
14485 | template0 | 204187319 | 346201683
16384 | mydb | 737 | 550388265
(4 rows)
age达到5.5亿,在1.5亿的时候,其实就会强制进行vacuum操作,数据库里头有大量的错误消息:
2023-04-16 01:19:30.496 UTC [3837] LOG: unexpected EOF on client connection with an open transaction
2023-04-16 02:02:53.340 UTC [9835] ERROR: could not open file "base/16384/16386": No such file or directory
2023-04-16 02:02:53.340 UTC [9835] CONTEXT: automatic vacuum of table "mydb.public.vacuum_test"
2023-04-16 02:03:08.385 UTC [9860] ERROR: could not open file "base/16384/16386": No such file or directory
2023-04-16 02:03:08.385 UTC [9860] CONTEXT: automatic vacuum of table "mydb.public.vacuum_test"
2023-04-16 02:03:23.385 UTC [9881] ERROR: could not open file "base/16384/16386": No such file or directory
2023-04-16 02:03:23.385 UTC [9881] CONTEXT: automatic vacuum of table "mydb.public.vacuum_test"
我们对一下时间:02:02:53报的第一次,pgbench大概是:03:39 - 02:51 = 00: 48 时发生的。就是说大概经历了02:02 - 00:48 = 74分钟的时间,就会发生vacuum freeze强制操作。
监测到后来:
Every 5.0s: psql mydb < check.sql Sun Apr 16 11:05:07 2023
WARNING: database "mydb" must be vacuumed within 19399914 transactions
HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
INSERT 0 1
xmin | age | ctid | id
------+------------+-------+----
740 | 2128084416 | (0,1) | 1
(1 row)
relname | relfilenode | relfrozenxid
---------------+-------------+--------------
vacuum_test | 16386 | 737
vacuum_test_1 | 16389 | 1960501278
(2 rows)
oid | datname | datfrozenxid | age
-------+-----------+--------------+------------
14486 | postgres | 150655352 | 1977430842
1 | template1 | 202440850 | 1925645344
14485 | template0 | 204187319 | 1923898875
16384 | mydb | 737 | 2128085457
(4 rows)
-- 这个时候已经跑了差不多11个小时了:
[11:05:59-postgres@sean-rh1:/var/lib/pgsql/14/data/log]$ ps -ef | grep pgbench
postgres 5274 18994 99 01:19 pts/0 11:35:36 pgbench -c 60 -j 60 -n -f script.sql -T 54000 mydb
紧接着有这样的server log:
2023-04-16 11:44:31.889 UTC [1521] ERROR: could not open file "base/16384/16386": No such file or directory
2023-04-16 11:44:31.889 UTC [1521] CONTEXT: automatic vacuum of table "mydb.public.vacuum_test"
2023-04-16 11:44:31.891 UTC [1521] WARNING: database "mydb" must be vacuumed within 2999945 transactions
2023-04-16 11:44:31.891 UTC [1521] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
2023-04-16 11:44:34.906 UTC [1528] ERROR: database is not accepting commands to avoid wraparound data loss in database "mydb"
2023-04-16 11:44:34.906 UTC [1528] HINT: Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
2023-04-16 11:44:34.906 UTC [1528] STATEMENT: insert into vacuum_test_1 select 1;
2023-04-16 11:44:39.917 UTC [1538] ERROR: database is not accepting commands to avoid wraparound data loss in database "mydb"
2023-04-16 11:44:39.917 UTC [1538] HINT: Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
2023-04-16 11:44:39.917 UTC [1538] STATEMENT: insert into vacuum_test_1 select 1;
2023-04-16 11:44:44.961 UTC [1546] ERROR: database is not accepting commands to avoid wraparound data loss in database "mydb"
2023-04-16 11:44:44.961 UTC [1546] HINT: Stop the postmaster and vacuum that database in single-user mode.
而且再次查看log文件时,大的吓人:
[11:44:56-postgres@sean-rh1:/var/lib/pgsql/14/data/log]$ ls -la
total 11990072
drwx------ 2 postgres postgres 4096 Apr 16 11:46 .
drwx------ 20 postgres postgres 4096 Apr 16 11:44 ..
-rw------- 1 postgres postgres 12277819696 Apr 16 11:44 postgresql-2023-04-16.log
6、强行vacuum恢复
log里头已经提示了:Stop the postmaster and vacuum that database in single-user mode
postgres --single -D /data/pgdata mydb
2023-04-16 11:49:18.594 UTC [1810] WARNING: database with OID 16384 must be vacuumed within 2999945 transactions
2023-04-16 11:49:18.594 UTC [1810] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database.
You might also need to commit or roll back old prepared transactions, or drop stale replication slots.
PostgreSQL stand-alone backend 14.7
backend> vacuum;
2023-04-16 11:49:35.030 UTC [1810] ERROR: could not open file "base/16384/16386": No such file or directory
2023-04-16 11:49:35.030 UTC [1810] STATEMENT: vacuum;
backend> \! mv /var/lib/pgsql/14/data/base/16384/16386.old /var/lib/pgsql/14/data/base/16384/16386
2023-04-16 11:50:08.574 UTC [1810] ERROR: syntax error at or near "\" at character 1
2023-04-16 11:50:08.574 UTC [1810] STATEMENT: \! mv /var/lib/pgsql/14/data/base/16384/16386.old /var/lib/pgsql/14/data/base/16384/16386
backend> vacuum;
2023-04-16 11:50:33.285 UTC [1810] WARNING: bypassing nonessential maintenance of table "mydb.public.vacuum_test" as a failsafe after 0 index scans
2023-04-16 11:50:33.285 UTC [1810] DETAIL: The table's relfrozenxid or relminmxid is too far in the past.
2023-04-16 11:50:33.285 UTC [1810] HINT: Consider increasing configuration parameter "maintenance_work_mem" or "autovacuum_work_mem".
You might also need to consider other ways for VACUUM to keep up with the allocation of transaction IDs.
照着上边出错的,接着往下执行:
-- 恢复原来的文件
mv /var/lib/pgsql/14/data/base/16384/16386.old /var/lib/pgsql/14/data/base/16384/16386
backend> alter system set maintenance_work_mem=2048;
backend> vacuum;
backend> vacuum verbose;s
在这之后,恢复正常。
[12:22:32-postgres@sean-rh1:/var/lib/pgsql/14/data/log]$ psql mydb
psql (14.7)
Type "help" for help.
mydb=# \dt+
List of relations
Schema | Name | Type | Owner | Persistence | Access method | Size | Description
--------+---------------+-------+----------+-------------+---------------+--------+-------------
public | vacuum_test | table | postgres | permanent | heap | 40 kB |
public | vacuum_test_1 | table | postgres | permanent | heap | 264 kB |
(2 rows)
mydb=# select count(*) from vacuum_test_1
mydb-# ;
count
-------
6392
(1 row)
-- 顺便把超大的log文件给移走
postgres=# \! ls -la
total 11990224
drwx------ 2 postgres postgres 4096 Apr 16 11:46 .
drwx------ 20 postgres postgres 4096 Apr 16 12:24 ..
-rw------- 1 postgres postgres 12277976039 Apr 16 12:23 postgresql-2023-04-16.log
postgres=# \! mv postgresql-2023-04-16.log postgresql-2023-04-16.log.old
postgres=# select pg_rotate_logfile();
pg_rotate_logfile
-------------------
t
(1 row)
postgres=# \! ls -la
total 11990224
drwx------ 2 postgres postgres 4096 Apr 16 12:24 .
drwx------ 20 postgres postgres 4096 Apr 16 12:24 ..
-rw------- 1 postgres postgres 0 Apr 16 12:24 postgresql-2023-04-16.log
-rw------- 1 postgres postgres 12277976039 Apr 16 12:23 postgresql-2023-04-16.log.old
7、曾经遇到的别的错:
可惜在跑了近3个小时的时候,log里头显示:
cp: error writing ‘/pgccc/archive/data/000000050000000400000047’: No space left on device
cp: failed to extend ‘/pgccc/archive/data/000000050000000400000047’: No space left on device
2023-04-16 00:33:14.768 UTC [5642] LOG: archive command failed with exit code 1
2023-04-16 00:33:14.768 UTC [5642] DETAIL: The failed archive command was: cp pg_wal/000000050000000400000047 /pgccc/archive/data/000000050000000400000047
2023-04-16 00:33:15.061 UTC [5638] PANIC: could not write to file "pg_logical/replorigin_checkpoint.tmp": No space left on device
2023-04-16 00:33:15.064 UTC [5635] LOG: checkpointer process (PID 5638) was terminated by signal 6: Aborted
2023-04-16 00:33:15.064 UTC [5635] LOG: terminating any other active server processes
2023-04-16 00:33:15.067 UTC [5643] LOG: could not write temporary statistics file "pg_stat/db_14486.tmp": No space left on device
2023-04-16 00:33:15.069 UTC [5643] LOG: could not write temporary statistics file "pg_stat/db_14485.tmp": No space left on device
2023-04-16 00:33:15.069 UTC [5643] LOG: could not close temporary statistics file "pg_stat/db_16384.tmp": No space left on device
2023-04-16 00:33:15.077 UTC [5635] LOG: all server processes terminated; reinitializing
2023-04-16 00:33:15.099 UTC [878] LOG: database system was interrupted; last known up at 2023-04-14 13:24:25 UTC
2023-04-16 00:33:15.198 UTC [878] LOG: database system was not properly shut down; automatic recovery in progress
2023-04-16 00:33:15.201 UTC [878] LOG: redo starts at 0/1203CCF8
2023-04-16 00:33:18.033 UTC [884] FATAL: the database system is in recovery mode
2023-04-16 00:33:18.147 UTC [885] FATAL: the database system is in recovery mode
经检查发现:
archive开启消耗了大量的空间。把磁盘空间耗光了。
参考
[1] PostgreSQL 中 xid 与 txid
https://www.modb.pro/db/377530
[2] 复现HINT: To avoid a database shutdown,execute a database-wide vacuum in that database
:https://www.modb.pro/db/31736