前言:
最近,我们的某个系统应用中出现一种比较奇怪的现象,某个表中有一个sequence字段,当然,同时也有一个涉及createtime的时间戳字段。在并发环境下,发现后来插入的sequence值比先前插入的sequence值还要小。
该系统应用是一个Java服务程序,同时它也是PostgreSQL的客户端。实际上这跟后端的DB类型无关,换别的数据库,同样的架构依然会有类似的问题。
问题不在于这个“乱序”本身,而是该应用严重依赖于sequence前后的大小顺序,它实际的要求是后来插入的sequence值不能小于前边插入的值。
分析与实作:
整个应用系统架构,大概就是Cloud Foundry下边的一个Java应用服务,但是它跑的是一个多实例,相当于多个相同的Java进程同时运行,提供服务。由前边的router -> HAProxy进行统一调度。
CF (Cloud Foundry)有一个重要的header: x-cf-app-instance。当你显式指定:x-cf-app-instance: {app_id}:index时,内置的gorouter 会自动将请求发到对应的Java进程上。index为进程的序号。
如:
1x-cf-app-instance: b7f26d3a-db87-xxxx-bc90-6a12f662d239:0
2x-cf-app-instance: b7f26d3a-db87-xxxx-bc90-6a12f662d239:1
3x-cf-app-instance: b7f26d3a-db87-xxxx-bc90-6a12f662d239:2
可是这些进程之间是无法知道对方状态的。也就是说如果前边的请求本身也是无状态的,那这种并发,就有可能发生时间上的乱序。尤其是在JPA控制下的情况。
该表相关字段如下:
1@Id
2@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "C_KEY_GEN")
3@Basic(optional = false)
4@Column(name = "CID", insertable = true, updatable = false, nullable = false)
5private long cId;
6
7"create-ml-key-gen-sequence">
8 "MARK_RAN">
9 <not>
10 "C_KEY_GEN" />
11 not>
12
13 "1000" sequenceName="C_KEY_GEN"
14 startValue="1" />
15
当然还有一个相关的字段:created timestamp用于记录时间。上边我们可以看到,它的序列值每次增量是1000.平时这样用的好处就是与DB端的交互少了很多,利用JPA的cache,只有满了1000个增量之后,它才再向DB去请求下一批值。而这也为多实例的并发请求导致乱序埋下祸根。
如果不是JPA的进程内缓存,我们看看:
1postgres=# create sequence t1_seq increment by 1000;
2CREATE SEQUENCE
3postgres=# create table t1(id bigint default nextval('t1_seq'), col2 timestamp);
4CREATE TABLE
5postgres=# create sequence t2_seq increment by 1;
6CREATE SEQUENCE
7postgres=# create table t2(id bigint default nextval('t2_seq'), col2 timestamp);
8CREATE TABLE
这里就打算用t1, t2做这个试验。多个客户端去压一压,看看id与col2的值是否出现乱序。
针对增量为1000的t1
构造一个seq.sql:
1insert into t1 (col2) values (now());
使用pgbench压下看看:
1pgbench -U postgres -c 4 -f seq.sql -T 60
2
3pgbench (14.8)
4starting vacuum...pgbench: error: ERROR: relation "pgbench_branches" does not exist
5pgbench: (ignoring this error and continuing anyway)
6pgbench: error: ERROR: relation "pgbench_tellers" does not exist
7pgbench: (ignoring this error and continuing anyway)
8pgbench: error: ERROR: relation "pgbench_history" does not exist
9pgbench: (ignoring this error and continuing anyway)
10end.
11transaction type: seq.sql
12scaling factor: 1
13query mode: simple
14number of clients: 4
15number of threads: 1
16duration: 60 s
17number of transactions actually processed: 305297
18latency average = 0.786 ms
19initial connection time = 8.831 ms
20tps = 5088.967969 (without initial connection time)
这一压不得了,很快就能看到效果,先插入的sequence值有可能比后插入的值要大。因为它是分批的。
1postgres=# select count(*) from t1;
2 count
3--------
4 305297
5(1 row)
6
7postgres=# select * from t1 order by col2 limit 20;
8 id | col2
9-------+----------------------------
10 3001 | 2023-07-28 21:22:43.662688
11 1 | 2023-07-28 21:22:43.662697
12 1001 | 2023-07-28 21:22:43.663131
13 2001 | 2023-07-28 21:22:43.663332
14 4001 | 2023-07-28 21:22:43.664167
15 6001 | 2023-07-28 21:22:43.664578
16 5001 | 2023-07-28 21:22:43.664582
17 7001 | 2023-07-28 21:22:43.664652
18 8001 | 2023-07-28 21:22:43.66486
19 10001 | 2023-07-28 21:22:43.665212
20 9001 | 2023-07-28 21:22:43.665214
21 11001 | 2023-07-28 21:22:43.665264
22 12001 | 2023-07-28 21:22:43.665597
23 13001 | 2023-07-28 21:22:43.665974
24 14001 | 2023-07-28 21:22:43.665979
25 15001 | 2023-07-28 21:22:43.666029
26 16001 | 2023-07-28 21:22:43.666275
27 17001 | 2023-07-28 21:22:43.66665
28 18001 | 2023-07-28 21:22:43.666655
29 19001 | 2023-07-28 21:22:43.666713
30(20 rows)
31
观察下上边的id列值,并不是严格顺序排列的。
针对增量为1的t2
构造一个seq.sql:
1insert into t2 (col2) values (now());
使用pgbench, 4个并发压一下看看:
1pgbench -U postgres -c 4 -f seq.sql -T 60
2
3transaction type: seq.sql
4scaling factor: 1
5query mode: simple
6number of clients: 4
7number of threads: 1
8duration: 60 s
9number of transactions actually processed: 284814
10latency average = 0.843 ms
11initial connection time = 10.665 ms
12tps = 4747.670943 (without initial connection time)
再看看相关值的顺序:
1postgres=# select count(*) from t2;
2 count
3-------
4 60681
5(1 row)
6
7postgres=# select * from t2 order by col2 limit 50;
8 id | col2
9----+----------------------------
10 1 | 2023-07-28 21:29:47.274116
11 4 | 2023-07-28 21:29:47.274126
12 2 | 2023-07-28 21:29:47.274353
13 3 | 2023-07-28 21:29:47.274907
14 5 | 2023-07-28 21:29:47.275764
15 6 | 2023-07-28 21:29:47.27667
16 8 | 2023-07-28 21:29:47.276698
17 7 | 2023-07-28 21:29:47.276711
18 9 | 2023-07-28 21:29:47.276796
19 10 | 2023-07-28 21:29:47.277199
20 12 | 2023-07-28 21:29:47.277669
21 11 | 2023-07-28 21:29:47.277671
22 13 | 2023-07-28 21:29:47.277741
23 14 | 2023-07-28 21:29:47.278068
24 16 | 2023-07-28 21:29:47.278474
25 15 | 2023-07-28 21:29:47.278475
26 17 | 2023-07-28 21:29:47.278545
27 18 | 2023-07-28 21:29:47.278821
28 19 | 2023-07-28 21:29:47.2792
29 20 | 2023-07-28 21:29:47.2792
30 21 | 2023-07-28 21:29:47.27927
31 22 | 2023-07-28 21:29:47.279591
32 23 | 2023-07-28 21:29:47.279976
33 24 | 2023-07-28 21:29:47.279979
34 25 | 2023-07-28 21:29:47.280039
35 26 | 2023-07-28 21:29:47.280384
36 28 | 2023-07-28 21:29:47.280781
37 27 | 2023-07-28 21:29:47.280781
38 29 | 2023-07-28 21:29:47.280832
39 30 | 2023-07-28 21:29:47.281136
40 32 | 2023-07-28 21:29:47.281505
41 31 | 2023-07-28 21:29:47.281506
42 33 | 2023-07-28 21:29:47.281563
43 34 | 2023-07-28 21:29:47.28184
44 35 | 2023-07-28 21:29:47.282204
45 36 | 2023-07-28 21:29:47.282207
46 37 | 2023-07-28 21:29:47.282273
47 38 | 2023-07-28 21:29:47.28255
48 39 | 2023-07-28 21:29:47.283935
49 40 | 2023-07-28 21:29:47.283939
50 41 | 2023-07-28 21:29:47.284025
51 42 | 2023-07-28 21:29:47.284367
52 43 | 2023-07-28 21:29:47.284697
53 44 | 2023-07-28 21:29:47.284779
54 45 | 2023-07-28 21:29:47.284862
55 46 | 2023-07-28 21:29:47.28513
56 47 | 2023-07-28 21:29:47.28555
57 48 | 2023-07-28 21:29:47.285561
58 49 | 2023-07-28 21:29:47.285642
59 50 | 2023-07-28 21:29:47.28592
60(50 rows)
61
看着上边的还都是顺序生成的。再反向推一把,看最后一个id值的col2是不是比所有的都大。
1postgres=# select max(id) from t2;
2 max
3--------
4 284814
5(1 row)
6
7postgres=# select count(id) from t2 where col2 < (select max(col2) from t2);
8 count
9--------
10 284813
11(1 row)
我们再深入比较一下,再仔细比对一下,即每相邻的两个值,看看是不是严格的同向增大,同向变小。
1create index col2_idx_t2 on t2(col2);
2
3select a.id, (a.col2 - b.col2) as diff from t2 a, t2 b where a.id=b.id+1 and a.col2 < b.col2;
4
5 id | diff
6--------+------------------
7 244 | -00:00:00.000003
8 264 | -00:00:00.000013
9 370 | -00:00:00.000002
10 830 | -00:00:00.000003
11 838 | -00:00:00.000005
12 866 | -00:00:00.000004
13 926 | -00:00:00.000004
14 1164 | -00:00:00.000002
15 1252 | -00:00:00.000001
16................
哦,似乎看到了些怪现象:
1postgres=# select * from t2 where id between 243 and 265 order by col2;
2 id | col2
3-----+----------------------------
4 244 | 2023-07-28 21:29:47.324876
5 243 | 2023-07-28 21:29:47.324879
6 245 | 2023-07-28 21:29:47.324972
7 246 | 2023-07-28 21:29:47.325245
8 247 | 2023-07-28 21:29:47.325675
9 248 | 2023-07-28 21:29:47.325679
10 249 | 2023-07-28 21:29:47.325753
11 250 | 2023-07-28 21:29:47.326006
12 251 | 2023-07-28 21:29:47.326372
13 252 | 2023-07-28 21:29:47.326374
14 253 | 2023-07-28 21:29:47.326439
15 254 | 2023-07-28 21:29:47.326703
16 255 | 2023-07-28 21:29:47.327107
17 256 | 2023-07-28 21:29:47.32711
18 257 | 2023-07-28 21:29:47.32718
19 258 | 2023-07-28 21:29:47.327535
20 260 | 2023-07-28 21:29:47.327856
21 259 | 2023-07-28 21:29:47.327859
22 261 | 2023-07-28 21:29:47.327951
23 262 | 2023-07-28 21:29:47.328213
24 264 | 2023-07-28 21:29:47.32865
25 263 | 2023-07-28 21:29:47.328663
26 265 | 2023-07-28 21:29:47.328747
27(23 rows)
我们仍然能看到id值不完全按时间排列的现象。244排在243的前头. 260排在259的前头了。其实后边还有很多。虽然这种比例很小,但依然存在。
t2的变体
这次不用默认值,直接在SQL中显式调用:nextval('t2_seq')
1create table t2(id bigint, col2 timestamp);
2
3seq.sql:
4insert into t2 (id, col2) values (nextval('t2_seq'), now());
5
6[22:00:04-postgres@sean-rh1:/var/lib/pgsql]$ pgbench -U postgres -c 4 -f seq.sql -T 60
7pgbench (14.8)
8starting vacuum...pgbench: error: ERROR: relation "pgbench_branches" does not exist
9pgbench: (ignoring this error and continuing anyway)
10pgbench: error: ERROR: relation "pgbench_tellers" does not exist
11pgbench: (ignoring this error and continuing anyway)
12pgbench: error: ERROR: relation "pgbench_history" does not exist
13pgbench: (ignoring this error and continuing anyway)
14end.
15transaction type: seq.sql
16scaling factor: 1
17query mode: simple
18number of clients: 4
19number of threads: 1
20duration: 60 s
21number of transactions actually processed: 286483
22latency average = 0.838 ms
23initial connection time = 11.066 ms
24tps = 4775.527472 (without initial connection time)
再验证下看看:
1postgres=# select a.id, (a.col2 - b.col2) as diff from t2 a, t2 b where a.id=b.id+1 and a.col2 < b.col2 order by a.id limit 30;
2 id | diff
3-----+------------------
4 3 | -00:00:00.000068
5 21 | -00:00:00.000001
6 27 | -00:00:00.000002
7 35 | -00:00:00.000002
8 79 | -00:00:00.000003
9 85 | -00:00:00.000003
10 91 | -00:00:00.000001
11 99 | -00:00:00.000001
12 131 | -00:00:00.000002
13 159 | -00:00:00.000002
14 163 | -00:00:00.000001
15 167 | -00:00:00.000002
16 169 | -00:00:00.000002
17 171 | -00:00:00.000003
18 173 | -00:00:00.000001
19 183 | -00:00:00.000003
20 185 | -00:00:00.000002
21 205 | -00:00:00.000005
22 217 | -00:00:00.000002
23 220 | -00:00:00.000002
24 229 | -00:00:00.000002
25 232 | -00:00:00.000003
26 233 | -00:00:00.000001
27 236 | -00:00:00.000004
28 244 | -00:00:00.000002
29 249 | -00:00:00.000003
30 253 | -00:00:00.000002
31 272 | -00:00:00.000002
32 277 | -00:00:00.000004
33 281 | -00:00:00.000005
34(30 rows)
35
此现象仍然存在着:
1postgres=# select * from t2 where id between 20 and 35 order by col2;
2 id | col2
3----+----------------------------
4 21 | 2023-07-28 22:00:32.727989
5 20 | 2023-07-28 22:00:32.72799
6 22 | 2023-07-28 22:00:32.728422
7 23 | 2023-07-28 22:00:32.728425
8 24 | 2023-07-28 22:00:32.728818
9 25 | 2023-07-28 22:00:32.728824
10 27 | 2023-07-28 22:00:32.729197
11 26 | 2023-07-28 22:00:32.729199
12 28 | 2023-07-28 22:00:32.729652
13 29 | 2023-07-28 22:00:32.72966
14 30 | 2023-07-28 22:00:32.73023
15 31 | 2023-07-28 22:00:32.730238
16 32 | 2023-07-28 22:00:32.730796
17 33 | 2023-07-28 22:00:32.7308
18 35 | 2023-07-28 22:00:32.731178
19 34 | 2023-07-28 22:00:32.73118
20(16 rows)
21
小结:
完全依靠sequence的大小来决定时间戳的顺序或者反过来,在多个实例(客户端)并发访问的情况下,是不现实的。业务逻辑最好不要建立这种依赖关系。只把它作为一个非业务主键就好。这种应用,出了问题,也是非常难以排查。
虽然标题列成PostgreSQL数据库,事实上,不出所料的话,这种现象在其它数据库当中应该也普遍存在。