本节简单介绍了PostgreSQL中的参数work_mem。官方解释如下:

Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. The value defaults to one megabyte (1MB). Note that for a complex query, several sort or hash operations might be running in parallel; each operation will be allowed to use as much memory as this value specifies before it starts to write data into temporary files. Also, several running sessions could be doing such operations concurrently. Therefore, the total memory used could be many times the value of work_mem; it is necessary to keep this fact in mind when choosing the value. Sort operations are used for ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash joins, hash-based aggregation, and hash-based processing of IN subqueries.

work_mem的内存从哪分配
work_mem是session(进程)的私有内存,与Oracle的PGA类似,由进程自行申请和管理.如果超出work_mem的限制,PG会把数据写入到临时文件中,如果OS的内存足够,写临时文件时会缓存到os的page cache中,相当于数据仍然在内存中.

work_mem对排序性能的影响
下面来看看work_mem大小对排序性能的影响.
测试表:


CREATE TABLE test (id serial PRIMARY KEY, random_text text );
\! perl -e '@c=("a".."z","A".."Z",0..9); print join("",map{$c[rand@c]}10..20+rand(40))."\n" for 1..1000000' > /tmp/random_strings
copy test (random_text) FROM '/tmp/random_strings';
analyze test;
[local:/data/run/pg12]:5120 pg12@testdb=# \d test
                               Table "public.test"
   Column    |  Type   | Collation | Nullable |             Default              
-------------+---------+-----------+----------+----------------------------------
 id          | integer |           | not null | nextval('test_id_seq'::regclass)
 random_text | text    |           |          | 
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)
[local:/data/run/pg12]:5120 pg12@testdb=# select count(*) from test;
  count  
---------
 1000000
(1 row)
[local:/data/run/pg12]:5120 pg12@testdb=# select * from test limit 5;
 id |                 random_text                  
----+----------------------------------------------
  1 | 82nXOCCqPYxsOCGf3sXHTi51hG720
  2 | wsYU8uZhanrFoPwJneIvqJYcYDAnKrKVo
  3 | mTD4bJr83asYTRCtgdn
  4 | xqrw1QoGouIOa0vlxW9t
  5 | VbWuf4p3jhrsAOoMKQrwrBBPZib7ZMAUA387EhSO1qsU
(5 rows)
[local:/data/run/pg12]:5120 pg12@testdb=#

test表有2个列,其中id为主键,random_text是随机字符串,100w行数据.
work_mem设置为1MB

[local:/data/run/pg12]:5120 pg12@testdb=# set work_mem='1MB';
SET
[local:/data/run/pg12]:5120 pg12@testdb=# 
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN analyze SELECT * FROM test WHERE id <= 100 ORDER BY random_text ASC;
                                                        QUERY PLAN                              
----------------------------------------------------------------------------------------
 Sort  (cost=12.86..13.09 rows=89 width=35) (actual time=0.990..1.056 rows=100 loops=1)
   Sort Key: random_text
   Sort Method: quicksort  Memory: 34kB
   ->  Index Scan using test_pkey on test  (cost=0.42..9.98 rows=89 width=35) (actual time=0.051
..0.165 rows=100 loops=1)
         Index Cond: (id <= 100)
 Planning Time: 1.028 ms
 Execution Time: 1.201 ms
(7 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN analyze SELECT * FROM test WHERE id <= 1000 ORDER BY random_text ASC;
                                                         QUERY PLAN                             
-----------------------------------------------------------------------------------------
 Sort  (cost=82.38..84.60 rows=887 width=35) (actual time=10.224..10.560 rows=1000 loops=1)
   Sort Key: random_text
   Sort Method: quicksort  Memory: 122kB
   ->  Index Scan using test_pkey on test  (cost=0.42..38.95 rows=887 width=35) (actual time=0.0
97..2.090 rows=1000 loops=1)
         Index Cond: (id <= 1000)
 Planning Time: 0.924 ms
 Execution Time: 11.027 ms
(7 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN analyze SELECT * FROM test WHERE id <= 10000 ORDER BY random_text ASC;
                                                           QUERY PLAN                           
------------------------------------------------------------------------------------------
 Sort  (cost=914.20..936.37 rows=8869 width=35) (actual time=40.895..44.648 rows=10000 loops=1)
   Sort Key: random_text
   Sort Method: external merge  Disk: 448kB
   ->  Index Scan using test_pkey on test  (cost=0.42..332.63 rows=8869 width=35) (actual time=0
.054..7.950 rows=10000 loops=1)
         Index Cond: (id <= 10000)
 Planning Time: 0.501 ms
 Execution Time: 45.357 ms
(7 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN analyze SELECT * FROM test WHERE id <= 100000 ORDER BY random_text ASC;
                                                             QUERY PLAN                         
------------------------------------------------------------------------------------------
 Sort  (cost=17731.80..17985.59 rows=101517 width=35) (actual time=274.599..344.113 rows=100000 
loops=1)
   Sort Key: random_text
   Sort Method: external merge  Disk: 4472kB
   ->  Index Scan using test_pkey on test  (cost=0.42..3731.97 rows=101517 width=35) (actual tim
e=0.072..29.042 rows=100000 loops=1)
         Index Cond: (id <= 100000)
 Planning Time: 0.192 ms
 Execution Time: 348.499 ms
(7 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN analyze SELECT * FROM test WHERE id <= 1000000 ORDER BY random_text ASC;
                                                            QUERY PLAN                          
-------------------------------------------------------------------------------------------
 Gather Merge  (cost=76126.17..173355.26 rows=833334 width=35) (actual time=1299.103..2370.246 r
ows=1000000 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Sort  (cost=75126.15..76167.81 rows=416667 width=35) (actual time=1291.503..1559.785 rows
=333333 loops=3)
         Sort Key: random_text
         Sort Method: external merge  Disk: 14960kB
         Worker 0:  Sort Method: external merge  Disk: 14976kB
         Worker 1:  Sort Method: external merge  Disk: 14648kB
         ->  Parallel Seq Scan on test  (cost=0.00..13441.33 rows=416667 width=35) (actual time=
0.013..78.030 rows=333333 loops=3)
               Filter: (id <= 1000000)
 Planning Time: 0.205 ms
 Execution Time: 2418.291 ms
(12 rows)
[local:/data/run/pg12]:5120 pg12@testdb=#

work_mem设置为100MB

[local:/data/run/pg12]:5120 pg12@testdb=# set work_mem='100MB';
SET
[local:/data/run/pg12]:5120 pg12@testdb=# 
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN analyze SELECT * FROM test WHERE id <= 100 ORDER BY random_text ASC;
                                                        QUERY PLAN                              
------------------------------------------------------------------------------------------
 Sort  (cost=12.86..13.09 rows=89 width=35) (actual time=0.623..0.652 rows=100 loops=1)
   Sort Key: random_text
   Sort Method: quicksort  Memory: 34kB
   ->  Index Scan using test_pkey on test  (cost=0.42..9.98 rows=89 width=35) (actual time=0.050
..0.163 rows=100 loops=1)
         Index Cond: (id <= 100)
 Planning Time: 1.029 ms
 Execution Time: 0.768 ms
(7 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN analyze SELECT * FROM test WHERE id <= 1000 ORDER BY random_text ASC;
                                                         QUERY PLAN                             
-----------------------------------------------------------------------------------------
 Sort  (cost=82.38..84.60 rows=887 width=35) (actual time=8.226..8.516 rows=1000 loops=1)
   Sort Key: random_text
   Sort Method: quicksort  Memory: 122kB
   ->  Index Scan using test_pkey on test  (cost=0.42..38.95 rows=887 width=35) (actual time=0.0
97..1.322 rows=1000 loops=1)
         Index Cond: (id <= 1000)
 Planning Time: 0.997 ms
 Execution Time: 8.885 ms
(7 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN analyze SELECT * FROM test WHERE id <= 10000 ORDER BY random_text ASC;
                                                           QUERY PLAN                           
------------------------------------------------------------------------------------------
 Sort  (cost=914.20..936.37 rows=8869 width=35) (actual time=52.552..53.942 rows=10000 loops=1)
   Sort Key: random_text
   Sort Method: quicksort  Memory: 1343kB
   ->  Index Scan using test_pkey on test  (cost=0.42..332.63 rows=8869 width=35) (actual time=0
.054..8.050 rows=10000 loops=1)
         Index Cond: (id <= 10000)
 Planning Time: 0.444 ms
 Execution Time: 55.059 ms
(7 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN analyze SELECT * FROM test WHERE id <= 100000 ORDER BY random_text ASC;
                                                             QUERY PLAN                         
-----------------------------------------------------------------------------------------------
 Sort  (cost=12173.80..12427.59 rows=101517 width=35) (actual time=307.212..318.567 rows=100000 
loops=1)
   Sort Key: random_text
   Sort Method: quicksort  Memory: 12680kB
   ->  Index Scan using test_pkey on test  (cost=0.42..3731.97 rows=101517 width=35) (actual tim
e=0.040..28.441 rows=100000 loops=1)
         Index Cond: (id <= 100000)
 Planning Time: 0.184 ms
 Execution Time: 326.030 ms
(7 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN analyze SELECT * FROM test WHERE id <= 1000000 ORDER BY random_text ASC;
                                                      QUERY PLAN                                
------------------------------------------------------------------------------------------
 Sort  (cost=120390.84..122890.84 rows=1000000 width=35) (actual time=4333.238..4862.205 rows=10
00000 loops=1)
   Sort Key: random_text
   Sort Method: external merge  Disk: 44536kB
   ->  Seq Scan on test  (cost=0.00..20733.00 rows=1000000 width=35) (actual time=0.014..191.083
 rows=1000000 loops=1)
         Filter: (id <= 1000000)
 Planning Time: 0.215 ms
 Execution Time: 4909.541 ms
(7 rows)
[local:/data/run/pg12]:5120 pg12@testdb=#

可以看到,在work_mem设置为100MB时,PG会尽可能的在内存中执行排序(排序算法从字面上来看是快速排序算法),但性能比起1MB时并没有非常明显的改进,而且得益于并行算法,在最后一个场景中1MB的性能比起100MB的性能还要好.

work_mem对HashJoin性能的影响
下面来看看work_mem大小对HashJoin性能的影响.
测试表同上,测试脚本:

set work_mem='1MB';
EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 1000;
set work_mem='100MB';
EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 1000;
set work_mem='1MB';
EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 10000;
set work_mem='100MB';
EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 10000;
set work_mem='1MB';
EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 100000;
set work_mem='100MB';
EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 100000;
set work_mem='1MB';
EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 100000;
set work_mem='100MB';
EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 100000;

1MB vs 100MB

[local:/data/run/pg12]:5120 pg12@testdb=# set work_mem='1MB';
SET
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 1000;
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1050.01..15104.46 rows=886 width=39) (actual time=5.191..172.614 rows=999 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Hash Join  (cost=50.00..14015.86 rows=369 width=39) (actual time=105.367..160.113 rows=333 loops=3)
         Hash Cond: (b.random_text = a.random_text)
         ->  Parallel Seq Scan on test b  (cost=0.00..12399.67 rows=416667 width=35) (actual time=0.046..60.472 rows=333333 loops=3)
         ->  Hash  (cost=38.93..38.93 rows=886 width=35) (actual time=4.903..4.903 rows=999 loops=3)
               Buckets: 1024  Batches: 1  Memory Usage: 76kB
               ->  Index Scan using test_pkey on test a  (cost=0.42..38.93 rows=886 width=35) (actual time=0.315..2.816 rows=999 loops=3)
                     Index Cond: (id < 1000)
 Planning Time: 1.737 ms
 Execution Time: 173.096 ms
(12 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# set work_mem='100MB';
SET
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 1000;
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1050.01..15104.46 rows=886 width=39) (actual time=1.133..139.035 rows=999 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Hash Join  (cost=50.00..14015.86 rows=369 width=39) (actual time=89.747..135.071 rows=333 loops=3)
         Hash Cond: (b.random_text = a.random_text)
         ->  Parallel Seq Scan on test b  (cost=0.00..12399.67 rows=416667 width=35) (actual time=0.010..52.889 rows=333333 loops=3)
         ->  Hash  (cost=38.93..38.93 rows=886 width=35) (actual time=0.806..0.806 rows=999 loops=3)
               Buckets: 1024  Batches: 1  Memory Usage: 76kB
               ->  Index Scan using test_pkey on test a  (cost=0.42..38.93 rows=886 width=35) (actual time=0.032..0.452 rows=999 loops=3)
                     Index Cond: (id < 1000)
 Planning Time: 0.368 ms
 Execution Time: 139.139 ms
(12 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# 
[local:/data/run/pg12]:5120 pg12@testdb=# set work_mem='1MB';
SET
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 10000;
                                                                  QUERY PLAN                                                                  
----------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1443.47..16329.38 rows=8868 width=39) (actual time=18.109..188.837 rows=9999 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Hash Join  (cost=443.47..14442.58 rows=3695 width=39) (actual time=22.259..182.177 rows=3333 loops=3)
         Hash Cond: (b.random_text = a.random_text)
         ->  Parallel Seq Scan on test b  (cost=0.00..12399.67 rows=416667 width=35) (actual time=0.021..61.790 rows=333333 loops=3)
         ->  Hash  (cost=332.62..332.62 rows=8868 width=35) (actual time=21.900..21.900 rows=9999 loops=3)
               Buckets: 16384  Batches: 1  Memory Usage: 801kB
               ->  Index Scan using test_pkey on test a  (cost=0.42..332.62 rows=8868 width=35) (actual time=0.069..12.185 rows=9999 loops=3)
                     Index Cond: (id < 10000)
 Planning Time: 0.786 ms
 Execution Time: 189.854 ms
(12 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# set work_mem='100MB';
SET
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 10000;
                                                                 QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1443.47..16329.38 rows=8868 width=39) (actual time=7.854..157.510 rows=9999 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Hash Join  (cost=443.47..14442.58 rows=3695 width=39) (actual time=8.019..152.570 rows=3333 loops=3)
         Hash Cond: (b.random_text = a.random_text)
         ->  Parallel Seq Scan on test b  (cost=0.00..12399.67 rows=416667 width=35) (actual time=0.010..55.844 rows=333333 loops=3)
         ->  Hash  (cost=332.62..332.62 rows=8868 width=35) (actual time=7.869..7.869 rows=9999 loops=3)
               Buckets: 16384  Batches: 1  Memory Usage: 801kB
               ->  Index Scan using test_pkey on test a  (cost=0.42..332.62 rows=8868 width=35) (actual time=0.031..4.434 rows=9999 loops=3)
                     Index Cond: (id < 10000)
 Planning Time: 0.410 ms
 Execution Time: 158.160 ms
(12 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# set work_mem='1MB';
SET
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 100000;
                                                                        QUERY PLAN                                                                        
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=4999.50..36132.51 rows=101516 width=39) (actual time=235.147..405.768 rows=99999 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Hash Join  (cost=3999.50..24980.91 rows=42298 width=39) (actual time=222.076..293.543 rows=33333 loops=3)
         Hash Cond: (b.random_text = a.random_text)
         ->  Parallel Seq Scan on test b  (cost=0.00..12399.67 rows=416667 width=35) (actual time=0.013..52.181 rows=333333 loops=3)
         ->  Parallel Hash  (cost=3139.78..3139.78 rows=42298 width=35) (actual time=57.009..57.010 rows=33333 loops=3)
               Buckets: 16384  Batches: 16  Memory Usage: 608kB
               ->  Parallel Index Scan using test_pkey on test a  (cost=0.42..3139.78 rows=42298 width=35) (actual time=0.139..29.482 rows=33333 loops=3)
                     Index Cond: (id < 100000)
 Planning Time: 1.389 ms
 Execution Time: 410.420 ms
(12 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# set work_mem='100MB';
SET
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 100000;
                                                                 QUERY PLAN                                                                 
--------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=5000.90..27999.06 rows=101516 width=39) (actual time=77.269..509.484 rows=99999 loops=1)
   Hash Cond: (b.random_text = a.random_text)
   ->  Seq Scan on test b  (cost=0.00..18233.00 rows=1000000 width=35) (actual time=0.014..129.504 rows=1000000 loops=1)
   ->  Hash  (cost=3731.95..3731.95 rows=101516 width=35) (actual time=77.152..77.152 rows=99999 loops=1)
         Buckets: 131072  Batches: 1  Memory Usage: 7760kB
         ->  Index Scan using test_pkey on test a  (cost=0.42..3731.95 rows=101516 width=35) (actual time=0.031..41.401 rows=99999 loops=1)
               Index Cond: (id < 100000)
 Planning Time: 0.311 ms
 Execution Time: 513.957 ms
(9 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# 
[local:/data/run/pg12]:5120 pg12@testdb=# set work_mem='1MB';
SET
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 1000000;
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=38546.00..96467.99 rows=999999 width=39) (actual time=483.527..1982.466 rows=999999 loops=1)
   Hash Cond: (a.random_text = b.random_text)
   ->  Seq Scan on test a  (cost=0.00..20733.00 rows=999999 width=35) (actual time=0.051..286.223 rows=999999 loops=1)
         Filter: (id < 1000000)
         Rows Removed by Filter: 1
   ->  Hash  (cost=18233.00..18233.00 rows=1000000 width=35) (actual time=482.952..482.952 rows=1000000 loops=1)
         Buckets: 16384  Batches: 128  Memory Usage: 644kB
         ->  Seq Scan on test b  (cost=0.00..18233.00 rows=1000000 width=35) (actual time=0.042..136.794 rows=1000000 loops=1)
 Planning Time: 1.413 ms
 Execution Time: 2023.608 ms
(10 rows)
[local:/data/run/pg12]:5120 pg12@testdb=# set work_mem='100MB';
SET
[local:/data/run/pg12]:5120 pg12@testdb=# EXPLAIN  analyze  SELECT * FROM test a JOIN test b USING (random_text) WHERE a.id < 1000000;
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=30733.00..65215.99 rows=999999 width=39) (actual time=495.932..1368.250 rows=999999 loops=1)
   Hash Cond: (a.random_text = b.random_text)
   ->  Seq Scan on test a  (cost=0.00..20733.00 rows=999999 width=35) (actual time=0.023..204.935 rows=999999 loops=1)
         Filter: (id < 1000000)
         Rows Removed by Filter: 1
   ->  Hash  (cost=18233.00..18233.00 rows=1000000 width=35) (actual time=495.148..495.149 rows=1000000 loops=1)
         Buckets: 1048576  Batches: 1  Memory Usage: 74114kB
         ->  Seq Scan on test b  (cost=0.00..18233.00 rows=1000000 width=35) (actual time=0.011..130.569 rows=1000000 loops=1)
 Planning Time: 0.295 ms
 Execution Time: 1417.372 ms
(10 rows)
[local:/data/run/pg12]:5120 pg12@testdb=#

从日志输出来看,在100MB时Batches数明显比1MB时少很多,表示所有数据都可以放在内存中处理(1个批次即可),执行时间也相对于少20%-30%左右.

增加work_mem会存在边际递减效应,除非内存足够,否则建议逐个测试找到最佳值,同时需考虑连接数对内存的影响.

参考资料
Understanding postgresql.conf : work_mem

PostgreSQL DBA(167) - pgAdmin(config：work_mem)