MongoDB WT性能调优

从mongodb3.2开始mongodb默认支持WiredTiger存储引擎,对于3.2之前的版本可以使用参数指定存储引擎。 mongo目前4.0+版本的默认存储引擎使用的wiredTiger存储引擎。

storage:
   engine: wiredTiger

  wireTiger:
      [存储引擎的参数设置]

WiredTiger是各种操作应用的理想选择,因此是MongoDB的默认存储引擎。它应该是所有新应用程序的起点,除了您需要内存或加密存储引擎的特定功能的情况。

WiredTiger存储引擎的主要优势:

最大化可用缓存: WiredTiger最大限度地利用可用内存作为缓存来减少I / O瓶颈。使用了两个缓存:WiredTiger缓存和文件系统缓存。WiredTiger缓存存储未压缩的数据并提供类似内存的性能。操作系统的文件系统缓存存储压缩数据。当在WiredTiger缓存中找不到数据时,WiredTiger将在文件系统缓存中查找数据。

MONGODB 的版本与 wiredTiger 之间的版本关系

MONGODB 6.0    --- WiredTiger  11.0.1    2022.June.24

MONGODB 5.0    ---  WiredTiger 10.0.2    2021.November.30

MONGODB 4.4    ---   WiredTiger 10.0.2    2021.November.30

MONGODB 4.2    ---  WiredTiger 3.3.0     2020, March ,20

 所以如果从wiredTiger 的版本上看 4.4 和 5.0 使用的数据库引擎的版本是一致的。

MongoDB WiredTiger 存储引擎cache_pool设计(转载)_存储引擎




1.oplog server层,类似于binlog

2.journal log 引擎层,WAL,类似redo log 默认100ms刷盘一次,storage.journal.enabled 决定是否开启journal,storage.journal.commitInternalMs 决定 journal 刷盘的间隔,默认为100ms,用户也可以通过写入时指定 writeConcern 为 {j: ture} 来每次写入时都确保 journal 刷盘.


Starting in MongoDB 4.0, you cannot specify  --nojournal  option or  storage.journal.enabled: false  for replica set members that use the WiredTiger storage engine.



3.checkpoint 默认1分钟一次,调整可以参考下面参数


在mongodb启动参数里面可以修改wiredtiger的配置 –wiredTigerEngineConfigString "checkpoint=(wait=10,log_size=2GB)"

mongodb 3.0
2017-08-20T13:51:13.516+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=2G,
session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,
compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
mongodb 3.4
2017-10-20T09:26:55.931+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=30720M,
session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,
archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,
log_size=2GB),statistics_log=(wait=0),
checkpoint = ( periodically checkpoint the database. Enabling the checkpoint server uses a session from the configured session_max. a set of related configuration options defined below.
    log_size wait for this amount of log record bytes to be written to the log between each checkpoint. If non-zero, this value will use a minimum of the log file size. A database can configure both log_size and wait to set an upper bound for checkpoints; setting this value above 0 configures periodic checkpoints. an integer between 0 and 2GB; default  0.
    wait seconds to wait between each checkpoint; setting this value above 0 configures periodic checkpoints. an integer between 0 and 100000; default  0.
)


4.内存相关

mongodb从3.4版本开始默认使用内存为下面两个中的最大一个:

  • 50% of (RAM - 1 GB)
  • 256MB


cacheSizeGB 可根据脏数据比例适当调整,mongodb 默认使用内存的50%作为cache,因为这里类似postgresql还有Linux系统缓存机制的参数,如果是双缓存。

与内存相关的参数 eviction_trigger ,保证内存使用达到多少比例,开始将内存刷出, eviction_target 则是当内存占比大于默认值,则一直刷出。系统写入压力过大时,可适当调整改参数。


db.adminCommand( { "setParameter": 1, "wiredTigerEngineRuntimeConfig": "eviction=(threads_min=3,threads_max=6),checkpoint=(wait=120),eviction_trigger=80,eviction_target=50"})


eviction = ( eviction configuration options. a set of related configuration options defined below.
    threads_max maximum number of threads WiredTiger will start to help evict pages from cache. The number of threads started will vary depending on the current eviction load. an integer between 1 and 20; default  1.
    threads_min minimum number of threads WiredTiger will start to help evict pages from cache. The number of threads currently running will vary depending on the current eviction load. an integer between 1 and 20; default  1.
)

eviction_dirty_target continue evicting until the cache has less dirty memory than the value, as a percentage of the total cache size. Dirty pages will only be evicted if the cache is full enough to trigger eviction. an integer between 10 and 99; default  80.
eviction_target continue evicting until the cache has less total memory than the value, as a percentage of the total cache size. Must be less than  eviction_trigger. an integer between 10 and 99; default  80.
eviction_trigger trigger eviction when the cache is using this much memory, as a percentage of the total cache size. an integer between 10 and 99; default  95.
shared_cache = ( shared cache configuration options. A database should configure either a cache_size or a shared_cache not both. a set of related configuration options defined below.
    chunk the granularity that a shared cache is redistributed. an integer between 1MB and 10TB; default  10MB.
    name name of a cache that is shared between databases. a string; default empty.
    reserve amount of cache this database is guaranteed to have available from the shared cache. This setting is per database. Defaults to the chunk size. an integer; default  0.
    size maximum memory to allocate for the shared cache. Setting this will update the value if one is already set. an integer between 1MB and 10TB; default  500MB.
)

cache_size maximum heap memory to allocate for the cache. A database should configure either a cache_size or a shared_cache not both. an integer between 1MB and 10TB; default  100MB.
  • wiredTiger cacheSize
    通过db.serverStatus().wiredTiger.cache 查看maximum bytes configured字段,为当前服务器的cachesize。如果需要调整,动态不重启调整
    db.adminCommand({setParameter:1, wiredTigerEngineRuntimeConfig:'cache_size=600M'}) ,重启时失效. 也可以在mongo的配置文件修改,然后重启永久生效。

5.并发

  • wiredTigerConcurrentReadTransactions
    允许并发读取的最大值,默认128,当压力大时,可以设置调大,一般不调节。
    db.adminCommand( { setParameter: 1, wiredTigerConcurrentReadTransactions: } )
  • wiredTigerConcurrentWriteTransactions
    并发写最大数,默认128

    可以查看: db.serverStatus().wiredTiger.concurrentTransactions

6.内存释放

  • wiredTiger的tcmalloc

wiredTiger使用tcmalloc作为申请内存的组件。tcmalloc申请内存会在主机内部缓存起来,类似与内部维护一个内存池,不必每次从操作系统申请内存,开销小。但是问题是tcmalloc内存的释放的速度不可控,容易造成内存内存的free buffer过大,却没有还给操作系统。导致oom,mongo被kill。

可以通过db.serverStatus().tcmalloc查看tcmalloc内存的使用情况。

tcmalloc提供了释放速度字段来调节缓存的释放速度tcmallocReleaseRate,从0-9,0表示永不释放,默认是1,也不是越大越好释放越快,也就和原始的malloc没有什么区别了。如果可以设置为中间值,然后观察性能吧。

mongo命令 : db.adminCommand({setParameter:1, tcmallocReleaseRate:4})

6. 副本集的flowControl机制

mongodb副本集写入时默认是写入majority,(1 + 节点数的一半。 3个节点时,majority=2),如果P (primary) -- S (secondary) --A (arbiter), P ,S有一个节点挂掉或者从节点性能问题,严重落后主节点,会导致写入不足2个节点。然后新的P节点就会触发flowcontrol机制,限制写入,等待从节点赶上。有时为了让mongo先恢复写速度,可以调节参数。

db.adminCommand({setParameter:1, flowControlMinTicketsPerSecond:10000}) // 可以调大flowcontrol的漏洞的ticket的数目,

db.adminCommand({setParameter:1, enableFlowControl: false}) // 不行就直接关闭flowControl。

关于w:majority. 按道理说,服务器默认的写入是majority(rs.conf() 查看),那么为什么没有触发客户端的等待呢?在客户端强制设置w:majority时,客户完全hang请求(达不到w:majority时)。rs.conf()
"getLastErrorDefaults" : {

        "w" : 1,        "wtimeout" : 0
    }

"writeConcernMajorityJournalDefault" : true, 这个字段表示的是,如果客户端写入为majority但是没有设置journal.那么默认也要等待写入journal。
getLastErrorDefaults 表示写入的确认个数,和等待时间。

{ w: , j: , wtimeout: } // 在客户端指定写入为majority时,设置了超时时间,就算返回超时,没有达到大多数。也是成功写入了数据的。


请使用浏览器的分享功能分享到微信等