一、参数修改规则:
This system variable should have the same value on all group members. The value of this system variable can be changed while Group Replication is running. The change takes effect on each group member after you stop and restart Group Replication on the member. During this process, the value of the system variable is permitted to differ between group members, but members might be unable to reconnect in the event of a disconnection.
该系统变量在所有组成员上都应具有相同的值。在运行组复制时,可以更改此系统变量的值。在每个组成员上停止并重新启动组复制后,更改将在该成员上生效。在此过程中,允许组成员之间的系统变量值不同,但在断开连接的情况下,成员可能无法重新连接。
二、参数作用:
group_replication_message_cache_size sets the maximum amount of memory that is available for the message cache in the group communication(通信) engine for Group Replication (XCom). The XCom message cache holds messages (and their metadata) that are exchanged between the group members as a part of the consensus protocol(一致性协议). Among other functions, the message cache is used for recovery of missed messages by members that reconnect with the group after a period where they were unable to communicate with the other group members.
group_replication_message_cache_size设置组复制(XCom)的组通信引擎中可用于消息缓存的最大内存量。XCom消息缓存保存作为共识协议的一部分在组成员之间交换的消息(及其元数据)。
在其他功能中,消息缓存用于恢复在无法与其他组成员通信一段时间后重新连接到组的成员丢失的消息。
三、参数设置建议:
The group_replication_member_expel_timeout system variable determines the waiting period (up to an hour) that is allowed in addition to the initial 5-second detection period for members to return to the group rather than being expelled. The size of the XCom message cache should be set with reference to the expected volume of messages in this time period, so that it contains all the missed messages required for members to return successfully. Up to MySQL 8.0.20, the default is only the 5-second detection period, but from MySQL 8.0.21, the default is a 5-second waiting period after the 5-second detection(检测) period, for a total time period of 10 seconds.
group_replication_member_expel_timeout系统变量确定了除了最初的5秒检测期之外允许成员返回组而不是被驱逐的等待期(最多一个小时)。
XCom消息缓存的大小应参考此时间段内的预期消息量进行设置,以便它包含成员成功返回所需的所有遗漏消息。在MySQL 8.0.20之前,默认值仅为5秒的检测周期,但在MySQL 8.0.21之后,默认值为5秒检测周期之后的5秒等待周期,总时间周期为10秒。
mgr的心跳检测期为5秒,该阈值无法修改,5秒心跳不通过,即认为节点故障!group_replication_member_expel_timeout表示,如果某个节点5秒心跳不通过,还需要等待多少时间剔除这个问题节点;
group_replication_member_expel_timeout这个参数的默认值如下图所示:
Ensure that sufficient memory is available on your system for your chosen cache size limit, considering the size of MySQL Server's other caches and object pools. The default setting is 1073741824 bytes (1 GB). The minimum setting is also 1 GB up to MySQL 8.0.20. From MySQL 8.0.21, the minimum setting is 134217728 bytes (128 MB), which enables deployment on a host that has a restricted amount of available memory, and good network connectivity to minimize the frequency and duration of transient losses of connectivity for group members. Note that the limit set using group_replication_message_cache_size applies only to the data stored in the cache, and the cache structures require an additional 50 MB of memory.
考虑到MySQL Server的其他缓存和对象池的大小,确保系统上有足够的内存用于您选择的缓存大小限制。默认设置为1073741824字节(1 GB)。MySQL 8.0.20之前的最低设置也是1GB。从MySQL 8.0.21开始,最小设置为134217728字节(128 MB),这使得能够在可用内存有限的主机上进行部署,并具有良好的网络连接,以最大限度地减少组成员暂时失去连接的频率和持续时间。请注意,使用group_replication_message_cache_size设置的限制仅适用于存储在缓存中的数据,并且缓存结构需要额外的50 MB内存。
The cache size limit can be increased or reduced dynamically at runtime. If you reduce the cache size limit, XCom removes the oldest entries that have been decided and delivered until the current size is below the limit. Group Replication's Group Communication System (GCS) alerts you, by a warning message, when a message that is likely to be needed for recovery by a member that is currently unreachable is removed from the message cache. For more information on tuning the message cache size, see Section 18.7.6, “XCom Cache Management”.
缓存大小限制可以在运行时动态增加或减少。如果减少缓存大小限制,XCom将删除已决定和交付的最旧条目,直到当前大小低于限制。
当当前无法访问的成员可能需要恢复的消息从消息缓存中删除时,Group Replication的Group Communication System(GCS)会通过警告消息提醒您。有关调整消息缓存大小的更多信息,请参阅第18.7.6节“XCom缓存管理”。
总结:如果机器内存充裕,建议调大该参数,建议5g以上,并且业务量大的mgr集群,需要设置更大的值;该参数设置需要结合group_replication_member_expel_timeout参数一起考虑设置;