延迟消息在业务场景中使用的非常多,订单失效,过期通知等功能都可以借助延迟消息机制来实现。本文将从源码层面来分析Rocketmq的延迟消息实现原理机制。
一、延迟消息的使用
先看下延迟消息的使用,发送消息逻辑和普通消息一样,只要在生产者端将Message对象中设置延迟消息的等级,Rocketmq的开源版本支持18个等级,每个等级代表一个延迟时间。

Rocketmq有18个延迟等级分别对应如下延迟时间,最小1s,最大2h,客户端从等级从1开始,上面设置为2,代表延迟5s。
private String messageDelayLevel = "1s 5s 10s 30s 1m 2m 3m 4m 5m 6m 7m 8m 9m 10m 20m 30m 1h 2h";二、broker端原理解析
实现原理分析,broker端分析,在Broker服务启动的时候,会初始化延迟消息的配置解析和对延迟消息处理的定时任务的开启。

brocker启动时,根据不同等级开启定时器,即分别对每个延迟等级都开启一个定时器,rocketmq是基于java.util.Timer的定时器。

源码中单独开启了一个定时任务来持久化对延迟等级和offset的关系,为什么需要持久化延迟等级和offset的关系呢?offset是每次写入消息都会实时更新,下次直接在新的offset写消息即可,持久化可以保证brocker重启后offset不丢失,实现高可用机制,并且高性能(避免重新计算)。
下面broker接收到延迟消息的处理逻辑。会将生产者提供的真实的topic和队列进行备份暂存,然后将消息存储到内置的一个延迟消息topic队列中。
public PutMessageResult putMessage(final MessageExtBrokerInner msg) {// Set the storage timemsg.setStoreTimestamp(System.currentTimeMillis());// Set the message body BODY CRC (consider the most appropriate setting// on the client)msg.setBodyCRC(UtilAll.crc32(msg.getBody()));// Back to ResultsAppendMessageResult result = null;StoreStatsService storeStatsService = this.defaultMessageStore.getStoreStatsService();String topic = msg.getTopic();int queueId = msg.getQueueId();final int tranType = MessageSysFlag.getTransactionValue(msg.getSysFlag());if (tranType == MessageSysFlag.TRANSACTION_NOT_TYPE|| tranType == MessageSysFlag.TRANSACTION_COMMIT_TYPE) {// Delay Delivery 处理延迟消息发送if (msg.getDelayTimeLevel() > 0) {if (msg.getDelayTimeLevel() > this.defaultMessageStore.getScheduleMessageService().getMaxDelayLevel()) {msg.setDelayTimeLevel(this.defaultMessageStore.getScheduleMessageService().getMaxDelayLevel());}//专门用来处理延迟消息的topic SCHEDULE_TOPIC_XXXXtopic = ScheduleMessageService.SCHEDULE_TOPIC;//队列id为 默认延迟等级-1queueId = ScheduleMessageService.delayLevel2QueueId(msg.getDelayTimeLevel());// Backup real topic, queueId 备份原始topic和队列MessageAccessor.putProperty(msg, MessageConst.PROPERTY_REAL_TOPIC, msg.getTopic());MessageAccessor.putProperty(msg, MessageConst.PROPERTY_REAL_QUEUE_ID, String.valueOf(msg.getQueueId()));msg.setPropertiesString(MessageDecoder.messageProperties2String(msg.getProperties()));//设置成内置的延迟消息topic和队列,这个topic消息不会被消费者接收处理msg.setTopic(topic);msg.setQueueId(queueId);}}//...}//获取延迟队列idpublic static int delayLevel2QueueId(final int delayLevel) {return delayLevel - 1;}
定时器检测消息到时间了,将消息存储到真实topic和队列。

延迟消息处理定时任务,每次都将当前延迟等级对应的消息,写入真实的topic和队列,并维护新的offset,然后重新开启定时任务,等下次再重新处理该延迟等级的消息。
class DeliverDelayedMessageTimerTask extends TimerTask {private final int delayLevel;private final long offset;public DeliverDelayedMessageTimerTask(int delayLevel, long offset) {this.delayLevel = delayLevel;this.offset = offset;}public void run() {try {if (isStarted()) {this.executeOnTimeup();}} catch (Exception e) {// XXX: warn and notify melog.error("ScheduleMessageService, executeOnTimeup exception", e);ScheduleMessageService.this.timer.schedule(new DeliverDelayedMessageTimerTask(this.delayLevel, this.offset), DELAY_FOR_A_PERIOD);}}/*** @return*/private long correctDeliverTimestamp(final long now, final long deliverTimestamp) {long result = deliverTimestamp;long maxTimestamp = now + ScheduleMessageService.this.delayLevelTable.get(this.delayLevel);if (deliverTimestamp > maxTimestamp) {result = now;}return result;}public void executeOnTimeup() {ConsumeQueue cq =ScheduleMessageService.this.defaultMessageStore.findConsumeQueue(SCHEDULE_TOPIC,delayLevel2QueueId(delayLevel));long failScheduleOffset = offset;if (cq != null) {SelectMappedBufferResult bufferCQ = cq.getIndexBuffer(this.offset);if (bufferCQ != null) {try {long nextOffset = offset;int i = 0;ConsumeQueueExt.CqExtUnit cqExtUnit = new ConsumeQueueExt.CqExtUnit();for (; i < bufferCQ.getSize(); i += ConsumeQueue.CQ_STORE_UNIT_SIZE) {long offsetPy = bufferCQ.getByteBuffer().getLong();int sizePy = bufferCQ.getByteBuffer().getInt();long tagsCode = bufferCQ.getByteBuffer().getLong();if (cq.isExtAddr(tagsCode)) {if (cq.getExt(tagsCode, cqExtUnit)) {tagsCode = cqExtUnit.getTagsCode();} else {//can't find ext content.So re compute tags code.log.error("[BUG] can't find consume queue extend file content!addr={}, offsetPy={}, sizePy={}",tagsCode, offsetPy, sizePy);long msgStoreTime = defaultMessageStore.getCommitLog().pickupStoreTimestamp(offsetPy, sizePy);tagsCode = computeDeliverTimestamp(delayLevel, msgStoreTime);}}long now = System.currentTimeMillis();long deliverTimestamp = this.correctDeliverTimestamp(now, tagsCode);nextOffset = offset + (i / ConsumeQueue.CQ_STORE_UNIT_SIZE);long countdown = deliverTimestamp - now;if (countdown <= 0) {MessageExt msgExt =ScheduleMessageService.this.defaultMessageStore.lookMessageByOffset(offsetPy, sizePy);if (msgExt != null) {try {//还原消息的topic和队列MessageExtBrokerInner msgInner = this.messageTimeup(msgExt);//将消息写入真实的topic和队列PutMessageResult putMessageResult =ScheduleMessageService.this.writeMessageStore.putMessage(msgInner);if (putMessageResult != null&& putMessageResult.getPutMessageStatus() == PutMessageStatus.PUT_OK) {continue;} else {// XXX: warn and notify melog.error("ScheduleMessageService, a message time up, but reput it failed, topic: {} msgId {}",msgExt.getTopic(), msgExt.getMsgId());//重新开启一个定时任务,10s后重新处理这个延迟等级的消息ScheduleMessageService.this.timer.schedule(new DeliverDelayedMessageTimerTask(this.delayLevel,nextOffset), DELAY_FOR_A_PERIOD);//延迟消息处理完了,更新新的offsetScheduleMessageService.this.updateOffset(this.delayLevel,nextOffset);return;}} catch (Exception e) {/** XXX: warn and notify me*/log.error("ScheduleMessageService, messageTimeup execute error, drop it. msgExt="+ msgExt + ", nextOffset=" + nextOffset + ",offsetPy="+ offsetPy + ",sizePy=" + sizePy, e);}}} else {ScheduleMessageService.this.timer.schedule(new DeliverDelayedMessageTimerTask(this.delayLevel, nextOffset),countdown);ScheduleMessageService.this.updateOffset(this.delayLevel, nextOffset);return;}} // end of fornextOffset = offset + (i / ConsumeQueue.CQ_STORE_UNIT_SIZE);ScheduleMessageService.this.timer.schedule(new DeliverDelayedMessageTimerTask(this.delayLevel, nextOffset), DELAY_FOR_A_WHILE);ScheduleMessageService.this.updateOffset(this.delayLevel, nextOffset);return;} finally {bufferCQ.release();}} // end of if (bufferCQ != null)else {long cqMinOffset = cq.getMinOffsetInQueue();if (offset < cqMinOffset) {failScheduleOffset = cqMinOffset;log.error("schedule CQ offset invalid. offset=" + offset + ", cqMinOffset="+ cqMinOffset + ", queueId=" + cq.getQueueId());}}} // end of if (cq != null)ScheduleMessageService.this.timer.schedule(new DeliverDelayedMessageTimerTask(this.delayLevel,failScheduleOffset), DELAY_FOR_A_WHILE);}private MessageExtBrokerInner messageTimeup(MessageExt msgExt) {MessageExtBrokerInner msgInner = new MessageExtBrokerInner();msgInner.setBody(msgExt.getBody());msgInner.setFlag(msgExt.getFlag());MessageAccessor.setProperties(msgInner, msgExt.getProperties());TopicFilterType topicFilterType = MessageExt.parseTopicFilterType(msgInner.getSysFlag());long tagsCodeValue =MessageExtBrokerInner.tagsString2tagsCode(topicFilterType, msgInner.getTags());msgInner.setTagsCode(tagsCodeValue);msgInner.setPropertiesString(MessageDecoder.messageProperties2String(msgExt.getProperties()));msgInner.setSysFlag(msgExt.getSysFlag());msgInner.setBornTimestamp(msgExt.getBornTimestamp());msgInner.setBornHost(msgExt.getBornHost());msgInner.setStoreHost(msgExt.getStoreHost());msgInner.setReconsumeTimes(msgExt.getReconsumeTimes());msgInner.setWaitStoreMsgOK(false);MessageAccessor.clearProperty(msgInner, MessageConst.PROPERTY_DELAY_TIME_LEVEL);msgInner.setTopic(msgInner.getProperty(MessageConst.PROPERTY_REAL_TOPIC));String queueIdStr = msgInner.getProperty(MessageConst.PROPERTY_REAL_QUEUE_ID);int queueId = Integer.parseInt(queueIdStr);msgInner.setQueueId(queueId);return msgInner;}}
当消息写入真实的topic和队列中后,消费者就可以正常的消费到消息了。
总的流程如下:

写到这里,Rocketmq延迟消息实现已经全部清楚了,他的还是比较容易理解的。