柯辉,某局点S6813-48X6 IRF倒换丢包长问题
组网及说明
不涉及
告警信息
无
问题描述
现场堆叠,用户reboot重启slot1(irf master)之后期间丢包大约20S左右,从slot1上日志看:
%@157917%Sep 15 17:59:58:896 2022 BJYZ109-T-11-41-IB-S6813-134.Int SHELL/6/SHELL_CMD_CONFIRM: Confirm option of command reboot slot 1 is yes.
GigabitEthernet2/0/30 changed to down.
在slot2上日志看
%@836%Sep 15 18:00:23:399 2022 BJYZ109-T-11-41-IB-S6813-134.Int STM/2/STM_LINK_TIMEOUT: IRF port 1 went down because the heartbeat timed out.
%@837%Sep 15 18:00:23:588 2022 BJYZ109-T-11-41-IB-S6813-134.Int STM/3/STM_LINK_DOWN: IRF port 1 went down.
期间经过大约20S才感知到irf port down
设备配置link-delay 0
BGP配置NSR
irf mac-address persistent timer
irf auto-update enable
irf link-delay 0
irf member 1 priority 16
irf member 2 priority 1
irf member 1 description BJYZ109-T-11-41-IB-S6813-134
irf member 2 description BJYZ109-T-12-41-IB-S6813-135
过程分析
当进行主备倒换时,有其他口(比如mad口)与堆叠口几乎同时down,因消息处理任务需顺序执行,堆叠口down消息可能靠后处理,引起备升主慢,产生较多丢包
解决方法
R6615P08H01补丁优化解决