首页 科技问答 张文宁,某局点S6520X堆叠重启后堆叠口不UP问题

张文宁,某局点S6520X堆叠重启后堆叠口不UP问题

科技问答 250
1676540277,

组网及说明

/

告警信息

/

问题描述

现场在97日进行网络切换测试时,对APP-DSHLW-SW这两组S6520X堆叠环境进行重启切换后出现堆叠口不up问题,后于1120日凌晨,对XX-B05-N05-OMS-ASW-6520这一组操作尝试复现,重启备设备后复现了堆叠口不up问题。

过程分析

11 20 日复现的故障现象进行分析,该组设备以 4950 口进行堆叠,重启备板 slot2 后, 50 口出现不 UP 问题。

1、   首先确认故障现象,堆叠口 50 口出现不 UP 情况

  ===============display irf link=============== 

Member 1

 IRF Port  Interface                             Status

 1         disable                               --   

 2         FortyGigE1/0/49                       UP   

           FortyGigE1/0/50                       DOWN 

Member 2

 IRF Port  Interface                             Status

 1         FortyGigE2/0/49                       UP   

           FortyGigE2/0/50                       DOWN 

 2         disable                               --   

=========================================================

2、   确认底层 UP 信息,确实 50 口物理上不 up ,但查看底层速率模式配置未发现异常。

  ===============phy info 1 xq=============== 

     port addr name ena[i/e]  link[i/e] speed[i/e] du[i/e] an STP itf[i/e]  

 hg1 (20)  -   -    ena/  -    up/ -   40000/  -   FD/-   N/- FWD    CR4/ -

 hg2 (16)  -   -    ena/  -  down/ -   40000/  -   FD/-   N/- FWD    CR4/ -

============================================================

  ===============phy info 2 xq=============== 

     port addr name ena[i/e]  link[i/e] speed[i/e] du[i/e] an STP itf[i/e]  

 hg1 (20)  -   -    ena/  -    up/ -   40000/  -   FD/-   N/- FWD    CR4/ -

 hg2 (16)  -   -    ena/  -  down/ -   40000/  -   FD/-   N/- FWD    CR4/ -

3、   查看底层 serdes 信息,发现未重启侧的 1/0/50 SinalDetect FALSE ,重启侧的 2/0/50 正常。

[XX-B05-N05-OMS-ASW-6520-probe]serdes read 1 0 50 0 // 未重启侧的 serdes 信息。

… …

*******************************************************

***********Port Parameter********uiPhyPortNo[16]********

 SignalDetect  FALSE

 AutoNegEnable  TRUE

 AutoNegBypassEnable  TRUE

 apEnable  TRUE

 ForceLinkDown  NOT SUPPORT

 ForcePeerDown  FALSE

 ForceLinkPass  FALSE

 SerdesTX  TRUE

 FecStatus(0-FEC/1-Disable/2-RS-FEC) 1

 RemFault NOT SUPPORT

 LocFault NOT SUPPORT

 invertTx TRUE, invertRx TRUE

[XX-B05-N05-OMS-ASW-6520-probe]serdes read 2 0 50 0 // 重启侧的 serdes 信息

      ... …

***********Port Parameter********uiPhyPortNo[16]********

 SignalDetect  TRUE

 AutoNegEnable  TRUE

 AutoNegBypassEnable  TRUE

 apEnable  TRUE

 ForceLinkDown  NOT SUPPORT

 ForcePeerDown  FALSE

 ForceLinkPass  FALSE

 SerdesTX  FALSE

 FecStatus(0-FEC/1-Disable/2-RS-FEC) 0

 RemFault NOT SUPPORT

 LocFault NOT SUPPORT

 invertTx TRUE, invertRx TRUE

[XX-B05-N05-OMS-ASW-6520-probe]

4、   40G电缆是通过AP协商的,查看1/0/502/0/50的协商会话记录,可以发现2/0/50侧一直尝试去协商但是没有协商成功,2/0/50侧无协商相关日志,正常情况下有日志输出,因此判断是2/0/50侧协商异常。

input:10 1 3 cpssDxChPortApDebugInfoGet 1 16 /2/0/50AP 协商信息

 Addr :0x70e78e90

Port Control Realtime Log

=========================

Num of log entries  415

Current entry index 1418

04739060: AP SM   Port 16, status 5 Tx Disable Success, O2(Tx Disable) 04739066: AP SM   Port 16, status 6 Resolution In Progress, O3(Resolution) 04743860: AP SM   Port 16, status 11 AP Resolution Timer Failure, O3(Resolution) 04743862: AP SM   Port 16, status 11 AP Resolution Timer Failure, O1(Init) 04743869: AP SM   Port 16, status 2 Init Success, O1(Init) 04743871: AP SM   Port 16, status 3 Tx Disable In Progress, O2(Tx Disable) 04743931: AP SM   Port 16, status 5 Tx Disable Success, O2(Tx Disable) 04743931: AP SM   Port 16, status 5 Tx Disable Success, O2(Tx Disable) 04743937: AP SM   Port 16, status 6 Resolution In Progress, O3(Resolution) 04748731: AP SM   Port 16, status 11 AP Resolution Timer Failure, O3(Resolution) 04748733: AP SM   Port 16, status 11 AP Resolution Timer Failure, O1(Init) 04748758: AP SM   Port 16, status 2 Init Success, O1(Init) 04748760: AP SM   Port 16, status 3 Tx Disable In Progress, O2(Tx Disable) 04748820: AP SM  

return value is: 00000000

 

10 1 3 cpssDxChPortApDebugInfoGet 0 16 //1/0/50 侧协商异常信息。

 Addr :0x70e78e90

Port Control Realtime Log

=========================

Num of log entries  004

Current entry index 179

-258261135: Super   Port 16, AP, AP Status Msg Sent to execute, O2(Low Pri Msg)

-258261133: Super   Port 16, AP, AP Cfg Get Msg Sent to execute, O2(Low Pri Msg)

 

return value is: 00000000

5、   为了进一步验证是 1/0/50 协商异常的结论,将其 shutdown/undo shutdown ,这个动作会将协商重置。重置后恢复正常 UP

[XX-B05-N05-OMS-ASW-6520-FortyGigE1/0/50]dis this

#

interface FortyGigE1/0/50

 description Link-irf-to-XX-B05-N05-OMS-ASW-6520-02:FG2/0/50

 shutdown

#

return

[XX-B05-N05-OMS-ASW-6520-FortyGigE1/0/50]undo shut

[XX-B05-N05-OMS-ASW-6520-FortyGigE1/0/50]

[XX-B05-N05-OMS-ASW-6520-FortyGigE1/0/50]pro

[XX-B05-N05-OMS-ASW-6520-probe]phy info 1 xq

     port addr name ena[i/e]  link[i/e] speed[i/e] du[i/e] an STP itf[i/e]  

 hg1 (20)  -   -    ena/  -    up/ -   40000/  -   FD/-   N/- FWD    CR4/ -

 hg2 (16)  -   -    ena/  -    up/ -   40000/  -   FD/-   N/- FWD    CR4/ -

[XX-B05-N05-OMS-ASW-6520-probe]phy info 2 xq

     port addr name ena[i/e]  link[i/e] speed[i/e] du[i/e] an STP itf[i/e]  

 hg1 (20)  -   -    ena/  -    up/ -   40000/  -   FD/-   N/- FWD    CR4/ -

 hg2 (16)  -   -    ena/  -    up/ -   40000/  -   FD/-   N/- FWD    CR4/ -

[XX-B05-N05-OMS-ASW-6520-probe]

6、 通过代码逻辑分析,当AP 状态机异常,在收到 serdes 信号变化时无法触发 AP 状态机迁移,就会导致 AP 协商异常。为了验证该结论,实验室进行测试,在 AP 状态机异常的情况下,复现出的故障现象与现场一致。

[XX-B05-N05-OMS-ASW-6520-probe]phy info 1 xq         //slot2重启后,1/0/49 协商异常,无法 up                          

     port addr name ena[i/e]  link[i/e] speed[i/e] du[i/e] an STP itf[i/e]     

 hg1 (20)  -   -    ena/  -  down/ -   40000/  -   FD/-   N/- FWD    CR4/ -    

 hg2 (16)  -   -    ena/  -    up/ -   40000/  -   FD/-   N/- FWD    CR4/ -    

[XX-B05-N05-OMS-ASW-6520-probe]phy info 2 xq                                   

     port addr name ena[i/e]  link[i/e] speed[i/e] du[i/e] an STP itf[i/e]     

 hg1 (20)  -   -    ena/  -  down/ -   40000/  -   FD/-   N/- FWD    CR4/ -     

 hg2 (16)  -   -    ena/  -    up/ -   40000/  -   FD/-   N/- FWD    CR4/ -    

[XX-B05-N05-OMS-ASW-6520-probe]                                                

XX-B05-N05-OMS-ASW-6520-probe]serdes read 1 0 49 0                                                                                 

... ...                                                                     

***********Port Parameter********uiPhyPortNo[20]********                                                                           

 SignalDetect  FALSE                            //异常侧signalDetectFALSE ,与现场一致。                                                                                    

 AutoNegEnable  TRUE                                                                                                               

 AutoNegBypassEnable  TRUE                                                                                                          

 apEnable  TRUE                                                                                                                    

 ForceLinkDown  NOT SUPPORT                                                                                                         

 ForcePeerDown  FALSE                                                                                                              

 ForceLinkPass  FALSE                                                                                                               

 SerdesTX  TRUE                                                                                                                    

 FecStatus(0-FEC/1-Disable/2-RS-FEC) 1                                                                                              

 RemFault NOT SUPPORT                                                                                                              

 LocFault NOT SUPPORT                                                                                                               

 invertTx TRUE, invertRx TRUE                                                                                                      

[XX-B05-N05-OMS-ASW-6520-probe]serdes read 2 0 49 0                                                                                                                

... ...                                                                                                                                   

***********Port Parameter********uiPhyPortNo[20]********                                                                           

 SignalDetect  TRUE                        //重启侧signalDetectTURE ,与现场一致。                                                            

 AutoNegEnable  TRUE                                                                                                               

 AutoNegBypassEnable  TRUE                                                                                                         

 apEnable  TRUE                                                                                                                    

 ForceLinkDown  NOT SUPPORT                                                                                                         

 ForcePeerDown  FALSE                                                                                                              

 ForceLinkPass  FALSE                                                                                                              

 SerdesTX  FALSE                                                                                                                    

 FecStatus(0-FEC/1-Disable/2-RS-FEC) 0                                                                                             

 RemFault NOT SUPPORT                                                                                                               

 LocFault NOT SUPPORT                                                                                                              

 invertTx TRUE, invertRx TRUE                                                                                                       

[XX-B05-N05-OMS-ASW-6520-probe]              

解决方法

通过以上问题分析可以确认:

1S6520X 设备使用 40G 电缆进行 AP 协商时,因 AP 状态机迁移异常导致堆叠口无法 UP ,后续将通过软件版本修复。

240G 光模块 UP 走的是 RxTraing 训练方式,不走 AP 协商流程,可临时选择使用 40G 光模块规避问 题。

CRM论坛(CRMbbs.com)——一个让用户更懂CRM的垂直性行业内容平台,CRM论坛致力于互联网、客户管理、销售管理、SCRM私域流量内容输出5年。 如果您有好的内容,欢迎向我们投稿,共建CRM多元化生态体系,创建CRM客户管理一体化生态解决方案。本文来源:知了社区基于知识共享署名-相同方式共享3.0中国大陆许可协议,某局点S6520X堆叠重启后堆叠口不UP问题