Jump to content

Recommended Posts

Posted

Приветствую всех.

 

Есть железка Catalyst 6509 с двумя головами SUP720-10GE, работают в режиме SSO.

IOS s72033-adventerprisek9-mz.151-2.SY11

 

В один прекрастный момент при попытке войти в режим конфигурации получаю отказ в надписью 

Config mode cannot be entered during Standby initialization

 

Смотрю и вижу, что отвалился один из супервизоров.

 

#sh redundancy
Redundant System Information :
------------------------------
       Available system uptime = 6 days, 23 hours, 45 minutes
Switchovers system experienced = 1
              Standby failures = 14
        Last switchover reason = active unit failed

                 Hardware Mode = Duplex
    Configured Redundancy Mode = sso
     Operating Redundancy Mode = sso
              Maintenance Mode = Disabled
                Communications = Up

Current Processor Information :
-------------------------------
               Active Location = slot 6
        Current Software state = ACTIVE
       Uptime in current state = 6 days, 22 hours, 37 minutes
                 Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVENTERPRISEK9-M), Version 15.1(2)SY11, RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2017 by Cisco Systems, Inc.
Compiled Fri 21-Jul-17 06:12 by prod_rel_team
                          BOOT = sup-bootdisk:s72033-adventerprisek9-mz.151-2.SY11.bin,12;sup-bootdisk:s72033-adventerprisek9-mz.151-2.SY10.bin,12;sup-bootdisk:s72033-adventerprisek9_wan-mz.122-33.SXJ9.bin,12;
                   CONFIG_FILE =
                       BOOTLDR =
        Configuration register = 0x2102

Peer (slot: 5) information is not available because it is in 'DISABLED' state

 

ds1#sh module
Mod Ports Card Type                              Model              Serial No.
--- ----- -------------------------------------- ------------------ -----------
  1   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL1101CTTW
  2   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL1021P4JY
  3   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL1023QBRV
  4   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL1021NW0X
  5    0  Supervisor-Other                       Unknown            Unknown
  6    5  Supervisor Engine 720 10GE (Active)    VS-S720-10G        SAL1223T4DA
  7   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL1052BYZA
  8   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL09370BH6

Mod MAC addresses                       Hw    Fw           Sw           Status
--- ---------------------------------- ------ ------------ ------------ -------
  1  001a.6cbe.a460 to 001a.6cbe.a48f   2.5   12.2(14r)S5  15.1(2)SY11  Ok
  2  0017.e041.1d9c to 0017.e041.1dcb   2.3   12.2(14r)S5  15.1(2)SY11  Ok
  3  0018.1833.8644 to 0018.1833.8673   2.3   12.2(14r)S5  15.1(2)SY11  Ok
  4  0017.0ed4.82e4 to 0017.0ed4.8313   2.3   12.2(14r)S5  15.1(2)SY11  Ok
  5  0000.0000.0000 to 0000.0000.0000   0.0   Unknown      Unknown      Unknown
  6  0019.e8bb.3114 to 0019.e8bb.311b   2.0   8.5(2)       15.1(2)SY11  Ok
  7  001a.2f80.f5c0 to 001a.2f80.f5ef   2.5   12.2(14r)S5  15.1(2)SY11  Ok
  8  0015.6245.f740 to 0015.6245.f76f   2.3   12.2(14r)S5  15.1(2)SY11  Ok

Mod  Sub-Module                  Model              Serial       Hw     Status
---- --------------------------- ------------------ ----------- ------- -------
  1  Centralized Forwarding Card WS-F6700-CFC       SAD103102LM  3.0    Ok
  2  Centralized Forwarding Card WS-F6700-CFC       SAL1019MBDB  2.0    Ok
  3  Centralized Forwarding Card WS-F6700-CFC       SAL1029W0ZC  2.0    Ok
  4  Centralized Forwarding Card WS-F6700-CFC       SAL1017LFEZ  2.0    Ok
  5  Policy Feature Card 3       VS-F6K-PFC3C       SAL12372VEC  1.0    Other
  6  Policy Feature Card 3       VS-F6K-PFC3C       SAL1222S0GS  1.0    Ok
  6  MSFC3 Daughterboard         VS-F6K-MSFC3       SAL1224TW93  1.0    Ok
  7  Centralized Forwarding Card WS-F6700-CFC       SAL10360MGJ  3.0    Ok
  8  Centralized Forwarding Card WS-F6700-CFC       SAL093813ST  2.0    Ok

Mod  Online Diag Status
---- -------------------
  1  Pass
  2  Pass
  3  Pass
  4  Pass
  5  Unknown
  6  Pass
  7  Pass
  8  Pass

 

Попробовал сделать ресет модулю (а впоследствии и передернуть его) - не помогло, модуль перегружается циклически и не спаривается с активным супервизором. Пишет про таймаут. В логах вот такое:

 

Apr 22 16:52:36 ds1 294872: Apr 22 13:52:34.015: %OIR-SP-3-PWRCYCLE: Card in module 5, is being power-cycled 'Module reset'
Apr 22 17:03:09 ds1 295197: Apr 22 14:03:08.094: %PFREDUN-SP-6-ACTIVE: Standby processor removed or reloaded, changing to Simplex mode
Apr 22 17:06:25 ds1 295308: Apr 22 14:06:25.094: %PFREDUN-SP-6-ACTIVE: Standby initializing for SSO mode
Apr 22 17:06:28 ds1 295310: Apr 22 14:06:28.705: %RF_ISSU-SP-3-RF_MSG_NOT_OK: RF ISSU msg type (101) for client (3) on domain (0) is not ok
Apr 22 17:06:29 ds1 295311: Apr 22 14:06:28.705: %RF-SP-5-SEND_FAIL: RF client progression send failure for reason (RF_BAD_MESSAGE)
Apr 22 17:06:29 ds1 295313: Apr 22 14:06:28.705: %SYS-SP-3-LOGGER_FLUSHED: System was paused for 00:00:03 to ensure console debugging output.
Apr 22 17:06:30 ds1 295314: Apr 22 14:06:29.840: %PFREDUN-SP-6-ACTIVE: Standby processor removed or reloaded, changing to Simplex mode
Apr 22 17:06:31 ds1 295315: Apr 22 14:06:29.840: %OIR-SP-3-PWRCYCLE: Card in module 5, is being power-cycled 'Module reset'
Apr 22 17:09:43 ds1 295429: Apr 22 14:09:42.858: %PFREDUN-SP-6-ACTIVE: Standby initializing for SSO mode
Apr 22 17:19:57 ds1 295777: Apr 22 14:19:57.777: %ONLINE-SP-6-BOOT_TIMER: Module 5, Proc. 0. Failed to bring online because of boot timer event
Apr 22 17:19:58 ds1 295778: sm(cygnus_oir_bay slot5), running yes, state empty
Apr 22 17:19:58 ds1 295779: Last transition recorded: (remove)-> occupied (remove)-> empty (remove)-> empty_clr_persist (remove)-> empty (remove)-> empty_clr_persist (remove)-> empty (insert)-> may_be_occupied (remove)-> empty (remove)-> empty_clr_persist (remove)-> empty
Apr 22 17:19:59 ds1 295781: Apr 22 14:19:57.781: %SYS-SP-3-LOGGER_FLUSHED: System was paused for 00:10:14 to ensure console debugging output.

 

Вылечилось это полной перезагрузкой шасси. Но повторилось через неделю на том же модуле.

Кстати говоря, этот модуль работал один несколько лет в симплексе и не знал проблем. Месяц назад где-то поставили второй в пару и обновили IOS. 

Точно такой же IOS и точно в такой же конфигурации с двумя VS-S720-10G работает второе шасси без подобных проблем.

 

Кто знает что это может быть и как вывести его из этого состояния без перезагрузки всего шасси? Подозреваю, что когда отваливается модуль, redundancy залипает в SSO и после этого не может восстановить состояние. По идее должен ведь SSO перейти в какое-то другое состояние?

 

В общем что-то странное происходит :( Есть идеи?

 

 

Posted
6 minutes ago, zhenya` said:

Воткните консоль в него и почитайте. скорее всего RIP

Консоль втыкал. Там тоже пишет про таймаут и уходит в циклическую перезагрузку.

Что такое RIP?

Posted
12 minutes ago, vurd said:

Rest in peace

ааа) ну после перезагрузки шасси все работает. 

Подозрительно, что после отвала одной головы, redundancy не переходит в simplex. Почему он в SSO остается? Это ведь не так.

Возможно тут собака и порылась.

Posted
#show redundancy domain all
Redundant System Information :
------------------------------
       Available system uptime = 1 week, 1 hour, 58 minutes
Switchovers system experienced = 1
              Standby failures = 18
        Last switchover reason = active unit failed

                 Hardware Mode = Duplex
    Configured Redundancy Mode = sso
     Operating Redundancy Mode = sso
              Maintenance Mode = Disabled
                Communications = Up

Current Processor Information :
-------------------------------
               Active Location = slot 6
        Current Software state = ACTIVE
       Uptime in current state = 1 week, 50 minutes
                 Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVENTERPRISEK9-M), Version 15.1(2)SY11, RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2017 by Cisco Systems, Inc.
Compiled Fri 21-Jul-17 06:12 by prod_rel_team
                          BOOT = sup-bootdisk:s72033-adventerprisek9-mz.151-2.SY11.bin,12;sup-bootdisk:s72033-adventerprisek9-mz.151-2.SY10.bin,12;sup-bootdisk:s72033-adventerprisek9_wan-mz.122-33.SXJ9.bin,12;
                   CONFIG_FILE =
                       BOOTLDR =
        Configuration register = 0x2102

Peer Processor Information :
----------------------------
              Standby Location = slot 5
        Current Software state = DISABLED
       Uptime in current state = 1 hour, 41 minutes
                 Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVENTERPRISEK9-M), Version 15.1(2)SY11, RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2017 by Cisco Systems, Inc.
Compiled Fri 21-Jul-17 06:12 by prod_rel_team
        Configuration register = 0x2102



Redundant System Information (domain# 1):
------------------------------
       Available system uptime = 1 week, 1 hour, 58 minutes
Switchovers system experienced = 0
              Standby failures = 0
        Last switchover reason = none

                 Hardware Mode = Simplex
              Maintenance Mode = Disabled
                Communications = Down      Reason: Failure

Current Processor Information :
-------------------------------
               Active Location = slot 6
        Current Software state = DISABLED
       Uptime in current state = 1 week, 1 hour, 58 minutes
                 Image Version = Cisco IOS Software, s72033_rp Software (s72033_rp-ADVENTERPRISEK9-M), Version 15.1(2)SY11, RELEASE SOFTWARE (fc3)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2017 by Cisco Systems, Inc.
Compiled Fri 21-Jul-17 06:12 by prod_rel_team
                          BOOT = sup-bootdisk:s72033-adventerprisek9-mz.151-2.SY11.bin,12;sup-bootdisk:s72033-adventerprisek9-mz.151-2.SY10.bin,12;sup-bootdisk:s72033-adventerprisek9_wan-mz.122-33.SXJ9.bin,12;
                   CONFIG_FILE =
                       BOOTLDR =
        Configuration register = 0x2102

Peer (slot: 5) information is not available because it is in 'DISABLED' state

 

 

Вот почему у него сейчас состояние SSO, когда модуля он вообще не видит?

 

#show redundancy states
       my state = 13 -ACTIVE
     peer state = 1  -DISABLED
           Mode = Duplex
           Unit = Secondary
        Unit ID = 6

Redundancy Mode (Operational) = sso
Redundancy Mode (Configured)  = sso
Redundancy State              = sso
     Maintenance Mode = Disabled
 Communications = Up

   client count = 148
 client_notification_TMR = 30000 milliseconds
          keep_alive TMR = 9000 milliseconds
        keep_alive count = 0
    keep_alive threshold = 18
           RF debug mask = 0x0

 

Posted
1 minute ago, zhenya` said:

Standby failures = 18

как бы намекает, что одной ногой на том свете уже.

Так он по кругу модуль грузит, поэтому и столько failures. В первом сообщении было 14 :) Вот за час натикало

Posted

Есть идеи как вывести его сейчас из состояния блокировки конфигурации без перезагрузки шасси?

Posted

Перегрузил, поменял местами модули. Посмотрим...

#sh modul
Mod Ports Card Type                              Model              Serial No.
--- ----- -------------------------------------- ------------------ -----------
  1   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL1101CTTW
  2   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL1021P4JY
  3   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL1023QBRV
  4   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL1021NW0X
  5    5  Supervisor Engine 720 10GE (Active)    VS-S720-10G        SAL1223T4DA
  6    5  Supervisor Engine 720 10GE (Hot)       VS-S720-10G        SAL12372PJX
  7   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL1052BYZA
  8   48  CEF720 48 port 10/100/1000mb Ethernet  WS-X6748-GE-TX     SAL09370BH6

Mod MAC addresses                       Hw    Fw           Sw           Status
--- ---------------------------------- ------ ------------ ------------ -------
  1  001a.6cbe.a460 to 001a.6cbe.a48f   2.5   12.2(14r)S5  15.1(2)SY11  Ok
  2  0017.e041.1d9c to 0017.e041.1dcb   2.3   12.2(14r)S5  15.1(2)SY11  Ok
  3  0018.1833.8644 to 0018.1833.8673   2.3   12.2(14r)S5  15.1(2)SY11  Ok
  4  0017.0ed4.82e4 to 0017.0ed4.8313   2.3   12.2(14r)S5  15.1(2)SY11  Ok
  5  0019.e8bb.3114 to 0019.e8bb.311b   2.0   8.5(2)       15.1(2)SY11  Ok
  6  001d.45e2.6030 to 001d.45e2.6037   2.0   8.5(2)       15.1(2)SY11  Ok
  7  001a.2f80.f5c0 to 001a.2f80.f5ef   2.5   12.2(14r)S5  15.1(2)SY11  Ok
  8  0015.6245.f740 to 0015.6245.f76f   2.3   12.2(14r)S5  15.1(2)SY11  Ok

Mod  Sub-Module                  Model              Serial       Hw     Status
---- --------------------------- ------------------ ----------- ------- -------
  1  Centralized Forwarding Card WS-F6700-CFC       SAD103102LM  3.0    Ok
  2  Centralized Forwarding Card WS-F6700-CFC       SAL1019MBDB  2.0    Ok
  3  Centralized Forwarding Card WS-F6700-CFC       SAL1029W0ZC  2.0    Ok
  4  Centralized Forwarding Card WS-F6700-CFC       SAL1017LFEZ  2.0    Ok
  5  Policy Feature Card 3       VS-F6K-PFC3C       SAL1222S0GS  1.0    Ok
  5  MSFC3 Daughterboard         VS-F6K-MSFC3       SAL1224TW93  1.0    Ok
  6  Policy Feature Card 3       VS-F6K-PFC3C       SAL12372VEC  1.0    Ok
  6  MSFC3 Daughterboard         VS-F6K-MSFC3       SAL12351G4C  1.0    Ok
  7  Centralized Forwarding Card WS-F6700-CFC       SAL10360MGJ  3.0    Ok
  8  Centralized Forwarding Card WS-F6700-CFC       SAL093813ST  2.0    Ok

Mod  Online Diag Status
---- -------------------
  1  Pass
  2  Pass
  3  Pass
  4  Pass
  5  Pass
  6  Pass
  7  Pass
  8  Pass

 

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...
На сайте используются файлы cookie и сервисы аналитики для корректной работы форума и улучшения качества обслуживания. Продолжая использовать сайт, вы соглашаетесь с использованием файлов cookie и с Политикой конфиденциальности.