Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: warm-reboot crashes on ossomain #37

Open
wdoekes opened this issue Nov 15, 2024 · 0 comments
Open

runtime: warm-reboot crashes on ossomain #37

wdoekes opened this issue Nov 15, 2024 · 0 comments

Comments

@wdoekes
Copy link
Member

wdoekes commented Nov 15, 2024

Description

# warm-reboot

takes a while and then things are broken.

2024 Nov 15 14:25:38.903056 spine1 NOTICE syncd#syncd_request_shutdown: :- loadFromFile: no context config specified, will load default context config
2024 Nov 15 14:25:38.903056 spine1 NOTICE syncd#syncd_request_shutdown: :- insert: added switch: idx 0, hwinfo ''
2024 Nov 15 14:25:38.903318 spine1 NOTICE syncd#syncd_request_shutdown: :- send: requested PRE-SHUTDOWN shutdown
2024 Nov 15 14:25:38.903523 spine1 ERR syncd#syncd: message repeated 23 times: [ :- bulkCollectData: Failed to get stats of Queue Counter 0x8e011500000001: -5]
2024 Nov 15 14:25:38.903550 spine1 NOTICE syncd#syncd: :- run: is asic queue empty: 1
2024 Nov 15 14:25:38.903577 spine1 NOTICE syncd#syncd: :- run: drained queue
2024 Nov 15 14:25:38.903577 spine1 NOTICE syncd#syncd: :- handleRestartQuery: received PRE-SHUTDOWN switch shutdown event
2024 Nov 15 14:25:38.926710 spine1 NOTICE syncd#syncd: :- threadFunction: time span 23 ms for 'PRE-SHUTDOWN:PRE-SHUTDOWN'
2024 Nov 15 14:25:38.930481 spine1 INFO syncd#syncd: [none] SAI_API_SWITCH:syncdb_database_all_store:1566 SAI Timing: SyncDB metadata and data file write time 0.006 seconds
2024 Nov 15 14:25:38.930719 spine1 INFO syncd#syncd: [none] SAI_API_SWITCH:syncdb_schema_file_store:2534 SAI Timing: SyncDB schema file write time 0.000 seconds
2024 Nov 15 14:25:38.957698 spine1 INFO syncd#syncd: [none] SAI_API_SWITCH:_brcm_sai_dm_fini:32842 SAI Timing: SyncDB save to NV storage time 0.036 seconds
2024 Nov 15 14:25:38.957698 spine1 INFO syncd#syncd: [none] SAI_API_SWITCH:_brcm_sai_dm_fini:32855 SAI Timing: DM fini time 0.041 seconds
2024 Nov 15 14:25:38.957962 spine1 NOTICE syncd#syncd: :- run: switched to PRE_SHUTDOWN, from now on accepting only shutdown requests
2024 Nov 15 14:25:38.957994 spine1 NOTICE syncd#syncd: :- run: warm pre-shutdown took 0.054414 sec
2024 Nov 15 14:25:39.710144 spine1 NOTICE syncd#syncd_request_shutdown: :- loadFromFile: no context config specified, will load default context config
2024 Nov 15 14:25:39.710144 spine1 NOTICE syncd#syncd_request_shutdown: :- insert: added switch: idx 0, hwinfo ''
2024 Nov 15 14:25:39.710409 spine1 NOTICE syncd#syncd_request_shutdown: :- send: requested WARM shutdown
2024 Nov 15 14:25:39.710562 spine1 NOTICE syncd#syncd: :- run: is asic queue empty: 1
2024 Nov 15 14:25:39.710588 spine1 NOTICE syncd#syncd: :- run: drained queue
2024 Nov 15 14:25:39.710588 spine1 NOTICE syncd#syncd: :- handleRestartQuery: received WARM switch shutdown event
2024 Nov 15 14:25:39.710599 spine1 NOTICE syncd#syncd: :- profileGetValue: SAI_WARM_BOOT_WRITE_FILE: /var/warmboot/sai-warmboot.bin
2024 Nov 15 14:25:39.710599 spine1 NOTICE syncd#syncd: :- run: using warmBootWriteFile: '/var/warmboot/sai-warmboot.bin'
2024 Nov 15 14:25:39.710640 spine1 NOTICE syncd#syncd: :- run: Warm Reboot requested, keeping data plane running
2024 Nov 15 14:25:39.710640 spine1 NOTICE syncd#syncd: :- setUninitDataPlaneOnRemovalOnAllSwitches: Fast/warm reboot requested, keeping data plane running
2024 Nov 15 14:25:39.810354 spine1 NOTICE syncd#syncd: :- stopMdioThread: IPC task thread is stopped
2024 Nov 15 14:25:39.810354 spine1 NOTICE syncd#syncd: :- removeAllSwitches: Removing all switches
2024 Nov 15 14:25:39.926763 spine1 NOTICE syncd#syncd: :- threadFunction: time span 216 ms for 'shutting down syncd'
...
2024 Nov 15 14:25:46.231040 spine1 NOTICE syncd#syncd: :- removeAllSwitches: removing switch RID oid:0xb980112100000000 took 6.420766 sec
...
2024 Nov 15 14:26:24.472484 spine1 NOTICE syncd#syncd: :- Syncd: command line:  EnableDiagShell=YES EnableTempView=YES DisableExitSleep=NO EnableUnittests=NO EnableConsistencyCheck=NO EnableSyncMode=YES RedisCommunicationMode=redis_async EnableSaiBulkSuport=NO StartType=cold ProfileMapFile=/etc/sai.d/sai.profile GlobalContext=0 ContextConfig= BreakConfig=/tmp/break_before_make_objects WatchdogWarnTimeSpan=30000000
2024 Nov 15 14:26:24.472504 spine1 NOTICE syncd#syncd: :- loadFromFile: no context config specified, will load default context config
2024 Nov 15 14:26:24.472504 spine1 NOTICE syncd#syncd: :- insert: added switch: idx 0, hwinfo ''
2024 Nov 15 14:26:24.472504 spine1 WARNING syncd#syncd: :- Syncd: enable sync mode is deprecated, please use communication mode, FORCING redis sync mode
2024 Nov 15 14:26:24.474114 spine1 NOTICE syncd#syncd: :- RedisSelectableChannel: opened redis channel
2024 Nov 15 14:26:24.475998 spine1 NOTICE syncd#syncd: :- isVeryFirstRun: First Run: False
2024 Nov 15 14:26:24.475998 spine1 WARNING syncd#syncd: :- performStartupLogic: override command line startType=cold via SAI_START_TYPE_WARM_BOOT
2024 Nov 15 14:26:24.475998 spine1 NOTICE syncd#syncd: :- profileGetValue: SAI_WARM_BOOT_READ_FILE: /var/warmboot/sai-warmboot.bin
2024 Nov 15 14:26:24.475998 spine1 NOTICE syncd#syncd: :- performStartupLogic: using warmBootReadFile: '/var/warmboot/sai-warmboot.bin'
2024 Nov 15 14:26:24.480828 spine1 INFO syncd#syncd: [none] SAI_API_UNSPECIFIED:sai_api_initialize:451 BRCM SAI ver: [10.1.42.0], OCP SAI ver: [1.13.2], SDK ver: [sdk-6.5.29]
...
2024 Nov 15 14:26:24.482035 spine1 NOTICE syncd#syncd: :- sai_metadata_apis_query: :- failed to query api SAI_API_TWAMP: SAI_STATUS_NOT_IMPLEMENTED (-15)
2024 Nov 15 14:26:24.482047 spine1 NOTICE syncd#syncd: :- apiInitialize: sai_api_query failed for 14 apis
2024 Nov 15 14:26:24.482077 spine1 NOTICE syncd#syncd: :- apiInitialize: SAI API vendor version: 11302
2024 Nov 15 14:26:24.482077 spine1 NOTICE syncd#syncd: :- apiInitialize: SAI API min version: 10900
2024 Nov 15 14:26:24.482077 spine1 NOTICE syncd#syncd: :- apiInitialize: SAI API headers version: 11400
2024 Nov 15 14:26:24.482077 spine1 NOTICE syncd#syncd: :- parseBreakConfig: break config parse success, contains 2 entries
2024 Nov 15 14:26:24.482077 spine1 NOTICE syncd#syncd: :- Syncd: syncd started
2024 Nov 15 14:26:24.482893 spine1 NOTICE syncd#syncd: :- performWarmRestart: switches defined in warm restart: 1
2024 Nov 15 14:26:24.482893 spine1 NOTICE syncd#syncd: :- performWarmRestartSingleSwitch: switch oid:0x21000000000000
2024 Nov 15 14:26:24.482930 spine1 NOTICE syncd#syncd: :- performWarmRestartSingleSwitch:  - attr: SAI_SWITCH_ATTR_LAG_DEFAULT_HASH_SEED:0
2024 Nov 15 14:26:24.482930 spine1 NOTICE syncd#syncd: :- performWarmRestartSingleSwitch:  - attr: SAI_SWITCH_ATTR_ECMP_DEFAULT_HASH_OFFSET:0
...
2024 Nov 15 14:26:31.823494 spine1 WARNING syncd#syncd: [none] SAI_API_TAM:_brcm_sai_get_eapp_data:342 Invalid EAPP global info for idx:13.
2024 Nov 15 14:26:31.823556 spine1 WARNING syncd#syncd: [none] SAI_API_TAM:_brcm_sai_get_eapp_data:342 Invalid EAPP global info for idx:14.
2024 Nov 15 14:26:31.830920 spine1 INFO syncd#syncd: [none] SAI_API_SWITCH:_brcm_sai_dm_init:32564 SAI Timing: DM init time 0.397 seconds
2024 Nov 15 14:26:31.832597 spine1 INFO syncd#syncd: [none] SAI_API_BUFFER:driverMMUInit:987 Set CPU Tx queue to cosq 0
2024 Nov 15 14:26:31.832849 spine1 INFO syncd#syncd: [none] SAI_API_SWITCH:brcm_sai_create_switch:2234 WB ver: 10.1.42.0
2024 Nov 15 14:26:31.966605 spine1 INFO syncd#syncd: [none] SAI_API_SWITCH:brcm_sai_create_switch:3060 SAI Timing: RIF ingress stats de-init time 0 seconds
2024 Nov 15 14:26:31.966605 spine1 INFO syncd#syncd: [none] SAI_API_SWITCH:brcm_sai_create_switch:4083 Got max mtu 9412
2024 Nov 15 14:26:31.967610 spine1 NOTICE syncd#syncd: :- performWarmRestartSingleSwitch: Warm boot: create switch VID: oid:0x21000000000000 took 7.483851 sec
2024 Nov 15 14:26:32.050428 spine1 WARNING syncd#syncd: :- discover: skipping since it causes crash: SAI_STP_ATTR_BRIDGE_ID
2024 Nov 15 14:26:32.050702 spine1 NOTICE syncd#syncd: :- discover: discover took 0.083017 sec
2024 Nov 15 14:26:32.050702 spine1 NOTICE syncd#syncd: :- discover: discovered objects count: 1426
2024 Nov 15 14:26:32.051296 spine1 NOTICE syncd#syncd: :- discover: SAI_OBJECT_TYPE_PORT: 40
...
2024 Nov 15 14:26:32.051296 spine1 NOTICE syncd#syncd: :- discover: SAI_OBJECT_TYPE_PORT_SERDES: 39
2024 Nov 15 14:26:32.052721 spine1 NOTICE syncd#syncd: :- checkWarmBootDiscoveredRids: check warm boot RIDs
2024 Nov 15 14:26:32.065863 spine1 NOTICE syncd#syncd: :- checkWarmBootDiscoveredRids: spotted new RID oid:0x1003a0000003c missing from current RID2VID (new VID oid:0x3a0000000006d5) (SAI_OBJECT_TYPE_BRIDGE_PORT) on WARM BOOT
2024 Nov 15 14:26:32.066278 spine1 NOTICE syncd#syncd: :- checkWarmBootDiscoveredRids: spotted new RID oid:0x1003a0000003d missing from current RID2VID (new VID oid:0x3a0000000006d6) (SAI_OBJECT_TYPE_BRIDGE_PORT) on WARM BOOT
...
2024 Nov 15 14:26:32.215731 spine1 NOTICE syncd#syncd: :- run: syncd listening for events
2024 Nov 15 14:26:32.215731 spine1 NOTICE syncd#syncd: :- run: starting main loop
2024 Nov 15 14:26:32.215907 spine1 NOTICE syncd#syncd: :- clearTempView: clearing current TEMP VIEW
2024 Nov 15 14:26:32.216216 spine1 NOTICE syncd#syncd: :- clearTempView: clear temp view took 0.000321 sec
2024 Nov 15 14:26:32.216216 spine1 WARNING syncd#syncd: :- processNotifySyncd: syncd switched to INIT VIEW mode, all op will be saved to TEMP view
2024 Nov 15 14:26:32.216438 spine1 NOTICE syncd#syncd: :- syncd_ipc_task_main: IPC service is online
2024 Nov 15 14:26:32.223494 spine1 NOTICE syncd#syncd: :- onSwitchCreateInInitViewMode: new switch oid:0x21000000000000 contains hardware info: ''
2024 Nov 15 14:26:32.223611 spine1 NOTICE syncd#syncd: :- onSwitchCreateInInitViewMode: current oid:0x21000000000000 switch hardware info: ''
2024 Nov 15 14:26:32.233124 spine1 INFO syncd#syncd: [none] SAI_API_SWITCH:sai_query_attribute_enum_values_capability:649 Error in enum capability query for obj type 28
2024 Nov 15 14:26:32.233616 spine1 ERR syncd#syncd: [none] SAI_API_SWITCH:sai_query_attribute_capability:572 Error in capability query for obj type 28
2024 Nov 15 14:26:32.235075 spine1 INFO syncd#syncd: [none] SAI_API_SWITCH:sai_query_attribute_enum_values_capability:649 Error in enum capability query for obj type 33
2024 Nov 15 14:26:32.268097 spine1 INFO syncd#syncd: [none] SAI_API_SWITCH:sai_query_attribute_enum_values_capability:649 Error in enum capability query for obj type 33
2024 Nov 15 14:26:32.268097 spine1 NOTICE syncd#syncd: :- addPlugins: Queue Counter counters plugin 3a6159b8386c08e7285e34a85f4d2c68ad0f1168 registered
2024 Nov 15 14:26:32.268647 spine1 NOTICE syncd#syncd: :- addPlugins: Priority Group Counter counters plugin bcf34ae9158bb04b86095f829f890234e38613ff registered
2024 Nov 15 14:26:32.269188 spine1 NOTICE syncd#syncd: :- addPlugins: Port Counter counters plugin 21177aac7026a8c9b96bf4befa0b17910a87f560 registered

^- those SAI_API_SWITCH:sai_query_attribute_capability:572 Error in capability query for obj type 28 go into a loop

Which build are we running (if any)

SONiC Software Version: SONiC.ossomain.0-41ea968fc
SONiC OS Version: 12
Distribution: Debian 12.8
Kernel: 6.1.0-22-2-amd64
Build commit: 41ea968fc
Build date: Tue Nov 12 15:28:30 UTC 2024
Built by: [email protected]

Platform: x86_64-accton_as9716_32d-r0
HwSKU: Accton-AS9716-32D
ASIC: broadcom
ASIC Count: 1

Upstream issues/PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant