-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expiring callbacks and defer sending during status handling #189
base: main
Are you sure you want to change the base?
Conversation
…me. Defer sending operations in status handling methods.
Getting a somewhat reliable crash after an open/close cycle completes
|
seems to be getting stuck at 95% somehow
|
Eventually it crashed |
I reverted back to |
I don't see anything obviously wrong here that would explain the crashes..... |
No idea why the crashes happen, I can't replicate them on my GDO. I'll think about it. |
I've been testing a lot on this one ...maybe I've worn out the flash.... |
If that's the case it would crash on main too. |
maybe its bad flash and when I OTA I hit the other side.... anyways |
Let me flash twice and see if its stable |
I flashed a different one and the problem happens there as well so not the chip
|
tried a third one same behavior
|
I can build a new box and do a fresh HA install with ESPHome and link one of them if you want to play around with it |
That sounds like too much work and I wouldn't see much more than what you've already posted here. It's very strange, I cannot get it to crash on mine... Maybe try with verbose logging to get more clues (update ESP_LOG1 and ESP_LOG1 in common.h). |
Crash with more logging
|
Let me know if I should pull out the car and get the ladder to get a serial console trace |
Sorry, I had to step away from the computer for a while. I pushed two branches, Also what version of esphome/platformio/gcc are you using? |
I'm installing with the latest stable addon from HA. It's whatever Debian bookworm provides. Esphome 2023.12.8. I can check the gcc and platform IO versions when I get back home. I'll have to test tomorrow as I won't be back home until late |
|
I hooked up a serial console. trace is above |
The trace is from the near garage door Testing expiring_callbacks_1 now on the mid garage door |
Confirmed crash still happens with expiring_callbacks_1 on mid garage door (console not attached as I'd have to pull out the other car but can if needed)
Trying _2 now |
crashed with this one as well
|
retesting as I realized the refs need to be changed in the branches |
retesting with expiring_callbacks_1 now ... |
|
|
spoke too soon _2 crashed
Now properly testing _1 |
Summary
|
I just realized I'm getting crashes on
|
The best way I have found to trigger the crash is to open and close a few times and stop the door at various points in the open/close phase and than restart the open/close |
Thanks for all the testing!
Ok, that at least makes a bit more sense, since it was a large refactoring... I couldn't see what would cause it in this latest change. Still strange that it didn't happen for me at all, I'll have some time later today to look at it and test some more. |
I saw in the stack-trace that the crash was a WDT reset and looking though the code it seems that esphome does not feed the WDT in between scheduler tasks. I pushed a change that feeds the WDT after each defer task in the ratgdo component, when you have a chance, can you see if it fixes the crashes? |
I re-flashed. Sorry to report the behavior is the same
|
Let me know if you want a full trace and I'll hook up the console again |
Try again with the latest? If the same, another trace might give some additional clues... frustrating to not be able to reproduce it, I've been opening/closing the door lots of times and nothing... |
I have a feeling its 8266 only.... |
Its hard crashed and not coming back so I have to get the ladder back out anyways |
I'm using an 8266... |
Thats super frustrating for sure considering how many different GDOs I can replicate on |
Send me a binary, I'll flash it on mine, at least narrow down to HW or build environment. |
Not even ping/OTA? I've always been able to OTA even if HA was not connecting (ESP went into safe mode boot). |
I have wifi creds hard coded into it. Can you find me on discord (same handle), and I'll compile one for you. |
pings but no OTA |
Last push appears to have fixed it 15 cycles and no crash. Couldn't get past 2 before |
Added expiring callbacks that will only be called within a certain time.
Defer sending operations in status handling methods.