-
-
Notifications
You must be signed in to change notification settings - Fork 754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLE connection randomly hangs on Bangle.js2 #2560
Comments
Just to check, the firmware on your Bangle.js is up to date? Please can you turn on the debug logging option on the Bangle - probably to write messages to the screen as well as log them to a file - and maybe that will help you see what's happening? Please can you also attempt to subscribe to notifications on the UART RX characteristic and print them? Even the act of doing that might help, but also it would show you any errors/warnings the Bangle sent There are a few options here:
With no idea what commands you're sending, what the Bangle is reporting back or even the software running on the Bangle there's not really anything I can do to help solve this - if you could get me the Bangle and ESP32 code you're running I could try it on a device here and see if I can reproduce it, but realistically it'd have to occur more often than once a week for me to be able to be sure of doing to while the Bangle was on a debugger. In the mean time, you should be able to work around your problem by resetting the Bangle if it's not working as you expect. Based on https://www.espruino.com/Bangle.js+Button+Reset you can do:
So now, if that interval (with |
Thanks for the feedback and plan forward for diagnosing. I will implement and report back. Your watchdog fix will work in the interim, but finding the culprit is better. I cannot disclose the exact code at the moment as we are currently in the process of a research publication submission, however the code just sends a function to 1) update unix time (and tz offset), and 2) updates a cached json file on the watch with the time. Each function is written with echo off; I initially thought it was a writing to storage issue (my main app writes to the watch storage each minute), and so randomly the second command and the storage write within app would crash, so I implemented a 'watcher' system for storage writes, but the freezing still occurs. The connect to Bangle.js2 from an ESP32 is quite generic (so happy to share) and used from many tutorials online, and registers for notifications:
|
Ok, great! So one think I do see if you are enabling notifications and adding a handler with I don't know how NimBLEDevice handles it but maybe if notifyCallback should be responding with an acknowledge and isn't that could be the issue. If you really don't care about the response you can always try removing every reference to |
One thing that came to mind is this...
What is the 'worst case' time duration for a Bluetooth connection to a Bangle? Could there be a case where the NRF thinks its connected, but the esp32 has timed out and stopped trying, and so it hasn't proceeded. Thereby halting the Bangle.js2 from continuing the 'connection dance'? In other words, do you think 5 seconds is too short...? Hope that makes sense. |
Well, it shouldn't really matter - at whatever point, if the Bangle's not receiving any packets from the connecting device for a certain time period it should flag itself as disconnected. Out of interest, when the Bangle hangs, what happens if you just unplug or hard reset the ESP32? Does it have any effect on the Bangle? |
While I haven't tried that, but the program on the microcontroller continues to run and connect to other Bangle.js2 in proximity.... so it has already cleared/recycled its bluetooth connection (only one Bangle.js2 is connected at a time). |
Update, after 2 days of solid connect/disconnects. We have a freeze on my dev/test watch. I wasn't able to capture the exact log of what happened. Got home, put watch on charger, and it froze on next BLE connection. The ESP32 continued to operate as expected (disconnect after a stale connection). I did have your interim suggestion included in the codebase;
But that didn't even restart the watch, despite the frozen clockface (I created a custom widget and just use the Anton clock). |
Well that's odd - so you think that the interval wasn't running? Maybe you could update your custom widget from within that setInterval so you can see if that is still working ok.
Did you actually try restarting the ESP32 in case it was still holding the connection open even though the code said it wasn't? |
I am not sure whether the interval was running or not (although, the inability to use the btn for reset implies its positioned correctly), but the Bangle.js2 accepts a connection (widbt shows blue on screen). So a connection is started. My guess is the ESP32 attempts to get the service/characteristic of the UART, and this fails so the ESP32 disconnects as expected. But the Bangle.js2 freezes at some point here. I can tell this as my custom widget writes data every minute to local storage, and I have no data between 19h35 and 20h55 during the freeze until I conducted a hard reset (holding button for +10 seconds). I did not physically restart my ESP32, but if a Bangle connection stalls for over 20 seconds (e.g. connected but no notifications received for 20 seconds), it automatically restarts esp32 as a failsafe. So connection to the watch is not held. I also only allow for 1 connection to a Bangle at a given time on the ESP32. |
Please can you just try this next time it happens? It'd really help to track down what the underlying issue is |
Funny enough, the freeze episode happened today. I did power reset the ESP32, but it did nothing. The BangleJS2 was frozen on the clock face.
Hopefully that helps narrow it down... |
…rs to supply their own JS watchdog (#2560)
It at least shows that something in the 'idle loop' isn't completing, but it's probably not JS because a ~2 sec press of the button would break out otherwise. Without a way for me to easily reproduce or any logs from the Bangle it's not possible to know much more though! If you could make me a cut-down firmware for the ESP32 that could reproduce the issue on a Bangle here then I could help, but if not there's nothing I can do. But I did just add
This should now reboot the Bangle if the interval isn't called, which we're pretty sure it's not - so that should fix your problem for now |
I will try that and report back! Thanks :) The press of the button should have unlocked the watchface at least, but the ~2 sec button press didn't/wouldn't work as I followed your previous suggestion of adding these lines of code to the widget:
But with this cutting edge build I can ignore that first bit of code... I think I could create a simplified version of the firmware for you to test on a standard ESP32 ... not an easy task so may take a couple of days. Will have to send through email. I am also going to add a memory watcher too so when / if the screen freezes I have a visual of used memory too, just to confirm that is/is not the culprit. |
Ok, great, thanks! And yes, dumping memory use on the screen would be handy. I believe there are some widgets for that so you may not even have to write any code.
Ahh, ok - interesting! So you think without If so it could be JS - if you enabled writing the debug log on the watch it might have even been able to write a stack trace of where it got stuck into the log |
Will do! |
updates... stack log looks like it might not be much assistance, but it seems that at the point of 'freeze' it prints this:
I also had to remove the
removing the
resolved that, but I still experienced a freeze (overnight last night) during one of the Bluetooth connections (attempts are made every 3 - 5 mins). It had the blue bluetooth symbol, suggesting a connection, but it was not advertising. |
Thanks for the update... Please keep If your code is taking too long to execute and you're sure it's ok, just add kickWatchdog inside it:
|
Wow, ok, well that's odd. Was it still scrolling And ... unless your code is stuck executing some JS code that kicks the watchdog itself (just Storage compaction or defragmentation as far as I know) |
I think I have come to a conclusion that the Bluetooth stack freezes when you try to make connections too often from multiple independent devices (e.g. beacons connecting to the BanglesJS to write a command and then immediately disconnect). The origins of the freeze are unknown at this time as once the BangleJS2 freezes, it is not responsive, but it likely occurs during connection initiation and before services/characteristics are found. I did observe that having only one beacon connecting to the BangleJS2 intermittently resolved the issue (frequency of once every 5 minutes). It was only when I had multiple devices connecting (+2) did the BangleJS2 hang, even if I set a flag to ensure beacons did not connect to the same BangleJS2 within 5 minute windows. When it froze was irregular as well, it was not consistent - sometimes freezing after a couple of hours to a couple of days. So I am marking this as closed for now and will reopen / update if I come by a fix |
Ok, thanks! That explains why my test setup worked for days without issues. If you do have success reproducing (eg with two ESP32 trying to connect every few seconds) please let me know - the trick will be reproducing it here so I can do it with the watch on a debugger. I also believe the use of the watchdog tweaks above should really have resolved almost any crashing issues you had |
I thought so too (re: manual watchdog), but even with it I was experiencing the freeze condition, albeit less frequently so it did do something. I really did try various permutations for testing (e.g. different BLE connection intervals without flags [1,5,10 minutes], flags for ensuring the Bangle wasn't connected to more than once every 1, 5, or 10 minutes, 2 or more beacons in proximity, etc). The only resolution I experienced was to not have more than 1 ESP32 making direct communications with the watch. ~2, 5, or 10 mins intervals were all ok for days. This, in general makes me think that I am observing a CPU halt somewhere during the BLE connection, but cannot pin point where in the stack this is happening as logging freezes. The only reset is a hardware button hold. It isn't the end of the world - just requires a different solution for my context! But my alternative approach sparked another observation with the low-powered GPS mode provided by the gpssetup module. |
Well that itself is odd because the only way long-pressing on the button resets the Bangle is because of the watchdog, which should be disabled by If you've got the code I gave you running on one of the recent firmwares:
then pressing the button on the watch will have zero effect on whether it reboots or not. |
I have noticed a random issue with the Bangle.js2 where during a bluetooth connection from a separate device to the UART, the Bangle.js2 will freeze and need a reboot (button hold).
I am connecting to the bangle.js2 every 3 minutes over bluetooth via an ESP32 (programmed in arduino/esp-idf), sending 2 commands one after the other, without need for reading the response, and then disconnecting from the device. The ESP32 executes its code successfully, and this can work fine for hours, days, even weeks.
However, at random during the connection, the Bangle.js2 freezes at some point and halts function (e.g. time on the clock freezes, the bluetooth icon from widbt stays blue). The Bangle.js2 does not advertise itself anymore as well, so it is stuck in a ble connected state, despite the ESP32 disconnecting.
I have been trying to unravel when in the connection this happens, and I believe it occurs before commands are sent, thus during the connection between ESP32 and Bangle.js2.
The text was updated successfully, but these errors were encountered: