Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on switching to new asyncio #1566

Closed
drc38 opened this issue Dec 28, 2024 · 12 comments
Closed

Error on switching to new asyncio #1566

drc38 opened this issue Dec 28, 2024 · 12 comments
Labels

Comments

@drc38
Copy link

drc38 commented Dec 28, 2024

Hi,

I need to do a bit more debugging but is this error expected when switching to the new asyncio from the legacy implementation?

Thanks

Error doing job: Fatal error: protocol.data_received() call failed. (None)
Traceback (most recent call last):
  File "/usr/local/lib/python3.13/asyncio/selector_events.py", line 1019, in _read_ready__data_received
    self._protocol.data_received(data)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/usr/local/lib/python3.13/site-packages/websockets/asyncio/connection.py", line 1082, in data_received
    self.process_event(event)
    ~~~~~~~~~~~~~~~~~~^^^^^^^
  File "/usr/local/lib/python3.13/site-packages/websockets/asyncio/server.py", line 218, in process_event
    super().process_event(event)
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^^
  File "/usr/local/lib/python3.13/site-packages/websockets/asyncio/connection.py", line 733, in process_event
    self.acknowledge_pings(bytes(event.data))
    ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.13/site-packages/websockets/asyncio/connection.py", line 753, in acknowledge_pings
    pong_waiter.set_result(latency)
    ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
asyncio.exceptions.InvalidStateError: invalid state

It appears to be due to trying to process incoming 0 byte ping/pongs:

2024-12-29 09:56:44.343 DEBUG (MainThread) [websockets.server] < PING '' [0 bytes]
2024-12-29 09:56:44.343 DEBUG (MainThread) [websockets.server] > PONG '' [0 bytes]
2024-12-29 09:56:46.094 DEBUG (MainThread) [websockets.server] > PING 20 71 45 d0 [binary, 4 bytes]
@aaugustin aaugustin added the bug label Dec 29, 2024
@drc38
Copy link
Author

drc38 commented Dec 30, 2024

Extra debug:

2024-12-30 21:46:20.130 DEBUG (MainThread) [websockets.server] > PING 8c 77 10 4c [binary, 4 bytes]
2024-12-30 21:46:20.339 DEBUG (MainThread) [websockets.server] < PONG 8c 77 10 4c [binary, 4 bytes]
2024-12-30 21:46:40.341 DEBUG (MainThread) [websockets.server] > PING 0d 90 0f 00 [binary, 4 bytes]
2024-12-30 21:46:40.855 DEBUG (MainThread) [websockets.server] < PONG 0d 90 0f [binary, 3 bytes]
2024-12-30 21:47:17.717 DEBUG (MainThread) [websockets.server] < PING '' [0 bytes]
2024-12-30 21:47:17.717 DEBUG (MainThread) [websockets.server] > PONG '' [0 bytes]
2024-12-30 21:47:18.363 DEBUG (MainThread) [websockets.server] < TEXT '[2,"27db2f45-d1d7-91b4-2aee-4b60975d27ce","Mete...,"value":"50.100"}]}]}]' [735 bytes]
2024-12-30 21:47:18.392 DEBUG (MainThread) [websockets.server] > TEXT '[3,"27db2f45-d1d7-91b4-2aee-4b60975d27ce",{}]' [45 bytes]
2024-12-30 21:47:37.764 DEBUG (MainThread) [websockets.server] > PING c7 27 2c d6 [binary, 4 bytes]
2024-12-30 21:47:37.898 DEBUG (MainThread) [websockets.server] < PONG c7 27 2c d6 [binary, 4 bytes]
2024-12-30 21:47:37.898 DEBUG (MainThread) [websockets.server] pong_waiter.set_result failed on ping_id: b'\r\x90\x0f\x00'
2024-12-30 21:47:37.898 DEBUG (MainThread) [websockets.server] Ping_ids: [b'\r\x90\x0f\x00']
2024-12-30 21:47:37.898 DEBUG (MainThread) [websockets.server] Data: b"\xc7',\xd6"

By adding the following:

            try:
                pong_waiter.set_result(latency)
            except Exception:
                self.logger.debug("pong_waiter.set_result failed on ping_id: %s", ping_id)
                self.logger.debug("Ping_ids: %s", ping_ids)
                self.logger.debug("Data: %s", data)

@drc38
Copy link
Author

drc38 commented Dec 30, 2024

Looks like this line where the trailing 00 is lost could be the cause:
2024-12-30 21:46:40.855 DEBUG (MainThread) [websockets.server] < PONG 0d 90 0f [binary, 3 bytes]

@drc38
Copy link
Author

drc38 commented Dec 30, 2024

Another example where the data is not returned by the pong, I assume due to a network error:

2024-12-30 22:31:32.851 DEBUG (MainThread) [websockets.server] > PING d6 00 72 43 [binary, 4 bytes]
2024-12-30 22:31:33.017 DEBUG (MainThread) [websockets.server] < PONG d6 [binary, 1 byte]
2024-12-30 22:32:02.857 DEBUG (MainThread) [websockets.server] < PING '' [0 bytes]
2024-12-30 22:32:02.857 DEBUG (MainThread) [websockets.server] > PONG '' [0 bytes]
2024-12-30 22:32:12.858 DEBUG (MainThread) [websockets.server] > PING '_\x17W\x1b' [text, 4 bytes]
2024-12-30 22:32:13.034 DEBUG (MainThread) [websockets.server] < PONG '_\x17W\x1b' [text, 4 bytes]
2024-12-30 22:32:13.034 DEBUG (MainThread) [websockets.server] pong_waiter.set_result failed on ping_id: b'\xd6\x00rC'
2024-12-30 22:32:13.034 DEBUG (MainThread) [websockets.server] Ping_ids: [b'\xd6\x00rC']
2024-12-30 22:32:13.035 DEBUG (MainThread) [websockets.server] Data: b'_\x17W\x1b'

@aaugustin
Copy link
Member

Thank you for providing the debug logs. That makes it a lot easier.

Your server (websockets) sends a ping with a random binary payload; the client is expected to echo it as is; but it doesn't, as you can see in the debug logs.

When I say "is expected", it's a "MUST" in the RFC (https://datatracker.ietf.org/doc/html/rfc6455#section-5.5.2). Arguably, his is a bug in the client.

@aaugustin
Copy link
Member

From the perspective of websockets there are two options:

  1. make no changes; expect other implementations to improve
  2. switch from "random payloads" to "random printable ASCII payloads"; improve compatibility

The problem isn't super frequent, which is why I didn't really consider option 2 until now.

@drc38
Copy link
Author

drc38 commented Dec 30, 2024

Thanks for the reply, interestingly the error doesn't occur with the legacy implementation with the same client, was number 2 the previous approach?

Could the scenarios above occur due to a flaky network connection rather than poor client implementation?

Is there a 3rd option to remove the pong future gracefully if it isn't received/matched before the next pong receipt as per:

If an endpoint receives a Ping frame and has not yet sent Pong
   frame(s) in response to previous Ping frame(s), the endpoint MAY
   elect to send a Pong frame for only the most recently processed Ping
   frame.

@drc38
Copy link
Author

drc38 commented Dec 30, 2024

I assume the above could be achieved by moving:

latency = pong_timestamp - ping_timestamp
pong_waiter.set_result(latency)

Into the if statement:

if ping_id == data:
    latency = pong_timestamp - ping_timestamp
    pong_waiter.set_result(latency)
    self.latency = latency
    break

@aaugustin
Copy link
Member

Could the scenarios above occur due to a flaky network connection rather than poor client implementation?

No - websockets runs over TCP and TCP guarantees that there's no gap in the data stream.

@aaugustin
Copy link
Member

I'm not 100% sure how you end up in this situation but I'm relatively confident that the change that I just made fixed it. Would you be able to test it?

You can just patch websockets like this:

197b3ec#diff-76885032aeb05335d0dc3238fcc783c2af021b1b13d46b3b0401d64956912aa8L753-R754

@drc38
Copy link
Author

drc38 commented Jan 4, 2025

Thanks, will test but the patch looks good to me.

@aaugustin
Copy link
Member

Cool, thank you for reporting the bug!

@drc38
Copy link
Author

drc38 commented Jan 4, 2025

Cool, thank you for reporting the bug!

Been running patch for 12h with no issues

aaugustin added a commit that referenced this issue Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants