Skip to content

Intermittent ConnectionResetError in SSL tests on Windows, probably due to openssl 1.1.1 bug #1293

Description

@njsmith

Signature:

..\pyinstall\python\tools\lib\site-packages\trio\tests\test_ssl.py:94: in ssl_echo_serve_sync
    data = wrapped.recv(4096)
..\pyinstall\python\tools\lib\ssl.py:1056: in recv
    return self.read(buflen)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <ssl.SSLSocket fd=1244, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 2560), raddr=('127.0.0.1', 2559)>
len = 4096, buffer = None

    def read(self, len=1024, buffer=None):
        """Read up to LEN bytes and return them.
        Return zero-length string on EOF."""
    
        self._checkClosed()
        if self._sslobj is None:
            raise ValueError("Read on closed or unwrapped SSL socket.")
        try:
            if buffer is not None:
                return self._sslobj.read(len, buffer)
            else:
>               return self._sslobj.read(len)
E               ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

..\pyinstall\python\tools\lib\ssl.py:931: ConnectionResetError

Example: https://dev.azure.com/python-trio/trio/_build/results?buildId=1210&view=logs

I seem to get this pretty often when testing locally, and now it's showing up on Azure as well...

I think this must be yet another manifestation of the famous openssl bug (openssl/openssl#7948, openssl/openssl#7948). We've worked around it in Trio itself (#1171), but in this test we're using the stdlib ssl module directly in blocking mode in a background thread, as a "known good" reference implementation. Of course, it's not so good – it sends session tickets unconditionally after the handshake. In some of our tests, the client closes the connection after the handshake, before reading the tickets. If you close a socket while there's pending data in the receive buffer, then sometimes that triggers a RST packet to the peer. And then the peer might complain that the connection was reset, like it does here.

This would explain why we only started seeing it recently – this is new behavior in openssl v1.1.1, and that's still percolating out through various distribution channels.

I'm not sure if the reason we've only seen this on Windows so far is because it's a Windows-only quirk, or because only Windows uses TCP sockets here – we're using socketpair, and on Unix that generally returns Unix-domain sockets, which don't have RST packets. But Windows doesn't have those, so the stdlib emulates socketpair using a loopback TCP socket.

Anyway, I guess we should ... just ignore ConnectionResetError here, probably?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions