Request with data which consists of empty values only sends bad request#6122
Request with data which consists of empty values only sends bad request#6122romanyakovlev wants to merge 2 commits into
Conversation
|
Hi @romanyakovlev, can you provide some more information about why you believe this needs to be changed? The original input data is I don't believe there's any specification implying you should remove the Content-Type for zero length bodies. That's likely breaking behavior for a subset of APIs. It's probably also worth noting that the semantics of a GET request body are undefined since this is non-standard behavior. What happens here cannot be "correct" or "incorrect". |
|
@nateprewitt the reason why I pushed this PR is the problem with nginx. When this type of request was sent to it I saw this in logs: So after Thats the only reason. |
|
You'll want to check error logs on why nginx is throwing a 400. It's also hard to tell if this is an issue for nginx or the application it's fronting. |
|
Well, my mistake. The problem was not with |
|
The same thing is happening on apache 2.4: |
|
@nateprewitt I'm sending request directly to nginx in docker without any other apps, the same thing is true for apache. Its hard to understand what’s wrong with the request because its just "bad request" |
|
But I'm going to investigate why this is happening and yes, I was wrong about the reason is |
|
Alright, the error is here. We do a check below to make sure we're not emitting a Content-Length for GET/HEAD requests, but not for this first check. When the body value is None, it bypasses this conditional, but an empty string does not. This results in a As I said earlier, what you're trying to do with this request is outside of the realm of defined HTTP semantics. You should not be emitting a GET request with a body but we allow it because some servers do crazy things. The issue I believe you're hitting with apache/nginx is this:
While we probably shouldn't be doing this we are. It's hard to tell what may be relying on this behavior at this point and given this is a SHOULD NOT rather than a MUST NOT, I don't think this is something we'd fix in Requests 2.x. |
|
Okay, I got it, thanks. PR is closed. |
|
@nateprewitt after some research I've discovered that the problem is not about And the most interesting thing - the same behavior is happening on post('http://localhost:80', data={'foo': None})gives: The raw request will be something like this: import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("localhost", 80))
s.send(
b'GET / HTTP/1.1\r\n'
b'Host: localhost:80\r\n'
b'User-Agent: python-requests/2.27.1\r\n'
b'Accept-Encoding: gzip, deflate, br\r\n'
b'Accept: */*\r\n'
b'Connection: keep-alive\r\n'
b'\r\n'
b'0\r\n\r\n' # this thing is added here https://github.com/psf/requests/blob/main/requests/adapters.py#L528
)
response = s.recv(4096)
print(response)so |
|
Changed PR description and name |
|
Yep, this does appear to be a bug! We shouldn't be leaving this code branch without setting However, short-circuiting this by changing the I wrote up a quick test for tests/test_lowlevel.py to demonstrate what we're trying to fix. @pytest.mark.parametrize(
"method,include,exclude",
(
(requests.get, [], [b"Content-Length:", b"Transfer-Encoding:"]),
(requests.post, [b"Content-Length: 0\r\n"], [b"Transfer-Encoding:"]),
)
)
def test_empty_urlencoded_form_body(method, include, exclude):
"""Ensure we use only the specified Host header for chunked requests."""
close_server = threading.Event()
server = Server(echo_response_handler, wait_to_close_event=close_server)
with server as (host, port):
url = f"http://{host}:{port}/"
resp = method(url, data=(("a", None,),))
close_server.set() # release server block
assert not resp.content.endswith(b'\r\n0\r\n\r\n')
for header in include:
assert header in resp.content
for header in exclude:
assert header not in resp.contentTo fix it, I think the least invasive change would be updating this line to: body = self._encode_params(data) or NoneLet me know what you think about that, @romanyakovlev. I'm curious to hear from @sigmavirus24 and/or @sethmlarson on their thoughts. |
|
My input is "garbage in, garbage out". The input causing the behavior is garbage so I'm not worried about this |
|
This is definitely an edge case, but I'm hesitant to call it garbage because the interface allows arbitrary dictionaries as input. If you're constructing your input dynamically and end up with a value of None, Requests shouldn't start emitting non-sense message framing. Ideally, we either error out or make sure we know how to send the right pieces over the wire. This isn't scoped to None either, any empty iterable value will trigger this. I'm surprised this hasn't been raised before. |
e34885f to
4ebdc47
Compare
4ebdc47 to
37f376a
Compare
|
@nateprewitt I agree, your solution looks better. Pushed it with the test to the branch. |
If you're constructing it this way and you're not being careful then it is garbage input. None doesn't mean anything in this context. I've always opposed the support of none in the parameter but we can't remove it.
Again, garbage. If you're sending an empty iterable that's garbage for us to try to do our best with and no way to predict it |
|
I'd like to work on this issue. Is anyone else currently working on it? |
When `data` encodes to an empty string (e.g. `data={'foo': None}`),
`prepare_body` set `body = ''` rather than `body = None`. This caused
`prepare_content_length` to skip setting `Content-Length: 0` (since
`super_len('')` is 0, which is falsy), and the adapter then treated the
request as chunked, sending a bare `0\r\n\r\n` terminator that servers
like nginx and Apache interpreted as a malformed second request.
The fix converts an empty encoded body to `None` so that
`prepare_content_length` correctly handles it: setting
`Content-Length: 0` for POST/PUT and omitting it for GET/HEAD.
Fixes psf#6122
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Daniel Bates <danielalanbates@gmail.com>
When data={'foo': None}, the body encodes to an empty string but
Content-Length was not set (because 0 is falsy). This caused the
adapter to fall back to Transfer-Encoding: chunked, sending a
terminating chunk that servers misinterpreted as a second request.
Fix: Change the Content-Length check from 'if length:' to
'if length is not None and Transfer-Encoding not in headers',
so that Content-Length: 0 is correctly set for empty bodies
while preserving chunked encoding for streams.
Fixes psf#6122
Case - request with data which consists of empty values only
Response in nginx:
So it sends second request with bad status code.
Here https://github.com/psf/requests/blob/main/requests/models.py#L576
lengthwill be0so there is noContent-Length: 0header in request.The problem occurs there https://github.com/psf/requests/blob/main/requests/adapters.py#L471 .
Because
request.bodyis''and 'Content-Length' not inrequest.headersit counts aschunk=True.Because of that it acts like it has
Transfer-Encoding: chunkedheader, and here https://github.com/psf/requests/blob/main/requests/adapters.py#L523-L528 it does not send nothing butlow_conn.send(b'0\r\n\r\n').I guess thats why It has bad request like this:
The same behavior is happening on POST request.
gives:
The raw request will be something like this:
so
0\r\n\r\nis the reason ofapache_1 | 172.21.0.1 - - [05/May/2022:23:05:44 +0000] "0" 400 226This PR fixes the problem. Tests for this case are created.