Fixed content-handler response-headers/object-storage collision.#4251
Conversation
dc291cf to
0debecc
Compare
| # https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/get_object.html | ||
| # response-headers S3 respects, and what they map to in an S3 object | ||
| _S3_RESPONSE_HEADER_MAP = { | ||
| "Content-Disposition": "ResponseContentDisposition", | ||
| "Content-Type": "ResponseContentType", | ||
| "Cache-Control": "ResponseCacheControl", | ||
| "Content-Language": "ResponseContentLanguage", | ||
| "Expires": "ResponseExpires", | ||
| "Content-Encoding": "ResponseContentEncoding", | ||
| } | ||
| # https://learn.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.contentsettings?view=azure-python | ||
| # response-headers azure respects, and what they map to in an azure object | ||
| _AZURE_RESPONSE_HEADER_MAP = { | ||
| "Content-Disposition": "content_disposition", | ||
| "Content-Type": "content_type", | ||
| "Cache-Control": "cache_control", | ||
| "Content-Language": "content_language", | ||
| "Content-Encoding": "content_encoding", | ||
| } | ||
| # https://gcloud.readthedocs.io/en/latest/storage-blobs.html | ||
| # response-headers Google Cloud Storage respects, and what they map to in a GCS object | ||
| _GCS_RESPONSE_HEADER_MAP = { | ||
| "Content-Disposition": "content_disposition", | ||
| "Content-Type": "content_type", | ||
| "Cache-Control": "cache_control", | ||
| "Content-Language": "content_language", | ||
| "Content-Encoding": "content_encoding", | ||
| } |
There was a problem hiding this comment.
This might be more favourable to store in the constants.py file. Is it worth caching the return value of _set_params_from_headers (the parameters variable will be returned instead of overwriting it inside the function)?
There was a problem hiding this comment.
These both sound like fine ideas, will do!
0debecc to
e175c1f
Compare
e175c1f to
cbc799f
Compare
| The :class:`aiohttp.web.FileResponse` for the file. | ||
| """ | ||
|
|
||
| def _set_params_from_headers(hdrs, storage_domain): |
There was a problem hiding this comment.
When we call this function, we have identified the storage class already, so i think we can spare ourselves one if and dict lookup by providing the response map instead of the storage class name.
def _set_params_from_headers(headers, response_header_map):
params = {}
for key, new_key in response_header_map.items():
if key in headers:
params[new_key] = headers[key]or even:
def _set_params_from_headers(headers, response_header_map):
return {new_key: headers[key] for key, new_key in response_header_map.items() if key in headers}We'd call it by, e.g. in the s3 branch:
_set_params_from_headers(headers, S3_RESPONSE_HEADER_MAP)There was a problem hiding this comment.
Passing the mapping-dictionary "assumes" that a simple lookup is all we're ever going to want to do. If we want/need to do anything more complicated in the future, the current approach lets us do so in a single place.
In terms of performance - given the current state of the content-handler, sparing "one dictionary lookup" is not convincing.
There was a problem hiding this comment.
I was thinking about my performance in understanding the code. But it's not going to block this contribution.
lubosmj
left a comment
There was a problem hiding this comment.
I like the change!
Optionally, we could remove the if storage_domain in STORAGE_RESPONSE_MAP condition and convert it to a try/except block for the for loop (https://www.oreilly.com/library/view/python-essentials/9781784390341/ch09s08.html).
I've heard that argument before, but personally feel exceptions should be for exceptional issues - not a main-flow "is this one of the things we know how to deal with?" path. Speed doesn't impress me either - no matter how fast python is at exceptions, building one, executing the jump, and then processing it, isn't as fast as "single key lookup into a (tiny) dictionary". "try it and see" makes more sense the more complicated the possibilities are. (Or it makes sense until you start fielding "OMG FIX THIS ERROR" issues from users seeing database-logging saying "you can't do that" for things you handle via "ask forgiveness" processing...) |
Backport to 3.28: 💚 backport PR created✅ Backport PR branch: Backported as #4265 🤖 @patchback |
Backport to 3.29: 💚 backport PR created✅ Backport PR branch: Backported as #4266 🤖 @patchback |
When GCP support was originally added, the parameters were `response_disposition` and `content_type`. When it was refactored in pulp#4251, the headers were GCS were mistakenly changed and expanded. These parameters get used by `generate_signed_url` method defined in https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463, through https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350 This PR ensures only the supported parameters are passed to that function call. Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net>
When GCP support was originally added, the parameters were `response_disposition` and `content_type`. When it was refactored in pulp#4251, the headers were GCS were mistakenly changed and expanded. These parameters get used by `generate_signed_url` method defined in https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463, through https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350 This PR ensures only the supported parameters are passed to that function call. closes pulp#6917 Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net>
When GCP support was originally added, the parameters were `response_disposition` and `content_type`. When it was refactored in pulp#4251, the headers were GCS were mistakenly changed and expanded. These parameters get used by `generate_signed_url` method defined in https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463, through https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350 This PR ensures only the supported parameters are passed to that function call. closes pulp#6917 Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net>
When GCP support was originally added, the parameters were `response_disposition` and `content_type`. When it was refactored in pulp#4251, the headers were GCS were mistakenly changed and expanded. These parameters get used by `generate_signed_url` method defined in https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463, through https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350 This PR ensures only the supported parameters are passed to that function call. closes pulp#6917 Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net>
When GCP support was originally added, the parameters were `response_disposition` and `content_type`. When it was refactored in #4251, the headers were GCS were mistakenly changed and expanded. These parameters get used by `generate_signed_url` method defined in https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463, through https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350 This PR ensures only the supported parameters are passed to that function call. closes #6917 Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net>
When GCP support was originally added, the parameters were `response_disposition` and `content_type`. When it was refactored in #4251, the headers were GCS were mistakenly changed and expanded. These parameters get used by `generate_signed_url` method defined in https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463, through https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350 This PR ensures only the supported parameters are passed to that function call. closes #6917 Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net> (cherry picked from commit fc99a60)
When GCP support was originally added, the parameters were `response_disposition` and `content_type`. When it was refactored in #4251, the headers were GCS were mistakenly changed and expanded. These parameters get used by `generate_signed_url` method defined in https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463, through https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350 This PR ensures only the supported parameters are passed to that function call. closes #6917 Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net> (cherry picked from commit fc99a60)
fixes #4028.