Skip to content

Fixed content-handler response-headers/object-storage collision.#4251

Merged
ggainey merged 1 commit into
pulp:mainfrom
ggainey:4028_content_headers
Aug 9, 2023
Merged

Fixed content-handler response-headers/object-storage collision.#4251
ggainey merged 1 commit into
pulp:mainfrom
ggainey:4028_content_headers

Conversation

@ggainey

@ggainey ggainey commented Aug 3, 2023

Copy link
Copy Markdown
Contributor

fixes #4028.

Comment thread pulpcore/content/handler.py Outdated
Comment on lines +897 to +924
# https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3/client/get_object.html
# response-headers S3 respects, and what they map to in an S3 object
_S3_RESPONSE_HEADER_MAP = {
"Content-Disposition": "ResponseContentDisposition",
"Content-Type": "ResponseContentType",
"Cache-Control": "ResponseCacheControl",
"Content-Language": "ResponseContentLanguage",
"Expires": "ResponseExpires",
"Content-Encoding": "ResponseContentEncoding",
}
# https://learn.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.contentsettings?view=azure-python
# response-headers azure respects, and what they map to in an azure object
_AZURE_RESPONSE_HEADER_MAP = {
"Content-Disposition": "content_disposition",
"Content-Type": "content_type",
"Cache-Control": "cache_control",
"Content-Language": "content_language",
"Content-Encoding": "content_encoding",
}
# https://gcloud.readthedocs.io/en/latest/storage-blobs.html
# response-headers Google Cloud Storage respects, and what they map to in a GCS object
_GCS_RESPONSE_HEADER_MAP = {
"Content-Disposition": "content_disposition",
"Content-Type": "content_type",
"Cache-Control": "cache_control",
"Content-Language": "content_language",
"Content-Encoding": "content_encoding",
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be more favourable to store in the constants.py file. Is it worth caching the return value of _set_params_from_headers (the parameters variable will be returned instead of overwriting it inside the function)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These both sound like fine ideas, will do!

@ggainey ggainey closed this Aug 4, 2023
@ggainey ggainey reopened this Aug 4, 2023
@ggainey ggainey force-pushed the 4028_content_headers branch from 0debecc to e175c1f Compare August 4, 2023 20:06
@ggainey ggainey force-pushed the 4028_content_headers branch from e175c1f to cbc799f Compare August 5, 2023 23:17
The :class:`aiohttp.web.FileResponse` for the file.
"""

def _set_params_from_headers(hdrs, storage_domain):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we call this function, we have identified the storage class already, so i think we can spare ourselves one if and dict lookup by providing the response map instead of the storage class name.

def _set_params_from_headers(headers, response_header_map):
    params = {}
    for key, new_key in response_header_map.items():
        if key in headers:
            params[new_key] = headers[key]

or even:

def _set_params_from_headers(headers, response_header_map):
    return {new_key: headers[key] for key, new_key in response_header_map.items() if key in headers}

We'd call it by, e.g. in the s3 branch:

_set_params_from_headers(headers, S3_RESPONSE_HEADER_MAP)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing the mapping-dictionary "assumes" that a simple lookup is all we're ever going to want to do. If we want/need to do anything more complicated in the future, the current approach lets us do so in a single place.

In terms of performance - given the current state of the content-handler, sparing "one dictionary lookup" is not convincing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about my performance in understanding the code. But it's not going to block this contribution.

@ggainey ggainey requested review from lubosmj and mdellweg August 8, 2023 17:36

@lubosmj lubosmj left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the change!

Optionally, we could remove the if storage_domain in STORAGE_RESPONSE_MAP condition and convert it to a try/except block for the for loop (https://www.oreilly.com/library/view/python-essentials/9781784390341/ch09s08.html).

@ggainey

ggainey commented Aug 9, 2023

Copy link
Copy Markdown
Contributor Author

I like the change!

Optionally, we could remove the if storage_domain in STORAGE_RESPONSE_MAP condition and convert it to a try/except block for the for loop (https://www.oreilly.com/library/view/python-essentials/9781784390341/ch09s08.html).

I've heard that argument before, but personally feel exceptions should be for exceptional issues - not a main-flow "is this one of the things we know how to deal with?" path. Speed doesn't impress me either - no matter how fast python is at exceptions, building one, executing the jump, and then processing it, isn't as fast as "single key lookup into a (tiny) dictionary".

"try it and see" makes more sense the more complicated the possibilities are.

(Or it makes sense until you start fielding "OMG FIX THIS ERROR" issues from users seeing database-logging saying "you can't do that" for things you handle via "ask forgiveness" processing...)

@ggainey ggainey merged commit 4dd2edf into pulp:main Aug 9, 2023
@patchback

patchback Bot commented Aug 9, 2023

Copy link
Copy Markdown

Backport to 3.28: 💚 backport PR created

✅ Backport PR branch: patchback/backports/3.28/4dd2edf6edaf2cdc8a0cb2881e00f6c0bc9007b9/pr-4251

Backported as #4265

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

@patchback

patchback Bot commented Aug 9, 2023

Copy link
Copy Markdown

Backport to 3.29: 💚 backport PR created

✅ Backport PR branch: patchback/backports/3.29/4dd2edf6edaf2cdc8a0cb2881e00f6c0bc9007b9/pr-4251

Backported as #4266

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

@ggainey ggainey deleted the 4028_content_headers branch February 2, 2024 19:56
balasankarc added a commit to balasankarc/pulpcore that referenced this pull request Sep 13, 2025
When GCP support was originally added, the parameters were
`response_disposition` and `content_type`. When it was refactored in
pulp#4251, the headers were GCS were
mistakenly changed and expanded.

These parameters get used by `generate_signed_url` method defined in
https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463,
through
https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350

This PR ensures only the supported parameters are passed to that
function call.

Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net>
balasankarc added a commit to balasankarc/pulpcore that referenced this pull request Sep 13, 2025
When GCP support was originally added, the parameters were
`response_disposition` and `content_type`. When it was refactored in
pulp#4251, the headers were GCS were
mistakenly changed and expanded.

These parameters get used by `generate_signed_url` method defined in
https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463,
through
https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350

This PR ensures only the supported parameters are passed to that
function call.

closes pulp#6917

Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net>
balasankarc added a commit to balasankarc/pulpcore that referenced this pull request Sep 13, 2025
When GCP support was originally added, the parameters were
`response_disposition` and `content_type`. When it was refactored in
pulp#4251, the headers were GCS were
mistakenly changed and expanded.

These parameters get used by `generate_signed_url` method defined in
https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463,
through
https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350

This PR ensures only the supported parameters are passed to that
function call.

closes pulp#6917

Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net>
balasankarc added a commit to balasankarc/pulpcore that referenced this pull request Sep 17, 2025
When GCP support was originally added, the parameters were
`response_disposition` and `content_type`. When it was refactored in
pulp#4251, the headers were GCS were
mistakenly changed and expanded.

These parameters get used by `generate_signed_url` method defined in
https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463,
through
https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350

This PR ensures only the supported parameters are passed to that
function call.

closes pulp#6917

Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net>
ggainey pushed a commit that referenced this pull request Sep 17, 2025
When GCP support was originally added, the parameters were
`response_disposition` and `content_type`. When it was refactored in
#4251, the headers were GCS were
mistakenly changed and expanded.

These parameters get used by `generate_signed_url` method defined in
https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463,
through
https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350

This PR ensures only the supported parameters are passed to that
function call.

closes #6917

Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net>
patchback Bot pushed a commit that referenced this pull request Sep 18, 2025
When GCP support was originally added, the parameters were
`response_disposition` and `content_type`. When it was refactored in
#4251, the headers were GCS were
mistakenly changed and expanded.

These parameters get used by `generate_signed_url` method defined in
https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463,
through
https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350

This PR ensures only the supported parameters are passed to that
function call.

closes #6917

Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net>
(cherry picked from commit fc99a60)
ggainey pushed a commit that referenced this pull request Sep 18, 2025
When GCP support was originally added, the parameters were
`response_disposition` and `content_type`. When it was refactored in
#4251, the headers were GCS were
mistakenly changed and expanded.

These parameters get used by `generate_signed_url` method defined in
https://github.com/googleapis/python-storage/blob/a8109e0/google/cloud/storage/blob.py#L463,
through
https://github.com/jschneier/django-storages/blob/758ad6f/storages/backends/gcloud.py#L350

This PR ensures only the supported parameters are passed to that
function call.

closes #6917

Signed-off-by: Balasankar 'Balu' C <balu@dravidam.net>
(cherry picked from commit fc99a60)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Content-app doesn't respect response-headers specified by plugins

3 participants