Skip to content

[test](regression) drop unsupported use_path_style=true in test_s3_tvf#64562

Merged
sollhui merged 2 commits into
apache:masterfrom
0AyanamiRei:fix/test-s3-tvf-drop-use-path-style
Jun 23, 2026
Merged

[test](regression) drop unsupported use_path_style=true in test_s3_tvf#64562
sollhui merged 2 commits into
apache:masterfrom
0AyanamiRei:fix/test-s3-tvf-drop-use-path-style

Conversation

@0AyanamiRei

@0AyanamiRei 0AyanamiRei commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: None

Related PR: None

Problem Summary:

regression-test/suites/load_p2/tvf/test_s3_tvf.groovy configured several S3 TVF attributes with both a virtual-host style URI and use_path_style=true:

uri = "s3://${bucket}.${endpoint}/..."
use_path_style = "true"

These two settings conflict. Aliyun OSS rejects path-style access for this bucket with HTTP 403 SecondLevelDomainForbidden and the message Please use virtual hosted style to access. The regression case could pass on the previous endpoint, but failed after the P2 environment switched to the Aliyun internal Beijing endpoint where OSS enforces virtual-host style access.

This PR fixes the regression case by removing the six active addProperty("use_path_style", "true") settings whose URI is already in virtual-host form, so the SDK sends requests in the addressing style required by OSS. The remaining S3 TVF attributes in this file do not set use_path_style and keep their previous behavior.

This PR also improves the failure path in the test. Previously, assertTrue(attribute.expectFiled) threw immediately and the later logger.info("error: ", ex) line was skipped, so the failure only showed a bare assertion line. The exception is now logged before the assertion, and the assertion message includes the loop index, table name, property map, and original error message.

Manual verification used the same OSS bucket, key prefix, and endpoint while only changing the addressing style:

Path-style request:
https://oss-cn-beijing.aliyuncs.com/doris-regression-bj/?prefix=regression/load/data/kd16=abcdefg/&max-keys=2
Result: HTTP 403 SecondLevelDomainForbidden

Virtual-host request:
https://doris-regression-bj.oss-cn-beijing.aliyuncs.com/?prefix=regression/load/data/kd16=abcdefg/&max-keys=2
Result: HTTP 200 OK

The same behavior was reproduced through Doris S3 TVF: the query fails with use_path_style=true and succeeds after the property is removed.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
      • Verified Aliyun OSS rejects the path-style URL with HTTP 403 SecondLevelDomainForbidden and accepts the virtual-host URL with HTTP 200 OK.
      • Verified Doris S3 TVF fails with use_path_style=true and succeeds after removing the property.
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes. The regression test now uses virtual-host style access for Aliyun OSS cases whose URI is already virtual-host style. Doris product behavior is unchanged.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

The TVF attributes set both
    uri = "s3://${bucket}.${endpoint}/..."   (virtual-host form)
and
    use_path_style = "true"                   (path-style request)
which contradict each other. Aliyun OSS S3-compatible API rejects
path-style addressing with HTTP 403 SecondLevelDomainForbidden /
"Please use virtual hosted style to access". The combination only
worked accidentally on the historical aliyun (HK) endpoint and broke
once the regression switched to aliyun-internal (BJ) where OSS
enforces the rule strictly.

Drop the six active addProperty("use_path_style", "true") lines so
the SDK uses virtual-host addressing that matches the URI form.
Reproduced end-to-end with curl against the public OSS endpoint and
with a Doris S3 TVF: same SQL passes when the property is absent
and fails with SecondLevelDomainForbidden when it is set.

Also surface the underlying error in the catch block so future
failures show the real exception instead of a bare assertTrue line
number. Move logger.info("error: ", ex) before assertTrue, since
assertTrue(false) throws AssertionFailedError and skipped the log
line entirely.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@0AyanamiRei

Copy link
Copy Markdown
Contributor Author

run buildall

@0AyanamiRei

Copy link
Copy Markdown
Contributor Author

run buildall

@sollhui

sollhui commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

/review

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review completed. I am requesting changes for one regression-test correctness issue: the touched catch block improves the message for unexpected exceptions, but still does not fail when an expected-error TVF insert unexpectedly succeeds.

Critical checkpoint conclusions: the PR scope is narrow and test-only; no product concurrency, lifecycle, transaction, persistence, or FE/BE protocol compatibility change is involved. The S3/OSS addressing change follows the existing OSS property path: Aliyun endpoints select OSSProperties and the bucket.endpoint URI form is normalized before listing/scanning. The remaining issue is test coverage semantics for negative cases, covered inline. No additional user focus was provided.

Subagent conclusions: optimizer-rewrite reported no candidates. tests-session-config proposed TSC-001, accepted as MAIN-001 and submitted inline. The suspected bucket-authority issue and assertion-overload issue were dismissed with code evidence in the ledger. Convergence round 1 ended with both live subagents replying NO_NEW_VALUABLE_FINDINGS for the same ledger/comment set.

Comment thread regression-test/suites/load_p2/tvf/test_s3_tvf.groovy

@sollhui sollhui left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Jun 23, 2026
@github-actions

Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions

Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

@sollhui sollhui merged commit c0c5a8a into apache:master Jun 23, 2026
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants