Skip to content

[8.6.0] Refactor TLS error handling (Backport fix for #28459)#28461

Closed
Ashutosh0x wants to merge 2 commits intobazelbuild:release-8.6.0from
Ashutosh0x:fix-pr-28459-tls-handling
Closed

[8.6.0] Refactor TLS error handling (Backport fix for #28459)#28461
Ashutosh0x wants to merge 2 commits intobazelbuild:release-8.6.0from
Ashutosh0x:fix-pr-28459-tls-handling

Conversation

@Ashutosh0x
Copy link
Contributor

This PR provides a robust fix for the TLS error handling logic originally proposed in #28459. It addresses the review feedback by inspecting the exception cause chain for certificate errors instead of checking exception messages strings.

Supersedes #28459.

@Ashutosh0x Ashutosh0x requested a review from a team as a code owner January 28, 2026 08:22
@github-actions github-actions bot added team-Performance Issues for Performance teams team-Configurability platforms, toolchains, cquery, select(), config transitions team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. team-Rules-Java Issues for Java rules team-Rules-CPP Issues for C++ rules team-Rules-Python Native rules for Python team-Local-Exec Issues and PRs for the Execution (Local) team team-Remote-Exec Issues and PRs for the Execution (Remote) team team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website team-Documentation Documentation improvements that cannot be directly linked to other team labels team-Rules-ObjC Issues for Objective-C maintainers awaiting-review PR is awaiting review from an assigned reviewer team-CLI Console UI labels Jan 28, 2026
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a major update, bumping the Bazel version to 9.0.0 and overhauling CI configurations, dependencies, and build settings. It also introduces a substantial amount of new documentation and tooling. My review identifies a few high-severity concerns in the new configurations that could impact CI stability and build correctness. These include potentially incorrect test scoping in CI, improper flag application in .bazelrc, and a possible version mismatch in a dependency patch. None of the comments were dropped or modified based on the provided rules, as they did not directly apply to the specific issues raised.

@Ashutosh0x Ashutosh0x force-pushed the fix-pr-28459-tls-handling branch 3 times, most recently from 03c9588 to 1d8bce1 Compare January 28, 2026 09:11
while (t != null) {
if (t instanceof CertificateException || t instanceof CertPathValidatorException) {
String message = "TLS error: " + e.getMessage();
eventHandler.handle(Event.progress(message));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look right to report one event per cause. Could we report just the outer exception's message?

@Ashutosh0x Ashutosh0x force-pushed the fix-pr-28459-tls-handling branch from 1d8bce1 to 7ac214c Compare January 28, 2026 10:06
@Ashutosh0x
Copy link
Contributor Author

Addressed @fmeum's review feedback:

Changes made:

  • Refactored the SSL exception handling to first inspect the entire cause chain using a for-loop
  • Set a boolean flag when a certificate error is found
  • Report the TLS error event only once after the loop completes
  • Use the outer SSLException's message as requested

The code now clearly shows that:

  1. We check the cause chain first (for-loop)
  2. We report a single event with the outer exception's message
  3. We throw the unrecoverable exception

This addresses the concern about potentially reporting one event per cause.

@Ashutosh0x
Copy link
Contributor Author

Regarding the gemini-code-assist bot review comments:

The bot appears to have analyzed files that are not part of this PR diff. This PR only modifies \HttpConnector.java\ to fix TLS error handling.

To clarify:

  • This PR does NOT change \MODULE.bazel, .bazelrc, or .bazelci/postsubmit.yml\
  • The
    ules_jvm_external\ patch concern is not applicable - the codebase correctly uses version \6.0\ with
    ules_jvm_external_6.0.patch\

The bot's comments about version mismatches (6.5/6.6) appear to be hallucinated.

@gemini-code-assist
Copy link

Thank you for bringing this to my attention, @Ashutosh0x. I apologize for the incorrect review comments regarding files outside the scope of this pull request. My analysis mistakenly included files that were not part of the diff, leading to irrelevant suggestions.

I understand that this PR specifically modifies src/main/java/com/google/devtools/build/lib/bazel/repository/downloader/HttpConnector.java to improve TLS error handling. I will ensure that my future reviews are strictly limited to the changes within the provided diffs and relevant context.

Your feedback is valuable and helps me improve. I am an AI and sometimes make mistakes, and I appreciate you pointing out these inaccuracies.

@Ashutosh0x
Copy link
Contributor Author

Thank you for bringing this to my attention, @Ashutosh0x. I apologize for the incorrect review comments regarding files outside the scope of this pull request. My analysis mistakenly included files that were not part of the diff, leading to irrelevant suggestions.

I understand that this PR specifically modifies src/main/java/com/google/devtools/build/lib/bazel/repository/downloader/HttpConnector.java to improve TLS error handling. I will ensure that my future reviews are strictly limited to the changes within the provided diffs and relevant context.

Your feedback is valuable and helps me improve. I am an AI and sometimes make mistakes, and I appreciate you pointing out these inaccuracies.

@fmeum Ai can make mistakes 😂

@Ashutosh0x
Copy link
Contributor Author

Ive refactored the TLS error handling to properly inspect the exception cause chain for certificate errors, ensuring we only fail over when it's appropriate. I also fixed the missing SSLException import and removed the accidentally committed merge conflict markers. The CI should be much cleaner now. @fmeum, could you take another look?

@meteorcloudy meteorcloudy removed team-Performance Issues for Performance teams team-Configurability platforms, toolchains, cquery, select(), config transitions team-Rules-Java Issues for Java rules team-Rules-CPP Issues for C++ rules team-Rules-Python Native rules for Python labels Jan 28, 2026
@meteorcloudy meteorcloudy removed team-Local-Exec Issues and PRs for the Execution (Local) team team-Remote-Exec Issues and PRs for the Execution (Remote) team team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website team-Rules-ObjC Issues for Objective-C maintainers team-CLI Console UI labels Jan 28, 2026
@meteorcloudy meteorcloudy changed the title Refactor TLS error handling (Backport fix for #28459) [8.6.0] Refactor TLS error handling (Backport fix for #28459) Jan 28, 2026
@meteorcloudy
Copy link
Member

meteorcloudy commented Jan 28, 2026

@Ashutosh0x Can you explain what's the difference between this PR and #28459?
Two suggestions:

  • Keep the PR description clear about which change you are backporting and what extra change you made if any.
  • Looks like we are also missing the test

Address @fmeum's review feedback: refactored the SSL exception handling
to first check the entire cause chain for certificate errors using a for
loop, then report a single event only if a certificate error is found.
This ensures we report just the outer exception's message once, rather
than appearing to report one event per cause.

Changes:
- Use a for-loop instead of while to inspect the exception cause chain
- Set a boolean flag when a certificate error is found
- Report the TLS error event only once after the loop completes
- Use the outer SSLException's message as requested
@Ashutosh0x Ashutosh0x force-pushed the fix-pr-28459-tls-handling branch from 7ac214c to e932303 Compare January 28, 2026 14:43
@Ashutosh0x
Copy link
Contributor Author

@meteorcloudy Thank you for the review and suggestions!

Difference from #28459:
This PR backports the fix from #28459 (now merged to master) to the release-8.6.0 branch. The core logic is the same:

  • Inspects the exception cause chain for CertificateException or CertPathValidatorException
  • Reports a single TLS error event (addressing @fmeum's feedback)
  • Throws UnrecoverableHttpException to prevent retries on permanent certificate errors

Tests Added:
I've now added unit tests to HttpConnectorTest.java covering:

  1. ssLException_withCertificateCause_throwsUnrecoverableHttpException - verifies immediate failover on CertificateException
  2. ssLException_withCertPathValidatorCause_throwsUnrecoverableHttpException - verifies immediate failover on CertPathValidatorException
  3. ssLException_withoutCertificateCause_retries - ensures transient SSL errors (e.g., 'Connection reset') still trigger retries

All 36 HttpConnectorTest tests pass locally.

Ready for another review!

@iancha1992
Copy link
Member

iancha1992 commented Jan 28, 2026

@Ashutosh0x Don't we also need to make changes for
src/test/java/com/google/devtools/build/lib/bazel/repository/downloader/HttpDownloaderTest.java? Like in https://github.com/bazelbuild/bazel/pull/28459/changes

This commit adds the missing test case from PR bazelbuild#28459 as requested by
@iancha1992. The changes include:
- Updated imports to use FileSystemUtils instead of DataInputStream
- Replaced the readFile() method with FileSystemUtils.readContent()
- Added downloadFrom2UrlsFirstTlsErrorSecondOk() test to verify TLS
  error failover behavior

These changes ensure that the test coverage for TLS error handling is
complete and consistent with the main PR bazelbuild#28459.
@Ashutosh0x
Copy link
Contributor Author

@iancha1992 Thank you for the review. I've added the missing changes to HttpDownloaderTest.java from PR #28459.

Changes made in commit 34ad9b1:

  1. Updated imports to use FileSystemUtils instead of DataInputStream
  2. Replaced the local readFile() method with FileSystemUtils.readContent()
  3. Added new test downloadFrom2UrlsFirstTlsErrorSecondOk() to verify TLS error failover behavior

This test validates that when the first URL encounters a TLS/SSL error, the downloader correctly fails over to the second URL and completes the download successfully.

The PR is now fully aligned with #28459.

@meteorcloudy
Copy link
Member

merging #28459 instead.

auto-merge was automatically disabled January 30, 2026 10:03

Pull request was closed

@github-actions github-actions bot removed the awaiting-review PR is awaiting review from an assigned reviewer label Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

team-Documentation Documentation improvements that cannot be directly linked to other team labels team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants