better handling of disconnections and rpc driver call issues#247
Conversation
📝 WalkthroughWalkthroughAdds RideSX-specific flash error type and robust upload/flash error handling with power-off cleanup, standardizes gRPC AioRpcError formatting in the core client, tightens stream-copy exception logging, and adds presigned HTTP resource handling with detailed exception mapping and cleanup. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
python/packages/jumpstarter-driver-ridesx/jumpstarter_driver_ridesx/client.py (1)
124-137:logger.exceptionwith a%sformat arg for the exception is redundant.
logger.exceptionimplicitly setsexc_info=True, so the exception string and traceback are already appended to the log record. Passingpower_erroras a format argument duplicates the error text in the emitted message.♻️ Proposed fix
- self.logger.exception("power-off cleanup failed after flash operation error: %s", power_error) + self.logger.exception("power-off cleanup failed after flash operation error")- self.logger.exception("power-off cleanup failed after successful flash operation: %s", power_error) + self.logger.exception("power-off cleanup failed after successful flash operation")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/packages/jumpstarter-driver-ridesx/jumpstarter_driver_ridesx/client.py` around lines 124 - 137, The logger.exception calls that pass the caught exception as a %s format arg are redundant; in both places around the flash operation (the inner except handling power_error after a failed flash and the later except handling power_error after a successful flash) remove the "%s" formatting and the power_error argument and call logger.exception with a plain descriptive message (e.g., "power-off cleanup failed after flash operation error" and "power-off cleanup failed after successful flash operation") so the exception and traceback are logged once by logger.exception while keeping the messages concise; target the calls to logger.exception associated with _power_off_if_available and power_error.python/packages/jumpstarter/jumpstarter/client/core.py (1)
129-138:streamingcall_asyncstill uses rawe.details()— inconsistent with the_format_rpc_errorhelper added in this PR.
call_asyncnow emitsDriverCall 'X' failed with gRPC STATUS: <details> | debug=…, but streaming errors still emit only the bare details string. This makes diagnosing streaming failures harder compared to regular calls.♻️ Proposed fix
try: async for response in self.stub.StreamingDriverCall(request): yield decode_value(response.result) except AioRpcError as e: + error_message = self._format_rpc_error(method, e) match e.code(): case StatusCode.UNIMPLEMENTED: - raise DriverMethodNotImplemented(e.details()) from None + raise DriverMethodNotImplemented(error_message) from None case StatusCode.INVALID_ARGUMENT: - raise DriverInvalidArgument(e.details()) from None + raise DriverInvalidArgument(error_message) from None case StatusCode.UNKNOWN: - raise DriverError(e.details()) from None + raise DriverError(error_message) from None case _: - raise DriverError(e.details()) from e + raise DriverError(error_message) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/packages/jumpstarter/jumpstarter/client/core.py` around lines 129 - 138, The streamingcall_async exception handling is still using raw e.details(); update it to use the shared _format_rpc_error helper so streaming errors include the same formatted message and debug info as call_async. In the except AioRpcError block inside streamingcall_async, replace uses of e.details() when constructing DriverMethodNotImplemented, DriverInvalidArgument, DriverError (and the default raise) with _format_rpc_error(e, "DriverCall '<method_name>' failed") or similar so the raised exceptions carry the formatted string; keep the same exception classes and preserve the original match on e.code().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@python/packages/jumpstarter/jumpstarter/streams/common.py`:
- Line 29: The except clause is importing the private symbol
asyncio.exceptions.InvalidStateError; replace it with the public
asyncio.InvalidStateError to avoid relying on CPython internals. Update the
except tuple in the error handling in jumpstarter.streams.common (the except
handling BrokenResourceError, ClosedResourceError,
asyncio.exceptions.InvalidStateError) to use asyncio.InvalidStateError, and make
the same change in jumpstarter.streams.router where the same private symbol is
used (occurrences around the except blocks at the referenced locations).
---
Nitpick comments:
In
`@python/packages/jumpstarter-driver-ridesx/jumpstarter_driver_ridesx/client.py`:
- Around line 124-137: The logger.exception calls that pass the caught exception
as a %s format arg are redundant; in both places around the flash operation (the
inner except handling power_error after a failed flash and the later except
handling power_error after a successful flash) remove the "%s" formatting and
the power_error argument and call logger.exception with a plain descriptive
message (e.g., "power-off cleanup failed after flash operation error" and
"power-off cleanup failed after successful flash operation") so the exception
and traceback are logged once by logger.exception while keeping the messages
concise; target the calls to logger.exception associated with
_power_off_if_available and power_error.
In `@python/packages/jumpstarter/jumpstarter/client/core.py`:
- Around line 129-138: The streamingcall_async exception handling is still using
raw e.details(); update it to use the shared _format_rpc_error helper so
streaming errors include the same formatted message and debug info as
call_async. In the except AioRpcError block inside streamingcall_async, replace
uses of e.details() when constructing DriverMethodNotImplemented,
DriverInvalidArgument, DriverError (and the default raise) with
_format_rpc_error(e, "DriverCall '<method_name>' failed") or similar so the
raised exceptions carry the formatted string; keep the same exception classes
and preserve the original match on e.code().
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
python/packages/jumpstarter-driver-ridesx/jumpstarter_driver_ridesx/client.py (1)
78-83: Double-wrappingRideSXFlashErrorfrom_upload_file_if_needed
_upload_file_if_neededalready raisesRideSXFlashError(with file, filename, and scheme details) whenwrite_from_pathfails. SinceRideSXFlashErroris a subclass ofException, it is caught here and re-wrapped in a secondRideSXFlashError, producing a redundant exception chain:RideSXFlashError("Failed preparing image for partition 'X': /path") └─ caused by: RideSXFlashError("Failed to transfer '/path' … (scheme=…)") └─ caused by: <original>The outer wrapper's value (adding partition context) is only meaningful for exceptions that do NOT already carry flash context — e.g., unexpected errors from
_should_upload_file. Re-raisingRideSXFlashErrordirectly avoids the redundant layer:♻️ Proposed refactor
- try: - remote_files[partition] = self._upload_file_if_needed(file_path, operator) - except Exception as e: - raise RideSXFlashError( - f"Failed preparing image for partition '{partition}': {file_path}" - ) from e + try: + remote_files[partition] = self._upload_file_if_needed(file_path, operator) + except RideSXFlashError: + raise # already carries detailed context + except Exception as e: + raise RideSXFlashError( + f"Failed preparing image for partition '{partition}': {file_path}" + ) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/packages/jumpstarter-driver-ridesx/jumpstarter_driver_ridesx/client.py` around lines 78 - 83, The try/except is double-wrapping RideSXFlashError from _upload_file_if_needed; change the handler so that if the caught exception is already a RideSXFlashError you re-raise it as-is, otherwise wrap non-RideSXFlashError exceptions in a new RideSXFlashError that adds the partition/file context; update the block around remote_files[partition] = self._upload_file_if_needed(file_path, operator) to re-raise existing RideSXFlashError and only construct a new RideSXFlashError for other exception types.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@python/packages/jumpstarter/jumpstarter/driver/base.py`:
- Around line 259-273: The broad except handlers for TimeoutError and OSError
are wrapping caller-side exceptions because both yield points live inside the
outer try; fix by tracking whether control has been yielded to the caller (e.g.,
add a boolean like setup_complete = False, set setup_complete = True immediately
after the yield at lines around the two yield statements) and in the
TimeoutError/OSError except blocks only wrap and re-raise with the "Presigned
HTTP ..." message when setup_complete is False; if setup_complete is True,
re-raise the original exception (raise) so caller-side errors are not
misattributed; keep the aiohttp-specific handlers (ClientResponseError,
ClientConnectionError, ClientPayloadError) as-is.
---
Nitpick comments:
In
`@python/packages/jumpstarter-driver-ridesx/jumpstarter_driver_ridesx/client.py`:
- Around line 78-83: The try/except is double-wrapping RideSXFlashError from
_upload_file_if_needed; change the handler so that if the caught exception is
already a RideSXFlashError you re-raise it as-is, otherwise wrap
non-RideSXFlashError exceptions in a new RideSXFlashError that adds the
partition/file context; update the block around remote_files[partition] =
self._upload_file_if_needed(file_path, operator) to re-raise existing
RideSXFlashError and only construct a new RideSXFlashError for other exception
types.
c7edd6d to
7ecf38e
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
python/packages/jumpstarter-driver-ridesx/jumpstarter_driver_ridesx/client.py (2)
147-160:logger.exception+%sformat arg causes the error to appear twice in output.
logger.exception(msg, *args)is equivalent tologger.error(msg, *args, exc_info=True)— theexc_info=Truealready appends the full traceback (including the exception message). Passingpower_erroras the%sargument embedsstr(power_error)in the log line and in the traceback, duplicating it.♻️ Suggested fix (both occurrences)
- self.logger.exception("power-off cleanup failed after flash operation error: %s", power_error) + self.logger.exception("power-off cleanup failed after flash operation error")- self.logger.exception("power-off cleanup failed after successful flash operation: %s", power_error) + self.logger.exception("power-off cleanup failed after successful flash operation")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/packages/jumpstarter-driver-ridesx/jumpstarter_driver_ridesx/client.py` around lines 147 - 160, The logger.exception calls in the cleanup blocks for _power_off_if_available pass power_error as a "%s" format argument which duplicates the exception message in the log; change both calls to not pass power_error (e.g. self.logger.exception("power-off cleanup failed after flash operation error") and self.logger.exception("power-off cleanup failed after successful flash operation")) so the traceback (exc_info=True provided by logger.exception) is printed once, leaving wrapped_error/flash_error handling unchanged.
78-83: Double-wrappingRideSXFlashErroradds a verbose, partially-redundant cause chain.
_upload_file_if_neededalready raisesRideSXFlashErrorwith a detailed message (file path + scheme). CatchingExceptionhere also catches thatRideSXFlashErrorand wraps it in anotherRideSXFlashError, producing a three-level chain:RideSXFlashError("Failed preparing image for partition 'boot_a': /path/to/boot.img") __cause__ = RideSXFlashError("Failed to transfer '/path/to/boot.img' … (scheme=file)") __cause__ = <storage exception>Since
JumpstarterException.__str__renders the full cause chain, users see the outer (less specific) message first. Letting already-typedRideSXFlashErrorpropagate unchanged avoids the noise while still wrapping lower-level errors (e.g.,ValueError) with partition context:♻️ Suggested refactor
- try: - remote_files[partition] = self._upload_file_if_needed(file_path, operator) - except Exception as e: - raise RideSXFlashError( - f"Failed preparing image for partition '{partition}': {file_path}" - ) from e + try: + remote_files[partition] = self._upload_file_if_needed(file_path, operator) + except RideSXFlashError: + raise + except Exception as e: + raise RideSXFlashError( + f"Failed preparing image for partition '{partition}': {file_path}" + ) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/packages/jumpstarter-driver-ridesx/jumpstarter_driver_ridesx/client.py` around lines 78 - 83, The current try/except around remote_files[partition] = self._upload_file_if_needed(file_path, operator) is double-wrapping existing RideSXFlashError instances; change the handler to let RideSXFlashError propagate unchanged and only wrap other exception types: i.e., in the except block check if isinstance(e, RideSXFlashError) then re-raise, otherwise raise a new RideSXFlashError that includes partition and file_path as context and set from e; reference _upload_file_if_needed, RideSXFlashError, partition, file_path and remote_files when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@python/packages/jumpstarter/jumpstarter/streams/common.py`:
- Around line 18-32: Replace the private CPython exception with the public
symbol and keep the new logging: ensure the except clause uses
asyncio.InvalidStateError (not asyncio.exceptions.InvalidStateError) along with
BrokenResourceError and ClosedResourceError, preserve the logger.warning("stream
copy interrupted (%s): %s", type(e).__name__, e) line and the optional
logger.debug("stream copy root cause: %r", e.__cause__) conditional, and keep
the suppress(...) block that calls await dst.send_eof() (including
AttributeError and OSError) so interrupted stream copies are logged rather than
silently suppressed.
---
Nitpick comments:
In
`@python/packages/jumpstarter-driver-ridesx/jumpstarter_driver_ridesx/client.py`:
- Around line 147-160: The logger.exception calls in the cleanup blocks for
_power_off_if_available pass power_error as a "%s" format argument which
duplicates the exception message in the log; change both calls to not pass
power_error (e.g. self.logger.exception("power-off cleanup failed after flash
operation error") and self.logger.exception("power-off cleanup failed after
successful flash operation")) so the traceback (exc_info=True provided by
logger.exception) is printed once, leaving wrapped_error/flash_error handling
unchanged.
- Around line 78-83: The current try/except around remote_files[partition] =
self._upload_file_if_needed(file_path, operator) is double-wrapping existing
RideSXFlashError instances; change the handler to let RideSXFlashError propagate
unchanged and only wrap other exception types: i.e., in the except block check
if isinstance(e, RideSXFlashError) then re-raise, otherwise raise a new
RideSXFlashError that includes partition and file_path as context and set from
e; reference _upload_file_if_needed, RideSXFlashError, partition, file_path and
remote_files when making the change.
7ecf38e to
92354b8
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
python/packages/jumpstarter/jumpstarter/client/core.py (1)
129-138:streamingcall_asyncnot updated to use_format_rpc_error.
streamingcall_asyncstill uses rawe.details()and also lacks theNOT_FOUND→DriverMethodNotImplementedmapping that was added tocall_async. Given the PR explicitly introduces_format_rpc_errorto standardize all RPC error messages, not applying it here is an inconsistency.♻️ Suggested fix
try: async for response in self.stub.StreamingDriverCall(request): yield decode_value(response.result) except AioRpcError as e: + error_message = self._format_rpc_error(method, e) match e.code(): + case StatusCode.NOT_FOUND: + raise DriverMethodNotImplemented(error_message) from None case StatusCode.UNIMPLEMENTED: - raise DriverMethodNotImplemented(e.details()) from None + raise DriverMethodNotImplemented(error_message) from None case StatusCode.INVALID_ARGUMENT: - raise DriverInvalidArgument(e.details()) from None + raise DriverInvalidArgument(error_message) from None case StatusCode.UNKNOWN: - raise DriverError(e.details()) from None + raise DriverError(error_message) from None case _: - raise DriverError(e.details()) from e + raise DriverError(error_message) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/packages/jumpstarter/jumpstarter/client/core.py` around lines 129 - 138, Update the AioRpcError handling in streamingcall_async to use the new _format_rpc_error helper instead of raw e.details(), and add the NOT_FOUND → DriverMethodNotImplemented mapping to match call_async; specifically, in the except AioRpcError block inside streamingcall_async replace usages of e.details() with _format_rpc_error(e) and add a case for StatusCode.NOT_FOUND that raises DriverMethodNotImplemented(... ) from None, keeping the other StatusCode branches and the default raise DriverError(... ) from e as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@python/packages/jumpstarter-driver-ridesx/jumpstarter_driver_ridesx/client.py`:
- Around line 78-83: The current catch around self._upload_file_if_needed is
re-wrapping an existing RideSXFlashError and creating a redundant exception
chain; in the except block for the try that populates remote_files[partition]
call, check if the caught exception (e) is an instance of RideSXFlashError and
if so re-raise it directly (raise) to preserve the original error, otherwise
wrap it in a new RideSXFlashError that includes the partition and file_path
context; reference _upload_file_if_needed, RideSXFlashError, partition,
file_path and remote_files to locate where to implement the isinstance(e,
RideSXFlashError) conditional re-raise.
---
Duplicate comments:
In `@python/packages/jumpstarter/jumpstarter/driver/base.py`:
- Around line 240-279: Remove the redundant aiohttp.ServerTimeoutError from the
multi-except that currently reads except (aiohttp.ClientConnectionError,
aiohttp.ClientPayloadError, aiohttp.ServerTimeoutError) as e; ServerTimeoutError
is a subclass of ClientConnectionError so it's already matched — update that
except clause to only include aiohttp.ClientConnectionError and
aiohttp.ClientPayloadError (leaving the surrounding error message and raising
behaviour unchanged).
---
Nitpick comments:
In `@python/packages/jumpstarter/jumpstarter/client/core.py`:
- Around line 129-138: Update the AioRpcError handling in streamingcall_async to
use the new _format_rpc_error helper instead of raw e.details(), and add the
NOT_FOUND → DriverMethodNotImplemented mapping to match call_async;
specifically, in the except AioRpcError block inside streamingcall_async replace
usages of e.details() with _format_rpc_error(e) and add a case for
StatusCode.NOT_FOUND that raises DriverMethodNotImplemented(... ) from None,
keeping the other StatusCode branches and the default raise DriverError(... )
from e as before.
| try: | ||
| remote_files[partition] = self._upload_file_if_needed(file_path, operator) | ||
| except Exception as e: | ||
| raise RideSXFlashError( | ||
| f"Failed preparing image for partition '{partition}': {file_path}" | ||
| ) from e |
There was a problem hiding this comment.
Double RideSXFlashError wrapping creates a redundant exception chain.
_upload_file_if_needed already raises RideSXFlashError (line 53). Catching it here and re-wrapping in another RideSXFlashError produces a chain of two same-typed exceptions:
original_exception → RideSXFlashError("Failed to transfer 'X' …") → RideSXFlashError("Failed preparing image for partition 'P' …")
The partition context in the outer message is useful, but only when the inner exception is not already a RideSXFlashError. Consider re-raising the inner RideSXFlashError as-is:
🛠️ Suggested fix
try:
remote_files[partition] = self._upload_file_if_needed(file_path, operator)
- except Exception as e:
+ except RideSXFlashError:
+ raise
+ except Exception as e:
raise RideSXFlashError(
f"Failed preparing image for partition '{partition}': {file_path}"
) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@python/packages/jumpstarter-driver-ridesx/jumpstarter_driver_ridesx/client.py`
around lines 78 - 83, The current catch around self._upload_file_if_needed is
re-wrapping an existing RideSXFlashError and creating a redundant exception
chain; in the except block for the try that populates remote_files[partition]
call, check if the caught exception (e) is an instance of RideSXFlashError and
if so re-raise it directly (raise) to preserve the original error, otherwise
wrap it in a new RideSXFlashError that includes the partition and file_path
context; reference _upload_file_if_needed, RideSXFlashError, partition,
file_path and remote_files to locate where to implement the isinstance(e,
RideSXFlashError) conditional re-raise.
This PR improves the error information received by a user when performing a j storage flash , specially from http targets, when the remote server closes connection you end up with a very silent
"Error:"
which is hard to interpret, so this is trying to surface the details to the user.
NOTE:
I tested HTTP server resets by creating :
during the download, makes the http client receive TCP RSTs for any packet, this could be helpful for e2e tests
Summary by CodeRabbit