Fix: More accurate perf metrics for slow-starting streams#385
Conversation
WalkthroughWalkthroughThe pull request introduces modifications to the Changes
Possibly related PRs
Suggested labels
Recent review detailsConfiguration used: CodeRabbit UI Files selected for processing (1)
Files skipped from review as they are similar to previous changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
/fix-pr
|
There was a problem hiding this comment.
Actionable comments posted: 0
Outside diff range and nitpick comments (1)
airbyte/progress.py (1)
425-425: Consider splitting the long line, wdyt?Ruff flagged line 425 for exceeding the line length limit of 100 characters. To improve readability and adhere to the coding style, how about splitting it into multiple lines like this:
self._print_info_message( f"Read started on stream `{stream_name}` at " f"`{pendulum.now().format('HH:mm:ss')}`..." )Tools
Ruff
425-425: Line too long (101 > 100)
(E501)
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- airbyte/progress.py (3 hunks)
Additional context used
Ruff
airbyte/progress.py
425-425: Line too long (101 > 100)
(E501)
Additional comments not posted (3)
airbyte/progress.py (3)
421-427: LGTM!The
log_stream_startfunction looks good. It consolidates the logging functionality and ensures the start time is only recorded once per stream. Nice work!Tools
Ruff
425-425: Line too long (101 > 100)
(E501)
Line range hint
429-432: Looks good!The
_log_stream_read_endfunction is straightforward and does its job of logging the stream read completion and recording the end time. No issues found.Tools
Ruff
425-425: Line too long (101 > 100)
(E501)
279-287: Nicely done!The changes in
tally_records_readlook great. The function now correctly handles theAirbyteStreamStatus.STARTEDandAirbyteStreamStatus.COMPLETEtrace messages, calling the appropriate logging functions. The logic is clean and easy to follow.
Prior to this PR, PyAirbyte tracked the start time of a stream along with the first record received. This works well for most cases, but if a stream takes a while to send its first record, then our metrics will be skewed to look like the sync is faster than it really is.
This update uses the start trace event to get a better start time for each stream.
Summary by CodeRabbit
New Features
Bug Fixes