fix: extract usage from ResponsesAPI streaming events#2573
fix: extract usage from ResponsesAPI streaming events#2573varad-ahirwadkar wants to merge 1 commit intokubernetes-sigs:mainfrom
Conversation
|
Hi @varad-ahirwadkar. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Tip We noticed you've done this a few times! Consider joining the org to skip this step and gain Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
ab88f85 to
55d24ce
Compare
|
/ok-to-test |
|
@varad-ahirwadkar: Cannot trigger testing until a trusted user reviews the PR and leaves an DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/ok-to-test |
55d24ce to
10fda3b
Compare
|
Thanks, what kind of testing was done to validate the fix? |
|
/assign @zetxqx |
zetxqx
left a comment
There was a problem hiding this comment.
Looks good, can you add a unit test of this, with some real streaming responses data?
I validated the fix by running the I also ran the full test suite for the OpenAI parser package to ensure there were no regressions: However, when I sent a similar curl request as mentioned in this issue: #2482, but I still got the same results. |
Sure, will add a unit test. Thanks |
Signed-off-by: Varad <varad.ahirwadkar1@ibm.com>
10fda3b to
107d3e4
Compare
|
Hi @zetxqx, |
|
/approve I will leave the lgtm to @zetxqx |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, varad-ahirwadkar The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Responses API returns usage under
response.usage, which was not previously handled by the streaming parser. This change adds support for that format while keeping compatibility with existing ChatCompletions and vLLM streaming responses.Which issue(s) this PR fixes:
Fixes #2482
Does this PR introduce a user-facing change?: