feat: add e2e test to verify service is avaliable#310
Conversation
|
/kind test |
Something wrong with golang ci lint? |
|
/kind feature |
This is because resp.Body.Close() will return a value, but we don't validate the return. We can silent the lint I think. Or we have to do something like: defer func() {
_ = resp.Body.Close()
}() |
| return nil | ||
| } | ||
|
|
||
| func CheckLlamacppServeAvaliable(localPort int) error { |
There was a problem hiding this comment.
Why should we have two different Check here? I think they're both OpenAI compatible, only one client makes more sense I think. Maybe we can use the https://github.com/openai/openai-go? The benefit is it supports more rich features like constructing the system prompt and chat conversation, or we have to build the structure ourself.
There was a problem hiding this comment.
I think they're both OpenAI compatible, only one client makes more sense I think.
IIUC, their API are different.
ollama: https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-completion
llamacpp: https://github.com/ggml-org/llama.cpp/blob/master/examples/server/README.md#post-completion-given-a-prompt-it-returns-the-predicted-completion
There was a problem hiding this comment.
Let's support OpenAI API compatible chat completions first. It's a standard protocol across most of the inference engines.
I see the same code in other project like kubernetes/lws, but no lint error in CI. |
7c31097 to
c972cba
Compare
|
All comments addressed. @kerthcet Will it be rate-limited by huggingFace if we run ci many times? I suspect that most of the time is spent on pulling the model. |
I think it's unrelated to this PR, right? We just check the service ready. |
|
em.. seems we used to run e2e tests with minutes. the last record. |
|
/retest all |
|
Once model is downloaded, loading the model into memory still takes time. |
|
I rerun the tests, it finished in 6mins. I think the time is acceptable. |
|
rerun again to see the final result. |
|
Seems stuck here, model is already downloaded. |
|
Have no idea whether is because of the portforward or the resource contention. May take a look later. |
| }() | ||
| <-readyChan | ||
| return check() | ||
| }).Should(gomega.Succeed()) |
There was a problem hiding this comment.
let's not use eventually here, I don't think we want to forward the port for several time, eventually is used for status check, so we can wrap the check() function with eventually. Generally it looks like:
func ValidateServiceAvaliable() {
// port forward logic
select {
case <-readyChan:
case <-time.After(TIMEOUT):
return fmt.Errorf("port forwarding timeout")
}
Eventually ({check()}, TIMEOUT, INTERVAL)
}
|
/lgtm Let's make it happen now, we can focus on the performance later! Thanks @nayihz |
What this PR does / why we need it
Which issue(s) this PR fixes
Fixes ##274
Special notes for your reviewer
Does this PR introduce a user-facing change?