feat: add e2e test to verify service is avaliable by nayihz · Pull Request #310 · InftyAI/llmaz

nayihz · 2025-03-12T12:14:45Z

What this PR does / why we need it

Which issue(s) this PR fixes

Fixes ##274

Special notes for your reviewer

Does this PR introduce a user-facing change?

nayihz · 2025-03-12T12:30:53Z

/kind test
/kind feature

nayihz · 2025-03-12T12:44:23Z

Error: test/util/validation/validate_service.go:346:23: Error return value of `resp.Body.Close` is not checked (errcheck)
  	defer resp.Body.Close()
  	                     ^
  Error: test/util/validation/validate_service.go:[37](https://github.com/InftyAI/llmaz/actions/runs/13811392705/job/38633611131?pr=310#step:4:39)7:23: Error return value of `resp.Body.Close` is not checked (errcheck)
  	defer resp.Body.Close()
  	                     ^

Something wrong with golang ci lint?

nayihz · 2025-03-12T12:44:44Z

/kind feature

kerthcet · 2025-03-13T03:34:21Z

Error: test/util/validation/validate_service.go:346:23: Error return value of `resp.Body.Close` is not checked (errcheck)
  	defer resp.Body.Close()
  	                     ^
  Error: test/util/validation/validate_service.go:[37](https://github.com/InftyAI/llmaz/actions/runs/13811392705/job/38633611131?pr=310#step:4:39)7:23: Error return value of `resp.Body.Close` is not checked (errcheck)
  	defer resp.Body.Close()
  	                     ^

Something wrong with golang ci lint?

This is because resp.Body.Close() will return a value, but we don't validate the return. We can silent the lint I think. Or we have to do something like:

defer func() {
   _ = resp.Body.Close()
}()

kerthcet · 2025-03-13T03:36:08Z

+	return nil
+}
+
+func CheckLlamacppServeAvaliable(localPort int) error {


Why should we have two different Check here? I think they're both OpenAI compatible, only one client makes more sense I think. Maybe we can use the https://github.com/openai/openai-go? The benefit is it supports more rich features like constructing the system prompt and chat conversation, or we have to build the structure ourself.

I think they're both OpenAI compatible, only one client makes more sense I think.

IIUC, their API are different.
ollama: https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-completion
llamacpp: https://github.com/ggml-org/llama.cpp/blob/master/examples/server/README.md#post-completion-given-a-prompt-it-returns-the-predicted-completion

Let's support OpenAI API compatible chat completions first. It's a standard protocol across most of the inference engines.

nayihz · 2025-03-13T05:52:39Z

This is because resp.Body.Close() will return a value, but we don't validate the return. We can silent the lint I think.

I see the same code in other project like kubernetes/lws, but no lint error in CI.

kerthcet

Generally LGTM. Thanks @nayihz

nayihz · 2025-03-14T08:49:58Z

All comments addressed. @kerthcet
e2e-test took a very long time.

playground e2e tests Deploy a huggingface model with customized backendRuntime
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:96
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
• [1075.442 seconds]

Will it be rate-limited by huggingFace if we run ci many times? I suspect that most of the time is spent on pulling the model.

kerthcet

Only one nit.

kerthcet · 2025-03-14T10:42:33Z

All comments addressed. @kerthcet e2e-test took a very long time.
playground e2e tests Deploy a huggingface model with customized backendRuntime
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:96
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
• [1075.442 seconds]
Will it be rate-limited by huggingFace if we run ci many times? I suspect that most of the time is spent on pulling the model.

I think it's unrelated to this PR, right? We just check the service ready.

kerthcet · 2025-03-14T10:44:44Z

em.. seems we used to run e2e tests with minutes. the last record.

succeeded  in 5m 55s

kerthcet · 2025-03-14T10:44:49Z

/retest all

kerthcet · 2025-03-14T10:47:08Z

Once model is downloaded, loading the model into memory still takes time.

kerthcet · 2025-03-14T10:53:51Z

I rerun the tests, it finished in 6mins. I think the time is acceptable.

succeeded  in 6m 11s

kerthcet · 2025-03-14T10:54:09Z

rerun again to see the final result.

kerthcet · 2025-03-14T11:05:14Z

Seems stuck here, model is already downloaded.

playground e2e tests Deploy a huggingface model with llama.cpp
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:77
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080

kerthcet · 2025-03-14T11:40:08Z

Have no idea whether is because of the portforward or the resource contention. May take a look later.

playground e2e tests Deploy a huggingface model with llama.cpp
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:77
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
• [985.885 seconds]
------------------------------
playground e2e tests Deploy a huggingface model with customized backendRuntime
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:96
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Handling connection for 8080
• [964.528 seconds]
------------------------------
playground e2e tests Deploy a huggingface model with llama.cpp, HPA enabled
/home/runner/work/llmaz/llmaz/test/e2e/playground_test.go:122
• [16.392 seconds]

kerthcet · 2025-03-14T11:49:32Z

+		}()
+		<-readyChan
+		return check()
+	}).Should(gomega.Succeed())


let's not use eventually here, I don't think we want to forward the port for several time, eventually is used for status check, so we can wrap the check() function with eventually. Generally it looks like:

func ValidateServiceAvaliable() { // port forward logic select { case <-readyChan: case <-time.After(TIMEOUT): return fmt.Errorf("port forwarding timeout") } Eventually ({check()}, TIMEOUT, INTERVAL) }

kerthcet · 2025-03-17T02:43:55Z

/lgtm
/approve

Let's make it happen now, we can focus on the performance later! Thanks @nayihz

InftyAI-Agent added needs-triage Indicates an issue or PR lacks a label and requires one. needs-priority Indicates a PR lacks a label and requires one. do-not-merge/needs-kind Indicates a PR lacks a label and requires one. labels Mar 12, 2025

InftyAI-Agent requested a review from kerthcet March 12, 2025 12:15

feat: add e2e test to verify service is avaliable

8c4ff45

nayihz force-pushed the feat_e2e branch from c2661d8 to 8c4ff45 Compare March 12, 2025 12:30

nayihz commented Mar 12, 2025

View reviewed changes

Comment thread test/util/validation/validate_service.go Outdated

InftyAI-Agent added feature Categorizes issue or PR as related to a new feature. and removed do-not-merge/needs-kind Indicates a PR lacks a label and requires one. labels Mar 12, 2025

kerthcet reviewed Mar 13, 2025

View reviewed changes

delete ollama test case

18e3686

nayihz changed the title ~~{WIP}feat: add e2e test to verify service is avaliable~~ feat: add e2e test to verify service is avaliable Mar 13, 2025

kerthcet reviewed Mar 14, 2025

View reviewed changes

nayihz force-pushed the feat_e2e branch 4 times, most recently from 7c31097 to c972cba Compare March 14, 2025 08:23

kerthcet reviewed Mar 14, 2025

View reviewed changes

Comment thread test/util/validation/validate_service.go Outdated

fix comments

3175d1b

nayihz force-pushed the feat_e2e branch from c972cba to 3175d1b Compare March 14, 2025 10:55

kerthcet reviewed Mar 14, 2025

View reviewed changes

nayihz force-pushed the feat_e2e branch from 696f1ef to dd6263f Compare March 15, 2025 00:05

move port forward outof eventually

38c2d45

nayihz force-pushed the feat_e2e branch from dd6263f to 38c2d45 Compare March 15, 2025 00:13

InftyAI-Agent added lgtm Looks good to me, indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 17, 2025

InftyAI-Agent assigned kerthcet Mar 17, 2025

InftyAI-Agent merged commit 5adc074 into InftyAI:main Mar 17, 2025

nayihz deleted the feat_e2e branch March 17, 2025 05:52

nayihz mentioned this pull request Mar 22, 2025

Use client to verify the model inference is ready in E2E test #274

Closed

3 tasks

carlory mentioned this pull request Jun 18, 2025

Update Karpenter image repository #461

Merged

Uh oh!

Conversation

nayihz commented Mar 12, 2025

What this PR does / why we need it

Which issue(s) this PR fixes

Special notes for your reviewer

Does this PR introduce a user-facing change?

Uh oh!

nayihz commented Mar 12, 2025

Uh oh!

Uh oh!

nayihz commented Mar 12, 2025

Uh oh!

nayihz commented Mar 12, 2025

Uh oh!

kerthcet commented Mar 13, 2025

Uh oh!

kerthcet Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nayihz Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

kerthcet Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

nayihz commented Mar 13, 2025

Uh oh!

kerthcet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nayihz commented Mar 14, 2025

Uh oh!

kerthcet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kerthcet commented Mar 14, 2025

Uh oh!

kerthcet commented Mar 14, 2025

Uh oh!

kerthcet commented Mar 14, 2025

Uh oh!

kerthcet commented Mar 14, 2025

Uh oh!

kerthcet commented Mar 14, 2025

Uh oh!

kerthcet commented Mar 14, 2025

Uh oh!

kerthcet commented Mar 14, 2025

Uh oh!

kerthcet commented Mar 14, 2025

Uh oh!

kerthcet Mar 14, 2025

Choose a reason for hiding this comment

Uh oh!

nayihz Mar 15, 2025

Choose a reason for hiding this comment

Uh oh!

kerthcet commented Mar 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kerthcet Mar 13, 2025 •

edited

Loading