-
Notifications
You must be signed in to change notification settings - Fork 3.1k
[LLM] Support for automatic deployment of services, modification of environment variable names #9966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your contribution! |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #9966 +/- ##
============================================
+ Coverage 17.38% 50.56% +33.17%
============================================
Files 752 752
Lines 120314 121062 +748
============================================
+ Hits 20919 61213 +40294
+ Misses 99395 59849 -39546 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| export BLOCK_BS=${BLOCK_BS:-"4"} | ||
| export BLOCK_SIZE=${BLOCK_SIZE:-"64"} | ||
| export DTYPE=${DTYPE:-"bfloat16"} | ||
| export USE_CACHE_KV_INT8=${USE_CACHE_KV_INT8:-"0"} # c8 model requires configuration 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些变量看下是否还需要,不需要的话可以移除掉
| export METRICS_PORT=${METRICS_PORT:-"8722"} | ||
| export INFER_QUEUE_PORT=${INFER_QUEUE_PORT:-"8813"} | ||
| export PUSH_MODE_HTTP_PORT=${PUSH_MODE_HTTP_PORT:-"9965"} | ||
| export HEALTH_HTTP_PORT=${HTTP_PORT:-${HEALTH_HTTP_PORT:-"8110"}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
少了$符号?
llm/server/server/server/utils.py
Outdated
| # Cleanup on failure | ||
| if os.path.exists(temp_tar): | ||
| os.remove(temp_tar) | ||
| raise Exception(f"Failed to get model from {url}, please recheck the model name from https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/docs/predict/inference.md") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的逻辑是,不管用户是什么字符串,都先去尝试? 然后下载不到抛这个错误吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改限制模型名称
| # Check if model_name matches any supported pattern | ||
| if not any(re.match(pattern, model_name) for pattern in supported_patterns): | ||
| raise ValueError( | ||
| f"{model_name} is not in the supported list. Currently supported models: Qwen, Llama, Mixtral, DeepSeek." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
改成 当前Serving 仅支持 Qwen, LlaMA, Mixtral, DeepSeekv3系列模型,具体Model name参数xx文档
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已添加
| @@ -0,0 +1 @@ | |||
| from server.triton_server import TritonPythonModel No newline at end of file | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个是多余的吗?文件夹命名是1?
Before submitting
testsfolder. If there are codecov issues, please add tests cases first.PR types
PR changes
environment variable modification
add the feature to download models.
Description
In this update, we have adjusted the naming of environment variables to enhance the clarity and consistency of configurations. Below are the specific changes:
GRPC_PORThas been changed toSERVICE_GRPC_PORTHTTP_PORThas been changed toHEALTH_HTTP_PORTMETRICS_PORThas been changed toMETRICS_HTTP_PORTINTER_QUEUE_PORThas been changed toINTER_PROC_PORTPUSH_MODE_HTTP_PORThas been changed toSERVICE_HTTP_PORTThe original environment variables remain compatible, but it is recommended to use the latest ones.
add
download_modelfunction to download the model from the repository.