Skip to content

Conversation

@ltd0924
Copy link
Contributor

@ltd0924 ltd0924 commented Mar 3, 2025

Before submitting

  • Lint code. If there are lint issues, please format the code first.
# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py
  • Add test cases into tests folder. If there are codecov issues, please add tests cases first.

PR types

PR changes

environment variable modification
add the feature to download models.

Description

  • Environment variable modification
    In this update, we have adjusted the naming of environment variables to enhance the clarity and consistency of configurations. Below are the specific changes:
    GRPC_PORT has been changed to SERVICE_GRPC_PORT
    HTTP_PORT has been changed to HEALTH_HTTP_PORT
    METRICS_PORT has been changed to METRICS_HTTP_PORT
    INTER_QUEUE_PORT has been changed to INTER_PROC_PORT
    PUSH_MODE_HTTP_PORT has been changed to SERVICE_HTTP_PORT
    The original environment variables remain compatible, but it is recommended to use the latest ones.
  • Supports model download and deployment.
    add download_model function to download the model from the repository.

@paddle-bot
Copy link

paddle-bot bot commented Mar 3, 2025

Thanks for your contribution!

@codecov
Copy link

codecov bot commented Mar 3, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 50.56%. Comparing base (4c2a3f7) to head (974621b).
Report is 325 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff              @@
##           develop    #9966       +/-   ##
============================================
+ Coverage    17.38%   50.56%   +33.17%     
============================================
  Files          752      752               
  Lines       120314   121062      +748     
============================================
+ Hits         20919    61213    +40294     
+ Misses       99395    59849    -39546     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

export BLOCK_BS=${BLOCK_BS:-"4"}
export BLOCK_SIZE=${BLOCK_SIZE:-"64"}
export DTYPE=${DTYPE:-"bfloat16"}
export USE_CACHE_KV_INT8=${USE_CACHE_KV_INT8:-"0"} # c8 model requires configuration 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些变量看下是否还需要,不需要的话可以移除掉

export METRICS_PORT=${METRICS_PORT:-"8722"}
export INFER_QUEUE_PORT=${INFER_QUEUE_PORT:-"8813"}
export PUSH_MODE_HTTP_PORT=${PUSH_MODE_HTTP_PORT:-"9965"}
export HEALTH_HTTP_PORT=${HTTP_PORT:-${HEALTH_HTTP_PORT:-"8110"}}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

少了$符号?

# Cleanup on failure
if os.path.exists(temp_tar):
os.remove(temp_tar)
raise Exception(f"Failed to get model from {url}, please recheck the model name from https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/docs/predict/inference.md")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的逻辑是,不管用户是什么字符串,都先去尝试? 然后下载不到抛这个错误吗

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改限制模型名称

# Check if model_name matches any supported pattern
if not any(re.match(pattern, model_name) for pattern in supported_patterns):
raise ValueError(
f"{model_name} is not in the supported list. Currently supported models: Qwen, Llama, Mixtral, DeepSeek."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改成 当前Serving 仅支持 Qwen, LlaMA, Mixtral, DeepSeekv3系列模型,具体Model name参数xx文档

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已添加

@@ -0,0 +1 @@
from server.triton_server import TritonPythonModel No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个是多余的吗?文件夹命名是1?

@ZHUI ZHUI merged commit 2c1b106 into PaddlePaddle:develop Mar 6, 2025
10 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants