Checklist
Motivation
Currently in SGLang, the FP8 Gemm kernels we use is controlled by a series of environment variables or implicit dispatching logics, as in https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/layers/quantization/fp8_utils.py#L151
To make a better control, we need a server argument like --fp8-gemm-runner-backend, similar to --moe-runner-backend
Related resources
No response