-
Notifications
You must be signed in to change notification settings - Fork 960
Migrate export_llama to new ao quantize API #8422
Copy link
Copy link
Labels
module: examplesIssues related to demos under examples/Issues related to demos under examples/module: llmIssues related to LLM examples and apps, and to the extensions/llm/ codeIssues related to LLM examples and apps, and to the extensions/llm/ codetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Metadata
Metadata
Assignees
Labels
module: examplesIssues related to demos under examples/Issues related to demos under examples/module: llmIssues related to LLM examples and apps, and to the extensions/llm/ codeIssues related to LLM examples and apps, and to the extensions/llm/ codetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Type
Projects
Status
Done
Status
Done
🚀 The feature, motivation and pitch
Int8DynActInt4WeightQuantizerfor-qmode 8da4wis no longer being developed by ao and doesn't support bias. Migrate to the newquantize_api which can take inint8_dynamic_activation_int4_weight.Alternatives
No response
Additional context
No response
RFC (Optional)
No response
cc @mergennachin @iseeyuan @lucylq @helunwencser @tarun292 @kimishpatel @cccclai