-
Notifications
You must be signed in to change notification settings - Fork 958
Support Dynamically Quantized Convolutions #9021
Copy link
Copy link
Closed
Labels
good first issueGood for newcomersGood for newcomersmodule: xnnpackIssues related to xnnpack delegation and the code under backends/xnnpack/Issues related to xnnpack delegation and the code under backends/xnnpack/triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Milestone
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomersmodule: xnnpackIssues related to xnnpack delegation and the code under backends/xnnpack/Issues related to xnnpack delegation and the code under backends/xnnpack/triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Type
Projects
Status
Done
Status
Done
We wish to have some initial support for Dynamically Quantized Convolutions.
Let's first write a test to drive development of the feature. Let's add a test here first:
executorch/backends/xnnpack/test/ops/test_conv2d.py
Line 4 in e9c2315
let's just do 2d convolutions for now. Take a look at how we test dqlinears in general:
executorch/backends/xnnpack/test/ops/test_linear.py
Line 327 in e9c2315
And let's try to add a test here. Since we're adding quantizer support, we should make sure that after
we should check that a choose_q_param node is in the graph. Now, after we've added this test, when we run it with
python -m unittest backends.xnnpack.test.ops.test_conv2d....It should fail because it can't find the choose_q_params node. Let's first start by enabling the quantizer to properly annotate convolutions.executorch/backends/xnnpack/quantizer/xnnpack_quantizer.py
Line 266 in e9c2315
Since we're first starting with conv2d, we should only annotate dynamically quantized convs if they are 2d. We can add a check that the len(outputpadding) == 2 somewhere here:
https://github.com/pytorch/executorch/blob/main/backends/xnnpack/quantizer/xnnpack_quantizer_utils.py#L295
Now that we have it annotated, it should bass through the test that's checking for the choose_q_param. Now we just need to update our partitioner to allow DynamicallyQuantizedConvolutions:
executorch/backends/xnnpack/partition/config/gemm_configs.py
Line 396 in e9c2315
Again it would be nice to check in our constraints that if we detect a dynamically quantized convolution, and it is 1d, then we don't partition. After this the test should be passing. There may be some more lingering issues with the wiring, if that's the case feel free to reach out in the discord group:
https://discord.com/channels/1334270993966825602/1336777807509979188
cc @digantdesai @cbilgin