-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds utility to replace Q/DQ ops with torchao quantized linear ops #1967
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1967
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 74f209e with merge base 3bbf42a ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
||
eager_results = model(activations) | ||
|
||
unwrap_tensor_subclass(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me double check. I thought without it, the exported graph didn't decompose into the Q/DQ ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed unwrap_tensor_subclass
torchao/experimental/quant_api.py
Outdated
return pattern, replacement | ||
|
||
|
||
def replace_q_dq_with_torchao_quantized_linear_ops( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a bit context on when this is used in the docstring of the function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added context in doc string
ca5e688
to
74f209e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, maybe also add some context in the summary about when do we use this, why people first use QDQLayout and then do the fusion instead of generating these ops directly with some other layout
This utility is for export scenarios in which people quantize with Q/DQ layout, and then later want to fuse the ops.