Quantize Weight for Gemm/Conv on Quantized Model #22969

centwang · 2024-11-28T10:20:40Z

Some quantized models have QDQ around Conv/Gemm but the weight and/or bias are not quantized. This PR adds WeightBiasQuantization optimizer to quantize float weight and/or bias to INT8 and INT32 tensors respectively. We only do this for weight and/or bias initializer so that ConstantFolding will fold the sub-graph to real quantized initializers during the graph optimization next round.

onnxruntime/core/optimizer/qdq_transformer/weight_bias_quantization.h

onnxruntime/core/optimizer/qdq_transformer/weight_bias_quantization.cc

skottmckay

adrianlizarraga

Thank you!

Some quantized models have QDQ around Conv/Gemm but the weight and/or bias are not quantized. This PR adds WeightBiasQuantization optimizer to quantize float weight and/or bias to INT8 and INT32 tensors respectively. We only do this for weight and/or bias initializer so that ConstantFolding will fold the sub-graph to real quantized initializers during the graph optimization next round.

quantize weight

9389bbe

centwang requested review from skottmckay, adrianlizarraga and jywu-msft November 28, 2024 10:20

centwang marked this pull request as ready for review November 28, 2024 10:20

centwang added 2 commits November 29, 2024 10:44

adjust ut scale and zp

aa00373

adjust ut data

3c170c3

skottmckay reviewed Dec 4, 2024

View reviewed changes

onnxruntime/core/optimizer/qdq_transformer/weight_bias_quantization.h Outdated Show resolved Hide resolved

onnxruntime/core/optimizer/qdq_transformer/weight_bias_quantization.cc Outdated Show resolved Hide resolved

centwang added 2 commits December 4, 2024 11:31

resolve comments

d8d1156

fix warn

e5b9b40

skottmckay approved these changes Dec 4, 2024

View reviewed changes

adrianlizarraga approved these changes Jan 8, 2025

View reviewed changes

centwang merged commit ff0ab0a into main Jan 8, 2025
95 checks passed

centwang deleted the weicwang/weight_quantization branch January 8, 2025 02:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantize Weight for Gemm/Conv on Quantized Model #22969

Quantize Weight for Gemm/Conv on Quantized Model #22969

centwang commented Nov 28, 2024

skottmckay left a comment

adrianlizarraga left a comment

Quantize Weight for Gemm/Conv on Quantized Model #22969

Quantize Weight for Gemm/Conv on Quantized Model #22969

Conversation

centwang commented Nov 28, 2024

skottmckay left a comment

Choose a reason for hiding this comment

adrianlizarraga left a comment

Choose a reason for hiding this comment