Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix conversion of TensorData, TensorsData to json #22166

Merged
merged 8 commits into from
Oct 7, 2024

Conversation

xadupre
Copy link
Member

@xadupre xadupre commented Sep 20, 2024

Description

Fix write_calibration_table to support TensorData, TensorsData

@chilo-ms
Copy link
Contributor

chilo-ms commented Oct 3, 2024

Thanks for making TensorsData and TensorData serializable.

write_calibration_table writes out three calibration files in different format: json, txt, and flatbuffers.
TRT EP consumes the flatbuffers file which it expects each tensor data only contains name and max(abs(values[0]), abs(values[1])), it should be something like this:

resnetv17_stage4_conv9_fwd 0.381193
resnetv17_stage2_relu4_fwd 2.53578352
....

However, it currently serializes much more tensor data's information:

flatten_473 {'lowest': array([0.], dtype=float32), 'highest': array([10.190684], dtype=float32), 'CLS': 'TensorData'}
resnetv17_batchnorm0_fwd {'lowest': array([-5.5807953], dtype=float32), 'highest': array([5.954951], dtype=float32), 'CLS': 'TensorData'}
....

Could you help add additional functionality to simply extract the tensor name and an absolute value that required by TRT EP?

@xadupre
Copy link
Member Author

xadupre commented Oct 4, 2024

I just made a change to restore the previous format. Let me know if that's ok with you.

@chilo-ms
Copy link
Contributor

chilo-ms commented Oct 4, 2024

Thanks!
TRT EP actually is reading the calibration.flatbuffers not the pure txt file, so could you please help add the code to block as below?

700     zero = np.array(0)
701     for key in sorted(calibration_cache.keys()):
702         values = calibration_cache[key]
703         d_values = values.to_dict()
704         floats = [
705             float(d_values.get("highest", zero).item()),
706             float(d_values.get("lowest", zero).item()),
707         ]
708         value = str(max(floats))  # str(max(abs(values[0]), abs(values[1])))
709
710         flat_key = builder.CreateString(key)
711         flat_value = builder.CreateString(value)

@xadupre
Copy link
Member Author

xadupre commented Oct 4, 2024

builder.CreateString(key)

I don't know block very well. Do you know the file I should modify?

@chilo-ms
Copy link
Contributor

chilo-ms commented Oct 4, 2024

builder.CreateString(key)

I don't know block very well. Do you know the file I should modify?

It's here https://github.com/xadupre/onnxruntime/blob/qdq_json/onnxruntime/python/tools/quantization/quant_utils.py#L698

@xadupre
Copy link
Member Author

xadupre commented Oct 4, 2024

builder.CreateString(key)

I don't know block very well. Do you know the file I should modify?

It's here https://github.com/xadupre/onnxruntime/blob/qdq_json/onnxruntime/python/tools/quantization/quant_utils.py#L698

Sorry, I did not see, it was just above. I just pushed the changes.

@jywu-msft jywu-msft merged commit 407c1ab into microsoft:main Oct 7, 2024
86 checks passed
@xadupre xadupre deleted the qdq_json branch November 7, 2024 10:35
ishwar-raut1 pushed a commit to ishwar-raut1/onnxruntime that referenced this pull request Nov 19, 2024
### Description
Fix write_calibration_table to support TensorData, TensorsData
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants