Open
Description
Dear authors, Thank you for making your idea open source. I looked through the issues raised earlier and could not find similar points raised.
I was going through the repository and saw that we tested the results for 8-bit multipliers. However, for my work, I require 32-bit floating point multipliers. As per my understanding, two changes will be needed. Could you please let me know what more changes might need to be made? Thanks in advance!
- Removing quantization-related functions in FakeApproxConv2D (present in tf2/python/keras/layers/fake_approx_convolutional.py) to allow floating point multipliers.
- Creating the binary file for the multiplier by changing the range from 256 (2^8) with the floating point range (2^32), as shown below.
FILE * f = fopen("output.bin", "wb");
for(unsigned int a = 0; a < (2^32); a++)
for(unsigned int b = 0; b < (2^32); b++) {
long val = approximate_mult(a, b); // replace by your own function call
fwrite(&val, sizeof(uint16_t), 1, f);
}
fclose(f);
Metadata
Assignees
Labels
No labels