-
Notifications
You must be signed in to change notification settings - Fork 837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: improve hexToBase64 #3178
perf: improve hexToBase64 #3178
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3178 +/- ##
==========================================
- Coverage 93.18% 92.15% -1.04%
==========================================
Files 196 87 -109
Lines 6431 2497 -3934
Branches 1359 547 -812
==========================================
- Hits 5993 2301 -3692
+ Misses 438 196 -242
|
How did you generate the above timings? |
The old implementation doesn't make any sense to me. Why did it bother doing the hex to ascii conversion before Buffer.from? I know probably nobody here has an answer for this but I haven't looked at this code in so long it is just making me scratch my head a bit why that choice was made. |
const hexPair = hexStr.substring(i, i + 2); | ||
const hexVal = parseInt(hexPair, 16); | ||
hexAsciiCharsStr += String.fromCharCode(hexVal); | ||
const buf = Buffer.alloc(hexStr.length / 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alloc is probably the most expensive part of this function at this point. Preallocating common lengths likely results in a significant speedup
const buf8 = Buffer.alloc(8)
const buf16 = Buffer.alloc(16)
function hexToBase64(hexStr) {
let buf;
if (hexStr.length === 16) buf = buf8;
else if (hexStr.length === 32) buf = buf16;
else buf = Buffer.alloc(hexStr.length / 2);
let offset = 0;
for (let i = 0; i < hexStr.length; i += 2) {
const hi = intValue(hexStr.charCodeAt(i));
const lo = intValue(hexStr.charCodeAt(i + 1));
buf.writeUInt8((hi << 4) | lo, offset++);
}
return buf.toString('base64');
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, added in 95966fa
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used node-microbenchmark and the speedup is definitely real
Iterations: 375
+-------------------------+--------------------+----------------------+-----------+
| Name | nano seconds | milli seconds | Slower by |
+-------------------------+--------------------+----------------------+-----------+
| testHexToBase64Prealloc | 8349.088 | 0.008349088 | fastest |
| testHexToBase64 | 12648.157333333333 | 0.012648157333333333 | 1x |
+-------------------------+--------------------+----------------------+-----------+
Measured each of these calls with @legendecas did additional benchmarking in one of the previous resolved comments |
I have no idea, it seems to have slipped in as part of a larger PR 🤔 |
Not sure what's up with the codecov %, possible to rerun? |
our code coverage has been weird recently. I can see which paths are tested here so i'm not worried about it |
Which problem is this PR solving?
Improve
hexToBase64
which is used in OTLP exporters to convert span and trace IDs. The previous implementation created a lot of garbage which had to be cleaned up by the GC. Besides the GC overhead, the old version was slow (see the table).Why not use
Buffer.from(hexStr, 'hex')
instead? It turned out the handwritten one is faster (500k random inputs, Node 16.15):Buffer.from(str, 'hex').toString('base64')
Checklist: