-
-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(record): prefer arrays over sets for lookups of blocked keys #68
Conversation
✅ Deploy Preview for valibot canceled.
|
Thank you for your contribution. Yes, we can change set to array.
I prefer the way it was before. If something is in a condition, it is quite clear in which case it is executed.
Do you have proof of this? I suspect it doesn't matter for |
Fair enough. This will be reverted ASAP.
I'll cite one of many articles in particular, but this is apparently a well known performance pitfall. I'll also go ahead and cite ThePrimeagen's streams, but I can't of course scroll through all the Twitch VODs right now. Would you like to proceed with this refactor anyway? |
I am not sure yet. I understand the performance advantage. However, I wonder if this approach is faster across the board or only after a certain number of object/array entries. |
Performance aside for a moment, I must say that I do prefer the So at least for the semantics and aesthetics, I can argue for the Performance-wise, however, I can't say I'm surprised with the results in many benchmarks. The |
https://twitter.com/tmikov/status/1688271863087280128 There is some discussion regarding the same here. |
Before we revise the loops, I would prefer a benchmark that demonstrates a performance difference for each schema. For example, I'm not sure if |
Hello! I am glad to report that after some benchmarking, I have observed considerable improvements in the average execution time. MethodologyI created an untracked file named import { getRandomValues } from 'node:crypto';
import { array, record, object, number, string, date } from './dist/index.js';
function randString() {
const bytes = getRandomValues(new Uint8Array(21));
return Buffer.from(bytes).toString('base64');
}
function randDict(n) {
const dict = {};
for (let i = 0; i < n; ++i)
dict[randString()] = Math.random();
return dict;
}
function randObj(dict) {
return {
test: Math.random(),
name: randString(),
expires: new Date,
dict,
};
}
function randArr(arrLen, dictLen) {
const objs = Array(arrLen);
for (let i = 0; i < arrLen; ++i)
objs[i] = randObj(randDict(dictLen))
return objs;
}
const ArrayOfObjects = array(object({
test: number(),
name: string(),
expires: date(),
dict: record(string(), number()),
}));
// Pre-allocate 16K randomly generated objects so that
// we don't incur the overhead during the benchmarking.
const objs = randArr(2 ** 14, 16);
// Warm up the optimization
ArrayOfObjects.parse(objs);
// Do the actual benchmark
const start = performance.now();
ArrayOfObjects.parse(objs);
const end = performance.now();
// Print the results
console.log(end - start); I then ran the following Powershell script to run the benchmark ten times. cd library
# Before PR
git switch main
pnpm build
for ($i = 0; $i -lt 10; ++$i) { node bench.mjs }
# After PR
git switch prefer-array
pnpm build
for ($i = 0; $i -lt 10; ++$i) { node bench.mjs } Results
All of the results below are in milliseconds.
We thus have the following average execution times in millseconds:
As we can see from the results, we have indeed gone faster by 16 milliseconds for 16K objects! That's effectively one frame in a 60 FPS render loop. Okay, of course in the real world, we wouldn't really spawn 16K objects in an array all at once, but this at least gives us an idea of the performance improvement nevertheless. |
Thank you for your research. I plan to merge the PR. Purely out of interest, I would be interested to see if there is also a small difference if only 100 or 1,000 objects are used instead of 16K. In a comparison between Valibot and Zod, Valibot was faster up to about 10 K runs on a loop and then Zod. Therefore, I would be interested to see how this benchmark changes with fewer objects. |
I'll be completely honest with you. I did not expect the benchmarks to be this much better across the board. 😅
I find the 1024 case to be the most fascinating because the range from 128-1024 sounds like a realistic figure in production (unlike the rather inflated 16K case). It seems that we get the biggest wins here. Or at least I think this is the point where the optimizing compiler really shows its true colors. |
Going from
Set
toArray
Following #67, this PR swaps the container of$O(1)$ , the constant coefficient for this is actually quite expensive due to underlying hashing algorithms. In comparison, the $O(n)$ function
BLOCKED_KEYS
from aSet
into anArray
. This is because althoughSet#has
isArray#includes
should be faster for small arrays for the following reasons:Array#includes
is a built-in method which does not perform any hashing.In the case of
BLOCKED_KEYS
, three elements are certainly sufficiently small to justify the linear search. See the discussion in #67 (comment) for more details and some benchmarks.Short-circuiting
Along the way, I also inverted the logic in the
forEach
loops to accommodate for short-circuiting and fewer indentations. Moreover, I converted aforEach
loop inrecord.ts
to afor...of
loop because the latter is generally considered to be faster.