Skip to content

Multithreading benchmark #54

Answered by xhluca
raphaelsty asked this question in Q&A
Discussion options

You must be logged in to vote

There's a built-in multithreaded approach, just set n_threads=n to the desired number of threads. It uses the built-in python multiprocessing library so you don't need joblib/delayed.

However the default doesn't seem to scale well, especially beyond 2 threads, you can see results here for 4 threads: https://github.com/xhluca/bm25-benchmarks

The new numba JIT backend that will be added on v0.2.0 should be better for multithreading since it lets numba (i.e. llvm) handle the parallel processing (via prange), rather than doing it via python's multiprocessing. However it might be bottlenecked by memory access since it seems that at 4 threads it's not 100% efficiency (more like 50-60%, roughly).

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by raphaelsty
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants