Description
openedon May 15, 2024
I understand the latency test to be 1 query at a time, as far as I can tell (and throughput is to send in many concurrently to the extent there are threads via OMP). I think this code confirms that:
In benchmark.hpp , function run_main:
if (metric_objective == Objective::LATENCY) {
if (threads[0] != 1 || threads[1] != 1) {
log_warn("Latency mode enabled. Overriding threads arg, running with single thread.");
threads = {1, 1};
}
}
I was a little confused by what this was trying to do in hnswlib_wrapper.h where there are thread pools created for the latency test, and search calls are submitted to the pool:
// Create a pool if multiple query threads have been set and the pool hasn't been created already
bool create_pool = (metric_objective_ == Objective::LATENCY && num_threads_ > 1 && !thread_pool_);
if (create_pool) { thread_pool_ = std::make_unique(num_threads_); }
....
if (metric_objective_ == Objective::LATENCY && num_threads_ > 1) { thread_pool_->submit(f, batch_size); }
Am i misunderstanding this? Thanks in advance!