Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

“nothreads” argument explanation #138

Open
nadavaviv opened this issue Aug 17, 2020 · 1 comment
Open

“nothreads” argument explanation #138

nadavaviv opened this issue Aug 17, 2020 · 1 comment

Comments

@nadavaviv
Copy link

I've been fiddling for a bit with the fuse package with python3.

When trying to create a FUSE instance, I came across the nothreads argument.

Can anyone please elaborate on what this does?

I can guess that setting this flag to True the software no longer supports multithreading, but what I would like to know is how it changes the software's behaviour, what would the flow be with and without setting it to True?

Thanks

@mxmlnkn
Copy link

mxmlnkn commented Aug 29, 2024

This is something that can be enabled with -s given to fuse_main in libfuse. It is a switch for this code location, i.e., either fuse_loop_mt or fuse_loop is started. Both of these are in their specific source files. fuse_loop) starts a pretty simple event loop such as this:

while ( !fuse_session_exited() ) {
    fuse_session_receive_buf();
    fuse_session_process_buf();
}

This high-level API uses libfuse low-level API calls for its implementation. It is an event loop that receives FUSE messages as documented here from the kernel via some communication file descriptor, which can be queried with fuse_chan_fd (I was only able to find online documentation for this for OpenBSD, not Linux) called from libfuse/lib/fuse_kern_chan.c, and then forwards them to the callback interface, i.e., calls one of the specified callbacks such as getattr or readdir. All from a single main thread.

The multi-threaded version fuse_loop_mt looks like this:

fuse_start_cleanup_thread();

/* fuse_session_loop_mt */
fuse_loop_start_thread();  // basically starts thread that runs fuse_do_work
while ( !fuse_session_exited() ) {
    /* wait */
/* Cancel and destroy threads. */
for (w = mt.main.next; w != &mt.main; w = w->next)
    pthread_cancel(w->thread_id);
while (mt.main.next != &mt.main)
    fuse_join_worker(&mt, mt.main.next);

fuse_stop_cleanup_thread();

So yeah, there seems to be:

  • a main thread, which only waits for the worker threads to finish
  • the worker threads, which run fuse_do_work and which can dynamically start more worker threads running the same fuse_do_work
  • a cleanup thread that runs fuse_prune_nodes in a loop delimited by a sleep with a duration as returned by fuse_clean_cache.

The worker threads have an event loop like this:

while ( !fuse_session_exited() ) {
    pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);
    res = fuse_session_receive_buf(mt->se, &fbuf, &ch);
    pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, NULL);

    /* Start more worker threads on demand. */
    if (mt->numavail == 0)
        fuse_loop_start_thread(mt);

    fuse_session_process_buf();

    /* Delete worker on demand. */
    if (mt->numavail > 10) {
        list_del_worker(w);
        mt->numavail--;
        mt->numworker--;
    }
}

So yeah, it is the basic loop that we already saw in the single-threaded fuse_loop, but it can dynamically start and stop more worker threads that run the same event loop. I'm not entirely sure how the serialization / locking works for reading from that FUSE kernel communication file descriptor, but, in the end, multiple threads read FUSE messages and call the programmer/user-given callbacks such as getattr and readdir in parallel from multiple threads. This means that each user-given callback must be thread-safe, but it also means that they can do some costly processing stuff in parallel, for example querying a read-only SQLite database for filesystem metadata for 16 files that are received via getattr call for different file paths from 16 different threads.

Note that fusepy automatically adds -s to force single-threaded code. Probably because multi-threading will not have any out-of-the-box performance benefits in Python anyway because of the global interpreter lock (GIL).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants