Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Background jobs #11696

Draft
wants to merge 23 commits into
base: main
Choose a base branch
from
Draft

Background jobs #11696

wants to merge 23 commits into from

Conversation

IanManske
Copy link
Member

@IanManske IanManske commented Jan 31, 2024

Description

This PR adds support for simple, single command background jobs. Background jobs are started using the job start command. This takes an external command and its arguments (just like run-external) and runs the command in the background. The current list of jobs can be shown using job list. The output from this is TBD, but it currently shows the job ids, pids, command, and status (running or completed). For now, completed jobs are removed from the list after calling job list.

Technical Details

In non-interactive mode, jobs inherit stdin, stdout, and stderr and are can write to them without synchronization.

But, when the shell is in interactive mode, background jobs are started disconnected from the terminal: their stdin, stdout, and stderr are piped rather than inherited. This is to prevent background jobs from directly printing to the terminal when editing the prompt in the repl. This is a problem, since reedline puts the terminal in raw mode while editing, which means printed new lines will not return the cursor to the start of the line. So, you can get output like this:

nushell> some textbackground output 1
                                     background output 2
                                                        background output 3
nushell> some text

With the reedline external printer, the output should much nicer like this:

background output 1
background output 2
background output 3
nushell> some text

Additionally, the external printer makes sure that there are no intermingled or partial lines. If multiple background jobs print at the same time, each line is added atomically (this has a slight performance penalty, but it should hopefully be reasonable considering that stdout is locked behind a mutex in the Rust standard library).

The downside to this is that background jobs cannot be directly attached to the terminal. Some programs behave differently depending on whether they detect that stdin/stdout is a terminal. For example, nushell itself will fail:

> job start nu
Error:   × Nushell launched as a REPL, but STDIN is not a TTY; either launch in a
   valid terminal or provide arguments to invoke a script!

The --inherit / interactive flag can be used to force background jobs to use the terminal with weird effects:

> job start -i nu

On unix, this child nushell will stop itself since it is not in control of the terminal. There is currently no way to move background jobs to the foreground, so this job will be forever stopped unless it is killed or the parent nushell exits. Also, output from any jobs launched with the -i flag will create messed up new lines as described above.

User-Facing Changes

Adds the job start and job list commands.

Tests + Formatting

TODO

After Submitting

TODO

@IanManske
Copy link
Member Author

IanManske commented Jan 31, 2024

This PR makes use of reedline's external printer feature. As such, there is a companion PR there to flush out that feature and mark it as no longer experimental. Also, to (hopefully) make it easier for others to test this PR, there currently is a temporary commit which will make cargo use a local version of reedline. That is, it will use the reedline folder located in the parent directory of one's local nushell repo.

Currently, there is no way to redirect stdout and stderr for a background command, but I plan to add support for redirection (o>, e>, o+e>).

Other things to work out:

  • How should job status be reported?
    • Currently, job list reports jobs as either running or completed, but completed jobs have additional information (exit code, signal number, core dumped or not -- see JobExitStatus).
    • Should messages be printed as jobs finish/complete (in interactive mode only)?
    • Should completed jobs be removed after running job list, or should there be an explicit prune flag or something else.
  • Should this PR also include a job kill command?
  • Are background jobs killed on shell exit?
    I say yes, since otherwise there may be nothing reading the output from the jobs, so they may get stuck when writing to stdout/stderr. We can add another command in the future to "disown" jobs from the shell.
  • Since this PR needs integration with reedline, I added it as a dependency to nu-system. Does this seem ok?

@fdncred
Copy link
Collaborator

fdncred commented Jan 31, 2024

Excited for this 💥 !!!

@WindSoilder
Copy link
Collaborator

WindSoilder commented Jan 31, 2024

Good job! I want to have something like this for a long time!


Regarding to your questions:

Currently, job list reports jobs as either running or completed, but completed jobs have additional information (exit code, signal number, core dumped or not -- see JobExitStatus).

I think it's good to report additional information in a record.

Should messages be printed as jobs finish/complete (in interactive mode only)?

Yup, I think once we press enter, messages can be printed, it just shows something is finished, and user can get more information from job list.

Should completed jobs be removed after running job list, or should there be an explicit prune flag or something else.

Refer to showing completed jobs status, I think we don't need to remove it. Personally I'd prefer an explicit prune flag, or something like job clean command.

Should this PR also include a job kill command?

We can iterate on it. Just curious, does job kill works like sending a 'SIGKILL` or something else to another job?

Are background jobs killed on shell exit?
I say yes, since otherwise there may be nothing reading the output from the jobs, so they may get stuck when writing to stdout/stderr. We can add another command in the future to "disown" jobs from the shell.

I'd say yes and no. Generally I'm agree with you, not really sure is there a way to auto "disown" jobs when it starts? (like &! in zsh)

Since this PR needs integration with reedline, I added it as a dependency to nu-system. Does this seem ok?

Yup :)

@IanManske
Copy link
Member Author

@WindSoilder thanks for the feedback!

I think it's good to report additional information in a record.

Makes sense, will work on that!

Yup, I think once we press enter, messages can be printed, it just shows something is finished, and user can get more information from job list.

Sounds good, will be added :)

Refer to showing completed jobs status, I think we don't need to remove it. Personally I'd prefer an explicit prune flag, or something like job clean command.

I was thinking along those lines was well. I think the job clean command you suggested will be better, as job list --prune will still output the list which might be unnecessary.

We can iterate on it. Just curious, does job kill works like sending a 'SIGKILL` or something else to another job?

Yep, exactly. I was thinking it would send SIGTERM by default but also have the option to send SIGKILL through a --force flag or something similar. Without this command users would have to do job list | where id == $id | first | get pid | kill $in, which seems a little verbose.

I'd say yes and no. Generally I'm agree with you, not really sure is there a way to auto "disown" jobs when it starts? (like &! in zsh)

Yeah, in the future there should be a way to start a job disowned. I'm thinking that a --disown flag to job start should work.

My original comment was slightly inaccurate for unix. If the shell exits, so nothing is reading stdout/stderr for a background job, then the job will receive a SIGPIPE when it tries to call write. This will most likely cause the job to exit anyways. So, for consistency with other jobs that are redirected and for consistency with Windows, I still think all jobs should be terminated. This is for interactive mode only (repl); in non-interactive mode (script) I'm not sure what should happen.

@devyn
Copy link
Contributor

devyn commented Apr 6, 2024

Finally had a chance to look at this. It seems much more tied to external command management than I was thinking, which kind of makes my idea of using closures probably not work so well.

I think being able to find the PID of jobs and cancel them is probably more important than supporting background jobs written in nu, so I'm in support of this, unless it's reasonable to try to do both.

@IanManske
Copy link
Member Author

The original plan was to later generalize this to closures instead of only allowing a single external command. Since the external printer with reedline seems to be a little tricky to do, I'm leaning towards having background jobs always detached from the terminal, and then to have some way to view the job output log so far. With this approach, I think it shouldn't be too hard to implement closures from the start (and perhaps only closures).

@nome
Copy link
Contributor

nome commented Sep 19, 2024

Very cool 😎

Regarding job kill: This would definitely be needed. At least on Unix, it should send SIGTERM/SIGKILL to the entire process group of the job, not just the pg leader as kill would.

@QORTEC
Copy link

QORTEC commented Nov 2, 2024

The currently discussed implementation of jobs seems to be focused on interactive use.

Currently, job list reports jobs as either running or completed, but completed jobs have additional information (exit code, signal number, core dumped or not -- see JobExitStatus).

From a scripting viewpoint, it would be helpful if the output of the job were also stored for later retrieval.


Additionally, I have some suggestions that may extend beyond the current scope:

  • add --name and --group flags
  • replace job list with job status
  • add job queue
  • add job wait

The --name and --group flags would simplify direct interaction with a specific job or group of jobs, for example:

  • job kill name_or_group
  • job status name_or_group
  • job wait name_or_group

Implementing job queue would be useful for queuing parallel jobs that don’t require immediate results.

job wait would prevent code that depends on completed jobs from executing before the job actually completes. 🏁

The queue and wait combination is effectively superior to par-each, as it allows running background jobs while the main thread continues, only blocking when job results are required.

Syntactically, status feels appropriate, as it conveys the state of the job(s) (running or completed, with results).


Additional thoughts:

Unless I'm missing something, it should be relatively straightforward to manually implement queue and wait in nu if we have --name and --group.

The purpose of the queue is to limit background jobs to a certain number of threads:

  • Should jobs count toward the number if manually started?
  • Does this hold true both before and after the queue starts?

@132ikl 132ikl mentioned this pull request Jan 21, 2025
6 tasks
@GrabbenD
Copy link

GrabbenD commented Jan 22, 2025

If multiple background jobs print at the same time, each line is added atomically (this has a slight performance penalty, but it should hopefully be reasonable considering that stdout is locked behind a mutex in the Rust standard library).

gnu/screen takes consistently a lot longer time to compile GCC than in tmux, presumably due to slow std (even Gentoo recommends silent builds to avoid this bottleneck). Notably, I've confirmed that pseudo-TTY (pty) in podman run isn't affected by this issue even if watching podman logs (meaning compilation is launched directly from --entrypoint without tmux nor screen).

Wouldn't this approach in NuShell create the same bottleneck as in gnu/screen for verbose compilation times?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants