pipeline-manager: logs endpoint #2500

snkas · 2024-09-16T18:24:01Z

Summary
Addition of a new API endpoint to follow the logs of a pipeline supplied by the runner, which invokes and manages the instance (e.g., process) running the pipeline executable. The runner retrieves the logs from the running pipeline instance and stores it in an internal circular buffer constrained by byte size and number of lines.

Each runner has a receiver, of which the runner logs endpoint can lookup the corresponding sender. Over this channel, the runner logs endpoint can request the runner to add it as a logs follower by sending a sender for which it has the receiver. The runner then catches up the logs endpoint by sending the entire content of the internal buffer, and every time a new line is added also sends it to all the followers. If the logs endpoint disconnects, the sender given to the runner will error and be removed. It is also possible for the runner to lose connection to the pipeline instance, in which case it will send a last line and then drop all follow senders, thus resulting in the logs endpoint receivers to error and end the stream.

The API server logs endpoint uses the pipeline name to lookup the pipeline identifier, which it uses to issue the request to the runner logs endpoint. The API server streams the result from the runner to the client.

Local runner: listens to the stdout/stderr of the spawned local
process and uses a line buffer to write complete lines to the internal
circular buffer.
UI: tab titled "Logs" which uses the new endpoint to display logs
Runner main web server default port: 8089
Runner main with web server and reconciliation loop
Additional CLI arguments to reach runner web server
Integration test

Example usage

Start pipeline manager: RUST_BACKTRACE=1 cargo run --package=pipeline-manager --features pg-embed --bin pipeline-manager -- --dev-mode
Create and start a pipeline, for example named example with program (below program is known to crash upon data ingestion):
```
CREATE TABLE t1(c1 INTEGER); CREATE VIEW v1 AS SELECT ELEMENT(ARRAY [2, 3]) FROM t1;
```
Use either the Logs tab in the UI or curl (curl -s -X GET http://localhost:8080/v0/pipelines/example/logs) to follow the logs.

To generate some interesting logs, you can trigger an error when you run:

curl -i -X POST http://localhost:8080/v0/pipelines/example/ingress/t1 -d '100'

Remaining tasks

Investigate behavior when one of the logs endpoint requests is slower than others
Decide on strategy when one of the log followers is unable to keep up -- use try_send instead of send potentially
Ability to set logging level on a per-pipeline basis -> Later PR
Use oneshot channels where appropriate

Related PRs/issues

Fixes Implement UI for pipeline logs #2503
Fixes don't log pipeline config on every startup #2419
Fixes Logs endpoint #2407

Screenshots

snkas · 2024-09-16T19:08:38Z

The desired behavior for the Logs tab would be based on the pipeline status of the currently open pipeline:

~~If the Logs tab is open, and pipeline status is Initializing, Running, Paused, or Failed, the logs are retrieved~~
If the Logs tab is opened, an attempt should be made to retrieve the logs
If the Logs tab is open and the pipeline switches from a non-log status (Shutdown/Provisioning), to a log-potential status (Initializing, Running, Paused, Failed) it should try to retrieve the logs or at least show some button that asks the user if the user wants to retry to get logs as the status has changed
If the user switches away from the logs tab the logs MAY be kept following keeping the request open, but if simpler the request can also be closed and upon revisit a fresh request is started
If the logs request ends (possibly, due to timeout) a message should appear that "Logs follow request has returned early (pipeline is not yet shutdown). In order to see the latest, the logs need to be reloaded." and with a button to "Reload logs"

ryzhyk

Can we keep the old behavior at least in the local form factor where logs from all pipelines are also written to the terminal?

ryzhyk · 2024-09-23T19:32:32Z

crates/pipeline-manager/src/runner/logs_buffer.rs

+    ///   is sufficient space, the line is added to the buffer.
+    pub fn append(&mut self, line: String) {
+        if line.len() > self.size_limit_byte {
+            self.num_discarded_lines += self.buffer.len() + 1;


Would it be better to truncate such a line to a reasonable length. In fact, this might be a good strategy for all lines, even if they fit in the log.

I see it more as a terminal experience, in which truncating lines also does not happen. The line length would be quite exceptional if it is more than a megabyte by itself -- the log lines are the output of the pipeline executable, as such such long log lines would not occur unless the adapters crate or the sql compiler prints it.

crates/pipeline-manager/src/runner/main.rs

crates/pipeline-manager/src/integration_test.rs

gz · 2024-09-24T21:30:40Z

crates/pipeline-manager/src/integration_test.rs

+    let mut response_logs = config.get("/v0/pipelines/test/logs").await;
+    assert_eq!(response_logs.status(), StatusCode::OK);
+    assert_eq!(
+        "LOGS STREAM END: no logs currently available (likely, the pipeline has not yet started)\n",


Suggested change

"LOGS STREAM END: no logs currently available (likely, the pipeline has not yet started)\n",

"LOG STREAM: no logs available (likely, the pipeline has not yet started)\n",

maybe this could just be empty instead of this string?

I changed it to LOG STREAM UNAVAILABLE: the pipeline has likely not yet started
I think it is useful for the user of the logs endpoint to get back some feedback, rather than just empty, as the stream will not await till the pipeline starts running but already return.

gz · 2024-09-24T21:33:28Z

crates/pipeline-manager/src/runner/local_runner.rs

+        stderr: ChildStderr,
+        mut log_follow_request_receiver: Receiver<Sender<LogMessage>>,
+    ) -> (Sender<()>, JoinHandle<Receiver<Sender<LogMessage>>>) {
+        let (terminate_sender, mut terminate_receiver) = mpsc::channel::<()>(10);


I've now replaced it with a oneshot channel, it is only for sending the termination message.

gz · 2024-09-24T21:41:05Z

crates/pipeline-manager/src/runner/local_runner.rs

+                    // New stdout line
+                    line = lines_stdout.next_line() => {
+                        if let Ok(line) = line {
+                            if let Some(line) = line {


when would this stuff ever be None? do we need an else branch

The case is when stdout (or stderr) would return that there are no more lines: https://docs.rs/tokio/1.40.0/tokio/io/struct.Lines.html#method.poll_next_line -- Thanks! I'll add an error case for that as well 👍

Added variables stdout_finished and stderr_finished that track if either of them returned a None

crates/pipeline-manager/src/runner/main.rs

gz · 2024-09-24T21:45:17Z

crates/pipeline-manager/src/runner/main.rs

+                    // browsers (in particular, Chrome) not yet displaying the content because
+                    // they want more data to infer the content type (even though it was provided).
+                    Ok(HttpResponse::Ok()
+                        .content_type("text/plain; charset=utf-8")


no extra method to set the charset?

charset is part of the content type. HttpResponseBuilder doesn't seem to have a dedicated method to set them separately: https://docs.rs/actix-web/latest/actix_web/struct.HttpResponseBuilder.html

gz · 2024-09-24T21:48:16Z

crates/pipeline-manager/src/runner/pipeline_executor.rs

+    /// Process a new log line by adding it to the lines buffer and
+    /// sending it out to all followers. Any followers that exhibit
+    /// a send error are removed.
+    async fn process_log_line_with_followers(


Can this stuff be simplified by using https://docs.rs/tokio/1.40.0/tokio/sync/broadcast/index.html

Unfortunately not I think, at the start we need to catch up a new follower and need to send to only that channel and not all the other existing followers.

ryzhyk · 2024-09-25T14:59:51Z

crates/pipeline-manager/src/runner/local_runner.rs

+                                    stdout_finished = true;
+                                }
+                                Some(line) => {
+                                    println!("{line}"); // Also print it to manager's stdout


I guess we're printing unconditionally because pipeline's output is already controlled by the pipeline's log level.

Yes, we can see as we go if we need to tune down the logging verbosity, or make this a runner configuration option

Addition of a new API endpoint to follow the logs of a pipeline supplied by the runner, which invokes and manages the instance (e.g., process) running the pipeline executable. The runner retrieves the logs from the running pipeline instance and stores it in an internal circular buffer constrained by byte size and number of lines. Each runner has a receiver, of which the runner logs endpoint can lookup the corresponding sender. Over this channel, the runner logs endpoint can request the runner to add it as a logs follower by sending a sender for which it has the receiver. The runner then catches up the logs endpoint by sending the entire content of the internal buffer, and every time a new line is added also sends it to all the followers. If the logs endpoint disconnects, the sender given to the runner will error and be removed. It is also possible for the runner to lose connection to the pipeline instance, in which case it will send a last line and then drop all follow senders, thus resulting in the logs endpoint receivers to error and end the stream. The API server logs endpoint uses the pipeline name to lookup the pipeline identifier, which it uses to issue the request to the runner logs endpoint. The API server streams the result from the runner to the client. - Local runner: listens to the stdout/stderr of the spawned local process and uses a line buffer to write complete lines to the internal circular buffer. - UI: tab titled "Logs" which uses the new endpoint to display logs - Runner main web server default port: 8089 - Runner main with web server and reconciliation loop - Additional CLI arguments to reach runner web server - Integration test Co-authored-by: Karakatiza666 <bulakh.96@gmail.com> Signed-off-by: Karakatiza666 <bulakh.96@gmail.com> Signed-off-by: Simon Kassing <simon.kassing@feldera.com>

snkas assigned snkas and Karakatiza666 Sep 16, 2024

snkas mentioned this pull request Sep 16, 2024

pipeline-manager: pipeline API logs stub #2412

Closed

Karakatiza666 force-pushed the logs-endpoint branch 4 times, most recently from 4721ff2 to 619846a Compare September 19, 2024 14:23

snkas force-pushed the logs-endpoint branch from 619846a to cd79d82 Compare September 23, 2024 11:56

snkas marked this pull request as ready for review September 23, 2024 12:04

snkas requested review from ryzhyk and gz September 23, 2024 14:16

ryzhyk reviewed Sep 23, 2024

View reviewed changes

Karakatiza666 force-pushed the logs-endpoint branch from 9ad31ce to 6202cb6 Compare September 24, 2024 17:14

gz approved these changes Sep 24, 2024

View reviewed changes

gz mentioned this pull request Sep 24, 2024

[datagen] Invalid timestamp string is quietly accepted, but no data is generated #2524

Closed

snkas force-pushed the logs-endpoint branch from 6202cb6 to c6927a6 Compare September 25, 2024 14:00

snkas mentioned this pull request Sep 25, 2024

[pipeline-manager] Configure logging level per-pipeline #2574

Open

ryzhyk reviewed Sep 25, 2024

View reviewed changes

ryzhyk approved these changes Sep 25, 2024

View reviewed changes

snkas force-pushed the logs-endpoint branch from 2935518 to 8e656d7 Compare September 25, 2024 16:03

snkas added this pull request to the merge queue Sep 25, 2024

Merged via the queue into main with commit f39a943 Sep 25, 2024
6 checks passed

snkas deleted the logs-endpoint branch September 25, 2024 18:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pipeline-manager: logs endpoint #2500

pipeline-manager: logs endpoint #2500

snkas commented Sep 16, 2024 •

edited

Loading

snkas commented Sep 16, 2024 •

edited

Loading

ryzhyk left a comment

ryzhyk Sep 23, 2024

snkas Sep 24, 2024

gz Sep 24, 2024

snkas Sep 25, 2024

gz Sep 24, 2024

snkas Sep 25, 2024

gz Sep 24, 2024

snkas Sep 25, 2024

snkas Sep 25, 2024

gz Sep 24, 2024

snkas Sep 25, 2024

gz Sep 24, 2024

snkas Sep 25, 2024

ryzhyk Sep 25, 2024

snkas Sep 25, 2024

	"LOGS STREAM END: no logs currently available (likely, the pipeline has not yet started)\n",
	"LOG STREAM: no logs available (likely, the pipeline has not yet started)\n",

pipeline-manager: logs endpoint #2500

pipeline-manager: logs endpoint #2500

Conversation

snkas commented Sep 16, 2024 • edited Loading

snkas commented Sep 16, 2024 • edited Loading

ryzhyk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

snkas commented Sep 16, 2024 •

edited

Loading

snkas commented Sep 16, 2024 •

edited

Loading