Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework API #2071

Merged
merged 64 commits into from
Aug 3, 2024
Merged

Rework API #2071

merged 64 commits into from
Aug 3, 2024

Conversation

snkas
Copy link
Contributor

@snkas snkas commented Jul 19, 2024

API rework which makes the program part of the pipeline and the connectors defined from the SQL.
Removes services fully.

Progress back-end:

  • Format and clippy
  • SQL migration to single pipeline table (first version)
  • Database operations
  • Make compiler work
  • Make local runner work
  • Simple demo to demonstrate new usage
  • Create connectors through the SQL schema properties
  • Support multiple connectors in SQL
  • Proptest database tests
  • Already existing unit tests
  • Manual database tests -> Will be a subsequent PR
  • Update authentication -> New authentication will be a subsequent PR
  • Improve error reporting -> Will be a subsequent PR
  • Improve API documentation
  • Errors instead of unwraps where possible -> Will be a subsequent PR
  • Generate connector names if not provided
  • Integration tests

Progress UI:

  • Merge the UI into this PR
  • Remove the old UI

Progress others:

  • Either temporarily disable or fix all existing demos
  • CHANGELOG entry

Try it out:
Start the pipeline-manager (note that it will delete all existing pipeline-related tables):

RUST_BACKTRACE=1 RUST_LOG=debug,tokio_postgres=info cargo run --package=pipeline-manager --features pg-embed --bin pipeline-manager -- --dev-mode

... followed by a Kafka instance:

docker compose -f deploy/docker-compose.yml -f deploy/docker-compose-dev.yml up redpanda --build

... followed by running the demo:

cd demo/simple-count
python3 run.py --api-url http://localhost:8080 --kafka localhost:19092

Is this a user-visible change (yes/no): yes

@gz
Copy link
Collaborator

gz commented Jul 30, 2024

Great work, some feedback from giving it a first try:

  • Serve the new UI under / and remove the build and serving the old UI, also delete all of web-console/

  • left navigvation should be open by default at first:

Screenshot from 2024-07-30 13-12-14

  • Needs some text how about: "Start a New Pipeline"

Screenshot from 2024-07-30 13-12-45

  • This should have some more spacing between heading and description
    Screenshot from 2024-07-30 13-14-02

  • Clicking this doesn't do anything yet? Best case can we configure it to fetch a file from github and populate the pipeline editor with that?

Screenshot from 2024-07-30 13-16-27

  • it seems odd that an empty SQL program is ready to run?

Screenshot from 2024-07-30 13-19-32

  • It's odd that we call this a "PROGRAM ERR" now maybe the state for a failed rust compilation is called something else? Otherwise maybe we should say "SYNTAX ERROR"

Screenshot from 2024-07-30 13-21-00

  • It's a bit confusing that the pipeline only shows up with a delay when creating a new one
Screencast.from.07-30-2024.01.25.22.PM.webm
  • The whole space should be clickable to go to the pipeline not just the "test'

Screenshot from 2024-07-30 13-26-24

  • There are these ERROR in the log that probably shouldn't be ERROR

2024-07-30 20:27:51 ERROR [manager] [HTTP error response] PipelineNotRunningOrPaused: Pipeline 0191054a-5ae2-7362-9993-c8c097f5a56b is not currently running or paused.

  • Sometimes clicking the play button doesn't have any effect
Screencast.from.07-30-2024.01.28.57.PM.webm
  • remove ad-hoc query and query-plan tabs for now

  • The errors only show up seconds after the program failed to compile here
    Screencast from 07-30-2024 01:31:16 PM.webm

  • if I use the mini-view in the text editor to slide up in the text and let the mouse go on the slack button it sends me to the community slack (note I just let the mouse button go I didn't click the community slack link or anything -- this is on firefox)

Screencast.from.07-30-2024.01.33.27.PM.webm

-if autosave is off and turned back on, if the state has changed the default action should be to save
Screencast from 07-30-2024 01:35:16 PM.webm

  • having this set to null seems wrong?

Screenshot from 2024-07-30 13-36-41

@snkas
Copy link
Contributor Author

snkas commented Jul 31, 2024

Thanks a ton for the feedback, a few replies already:

There are these ERROR in the log that probably shouldn't be ERROR

The return code should be an error (issuing an action which is not valid in a certain status).

having this set to null seems wrong?

These are the current defaults, I agree it would be good to change to None, though for this PR not planning on changing that part of the API.

@Karakatiza666 Karakatiza666 force-pushed the rework-api branch 2 times, most recently from 0fd0650 to 2a9cde0 Compare July 31, 2024 14:33
.with(['Initializing', P.any, P.nullish, P._], () => 'Initializing' as const)
.with(['Paused', P.any, P.nullish, 'CompilingSql'], () => 'Compiling sql' as const)
.with(['Paused', P.any, P.nullish, 'Pending'], () => 'Queued' as const)
.with(['Paused', P.any, P.nullish, 'CompilingRust'], () => 'Compiling bin' as const)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiling Rust

.with(['Shutdown', 'Running', P.nullish, P._], () => 'Running' as const)
.with(['Provisioning', P.any, P.nullish, P._], () => 'Starting up' as const)
.with(['Initializing', P.any, P.nullish, P._], () => 'Initializing' as const)
.with(['Paused', P.any, P.nullish, 'CompilingSql'], () => 'Compiling sql' as const)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiling SQL

.with(['Paused', P.any, P.nullish, 'CompilingRust'], () => 'Compiling bin' as const)
.with(['Paused', P.any, P.nullish, P._], () => 'Paused' as const)
.with(['Running', P.any, P.nullish, P._], () => 'Running' as const)
.with(['ShuttingDown', P.any, P.nullish, P._], () => 'ShuttingDown' as const)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shutting Down

.with(['Shutdown', P.any, P.nullish, 'Success'], () => 'Shutdown' as const)
.with(['Shutdown', P.any, P.select(P.nonNullable), P.any], () => 'Shutdown' as const)
.with(['Shutdown', 'Running', P.nullish, P._], () => 'Running' as const)
.with(['Provisioning', P.any, P.nullish, P._], () => 'Starting up' as const)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=> 'Provisioning' sounds clearer

@gz
Copy link
Collaborator

gz commented Jul 31, 2024

The return code should be an error (issuing an action which is not valid in a certain status).

I think for logging we should only log error! if something goes really wrong, returning an error status code to the user that's not due to an unexpected thing happening in the manager should probably just be warn! or info!

(I think there were cases even before this where we logged error! in the manager when we shouldn't have so maybe it's still the same issue)

@lalithsuresh
Copy link
Collaborator

lalithsuresh commented Jul 31, 2024

Yeah I noticed this too after coming back to the pipeline manager after a while. We shouldn't be spamming error! logs when the user isn't doing anything wrong. So either these are not legitimate errors, and shouldn't be logged, or the UI shouldn't be doing whatever it was that's causing these error logs.

@snkas
Copy link
Contributor Author

snkas commented Jul 31, 2024

We could distinguish that error-level log messages should only occur when there was truly an error at the api-server.

In this case, the UI is issuing an invalid request (we can diagnose why), so it should get back an HTTP error though (maybe log level should just be info of it though, or even debug).

@gz
Copy link
Collaborator

gz commented Jul 31, 2024

In this case, the UI is issuing an invalid request (we can diagnose why),

For a user it's confusing to see errors in a log if they aren't really errors for him -- for debugging our own code a lower log-level should suffice. Sending HTTP errors back is totally fine, just doesn't need to be logged as an error.

Copy link
Collaborator

@gz gz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read every character!

@@ -0,0 +1,9 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this out of basedir into web-console?

@@ -117,7 +118,13 @@ fn parquet_output() {
let mut encoder = ParquetEncoder::new(
Box::new(consumer),
config,
Relation::new("TestStruct2", false, TestStruct2::schema(), false),
Relation::new(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably should make this a method on teststruct2, but not so important

@@ -44,6 +41,7 @@ fn default_tracing() -> bool {

/// Pipeline configuration specified by the user when creating
/// a new pipeline instance.
/// TODO: change this description
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it be fixed now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, changed to:

/// Overall pipeline configuration.
///
/// It is generated upon the deployment of a pipeline and contains
/// the shape of the overall pipeline configuration.
///
/// Its input and output endpoints are generated based on the schema
/// of the compiled program. The runtime configuration is directly
/// provided by the user. Storage configuration, if applicable,
/// is set by the runner.

@@ -74,6 +72,16 @@ pub struct PipelineConfig {
pub outputs: BTreeMap<Cow<'static, str>, OutputEndpointConfig>,
}

impl PipelineConfig {
pub fn from_yaml(s: &str) -> Self {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't need to be fixed now but maybe we can make an issue to settle on one config format, e.g., keep json everywhere or switch everything to a human friendly format, but ideally not have both

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently it uses YAML in the database but JSON for the API, I'm in favor of the latter I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will make a note of it, but likely not this PR

version: Version,
// REGULAR ENDPOINTS

/// Extended pipeline descriptor with code being optionally included.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it should just always have the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation is that with many pipelines could become a large response if polled regularly.

Copy link
Contributor Author

@snkas snkas Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would simplify it though -- let's see in terms of API poll performance how we fare

, body = ErrorResponse
, example = json!(examples::error_duplicate_name())),
(status = BAD_REQUEST
, description = "Pipeline is not shutdown"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
, description = "Pipeline is not shutdown"
, description = "Pipeline needs to be shutdown to be modified"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, updated in three places (full update, partial update, and delete).

- `static/`: Static assets served as-is
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding general comment here but we should rm -rf web-console and mv web-console-sveltekit to web-console

@snkas
Copy link
Contributor Author

snkas commented Aug 1, 2024

Rebased on latest main

openapi.json Outdated
@@ -2913,7 +2913,7 @@
}
}
],
"description": "Pipeline configuration specified by the user when creating\na new pipeline instance.\nTODO: change this description\n\nThis is the shape of the overall pipeline configuration. It encapsulates a\n[`RuntimeConfig`], which is the publicly exposed way for users to configure\npipelines."
"description": "Overall pipeline configuration.\n\nIt is generated upon the deployment of a pipeline and contains\nthe shape of the overall pipeline configuration.\n\nIts input and output endpoints are generated based on the schema\nof the compiled program. The runtime configuration is directly\nprovided by the user. Storage configuration, if applicable,\nis set by the runner."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for the PipelineConfig? The first two sentences are redundant and don't tell the user much about the object. Just say: "Pipeline's runtime configuration, including the number of workers, resource allocations, and connectors used. It represents configuration entries provided by the user (...) and entries derived from the program and system deployment (connectors...)"

Copy link
Contributor Author

@snkas snkas Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is part of the OpenAPI spec because it is a generated field deployment_config, which the user has in the extended pipeline descriptor. I agree I wrote it a bit verbose, will adjust the text 👍

@gz
Copy link
Collaborator

gz commented Aug 1, 2024

@Karakatiza666 the pipeline settings no longer shows after click for me on latest commit

Screencast.from.08-01-2024.09.41.06.AM.webm

@lalithsuresh
Copy link
Collaborator

@snkas Some bugs to fix.

When following the REST API tutorial, I noticed a few things. I created the following file:

➜  /tmp cat program.json
{
    "name": "sc-program",
    "description": "Supply Chain program",
    "runtime_config": {"workers": 4},
    "program_config": {},
    "program_code": "
CREATE TABLE vendor (                                    \n
    id BIGINT NOT NULL PRIMARY KEY,                      \n
    name VARCHAR,                                        \n
    address VARCHAR                                      \n
);                                                       \n
                                                         \n
CREATE TABLE part (                                      \n
    id bigint NOT NULL PRIMARY KEY,                      \n
    name VARCHAR                                         \n
);                                                       \n
                                                         \n
CREATE TABLE price (                                     \n
    part BIGINT NOT NULL,                                \n
    vendor BIGINT NOT NULL,                              \n
    price DECIMAL                                        \n
);                                                       \n
                                                         \n
CREATE VIEW low_price                                    \n
    (part, price)                                        \n
    AS                                                   \n
    SELECT                                               \n
        price.part AS part,                              \n
        MIN(price.price) AS price                        \n
    FROM price                                           \n
    GROUP BY price.part;                                 \n
                                                         \n
CREATE MATERIALIZED VIEW preferred_vendor                \n
    (part_id, part_name, vendor_id, vendor_name, price)  \n
    AS                                                   \n
    SELECT                                               \n
        part.id AS part_id,                              \n
        part.name AS part_name,                          \n
        vendor.id AS vendor_id,                          \n
        vendor.name AS vendor_name,                      \n
        price.price AS price                             \n
    FROM                                                 \n
        price, part, vendor, low_price                   \n
    WHERE                                                \n
        price.price = low_price.price AND                \n
        price.part = low_price.part AND                  \n
        part.id = price.part AND                         \n
        vendor.id = price.vendor;                        \n
"
}
  • I tried to use PUT /v0/pipelines/{name}. I started without "name", or the two config fields. I was forced to add each of them by consecutive error messages that asked for the missing fields. The config fields should be optional.

  • Minor: After that however, I get the following error for PUT, which was confusing (note the names is off in the URL vs the program which I fixed afterwards). PUT doesn't usually signal "rename" to a user, so perhaps just give a better error message here:

➜  /tmp curl -i -X PUT http://localhost:8080/v0/pipelines/sc-pipeline \
-H 'Content-Type: application/json' \
-d @program.json
HTTP/1.1 400 Bad Request
content-length: 119
vary: Origin, Access-Control-Request-Method, Access-Control-Request-Headers
content-type: application/json
access-control-allow-credentials: true
date: Fri, 02 Aug 2024 00:32:47 GMT

{"message":"Cannot rename a pipeline which does not exist","error_code":"CannotRenameNonExistingPipeline","details":{}}%      
  • Source position error when there is a connector misconfiguration. I kept going where the editor directed me to (line 6) but what the error message actually means is line 6 within the create table statement on line 13.

image

@snkas
Copy link
Contributor Author

snkas commented Aug 2, 2024

The config fields should be optional.
@lalithsuresh

The biggest downside to making them optional is that any typos will go unnoticed as the property is now optional (e.g., runtiem_config: { workers: 10 } would just be ignored and instead the default used). We could address this by adding disallowing unknown fields, but that would be inconsistent behavior with all the other serialized types in the nesting of runtime_config and program_config, and as well might affect compatibility in future.

The primary form factor where this is an inconvenience is doing direct API calls, which now that the CURL tutorial is rewritten (thank you :) ) people will use as basis in any case. I'm leaning towards keeping it explicit for now ({} = set to default) and see, what do you think?

@snkas
Copy link
Contributor Author

snkas commented Aug 2, 2024

error for PUT, which was confusing
@lalithsuresh

Replaced with the error:

The pipeline name in the request body does not match the one provided in the URL path. This is not allowed when no pipeline with the name provided in the URL path exists.

Source position error when there is a connector misconfiguration
@lalithsuresh

I've added the relation name now to the error and added that the position is within the string, for example:

 ConnectorGenerationError:
relation 'example': property 'connectors' has value '[{
        "transport2": {
         "name": "datagen",
         "config": {
            "plan": [ { "rate": 1000 } ]
         }
      }
    }]' which is invalid: deserialization failed: missing field `transport` at line 8 column 5 (position is within the string itself)

Exact positional indication in the SQL program will require additional information from the SQL compiler.

@mihaibudiu
Copy link
Collaborator

The compiler can provide the starting point in the source sql program for the connectors string. If that's enough I can add a property to the Metadata json

@snkas
Copy link
Contributor Author

snkas commented Aug 2, 2024

Ideally it would be just for each property such that in the future if we have new or different property expectations, it line/column number can be used for those as well.

We can for now have a dedicated one for the connectors key, where in the ProgramSchema is Metadata located though? I cannot seem to find it?

@mihaibudiu
Copy link
Collaborator

I call everything metadata including the properties themselves

Signed-off-by: Simon Kassing <simon.kassing@feldera.com>
Signed-off-by: Simon Kassing <simon.kassing@feldera.com>
Signed-off-by: Simon Kassing <simon.kassing@feldera.com>
Signed-off-by: Simon Kassing <simon.kassing@feldera.com>
ryzhyk and others added 9 commits August 2, 2024 17:42
Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Ben Pfaff <blp@feldera.com>
gz and others added 18 commits August 2, 2024 17:56
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Switch new web-console to serve from /

Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Otherwise if you have a very low rate, it will not
insert anything until the batch is full which might
take too long for the rate set to produce them.

Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
Signed-off-by: Lalith Suresh <lalith@feldera.com>
Signed-off-by: Lalith Suresh <lalith@feldera.com>
Signed-off-by: Lalith Suresh <lalith@feldera.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
Signed-off-by: Leonid Ryzhyk <leonid@feldera.com>
@gz gz merged commit 2798d62 into main Aug 3, 2024
4 of 5 checks passed
@gz gz deleted the rework-api branch August 3, 2024 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants