Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Streams 🌊] Add processors validation and simulation gate #206566

Merged
merged 12 commits into from
Jan 15, 2025

Conversation

tonyghiani
Copy link
Contributor

@tonyghiani tonyghiani commented Jan 14, 2025

📓 Summary

Closes https://github.com/elastic/streams-program/issues/66

This work adds changes to prevent invalid processors from being submitted.
The main rule is that a simulation is performed before any add/edit submission to guarantee that the processor config is valid.

This work also updates the simulation API to detect whether there is a non-additive change in any simulated document.

@patpscal error reporting UI for add/edit is different since the simulator is not visible for edit, I used a callout but we can easily update this once there is a final design in place.

Form validation + simulation

demo.mov

Non-additive processors

Screen.Recording.2025-01-14.at.14.15.18.mov

@tonyghiani tonyghiani changed the title 66 add processors validation [Streams 🌊] Add processors validation and simulation gate Jan 14, 2025
@tonyghiani tonyghiani marked this pull request as ready for review January 14, 2025 14:04
@tonyghiani tonyghiani requested review from a team as code owners January 14, 2025 14:04
@tonyghiani tonyghiani added release_note:skip Skip the PR/issue when compiling release notes backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) Feature:Streams This is the label for the Streams Project labels Jan 14, 2025
@Kerry350 Kerry350 self-requested a review January 14, 2025 14:46
@flash1293
Copy link
Contributor

Tangential to this PR, but I just noticed we also don't validate whether the mapped field type will work:

Here I'm picking double as the field type which will break during ingest because "this" can't be parsed, but the UI doesn't tell me. It could do this by including the mapping update in the simulate call. We can tackle this separately but I think we should consider it.

Screenshot 2025-01-14 at 16 25 38

Also we should show a more aggressive warning when the user disabled "Ignore failures for this processor" and there are unmatched sample documents - while unmatched docs are benign as long as failures are ignored, they are a pretty big problem once they lead to failure for ingesting the doc.

Copy link
Contributor

@flash1293 flash1293 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is breaking updating an existing processor:

Screenshot 2025-01-14 at 16 35 01

@tonyghiani
Copy link
Contributor Author

Here I'm picking double as the field type which will break during ingest because "this" can't be parsed, but the UI doesn't tell me. It could do this by including the mapping update in the simulate call.

Make sense, the current simulation is just simulating the pipeline, not a data ingestion, updating the simulate API to try an ingest simulation should also handle this. I'll open a follow up PR for this specifically.

Also we should show a more aggressive warning when the user disabled "Ignore failures for this processor" and there are unmatched sample documents - while unmatched docs are benign as long as failures are ignored, they are a pretty big problem once they lead to failure for ingesting the doc.

Sounds good, @patpscal can we add to design a variant that handles this case? Not sure what feedback we could provide here, something like a bold warning icon+text next to the disabled flag seems fine to me.

Looks like this is breaking updating an existing processor

Mmm I see what's happening, to validate that the updated processor configuration is correct, I'm running a new simulation on submit retrieving the data sample, but it's possible the user won't have data ingested for the time range used.

Actually, this might be limiting also when running the validation for creating the processor. I'll take a look at how we can handle this, but I'm not aware of any JS API available in Kibana to validate a processor config without simulating ingestion/pipeline or updating a real pipeline, I believe that syntax-check lives in Elasticsearch only.

@flash1293
Copy link
Contributor

flash1293 commented Jan 14, 2025

Actually, this might be limiting also when running the validation for creating the processor. I'll take a look at how we can handle this, but I'm not aware of any JS API available in Kibana to validate a processor config without simulating ingestion/pipeline or updating a real pipeline, I believe that syntax-check lives in Elasticsearch only.

I think we can't safely do this, because existing already processed documents might fail with the new configuration even if it would work with raw documents coming from the source (e.g. they will have the extracted field defined already and it would look like it updated it, but actually raw documents wouldn't have the extracted field set). I think we just shouldn't do the validation for the editing case, only for the new processor case. Annoying, but all coming back to the same root problem that we need original documents to test edits to processing, which we can't get.

We need some logic to only run this simulation check when the only change to the processing pipeline is adding new processors to the end of the pipeline, not reordering existing ones or changing existing ones.

@tonyghiani
Copy link
Contributor Author

@flash1293 I addressed your suggestions, this is ready for another look.

@patpscal I made this to provide a warning caption once the ignore_failure option is disable, let me know if looks good or we can implement something else 👌

Screen.Recording.2025-01-15.at.09.13.07.mov

Copy link
Contributor

@gsoldevila gsoldevila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we own @kbn/object-utils, TBD

Copy link
Contributor

@Kerry350 Kerry350 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM 👌 I went through with the latest changes, but I’ll defer to Joe on the approval with the edge cases.

Copy link
Contributor

@flash1293 flash1293 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Beyond the stuff we already noted down as follow ups - for the "unmatched" filter it doesn't really make sense to keep the columns for the fields that were not extracted:
Screenshot 2025-01-15 at 10 40 23

This will give the message itself more space so it's easier to see what the problem is.

@tonyghiani
Copy link
Contributor Author

tonyghiani commented Jan 15, 2025

Beyond the stuff we already noted down as follow ups - for the "unmatched" filter it doesn't really make sense to keep the columns for the fields that were not extracted

Addressed with refactor(streams): hide detected fields columns for unmatched filter

Screen.Recording.2025-01-15.at.11.04.04.mov

I'll get this merged when it goes 💚 again and take on the next steps.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
streamsApp 269 273 +4

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
streamsApp 269.2KB 271.6KB +2.4KB
Unknown metric groups

ESLint disabled in files

id before after diff
streamsApp 2 3 +1

Total ESLint disabled count

id before after diff
streamsApp 5 6 +1

History

@tonyghiani tonyghiani merged commit 6429c53 into elastic:main Jan 15, 2025
8 checks passed
@tonyghiani tonyghiani deleted the 66-add-processors-validation branch January 15, 2025 15:09
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/12791225692

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jan 15, 2025
…6566)

## 📓 Summary

Closes elastic/streams-program#66

This work adds changes to prevent invalid processors from being
submitted.
The main rule is that a simulation is performed before any add/edit
submission to guarantee that the processor config is valid.

This work also updates the simulation API to detect whether there is a
non-additive change in any simulated document.

@patpscal error reporting UI for add/edit is different since the
simulator is not visible for edit, I used a callout but we can easily
update this once there is a final design in place.

### Form validation + simulation

https://github.com/user-attachments/assets/f7fc351b-6efc-4500-8490-b7f1c85139bf

### Non-additive processors

https://github.com/user-attachments/assets/47b5b739-c2cf-4a74-93a8-6ef43521c7d4
(cherry picked from commit 6429c53)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Jan 15, 2025
) (#206803)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[Streams 🌊] Add processors validation and simulation gate
(#206566)](#206566)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Marco Antonio
Ghiani","email":"marcoantonio.ghiani01@gmail.com"},"sourceCommit":{"committedDate":"2025-01-15T15:09:02Z","message":"[Streams
🌊] Add processors validation and simulation gate (#206566)\n\n## 📓
Summary\r\n\r\nCloses
https://github.com/elastic/streams-program/issues/66\r\n\r\nThis work
adds changes to prevent invalid processors from
being\r\nsubmitted.\r\nThe main rule is that a simulation is performed
before any add/edit\r\nsubmission to guarantee that the processor config
is valid.\r\n\r\nThis work also updates the simulation API to detect
whether there is a\r\nnon-additive change in any simulated
document.\r\n\r\n@patpscal error reporting UI for add/edit is different
since the\r\nsimulator is not visible for edit, I used a callout but we
can easily\r\nupdate this once there is a final design in
place.\r\n\r\n### Form validation +
simulation\r\n\r\n\r\nhttps://github.com/user-attachments/assets/f7fc351b-6efc-4500-8490-b7f1c85139bf\r\n\r\n###
Non-additive
processors\r\n\r\n\r\nhttps://github.com/user-attachments/assets/47b5b739-c2cf-4a74-93a8-6ef43521c7d4","sha":"6429c53597b203d0f4435909333131b00c01a996","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:prev-minor","Feature:Streams"],"title":"[Streams
🌊] Add processors validation and simulation
gate","number":206566,"url":"https://github.com/elastic/kibana/pull/206566","mergeCommit":{"message":"[Streams
🌊] Add processors validation and simulation gate (#206566)\n\n## 📓
Summary\r\n\r\nCloses
https://github.com/elastic/streams-program/issues/66\r\n\r\nThis work
adds changes to prevent invalid processors from
being\r\nsubmitted.\r\nThe main rule is that a simulation is performed
before any add/edit\r\nsubmission to guarantee that the processor config
is valid.\r\n\r\nThis work also updates the simulation API to detect
whether there is a\r\nnon-additive change in any simulated
document.\r\n\r\n@patpscal error reporting UI for add/edit is different
since the\r\nsimulator is not visible for edit, I used a callout but we
can easily\r\nupdate this once there is a final design in
place.\r\n\r\n### Form validation +
simulation\r\n\r\n\r\nhttps://github.com/user-attachments/assets/f7fc351b-6efc-4500-8490-b7f1c85139bf\r\n\r\n###
Non-additive
processors\r\n\r\n\r\nhttps://github.com/user-attachments/assets/47b5b739-c2cf-4a74-93a8-6ef43521c7d4","sha":"6429c53597b203d0f4435909333131b00c01a996"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/206566","number":206566,"mergeCommit":{"message":"[Streams
🌊] Add processors validation and simulation gate (#206566)\n\n## 📓
Summary\r\n\r\nCloses
https://github.com/elastic/streams-program/issues/66\r\n\r\nThis work
adds changes to prevent invalid processors from
being\r\nsubmitted.\r\nThe main rule is that a simulation is performed
before any add/edit\r\nsubmission to guarantee that the processor config
is valid.\r\n\r\nThis work also updates the simulation API to detect
whether there is a\r\nnon-additive change in any simulated
document.\r\n\r\n@patpscal error reporting UI for add/edit is different
since the\r\nsimulator is not visible for edit, I used a callout but we
can easily\r\nupdate this once there is a final design in
place.\r\n\r\n### Form validation +
simulation\r\n\r\n\r\nhttps://github.com/user-attachments/assets/f7fc351b-6efc-4500-8490-b7f1c85139bf\r\n\r\n###
Non-additive
processors\r\n\r\n\r\nhttps://github.com/user-attachments/assets/47b5b739-c2cf-4a74-93a8-6ef43521c7d4","sha":"6429c53597b203d0f4435909333131b00c01a996"}}]}]
BACKPORT-->

Co-authored-by: Marco Antonio Ghiani <marcoantonio.ghiani01@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) Feature:Streams This is the label for the Streams Project release_note:skip Skip the PR/issue when compiling release notes v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants