-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: give more helpful error when trying to get a column that is missing from only some rows #14811
base: main
Are you sure you want to change the base?
Conversation
Nice, that looks like a good add. Thanks! |
So far, this only works for |
…ing from only some rows
c9b190e
to
4afd9ae
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this change makes the error more precise there are new performance costs to follow_cell_path
introduced both in the theoretical best case and additionally through how it is currently implemented in this PR.
The four different cases to worry about.
- All rows contain the cell path. All n need to be visited (For
Value::follow_cell_path
we sadly have to allocate ). - A missing value appears after a stretch of valid values -> We can already abort here and emit the new
RowLacksValue
error. Should not change the run time over the existing implementation usingResult as FromIterator
- At the beginning of the
List
the CellPath is not found but then the value is found. Compared to the previous Version we now have to iterate until this hit instead of failing early to reportRowLacksColumn
. - The column doesn't exist at all. Now instead of failing early we have to walk the whole List to then report
CantFindColumn
A good implementation should make 2.) cheap but to have a nicer message for 3.) we will always pay for 4.) That is a trade-off we should keep in mind, don't have a strong opinion yet on what should be the default here.
For the implementation we should at least make 2. and 3. reasonable
@@ -1252,9 +1252,21 @@ impl Value { | |||
}), | |||
} | |||
}) | |||
.collect::<Result<_, _>>()?; | |||
.collect::<Vec<Result<_, _>>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't collect all the ShellError
s as only one is ever reported and the Vec
would grow signifiicantly in size:
- 120 bytes per Unit of
ShellError
- 48 bytes previously for each
Value
Instead of a raw non-short-circuiting collect, we should replace the collect
with a custom collection mechanism that manages the Result::Err
cases depending on the situation.
- have we only observed the
Err
case so farSome(Err)
proceed consuming (we don't need to keep allocations around anymore)Some(Ok(Value))
returnRowLacksColumn
None
exhausted returnCantFindColumn
- We only saw a
Ok(Value)
Some(Err)
returnRowLacksColumn
Some(Ok(Value))
, pushValue
into the size-hintedVec
allocation.None
exhausted return theVec
/Value::List
/// | ||
/// ## Resolution | ||
/// | ||
/// Use the Optional Operator, --ignore-errors with get or select, or fill in the missing values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment only appears in the Rust crate documentation. If you want to show it you can use the miette
#[help]
attribute.
As follow_cell_path
and friends are also used in a bunch other commands , you should check that the error message like that makes universal sense, including its span annotations.
#[label = "try using '{col_name}?'"] | ||
span: Option<Span>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This points to the cell path component (if that is set correctly in all circumstances)
Question if we should have another label pointing to the source Value
guaranteed to have a span (even if sometimes a bit weirdly located)
Would it make sense to instead just tweak the That way we avoid the performance issue, but the user still gets a hint for resolving the situation. wdyt? |
Description
LIke the above issue describes, trying to get a column that some rows lack values for gives an unhelpful
cannot find column
error. This PR introduces a newRowLacksValue
error, that gives a clearer error:User-Facing Changes
Tests + Formatting
new tests:
commands::get::errors_column_missing_in_some_rows
changed tests:
commands::select::select_failed1
After Submitting