Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: give more helpful error when trying to get a column that is missing from only some rows #14811

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 29 additions & 6 deletions crates/nu-command/src/filters/select.rs
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,7 @@ fn select(
Value::List {
vals: input_vals, ..
} => {
Ok(input_vals
let results = input_vals
.into_iter()
.map(move |input_val| {
if !columns.is_empty() {
Expand All @@ -250,11 +250,34 @@ fn select(
input_val.clone()
}
})
.into_pipeline_data_with_metadata(
call_span,
engine_state.signals().clone(),
metadata,
))
.collect::<Vec<Value>>();

let missing_col_errors = results
.iter()
.filter_map(|row| {
if let Value::Error { error, .. } = row {
if let ShellError::CantFindColumn { col_name, .. } = &**error {
return Some(col_name.as_str());
}
}
return None;
})
.collect::<Vec<&str>>();

if missing_col_errors.len() < results.len() {
if let Some(first) = missing_col_errors.first() {
return Err(ShellError::RowLacksValue {
col_name: first.to_string(),
span: Some(call_span),
});
}
}

Ok(results.into_pipeline_data_with_metadata(
call_span,
engine_state.signals().clone(),
metadata,
))
}
_ => {
if !columns.is_empty() {
Expand Down
6 changes: 6 additions & 0 deletions crates/nu-command/tests/commands/get.rs
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,12 @@ fn errors_fetching_by_accessing_empty_list() {
assert!(actual.err.contains("Row number too large (empty content)"),);
}

#[test]
fn errors_column_missing_in_some_rows() {
let actual = nu!("[{a: 1, b: 2} {a: 3, b: 5} {a: 3}] | get b ");
assert!(actual.err.contains("Not all rows contain a value for 'b'"));
}

#[test]
fn quoted_column_access() {
let actual = nu!(r#"'[{"foo bar": {"baz": 4}}]' | from json | get "foo bar".baz.0 "#);
Expand Down
2 changes: 1 addition & 1 deletion crates/nu-command/tests/commands/select.rs
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ fn select_failed1() {
let actual = nu!("[{a: 1, b: 2} {a: 3, b: 5} {a: 3}] | select b ");

assert!(actual.out.is_empty());
assert!(actual.err.contains("cannot find column"));
assert!(actual.err.contains("Not all rows contain a value for 'b'"));
}

#[test]
Expand Down
13 changes: 13 additions & 0 deletions crates/nu-protocol/src/errors/shell_error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -595,6 +595,19 @@ pub enum ShellError {
src_span: Span,
},

/// Not all rows contain data in the requested column.
///
/// ## Resolution
///
/// Use the Optional Operator, --ignore-errors with get or select, or fill in the missing values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment only appears in the Rust crate documentation. If you want to show it you can use the miette #[help] attribute.

As follow_cell_path and friends are also used in a bunch other commands , you should check that the error message like that makes universal sense, including its span annotations.

#[error("Not all rows contain a value for '{col_name}'")]
#[diagnostic(code(nu::shell::row_lacks_value))]
RowLacksValue {
col_name: String,
#[label = "try using '{col_name}?'"]
span: Option<Span>,
Comment on lines +607 to +608
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This points to the cell path component (if that is set correctly in all circumstances)

Question if we should have another label pointing to the source Value guaranteed to have a span (even if sometimes a bit weirdly located)

},

/// Attempted to insert a column into a table, but a column with that name already exists.
///
/// ## Resolution
Expand Down
17 changes: 15 additions & 2 deletions crates/nu-protocol/src/value/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1252,9 +1252,22 @@ impl Value {
}),
}
})
.collect::<Result<_, _>>()?;
.collect::<Vec<Result<_, _>>>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't collect all the ShellErrors as only one is ever reported and the Vec would grow signifiicantly in size:

  • 120 bytes per Unit of ShellError
  • 48 bytes previously for each Value

Instead of a raw non-short-circuiting collect, we should replace the collect with a custom collection mechanism that manages the Result::Err cases depending on the situation.

  • have we only observed the Err case so far
    • Some(Err) proceed consuming (we don't need to keep allocations around anymore)
    • Some(Ok(Value)) return RowLacksColumn
    • None exhausted return CantFindColumn
  • We only saw a Ok(Value)
    • Some(Err) return RowLacksColumn
    • Some(Ok(Value)), push Value into the size-hinted Vec allocation.
    • None exhausted return the Vec/Value::List


current = Value::list(list, span);
let missing_col_errors = list
.iter()
.filter(|row| matches!(row, Err(ShellError::CantFindColumn { .. })))
.count();

if missing_col_errors != 0 && missing_col_errors < list.len() {
return Err(ShellError::RowLacksValue {
col_name: column_name.clone(),
span: Some(*origin_span),
});
}

current =
Value::list(list.into_iter().collect::<Result<_, _>>()?, span);
}
Value::Custom { ref val, .. } => {
current = match val.follow_path_string(
Expand Down
10 changes: 8 additions & 2 deletions tests/repl/test_cell_path.rs
Original file line number Diff line number Diff line change
Expand Up @@ -105,8 +105,14 @@ fn list_single_field_failure() -> TestResult {
// Test the scenario where the requested column is not present in all rows
#[test]
fn jagged_list_access_fails() -> TestResult {
fail_test("[{foo: 'bar'}, {}].foo", "cannot find column")?;
fail_test("[{}, {foo: 'bar'}].foo", "cannot find column")
fail_test(
"[{foo: 'bar'}, {}].foo",
"cannot find value for 'foo' in some rows",
)?;
fail_test(
"[{}, {foo: 'bar'}].foo",
"cannot find value for 'foo' in some rows",
)
}

#[test]
Expand Down
2 changes: 1 addition & 1 deletion tests/repl/test_table_operations.rs
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ fn update_cell_path_1() -> TestResult {
fn missing_column_errors() -> TestResult {
fail_test(
r#"[ { name: ABC, size: 20 }, { name: HIJ } ].size.1 == null"#,
"cannot find column",
"cannot find value for 'size' in some rows",
)
}

Expand Down