-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite nested filters again #1935
Conversation
generall
commented
May 20, 2023
- Use regular payload checking function for nested case
- Move index-aware checking into the plain payload checker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've commented some minor suggestions to improve code quality a slight bit. You can decide whether you like them or not 😃 .
Not approving yet because there's still one todo!()
.
lib/segment/src/index/query_optimization/condition_converter.rs
Outdated
Show resolved
Hide resolved
field_indexes: &HashMap<PayloadKeyType, R>, | ||
) -> BitVec | ||
where | ||
R: AsRef<Vec<FieldIndex>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess just using &[FieldIndex]
would be better here, if possible.
We don't have a situation like with read_to_string
where we try to convert multiple types into &FieldIndex
. At least, I cannot find it.
for index in field_indexes.as_ref() { | ||
if let Some(index_check_res) = index.check_condition(field_condition, p) { | ||
if index_check_res { | ||
// If at least one object matches the condition, we can return true | ||
return true; | ||
} | ||
index_checked = true; | ||
// If index check of the condition returned something, we don't need to check | ||
// other indexes | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different approach without break
, not sure which is better:
for index in field_indexes.as_ref() { | |
if let Some(index_check_res) = index.check_condition(field_condition, p) { | |
if index_check_res { | |
// If at least one object matches the condition, we can return true | |
return true; | |
} | |
index_checked = true; | |
// If index check of the condition returned something, we don't need to check | |
// other indexes | |
break; | |
} | |
} | |
if let Some(index_check_res) = field_indexes | |
.as_ref() | |
.iter() | |
.filter_map(|index| index.check_condition(field_condition, p)) | |
.next() | |
{ | |
// If at least one object matches the condition, we can return true | |
if index_check_res { | |
return true; | |
} | |
// If index check of the condition returned something, we don't need to check | |
// other indexes | |
index_checked = true; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative to the alternative is to use find_map()
instead of filter_map().next()
:P
lib/segment/src/index/query_optimization/condition_converter.rs
Outdated
Show resolved
Hide resolved
lib/segment/src/index/query_optimization/condition_converter.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
The new design blows my mind, this is a completely different approach that fits much better in the code base. 🏆
* Add optional ID tracker to check_payload, remove boxed closure * Replace some match with or_else
* Replace starts_with/substring with strip_prefix * Transform for-if-let-check-return into any iterator * Transform for-if-return into any iterator * Add comment to describe why check_payload has no ID tracker See: #1935 (comment)
let field_values = payload.get_value(&nested_path).values(); | ||
|
||
for value in field_values { | ||
if let Value::Object(object) = value { | ||
let get_payload = || OwnedPayloadRef::from(object); | ||
if check_payload( | ||
Box::new(get_payload), | ||
// None because has_id in nested is not supported. So retrieving | ||
// IDs throug the tracker would always return None. | ||
None, | ||
&nested.nested.filter, | ||
point_id, | ||
&nested_indexes, | ||
) { | ||
// If at least one nested object matches, return true | ||
return true; | ||
} | ||
} | ||
} | ||
false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wanted to add this to #1940 but got merged first :)
let field_values = payload.get_value(&nested_path).values(); | |
for value in field_values { | |
if let Value::Object(object) = value { | |
let get_payload = || OwnedPayloadRef::from(object); | |
if check_payload( | |
Box::new(get_payload), | |
// None because has_id in nested is not supported. So retrieving | |
// IDs throug the tracker would always return None. | |
None, | |
&nested.nested.filter, | |
point_id, | |
&nested_indexes, | |
) { | |
// If at least one nested object matches, return true | |
return true; | |
} | |
} | |
} | |
false | |
payload | |
.get_value(&nested_path) | |
.values() | |
.iter() | |
// Only use objects | |
.filter_map(|value| value.as_object()) | |
// If at least one nested object matches, return true | |
.any(|object| { | |
check_payload( | |
Box::new(|| OwnedPayloadRef::from(object)), | |
// None because has_id in nested is not supported. So retrieving | |
// IDs throug the tracker would always return None. | |
None, | |
&nested.nested.filter, | |
point_id, | |
&nested_indexes, | |
) | |
}) |
Co-authored-by: Luis Cossío <luis.cossio@qdrant.com>
Co-authored-by: Luis Cossío <luis.cossio@qdrant.com>
* working nested filters * rm unused file * add comment example * todo * remove nester checkers * Box recursive generic Fn types * Box recursive generic Fn types [2/2] * Add optional ID tracker to check_payload, remove boxed closure (#1939) * Add optional ID tracker to check_payload, remove boxed closure * Replace some match with or_else * Some nested filter improvements (#1940) * Replace starts_with/substring with strip_prefix * Transform for-if-let-check-return into any iterator * Transform for-if-return into any iterator * Add comment to describe why check_payload has no ID tracker See: #1935 (comment) * Update lib/segment/src/payload_storage/query_checker.rs Co-authored-by: Luis Cossío <luis.cossio@qdrant.com> * Update lib/segment/src/payload_storage/query_checker.rs Co-authored-by: Luis Cossío <luis.cossio@qdrant.com> * fix clippy --------- Co-authored-by: timvisee <tim@visee.me> Co-authored-by: Tim Visée <tim+github@visee.me> Co-authored-by: Luis Cossío <luis.cossio@qdrant.com>