Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite nested filters again #1935

Merged
merged 12 commits into from
May 22, 2023
Merged

Rewrite nested filters again #1935

merged 12 commits into from
May 22, 2023

Conversation

generall
Copy link
Member

  • Use regular payload checking function for nested case
  • Move index-aware checking into the plain payload checker

@generall generall requested a review from agourlay May 20, 2023 09:18
@agourlay agourlay changed the title Rewrive nested filters again Rewrite nested filters again May 22, 2023
Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've commented some minor suggestions to improve code quality a slight bit. You can decide whether you like them or not 😃 .

Not approving yet because there's still one todo!().

Comment on lines 146 to 149
field_indexes: &HashMap<PayloadKeyType, R>,
) -> BitVec
where
R: AsRef<Vec<FieldIndex>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess just using &[FieldIndex] would be better here, if possible.

We don't have a situation like with read_to_string where we try to convert multiple types into &FieldIndex. At least, I cannot find it.

lib/segment/src/payload_storage/query_checker.rs Outdated Show resolved Hide resolved
Comment on lines +135 to +146
for index in field_indexes.as_ref() {
if let Some(index_check_res) = index.check_condition(field_condition, p) {
if index_check_res {
// If at least one object matches the condition, we can return true
return true;
}
index_checked = true;
// If index check of the condition returned something, we don't need to check
// other indexes
break;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different approach without break, not sure which is better:

Suggested change
for index in field_indexes.as_ref() {
if let Some(index_check_res) = index.check_condition(field_condition, p) {
if index_check_res {
// If at least one object matches the condition, we can return true
return true;
}
index_checked = true;
// If index check of the condition returned something, we don't need to check
// other indexes
break;
}
}
if let Some(index_check_res) = field_indexes
.as_ref()
.iter()
.filter_map(|index| index.check_condition(field_condition, p))
.next()
{
// If at least one object matches the condition, we can return true
if index_check_res {
return true;
}
// If index check of the condition returned something, we don't need to check
// other indexes
index_checked = true;
}

Copy link
Contributor

@coszio coszio May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative to the alternative is to use find_map() instead of filter_map().next() :P

Copy link
Member

@agourlay agourlay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

The new design blows my mind, this is a completely different approach that fits much better in the code base. 🏆

* Add optional ID tracker to check_payload, remove boxed closure

* Replace some match with or_else
* Replace starts_with/substring with strip_prefix

* Transform for-if-let-check-return into any iterator

* Transform for-if-return into any iterator

* Add comment to describe why check_payload has no ID tracker

See: #1935 (comment)
Comment on lines +88 to +107
let field_values = payload.get_value(&nested_path).values();

for value in field_values {
if let Value::Object(object) = value {
let get_payload = || OwnedPayloadRef::from(object);
if check_payload(
Box::new(get_payload),
// None because has_id in nested is not supported. So retrieving
// IDs throug the tracker would always return None.
None,
&nested.nested.filter,
point_id,
&nested_indexes,
) {
// If at least one nested object matches, return true
return true;
}
}
}
false
Copy link
Contributor

@coszio coszio May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wanted to add this to #1940 but got merged first :)

Suggested change
let field_values = payload.get_value(&nested_path).values();
for value in field_values {
if let Value::Object(object) = value {
let get_payload = || OwnedPayloadRef::from(object);
if check_payload(
Box::new(get_payload),
// None because has_id in nested is not supported. So retrieving
// IDs throug the tracker would always return None.
None,
&nested.nested.filter,
point_id,
&nested_indexes,
) {
// If at least one nested object matches, return true
return true;
}
}
}
false
payload
.get_value(&nested_path)
.values()
.iter()
// Only use objects
.filter_map(|value| value.as_object())
// If at least one nested object matches, return true
.any(|object| {
check_payload(
Box::new(|| OwnedPayloadRef::from(object)),
// None because has_id in nested is not supported. So retrieving
// IDs throug the tracker would always return None.
None,
&nested.nested.filter,
point_id,
&nested_indexes,
)
})

generall and others added 3 commits May 22, 2023 18:55
Co-authored-by: Luis Cossío <luis.cossio@qdrant.com>
Co-authored-by: Luis Cossío <luis.cossio@qdrant.com>
@generall generall merged commit 60ce060 into dev May 22, 2023
generall added a commit that referenced this pull request May 22, 2023
* working nested filters

* rm unused file

* add comment example

* todo

* remove nester checkers

* Box recursive generic Fn types

* Box recursive generic Fn types [2/2]

* Add optional ID tracker to check_payload, remove boxed closure (#1939)

* Add optional ID tracker to check_payload, remove boxed closure

* Replace some match with or_else

* Some nested filter improvements (#1940)

* Replace starts_with/substring with strip_prefix

* Transform for-if-let-check-return into any iterator

* Transform for-if-return into any iterator

* Add comment to describe why check_payload has no ID tracker

See: #1935 (comment)

* Update lib/segment/src/payload_storage/query_checker.rs

Co-authored-by: Luis Cossío <luis.cossio@qdrant.com>

* Update lib/segment/src/payload_storage/query_checker.rs

Co-authored-by: Luis Cossío <luis.cossio@qdrant.com>

* fix clippy

---------

Co-authored-by: timvisee <tim@visee.me>
Co-authored-by: Tim Visée <tim+github@visee.me>
Co-authored-by: Luis Cossío <luis.cossio@qdrant.com>
@agourlay agourlay deleted the retrive-nested-filters-again branch July 12, 2023 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants