Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support json index #36750

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

feat: support json index #36750

wants to merge 6 commits into from

Conversation

sunby
Copy link
Contributor

@sunby sunby commented Oct 10, 2024

#35528

This PR adds json index support for json and dynamic fields. Now you can only do unary query like 'a["b"] > 1' using this index. We will support more filter type later.

basic usage:

collection.create_index("json_field", {"index_type": "INVERTED",
    "params": {"json_cast_type": DataType.STRING, "json_path":
'json_field["a"]["b"]'}})

There are some limits to use this index:

  1. If a record does not have the json path you specify, it will be ignored and there will not be an error.
  2. If a value of the json path fails to be cast to the type you specify, it will be ignored and there will not be an error.
  3. A specific json path can have only one json index.
  4. If you try to create more than one json indexes for one json field, sdk(pymilvus<=2.4.7) may return immediately because of internal implementation. This will be fixed in a later version.

@sre-ci-robot sre-ci-robot added area/internal-api size/XL Denotes a PR that changes 500-999 lines. labels Oct 10, 2024
@mergify mergify bot added dco-passed DCO check passed. kind/feature Issues related to feature request from users labels Oct 10, 2024
Copy link
Contributor

mergify bot commented Oct 10, 2024

@sunby Please associate the related issue to the body of your Pull Request. (eg. “issue: #”)

@sunby sunby force-pushed the add_json_index branch 2 times, most recently from 7ececff to 68a0644 Compare October 10, 2024 10:52
Copy link
Contributor

mergify bot commented Oct 10, 2024

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Oct 10, 2024

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Oct 12, 2024

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

@sre-ci-robot sre-ci-robot added size/XXL Denotes a PR that changes 1000+ lines. and removed size/XL Denotes a PR that changes 500-999 lines. labels Oct 12, 2024
internal/core/src/segcore/SegmentInterface.h Outdated Show resolved Hide resolved
internal/datacoord/index_service.go Outdated Show resolved Hide resolved
internal/datacoord/index_service.go Outdated Show resolved Hide resolved
internal/querynodev2/segments/segment_loader.go Outdated Show resolved Hide resolved
Copy link
Contributor

mergify bot commented Oct 31, 2024

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

1 similar comment
Copy link
Contributor

mergify bot commented Oct 31, 2024

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link

codecov bot commented Oct 31, 2024

Codecov Report

Attention: Patch coverage is 69.34673% with 61 lines in your changes missing coverage. Please review.

Project coverage is 82.92%. Comparing base (2a02bbe) to head (42a790c).
Report is 7 commits behind head on master.

Current head 42a790c differs from pull request most recent head d0a0b18

Please upload reports for the commit d0a0b18 to get more accurate results.

Files with missing lines Patch % Lines
internal/datacoord/index_service.go 67.88% 26 Missing and 9 partials ⚠️
internal/datacoord/index_meta.go 60.00% 8 Missing and 4 partials ⚠️
internal/util/indexparamcheck/inverted_checker.go 30.00% 6 Missing and 1 partial ⚠️
internal/querynodev2/segments/segment.go 81.81% 3 Missing and 1 partial ⚠️
internal/querynodev2/segments/segment_l0.go 0.00% 2 Missing ⚠️
internal/querynodev2/segments/segment_loader.go 95.83% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #36750      +/-   ##
==========================================
+ Coverage   81.11%   82.92%   +1.80%     
==========================================
  Files        1395     1099     -296     
  Lines      197434   170918   -26516     
==========================================
- Hits       160154   141728   -18426     
+ Misses      31641    23548    -8093     
- Partials     5639     5642       +3     
Components Coverage Δ
Client 79.53% <ø> (ø)
Core ∅ <ø> (∅)
Go 83.06% <70.77%> (+0.01%) ⬆️
Files with missing lines Coverage Δ
internal/querycoordv2/checkers/index_checker.go 79.09% <100.00%> (ø)
internal/querycoordv2/meta/segment_dist_manager.go 89.84% <ø> (ø)
internal/querynodev2/services.go 87.42% <100.00%> (+0.65%) ⬆️
pkg/common/common.go 77.18% <ø> (ø)
internal/querynodev2/segments/segment_loader.go 73.00% <95.83%> (+0.13%) ⬆️
internal/querynodev2/segments/segment_l0.go 70.40% <0.00%> (-1.47%) ⬇️
internal/querynodev2/segments/segment.go 65.24% <81.81%> (+0.08%) ⬆️
internal/util/indexparamcheck/inverted_checker.go 65.00% <30.00%> (-35.00%) ⬇️
internal/datacoord/index_meta.go 94.06% <60.00%> (-1.26%) ⬇️
internal/datacoord/index_service.go 89.69% <67.88%> (-3.60%) ⬇️

... and 318 files with indirect coverage changes

Copy link
Contributor

mergify bot commented Oct 31, 2024

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@xiaofan-luan
Copy link
Collaborator

@sunby

Are you done with this feature so I can start to review it?

Copy link
Contributor

mergify bot commented Nov 4, 2024

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Nov 4, 2024

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Nov 4, 2024

@sunby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Nov 5, 2024

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Nov 5, 2024

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Dec 26, 2024

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Dec 26, 2024

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 2, 2025

@sunby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 2, 2025

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 2, 2025

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 2, 2025

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

@sunby
Copy link
Contributor Author

sunby commented Jan 3, 2025

rerun go-sdk

Copy link
Contributor

mergify bot commented Jan 3, 2025

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 6, 2025

@sunby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 6, 2025

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@sunby
Copy link
Contributor Author

sunby commented Jan 9, 2025

/run-cpu-e2e

Copy link
Contributor

mergify bot commented Jan 9, 2025

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@sunby
Copy link
Contributor Author

sunby commented Jan 9, 2025

/run-cpu-e2e

Copy link
Contributor

mergify bot commented Jan 9, 2025

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 10, 2025

@sunby go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 10, 2025

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

sunby added 5 commits January 13, 2025 10:28
This PR adds json index support for json and dynamic fields. Now you can only do unary query like 'a["b"] > 1' using this index. We will support more filter type later.

basic usage:
```
collection.create_index("json_field", {"index_type": "INVERTED",
    "params": {"json_cast_type": DataType.STRING, "json_path":
'json_field["a"]["b"]'}})
```

There are some limits to use this index:
1. If a record does not have the json path you specify, it will be ignored and there will not be an error.
2. If a value of the json path fails to be cast to the type you specify,  it will be ignored and there will not be an error.
3. A specific json path can have only one json index.
4. If you try to create more than one json indexes for one json field, sdk(pymilvus<=2.4.7) may return immediately because of internal implementation. This will be fixed in a later version.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
Signed-off-by: sunby <sunbingyi1992@gmail.com>
Signed-off-by: sunby <sunbingyi1992@gmail.com>
Signed-off-by: sunby <sunbingyi1992@gmail.com>
Signed-off-by: sunby <sunbingyi1992@gmail.com>
Copy link
Contributor

mergify bot commented Jan 13, 2025

@sunby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 13, 2025

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
Copy link
Contributor

mergify bot commented Jan 13, 2025

@sunby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 13, 2025

@sunby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/compilation area/internal-api area/test dco-passed DCO check passed. kind/feature Issues related to feature request from users sig/testing size/XXL Denotes a PR that changes 1000+ lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants