Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Sorting segment data by primary key to support primary key index #33744

Open
1 task done
xiaocai2333 opened this issue Jun 11, 2024 · 1 comment
Open
1 task done
Assignees
Labels
kind/feature Issues related to feature request from users
Milestone

Comments

@xiaocai2333
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Is your feature request related to a problem? Please describe.

  1. Sorting segment data by primary key.
  2. Support primary key index.

Describe the solution you'd like.

No response

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

@xiaocai2333 xiaocai2333 added the kind/feature Issues related to feature request from users label Jun 11, 2024
@xiaocai2333 xiaocai2333 self-assigned this Jun 11, 2024
@xiaocai2333 xiaocai2333 modified the milestones: 2.4.5, 2.5.0 Jun 11, 2024
sre-ci-robot pushed a commit that referenced this issue Sep 2, 2024
issue: #33744 

This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue Sep 12, 2024
issue: #33744

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue Sep 15, 2024
This PR introduce stats task for import:
1. Define new `Stats` and `IndexBuilding` states for importJob
2. Add new stats step to the import process: trigger the stats task and
wait for its completion
3. Abort stats task if import job failed

issue: #33744

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue Sep 20, 2024
issue: #33744

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue Sep 28, 2024
issue: #33744

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue Sep 28, 2024
…ction (#36408)

issue: #33744

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue Sep 28, 2024
issue: #33744 

master pr: #36371

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue Sep 30, 2024
issue: #33744

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
sre-ci-robot pushed a commit to milvus-io/pymilvus that referenced this issue Oct 8, 2024
milvus issue:
[#milvus-io/milvus](milvus-io/milvus#33744)

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
hasansustcse13 pushed a commit to hasansustcse13/pymilvus that referenced this issue Oct 9, 2024
…o#2280)

milvus issue:
[#milvus-io/milvus](milvus-io/milvus#33744)

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Signed-off-by: hasan <hasan@m2sys.com>
sre-ci-robot pushed a commit that referenced this issue Oct 28, 2024
issue: #33744 

Check whether the PK is truly sorted in the debug model.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
xiaocai2333 added a commit to xiaocai2333/milvus that referenced this issue Oct 28, 2024
issue: milvus-io#33744 

Check whether the PK is truly sorted in the debug model.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
@xiaofan-luan
Copy link
Collaborator

this is almost done

sre-ci-robot pushed a commit that referenced this issue Oct 30, 2024
issue: #33744 

master pr: #36819 

Check whether the PK is truly sorted in the debug model.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue Oct 31, 2024
issue: #33744 

master pr: #36819 

Check whether the PK is truly sorted in the debug model.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue Nov 7, 2024
issue: #33744 

1. Segments generated from inserts will be loaded as growing until they
are sorted by primary key.
2. This PR may increase memory pressure on the delegator, but we need to
test the performance of stats. In local testing, the speed of stats is
greater than the insert speed.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
sre-ci-robot pushed a commit that referenced this issue Nov 8, 2024
issue: #33744

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
@yanliang567 yanliang567 modified the milestones: 2.5.0, 2.5.1 Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Issues related to feature request from users
Projects
None yet
Development

No branches or pull requests

3 participants