-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize insert
performance by batching
#3621
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3621 +/- ##
==========================================
- Coverage 74.17% 74.07% -0.10%
==========================================
Files 370 372 +2
Lines 23564 23626 +62
==========================================
+ Hits 17478 17502 +24
- Misses 5059 5093 +34
- Partials 1027 1031 +4
... and 12 files with indirect coverage changes
Flags with carried forward coverage won't be shown. Click here to find out more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's start from fixing code to pass integration tests
@princejha95 this pull request has merge conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution 🤗, I made changes for CI to pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
insert
performance by batching
|
||
if params.Ordered { | ||
break | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's simplify that with
if err = doc.ValidateData(); err == nil {
docs = append(docs, doc)
docsIndexes = append(docsIndexes, int32(i))
continue
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, let's do 🙏
var j int | ||
for i := 0; i < len(params.Docs); i += batchSize { | ||
if j += batchSize; j > len(params.Docs) { | ||
j = len(params.Docs) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that could be simplifying by using a slice instead of a second index j
:
docs := params.Docs
// ...
i := min(batchSize, len(docs))
batch, docs := docs[:i], docs[i:]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks it's easier to read too
metadata.DefaultColumn, | ||
) | ||
args = append(args, doc.RecordID()) | ||
q, args, err := prepareInsertStatement(meta.TableName, meta.Settings.CappedSize > 0, params.Docs[i:j]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a method for meta.Settings.CappedSize > 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was already added by another PR, using it now 👍
for { | ||
i, d, err := docsIter.Next() | ||
if errors.Is(err, iterator.ErrIteratorDone) { | ||
if done { | ||
break | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for !done {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
…tDB into clean_optimise_inserts
|
Description
Closes #3271.
Readiness checklist
task all
, and it passed.@FerretDB/core
), Milestone (Next
), Labels, Project and project's Sprint fields.