Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize detection of duplicate fields #3645

Merged
merged 4 commits into from
Oct 26, 2023
Merged

Optimize detection of duplicate fields #3645

merged 4 commits into from
Oct 26, 2023

Conversation

AlekSi
Copy link
Member

@AlekSi AlekSi commented Oct 25, 2023

Description

We don't want to fully remove or refactor that code just yet (that's #2412), but we can make it quicker… quickly.

task: [bench-postgresql] ../bin/benchstat old-postgresql.txt new-postgresql.txt
goos: darwin
goarch: arm64
pkg: github.com/FerretDB/FerretDB/integration
                                                          │ old-postgresql.txt │          new-postgresql.txt          │
                                                          │       sec/op       │    sec/op     vs base                │
Find/SmallDocuments/Docs1000/7dc4/Int32ID-10                      2.237m ±  6%   2.102m ± 14%        ~ (p=0.165 n=10)
Find/SmallDocuments/Docs1000/7dc4/Int32One-10                     2.337m ±  6%   1.902m ± 11%  -18.62% (p=0.000 n=10)
Find/SmallDocuments/Docs1000/7dc4/Int32Many-10                    7.425m ±  1%   7.161m ±  1%   -3.55% (p=0.000 n=10)
Find/SmallDocuments/Docs1000/7dc4/Int32ManyDotNotation-10         20.62m ±  0%   19.20m ±  3%   -6.90% (p=0.000 n=10)
ReplaceOne/SettingsDocuments/Docs1000/b34e-10                     36.25m ±  1%   25.30m ±  1%  -30.21% (p=0.000 n=10)
InsertMany/SmallDocuments/Docs1000/7dc4/Batch1-10                 550.9m ±  1%   551.7m ±  1%        ~ (p=0.971 n=10)
InsertMany/SmallDocuments/Docs1000/7dc4/Batch10-10                467.7m ±  3%   472.4m ±  5%        ~ (p=0.912 n=10)
InsertMany/SmallDocuments/Docs1000/7dc4/Batch100-10               474.2m ±  2%   483.0m ±  5%        ~ (p=0.089 n=10)
InsertMany/SmallDocuments/Docs1000/7dc4/Batch1000-10              471.8m ±  3%   467.3m ±  3%        ~ (p=0.105 n=10)
InsertMany/SettingsDocuments/Docs1000/b34e/Batch1-10               5.793 ±  5%    5.857 ± 10%        ~ (p=0.684 n=10)
InsertMany/SettingsDocuments/Docs1000/b34e/Batch10-10              1.578 ±  2%    1.532 ±  1%   -2.92% (p=0.000 n=10)
InsertMany/SettingsDocuments/Docs1000/b34e/Batch100-10             1.448 ±  3%    1.443 ±  2%        ~ (p=0.123 n=10)
InsertMany/SettingsDocuments/Docs1000/b34e/Batch1000-10            1.723 ± 31%    1.676 ±  3%        ~ (p=0.075 n=10)
geomean                                                           158.0m         148.9m         -5.71%

                                                          │ old-postgresql.txt │          new-postgresql.txt          │
                                                          │        B/op        │     B/op      vs base                │
Find/SmallDocuments/Docs1000/7dc4/Int32ID-10                      113.8Ki ± 0%   117.6Ki ± 0%   +3.31% (p=0.000 n=10)
Find/SmallDocuments/Docs1000/7dc4/Int32One-10                     113.9Ki ± 0%   117.7Ki ± 0%   +3.37% (p=0.000 n=10)
Find/SmallDocuments/Docs1000/7dc4/Int32Many-10                    6.670Mi ± 0%   6.947Mi ± 0%   +4.15% (p=0.000 n=10)
Find/SmallDocuments/Docs1000/7dc4/Int32ManyDotNotation-10         18.91Mi ± 0%   19.59Mi ± 0%   +3.62% (p=0.000 n=10)
ReplaceOne/SettingsDocuments/Docs1000/b34e-10                    36.159Mi ± 0%   1.547Mi ± 0%  -95.72% (p=0.000 n=10)
InsertMany/SmallDocuments/Docs1000/7dc4/Batch1-10                 65.81Mi ± 0%   68.32Mi ± 0%   +3.82% (p=0.000 n=10)
InsertMany/SmallDocuments/Docs1000/7dc4/Batch10-10                21.92Mi ± 0%   22.28Mi ± 0%   +1.60% (p=0.000 n=10)
InsertMany/SmallDocuments/Docs1000/7dc4/Batch100-10               17.60Mi ± 0%   17.73Mi ± 0%   +0.74% (p=0.000 n=10)
InsertMany/SmallDocuments/Docs1000/7dc4/Batch1000-10              17.30Mi ± 0%   17.42Mi ± 0%   +0.72% (p=0.000 n=10)
InsertMany/SettingsDocuments/Docs1000/b34e/Batch1-10              754.0Mi ± 0%   591.1Mi ± 0%  -21.60% (p=0.000 n=10)
InsertMany/SettingsDocuments/Docs1000/b34e/Batch10-10             804.4Mi ± 0%   639.6Mi ± 0%  -20.48% (p=0.000 n=10)
InsertMany/SettingsDocuments/Docs1000/b34e/Batch100-10            842.5Mi ± 0%   677.3Mi ± 0%  -19.60% (p=0.000 n=10)
InsertMany/SettingsDocuments/Docs1000/b34e/Batch1000-10           833.1Mi ± 0%   668.3Mi ± 0%  -19.78% (p=0.000 n=10)
geomean                                                           29.06Mi        21.61Mi       -25.65%

                                                          │ old-postgresql.txt │         new-postgresql.txt          │
                                                          │     allocs/op      │  allocs/op   vs base                │
Find/SmallDocuments/Docs1000/7dc4/Int32ID-10                       1.006k ± 0%   1.014k ± 0%   +0.79% (p=0.000 n=10)
Find/SmallDocuments/Docs1000/7dc4/Int32One-10                      1.010k ± 0%   1.020k ± 0%   +0.99% (p=0.000 n=10)
Find/SmallDocuments/Docs1000/7dc4/Int32Many-10                     58.17k ± 0%   58.43k ± 0%   +0.45% (p=0.000 n=10)
Find/SmallDocuments/Docs1000/7dc4/Int32ManyDotNotation-10          173.5k ± 0%   174.8k ± 0%   +0.73% (p=0.000 n=10)
ReplaceOne/SettingsDocuments/Docs1000/b34e-10                     26.750k ± 0%   6.419k ± 0%  -76.00% (p=0.000 n=10)
InsertMany/SmallDocuments/Docs1000/7dc4/Batch1-10                  693.7k ± 0%   698.3k ± 0%   +0.66% (p=0.000 n=10)
InsertMany/SmallDocuments/Docs1000/7dc4/Batch10-10                 301.6k ± 0%   300.1k ± 0%   -0.51% (p=0.000 n=10)
InsertMany/SmallDocuments/Docs1000/7dc4/Batch100-10                261.9k ± 0%   259.7k ± 0%   -0.83% (p=0.000 n=10)
InsertMany/SmallDocuments/Docs1000/7dc4/Batch1000-10               258.7k ± 0%   256.4k ± 0%   -0.87% (p=0.000 n=10)
InsertMany/SettingsDocuments/Docs1000/b34e/Batch1-10               3.071M ± 0%   2.980M ± 0%   -2.97% (p=0.000 n=10)
InsertMany/SettingsDocuments/Docs1000/b34e/Batch10-10              2.668M ± 0%   2.570M ± 0%   -3.68% (p=0.000 n=10)
InsertMany/SettingsDocuments/Docs1000/b34e/Batch100-10             2.620M ± 0%   2.521M ± 0%   -3.78% (p=0.000 n=10)
InsertMany/SettingsDocuments/Docs1000/b34e/Batch1000-10            2.613M ± 0%   2.515M ± 0%   -3.76% (p=0.000 n=10)
geomean                                                            180.7k        160.3k       -11.29%

Refs #3633.
Refs #2412.

Readiness checklist

  • I added/updated unit tests (and they pass).
  • I added/updated integration/compatibility tests (and they pass).
  • I added/updated comments and checked rendering.
  • I made spot refactorings.
  • I updated user documentation.
  • I ran task all, and it passed.
  • I ensured that PR title is good enough for the changelog.
  • (for maintainers only) I set Reviewers (@FerretDB/core), Milestone (Next), Labels, Project and project's Sprint fields.
  • I marked all done items in this checklist.

@mergify mergify bot assigned AlekSi Oct 25, 2023
@codecov
Copy link

codecov bot commented Oct 25, 2023

Codecov Report

Merging #3645 (20a9a7e) into main (329efdf) will decrease coverage by 0.59%.
The diff coverage is 96.49%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3645      +/-   ##
==========================================
- Coverage   62.59%   62.00%   -0.59%     
==========================================
  Files         430      430              
  Lines       27887    27906      +19     
==========================================
- Hits        17455    17303     -152     
- Misses       9411     9568     +157     
- Partials     1021     1035      +14     
Files Coverage Δ
internal/types/types.go 94.20% <100.00%> (+0.08%) ⬆️
internal/types/document.go 95.29% <96.42%> (+4.15%) ⬆️

... and 18 files with indirect coverage changes

Flag Coverage Δ
filter-false ?
filter-true 58.64% <94.73%> (-0.56%) ⬇️
hana-1 ?
integration 58.64% <94.73%> (-0.60%) ⬇️
mongodb-1 4.46% <42.10%> (+0.03%) ⬆️
postgresql-1 42.95% <89.47%> (-0.27%) ⬇️
postgresql-2 40.29% <94.73%> (-0.28%) ⬇️
postgresql-3 42.54% <94.73%> (-0.35%) ⬇️
sort-false 58.64% <94.73%> (-0.14%) ⬇️
sort-true ?
sqlite-1 42.37% <89.47%> (-0.15%) ⬇️
sqlite-2 39.74% <94.73%> (-0.18%) ⬇️
sqlite-3 41.98% <94.73%> (-0.40%) ⬇️
unit 22.79% <96.49%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

@AlekSi AlekSi changed the title Speedup Optimize detection of duplicate fields Oct 25, 2023
@AlekSi AlekSi added this to the Next milestone Oct 25, 2023
@AlekSi AlekSi added the code/enhancement Some user-visible feature could work better label Oct 25, 2023
@AlekSi AlekSi marked this pull request as ready for review October 25, 2023 10:31
@AlekSi AlekSi requested a review from a team as a code owner October 25, 2023 10:31
@AlekSi AlekSi enabled auto-merge (squash) October 25, 2023 10:31
@AlekSi AlekSi requested review from a team and noisersup October 25, 2023 10:31
Copy link
Member

@noisersup noisersup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AlekSi AlekSi merged commit ebe7cd1 into FerretDB:main Oct 26, 2023
29 of 32 checks passed
@AlekSi AlekSi deleted the speedup branch October 26, 2023 01:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code/enhancement Some user-visible feature could work better
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants