Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve sjson package fuzzing #3071

Merged
merged 5 commits into from
Jul 19, 2023
Merged

Improve sjson package fuzzing #3071

merged 5 commits into from
Jul 19, 2023

Conversation

quasilyte
Copy link
Contributor

@quasilyte quasilyte commented Jul 18, 2023

Description

Previously, only FuzzDocument was reaching the marshal+unmarshal steps due to the document type assertion. If it wasn't a documentType, the rest of the fuzzing function was skipped.

Since data validation is available only via types.Document.ValidateData, a helper isValidDocumentData function is introduced. It may go away in the future. But for now, it allows all sjson types to pass the validation.

The other change is related to the way we fuzz sjson. Instead of generating both values and schemes at the same time, focus on one thing at the time: a fixed scheme against the generated and fixed documents against the generated schemes.

This approach can be further improved by the increased number of relevant corpus seed entries, but that's a different issue (#3067).

To answer the question is it any better: yes, it is. The old approach skipped any cases where either of those two was invalid; it was a very rare occasion for a fuzz runner to generate a valid JSON for both document and scheme. It's even more rare to get both of them even remotely compatible.

Closes #1273

Readiness checklist

  • I added/updated unit tests (and they pass).
  • I added/updated integration/compatibility tests (and they pass).
  • I added/updated comments and checked rendering.
  • I made spot refactorings.
  • I updated user documentation.
  • I ran task all, and it passed.
  • I ensured that PR title is good enough for the changelog.
  • (for maintainers only) I set Reviewers (@FerretDB/core), Labels, Project and project's Sprint fields.
  • I marked all done items in this checklist.

Previously, only FuzzDocument was reaching the marshal+unmarshal
steps due to the document type assertion. If it wasn't a documentType,
the rest of the fuzzing function was skipped.

Since data validation is available only via `types.Document.ValidateData`,
a helper `isValidDocumentData` function is introduced. It may go
away in the future. But for now, it allows all sjson types to pass
the validation.

The other change is related to the way we fuzz sjson.
Instead of generating both values and schemes at the same time,
focus on one thing at the time: a fixed scheme against the generated
and fixed documents against the generated schemes.

This approach can be further improved by the increased number
of relevant corpus seed entries, but that's a different issue (#3067).

To answer the question is it any better: yes, it is.
The old approach skipped any cases where either of those two
was invalid; it was a very rare occasion for a fuzz runner
to generate a valid JSON for both document and scheme.
It's even more rare to get both of them even remotely compatible.

Updates #1273
@quasilyte quasilyte requested review from a team and AlekSi as code owners July 18, 2023 12:41
@quasilyte quasilyte requested a review from rumyantseva July 18, 2023 12:41
@codecov
Copy link

codecov bot commented Jul 18, 2023

Codecov Report

Merging #3071 (bb89b5b) into main (fbe7748) will decrease coverage by 0.05%.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3071      +/-   ##
==========================================
- Coverage   76.48%   76.44%   -0.05%     
==========================================
  Files         386      386              
  Lines       21165    21165              
==========================================
- Hits        16188    16179       -9     
- Misses       4050     4060      +10     
+ Partials      927      926       -1     
Impacted Files Coverage Δ
internal/handlers/sjson/sjson.go 85.65% <100.00%> (+2.17%) ⬆️

... and 12 files with indirect coverage changes

Flag Coverage Δ
hana ?
integration 72.98% <0.00%> (-0.04%) ⬇️
mongodb 5.54% <0.00%> (ø)
pg 66.26% <0.00%> (-0.04%) ⬇️
shard-1 54.80% <0.00%> (-0.01%) ⬇️
shard-2 55.20% <0.00%> (+0.06%) ⬆️
shard-3 57.64% <0.00%> (-0.11%) ⬇️
sqlite 46.78% <0.00%> (-0.01%) ⬇️
unit 24.23% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

@quasilyte quasilyte changed the title improve sjson package fuzzing Improve sjson package fuzzing Jul 18, 2023
@quasilyte quasilyte enabled auto-merge (squash) July 18, 2023 12:56
@quasilyte quasilyte requested review from a team, chilagrow and noisersup July 18, 2023 13:47
@quasilyte quasilyte disabled auto-merge July 18, 2023 13:52
@quasilyte quasilyte enabled auto-merge (squash) July 18, 2023 13:53
chilagrow
chilagrow previously approved these changes Jul 19, 2023
Copy link
Member

@chilagrow chilagrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvement!

internal/handlers/sjson/array_test.go Outdated Show resolved Hide resolved
internal/handlers/sjson/sjson_test.go Show resolved Hide resolved
Copy link
Member

@chilagrow chilagrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 Let's start fuzzing!

Copy link
Member

@AlekSi AlekSi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I marked all done items in this checklist.

Please do that.

And Conform PR fails now ;)

LGTM overall

- go test -run=XXX -fuzz=FuzzArrayWithFixedSchemas -fuzztime={{.FUZZ_TIME}} ./internal/handlers/sjson/
- go test -run=XXX -fuzz=FuzzArrayWithFixedDocuments -fuzztime={{.FUZZ_TIME}} ./internal/handlers/sjson/
- go test -run=XXX -fuzz=FuzzDocumentWithFixedSchemas -fuzztime={{.FUZZ_TIME}} ./internal/handlers/sjson/
- go test -run=XXX -fuzz=FuzzDocumentWithFixedDocuments -fuzztime={{.FUZZ_TIME}} ./internal/handlers/sjson/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were sorted alphabetically, please move them back

func addRecordedFuzzDocs(f *testing.F, needDocument, needSchema bool) int {
// We're trying to use that corpus with our hopes set high,
// but chances are, it will still be 0 extra documents.
// See #3067 for more details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always use full-issue URLs and TODO markers for such cases

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +296 to +297
// if there was no error,
// check that the documents match each other
Copy link
Member

@AlekSi AlekSi Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That comment duplicates the code, we should improve it:

FerretDB/CONTRIBUTING.md

Lines 226 to 229 in fbe7748

- In code comments, in general, do not describe _what_ the code does: it should be clear from the code itself
(and when it doesn't and the code is tricky, simplify it instead).
Instead, describe _why_ the code does that if it is not clear from the surrounding context, names, etc.
There is no need to add comments just because there are no comments if everything is already clear without them.

@AlekSi AlekSi added this to the Next milestone Jul 19, 2023
@quasilyte quasilyte added the code/chore Code maintenance improvements label Jul 19, 2023
@quasilyte quasilyte merged commit 4106994 into FerretDB:main Jul 19, 2023
@quasilyte quasilyte deleted the quasilyte/improve_sjson_fuzzing branch July 19, 2023 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code/chore Code maintenance improvements
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Clean-up sjson fuzzing
3 participants