-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve sjson package fuzzing #3071
Improve sjson package fuzzing #3071
Conversation
Previously, only FuzzDocument was reaching the marshal+unmarshal steps due to the document type assertion. If it wasn't a documentType, the rest of the fuzzing function was skipped. Since data validation is available only via `types.Document.ValidateData`, a helper `isValidDocumentData` function is introduced. It may go away in the future. But for now, it allows all sjson types to pass the validation. The other change is related to the way we fuzz sjson. Instead of generating both values and schemes at the same time, focus on one thing at the time: a fixed scheme against the generated and fixed documents against the generated schemes. This approach can be further improved by the increased number of relevant corpus seed entries, but that's a different issue (#3067). To answer the question is it any better: yes, it is. The old approach skipped any cases where either of those two was invalid; it was a very rare occasion for a fuzz runner to generate a valid JSON for both document and scheme. It's even more rare to get both of them even remotely compatible. Updates #1273
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #3071 +/- ##
==========================================
- Coverage 76.48% 76.44% -0.05%
==========================================
Files 386 386
Lines 21165 21165
==========================================
- Hits 16188 16179 -9
- Misses 4050 4060 +10
+ Partials 927 926 -1
... and 12 files with indirect coverage changes
Flags with carried forward coverage won't be shown. Click here to find out more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great improvement!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 Let's start fuzzing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I marked all done items in this checklist.
Please do that.
And Conform PR fails now ;)
LGTM overall
- go test -run=XXX -fuzz=FuzzArrayWithFixedSchemas -fuzztime={{.FUZZ_TIME}} ./internal/handlers/sjson/ | ||
- go test -run=XXX -fuzz=FuzzArrayWithFixedDocuments -fuzztime={{.FUZZ_TIME}} ./internal/handlers/sjson/ | ||
- go test -run=XXX -fuzz=FuzzDocumentWithFixedSchemas -fuzztime={{.FUZZ_TIME}} ./internal/handlers/sjson/ | ||
- go test -run=XXX -fuzz=FuzzDocumentWithFixedDocuments -fuzztime={{.FUZZ_TIME}} ./internal/handlers/sjson/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They were sorted alphabetically, please move them back
func addRecordedFuzzDocs(f *testing.F, needDocument, needSchema bool) int { | ||
// We're trying to use that corpus with our hopes set high, | ||
// but chances are, it will still be 0 extra documents. | ||
// See #3067 for more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We always use full-issue URLs and TODO markers for such cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still like that in the main
branch: https://github.com/FerretDB/FerretDB/blob/main/internal/handlers/sjson/sjson_test.go#L196
// if there was no error, | ||
// check that the documents match each other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That comment duplicates the code, we should improve it:
Lines 226 to 229 in fbe7748
- In code comments, in general, do not describe _what_ the code does: it should be clear from the code itself | |
(and when it doesn't and the code is tricky, simplify it instead). | |
Instead, describe _why_ the code does that if it is not clear from the surrounding context, names, etc. | |
There is no need to add comments just because there are no comments if everything is already clear without them. |
Description
Previously, only FuzzDocument was reaching the marshal+unmarshal steps due to the document type assertion. If it wasn't a documentType, the rest of the fuzzing function was skipped.
Since data validation is available only via
types.Document.ValidateData
, a helperisValidDocumentData
function is introduced. It may go away in the future. But for now, it allows all sjson types to pass the validation.The other change is related to the way we fuzz sjson. Instead of generating both values and schemes at the same time, focus on one thing at the time: a fixed scheme against the generated and fixed documents against the generated schemes.
This approach can be further improved by the increased number of relevant corpus seed entries, but that's a different issue (#3067).
To answer the question is it any better: yes, it is. The old approach skipped any cases where either of those two was invalid; it was a very rare occasion for a fuzz runner to generate a valid JSON for both document and scheme. It's even more rare to get both of them even remotely compatible.
Closes #1273
Readiness checklist
task all
, and it passed.@FerretDB/core
), Labels, Project and project's Sprint fields.