Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scripts to generate a dictionary and seed corpus for the config fuzzing #5915

Merged
merged 1 commit into from
Oct 27, 2019

Conversation

tomrittervg
Copy link
Contributor

This is partly a straw-man PR.

The dictionary should list keywords for the config file. Rather than make a hard-coded list that will eventually get out of date, I grab them from the sample config files and from keywords inside HasMember() in the source code.

The seed corpus is a set of interesting config files for the fuzzer to start with. Again, I grab the sample config files.

I'd love more/better suggestions for how to identify config keywords in the codebase; or config files that are interesting but not represented in the existing test configs.

If these techniques are decent; but some hardcoded suggestions are also valuable, we can add them into the script (for the dict) or in a sub-directory of harnesses (for seed files).

Finally, if no one has any ideas, we can always add these to the tree, update oss-fuzz, and see if the code coverage improves.

osquery/main/harnesses/gen_fuzz_config_dict.sh Outdated Show resolved Hide resolved
}

function main() {
if [[ $# < 1 ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the expected usage? This seem like it's mean to be called with an argument for where output goes? I generally expect scripts to take arguments for input and output to stdout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These scripts will be called as part of the oss-fuzz build.sh file to put the resulting files in the $OUT dir with the appropriate name. jbig2dec demonstrates what the files are supposed to look like (although I don't know why it calls unzip -l).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as taking an argument for output; for the .dict file I can print to stdout and pipe to a file if you like; but for the corpus the output needs to be a zip file. (And my initial thought was to make the 'one argument which is the output location' a uniform calling convention.)

osquery/main/harnesses/gen_fuzz_config_dict.sh Outdated Show resolved Hide resolved
@directionless
Copy link
Member

You could also pull in the stuff from ./packs/ (even recognizing those might move later)

@tomrittervg
Copy link
Contributor Author

Updated!

Copy link
Member

@theopolis theopolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comments here: #5923

About indents, naming, and shebang/license header.

@tomrittervg
Copy link
Contributor Author

Updated! LMK if I missed anything.

@theopolis theopolis merged commit f637199 into osquery:master Oct 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants