Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Path prefix search #660

Merged
merged 2 commits into from
Sep 27, 2016
Merged

Path prefix search #660

merged 2 commits into from
Sep 27, 2016

Conversation

mstemm
Copy link
Contributor

@mstemm mstemm commented Sep 22, 2016

Add a 'pmatch' operator that allows for parallel testing of a pathname against a set of patch prefixes.

As a result, you can run sysdig using a command line like:

sudo sysdig "evt.type=open and fd.directory pmatch (/var, /usr)"

and see all the file opens for files below either /var or /usr.

This PR also adds unit test support using Google Test.

I still need to modify travis to install gtest locally but I'll do that next.

@luca3m
Copy link
Contributor

luca3m commented Sep 22, 2016

I'm not sure we want to add Google Tests here. We have already them in our agent repository, I think it's better to add tests there until we opensource sysdig related ones or do the latter thing. See #64

@mstemm
Copy link
Contributor Author

mstemm commented Sep 22, 2016

I could definitely move the prefix tree unit tests to the agent, as the agent pulls in most of the sysdig code anyway. I was thinking it would be better to keep the unit tests in the same repository of the actual code, though.

#64 refers to a bunch of third party libraries that needed to be added. I only had to add google test, so at least for these unit tests it was pretty easy.

@gianlucaborello @ldegio @adityakinifr any opinions?

@ldegio
Copy link
Contributor

ldegio commented Sep 22, 2016

We should have a comprehensive discussion on testing strategy one of these days. For the moment, however, I would just keep things simple and consistent and I would put these tests in the agent.

@mstemm mstemm force-pushed the path-prefix-search branch 2 times, most recently from 49bd466 to ae80918 Compare September 22, 2016 17:29
@mstemm
Copy link
Contributor Author

mstemm commented Sep 22, 2016

Ok, dropped the commit here that added unit tests. Moved the prefix_search unit tests to the agent in https://github.com/draios/agent/pull/200.

Love to hear feedback on the impl and choice of operator, though!

Add a data structure that allows testing a path aka /var/log/messages
against a set of path prefixes (/usr, /bin, /var/log, ...).

Thinking ahead to use in libsinsp, the search structure uses pointer +
length pairs instead of strings, meaning that it does not copy any data,
only refers to it. This way the structure doesn't copy filtercheck
values.

In order to share the idea of a pair of pointer + length between this
structure and the unordered_set used by "in (...)" set membership tests,
move the hashing and equality function from filterchecks to a standalone
header filter_value.h and use it for the unordered_map.

The paths are held in a tree-like structure. At each level, an unordered
map has the path components for that level and a sub-tree of paths for
that root.

The tree can change if new search paths are prefixes of any of the
current paths (the sub-tree for the longer path is replaced by the
prefix). If a new search path is a suffix of any existing path no change
is made, as it is already covered by the prefix.

Matching involves splitting off the first directory component and
testing it against the values at that level. If a match is found, it
recursively calls match for the subtree.
Remove the definitions of filter_value_member_t/hash function/equality
function from filterchecks.h, renaming it to filter_value_t along the
way.

Add an operator CO_PMATCH/"pmatch" which takes a set of values like
CO_IN does, but requires that the left hand side of the comparison is a
PT_CHARBUF. When filter values are added, they are added to the new
path_prefix_search object m_val_storages_paths.

in ::flt_compare, when the operator is CO_PMATCH, test the value against
m_val_storages_paths.

As a result, you can run sysdig using a command line like:

sudo ./userspace/sysdig/sysdig "evt.type=open and fd.directory
pmatch (/var, /usr)"

and see all the file opens for files below either /var or /usr.
@mstemm mstemm merged commit 9c57306 into dev Sep 27, 2016
@mstemm mstemm deleted the path-prefix-search branch September 27, 2016 01:32
luca3m pushed a commit that referenced this pull request Oct 10, 2016
* Add ability to test a path against many prefixes

Add a data structure that allows testing a path aka /var/log/messages
against a set of path prefixes (/usr, /bin, /var/log, ...).

Thinking ahead to use in libsinsp, the search structure uses pointer +
length pairs instead of strings, meaning that it does not copy any data,
only refers to it. This way the structure doesn't copy filtercheck
values.

In order to share the idea of a pair of pointer + length between this
structure and the unordered_set used by "in (...)" set membership tests,
move the hashing and equality function from filterchecks to a standalone
header filter_value.h and use it for the unordered_map.

The paths are held in a tree-like structure. At each level, an unordered
map has the path components for that level and a sub-tree of paths for
that root.

The tree can change if new search paths are prefixes of any of the
current paths (the sub-tree for the longer path is replaced by the
prefix). If a new search path is a suffix of any existing path no change
is made, as it is already covered by the prefix.

Matching involves splitting off the first directory component and
testing it against the values at that level. If a match is found, it
recursively calls match for the subtree.

* Add pmatch operator, using prefix_search struct.

Remove the definitions of filter_value_member_t/hash function/equality
function from filterchecks.h, renaming it to filter_value_t along the
way.

Add an operator CO_PMATCH/"pmatch" which takes a set of values like
CO_IN does, but requires that the left hand side of the comparison is a
PT_CHARBUF. When filter values are added, they are added to the new
path_prefix_search object m_val_storages_paths.

in ::flt_compare, when the operator is CO_PMATCH, test the value against
m_val_storages_paths.

As a result, you can run sysdig using a command line like:

sudo ./userspace/sysdig/sysdig "evt.type=open and fd.directory
pmatch (/var, /usr)"

and see all the file opens for files below either /var or /usr.
dmyerscough pushed a commit to dmyerscough/sysdig that referenced this pull request Mar 3, 2017
* Add ability to test a path against many prefixes

Add a data structure that allows testing a path aka /var/log/messages
against a set of path prefixes (/usr, /bin, /var/log, ...).

Thinking ahead to use in libsinsp, the search structure uses pointer +
length pairs instead of strings, meaning that it does not copy any data,
only refers to it. This way the structure doesn't copy filtercheck
values.

In order to share the idea of a pair of pointer + length between this
structure and the unordered_set used by "in (...)" set membership tests,
move the hashing and equality function from filterchecks to a standalone
header filter_value.h and use it for the unordered_map.

The paths are held in a tree-like structure. At each level, an unordered
map has the path components for that level and a sub-tree of paths for
that root.

The tree can change if new search paths are prefixes of any of the
current paths (the sub-tree for the longer path is replaced by the
prefix). If a new search path is a suffix of any existing path no change
is made, as it is already covered by the prefix.

Matching involves splitting off the first directory component and
testing it against the values at that level. If a match is found, it
recursively calls match for the subtree.

* Add pmatch operator, using prefix_search struct.

Remove the definitions of filter_value_member_t/hash function/equality
function from filterchecks.h, renaming it to filter_value_t along the
way.

Add an operator CO_PMATCH/"pmatch" which takes a set of values like
CO_IN does, but requires that the left hand side of the comparison is a
PT_CHARBUF. When filter values are added, they are added to the new
path_prefix_search object m_val_storages_paths.

in ::flt_compare, when the operator is CO_PMATCH, test the value against
m_val_storages_paths.

As a result, you can run sysdig using a command line like:

sudo ./userspace/sysdig/sysdig "evt.type=open and fd.directory
pmatch (/var, /usr)"

and see all the file opens for files below either /var or /usr.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants