-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC - Restrict LIST and SCAN to only accessible paths #769
Comments
@SharpEdgeMarshall, @stormshield-gt -- I see you both have commented on the linked Vault issue recently. If either of you two are willing to provide feedback on this design, it'd be appreciated! :-) |
@cipherboy: I must have a very partial understanding of the topic, but instinctively I would have thought that the pagination parameters ( |
@DanGhita Yeah, I generally concurred based on intuition and came down on that side as well. However, it does run the risk of being as expensive as a non-paginated list (pagination could be up to 1000s of entries at a time, with 1m+ total entries) from a storage PoV, though you would never consume all of that in memory like you would with a non-paginated list. Plus there's context cancellation and thus you'd probably be interruptible in a way that a regular list wouldn't quite be. But if you wanted this to be fully efficient, I think you'd have to at least push ACLs into the plugin layer, if not all the way down to the storage layer which wouldn't be... enjoyable. :-) So I think this is probably the most optimal set of tradeoffs we can make at the moment. |
@cipherboy, thank you for your consideration; I have limited knowledge about Vault internals, but if this proposal will allow users to list only the entities they can read (without negatively impacting performance too much), then LGTM. |
@SharpEdgeMarshall Yes, that is the intent! The policy author has to do some work to enforce this behavior (so it is opt-in, not applied globally by default) but after that, it gives you the behavior you describe. There's been lots of other core improvements happening in OpenBao as well outside of this, too! Would be happy to chat more if you have questions :-) |
Summary
Add a new keyword,
list_scan_response_keys_filter_path
, to ACL policies to support filtering well-formedlist
andscan
responses to only subpaths accessible by the token.Problem Statement
Administrators grant broad permissions via the
list
endpoint. Notably, if a user only hasread
orlist
access to select subkeys, but is grantedlist
orscan
on a parent, all results are visible to the user even if they are not otherwise accessible. Complicating this is that certain paths are unauthenticated (such as PKI'scerts/
listing, which is itself authenticated but any given entry undercert/:serial
is not), or similarly, that a list handler may not necessarily be a prefix of the single entry read handler (again, in the case of PKI's certificate or issuers lists).This makes automatic, backend-agnostic filtering hard to achieve without some guidance from the plugin or policy author.
We opt to do the latter and push this mapping requirement onto the policy author.
This also gives us an indication at the system level that filtering should be applied, as it is a potentially expensive operation and different callers may have differing requirements (e.g., in KVv2, some users may wish to filter based on
metadata/{{ key }}
access whereas others may wish to filter ondata/{{ key }}
access).If necessary, this parameter could be made to accept a list of paths. However, semantics would then also have to be defined (
OR
versusAND
) to decide how many need to match to show the entry in the result.User-facing Description
Policy authors may make a breaking (from a consuming application's PoV) change by introducing a
list_scan_response_keys_filter_path
argument inside a givenpath
grant in an ACL policy. This modifies the specificlist
response indicated by removing any paths whose corresponding item path (generated by evaluating{{ key }}
inlist_scan_response_keys_filter_path
with each item in the response's"keys"
data field) which the token does not have access to. This is relatively expensive, requiring a policy check for every response key, so should likely be combined withrequired_parameters=limit
on list operations to prevent having to evaluate thousands or potentially millions (in the case of PKI certificates or recursive K/V listing over the entire mount) of policy checks.Consuming applications will see a given key in the
keys
response field if one of two conditions holds:/
and the application has thelist
capability on the resulting path, or/
and the application has theread
on the resulting path.Example policy:
Technical Description
Permissions are currently governed only by a single type of policy, ACLs. Individual policies are compiled into an ACL (which is subsequently cached for potential reuse), which walks each policy and computes combined permission grants. We opt to say that if any
list
operation (for a given path) setslist_scan_response_keys_filter_path
, a value will be present, and the first such policy (in list order) determines the resulting filter.A token's is thus known at request handling time, which means we can, for
LIST
orSCAN
operations where filtering is desired, re-evaluate (without decrementing the token'snum_usage
!) the ACL against other paths without incurring additional storage accesses. Because such filtering is opt-in on a path-by-path basis, we can require plugins conform to the required response field structure (i.e., not be asecret
orauth
response and have onlykeys
and optionallykey_info
).As above, we know the two types of results from list entries (though only concrete keys are returned from
SCAN
results usually) and can apply policy evaluation against the simulated operation. Notably, the underlying policy which granted this request is only returned by name. We modify theACLResults
object to include theResponseKeysFilterPath
value (as chosen above) for later evaluation.Rationale and Alternatives
One alternative is globally enforcing this today. However, as this is technically a breaking change (insofar as applications could be acting on complete list results, without necessarily having read permissions on any entries), this isn't desirable without opting in. However, as different applications may have different behavior, it would be hard to opt-in globally and thus per-application (or, by proxy, per-policy) opt-in makes sense.
Additionally, one could consider pushing filter path identification to the plugin author for each list response. However this isn't ideal as shown above: KV has multiple potentially valid paths which could independently or dependently determine whether an entry should be visible in a list response. Thus, having
list_scan_response_keys_filter_path
take a string (and not simply be a boolean) seems to make the most sense to allow plugin authors to determine paths appropriately.In the future, more advanced path filters could be created via
text/template
; the existing proposal of{{ .key }}
is consistent with that, assumingkey
is injected as a data parameter.Downsides
This does add some complexity to the already complex routing and response mechanism. However, it is almost entirely opt-in and thus performance impact is minimal unless this filtering is desired by an operator.
Parameter Interaction
This initial proposal lacks support for certain confounding parameters:
min_wrapping_ttl
ormax_wrapping_ttl
; because our simulated request does not ask for wrapping, policy will deny this entry though the user will have access to it if requested with wrapping.required_parameters
) and these will also not be present in our simulated request.This could either be resolved by adding additional parameters (which may not be sufficiently flexible) or by iterating through the attached ACL on the templated path to consider required parameters and wrapping information for each policy for each path.
This can be done later if users utilizing these features wish to have support.
Cross-Mount/Invalid Path Confusion
One additional downside is that, because policy authors set
list_scan_response_keys_filter_path
, they could set this to a mount which doesn't exist or isn't the same as the list. That is, there is no common prefix requirement between the list and the templated path for filtering.Wildcard Matching Difficulties
Lastly, one downside is that there are difficulties in supporting a wildcard match: if any path within a given prefix is allowed by any policy, show the key in the list result. For instance, if the list path was
sys/mounts
, one could envision alist_scan_response_keys_filter_path
value of{{ .key }}/*
to indicate a matching mount should only be returned if the token's policy grants at least one operation on said mount. This is hard as the underlying trie structures do not allow reverse wildcard lookup.This may require careful subtree iteration with backtracking, rooted at a certain concrete path and thus is left for the future if someone desires this behavior. This has parallels to an earlier
dev-wg
mailing list post around glob-based alias matches.Security Implications
In addition to the downsides and complexity noted above, this probably has net-positive security benefit: when desired, users who shouldn't see certain
list
entries, because they could not read or list them, will no longer be able to see them if an operator applies relevant filtering. Because this is a breaking change and incurs some non-trivial performance impact, this cannot be done by default.User/Developer Experience
For application developers, this may make results change based on a policy author's actions. On the whole, assuming the application was reading list results and already handling permission denied errors, this will likely improve an application's performance with fewer denied requests.
For policy authors, this is another variable to consider but may make quality of life better on certain endpoints.
However, the greatest improvement comes from web UI users: list results will now only contain relevant paths that a policy author sets and thus.
Unresolved Questions
Interactions with Pagination
When implementing pagination, the design was given as two request keys (
after
andlimit
), returning the same response structure. The semantics were implicitly that an empty result meant no further data was present. As a user may not have access to a given range of keys (say, there are four-digit numbers encoded in a K/V entry and the user only has access to the one2024
,list
called withafter=1000
andlimit=100
would return 100 results from the plugin call which would be filtered to the empty list as the allowed2024
is not in the returned[1000, 1100)
). This would look to the client like they should finish iteration (as there are no results) instead of having ot increment theirafter
value.However, the client may not know
after
values that are useful for iteration. These may be random UUIDs (where the probability, with sufficient entries, of discovering a usefulafter=
value is rather small) or some other information-leaking construct. Thus returning anext
parameter would not be ideal, as the caller would also have to repeat calls and gains information about present-but-unaccessible keys.This may mean OpenBao should add a warning (that all results were pruned) or itself iterate the list handler until at least one entry is returned if a non-empty
list-page
result occurs. The latter adds overhead and auditing concern:limit
, if there were sufficient ACL-permitted entries) or only a single value?My current stance leads me to believe that ideally we'd fill the entire response as much as we could, letting the policy author set a recursion limit, not warn, and not have them count towards quotas or auditing.
Related Issues
Upstream:
Proof of Concept
https://github.com/cipherboy/openbao/pull/new/list-filter-by-readable
The text was updated successfully, but these errors were encountered: