Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dash is considered as a field instead as separator #7524

Closed
wants to merge 1 commit into from

Conversation

vvejell1
Copy link

@vvejell1 vvejell1 commented Oct 13, 2022

Signed-off-by: Vaishnavi Vejella vvejella@amazon.com

Description #5642:
Containers have labels and they act as filters when querying the containers. So, while checking a label with the 'dash' which is failing because '-' is interpreting here as an operator and it is not considering as a part of token field.
Solution:
This problem can be solved by treating - as a component of the field rather than a separator.
Also, it appears that there are some places where it is assumed that . is the only separator in a fieldpath.
[e.g]

value, ok := m[strings.Join(fieldpath, ".")]

@k8s-ci-robot
Copy link

Hi @vvejell1. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vvejell1 vvejell1 changed the title Dash is acting as operator instead as identifier Dash is acting as operator instead as an identifier Oct 13, 2022
@kzys
Copy link
Member

kzys commented Oct 13, 2022

Let's follow the terminology in parser.go.

/*
Parse the strings into a filter that may be used with an adaptor.
The filter is made up of zero or more selectors.
The format is a comma separated list of expressions, in the form of
`<fieldpath><op><value>`, known as selectors. All selectors must match the
target object for the filter to be true.
We define the operators "==" for equality, "!=" for not equal and "~=" for a
regular expression. If the operator and value are not present, the matcher will
test for the presence of a value, as defined by the target object.
The formal grammar is as follows:
selectors := selector ("," selector)*
selector := fieldpath (operator value)
fieldpath := field ('.' field)*
field := quoted | [A-Za-z] [A-Za-z0-9_]+
operator := "==" | "!=" | "~="
value := quoted | [^\s,]+
quoted := <go string syntax>
*/

The change is making - a valid character in a field, not an operator.

@kzys
Copy link
Member

kzys commented Oct 13, 2022

Also please write some unit tests for the change.

@estesp
Copy link
Member

estesp commented Oct 13, 2022

Also see @dmcgowan's comments in #6188 ; although when I tried to dig a bit I think it can't be related to . without more significant code changes as I mentioned in a comment.

@vvejell1 vvejell1 changed the title Dash is acting as operator instead as an identifier Dash is considered as a field instead as separator Oct 20, 2022
@@ -41,7 +41,7 @@ The formal grammar is as follows:
selectors := selector ("," selector)*
selector := fieldpath (operator value)
fieldpath := field ('.' field)*
field := quoted | [A-Za-z] [A-Za-z0-9_]+
field := quoted | [A-Za-z-] [A-Za-z0-9_]+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a field start from -?

Copy link
Author

@vvejell1 vvejell1 Oct 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 'dash' that separates two words is taken as a field/value. Also, hyphen(-) is accepted in regex. Here an example [-A-z0-9]+, [a-Za-9-]+, [a-z-0-9]+, and [a-z-0-9]+ are all similar.

Copy link
Member

@kzys kzys Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before your change, hello, hello1 and hello_1 are all valid fields, but 1hello and _hello are not.

The comment after your change indicates, _hello is still invalid but -hello is valid. Is that what you want to propose?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an '-' or '_' whenever possible to avoid positional dependencies and making the sentences less complex and valid. In my view, there may be a better way to write some of the special characters like 1hello, they must absolutely consider in certain situations.

filters/scanner_test.go Outdated Show resolved Hide resolved
@vvejell1 vvejell1 force-pushed the dash branch 2 times, most recently from 91c4f50 to 6efbbb9 Compare October 25, 2022 16:34
@vvejell1 vvejell1 requested a review from kzys October 25, 2022 17:44
@vvejell1 vvejell1 force-pushed the dash branch 3 times, most recently from 7ae1cd0 to 34cf495 Compare November 4, 2022 05:34
Signed-off-by: Vaishnavi Vejella <vvejella@amazon.com>
@@ -138,6 +138,10 @@ chomp:
return pos, tokenValue, s.input[pos:s.ppos]
case isFieldRune(ch):
s.scanField()
if ch == '-' || isDigitRune(ch) { // illegal token if it starts with '-' or any digit
s.error("illegal token field")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmcgowan / @estesp - The implementation was less strict than the syntax here. Do we want to make this change in 1.x?

The formal grammar is as follows:
selectors := selector ("," selector)*
selector := fieldpath (operator value)
fieldpath := field ('.' field)*
field := quoted | [A-Za-z] [A-Za-z0-9_]+
operator := "==" | "!=" | "~="
value := quoted | [^\s,]+
quoted := <go string syntax>

If we follow the syntax and update the implementation;

  • It would be better to explain why this is illegal in the error itself.
  • It should handle _ as well.

@kzys
Copy link
Member

kzys commented Nov 10, 2022

The situation is messier than I initially thought.

First, our implementation doesn't follow the grammar in the code comment. If not quoted, only [A-Za-z] is allowed in the beginning of a field, but isFieldRune allows both _ and numbers.

Second, our implementation matches neither http://label-schema.org/rc1/ nor https://docs.docker.com/config/labels-custom-metadata/#key-format-recommendations. Both allows - but _, but we do the opposite.

Third, neither http://label-schema.org/rc1/ nor https://docs.docker.com/config/labels-custom-metadata/#key-format-recommendations follow https://www.ietf.org/rfc/rfc2119.txt.

For example,

Label Schema labels MUST comply with Docker’s guidelines:

  • Keys should only consist of lowercase alphanumeric characters, dots and dashes (for example, [a-z0-9-.]).
  • Keys should start and end with an alphanumeric character.
  • Keys may not contain consecutive dots or dashes.

Are they MUST, should or may? :)

@kzys
Copy link
Member

kzys commented Nov 10, 2022

I synced-up offline with @vvejell1, @dims and @estesp and but want to hear thoughts from @thaJeztah, @stevvooe and @dmcgowan.

Especially since containerd has been around for years and the grammar is only in the code comment, I don't want to make the implementation more strict, especially in 1.x series. The ship has sailed.

Accepting - seems good, since both http://label-schema.org/rc1/ and https://docs.docker.com/config/labels-custom-metadata/#key-format-recommendations are doing so.

Thoughts?

@dmcgowan
Copy link
Member

Especially since containerd has been around for years and the grammar is only in the code comment, I don't want to make the implementation more strict, especially in 1.x series. The ship has sailed.

This change is only changing the grammar of the filters, which is intentionally limited and has quotes to express any value. If this change is only necessary to avoid quoting the value when providing a filter, then the dash seems like an arbitrary addition. Is there a case which is not working today when quoted or is this mostly about adding the case for CLI users?

@dmcgowan dmcgowan added the status/needs-discussion Needs discussion and decision from maintainers label Nov 30, 2022
@vvejell1
Copy link
Author

vvejell1 commented Dec 2, 2022

I think that quotes are necessary for the label. This implementation only allows the '-' as valid in the label but does not remove quotes from the label.

@vvejell1 vvejell1 closed this Dec 2, 2022
@vvejell1 vvejell1 reopened this Dec 2, 2022
@dmcgowan
Copy link
Member

dmcgowan commented Dec 5, 2022

I think that quotes are necessary for the label. This implementation only allows the '-' as valid in the label but does not remove quotes from the label.

The point is that the quotes make this change unnecessary. In the unit test example, labels.io-foo would already work as labels."io-foo". We commonly use / in labels which make quoting user provided fields necessary already. My understanding is the only advantage of this change is you could give commands like ctr images ls labels.io-foo==something instead of ctr images ls 'labels."io-foo"==something'. If that is the main motivation, then I was suggesting instead of changing the filter grammar, we could add a cli library to sanitize user input before passing it over the API. I would think that supporting ctr images ls labels.containerd.io/foo==something would be even more useful since we use that pattern quite often in our labels.

@thaJeztah
Copy link
Member

If I recall correctly, the idea behind labels being bookended by a non-separator was to prevent ambiguity when using these on the command-line; e.g., a label named --quiet would be easily to be interpreted as a command-line option ( moby/moby#9882 (comment), moby/moby#9882 (comment)), in addition, the convention discussed in the original implementation was to use reverse-dns notation to namespace labels (allowing - at the start would not match reverse-dns formats).

@thaJeztah
Copy link
Member

Accepting - seems good, since both http://label-schema.org/rc1/ and https://docs.docker.com/config/labels-custom-metadata/#key-format-recommendations are doing so.

Perhaps the choice of wording on that page in our docs is a bit poor;

strings which may contain periods (.) and hyphens (-).

So yes, the key (name) may contain periods and hyphens, but enclosed between alpha-numeric characters (original version, which got somewhat rewritten in docker/docs@67610e9). Some of the wording wasn't written in strict MUST/SHOULD conventions (I think as there was still some debate about enforcing or not).

@k8s-ci-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@stevvooe
Copy link
Member

stevvooe commented Mar 5, 2024

The syntax limitation comes from Protobuf's fieldpath syntax. I cannot find a link, since the containerd implementation comes up. The "fields" in fieldpath are limited to valid protobuf IDENT character set, which typically doesn't include "-". To make filters work on user specific input, quotes were added as a general escape hatch.

This might be an ok change but we need a few things:

  1. Need to propose grammar modification in
    field := quoted | [A-Za-z] [A-Za-z0-9_]+
    . - needs to be added to the field regexp for internal characters. This would diverge from the fieldpath/fieldmask spec this grammar is based on but that's ok, because I don't think there is a strong relationship anyways.
  2. There is a bug in the way fields are handled. The grammar and implementation do not agree. We allow all fieldRunes in the first position but the grammar clearly states it has to take an alpha before excepting the wider character set. You have part of the fix but we really need isFieldStartRune, the scanField. The current proposal might be too strict.

I can make a PR for this, as the change is slightly subtle.

Copy link

github-actions bot commented Jun 4, 2024

This PR is stale because it has been open 90 days with no activity. This PR will be closed in 7 days unless new comments are made or the stale label is removed.

@github-actions github-actions bot added the Stale label Jun 4, 2024
Copy link

This PR was closed because it has been stalled for 7 days with no activity.

@github-actions github-actions bot closed this Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Stale status/needs-discussion Needs discussion and decision from maintainers
Projects
Status: Done
Archived in project
Development

Successfully merging this pull request may close these issues.

7 participants