Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic field selectors #1362

Open
lavalamp opened this issue Sep 18, 2014 · 26 comments
Open

Generic field selectors #1362

lavalamp opened this issue Sep 18, 2014 · 26 comments
Labels
area/api Indicates an issue on api area. area/usability kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@lavalamp
Copy link
Member

lavalamp commented Sep 18, 2014

We need a general policy for listing based on object's fields. Something quick 'n dirty has been implemented for pods to allow for scheduling, but no implementation exists for other resource types, and it's not clear that we want to do this (as opposed to making synthetic labels or something).

EDIT, May 2019: It remains incredibly inefficient in our system to select objects this way, O(N); we can't even consider this until we have a technical solution (e.g., indexes). Additionally, it's very tricky to make an API with exactly the right amount of expressiveness. For these reasons, this is actually mostly an anti-goal at the moment.

@lavalamp lavalamp added the kind/design Categorizes issue or PR as related to design. label Sep 18, 2014
@smarterclayton
Copy link
Contributor

#1297 includes a discussion of how labels and fields could conflict if overlapping and other concerns that field names have that labels don't

@bgrant0607
Copy link
Member

A few alternatives:
0. Don't support this. Filter by fields in the client.

  1. Mirror certain blessed fields as system-defined labels.
  2. While evaluating label selectors, treat fields (aka attributes) as labels with namespaces comprised from the object Kinds and JSON paths.
  3. Generalize label selectors to field selectors, and refer to labels as fields.

The last option (3) would make field selection more convenient than label selection, which I don't think we want to encourage.

(2) would be the most flexible, both for users (all fields would be available) and in terms of the implementation (we could choose what to index).

(1) would pollute the label space, IMO.

(0) is a good starting point. :-)

But assuming we needed to do something, I prefer (2). We shouldn't necessarily support fields in label selectors of replication controllers or services, however.

That said, I wouldn't want to introduce types and comparisons into label selectors. Also, @smarterclayton makes a good point about fields having values that would be problematic to express in a label selector. Perhaps we could get away with not supporting such complex selector syntax -- field values could be arbitrary, but you just couldn't use the selector mechanism to search for funky values.

@bgrant0607 bgrant0607 added the area/api Indicates an issue on api area. label Sep 18, 2014
@erictune
Copy link
Member

Is this about selecting or projecting?

For projecting, this appears to be called "partial responses" in the context of REST. This article discusses 3 already in use by google, facebook, twitter and linkedin:
https://blog.apigee.com/detail/restful_api_design_can_your_api_give_developers_just_the_information
There is more documentation on the google format here:
https://developers.google.com/+/api/#partial-response

I think we should support this.

For Selecting, I agree with Brian that we should start with having clients do this (option 0).
Clients can use Filtering to reduce the amount of data that needs to be transferred when doing option 0.

I think mixing user-applied labels with all JSON properties is dangerous in that it dilutes the significance of labels.

@derekwaynecarr
Copy link
Member

I agree with @erictune that we need to do partial responses.

I do not think field selectors should share a common syntax with label selectors, and they should remain separate arguments to the desired operation.

I would be more interested in knowing what types of queries we are attempting to satisfy via a field selector as sets of specific use cases before introducing a general purpose solution as there may be better ways of getting the same information.

Field selectors to me are more like what I would want to do with SQL style queries, while labels are more like picking in a tag cloud. As a result, field selectors should be type aware, and support more complex operations like BETWEEN for date types for example.

@bgrant0607
Copy link
Member

Partial responses / projecting is #1459. Let's just discuss selection filtering here.

@erictune
Copy link
Member

Agree with what Derek said, particularly about field selection having typed data and more general expressions.

Labels will eventually be set and selected in a GUI. That GUI may have autocompletion or auto-populated check-boxes for labels. For that to work, there should be preferably < 100s of distinct label key-values.

Also, usage and monitoring statistics may be automatically grouped and accumulated by labels. This requires limited value count for each key.

@bgrant0607 bgrant0607 added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Dec 4, 2014
@bgrant0607 bgrant0607 added priority/backlog Higher priority than priority/awaiting-more-evidence. area/usability and removed priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. kind/design Categorizes issue or PR as related to design. labels Dec 18, 2014
@bgrant0607 bgrant0607 added priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels Jan 10, 2015
@bgrant0607
Copy link
Member

Note that this needs a general solution to #2518.

@bgrant0607
Copy link
Member

Copying @erictune's text from #2602:
Events takes a field selector but the other ones don't.

Suggest adding fields labels.Selector and labels labels.Selector to events type and all other corresponding types, so that a person can call:
c.Pods(ns).List() to get all pods or c.Events(ns).Select(someSelector).Project(someOtherSelector).List()
or various other combinations.

$ grep 'func.*List(' pkg/client/*.go | grep -v fake

pkg/client/endpoints.go:func (c *endpoints) List(selector labels.Selector) (result *api.EndpointsList, err error) {
pkg/client/events.go:func (e *events) List(label, field labels.Selector) (*api.EventList, error) {
pkg/client/minions.go:func (c *minions) List() (*api.MinionList, error) {
pkg/client/pods.go:func (c *pods) List(selector labels.Selector) (result *api.PodList, err error) {
pkg/client/replication_controllers.go:func (c *replicationControllers) List(selector labels.Selector) (result *api.ReplicationControllerList, err error) {
pkg/client/services.go:func (c *services) List(selector labels.Selector) (result *api.ServiceList, err error) {

@bgrant0607
Copy link
Member

Another use case for efficient lookup by field: lookup of pods (and probably nodes and services) by IP.

@errordeveloper
Copy link
Member

errordeveloper commented Jan 15, 2017

We need a mechanism to enumerate what fields on a given Kind are selectable as a prerequisite.

Should we just open an issue to discuss this? It seems like the key dependency, unless folks see anything else.

How does it work right now? There seems to be a way to tell what fields are not supported, e.g. it barked at me with field label not supported: status.hostIP. However, it doesn't complain about status.podIP... But, it doesn't match any pods when I pass status.podIP==172.17.0.4, where 172.17.0.4 is assigned to one of my pods, and status.podIP!=172.17.0.4 matches all the pods, including the one with the said IP address.

@derekwaynecarr
Copy link
Member

@errordeveloper -- I think sig-api-machinery is well aware that not all fields are selectable. if there are additional fields that you want to expose on pod as selectable, you can modify: https://github.com/kubernetes/kubernetes/blob/master/pkg/registry/core/pod/strategy.go#L163 along with the use case.

@frankgreco
Copy link
Contributor

Is it the plan to make this extensible for TPRs?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 4, 2018
@errordeveloper
Copy link
Member

/remove-lifecycle stale
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 22, 2018
@stanxing
Copy link

stanxing commented Jan 22, 2018

@errordeveloper hello, I want to know this schedule about field selectors. I have a issue about getting pod or pod list via metadata.uid ; but when I try fieldSelector paramater with curl to call kubernets api ,it return the error as following:

curl -k -X GET -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://kubernetes.default/api/v1/pods?fieldSelector=metadata.uid=111
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "field label not supported: metadata.uid",
  "reason": "BadRequest",
  "code": 400

I try to change the fieldSelector parameter to metadata.name as following, it return a null result:

curl -k -X GET -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://kubernetes.default/apiv1/pods?fieldSelector=metadata.name%3Dinfras-https-redirector-1813593808
{
  "kind": "PodList",
  "apiVersion": "v1",
  "metadata": {
    "selfLink": "/api/v1/pods",
    "resourceVersion": "20792137"
  },
  "items": []

In fact , the result is incorrect and I have the pod named infras-https-redirector-1813593808
I think this behavior is very strange. I will be very happy if I can get your reply.

@errordeveloper
Copy link
Member

@stanxing I don't know how to answer your question, but perhaps you could try a different channel, e.g. #sig-apimachinery on Slack or #office-hours.

@anfernee
Copy link
Member

@stanxing Probably it's not well documented, but the supported fields contains 2 parts:

@stanxing
Copy link

@errordeveloper Thank you for your reply!

@stanxing
Copy link

stanxing commented Jan 23, 2018

@anfernee Thank you for the explanation. I viewed the code above, but I found that it didn't support pass metadata.uid parameter to get the pod in the api. Is this design unreasonable?

@liggitt
Copy link
Member

liggitt commented Jan 23, 2018

I found that it didn't support pass metadata.uid parameter to get the pod in the api. Is this design unreasonable?

Using a field selector to look up a single object in a list is extremely inefficient today. We'd likely need a way to build indexes before supporting that query pattern across all objects (especially across namespaces)

@stanxing
Copy link

@liggitt ok, I understood. This is a problem indeed. Thank you very much.

@tossmilestone
Copy link
Member

Hi, @liggitt, are there any progress updated at this issue?

@liggitt
Copy link
Member

liggitt commented May 6, 2019

No, this is not being worked on currently

@lavalamp
Copy link
Member Author

lavalamp commented May 6, 2019

Yeah this is mostly an explicit anti-goal at the moment. I'll update the OP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue on api area. area/usability kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests