Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create ReplicaSet #3024

Closed
jstrachan opened this issue Dec 18, 2014 · 95 comments
Closed

Create ReplicaSet #3024

jstrachan opened this issue Dec 18, 2014 · 95 comments
Assignees
Labels
area/api Indicates an issue on api area. area/app-lifecycle area/usability priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@jstrachan
Copy link

When using the CLI or REST API or just talking about Kubernetes to people the phrase "replication controller" is a little bit of a mouthful. Pod & service are 1 and 2 syllables. "replication controller" is as many syllables as service, pod and kubernetes combined ;)

How about using the phrase "replicator" instead? Shorter, less to type on the CLI and less to say out loud? Then the core pieces of Kubernetes would be pods, services and replicators?

@abonas
Copy link
Contributor

abonas commented Dec 18, 2014

+1 , I'd also like to see this name shortened and simplified

@bgrant0607 bgrant0607 changed the title naming idea: /s/replicationController/replicator Consider renaming ReplicationController Dec 18, 2014
@bgrant0607 bgrant0607 added area/api Indicates an issue on api area. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Dec 18, 2014
@bgrant0607 bgrant0607 self-assigned this Dec 18, 2014
@bgrant0607
Copy link
Member

Thanks for the suggestion. We are considering a name change prior to 1.0. More suggestions welcome.

I actually think that "replication" isn't really the main function of the controller. Ensuring that the desired number of instances are running and healthy is.

Possibilities in that vein: controller, supervisor, overseer, governor, steward

Previous discussion: #1225 (comment)

Along with a name change, we're also considering whether to extend the functionality, such as to schedule per-node agents (#1518) or to properly handle RestartPolicyOnFailure pods (#1624).

@jbeda
Copy link
Contributor

jbeda commented Dec 18, 2014

I agree that simplifying this name would be a good thing. Let the bikeshedding commence!

As to explain the current name -- the thinking is that we'll have a lot of 'controller' objects. These are simple active actors that help to monitor and update some aspect of a deployment. While we only have one right now, we will likely have at least a few more before we hit v1. Specifically, I think we'll need a 'singleton controller' (which makes sure that 1 and only 1 instance of a pod is running at any one time) and a 'pod per node controller' (which makes sure that a specific pod is running on each node).

With this in mind, I think that 'controller' is a type/class/category of object in the system. Having that type called out explicitly in the name is good but it is also just too long.

So, I don't have any concrete suggestions here, but I would say that cutting it down to just 'controller' is too simplistic.

@jstrachan
Copy link
Author

FWIW I quite like supervisor too. Though thats kinda vague too like controller (a supervisor could do all kinds of different things or there could be many kinds of supervisor).

I purposely went with replicator as in "a replicator is the controller that ensures the desired number of replicas are current"; we refer to the replica count in the CLI / REST / model so figured it was a nice close word that'd be easy for folks to grok.

But yeah whatever the name is; so long as its not too vague, long, hard to remember or type I'm more than happy ;).

I'd even prefer "jedi" or "ninja" to "replication controller" TBH :)

@jstrachan
Copy link
Author

"scaler" or "sizer" are other ideas - as it controls the number of instances of a pod. Though there could be complex auto scalers which might cause confusion later

@bgrant0607
Copy link
Member

@jstrachan Re. "scaler", see #1629 and #2863.

@bgrant0607
Copy link
Member

"Supervisor" is along the lines of supervisord, and the definition is a person who observes and directs workers or activities and/or keeps watch in the interest of their well being.

@brendandburns
Copy link
Contributor

I like "replicator"

On Thu, Dec 18, 2014 at 9:48 AM, bgrant0607 notifications@github.com
wrote:

"Supervisor" is along the lines of supervisord, and the definition is a
person who observes and directs workers or activities and/or keeps watch in
the interest of their well being.


Reply to this email directly or view it on GitHub
#3024 (comment)
.

@j3ffml
Copy link
Contributor

j3ffml commented Dec 19, 2014

I rather like "controller", but +1 for replicator.

@bgrant0607
Copy link
Member

The rough equivalent in GCE was called Replica Pool and is now called the Instance Group Manager, which is also a mouthful.

The replication controller creates replicas upon creation, and in response to changes in the replica count made by an auto-scaler. However, what it mainly does is replace pods that disappeared due to minions that disappeared. I expect it to become more sophisticated in the future, such as by watching pod readiness (#620), assisting in pod migration, proactively moving pods when nodes are shutdown or due to performance problems, etc. I think "supervisor" resonates with this type of functionality.

I also would like to add a way to bulk-create pods from a standalone pod template (#170), with no controller. I'd use this for run-once pods, for example, or in bootstrapping scenarios. That would be "replication", too. I also think the distinction between a "replicator" and "auto-scaler" would be lost on most people.

I agree "controller" is too generic. We also have endpoints and node controllers. "Manager" has the same problem.

@jstrachan
Copy link
Author

FWIW I'm happy with supervisor too; if we make sure never to have any more different kinds of supervisor (like we have different kinds of controller).

I think I'm leaning towards supervisor too now, as you say @bgrant0607 - its not just ensuring the replica count is realised its also ensuring the running instances are running correctly too (e.g. ensuring the liveness checks indicate they are valid, that the minion is still valid etc)

@mikehughestwm
Copy link

@jbeda just to throw my color-swatch on the bike-shed, regarding

"As to explain the current name -- the thinking is that we'll have a lot of 'controller' objects. These are simple active actors that help to monitor and update some aspect of a deployment. While we only have one right now, we will likely have at least a few more before we hit v1. Specifically, I think we'll need a 'singleton controller' (which makes sure that 1 and only 1 instance of a pod is running at any one time) and a 'pod per node controller' (which makes sure that a specific pod is running on each node)."

Perhaps "actor" instead of "controller" or "replicator"?

My personal vote, although it shouldn't count for much as I don't have a complete idea of the k8s road-map, is still with "controller" :-)

Also, I recognize that "Having that type called out explicitly in the name is good", but perhaps, if there are to be many types of "controller", having a "type" field in the "controller" definition would work for both "singleton" and "pod-per-node", and should be fairly extensible. This field would override the "replicas" field, but I presume having separate "controller" types would also mean having "controller" precedence rules (e.g. singleton may override pod-per-node, which overrides the "replicas" field in the "controller" definition). Does this make the API too confusing? There is already a "type" field on "livenessProbe" so it sort-of has a precedent.

@derekwaynecarr
Copy link
Member

I think controller is too generic and not type-specific. I think replicator is good. The only longer term concern I have is if it pays to associate the resource type with the name if we ever think we will replicate other things besides pods in the future. If that were the case, podReplicator works for me as well. That is my general concern with controller, as I know there are other types of controllers that we will have for other resource types, so if we go that generic, it should include the resouce name.

@bgrant0607
Copy link
Member

@jstrachan FWIW, some of the responsibilities you mention are carried out by other components. For example, liveness probes are performed by Kubelet, and checking node health/appropriateness will be performed by the node controller. That does, perhaps, reduce the suitability of "supervisor".

Also, FWIW, if we have a single-instance controller, it won't be replicating.

@derekwaynecarr When we last discussed this, I said I was fine with podReplicator. In addition to my doubts about "replicator", I'm concerned that podReplicator is still too long, and I'd prefer a single-word term, ideally one that starts with a different letter than our other concepts, to facilitate abbreviation in kubectl, autocompletion, etc.

What about Overseer?

@bgrant0607
Copy link
Member

More food for thought: We're planning to use one replication controller per deployment. So, if you had a canary deployment, daily release, and weekly release, and had a rolling update of the weekly release in progress, you'd have 4 replication controllers at that time. I wouldn't call the replication controller the Deployer (btw, I just stumbled across Vlad the Deployer -- great name), since I'd think that would be entity controlling the rolling update -- similar to the distinction between the replication controller and auto-scaler. But, maybe there's terminology in that realm we could use.

@abonas
Copy link
Contributor

abonas commented Dec 23, 2014

shortening to one word can also avoid this problem:
http://localhost:8080/api/v1beta1/replicationControllers --> works
http://localhost:8080/api/v1beta1/replicationcontrollers --> doesn't. (404)
since it's the only entity that has a 2 word name, it requires a special "camelize" treatment on client side.
or alternatively make the lowercase url respond correctly.

@derekwaynecarr
Copy link
Member

I believe v1beta3 will support lower case naming.

Sent from my iPhone

On Dec 23, 2014, at 8:39 AM, abonas notifications@github.com wrote:

shortening to one word can also avoid this problem:
http://localhost:8080/api/v1beta1/replicationControllers --> works
http://localhost:8080/api/v1beta1/replicationcontrollers --> doesn't. (404)
since it's the only entity that has a 2 word name, it requires a special "camelize" treatment on client side.
or alternatively make the lowercase url respond correctly.


Reply to this email directly or view it on GitHub.

@thockin
Copy link
Member

thockin commented Dec 23, 2014

Actuator

Cardinator

Inducer

Instigator

Factory

Initiator

@abonas
Copy link
Contributor

abonas commented Dec 23, 2014

my vote goes to 'replicator' - it's a clear name that explains what that entity does.

@jasonkuhrt
Copy link
Contributor

_podlicator_

@jasonkuhrt
Copy link
Contributor

@bgrant0607 What's the status on this now? Is there an expected winner yet?

@bgrant0607
Copy link
Member

@jasonkuhrt Was out for the holidays. Will pick this up again later this week -- a few things are ahead of it in my queue. Also, first we need to resolve #3058 -- subobjects vs.separate objects for main different controller types.

@bgrant0607
Copy link
Member

Some more fuel for the fire:

  • The existing name and proposals convey that the object is active: controller, overseer, replicator, etc. This is in distinct contrast to our other objects, pods and services, which are no less active. Services also monitor label selectors, for instance, and pods monitor and restart containers using liveness probes. This inconsistency suggests that we should at least entertain inactive-sounding names, as well.
  • If we go with the separate-object/plugin approach, there will be multiple controllers that replicate and/or oversee pods, so neither of those will be distinguishing properties. Other planned controllers include per-something controllers (generalization of per-node Daemon (was Feature: run-on-every-node scheduling/replication (aka per-node controller or daemon controller)) #1518), batch/workflow job controllers (Job Controller #1624), cron controllers (Distributed CRON jobs in k8s #2156), and, depending on your definition of "controller", auto-scalers (WIP: auto-scaler proposal #2863) and nominal services (PetSet (was nominal services) #260).
  • For the separate-object approach, we'll want first-class metadata about Kinds of objects in order to be able to reliably determine which object Kinds represent controllers and the Kinds of objects they control, so "Controller" need not be part of the Kind name.

@karlkfi
Copy link
Contributor

karlkfi commented Oct 15, 2015

The other distinction between JobController and ReplicationController is that RC's are created by users. I think the fact that it makes a controller is an implementation detail that the end user shouldn't have to know about. It could continue to be implemented as ReplicationController, as long as it's exposed to the user with a friendlier name. The dev and operator are the only personas that should have to know its implemented by a controller.

That said, this isn't java. We don't need all our objects to say what kind of interface they satisfy in their name.

Also, a good interface is similar but not identical to the implementation. The API needs to be flexible enough to change when it improves usability without being tightly coupled to the architecture or impl details.

@davidopp
Copy link
Member

It's a single pod factory with a template for producing more of one kind of pod. It's also it's own daemon that watches those pods and makes new ones when necessary.

How is this different from DaemonSet or Job? They're all pod factories that also manage the lifecycle of the pods they create.

@karlkfi
Copy link
Contributor

karlkfi commented Oct 16, 2015

I'm not an expert in either.

The only clear difference between DaemonSet and RC seems to be the constraints that can be placed on the creation of pods. Both are constrained by node constraints but a ReplicationController can also have a desired number of pods independent of nodes and DaemonSets have an implicit maximum per node. I may be ignorant, but I don't see why they're not just the same thing with multiple optional constraint types.

As for Job, I think it's well named and exemplifies my point. A Job is a singular abstraction that creates Pods with a logical policy and a set of constraints.

@karlkfi
Copy link
Contributor

karlkfi commented Oct 16, 2015

In fact, now that I think about it, if I were really going to clean house (in a reverse incompatible way) I'd combine DaemonSet and ReplicationController into one API resource and call it a Daemon, singular. That way it simplifies the abstractions down to just Daemons and Jobs. Daemons run till you stop them. Jobs run till they complete (for some configurable definition of completion). Both would be scheduled with the same types of constraints: min/max per node, min/max per cluster, min/max per filtered group of nodes, min/max concurrent, whatever.

@davidopp
Copy link
Member

@karlkfi I would agree with you about combining DaemonSet and ReplicationController, but we are planning to move DaemonSet into the NodeController, so I think it makes sense to keep them separate.

Anyway, my point is that all three (D, R, and J) are "pod factory with a template for producing more of one kind of pod [and] it's own daemon that watches those pods and makes new ones when necessary" so they should be named in a parallel fashion.

@karlkfi
Copy link
Contributor

karlkfi commented Oct 16, 2015

we are planning to move DaemonSet into the NodeController

Why?

they should be named in a parallel fashion

Agreed.

Is it possible to move to a world of just Daemons and Jobs or is that going to be impossible with reverse compatibility concerns and "daemon" already being used for a narrower concept?

@davidopp
Copy link
Member

Why?

To make it easier for it to put the daemons on newly-added machines before any scheduler can jump in and schedule something there, possibly blocking the daemon from scheduling. (Even if we had preemption, doing this in node controller allows us to avoid the preemption.)

Is it possible to move to a world of just Daemons and Jobs or is that going to be impossible with reverse compatibility concerns and "daemon" already being used for a narrower concept?

My main concern is the latter; when you read the code, documentation, comments, etc. it is really confusing to see "daemon" sometimes used to mean the traditional thing (the process that runs) and sometimes used to mean the API object/resource representing a collection of (traditional) daemons.

@karlkfi
Copy link
Contributor

karlkfi commented Oct 16, 2015

it is really confusing to see "daemon" sometimes used to mean the traditional thing (the process that runs) and sometimes used to mean the API object/resource representing a collection of (traditional) daemons.

Um, yes... Repurposing a term to mean something new is always going to cause intermediate confusion. It's only worth doing if the end state is sufficiently less confusing.

If the controllers are implementation details, however, the user can still conceptually create a Daemon and let k8s decide how to make that happen.

The difference to the user then is just "one per node + as many as possible total + high priority" vs "any number per node + exactly N total + low priority". High priority Daemons would be scheduled first, by the NodeController or Kubelet or however you want to implement it. "Priority" may or may not be the term you want to use; it could be a flag or something.

@bgrant0607
Copy link
Member

@karlkfi More history about the separation of DaemonSet and ReplicationController is in #3058, if you're interested.

@bgrant0607 bgrant0607 changed the title Rename ReplicationController to ReplicaSet Create ReplicaSet Nov 12, 2015
@bgrant0607
Copy link
Member

We need to create a new controller in order to adopt the new label selector:
#341 (comment)

Back to original plan:
ReplicationController replacement should be called ReplicaSet
Deployment will not be renamed.

@bgrant0607 bgrant0607 added this to the v1.2-candidate milestone Nov 12, 2015
@bgrant0607 bgrant0607 modified the milestones: v1.2, v1.2-candidate Nov 19, 2015
@mqliang
Copy link
Contributor

mqliang commented Dec 18, 2015

Is there anybody working on this now? If not, I'd like to volunteer myself to do this work.

@madhusudancs
Copy link
Contributor

@mqliang I am working on it. You can follow the progress here - https://github.com/madhusudancs/kubernetes/commits/deployment-pod-selector I haven't pushed my latest changes yet. I will do so soon. I am fixing up e2e tests for Deployments.

@bgrant0607
Copy link
Member

@mqliang If you'd like to help, kubectl get/describe haven't been implemented yet for ReplicaSet.

@mqliang
Copy link
Contributor

mqliang commented Feb 9, 2016

@bgrant0607 I send a PR #20886 to implement kubectl get/describe. I am not very familiar with kubectl, hoping I have not miss anything.

@bgrant0607
Copy link
Member

Remaining tasks should be covered by more specific issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue on api area. area/app-lifecycle area/usability priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests