Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kubectl run --dir=<foo> and kubectl push ... #16010

Closed
wants to merge 4 commits into from

Conversation

brendandburns
Copy link
Contributor

This adds a new option to kubectl run which allows a user to automatically upload a directory to storage and make that URL available to their Pod as an environment variable. It also adds a standalone command.

kubectl run example

kubectl run example --image=busybox --dir=/some/dir/on/my/machine --cmd="wget $ARCHIVE; tar -xvzf `basename $ARCHIVE`; ..."

This packs the specified directory up as a tarball, uploads it to cloud storage, and makes the uploaded tarball available as an environment variable in the Pod that is run so that you can use ${ARCHIVE} to reference the uploaded file.

kubectl push examples

kubectl push some/dir my-bucket/my-archive.tgz
kubectl push my-java.jar my-bucket

@bgrant0607 @kubernetes/kubectl

Depends on #16009 and #15926

Please only review the top-most commit

@brendandburns brendandburns changed the title Add kubectl run --directory=<foo> Add kubectl run --dir=<foo> Oct 21, 2015
@k8s-bot
Copy link

k8s-bot commented Oct 21, 2015

GCE e2e build/test failed for commit 176f893f12caca82ef6c7a83aa9d98f04a754b0c.

@k8s-github-robot k8s-github-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 21, 2015
@thockin
Copy link
Member

thockin commented Oct 21, 2015

The flag name --dir could be more descriptive. --use-dir ? --upload-dir?

Is this a common enough pattern to warrant porcelain support?

@k8s-bot
Copy link

k8s-bot commented Oct 21, 2015

GCE e2e build/test failed for commit fef7b1eb34c5281154023c66cc1974419fb70070.

@hw-qiaolei
Copy link
Contributor

@brendandburns It is an illusion, or it is really in such a deep path such as pkg/api/api/api/api/api/context.go?

@smarterclayton
Copy link
Contributor

Not that I dislike the use case, but what happens when a cluster doesn't have integrated cloud storage? Why not attach and stream to stdin? The environment variable doesn't seem like the most natural target for a file. Is that URL secured? Time delayed? How do I make it secure?

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 21, 2015
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 22, 2015
@k8s-bot
Copy link

k8s-bot commented Oct 22, 2015

GCE e2e test build/test passed for commit 69a54ba611617178dc60f67c33754240b71ca03d.

@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 22, 2015
@brendandburns
Copy link
Contributor Author

@thockin
switched to --upload-directory

I would really like to support use cases where users don't have to build their own Docker images to use kubernetes. Imagine something like:

kubectl run --upload-dir=./my-python-app --image=generic-python:2.7 my-python-app

Which tars up your files, uploads them to cloud storage and then generic-python:2.7 image knows how to use the ${ARCHIVE} variable to download the users code and start executing it.

@hw-qiaolei
ugh, something went wrong in my client. fixed

@smarterclayton
The trouble with the stream approach is that it won't work if you create a replication controller with multiple replicas. (e.g. --replicas=3). It also won't work unless you require the pod to have a volume.

For security purposes, the uploads will require authenticated access which cloud providers enable via IAM credentials on the VMs in the cluster.

For now, this won't work in non-cloud environments, but I could easily imagine a cluster add-on that installs an S3 clone inside the cluster for physical environments.

@k8s-bot
Copy link

k8s-bot commented Oct 22, 2015

GCE e2e test build/test passed for commit e5835c8a7d510e6014e597a13370c3652ebe4980.

@smarterclayton
Copy link
Contributor

Why not add a URL volume type and bypass the env var injection?

@smarterclayton
Copy link
Contributor

Having an opinionated env var (ARCHIVE) is something we've tried to avoid, but probably is ok here. Is this really such a common use pattern that it belongs in run (vs any of a hundred other tools that can do this before run? This assumes pretty strongly one object store pattern per environment - I don't know that makes sense in a heterogeneous deployment at scale.

@thockin
Copy link
Member

thockin commented Oct 22, 2015

Yeah, I get the pattern and I like it. I share Clayton's concern about
assuming a one-true storage system based on cloud-provider. It might be
something we can get away with, but it's unpleasant. I mean, I could use
GCS or S3 for my non-GCE kube runs, just because it is there. Would it
make more sense to abstract that? I'm grasping..

On Wed, Oct 21, 2015 at 10:30 PM, Clayton Coleman notifications@github.com
wrote:

Having an opinionated env var (ARCHIVE) is something we've tried to avoid,
but probably is ok here. Is this really such a common use pattern that it
belongs in run (vs any of a hundred other tools that can do this before
run? This assumes pretty strongly one object store pattern per environment

  • I don't know that makes sense in a heterogeneous deployment at scale.


Reply to this email directly or view it on GitHub
#16010 (comment)
.

@brendandburns
Copy link
Contributor Author

cloud provider comes in as the kubectl --cloud-provider ... flag so you could conceivably use google cloud storage with on-prem Kubernetes, you'd just have to make sure that gsutil works correctly on those nodes.

I like the idea of adding a URL backed volume, though that doesn't work for cloud storage providers since you need to supply the credentials to both GCS and S3 in order for private archives to work.

I could imagine adding both a URL and a cloud storage volumes, but that is beyond the scope of this PR.

@brendandburns brendandburns force-pushed the kubectl3 branch 2 times, most recently from bcf5aca to fa3148a Compare October 23, 2015 05:19
@k8s-bot
Copy link

k8s-bot commented Oct 23, 2015

GCE e2e build/test failed for commit fa3148a443987a0c8f6e40d8764c807e9f028935.

@bgrant0607
Copy link
Member

I'm going to try to look at this on Wednesday.

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 3, 2015
@brendandburns
Copy link
Contributor Author

Ok, I added a separate command:

kubectl push <file-or-directory> <bucket[/<file>]>

in addition to

kubectl run --upload-directory ...

ptal.

thanks!
--brendan

@brendandburns brendandburns changed the title Add kubectl run --dir=<foo> Add kubectl run --dir=<foo> and kubectl push ... Nov 17, 2015
@k8s-github-robot k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 17, 2015
@k8s-bot
Copy link

k8s-bot commented Nov 17, 2015

GCE e2e test build/test passed for commit 2fd6dc1.

@bgrant0607
Copy link
Member

Does bucket have any meaning on other cloud providers?

How are credentials provided?

What do you think about the approach of pushing to a registry, which already abstracts away the cloud provider?

@bgrant0607
Copy link
Member

Or is there another third-party tool that does the equivalent of push?

@bgrant0607
Copy link
Member

I prefer the idea of mounting an image or part of an image as mentioned here:
#831 (comment)

@brendandburns
Copy link
Contributor Author

@bgrant0607

Bucket is used by both GCS and AWS S3, seems like a reasonable enough term of art

Credentials are provider by the cloud provider in a cloud-provider specific manner (either via installed credentials on the machine, or a config file)

I don't like pushing to a registry because I would like people to be able to use this without thinking about a Docker registry. Also, I don't think we want to get into the business of building images (and same for that comment)

@brendandburns
Copy link
Contributor Author

@bgrant0607
And I don't know of an equivalent command line that is cross-cloud.

@bgrant0607
Copy link
Member

We have a program for building simple images without the Docker client (https://github.com/kubernetes/contrib/blob/master/go2docker/go2docker.go), which we could update, generalize, make easier to use, and potentially integrate with kubectl.

A registry is required to run Kubernetes, and Docker more generally, in every environment.

We have a solution for distributing user-specific image pull secrets, rather than host-centric secrets.

We need to be able to mount images into pods/containers (part of #831) for other reasons.

So using images wouldn't add much to Kubernetes that doesn't already exist or need to exist for other use cases.

I might suggest git, since git is cloud-independent and we have git volumes, but git supposedly isn't great with binaries and we don't have a way to plumb credentials to git volumes currently, and I'm not eager to add that until we generalize git volumes to arbitrary data-pulling containers.

Or maybe we could leverage a slug runner:
https://github.com/flynn/flynn/tree/master/slugrunner

I want to make it easier to run apps for people who don't know Docker well, but I'm not keen to add a non-portable, partially redundant feature to Kubernetes, nor am I eager to expand the cloudprovider API. This seems like it would add too much technical debt without advancing our other goals.

@saad-ali @thockin @smarterclayton @gabrtv: Other suggestions?

@gabrtv
Copy link

gabrtv commented Nov 19, 2015

I'd like to better understand the use-case here, but my first impression is that I'm not thrilled about:

  1. Expanding usage of cloudprovider API in this way
  2. Moving away from fully-formed images as the de-facto runtime artifact

Imagine something [...] which tars up your files, uploads them to cloud storage and then generic-python:2.7 image knows how to use the ${ARCHIVE} variable to download the users code and start executing it.

This is trickier than it sounds. What happens when the user pushes a Python app that has a dependency on libpq5 or imagemagick? One approach is to ship a beefy Cedar stack as a base image. However, this isn't comprehensive either. If the goal is to..

...support use cases where users don't have to build their own Docker images to use kubernetes.

I think this is best addressed by an optional source-to-image capability running on top of Kubernetes. The Deis Builder, for example, runs on cluster and provides a git server interface which spits out Docker images to a registry and (optionally) calls out to the orchestrator. I'm supportive of something like this being integrated more tightly. The big problem is we need a registry!

A registry is required to run Kubernetes, and Docker more generally, in every environment.

Agree wholeheartedly. Let me propose something more drastic: ship an opinionated registry solution with Kubernetes (Docker registry or otherwise). While this anything but trivial, it would allow intelligent image distribution/management across the cluster and even Swarm-style image affinity, which is a really nice feature. This feels like a better long-term solution.

@bgrant0607
Copy link
Member

@brendandburns There are multiple examples in #831 of users wanting to just mount images, for war files, web site assets, etc.

@gabrtv Our internal package format is composable: the base file system, language runtime, application, configuration, etc. can be bundled as separate packages and dynamically composed. It's described here:
https://www.usenix.org/sites/default/files/conference/protected-files/lisa_2014_talk.pdf

It's actually pretty handy, though I agree there are tradeoffs and it needs to be used judiciously.

Shipping a registry: Yes. #1319. We've started down this path:
https://github.com/kubernetes/kubernetes/blob/master/cluster/saltbase/salt/kube-registry-proxy/kube-registry-proxy.yaml
But it's not very usable yet.

Image affinity: Thanks for the pointer. #17196

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 20, 2015
@brendandburns
Copy link
Contributor Author

@bgrant0607 what you suggest has way more additional complexity than is worth building for now. I would like to get this experience built asap, if we think that there are ultimately better solutions, great, we can TODO those for future expansion. I really don't want the perfect to be the enemy of the good here.

@brendandburns
Copy link
Contributor Author

and @gabrtv this is no different than slug runner, except it is enabling users to develop and use their own "slug runner" style templates, and not necessarily rely on the Heroku format/compatability.

I agree that all of the problems you present are potential problems, but I think that something very simple will hit the 80% of most users, enabling them to directly integrate Kubernetes into their existing develop/build workflows.

@bgrant0607
Copy link
Member

@brendandburns Regarding perfect vs. good: I can make the same argument. Just put this in contrib and write a blog post about it. Users could start using it right away, even without waiting for 1.2.

@bgrant0607
Copy link
Member

cc @thockin @kubernetes/goog-cluster due to the relationship to registry, #831, storage, etc.

@bgrant0607
Copy link
Member

@brendandburns

Regarding Heroku style: Buildpacks are pretty minimalistic and are used by more than just Heroku -- all CloudFoundry-derived systems, for instance.
https://dzone.com/articles/paas-buildpacks

@bgrant0607
Copy link
Member

Another push request: #17707

@k8s-bot
Copy link

k8s-bot commented Feb 17, 2016

Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist")

If this message is too spammy, please complain to ixdy.

@bgrant0607
Copy link
Member

We're going to find another way to support this. For example, templates.md and initContainers (#23666) are in progress. Closing as inactive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants