Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contrib: add git-sync container #3099

Merged
merged 8 commits into from
Feb 5, 2015
Merged

Conversation

proppy
Copy link
Contributor

@proppy proppy commented Dec 22, 2014

git-sync is a command that periodically sync a git repository to a local directory.

It can be used to source a container volume with the content of a git repo.

Usage

usage:
GIT_SYNC_REPO= GIT_SYNC_DEST= [GIT_SYNC_INTERVAL= GIT_SYNC_BRANCH= GIT_SYNC_HANDLER=] \
git-sync -repo GIT_REPO_URL -dest PATH [-interval -branch -handler]"

In addition to the periodic pull, it will refresh sync the git repository if a HTTP request is made against :8080/${GIT_SYNC_HANDLER}.

@thockin
Copy link
Member

thockin commented Dec 22, 2014

Why is this more useful than a shell script written around git?
while true; do git pull ...; sleep ...; done

Does this provide any semblance of atomicity? what happens to things that are using the repo while it update? They can get one file from version A and another from version B.

Please think about what you are trying to do with this - I don't think it has been thought through fully.

@proppy
Copy link
Contributor Author

proppy commented Dec 22, 2014

@thockin, I had a shell script + cron before, but @brendanburns convinced me to turn it into a small go program.

The added value is that you can ping it over HTTP to force a pull.

For atomicity, maybe we could fetch over a bare repo and checkout into a subdirectory named after the hash, with a HEAD symlink that is updated atomically.

@proppy
Copy link
Contributor Author

proppy commented Dec 22, 2014

P lease think about what you are trying to do with this - I don't think it has been thought through fully.

I need this for a sample I'm working on, sent it as a PR because I thought it might be more generally useful.

Happy to move it to the sample repo if you think it's not in its current form.

@thockin
Copy link
Member

thockin commented Dec 22, 2014

Now we're talking. Apps can still get one file from version A and another
from version B, though. How can we signal a change to apps that actually
care?

On Mon, Dec 22, 2014 at 1:31 PM, Johan Euphrosine notifications@github.com
wrote:

@thockin https://github.com/thockin, I had a shell script + cron
before, but @brendanburns https://github.com/brendanburns convinced me
to turn it into a small go program.

The added value is that you can ping it over HTTP to force a pull.

For atomicity, maybe we could fetch over a bare repo and checkout into a
subdirectory named after the hash, with a HEAD symlink that is updated
atomically.

Reply to this email directly or view it on GitHub
#3099 (comment)
.

@thockin
Copy link
Member

thockin commented Dec 22, 2014

I think it is under-specced in its current form.

On Mon, Dec 22, 2014 at 1:37 PM, Johan Euphrosine notifications@github.com
wrote:

P lease think about what you are trying to do with this - I don't think it
has been thought through fully.

I need this for a sample I'm working on, sent it as a PR because I thought
it might be more generally useful.

Happy to move it to the sample repo if you think it's not in its current
form.

Reply to this email directly or view it on GitHub
#3099 (comment)
.

@proppy
Copy link
Contributor Author

proppy commented Dec 22, 2014

Apps can still get one file from version A and another
from version B, though

They could use http://golang.org/pkg/path/filepath/#EvalSymlinks to resolve to a stable location.

How can we signal a change to apps that actually
care?

Check the creation timestamp on the HEAD symlink?

@brendandburns
Copy link
Contributor

Let's leave signaling to a later date. Plenty of webservers check file timestamps and invalidate caches when serving files, you could easily use this as-is with something like nginx.

However, I think that the container you build should be completely parameterizable via environment variables. I don't care if you edit the go code to pick them up, or you add a startup script to do it.

@brendandburns
Copy link
Contributor

(and most people don't care about atomic update of their PHP files)

@thockin
Copy link
Member

thockin commented Dec 22, 2014

We should cut corners where doing so saves us significant time. I don't
think that doing this "right" (or right-er) is significantly harder, it
just needs to be thought about. I'm fine with leaving signalling or
detection as an exercise for the reader, as long as we think it is
plausibly doable.

On Mon, Dec 22, 2014 at 2:01 PM, Brendan Burns notifications@github.com
wrote:

(and most people don't care about atomic update of their PHP files)

Reply to this email directly or view it on GitHub
#3099 (comment)
.

@proppy
Copy link
Contributor Author

proppy commented Dec 22, 2014

However, I think that the container you build should be completely parameterizable via environment variables. I don't care if you edit the go code to pick them up, or you add a startup script to do it.

@brendanburns
All flags values are currently defaulting to Env, I updated the PR description with usage, let me know if it misses additional configuration.

@proppy
Copy link
Contributor Author

proppy commented Dec 23, 2014

@thockin added the symlink and individual checkout logic.

@proppy
Copy link
Contributor Author

proppy commented Dec 23, 2014

It does complicate serving the directory from nginx quite a bit, you'd have to override nginx conf to resolve symbolic link and set HEAD as the new root.

I wonder if it would make sense to have 2 modes triggered by a flag:

  • -atomic=true would explode every rev in its own dir and maintain a HEAD symlinks
.git/
6e96effe8409ed95404fe63c2be344587cc5fe2e/
HEAD -> ec48c3b590726afeecf4206ecb0fae80752de12d/
ec48c3b590726afeecf4206ecb0fae80752de12d/
  • atomic=false would just update the repo in place (and allow you to bindmount it without dealing with symlink)

@bgrant0607
Copy link
Member

What's our bar for contrib?

Anyway, a good starting point would be matching our existing git volume functionality:
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/kubelet/volume/git_repo/git_repo.go

I'd like to eliminate that from our API and just rely on a sidecar container instead.

@proppy
Copy link
Contributor Author

proppy commented Jan 29, 2015

One key difference with the existing git volume functionality is that it refresh the git repo periodically.

Would you rather start with a simpler approach that just populate the git repo on startup and address the question of atomic update (that @thockin raised and that I tried to address in the 2 last commits) later?

@bgrant0607
Copy link
Member

If that would help get this in sooner, yes. :-)

@proppy
Copy link
Contributor Author

proppy commented Jan 29, 2015

PTAL, simplified the logic a lot, no more sync loop. Just initially populate the volume at a given rev.

@proppy
Copy link
Contributor Author

proppy commented Jan 30, 2015

Some notes about offline discussion with @bgrant0607 and @thockin.

There is multiple ways we can approach periodic sync w/ this sidekick container:

  1. add the sync logic to the git-sync program itself proppy@0003d5d
  2. like @thockin suggested make the container sleep before exiting after the git sync and rely on the pod the automatically restart it
    5aba5f0
  3. add per container restart policy in pods

Additionally there is multiple ways we can synchronise the side kick with the other containers in the same pod to make sure we don't serve partial data while the first sync is in progress:

  1. volumes as container Init container #1589
  2. container dependencies Run Once containers vs. Chaining / Container dependencies #1996
  3. atomic checkout
    proppy@9f37c9d
  4. git-sync advertise 'readiness' of the pod over some http endpoint (if multiple containers controls different aspect of readiness, they can advertise it on different endpoint, and another sidekick could multiplex)

I'm happy to file different issues for those ideas, but I think there is some value to discuss them in context first.

@proppy
Copy link
Contributor Author

proppy commented Feb 3, 2015

PTAL, added demo for usage of the sidekick container, see the README

@proppy
Copy link
Contributor Author

proppy commented Feb 3, 2015

Let me know if you'd prefer to have the two in separate PRs.

@bgrant0607
Copy link
Member

Thanks! LGTM. We should merge to make it easier for people to play with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants