Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support nested namespaces (without ~prefix) #279

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

s-ol
Copy link
Contributor

@s-ol s-ol commented Jun 23, 2022

This is a major change that changes all existing route fomats.
The base route of each repo is no longer /repo or /~ns/repo but
/repo/-/ or /deeply/nested/repo/-/, whereas the Dulwich/SMARTHTTP
routes are at /repo.git or /deeply/nested/repo.git.

benefits

Klaus is expected to be used together with some method of hosting Git
repositories, such as plain SSH access, git-shell, or e.g. gitolite.
All of these allow organizing the repositories freely, creating directory
structures (and even mixing with other files) as wanted.

This routing scheme is inspired by GitLab, where repositories and groups
can be nested in this way also. It allows klaus to have that same flexibility
and keeps the SMARTHTTP clone URL "symmetric" to what e.g. an SSH clone URL may
look like:

git@host:path/to/repo.git
http://host/path/to/repo.git

downsides

Some might find the - in the URL irritating. The biggest issue is simply
that changing the URL scheme is not backwards compatible.

possible improvements

It would make sense to redirect /repo to /repo/-/ automatically, which would solve the most common issue (links to repositories not working).

It'd be great to be able to browse the namespace tree and maybe even show files (or just READMEs) there to use this organization scheme to the fullest.

@s-ol
Copy link
Contributor Author

s-ol commented Jun 23, 2022

I'm sure this is a controversial change, but I put it up here for discussion primarily.

One way to keep backwards compatibility might be to allow choosing from one of two routing schemes:

  1. "compat":
    • /<repo>/
    • /<repo>/tree/<rev>/<path>
    • /~path/~to/<repo>
    • /~path/~to/<repo>/tree/<rev>/<path>
  2. "bare":
    • /<repo>/-/
    • /<repo>/-/tree/<rev>/<path>
    • /path/to/<repo>/-/
    • /path/to/<repo>/-/tree/<rev>/<path>

@jonashaag
Copy link
Owner

jonashaag commented Jun 23, 2022

Love it! I think it's worth switching to this. Is it identical to the GitLab URL scheme?

I think redirects are sufficient for backwards compatibility. We definitely need to redirect SmartHTTP requests also. (Hope that's possible with git pull and push, otherwise we'll have to support the old scheme and maybe emit a warning.)

@s-ol
Copy link
Contributor Author

s-ol commented Jun 23, 2022

Demo is up here: https://git.s-ol.nu/

I think redirects are sufficient for backwards compatibility.

Got it. I think I'm done for the day, if you want to pick up from here go for it. Otherwise I might get around to it in the weekend.

@jelmer
Copy link
Contributor

jelmer commented Jun 23, 2022

FWIW I really like this change too - and +1 for using /-/, since that's consistent with GitLab. Redirects for the old URLs would indeed be good.

@s-ol
Copy link
Contributor Author

s-ol commented Jun 26, 2022

I think redirects are sufficient for backwards compatibility. We definitely need to redirect SmartHTTP requests also. (Hope that's possible with git pull and push, otherwise we'll have to support the old scheme and maybe emit a warning.)

Since dulwich runs first in the WSGI chain and only looks at the tail of the path to decide whether it should handle a request, this seems kin of difficult. When requesting /repo/HEAD we don't get a chance to redirect or warn, dulwich just 404s right away (if we configured it for /repo.git/HEAD instead). The simplest solution is just to register both I guess.

@s-ol
Copy link
Contributor Author

s-ol commented Jun 26, 2022

This last commit tries to redirect all old URLs, but it currently causes issues when there is a collision:

http://localhost:8000/test/tree/tofu/-/ is redirected to http://localhost:8000/test/-/tree/tofu/-/ instead of showing an index page.

Currently reading up on flask route ordering.

@s-ol s-ol force-pushed the nested-namespaces branch 2 times, most recently from 9470d0b to a30105b Compare August 11, 2022 15:07
@s-ol
Copy link
Contributor Author

s-ol commented Aug 11, 2022

Okay, this took some fiddling but I think I found a workable solution. Instead of relying on dulwich to decide whether to handle an URL or not, the request handling is now turned around like so:

  1. the main klaus app handles the request
  2. a secondary app redirects old-style URLs to new-style URLs (for known repos only)
  3. dulwich has a turn

Whenever one of the apps returns a 404, the request is forwarded to the next one.

This works because:

  • dulwich doesn't try to handle old-style git paths before they are redirected
  • the redirects don't clash with the real paths

This now also implements all redirects, including ~namespaced/repos, SmartHTTP, and all of the views klaus provides.

@s-ol
Copy link
Contributor Author

s-ol commented Aug 11, 2022

The only downside with the current flow is that the last app in the chain decides what the 404 request looks like - and that's dulwich. Klaus never had a particularily pretty 404 page, but dulwich's just prints "Method not supported" which is misleading. Not 100% sure how to tackle this, especially since we don't get good feedback from dulwich on whether a request went well or not...

@s-ol s-ol force-pushed the nested-namespaces branch 2 times, most recently from 9cd5398 to dd9bd9c Compare August 11, 2022 15:34
@s-ol
Copy link
Contributor Author

s-ol commented Aug 11, 2022

Okay, another change and I think that's a lot cleaner now:

  • the Klaus() and KlausRedirects() apps are chained as mentioned above
  • Dulwich is explicitly invoked inside of Klaus() only in the smarthttp route
  • if both Klaus() and KlausRedirects() 404, the first response (from the main klaus app) gets sent out

Another small bonus change:
/path/to/repo.git now redirects to /path/to/repo/-/. This does not disturb git clone but means that one URL can be used to link to the repo and to clone it.

I think the only thing that is missing now is tests for old-style URL redirects. I would appreciate if someone that actually uses the test suite wants to take a crack at that.

@s-ol s-ol force-pushed the nested-namespaces branch 2 times, most recently from 7ef51f1 to 1b54a50 Compare August 11, 2022 15:58
s-ol added 3 commits August 11, 2022 18:01
This is similar to jonashaag#168 but uses `os.scandir()` instead of `os.walk()` so as
not to recurse into detected repositories. It has so been extended for bare
repositories, although it only checks for the `.git` suffix in that case.
This is a major change that changes all existing route fomats.
The base route of each repo is no longer `/repo` or `/~ns/repo` but
`/repo/-/` or `/deeply/nested/repo/-/`, whereas the Dulwich/SMARTHTTP
routes are at `/repo.git` or `/deeply/nested/repo.git`.

Klaus is expected to be used together with some method of hosting Git
repositories, such as plain SSH access, git-shell, or e.g. gitolite.
All of these allow organizing the repositories freely, creating directory
structures (and even mixing with other files) as wanted.

This routing scheme is inspired by GitLab, where repositories and groups
can be nested in this way also. It allows klaus to have that same flexibility
and keeps the SMARTHTTP clone URL "symmetric" to what e.g. an SSH clone URL may
look like:

    git@host:path/to/repo.git
    http://host/path/to/repo.git

Some might find the `-` in the URL irritating. The biggest issue is simply
that changing the URL scheme is not backwards compatible.

It would make sense to redirect `/repo` to `/repo/-/` automatically,
which would solve the most common issue (links to repositories not working).
@s-ol s-ol force-pushed the nested-namespaces branch from 1b54a50 to e1182b3 Compare August 11, 2022 16:01
@jonashaag
Copy link
Owner

jonashaag commented Sep 12, 2022

@s-ol: I'll happily merge this if it has working tests. Not saying you need to fix them, but it can very well be that nobody else will step forward for a long time to do it either.

@s-ol
Copy link
Contributor Author

s-ol commented Sep 12, 2022

@jonashaag the existing tests do (should?) pass, what would be nice to have is tests that validate that old URLs are redirected as appropriate.

@jonashaag
Copy link
Owner

I spent some time updating the tests (and fixing them, mostly not your fault).

Open problems:

  • ChainedApps is broken (_cv_app is undefined)
  • SmartHTTP seems to be broken (/git-receive-pack not working?)
  • Do we still need the "namespace" thing? It was never officially released. Can't we just stuff all the repos in a single "namespace" like it was before the introduction of namespaces? Then repo names could be flat or nest/ed and repo_paths in __init__.py could be a flat list instead of a {namespace: repos} dict.

My changes are here, feel free to incorporate: https://github.com/jonashaag/klaus/tree/nested-namespaces

@jonashaag
Copy link
Owner

@s-ol are you planning to work on this some more?

@s-ol
Copy link
Contributor Author

s-ol commented Dec 29, 2022

Hey! Yeah, I want to but haven't really made any progress / had time to. I'm a bit surprised about SmartHTTP since it seems to be working on my hosted instance but I'll look into the whole list

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants