Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out how to handle code in multiple repos #24343

Closed
bgrant0607 opened this issue Apr 15, 2016 · 108 comments
Closed

Figure out how to handle code in multiple repos #24343

bgrant0607 opened this issue Apr 15, 2016 · 108 comments
Labels
area/code-organization Issues or PRs related to kubernetes code organization kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. sig/testing Categorizes an issue or PR as relevant to SIG Testing.

Comments

@bgrant0607
Copy link
Member

bgrant0607 commented Apr 15, 2016

A large monorepo works for Google, but not on github.

We hit the ceiling of achievable velocity of a single github repo in early 2015:
https://github.com/kubernetes/kubernetes/graphs/contributors

There are many reasons: ACLs, notification management, issue triage, PR reviews, sequentialized submit testing, merge conflicts, etc.

We're chipping away at these issues, but we need more than incremental improvement.

We've discussed moving a number of things to other repos:

We need to seriously think about how to do this.

Known issues that need to be addressed:

An example of a Go project on github with good repo hygiene:
https://github.com/deis

I have no illusions that breaking the project into separate repos will be a silver bullet: it's necessary, but not sufficient. I also know that it will cause some pain. But that pain already exists: cadvisor, heapster, dashboard, contrib, docs, ....

Speaking of contrib, it needs to be broken up, too: kubernetes-retired/contrib#762

@thockin @smarterclayton @lavalamp @mikedanese @dchen1107 @davidopp @ixdy

@bgrant0607 bgrant0607 added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. team/none labels Apr 15, 2016
@hongchaodeng
Copy link
Contributor

That would be super awesome.
I can definitely help move the scheduler part.
Let me know when it's ready for a move.

@davidopp @wojtek-t @xiang90

@davidopp
Copy link
Member

davidopp commented Apr 15, 2016

I'd love to see an analysis of what is the best we can do without creating additional repos (of course re-organizing the directory structure would not count as creating additional repos). In other words, for each of the points you mentioned ("ACLs, notification management, issue triage, PR reviews, sequentialized submit testing, merge conflicts, etc."), what is the best we can do within the current framework, and what will these things look like if we have separate repos. So we can see how much things will be better with current repos (and understand what downsides there might be, other than the implementation work to get there).

Full disclosure: my personal feeling is that creating more repos this year will create an amount of churn that will be counter-productive to the project's velocity, and while it may make some things better, it will make other things worse, and we don't have a good handle on either how much better things will become or how much worse things will become.

@bprashanth
Copy link
Contributor

Full disclosure: my personal feeling is that creating more repos this year will create an amount of churn that will be counter-productive to the project's velocity.

Another downside is people will start saying "that's not my job", cross-repo. I think it'll make dealing with e2e flakes that much harder.

@hongchaodeng
Copy link
Contributor

hongchaodeng commented Apr 15, 2016

The boundaries between repos are APIs and releases. It's already hard for people from different groups to figure out why other e2e tests failed?

what is the best we can do within the current framework, and what will these things look like if we have separate repos.

I completely agree. Nevertheless, it's a good timing to raise attention that the current development workflow is tangling people with different interests in scheduling, client side, testing, node into one crowded path. I think what was suggested here is not forcing people to choose but opening new paths for people w.r.t separation of concerns. The goal is about agility. Starting evaluating different approaches and understanding potential problems is much better.

@smarterclayton
Copy link
Contributor

Some repos will be compiled into Kube, and so the boundaries are also
imports, dependencies, and team process (along with testing).

Admission controllers may deserve separate repos, as well as authorizer
implementations.

On Fri, Apr 15, 2016 at 3:05 PM, Hongchao Deng notifications@github.com
wrote:

The boundaries between repos are APIs and releases. It's already hard for
people from different groups to figure out why other e2e tests failed?

what is the best we can do within the current framework, and what will
these things look like if we have separate repos.

I completely agree. Nevertheless, it's a good timing to raise attention
that the current development workflow is tangling people with different
interests in scheduling, client side, testing, node into one crowded path.
I think what was suggested here is not forcing people to choose but opening
new paths for people. The goal is about agility. Starting evaluating
different approaches and understanding potential problems is much better!


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#24343 (comment)

@bprashanth
Copy link
Contributor

bprashanth commented Apr 15, 2016

The boundaries between repos are APIs and releases. It's already hard for people from different groups to figure out why other e2e tests failed?

That really depends on how coupled the project is. This is more a statement about the current state of our repo, not about software development philosoply. ~50% of the "Services" test flake I debugged for the past release was not related to Services. If i'd just dropped the mic on those, things would probably be worse off.

I think the distinction is that a lot of contributors don't debug test flake (and this is being unfair to those contributors who do) but there's a very low chance I'm following anything into eg: kubelet code if it's a Godep.

Cadvisor is a good exmaple of this.

@lavalamp
Copy link
Member

I think we're not really equipped to run multiple repos right now. I imagine that leads to a world of N submit queues, N times the tests, N times the vendoring, basically N times the problems. It will badly hurt velocity because what you can do in a single PR now, you'd instead have to do with a well-ordered series of PRs & dependency bumps.

Building import walls in the repository is the thing we can do ~now and it will make an eventual split easier/possible (right now I think we'll have vendor loops if we split). The import-boss utility makes this possible.

I think we desperately need OWNERs, & to scale the number of reviewers.

I am not in favor of splitting our repository further until it looks like we're treating contrib/ with the same seriousness we treat this one. That means same tool stack, same testing standards, same set of code verification tests, same SLO on reviews, sane vendoring strategy. We shouldn't do more splits until our current split is looking like a success.

I do think it's good to come up with a strategy for splitting, but I don't have bandwidth to participate heavily at the moment. However, before we do any splits, I think we need to split out the tools that operate on our repo. That means many/most of the verify-*.sh scripts, for example. We need consistent tooling to run over all of our repositories. Without that, instead of one tangled mess, we will end up with N tangled messes.

@smarterclayton
Copy link
Contributor

Agree with OWNER and import walls now.

On Fri, Apr 15, 2016 at 3:20 PM, Daniel Smith notifications@github.com
wrote:

I think we're not really equipped to run multiple repos right now. I
imagine that leads to a world of N submit queues, N times the tests, N
times the vendoring, basically N times the problems. It will badly hurt
velocity because what you can do in a single PR now, you'd instead have to
do with a well-ordered series of PRs & dependency bumps.

Building import walls in the repository is the thing we can do ~now and
it will make an eventual split easier/possible (right now I think we'll
have vendor loops if we split). The import-boss utility makes this possible.

I think we desperately need OWNERs, & to scale the number of reviewers.

I am not in favor of splitting our repository further until it looks like
we're treating contrib/ with the same seriousness we treat this one. That
means same tool stack, same testing standards, same set of code
verification tests, same SLO on reviews, sane vendoring strategy. We
shouldn't do more splits until our current split is looking like a success.

I do think it's good to come up with a strategy for splitting, but I don't
have bandwidth to participate heavily at the moment. However, before we do
any splits, I think we need to split out the tools that operate on our
repo. That means many/most of the verify-*.sh scripts, for example. We need
consistent tooling to run over all of our repositories. Without that,
instead of one tangled mess, we will end up with N tangled messes.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#24343 (comment)

@vishh
Copy link
Contributor

vishh commented Apr 15, 2016

I like @lavalamp's suggestion, but it still doesn't help with managing notifications though...

@hongchaodeng
Copy link
Contributor

May I ask what import wall is?

@alex-mohr
Copy link
Contributor

I'd like to see more justification of why we should split -- that is, identify problems caused by the single repo, determine how significant those problems actually are and what their impact is, then discuss specifically how splitting into multiple repos will solve those problems, along with what the downsides of splitting are, and whether there are alternative approaches to solving those problems that come with less cost or other benefits.

Once we have that, we can decide whether cost/benefit works out vs. opportunity cost elsewhere.

@davidopp
Copy link
Member

it still doesn't help with managing notifications though...

@vishh : you and @bgrant0607 both referred to a notifications problem. Can you explain this problem in more detail? In particular, what notifications do you receive today that you do not want, and what notifications do you not receive today that you do want?

@vishh
Copy link
Contributor

vishh commented Apr 15, 2016

@davidopp I was referring to the notifications generated by github. If we were to have separate repos, I can choose to not watch or de-prioritize emails from certain parts of the system. As of now, it is difficult to identify the PRs and issues that I need to look at using notification emails.

@davidopp
Copy link
Member

@vishh I feel there are solutions to that problem without going to separate repos.

Step 1: Reorg the directory structure a bit, to have cleaner separation between areas
Step 2: Either as part of OWNERS or as a separate mechanism, have a list that maps directory paths to github handles of people who want to be auto-subscribed to PRs that touch that directory
Step 3: Get the oncall to @ mention at least one SIG for every nontrivial issue that is filed. Use SIG labels on issues for easy searching.

This ensures, to first approximation, that you are subscribed to every issue and PR that you might be interested in. Then you can just ignore everything you are not subscribed to.

@bgrant0607
Copy link
Member Author

To be clear, there is no specific timeframe for this, but I view it as inevitable and ultimately healthy. I wanted a place to centralize the discussion.

More and more code is going into additional repos already, I expect that trend to continue.

@bgrant0607 bgrant0607 added the sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. label Apr 16, 2016
@smarterclayton
Copy link
Contributor

+1 to this being healthy.

On Sat, Apr 16, 2016 at 2:36 AM, Brian Grant notifications@github.com
wrote:

To be clear, there is no specific timeframe for this, but I view it as
inevitable and ultimately healthy. I wanted a place to centralize the
discussion.

More and more code is going into additional repos already, I expect that
trend to continue.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#24343 (comment)

@thockin
Copy link
Member

thockin commented Apr 18, 2016

I agree it's healthy long term, but we do need to think hard about the chunks. Having all of the binaries be built and versioned together is pretty important, I think. In fact, I'd love for us only to build one binary - hyperkube. Breaking out client utils and libs is pretty obvious. Gabe from Deis said it hurt a lot but was worth doing...

@bgrant0607
Copy link
Member Author

Request to break out service discovery / DNS: https://twitter.com/jbeda/status/711585271221866496

@dchen1107
Copy link
Member

+1 on this as a long term goal.

But I don't think we are ready to do this, especially the top 4 areas listed above: Kubelet, Generic API infrastructure, Client libraries, and Misc. utilities. Taking Kubelet as an example, for a long term, I really want to run Kubelet as a standalone project, so that itself can be packaged as a product to manage a single KNode without API server. But to really achieve that, we need to solve Generic API infrastructure issue first, Kubelet checkpoint issue, etc. Not even mention that we need first answer the question related to the version management, compatibility issues, testing, etc. Also if one look through all features we introduced to the core system today, actually most of them are still requiring multiple components' changes. I think splitting them out at this moment has a negative impact on our velocity.

@spiffxp
Copy link
Member

spiffxp commented Apr 21, 2016

/cc @kubernetes/sig-testing

@bgrant0607
Copy link
Member Author

We can't put anything new into the main repo. At minimum, new things need to go in other repos. This is why minikube and the node-problem-detector were put into new repos, for instance.

Github is designed for small repos, small teams. ACLs (e.g., for label/PR management) are coarse-grain, on a per-repo basis. CI is on a per-repo basis. The notification flood from our giant repo is unmanageable. We have >3000 open issues and >1000 that we haven't really even looked at. PRs can't find reviewers and vice versa. We now have over 500 open PRs. PR merge latency is growing monotonically. The repo is infeasible to build-cop. We can barely keep our tests running. We hit a ceiling on commit rate over a year ago.

https://github.com/kubernetes/kubernetes/graphs/contributors

I haven't found any single repo on github with a sustained commit rate higher than 250 commits per week. The only way Docker can achieve 300 PRs / week is through multiple repos.

We also need to break up contrib:
kubernetes-retired/contrib#762

That repo has no automation and isn't build-copped, and most people on the project don't pay attention to it. Even things that should be maintained are hard to maintain because so many of the notifications are about irrelevant things in contrib, so it gets ignored.

Implementing OWNERS will enable us to give label/PR/wiki power to more people, and we need to do that, for automatic PR assignment/approval if nothing else, but it won't solve the notification, build-copping, and CI problems.

What's urgent to extract?

Ecosystem developers need stable client libraries, with minimal dependencies on the rest of our codebase, which can be imported independently.

Cloudproviders / cluster provisioning / getting-started guides: Many of these aren't maintained, and we can't really review or test most of them. We have at least 3 new ones waiting to be merged right now. We're just slowing them down.

Both of these require technical changes. I don't know if there is any lower-hanging fruit that also has significant value.

We're going to need to make our PR automation and test infrastructure easy to replicate for new repos. We have to do that, anyway, since we have several active repos already: contrib, kubernetes.github.io, dashboard, heapster, minikube, helm, ...

The documentation repo, which was created due to the way github hosts project sites, desperately needs automation.

kubernetes/website#310

cc @philips

@sebgoa
Copy link
Contributor

sebgoa commented Jun 27, 2016

How about picking one component and breaking it into a separate repo to exercise the change needed and define a process. We won't be breaking all at once.

In some other projects, this would start with a [PROPOSAL] and then a [VOTE] on the mailing list.

Start with kubectl maybe, or something less controversial even like examples.

@bgrant0607
Copy link
Member Author

@Runseb We decided we wanted to move out examples a long time ago. The challenge is finding someone to do the required work, and someone to review the necessary changes.

@bgrant0607
Copy link
Member Author

@karlkfi Made some great points here: #16508 (comment)

@justinsb
Copy link
Member

I agree that github is not good at big repos. But it's not clear that splitting into multiple repos is going to be any better. We had one separate repo (contrib) and it was worse on all the axes we are critiquing the main repo for: PRs and issues remained unattended, and they were less discoverable for being split across repos. We should prove with one extra repo that it is an improvement, before we go through all the churn involved in splitting into N repos.

We should also give due weight to the costs of having multiple repos. We are aware of the problems of having one repo, but multiple repos will brings its own problems. Will issues actually be more discoverable? Do we really expect issues to be opened in the correct repo, or in practice will we just track issues on the main repo? How will we coordinate releases, or PRs that touch more than one component? The obvious precedent (OpenStack) went through a similar fragmentation into a large number of repos & projects, and the outcome was not what we are hoping for.

Finally, I am not sure that the conclusion I would draw from "github is bad at big repos" is "not big repos". "Not github" feels like an equally valid position: many big projects seem to have gone that route, with a code mirror on github. I personally am much more excited about working on tooling that leap-frogs GitHub's functionality, than I am in teasing apart a repo into GitHub-approved chunk sizes.

I propose that we:

  1. pick one repo to start with
  2. make it work well, to prove that splitting into multiple repos could solve the problems we are trying to solve
  3. evaluate the overhead of dealing with multiple repos (this will only be N=2, but we have some other repos that are springing up anyway for things like management tools/installers, and we can also start encouraging additional controllers to be created in separate projects)

I propose that the client APIs would be a good candidate as a project to split off, because we definitely have a problem right now with the binary size of any program that uses the go client (I think being in one big repo means the client pulls in a lot more code than it strictly needs.) I think the protobuf work also makes this practical now, including having other language bindings to the k8s API. In some senses clients are a bad choice because the separation there should be more obvious than with some of our other components, but it also will give us a taste of the complexity because of the circular dependency (because our servers are also API clients)

@chrislovecnm
Copy link
Contributor

Is anyone working on a generic release process? If not I am raising my hand to help with that and get a release process for kops.

One hurdle is that we need a place to put bits and containers.

@sttts
Copy link
Contributor

sttts commented Jul 17, 2017

Just something I noticed this morning about our release process: we have release-* branches for staging repos, but no tags. This makes vendoring tedious.

@bgrant0607
Copy link
Member Author

Related: Development in other branches/forks:
kubernetes/community#566

@bgrant0607
Copy link
Member Author

Some updates:

@ldemailly
Copy link

ldemailly commented Oct 5, 2017

Is there a post mortem/lessons learned ? if you had to do it again would you do it or what would you do differently ? And which size of contributors/velocity would you recommend for a split ?

@bgrant0607
Copy link
Member Author

@SpamapS
Copy link

SpamapS commented Oct 26, 2017

Update on Zuul which I presented to the testing SIG a few months back as a potential solution for Jenkins scaling and multi-repo testing. OpenStack has fully migrated to Zuulv3 now. It can be considered "battle hardenING" at the moment. If a solution for cross-repo testing has not been created yet, perhaps someone should make an attempt to setup an experimental Zuulv3. The main challenge to success in that now would be that Zuul still only knows how to get computing resources from nodepool, and nodepool only supports OpenStack clouds. So, if there is someone in SIG testing with access to an OpenStack cloud who wants to work on this, I recommend reaching out to me directly, sending email to http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra , or joining #zuul on Freenode IRC to discuss further.

@chrislovecnm
Copy link
Contributor

Just an FYI for everyone. kops team with sig-release are working on a MVP, for a build process for sub projects. Currently kops is not released with kubernetes/kubernetes, and we are working on flushing out a release process.

@0xmichalis
Copy link
Contributor

@kubernetes/sig-testing-feature-requests

@k8s-ci-robot k8s-ci-robot added sig/testing Categorizes an issue or PR as relevant to SIG Testing. kind/feature Categorizes issue or PR as related to a new feature. labels Nov 3, 2017
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 7, 2018
dims pushed a commit to dims/kubernetes that referenced this issue Feb 8, 2018
Automatic merge from submit-queue

Redirect all files in /examples folder to kubernetes/examples repo

**What this PR does / why we need it**:

Examples are being moved to their own repository: https://github.com/kubernetes/examples

We need to remove them from the main repo , but first we need to keep a redirect.

This is a *big* organizational change, but nothing technical (aside from e2e tests)

**Which issue this PR fixes** 

fixes part of kubernetes#24343 

**Special notes for your reviewer**:

WIP, I still need to figure out what to do with the BUILD script and tests, plus take care of the e2e tests that use some of these examples.

**release notes**
```release-note
Redirect all examples README to the the kubernetes/examples repo
```
@chrislovecnm
Copy link
Contributor

/lifecycle frozen
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 25, 2018
@oferb
Copy link

oferb commented Sep 8, 2018

Hey, just to let you know - I'm working on a multi-repo code review tool.
It can do some cool stuff:

  • Well, multi-repo. Repos can be any combination of on-prem, GitHub, GitLab, Bitbucket, Google CSR etc.
  • No server to maintain. Code is served locally from your machine, metadata from Firebase.
  • Can view local changes in tool - no need to push.
  • Comes with multi-repo cli tool (it's not annoying)
  • Supports multiple auth providers - GitHub, Google (got that for free from Firebase)

Here's a 2-min video of the tool:
https://www.youtube.com/watch?v=BKdCrtcmYsM
Code is here (still WIP): https://github.com/google/startup-os/tree/master/tools/reviewer

The grand idea is that having multi-repo code review, multi-repo cli and multi-repo CI, we can do cross-repo changes, repos can depend on head of other repos, so also cross-repo tests at head. These are the main advantages of a monorepo, while also staying multi-repo.

Happy to answer any questions.

@bgrant0607
Copy link
Member Author

/area code-organization

This specific issue is no longer useful, so closing.
/close

@k8s-ci-robot
Copy link
Contributor

@bgrant0607: Closing this issue.

In response to this:

/area code-organization

This specific issue is no longer useful, so closing.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the area/code-organization Issues or PRs related to kubernetes code organization label May 7, 2019
openshift-publish-robot pushed a commit to openshift/kubernetes that referenced this issue Dec 23, 2019
…tapi-pkg

Picks Upstream 86256 - Remove use of testapi package 

Origin-commit: ae96064ab640c9e2206b28472f17163fefad022b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/code-organization Issues or PRs related to kubernetes code organization kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
None yet
Development

No branches or pull requests