I just finished watching Why Google Stores Billions of Lines of Code in a Single Repository and honestly, while it looks intriguing, it also looks horrible.
Have you run into issues? Did you love it? How was it/
You usually run into issues if you are trying to use off the shelf tools and git providers. IMO GitHub and GitHub actions sucks hard for monorepo. The fact that all actions have to be stored in a single directory for example almost certainly is unmanageable rats nest waiting to happen at any sufficiently large business with a sufficiently complex product or set of products.
This is why companies like google run their own forms of git with custom wrappers to let you do things like pull a segment of the terabyte sized repo or run partial builds with tooling that basically runs some kind of graph against the changes. Bazel for example had to be invented to help solve that problem at Google and pants similarly for twitter (who also has a monorepo)
If you are willing to invest in using tools like bazel and own building all these complex wrappers then it can be fine. But if you want to off the shelf gitlab or GitHub actions and use your IDEs built in git tooling it’s not going to be for you. That’s the difference between what’s possible or a good idea at a medium shop vs a company with 40k engineers
In my experience at a company that just moved away from monorepo, half the off the shelf vendors and foss tools out there balk at you if you expect monorepo support. We moved away specifically because at our current company size it is more tolerable to have our different products separate and eat the occasional pain of mass pattern adjustments across the repos than to build out a team to manage the custom tooling required for a gig plus sized monorepo
Plus, even google doesn’t have a true monorepo. Chrome and Android are not in the same repo as search for example. Find your seams and manage them appropriately
Thanks for the insight. Are there any tools that you used at your company that you’d recommend? Did you encounter any opensource CICD for monorepos that worked?
I discovered JOSH which was intriguing to put in front of existing source forges, but I don’t know of source forges that support monorepos by design. Github and Gitlab are multirepo for sure and shoehorning a monorepo into that, like nix did with nixpkgs, is cumbersome.
I think it mostly has to do with how coupled your code modules are. If you have a lot of tightly coupled modules/libraries/apps/etc, then it makes sense to put them in the same repo so that changes that ultimately have a large blast radius can be handled within a single repo instead of spanning many repos.
And that’s just a judgement call based on code organization and team organization.
I’m inclined to interpret monorepos as an anti-pattern intended to mask away fundamental problems in the way an organization structures it’s releases and dependency management.
It all boils down to being an artificial versioning constraint at the expense of autonomy and developer experience.
Huge multinationals don’t have a problem in organizing all their projects as independent (and sometimes multiple) source code repositories per project. What’s wrong with these small one-bus software shops that fail to do that when they operate at a scale that’s orders of magnitude smaller?
They work great when you have many teams working alongside each other within the same product.
It helps immensely with having consistent quality, structure, shared code, review practices, CI/CD…etc
The downside is that you essentially need an entire platform engineering team just to set up and maintain the monorepo, tooling, custom scripts, custom workflows…etc that support all the additional needs a monorepo and it’s users have. Something that would never be a problem on a single repository like the list of pull requests maybe something that needs custom processes and workflows for in a monorepo due to the volume of changes.
(Ofc small mono repos don’t require you to have a full team doing maintenance and platform engineering. But often you’ll still find yourself dedicating an entire FTE worth of time towards it)
It’s similar to microservices in that monorepo is a solution to scaling an organizational problem, not a solution to scaling a technology problem. It will create new problems that you have to solve that you would not have had to solve before. And that solution requires additional work to be effective and ergonomic. If those ergonomic and consistency issues aren’t being solved then it will just devolve over time into a mess.
I’ve been a big fan of monorepos because it leads to more consistent style and coding across the whole company. It makes the code more transparent so you can see what’s going on with the rest of the company, too, which helps reduce code islands and duplicated work. It enables me to build everything from source, which helps catch bugs that would only show up in prod due to version drift. It also means that I can do massive refactorings across the company without breaking anything.
That said, tooling is slowly improving for decentralized repos, so some of these may be doable on git now/soon.
deleted by creator
(…) you can see what’s going on with the rest of the company, too.
That’s a huge security problem.
Edit for those who are down voting this post, please explain why you believe that granting anyone in the organization full access to all the projects used across all organizations does not represent a security problem.
Because security through obscurity is not security at all.
Most companies will never have a monorepo at the level of these bigger companies. So I personally don’t think most people need to worry about the limitations of github/lab as platforms.
However if you happen to be having those kinds of issues, I think looking at what the big companies are doing and/or starting to split things up makes sense.
There’s also alternatives with custom ci jobs within non GitHub/lab within the git universe that may help out with those sorts of operations. I know actions still feel very beta in some toolsets so it may be easier/more useful to run your own arch. I’ve been enjoying forgeo/gitea for example, but it’s not like you can’t do the same with girlab runners or GitHub enterprise. Depends on use case.
There’s also alternatives with custom ci jobs within non GitHub/lab within the git universe that may help out with those sorts of operations.
Why would anyone subject themselves to explore nonstandard and improvised solutions to try to fit a usecase that fails to meet your needs to a tool that was not designed to support it?
Do people enjoy creating their own problems just to complain about them?
We use a mono repo for a new cloud based solution. So far it’s been really great.
The shared projects are all in one place so we don’t have to kick things out to a package manager just to pull them back in.
We use filters in azure pipelines so things only get built if they or dependent projects get changed.
It makes big changes that span multiple projects effortless to implement.
Also running a local deployment is as easy as hitting run in the ide.
So far no problems at all.
We use filters in azure pipelines so things only get built if they or dependent projects get changed.
Any guides on how to do this? I know about filtering triggers by where changes happens, but how do dependent projects get triggered? Is that a manually maintained list or is that something automatic? I mostly use Gitlab, but am curious how Azure Pipelines would do it.
You have a list of filters like “src/libs/whatever/*” if there is a change the pipeline runs.
I wrote a tool that automatically updates these based on recursive project references (c#)
So if any project referenced by the service (or recursively referenced by dependencies) changes the service is rebuilt.
I see. OK. I thought that was built into Azure pipelines.
Pretty cool tool you built 👍 Is it language agnostic?
No it relies on the c# project files. It looks for all projectreference tags in the projects file and recursively grabs all of them and turns them into filters.
Like all other patterns, it can be done well or done poorly. I’ve experienced both with monorepos. The pain is greater when it is painful. But if the contribution, build, and release procedures are well designed and clearly documented it can also be nice.
all the tech companies ive worked at have used monorepos so i dont know any other way
How do they handle updates to common code, especially breaking changes to the public API?
From a personal experience:
I see some benefits but i will make your developer life a nightmare… it is like trying to focus on a single problem in a zoo on fire with all gate open…
Booooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo