27 novembre 2024
How to setup a monorepo with Turborepo and Yarn?
12 minutes de lecture
In the Javascript ecosystem, it's common to work with multiple packages for a single project, or even multiple projects revolving around several packages. This situation often arises in open-source projects or in companies with teams working on different projects. This can quickly become complex to manage, especially if you have several projects sharing dependencies. In this article, we'll look at how to simplify your package management with a Turbo monorepo and Yarn workspaces.
History of Monorepos
Monoliths
Historically, Javascript project lifecycles were managed as monoliths, with a single repository for the entire project. This approach has shown its limits, particularly in terms of maintainability, performance, and collaboration.
Google was boasting a few years ago about still using monoliths. Why Google Stores Billions of Lines of Code in a Single Repository
Multi-repos
Following this, around 2010, with the appearance of npm
: came multi-repos. An approach that consists of separating each package into a dedicated repository. This solved some scaling issues but created others, particularly in terms of dependency management and version consistency. Each package had to wait for its dependencies to be published before it could use them. Driven by the success of microservice architectures, this approach is still widely used today.
Monorepos
In 2015, package managers began natively integrating workspace features, allowing for management of multiple packages within a single repository. This is the case with yarn
and npm
among others, and it has helped to democratize monorepos. This is a very popular approach with maintainers of open-source projects, as it allows you to manage multiple packages within a single repository while maintaining version and dependency consistency. The issue that can arise with this approach is the slowness of installations and builds, especially on a large scale. We quickly notice similarities with the two previous approaches: the centralized aspect of monoliths and the package autonomy of multi-repos.
From 2018 onwards, we began seeing tools like Nx
and Turbo
, which facilitate more efficient management of monorepos. These tools allow for caching of common dependencies and orchestration of scripts (build, test, etc.). This results in improved performance and easier monorepo management. We will revisit Turbo in more detail later in this article.
Why a Monorepo?
As you've probably gathered, each approach has its advantages and drawbacks. Whether or not you choose a monorepo will depend on your context, needs, and constraints. Here are few advantages to using a monorepo:
- Facilitating collaboration: by gathering all the packages in the same repository, it becomes easier to share code between different packages and work on multiple packages simultaneously.
- Facilitating dependency management: managing dependencies centrally makes it easier to maintain version consistency and manage common dependencies.
- Facilitating maintenance: when all packages are grouped in the same repository, it becomes easier to maintain the packages, update dependencies, and manage versions.
- Facilitating deployment: grouping all packages in the same repository makes it easier to deploy the packages, and this can be done independently.
Setting Up a Monorepo with Turbo and Yarn
Now that we've explored the advantages of a monorepo, let's look at how to set one up with Turbo and Yarn. Throughout the article, we'll dive deeper into the key concepts of each tool and see how they can be used together. The aim is to provide a comprehensive understanding of a monorepo's architecture and how to adapt it to suit your context.
For example, we'll set up a monorepo containing a React application, a Next.js application, a package of React components, and a CLI (simple console.log
). The structure of our project will be as follows:
monorepo
├── apps
│ ├── app-react # @premieroctet/app-react
│ │ └── package.json
│ └── app-nextjs # @premieroctet/app-nextjs
│ └── package.json
├── packages
│ ├── components # @premieroctet/components
│ │ └── package.json
│ └── cli # @premieroctet/cli
│ └── package.json
├── package.json
└── yarn.lock
The Yarn package manager
For the rest of this article, we will use Yarn 2+ to manage the dependencies of our monorepo. For your monorepos, I recommend using Yarn 2+ as it has brought many enhancements over Yarn 1.x, particularly with regard to workspaces.
Initializing the monorepo
We will therefore start by initializing our monorepo with Yarn. To do this, we will create a new folder and initialize a new Node project.
mkdir monorepo
cd monorepo
yarn set version berry # Use last version of Yarn
yarn init -w
The -w
option is used to initialize a workspace in the package.json
file and set
the packages
folder as the default location for packages.
In the package.json
file that has been created, we will add the workspaces to define the apps
and packages
folders as workspaces.
{
"name": "monorepo",
"version": "1.0.0",
"private": true,
"workspaces": ["apps/*", "packages/*"],
"packageManager": "yarn@4.5.1"
}
Workspaces
We then create the apps
and packages
directories (the latter being already created by Yarn) to store our applications and packages.
For the React and Next.js applications, we will use the create-react-app
and create-next-app
commands respectively to initialize them.
cd monorepo/apps
yarn create react-app app-react
yarn create next-app app-next
For the components
and cli
packages, we'll create a folder for each package and initialize a new Node project.
cd monorepo/packages
mkdir components
mkdir cli
cd components
yarn init -y
cd ../cli
yarn init -y
Now all our applications and packages are initialized. Before proceeding, we'll prefix these projects with a workspace name (here with @premieroctet
) in the package.json
file to distinguish them from their file architecture names and better identify them.
Now that we've renamed our projects, we're going to do our first yarn install
at the root of our project to install the dependencies of all the packages. We can see that Yarn creates symbolic links of projects that are part of the workspaces, and that the node_modules
is at the root of the monorepo.
node_modules
at the root
To reduce the number of installed dependencies, Yarn 2+ uses a global cache to store shared dependencies between packages. This reduces dependency installation time and disk space usage.
Managing Dependency Conflicts
Since there's only one node_modules
at the root, how do you manage dependency conflicts between
packages?
If we take our example, let's say the @premieroctet/app-react
application uses react@18
and the @premieroctet/app-next
application uses react@19
. Yarn will install react@19
at the root of the monorepo and create a node_modules
in the @premieroctet/app-react
application with react@18
. This allows for the management of dependency conflicts between packages. In this case, it's the topological order that determines the priority of the dependencies. However, if a third package uses react@18
, then Yarn will install react@18
at the root of the monorepo and create a node_modules
in the @premieroctet/app-next
package with react@19
since it's "in the minority".
Using Components in Applications
Now that we've initialized our applications and packages, let's see how to use them in our applications.
Let's start by adding components to our @premieroctet/components
package. To do this, we'll create a Button
component in the packages/components/src/Button.js
file.
import React from 'react'
const Button = ({ children }) => {
return <button>{children}</button>
}
export default Button
Then, let's define an entrypoint in the packages/components/src/index.js
file.
export { default as Button } from './Button'
Initialize typescript in the @premieroctet/components
package to use types in the applications.
cd monorepo/packages/components
yarn add -D typescript
npx tsc --init
Add compiler options to the packages/components/tsconfig.json
file.
{
"compilerOptions": {
"declaration": true,
"outDir": "dist",
...
}
}
And finally, define the entry point in the packages/components/package.json
file and add a script to generate the dist
.
{
"name": "@premieroctet/components",
"version": "1.0.0",
"main": "dist/index.js",
"scripts": {
"build": "tsc"
}
...
}
cd monorepo/packages/components
yarn build
Now we can add the dependencies of our packages into our applications.
cd monorepo/apps/app-react
yarn add @premieroctet/components
In the apps/app-react/package.json
file, you can see that Yarn added a dependency to our @premieroctet/components
package.
"@premieroctet/components": "workspace:^"
The wildcard allows the package version to be resolved during yarn npm publish
or yarn pack
.
You can use another based on your needs, see the
documentation.
Now we can import the components of our @premieroctet/components
package into our @premieroctet/app-react
application.
For now, @premieroctet/components
does not have any peerDependencies
, if it were the case, the @premieroctet/app-react
application would have to add them in its package.json
or already possess the dependencies that match the peerDependencies
of @premieroctet/components
.
We can also use the scripts of our @premieroctet/cli
package in our applications. To do this, we will add a cli
command in the packages/cli/src/index.js
file.
#!/usr/bin/env node
console.log('Hello from @premieroctet/cli')
Then define the entry point in the packages/cli/package.json
file and add a script to run the cli
command.
{
"name": "@premieroctet/cli",
"version": "1.0.0",
"bin": {
"cli": "dist/index.js"
},
"scripts": {
"build": "tsc"
}
...
}
cd monorepo/packages/cli
yarn build
Now we can use the cli
command of our @premieroctet/cli
package in our applications.
cd monorepo/apps/app-react
yarn @premieroctet/cli cli
And even add scripts in the package.json
of our applications to run the scripts of our packages.
cd monorepo/apps/app-react
yarn add -D @premieroctet/cli
{
"scripts": {
"cli": "cli"
}
...
}
The yarn workspaces
commands
With the yarn workspaces
command, one can execute commands in all workspaces or in a specific workspace.
yarn workspaces foreach --all run build
yarn workspace @premieroctet/components run build
The -pt
option allows the commands to be launched after its dependencies have themselves launched the
defined command.
You can find all available commands in the Yarn 2+ documentation.
The limits of yarn workspaces
At this stage of our monorepo, we have only used Yarn workspaces to manage our packages. However, there are limitations to this approach, the most significant being the performance of commands such as yarn build
or yarn test
that are executed for each package. This can quickly become a problem if you have many packages or if the commands are long to execute.
A second limitation is the command orchestration between the packages. For example, if you have a command that depends on another command in another package, you'll have to manually manage the order of execution of the commands.
Therefore, we will see how Turbo can help us solve these problems.
Turborepo
Installation
Let's start by installing Turbo in our monorepo at the workspace root. And create a turbo.json
configuration file to define the tasks.
cd monorepo
yarn add -D turbo
{
"$schema": "https://turbo.build/schema.json",
"tasks": {}
}
For more information on Turbo installation, you can refer to the documentation.
Task Orchestration
In our example, we can consider the following tasks:
build
: to build all the packages - all projectsdev
: to run the applications in development mode - app-react and app-nextdeploy
: to deploy the applications - app-react and app-nextcli
: to run the script of the@premieroctet/cli
package - app-reactpublish
: for the publication of the packages - component and cli
The idea is to be able to run commands without having to move into each project and manage the order of execution of the commands.
In the turbo.json
configuration file, we will define tasks for each command.
{
"$schema": "https://turbo.build/schema.json",
"tasks": {
"build": {
"dependsOn": ["^build"],
"outputs": ["dist/**", "build/**", ".next/**", "!.next/cache/**"]
},
"dev": {
"cache": false,
"persistent": true
},
"deploy": {
"dependsOn": ["^build"]
},
"@premieroctet/app-react#cli": {
"dependsOn": ["@premieroctet/cli#build"]
},
"publish": {
"dependsOn": ["^build"]
}
}
}
dependsOn
allows to define dependencies between tasks, here we define that thebuild
task should be run after the builds of the dependencies.outputs
allows to define files generated by the task, these files will be used to determine if the task needs to be executed or if the cache can be used.cache
allows to disable cache for the task.persistent
keeps the task in execution.
In the case of cyclic dependencies, it will not be possible to use dependsOn
to define the order of execution of tasks.
For deploy
and publish
tasks, which are not defined in all of the packages, Turbo will only run the defined tasks.
In case of the cli
task, one can specify the package in which they want to run the task. We also define the dependency with the build
task of the @premieroctet/cli
package.
To run tasks, you use the turbo
command followed by the name of the task.
yarn turbo run build
One can specify the packages on which they want to run tasks using the --filter
flag or by prefixing the task name with the package name (e.g., @premieroctet/app-react#dev
). This is very handy when reducing the number of tasks to execute in continuous integration or when testing a specific package task. See documentation.
For developer comfort, one can add scripts to the package.json
to run Turbo tasks.
{
"scripts": {
"build": "turbo run build",
"dev": "turbo run dev",
"deploy": "turbo run deploy",
"cli": "turbo run @premieroctet/app-react#cli",
"publish": "turbo run publish"
}
...
}
Turbo Cache
To optimize execution times, Turbo relies on a caching system to avoid redoing tasks already executed when the outputs
have not changed. This improves performance during builds and tests. Cache is stored in the .turbo
folder at the workspace root.
In the context of large monorepos, one can store cache on a dedicated server to reduce cache-sharing time between developers. See Remote Caching.
Tips for a successful monorepo
You've understood it, managing a monorepo can quickly become complicated if not mastered well. Here are some points to keep in mind to avoid errors and facilitate monorepo management:
-
Avoid cyclic dependencies : this can quickly become a headache to manage, especially if you have many packages, and even block your Turbo tasks.
-
⚠️ Explicitly define dependencies : it is important to define the dependencies of your packages in the
package.json
to avoid dependency conflicts. Especially if you are using monorepos to publish packages, sinceyarn workspaces
centralizes dependencies, it is possible that dependencies (explicitly defined in apackage-a
) are present in thenode_modules
and consequently usable in another package (package-b
) without being defined in thepackage.json
ofpackage-b
. You won't get compilation errors in your workspace; however, if someone usespackage-b
in another project, they'll get compilation errors because the dependencies will be missing. -
Align dependency versions : to reduce the size of your
node_modules
and avoid dependency conflicts, it's preferable to align dependency versions between your packages when possible. -
Use
peerDependencies
: if you have common dependencies between your packages, it is better to define them in thepeerDependencies
to avoid dependency conflicts. In our example, a good practice would be to definereact
andreact-dom
in thepeerDependencies
of@premieroctet/components
. So you could have only one version ofreact
andreact-dom
for all your applications and a warning during installation if the application does not have the necessary dependencies. However, keep in mind,peerDependencies
are not installed automatically, if your workspace already contains a version of yourpeerDependencies
that will be the one used. -
BONUS - Clone a part of the monorepo : if you have a large versioned monorepo and want to work on a part of the monorepo, you can use the
git clone --depth=1 <url>
command to clone only the last commit of your monorepo. If you want to delve into this technique, I invite you to read the article: Get up to speed with partial clone and shallow clone.
Conclusion
As you have understood, the merger of Turbo monorepos and Yarn workspaces is a very powerful solution for managing your projects. This allows for version and dependency consistency, facilitates the maintenance and deployment of your packages. However, it is important to master the tools you use and follow best practices to avoid errors and to facilitate your monorepo management.
In conclusion, if you have projects that share dependencies, or if you have teams working on different projects, I would recommend using a monorepo with Turbo and Yarn workspaces. This will improve performance, maintainability, and collaboration.
👋