27 novembre 2024

How to setup a monorepo with Turborepo and Yarn?

12 minutes de lecture

How to setup a monorepo with Turborepo and Yarn?

🇫🇷 This post is also available in french

In the Javascript ecosystem, it's common to work with multiple packages for a single project, or even multiple projects revolving around several packages. This situation often arises in open-source projects or in companies with teams working on different projects. This can quickly become complex to manage, especially if you have several projects sharing dependencies. In this article, we'll look at how to simplify your package management with a Turbo monorepo and Yarn workspaces.

History of Monorepos

Monorepo

Monoliths

Historically, Javascript project lifecycles were managed as monoliths, with a single repository for the entire project. This approach has shown its limits, particularly in terms of maintainability, performance, and collaboration.

Google was boasting a few years ago about still using monoliths. Why Google Stores Billions of Lines of Code in a Single Repository

Multi-repos

Following this, around 2010, with the appearance of npm: came multi-repos. An approach that consists of separating each package into a dedicated repository. This solved some scaling issues but created others, particularly in terms of dependency management and version consistency. Each package had to wait for its dependencies to be published before it could use them. Driven by the success of microservice architectures, this approach is still widely used today.

Monorepos

In 2015, package managers began natively integrating workspace features, allowing for management of multiple packages within a single repository. This is the case with yarn and npm among others, and it has helped to democratize monorepos. This is a very popular approach with maintainers of open-source projects, as it allows you to manage multiple packages within a single repository while maintaining version and dependency consistency. The issue that can arise with this approach is the slowness of installations and builds, especially on a large scale. We quickly notice similarities with the two previous approaches: the centralized aspect of monoliths and the package autonomy of multi-repos.

From 2018 onwards, we began seeing tools like Nx and Turbo, which facilitate more efficient management of monorepos. These tools allow for caching of common dependencies and orchestration of scripts (build, test, etc.). This results in improved performance and easier monorepo management. We will revisit Turbo in more detail later in this article.

Why a Monorepo?

As you've probably gathered, each approach has its advantages and drawbacks. Whether or not you choose a monorepo will depend on your context, needs, and constraints. Here are few advantages to using a monorepo:

Facilitating collaboration: by gathering all the packages in the same repository, it becomes easier to share code between different packages and work on multiple packages simultaneously.
Facilitating dependency management: managing dependencies centrally makes it easier to maintain version consistency and manage common dependencies.
Facilitating maintenance: when all packages are grouped in the same repository, it becomes easier to maintain the packages, update dependencies, and manage versions.
Facilitating deployment: grouping all packages in the same repository makes it easier to deploy the packages, and this can be done independently.

Setting Up a Monorepo with Turbo and Yarn

Now that we've explored the advantages of a monorepo, let's look at how to set one up with Turbo and Yarn. Throughout the article, we'll dive deeper into the key concepts of each tool and see how they can be used together. The aim is to provide a comprehensive understanding of a monorepo's architecture and how to adapt it to suit your context.

For example, we'll set up a monorepo containing a React application, a Next.js application, a package of React components, and a CLI (simple console.log). The structure of our project will be as follows:

monorepo
├── apps
│   ├── app-react # @premieroctet/app-react
│   │   └── package.json
│   └── app-nextjs # @premieroctet/app-nextjs
│       └── package.json
├── packages
│   ├── components # @premieroctet/components
│   │   └── package.json
│   └── cli # @premieroctet/cli
│       └── package.json
├── package.json
└── yarn.lock

The Yarn package manager

For the rest of this article, we will use Yarn 2+ to manage the dependencies of our monorepo. For your monorepos, I recommend using Yarn 2+ as it has brought many enhancements over Yarn 1.x, particularly with regard to workspaces.

Initializing the monorepo

We will therefore start by initializing our monorepo with Yarn. To do this, we will create a new folder and initialize a new Node project.

mkdir monorepo
cd monorepo
yarn set version berry # Use last version of Yarn
yarn init -w

The -w option is used to initialize a workspace in the package.json file and set the packages folder as the default location for packages.

In the package.json file that has been created, we will add the workspaces to define the apps and packages folders as workspaces.

package.json

{
  "name": "monorepo",
  "version": "1.0.0",
  "private": true,
  "workspaces": ["apps/*", "packages/*"],
  "packageManager": "yarn@4.5.1"
}

Workspaces

We then create the apps and packages directories (the latter being already created by Yarn) to store our applications and packages. For the React and Next.js applications, we will use the create-react-app and create-next-app commands respectively to initialize them.

cd monorepo/apps
yarn create react-app app-react
yarn create next-app app-next

For the components and cli packages, we'll create a folder for each package and initialize a new Node project.

cd monorepo/packages
mkdir components
mkdir cli
cd components
yarn init -y
cd ../cli
yarn init -y

Now all our applications and packages are initialized. Before proceeding, we'll prefix these projects with a workspace name (here with @premieroctet) in the package.json file to distinguish them from their file architecture names and better identify them.

Now that we've renamed our projects, we're going to do our first yarn install at the root of our project to install the dependencies of all the packages. We can see that Yarn creates symbolic links of projects that are part of the workspaces, and that the node_modules is at the root of the monorepo.

Yarn workspaces node_modules at the root

To reduce the number of installed dependencies, Yarn 2+ uses a global cache to store shared dependencies between packages. This reduces dependency installation time and disk space usage.

Managing Dependency Conflicts

Since there's only one node_modules at the root, how do you manage dependency conflicts between packages?

If we take our example, let's say the @premieroctet/app-react application uses react@18 and the @premieroctet/app-next application uses react@19. Yarn will install react@19 at the root of the monorepo and create a node_modules in the @premieroctet/app-react application with react@18. This allows for the management of dependency conflicts between packages. In this case, it's the topological order that determines the priority of the dependencies. However, if a third package uses react@18, then Yarn will install react@18 at the root of the monorepo and create a node_modules in the @premieroctet/app-next package with react@19 since it's "in the minority".

Using Components in Applications

Now that we've initialized our applications and packages, let's see how to use them in our applications.

Let's start by adding components to our @premieroctet/components package. To do this, we'll create a Button component in the packages/components/src/Button.js file.

Button.js

import React from 'react'

const Button = ({ children }) => {
  return <button>{children}</button>
}

export default Button

Then, let's define an entrypoint in the packages/components/src/index.js file.

index.js

export { default as Button } from './Button'

Initialize typescript in the @premieroctet/components package to use types in the applications.

cd monorepo/packages/components
yarn add -D typescript
npx tsc --init

Add compiler options to the packages/components/tsconfig.json file.

tsconfig.json

{
  "compilerOptions": {
    "declaration": true,
    "outDir": "dist",
    ...
  }
}

And finally, define the entry point in the packages/components/package.json file and add a script to generate the dist.

package.json

{
  "name": "@premieroctet/components",
  "version": "1.0.0",
  "main": "dist/index.js",
  "scripts": {
    "build": "tsc"
  }
  ...
}

cd monorepo/packages/components
yarn build

Now we can add the dependencies of our packages into our applications.

cd monorepo/apps/app-react
yarn add @premieroctet/components

In the apps/app-react/package.json file, you can see that Yarn added a dependency to our @premieroctet/components package.

package.json

  "@premieroctet/components": "workspace:^"

The wildcard allows the package version to be resolved during yarn npm publish or yarn pack. You can use another based on your needs, see the documentation.

Now we can import the components of our @premieroctet/components package into our @premieroctet/app-react application. For now, @premieroctet/components does not have any peerDependencies, if it were the case, the @premieroctet/app-react application would have to add them in its package.json or already possess the dependencies that match the peerDependencies of @premieroctet/components.

We can also use the scripts of our @premieroctet/cli package in our applications. To do this, we will add a cli command in the packages/cli/src/index.js file.

index.js

#!/usr/bin/env node

console.log('Hello from @premieroctet/cli')

Then define the entry point in the packages/cli/package.json file and add a script to run the cli command.

package.json

{
  "name": "@premieroctet/cli",
  "version": "1.0.0",
  "bin": {
    "cli": "dist/index.js"
  },
  "scripts": {
    "build": "tsc"
  }
  ...
}

cd monorepo/packages/cli
yarn build

Now we can use the cli command of our @premieroctet/cli package in our applications.

cd monorepo/apps/app-react
yarn @premieroctet/cli cli

And even add scripts in the package.json of our applications to run the scripts of our packages.

cd monorepo/apps/app-react
yarn add -D @premieroctet/cli

package.json

{
  "scripts": {
    "cli": "cli"
  }
  ...
}

The `yarn workspaces` commands

With the yarn workspaces command, one can execute commands in all workspaces or in a specific workspace.

yarn workspaces foreach --all run build

yarn workspace @premieroctet/components run build

The -pt option allows the commands to be launched after its dependencies have themselves launched the defined command.

You can find all available commands in the Yarn 2+ documentation.

The limits of yarn workspaces

At this stage of our monorepo, we have only used Yarn workspaces to manage our packages. However, there are limitations to this approach, the most significant being the performance of commands such as yarn build or yarn test that are executed for each package. This can quickly become a problem if you have many packages or if the commands are long to execute.

A second limitation is the command orchestration between the packages. For example, if you have a command that depends on another command in another package, you'll have to manually manage the order of execution of the commands.

Therefore, we will see how Turbo can help us solve these problems.

Turborepo

Installation

Let's start by installing Turbo in our monorepo at the workspace root. And create a turbo.json configuration file to define the tasks.

cd monorepo
yarn add -D turbo

turbo.json

{
  "$schema": "https://turbo.build/schema.json",
  "tasks": {}
}

For more information on Turbo installation, you can refer to the documentation.

Task Orchestration

In our example, we can consider the following tasks:

build : to build all the packages - all projects
dev: to run the applications in development mode - app-react and app-next
deploy: to deploy the applications - app-react and app-next
cli: to run the script of the @premieroctet/cli package - app-react
publish: for the publication of the packages - component and cli

The idea is to be able to run commands without having to move into each project and manage the order of execution of the commands.

In the turbo.json configuration file, we will define tasks for each command.

turbo.json

{
  "$schema": "https://turbo.build/schema.json",
  "tasks": {
    "build": {
      "dependsOn": ["^build"],
      "outputs": ["dist/**", "build/**", ".next/**", "!.next/cache/**"]
    },
    "dev": {
      "cache": false,
      "persistent": true
    },
    "deploy": {
      "dependsOn": ["^build"]
    },
    "@premieroctet/app-react#cli": {
      "dependsOn": ["@premieroctet/cli#build"]
    },
    "publish": {
      "dependsOn": ["^build"]
    }
  }
}

dependsOn allows to define dependencies between tasks, here we define that the build task should be run after the builds of the dependencies.
outputs allows to define files generated by the task, these files will be used to determine if the task needs to be executed or if the cache can be used.
cache allows to disable cache for the task.
persistent keeps the task in execution.

In the case of cyclic dependencies, it will not be possible to use dependsOn to define the order of execution of tasks.

For deploy and publish tasks, which are not defined in all of the packages, Turbo will only run the defined tasks.

In case of the cli task, one can specify the package in which they want to run the task. We also define the dependency with the build task of the @premieroctet/cli package.

To run tasks, you use the turbo command followed by the name of the task.

yarn turbo run build

One can specify the packages on which they want to run tasks using the --filter flag or by prefixing the task name with the package name (e.g., @premieroctet/app-react#dev). This is very handy when reducing the number of tasks to execute in continuous integration or when testing a specific package task. See documentation.

For developer comfort, one can add scripts to the package.json to run Turbo tasks.

package.json

{
  "scripts": {
    "build": "turbo run build",
    "dev": "turbo run dev",
    "deploy": "turbo run deploy",
    "cli": "turbo run @premieroctet/app-react#cli",
    "publish": "turbo run publish"
  }
  ...
}

Turbo Cache

To optimize execution times, Turbo relies on a caching system to avoid redoing tasks already executed when the outputs have not changed. This improves performance during builds and tests. Cache is stored in the .turbo folder at the workspace root.

In the context of large monorepos, one can store cache on a dedicated server to reduce cache-sharing time between developers. See Remote Caching.

Tips for a successful monorepo

You've understood it, managing a monorepo can quickly become complicated if not mastered well. Here are some points to keep in mind to avoid errors and facilitate monorepo management:

Avoid cyclic dependencies : this can quickly become a headache to manage, especially if you have many packages, and even block your Turbo tasks.
⚠️ Explicitly define dependencies : it is important to define the dependencies of your packages in the package.json to avoid dependency conflicts. Especially if you are using monorepos to publish packages, since yarn workspaces centralizes dependencies, it is possible that dependencies (explicitly defined in a package-a) are present in the node_modules and consequently usable in another package (package-b) without being defined in the package.json of package-b. You won't get compilation errors in your workspace; however, if someone uses package-b in another project, they'll get compilation errors because the dependencies will be missing.
Align dependency versions : to reduce the size of your node_modules and avoid dependency conflicts, it's preferable to align dependency versions between your packages when possible.
Use peerDependencies : if you have common dependencies between your packages, it is better to define them in the peerDependencies to avoid dependency conflicts. In our example, a good practice would be to define react and react-dom in the peerDependencies of @premieroctet/components. So you could have only one version of react and react-dom for all your applications and a warning during installation if the application does not have the necessary dependencies. However, keep in mind, peerDependencies are not installed automatically, if your workspace already contains a version of your peerDependencies that will be the one used.
BONUS - Clone a part of the monorepo : if you have a large versioned monorepo and want to work on a part of the monorepo, you can use the git clone --depth=1 <url> command to clone only the last commit of your monorepo. If you want to delve into this technique, I invite you to read the article: Get up to speed with partial clone and shallow clone.

Conclusion

As you have understood, the merger of Turbo monorepos and Yarn workspaces is a very powerful solution for managing your projects. This allows for version and dependency consistency, facilitates the maintenance and deployment of your packages. However, it is important to master the tools you use and follow best practices to avoid errors and to facilitate your monorepo management.

In conclusion, if you have projects that share dependencies, or if you have teams working on different projects, I would recommend using a monorepo with Turbo and Yarn workspaces. This will improve performance, maintainability, and collaboration.

👋

History of Monorepos

Monoliths

Multi-repos

Monorepos

Why a Monorepo?

Setting Up a Monorepo with Turbo and Yarn

The Yarn package manager

Initializing the monorepo

Workspaces

Managing Dependency Conflicts

Using Components in Applications

The yarn workspaces commands

The limits of yarn workspaces

Turborepo

Installation

Task Orchestration

Turbo Cache

Tips for a successful monorepo

Conclusion

The `yarn workspaces` commands