Skip to content
Logo Theodo

What I learned from the CI of Facebook and their Open Source Projects

Aurélien Le Masson7 min read

Stop wasting your time on tasks your CI could do for you.

Find 4 tips on how to better use your CI in order to focus on what matters - and what you love: code. Let’s face it: as a developer, a huge part of the value you create is your code.

Note: Some of these tips use the GitHub / CircleCI combo. Don’t leave yet if you use BitBucket or Jenkins! I use GitHub and CircleCi on my personal and work-related projects, so they are the tools I know best. But most of those tips could be set up with every CI on the market.

Tip 1: Automatic Changelogs

I used to work on a library of React reusable components, like Material UI. Several teams were using components from our library, and with our regular updates, we were wasting a lot of time writing changelogs. We decided to use Conventional Commits. Conventional Commits is a fancy name for commits with a standardized name:

feat(NavTable): make NavTable responsive, fix(Fundsheet): correct body margin for responsive designs example of Conventional Commits

The standard format is “TYPE(SCOPE): DESCRIPTION OF THE CHANGES”.

TYPE can be

_SCOPE (_optional parameter) describes what part of your codebase is changed within the commit.

DESCRIPTION OF THE CHANGES is pretty much what you would write in a “traditional” commit message. However, you can use keywords in your commit message to add more information. For instance:

fix(SomeButton): disable by default to fix IE7 behaviour BREAKING CHANGE: prop `isDisabled` is now mandatory

Why is this useful? Three main reasons:

  1. Allow scripts to parse the commit names, and generate changelogs with them
  2. Help developers thinking about the impact of their changes (Does my feature add a Breaking Change?)
  3. Allow scripts to choose the correct version bump for your project, depending on “how big” the changes in a commit are (bugfix: x.y.Z, feature: x.Y.z, breaking change: X.y.z)

This standard version bump calculation is called Semantic Versioning. Depending on the version bump, you can anticipate the impact on your app and the amount of work needed.

Be careful though! Not everyone follows this standard, and even those who do can miss a breaking change! You should never update your dependencies without testing everything is fine 😉

How to set up Conventional Commits

  1. Install Commitizen
  2. Install Semantic Releases
  3. Add GITHUB_TOKEN and NPM_TOKEN to the environment variables of your CI
  4. Add `npx semantic-release` after the bundle & tests steps on your CI master/production build
  5. Use `git cz` instead of `git commit` to get used to the commit message standard
  6. Squash & merge your feature branch on master/production branch

When you get used to the commit message standard, you can go back to `git commit`, but remember the format! (e.g: `git commit -m “feat: add an awesome feature”`)

Now, every developer working on your codebase will create changelogs without even noticing it. Plus, if your project is used by others, they only need a glance at your package version/changelog to know what changes you’ve made, and if they are Breaking.

Tip 2a: Run parallel tasks on your CI

Why do I say task instead of tests? Because a CI can do a lot more than run tests! You can:

There are several ways to use parallelism to run your tasks.

The blunt approach

This simply consists of using the built-in parallelism of your tasks, combined with a multi-thread CI container.

With Jest, you can choose the number of workers (threads) to use for your test with the `—max-workers` flag.

With Pytest, try xdist and the `-n` flag to split your tests on multiple CPUs.

Another way of parallelizing tests is by splitting the test files between your CI containers, as React tries to do it. However, I won’t write about this approach in this article since the correct way of doing it is nicely explained in the CircleCi docs.

Tip 2b: CircleCI Workflows

With Workflows, we reduced our CI Build time by 25% on feature branches (from 11” to 8”30) and by 30% on our master branch (from 16”30 to 11”30). With an average of 7 features merged on master a day, this is 1 hour and 30 minutes less waiting every day for our team.

Workflow is a feature of CircleCI. Group your tasks in Jobs, then order your Jobs how it suits your project best. Let’s imagine you are building a library of re-usable React Components (huh, I think I’ve already read that somewhere…). Your CI:

  1. Sets up your project (maybe spawn a docker, install your dependencies, build your app)
  2. Runs unit/integration tests
  3. Runs E2E tests
  4. Deploys your Storybook
  5. Publishes your library

Each of those bullet points can be a Job: it may have several tasks in it, but all serve the same purpose. But do you need to wait for your unit tests to pass before launching your E2E tests? Those two jobs are independent and could be running on two different machines.

Our CircleCI workflow Our CircleCI workflow

Extract of our config.yml Extract of our config.yml

As you can see, it is pretty straight-forward to re-order or add dependencies between steps.

Note: Having trouble setting up a workflow? You can SSH on the machine during the build.

Parallelization drawbacks

But be careful with the parallelism: resources are not unlimited; if you share your CI plan with other teams in your organization, make sure using more resources for parallelism will not be counter-productive at a larger scale. You can easily understand why using 2 machines for 10 minutes can be worse than using 1 machine for 15 minutes:

Project #2 is queued on CI because there is no machine free when the build was triggered Project #2 is queued on CI because there is no machine free when the build was triggered

Plus, sharing the Workspace (the current state) of one machine to others (e.g: after running `yarn`, to make your dependencies installed for every job) costs time (both when saving the state on the first machine and loading it on the other).

So, when should I parallelize my CI tasks?

The most optimized formula would be to split jobs when jobDuration > (nb_containers_available * workspaceSharingDuration).

If you want to remember something simpler, a good rule of thumb is always merge jobs which duration is < 1 min.

Workspace sharing can take up to a minute for a large codebase. You should try several workflow configurations to find what’s best for you.

Tip 3: Set up cron(tab)s

Crontabs help make your CI more reliable without making builds longer.

Some of you may wonder: what is a cron/crontab? Cron(tab) is an abbreviation of ChronoTable, a job scheduler. A cron is a program that executes a series of instructions at a given time. It can be once an hour, once a day, once a year…

I worked on a project in finance linking several sources of data and API’s. Regression was the biggest fear of our client. If you give a user outdated or incorrect info, global financial regulators could issue you a huge fine.

Therefore, I built a tool to generate requests with randomized parameters (country, user profile…), play them, and check for regressions. The whole process can take an hour. We run it via our CI, daily, at night, and it saved the client a lot of trouble.

You can easily set up crons on CircleCi if you’ve already tried Jobs/Workflows. Check out the documentation.

Note: Crons use the POSIX date-time notation, which can be a bit tricky at first. Check out this neat Crontab Tester tool to get used to it!

Misc tips:

If you have any other tip you would like to share, don’t hesitate!

Liked this article?