CI#

At first, you’ll want to write your tests locally, and test them against as many local browsers as possible. However, to really test out your features, you’ll want to:

  • run them against as many real browsers on other operating systems as possible

  • have easy access to human- and machine-readable test results and build assets

  • integration with development tools like GitHub

Enter Continuous Integration (CI).

Providers: Cloud#

Multi-Provider#

Historically, Jupyter projects have used a mix of free-as-in-beer-for-open source hosted services:

Each brings their own syntax, features, and constraints to building and maintaining robust CI workflows.

JupyterLibrary started on Travis-CI, but as soon as we wanted to support more platforms and browsers…

Azure Pipelines#

At the risk of putting all your eggs in one (proprietary) basket, Azure Pipelines provides a single-file approach to automating all of your tests against reasonably modern versions of browsers.

JupyterLibrary was formerly built on Azure, and looking through pipeline and various jobs and steps shows some evolving approaches…

Github Actions#

At the risk of putting all your eggs in one (proprietary) basket, if your code is on Github, Github Actions offers the tightest integration, requiring no aditional accounts.

JupyterLibrary is itself built on Github Actions, and looking at the workflows offers some of the best patterns we have found.

Providers: On-Premises#

Jenkins#

If you are working on in-house projects, and/or have the ability to support it, Jenkins is the gold standard for self-hosted continuous integration. It has almost limitless configurability, and commercial support is available.

Approach: It’s Just Scripts#

No matter how shiny or magical your continuous integration tools appear, the long-term well-being of your repo depends on techniques that are:

  • simple

  • cross-platform

  • as close to real browsers as possible

  • easily reproducible outside of CI

Practically, since this is Jupyter, this boils down to putting as much as possible into platform-independent python (and, when neccessary, nodejs) code.

JupyterLibrary uses doit to manage a relatively complex lifecycle across multiple environments with minimal CLI churn.

  • doit has very few runtime dependencies, and works well with caching, etc.

Environment variables are used for feature flags

  • aside from some inevitable path issues, environment variables are easy to migrate onto another CI provider

A small collection of scripts, not shipped as part of the distribution, provide some custom behaviors around particularly complex tasks.

  • sometimes doit is too heavy of a hammer for delicate work

Approach: Caching#

Most of the CI providers offer nuanced approaches to caching files. Things to try caching (it doesn’t always help):

  • packages/metadata for your package manager, e.g. conda, pip, yarn

  • built web assets

Approach: Pay technical debt forward#

A heavy CI pipeline can become necessary to manage many competing concerns. Each non-trivial, browser-based robot test can easily cost tens of seconds. Some approaches:

  • use an up-front dry-run robot test

    • this can help catch whitespace errors in robot syntax

    • this usually costs $\frac{\sim1}{100}$ the time of running the full test

  • run tests in subsets, in parallel, and in random order with pabot

    • requires avoiding shared resources, e.g. network ports, databases, logfiles