Dependabot

Over the past few weeks, I've been on a big security/DevOps/supply-chain/continuous integration kick. Not just for my own projects, mind you, but also for client and open source work.

It may have been triggered by one of the many NPM package hijacks that happened last month, or maybe it's related to one of the Arch AUR hacks that have happened recently. Oh hey, and for good measure, here's a Rust RCE in an abandoned library... Yayyy...

This sort of thing as always been on my mind, but something made me finally decide to brush the dust off my ToDo list and start implementing some more robust security protocols around my development machines and the code that gets to run on it.

In my own projects, I tend towards minimal (or zero) third-party dependencies which reduces my supply-chain attack surface, but in open-source projects or certain client work, I don't have that same liberty to write a bunch of code on my own timelines. Dependencies are also introduced by other developers and I don't always have control over that. Never mind transitive dependencies, which are a can of worms all on their own.

Some preachy examples

I wrote an iOS SDK which is in use by a few million users. It's a non-trivial amount of code which does a surprising amount of work. It has exactly 1 non-Apple dependency which has no transitive dependencies. There are a couple Apple dependencies, and I believe transitively, there are two non-Apple dependencies. I don't particularly like that, however, given that Apple has paid people to work on the libraries I'm using, I would hope (wishful thinking maybe) they're keeping an eye on the external dependencies they use. I've even considered removing those dependencies in favour of something I could write that would be more performant and cross-platform, but that would take some time.

I've written an SSH client for my iPad to let me do some work when I'm not in front of my daily driver, and there are no external dependencies. Not via a package manager, not via vendoring. I wrote all the code, including the GPU rendering pipeline, and Apple wrote the UI framework and built the hardware.

Overall, I really do try to practice what I preach.

Enough preaching

Those specific examples aside, there are a lot of ideas and topics at varying levels of paranoia I might implement or write about. I wrote a list of about 30 items last night, but they could likely be batched to a top-10 list someday...

The first is the easiest and most obvious to implement... If you use Github... And I won't be for any of my non-open-source or non-source-available work for much longer. BUT! If YOU do...

Most people who use Github probably know what Dependabot is, and if you do, you've almost certainly been annoyed by incessant pull request churn.

For those who dont know, Dependabot is a Github Action which runs at some pre-defined frequency over your repo's dependencies (for supported languages and ecosystems) and checks for package updates, or specifically for security updates to your packages. It then can draft a PR automatically - which I find mostly useless other than as a signal that there are dependencies to update.

If you want to use it in one of your repos, check out their quick start. That site has a metric ton of information, which is great. The downside is that it has a metric ton of information that you have to parse through to find the 3-4 things that 95% of users will care about. Fortunately, I've done that already.

Here is the default dependabot.yml file we added to the PantsBuild website repo the other day:

version: 2
updates:
  - package-ecosystem: "npm"
    directory: "/"
    schedule:
      interval: "weekly"

Pretty simple, but will eventually annoy the crap outta you.

Groups

First thing everyone should do is to group dependencies (and optionally tune the schedule to however often you think you might need updates):

version: 2
updates:
  - package-ecosystem: "npm"
    directory: "/"
    groups:
      npm-deps:
        patterns:
          - "*"
    schedule:
      interval: "weekly"

groups, in this context, will simply batch updates from 1 PR-per-update (yuck) to 1 PR-per-run. This takes us from this:

Dependabot - too many PRs

to this:

Dependabot - just enough PRs

Labels

Depending on the size of your project and the number of contributors, this next step has varying value. For Pants, PRs require certain labels to kick off CI, so it only makes sense to automatically apply them.

version: 2
updates:
  - package-ecosystem: "npm"
    directory: "/"
    groups:
      npm-deps:
        patterns:
          - "*"
    labels:
      - "category:internal"
      - "dependencies"
      - "release-notes:not-required"
    schedule:
      interval: "weekly"

Not reviewers, apparently

Github removed the reviewers label in favour of using a CODEOWNERS file.

So, if you were using this before April 2025, you should clean this up. This has questionable value anyways. I find that, unless the PR is incredibly trivial, the auto-reviewers doesn't make a large impact either way.

Who actions the actioners

You can have multiple package ecosystems in the same Dependabot file (each runs as a separate action), so don't forget to include one to cover your imported Github Actions workflows (and ensure it gets grouped and labelled correctly too).

version: 2
updates:
  - package-ecosystem: "github-actions"
    directory: "/"
    groups:
      gha-updates:
        patterns:
          - "*"
    labels:
      - "maybe"
    schedule:
      interval: "monthly"

  - package-ecosystem: "npm"
    directory: "/"
    groups:
      npm-deps:
        patterns:
          - "*"
    labels:
      - "maybe"
    schedule:
      interval: "weekly"

That's basically it. The reference docs go in-depth on a lot of the options and sub-options, but what I have above is probably closer to what a default or typical Dependabot should look like.

Non-standard use cases

I ran into some non-standard use cases that I'll briefly mention below that we use for Pants. I didn't even realize Dependabot covered some of these workflows, so this was pretty cool to find out.

Security groups vs version groups

This hasn't worked out exactly as well as I would have hoped, since my intention (without writing too much configuration) would be to split up the frequency of security updates against general package updates. I won't know how well this works until this runs for a few weeks after we've cleaned up the stale dependencies we have. The ideal case would be frequent security notifications, infrequent version notifications. To do that, I used groups and apply them situationally - where the order matters and it changes which group the dependency ends up in. There is a chance the * pattern might be too broad, but I'll re-visit this if it doesn't work.

groups:
  rust-security-updates:
    applies-to: security-updates
    patterns:
      - "*"
    update-types:
      - "minor"
      - "patch"
  rust-version-updates:
    applies-to: version-updates
    patterns:
      - "*"

Non-standard (or multiple) directories

Pants is an interesting case study in Dependabot, because it has internal dependencies in several folders, and then a set of external-ish dependencies elsewhere. Currently, I'm covering the internal dependencies before I move on to the external ones. It's pretty easy to get each ecosystem to look in different directories for their desired dependency files:

  - package-ecosystem: "cargo"
    directory: "/src/rust"
    ...

  - package-ecosystem: pip
    directory: "/3rdparty/python"
    ...

  - package-ecosystem: "npm"
    directories: 
      - "build-support/**/*"
      - "src/python/pants/backend/javascript/**/*"
      - "src/python/pants/backend/typescript/**/*"
      - "testprojects/src/js/**/*"
      - "testprojects/src/ts/**/*"
    ...

How to approach dependency updates

The most overwhelming part of this process is the PR that commits your first Dependabot. You might have an overwhelming number of dependencies that are out-of-date and it's a chore to get them fresh.

There are a few ways to handle this, but my general strategy is pretty simple:

Review every changelog update for every single dependency
Identify no-brainers and land them quickly, like yanked packages, security patches, or no-ops (e.g. release happened due to a README update) to bring the overall numbers down
For the remainder, evaluate if you actually need the dependency or if you can write some code to replace the subset of functionality you use
Leave breaking changes until last and then, again, really think whether you NEED the dependency or not

Case-by-case the updates get more involved.

Maybe a controversial idea

I (almost) always explicitly pin dependencies to specific versions. I know a lot of people don't do that, and a lot of people have been taught not to - but I absolutely detest any mechanism by which dependencies can ever change under my feet. In the best cases, it might be helpful for zero days - but in the common case, I just have a slightly less reliable pipeline.

And, in reality, you're trusting some random 3rd party to religiously adhere to SemVer... Why would you ever transitively trust people you don't know to do that?

My Slack rant

In writing this, I just remembered a rant I had in the Pants Slack earlier this month about this very subject, where I somehow keep forgetting (and then re-remembering) that Cargo has the WORST dependency specifiers.

Quick rust rant: Cargo's default requirements are insane. For a language that prides itself on safety and security - it should be falling into a pit of safety everywhere, especially with supply chain.

If I type in the package version "1.2.3" - I'm typically not expecting the possibility of "1.99.99" as well. While I think npm versioning has too many options, at least if you type a full specifier, you get the full specifier and nothing else. Sane default. To take it a step further, I think without a range, caret, or tilde - you should enforce wildcards so people always know what they're getting into.
Default requirements

Default requirements specify a minimum version with the ability to update to SemVer compatible versions. Versions are considered compatible if their left-most non-zero major/minor/patch component is the same. This is different from SemVer which considers all pre-1.0.0 packages to be incompatible.

1.2.3 is an example of a default requirement.

1.2.3  :=  >=1.2.3, <2.0.0
1.2    :=  >=1.2.0, <2.0.0
1      :=  >=1.0.0, <2.0.0
0.2.3  :=  >=0.2.3, <0.3.0
0.2    :=  >=0.2.0, <0.3.0
0.0.3  :=  >=0.0.3, <0.0.4
0.0    :=  >=0.0.0, <0.1.0
0      :=  >=0.0.0, <1.0.0

Not just ranting about dependencies either

This is also a problem with non-namespaced, centralized dependency repos like crates.io or NPM.

The other one is that crates.io still (as far as I can see) [doesn't have] mandated namespaced crates. Relying on a flat namespace is a great way to get name squatters and it just makes the dependency space more confusing.

NPM eventually added, and I think JSR mandates namespaces. Even the standard library for Deno is namespaced: https://jsr.io/@std

Makes fat-fingering a squatted package even harder if instead of:
tokio = ...
axum = ...
prost = ...
I had to do:
@tokio/runtime = ...
@tokio/axum = ...
@tokio/prost = ...
Becomes an org squatting issue, but that is a more tractable problem I think

How do you name a fork of something that is unmaintained?

Similar problem for lsp_types - there needed to be ls_types and I think Microsoft used lsprotocol for theirs.