Software artist. Writer aficionado. Open source enthusiast.
Runner. Father of two.
Currently: Senior Software Engineer at Google, New York City.
Bazel likes creating very deep and large trees on disk during a build. One example is the output tree, which naturally contains all the artifacts of your build. Another, more problematic example is the symlink forest trees created for every action when sandboxing is enabled. As garbage gets created, it must be deleted. It turns out, however, that deleting file system trees can be very expensive—and especially so on macOS. In fact, calls to our deleteTree algorithm routinely showed up in my profiling runs when trying to diagnose slowdowns using the dynamic scheduler.
Since the publication of Bazel a few years ago, users have reported (and I myself have experienced) general slowdowns when Bazel is running on Macs: things like the window manager stutter and others like the web browser cannot load new pages. Similarly, after the introduction of the dynamic spawn scheduler, some users reported slower builds than pure remote or pure local builds, which made no sense. All along we guessed that these problems were caused by Bazel’s abuse of system threads, as it used to spawn 200 runnable threads during analysis and used to run 200 concurrent compiler subprocesses.
This is the tale of yet another Bazel bug, this time involving environment variables, global state, and gRPC. Through it, I'll argue that you should never use setenv within a program unless you are doing so to execute something else.
The point of this post is simple and I’ll spoil it from the get go: every time you make an assumption in a piece of code, make such assumption explicit in the form of an assertion or error check. If you cannot do that (are you sure?), then write a detailed comment. In fact, I’m exceedingly convinced that the amount of assertion-like checks in a piece of code is a good indicator of the programmer’s expertise.
I am pleased to announce that the first release of sandboxfs, 0.1.0, is finally here! You can download the sources and prebuilt binaries from the 0.1.0 release page and you can read the installation instructions for more details. The journey to this first release has been a long one. sandboxfs was first conceived over two years ago, was first announced in August 2017, showed its first promising results in April 2018, and has been undergoing a rewrite from Go to Rust.
Bazel’s original raison d’etre was to support Google’s monorepo. A consequence of using a monorepo is that some builds will become very large. And large builds can be very resource hungry, especially when using a tool like Bazel that tries to parallelize as many actions as possible for efficiency reasons. There are many resource types in a system, but today I’d like to focus on the number of open files at any given time (nofiles).
I used to be good at replying to emails on time. When tens of emails came in every day, I would sort them out and I would reply right away to anything that needed or caught my attention. The so-called Inbox Zero wasn’t a specific goal that required effort: “it just was”. Things have changed over the years and I am now quite awful at dealing with personal email. Some emails can go weeks (or, I confess, months) before getting a reply.