Node.js Startup: Series Introduction & Measuring Startup
Tags: nodejs-startup, low-level
This blog post is part of a series to see how much I can optimize Node.js’s startup time. Startup time is something that users care about, especially for interactive tooling or for workloads with many short-lived processes. The most important step of performance analysis is measurement, so let’s start by measuring Node.js’s startup time.
I decided to measure the time to execute node -e 0
, which simply evaluates
the no-op expression of “0”. I focused on “warm startup”, i.e. when the
various file system caches were already warm. This felt more realistic since
mostly when you care about startup time, it’s for something you’ll be executing
often, so the operating system will already have the executable and shared
libraries paged in to memory.
Here’s a boxplot of the runtimes of older versions of Node.js, along with the main branch (“main”, 4b80a7b0c404e). As a teaser for the rest of this series, I’ve also included my WIP branch (“mine”).
As you can see, I’ve got a branch where the startup is faster than it’s
been in a long time (at least since 2017). It’s partially offset by a
minor regression in main
. There’s not much variance in runtimes, so
the boxplot looks smushed.
Typical process startup is memory intensive, so optimizing startup time will likely optimize memory usage as well, and vice-versa. Here’s the same graph except focusing on memory usage.
It’s not as impressive as the runtime graph unfortunately: again it’s
fighting a regression in main
. The final results bring us back to
v17.9.1 (released June 2022), but still 2.3 MiB above the glory days of
15.4.0 (released April 2021).
Node.js also provides its own startup benchmarks, which we can check to verify our results.
$ node benchmark/compare.js --old ./node_main --new ./node \
--runs 10 --filter startup misc > results.csv
$ node-benchmark-compare results.csv
confidence improvement
process require-builtins *** 35.96 %
process semicolon *** 37.63 %
worker require-builtins *** 34.96 %
worker semicolon *** 34.80 %
0.00 false positives, when considering a 0.1% risk acceptance (***)
In addition to speeding up the no-op benchmark (semicolon), we’ll also be speeding up the overall performance of requiring Node.js’s builtin library (require-builtins).
Check out the rest of this series to see how we’ll achieve this amazing feat!
Extra: How I made the graphs
All steps were performed on an Amazon EC2 Linux instance running Debian
10. First, I downloaded a bunch of old versions of Node.js using
nodeenv
.
for n in 9.11.2 11.15.0 13.14.0 15.14.0 17.9.1 19.9.0; do
nodeenv $n -n $n &
done
wait
I also built the “main” branch on the day where I started this project
(SHA 4b80a7b0c404e) as a comparison point. I did so by creating a
release tarball via CUSTOMTAG=t DISTTYPE=custom make -j$(nproc) binary
, and extracting it to main/
.
I used hyperfine
to benchmark the runtime. Executing warmup
iterations via --warmup
was important to avoid outliers, since
Node.js’s startup is very IO heavy.
hyperfine --export-json timings.json \
-L node_version 9.11.2,11.15.0,13.14.0,15.14.0,17.9.1,19.9.0,main,mine \
--shell=none --warmup 100 './{node_version}/bin/node -e 0'
This exports a timings.json
file. Although hyperfine
comes with some
utilities for graphing it, I prefer Vega-Lite. Vega-Lite’s builtin
transform functionality is sufficient to convert hyperfine’s format into one
that Vega-Lite can use for graphing:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"url": "/assets/node-startup-runtime-data.json",
"format": {"type": "json", "property": "results"}
},
"transform": [
{"flatten": ["times"]},
{"calculate": "1000*datum.times", "as": "times"},
],
"mark": {"type": "boxplot", "extent": "min-max"},
"encoding": {
"x": {
"field": "parameters.node_version",
"title": "Node.js version",
"type": "nominal",
"sort": [],
"axis": {"labelAngle": 0}
},
"y": {
"field": "times",
"type": "quantitative",
"title": "Time (ms)"
},
"color": {
"title": "Node.js version",
"field": "parameters.node_version",
"type": "nominal",
"sort": []
}
},
"config": {"numberFormat": ".3"},
"title": {"text": "Startup time of Node.js over the years"},
"width": "container", "height": 500
}
To measure memory usage, I decided to look at “unique set size” (USS). Unique
set size is a measure of how much memory an individual process adds, i.e.
excluding any memory shared by any other process. Resident set size (RSS) is
also interesting, but it includes all the process’s memory, a lot of which
will be shared (like shared libraries, the node binary itself, etc.), so it’s
not as meaningful for our purposes. USS is measured by smem
, but
actually collecting the data required some ugly Bash:
for i in {1..30}; do
for n in 9.11.2 11.15.0 13.14.0 15.14.0 17.9.1 19.9.0 main mine; do
# Fire off two node processes, so that the executable and
# shared libraries will be shared.
$n/bin/node --expose-gc -e 'gc(),gc();while(1);' &
$n/bin/node --expose-gc -e 'gc(),gc();while(1);' &
# Delay until startup/GCs hopefully finish.
sleep 1
# Collect the memory used by all processes.
smem | tee -a $n/smem_results
# Kill the node processes.
kill $(jobs -p)
done
done
And now some extra ugly Bash to convert it into JSON. Note that each smem
call gives us two node processes, so we use paste
/awk
to select the one
which has a larger USS.
for n in 9.11.2 11.15.0 13.14.0 15.14.0 17.9.1 19.9.0 main mine; do
printf '{"node_version":"%s","memory":[%s]},\n' $n \
$(< $n/smem_results grep /node \
| awk '{print $(NF-2)}' \
| paste - - | awk '{print ($1>$2?$1:$2)}' \
| paste -sd,)
done
And finally, the data goes into Vega-Lite to make a pretty chart.
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
"url": "/assets/node-startup-memory-data.json",
"format": {"type": "json"}
},
"transform": [
{"flatten": ["memory"]},
{"calculate": "datum.memory/1024", "as": "memory"}
],
"mark": {"type": "boxplot", "extent": "min-max"},
"encoding": {
"x": {
"field": "node_version",
"title": "Node.js version",
"type": "nominal",
"sort": [],
"axis": {"labelAngle": 0}
},
"y": {
"field": "memory",
"type": "quantitative",
"title": "Unique Set Size (MiB)"
},
"color": {
"title": "Node.js version",
"field": "node_version",
"type": "nominal",
"sort": []
},
},
"config": {"numberFormat": ".2"},
"title": {
"text": "Unique set size of Node.js over the years"
},
"width": "container",
"height": 500
}