Description
Reported by @zbjornson in #17054. All text below the horizontal line below is a copy/paste from their comment on that PR. /ping @davisjam @nodejs/fs
The comments above noted that this doesn't appear to cause a significant performance regression, but we're seeing a 7.6-13.5x drop in read throughput between 8.x and 10.x in both the readfile benchmark and our real-world benchmarks that heavily exercise fs.readFile
. Based on my troubleshooting, I think it's from this change, but it's possible some other change is responsible.
The readfile benchmark (Ubuntu 16):
Test | v8.15.0 | v10.15.0 | 8 ÷ 10 |
---|---|---|---|
concurrent=1 len=1024 | 6,661 | 7,066 | 1.06x |
concurrent=10 len=1024 | 23,100 | 21,079 | 0.91x |
concurrent=1 len=16777216 | 156.6 | 11.6 | 13.5x |
concurrent=10 len=16777216 | 584 | 76.6 | 7.6x |
From what I can extract from the comments in this PR, either no degradation or a 3.6-4.8x degradation was expected for the len=16M cases.
As for why I think it's because of this change, the benchmark below compares fs.readFile
against a simple version of how fs.readFile
used to work (one-shot read), measuring time to read the same 16 MB file 50 times.
// npm i async
const fs = require("fs");
const async = require("async");
function chunked(filename, cb) {
fs.readFile(filename, cb);
}
function oneshot(filename, cb) {
// shoddy implementation -- leaks fd in case of errors
fs.open(filename, "r", 0o666, (err, fd) => {
if (err) return cb(err);
fs.fstat(fd, (err, stats) => {
if (err) return cb(err);
const data = Buffer.allocUnsafe(stats.size);
fs.read(fd, data, 0, stats.size, 0, (err, bytesRead) => {
if (err) return cb(err);
fs.close(fd, err => {
cb(err, data);
});
});
});
});
}
fs.writeFileSync("./test.dat", Buffer.alloc(16e6, 'x'));
function bm(method, name, cb) {
const start = Date.now();
async.timesSeries(50, (n, next) => {
method("./test.dat", next);
}, err => {
if (err) return cb(err);
const diff = Date.now() - start;
console.log(name, diff);
cb();
});
}
async.series([
cb => bm(chunked, "fs.readFile()", cb),
cb => bm(oneshot, "oneshot", cb)
])
Node.js | OS | fs.readFile() (ms) |
one-shot (ms) |
---|---|---|---|
v10.15.0 | Ubuntu 16 | 7320 | 370 |
v8.15.0 | Ubuntu 16 | 693 | 378 |
v10.15.0 | Win64 | 2972 | 493 |
We've switched to fs.fstat()
and then fs.read()
(similar to above) as a work-around, but I wouldn't be surprised if this has also negatively impacted other apps/tools. As far as the original justification: web servers aside, other sorts of apps like build tools and compilers (for which DoS attacks are irrelevant) often need to read an entire file as fast as possible, and furthermore aren't typically concerned with concurrency.
Is anyone else able to verify that this degradation exists and/or was expected?
Originally posted by @zbjornson in #17054 (comment)