Skip to content

fs.readfile() performance degredation between 8.x and 10.x #25740

Closed
@Trott

Description

@Trott

Reported by @zbjornson in #17054. All text below the horizontal line below is a copy/paste from their comment on that PR. /ping @davisjam @nodejs/fs


The comments above noted that this doesn't appear to cause a significant performance regression, but we're seeing a 7.6-13.5x drop in read throughput between 8.x and 10.x in both the readfile benchmark and our real-world benchmarks that heavily exercise fs.readFile. Based on my troubleshooting, I think it's from this change, but it's possible some other change is responsible.

The readfile benchmark (Ubuntu 16):

Test v8.15.0 v10.15.0 8 ÷ 10
concurrent=1 len=1024 6,661 7,066 1.06x
concurrent=10 len=1024 23,100 21,079 0.91x
concurrent=1 len=16777216 156.6 11.6 13.5x
concurrent=10 len=16777216 584 76.6 7.6x

From what I can extract from the comments in this PR, either no degradation or a 3.6-4.8x degradation was expected for the len=16M cases.

As for why I think it's because of this change, the benchmark below compares fs.readFile against a simple version of how fs.readFile used to work (one-shot read), measuring time to read the same 16 MB file 50 times.

// npm i async

const fs = require("fs");
const async = require("async");

function chunked(filename, cb) {
	fs.readFile(filename, cb);
}

function oneshot(filename, cb) {
	// shoddy implementation -- leaks fd in case of errors
	fs.open(filename, "r", 0o666, (err, fd) => {
		if (err) return cb(err);
		fs.fstat(fd, (err, stats) => {
			if (err) return cb(err);
			const data = Buffer.allocUnsafe(stats.size);
			fs.read(fd, data, 0, stats.size, 0, (err, bytesRead) => {
				if (err) return cb(err);
				fs.close(fd, err => {
					cb(err, data);
				});
			});
		});
	});
}

fs.writeFileSync("./test.dat", Buffer.alloc(16e6, 'x'));

function bm(method, name, cb) {
	const start = Date.now();
	async.timesSeries(50, (n, next) => {
		method("./test.dat", next);
	}, err => {
		if (err) return cb(err);
		const diff = Date.now() - start;
		console.log(name, diff);
		cb();
	});
}

async.series([
	cb => bm(chunked, "fs.readFile()", cb),
	cb => bm(oneshot, "oneshot", cb)
])
Node.js OS fs.readFile() (ms) one-shot (ms)
v10.15.0 Ubuntu 16 7320 370
v8.15.0 Ubuntu 16 693 378
v10.15.0 Win64 2972 493

We've switched to fs.fstat() and then fs.read() (similar to above) as a work-around, but I wouldn't be surprised if this has also negatively impacted other apps/tools. As far as the original justification: web servers aside, other sorts of apps like build tools and compilers (for which DoS attacks are irrelevant) often need to read an entire file as fast as possible, and furthermore aren't typically concerned with concurrency.

Is anyone else able to verify that this degradation exists and/or was expected?

Originally posted by @zbjornson in #17054 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions