Skip to content

Commit b1c2964

Browse files
committed
crates/ignore: switch to depth first traversal
This replaces the use of channels in the parallel directory traversal with a simple stack. The primary motivation for this change is to reduce peak memory usage. In particular, when using a channel (which is a queue), we wind up visiting files in a breadth first fashion. Using a stack switches us to a depth first traversal. While there are no real intrinsic differences, depth first traversal generally tends to use less memory because directory trees are more commonly wide than they are deep. In particular, the queue/stack size itself is not the only concern. In one recent case documented in #1550, a user wanted to search all Rust crates. The directory structure was shallow but extremely wide, with a single directory containing all crates. This in turn results is in descending into each of those directories and building a gitignore matcher for each (since most crates have `.gitignore` files) before ever searching a single file. This means that ripgrep has all such matchers in memory simultaneously, which winds up using quite a bit of memory. In a depth first traversal, peak memory usage is much lower because gitignore matches are built and discarded more quickly. In the case of searching all crates, the peak memory usage decrease is dramatic. On my system, it shrinks by an order magnitude, from almost 1GB to 50MB. The decline in peak memory usage is consistent across other use cases as well, but is typically more modest. For example, searching the Linux repo has a 50% decrease in peak memory usage and searching the Chromium repo has a 25% decrease in peak memory usage. Search times generally remain unchanged, although some ad hoc benchmarks that I typically run have gotten a bit slower. As far as I can tell, this appears to be result of scheduling changes. Namely, the depth first traversal seems to result in searching some very large files towards the end of the search, which reduces the effectiveness of parallelism and makes the overall search take longer. This seems to suggest that a stack isn't optimal. It would instead perhaps be better to prioritize searching larger files first, but it's not quite clear how to do this without introducing more overhead (getting the file size for each file requires a stat call). Fixes #1550
1 parent afb325f commit b1c2964

File tree

3 files changed

+57
-43
lines changed

3 files changed

+57
-43
lines changed

crates/ignore/Cargo.toml

+3-1
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,6 @@ name = "ignore"
1818
bench = false
1919

2020
[dependencies]
21-
crossbeam-channel = "0.4.0"
2221
crossbeam-utils = "0.7.0"
2322
globset = { version = "0.4.3", path = "../globset" }
2423
lazy_static = "1.1"
@@ -32,5 +31,8 @@ walkdir = "2.2.7"
3231
[target.'cfg(windows)'.dependencies.winapi-util]
3332
version = "0.1.2"
3433

34+
[dev-dependencies]
35+
crossbeam-channel = "0.4.0"
36+
3537
[features]
3638
simd-accel = ["globset/simd-accel"]

crates/ignore/src/lib.rs

-1
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,6 @@ See the documentation for `WalkBuilder` for many other options.
4646

4747
#![deny(missing_docs)]
4848

49-
extern crate crossbeam_channel as channel;
5049
extern crate globset;
5150
#[macro_use]
5251
extern crate lazy_static;

crates/ignore/src/walk.rs

+54-41
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,11 @@ use std::fs::{self, FileType, Metadata};
55
use std::io;
66
use std::path::{Path, PathBuf};
77
use std::sync::atomic::{AtomicBool, AtomicUsize, Ordering};
8-
use std::sync::Arc;
8+
use std::sync::{Arc, Mutex};
9+
use std::thread;
10+
use std::time::Duration;
911
use std::vec;
1012

11-
use channel::{self, TryRecvError};
1213
use same_file::Handle;
1314
use walkdir::{self, WalkDir};
1415

@@ -364,7 +365,8 @@ impl DirEntryRaw {
364365
})
365366
}
366367

367-
// Placeholder implementation to allow compiling on non-standard platforms (e.g. wasm32).
368+
// Placeholder implementation to allow compiling on non-standard platforms
369+
// (e.g. wasm32).
368370
#[cfg(not(any(windows, unix)))]
369371
fn from_entry_os(
370372
depth: usize,
@@ -413,7 +415,8 @@ impl DirEntryRaw {
413415
})
414416
}
415417

416-
// Placeholder implementation to allow compiling on non-standard platforms (e.g. wasm32).
418+
// Placeholder implementation to allow compiling on non-standard platforms
419+
// (e.g. wasm32).
417420
#[cfg(not(any(windows, unix)))]
418421
fn from_path(
419422
depth: usize,
@@ -1186,16 +1189,9 @@ impl WalkParallel {
11861189
/// can be merged together into a single data structure.
11871190
pub fn visit(mut self, builder: &mut dyn ParallelVisitorBuilder) {
11881191
let threads = self.threads();
1189-
// TODO: Figure out how to use a bounded channel here. With an
1190-
// unbounded channel, the workers can run away and fill up memory
1191-
// with all of the file paths. But a bounded channel doesn't work since
1192-
// our producers are also are consumers, so they end up getting stuck.
1193-
//
1194-
// We probably need to rethink parallel traversal completely to fix
1195-
// this. The best case scenario would be finding a way to use rayon
1196-
// to do this.
1197-
let (tx, rx) = channel::unbounded();
1192+
let stack = Arc::new(Mutex::new(vec![]));
11981193
{
1194+
let mut stack = stack.lock().unwrap();
11991195
let mut visitor = builder.build();
12001196
let mut paths = Vec::new().into_iter();
12011197
std::mem::swap(&mut paths, &mut self.paths);
@@ -1232,28 +1228,27 @@ impl WalkParallel {
12321228
}
12331229
}
12341230
};
1235-
tx.send(Message::Work(Work {
1231+
stack.push(Message::Work(Work {
12361232
dent: dent,
12371233
ignore: self.ig_root.clone(),
12381234
root_device: root_device,
1239-
}))
1240-
.unwrap();
1235+
}));
12411236
}
12421237
// ... but there's no need to start workers if we don't need them.
1243-
if tx.is_empty() {
1238+
if stack.is_empty() {
12441239
return;
12451240
}
12461241
}
12471242
// Create the workers and then wait for them to finish.
12481243
let quit_now = Arc::new(AtomicBool::new(false));
1249-
let num_pending = Arc::new(AtomicUsize::new(tx.len()));
1244+
let num_pending =
1245+
Arc::new(AtomicUsize::new(stack.lock().unwrap().len()));
12501246
crossbeam_utils::thread::scope(|s| {
12511247
let mut handles = vec![];
12521248
for _ in 0..threads {
12531249
let worker = Worker {
12541250
visitor: builder.build(),
1255-
tx: tx.clone(),
1256-
rx: rx.clone(),
1251+
stack: stack.clone(),
12571252
quit_now: quit_now.clone(),
12581253
num_pending: num_pending.clone(),
12591254
max_depth: self.max_depth,
@@ -1263,8 +1258,6 @@ impl WalkParallel {
12631258
};
12641259
handles.push(s.spawn(|_| worker.run()));
12651260
}
1266-
drop(tx);
1267-
drop(rx);
12681261
for handle in handles {
12691262
handle.join().unwrap();
12701263
}
@@ -1362,10 +1355,13 @@ impl Work {
13621355
struct Worker<'s> {
13631356
/// The caller's callback.
13641357
visitor: Box<dyn ParallelVisitor + 's>,
1365-
/// The push side of our mpmc queue.
1366-
tx: channel::Sender<Message>,
1367-
/// The receive side of our mpmc queue.
1368-
rx: channel::Receiver<Message>,
1358+
/// A stack of work to do.
1359+
///
1360+
/// We use a stack instead of a channel because a stack lets us visit
1361+
/// directories in depth first order. This can substantially reduce peak
1362+
/// memory usage by keeping both the number of files path and gitignore
1363+
/// matchers in memory lower.
1364+
stack: Arc<Mutex<Vec<Message>>>,
13691365
/// Whether all workers should terminate at the next opportunity. Note
13701366
/// that we need this because we don't want other `Work` to be done after
13711367
/// we quit. We wouldn't need this if have a priority channel.
@@ -1550,25 +1546,25 @@ impl<'s> Worker<'s> {
15501546
/// If all work has been exhausted, then this returns None. The worker
15511547
/// should then subsequently quit.
15521548
fn get_work(&mut self) -> Option<Work> {
1553-
let mut value = self.rx.try_recv();
1549+
let mut value = self.recv();
15541550
loop {
15551551
// Simulate a priority channel: If quit_now flag is set, we can
15561552
// receive only quit messages.
15571553
if self.is_quit_now() {
1558-
value = Ok(Message::Quit)
1554+
value = Some(Message::Quit)
15591555
}
15601556
match value {
1561-
Ok(Message::Work(work)) => {
1557+
Some(Message::Work(work)) => {
15621558
return Some(work);
15631559
}
1564-
Ok(Message::Quit) => {
1560+
Some(Message::Quit) => {
15651561
// Repeat quit message to wake up sleeping threads, if
15661562
// any. The domino effect will ensure that every thread
15671563
// will quit.
1568-
self.tx.send(Message::Quit).unwrap();
1564+
self.send_quit();
15691565
return None;
15701566
}
1571-
Err(TryRecvError::Empty) => {
1567+
None => {
15721568
// Once num_pending reaches 0, it is impossible for it to
15731569
// ever increase again. Namely, it only reaches 0 once
15741570
// all jobs have run such that no jobs have produced more
@@ -1580,17 +1576,21 @@ impl<'s> Worker<'s> {
15801576
if self.num_pending() == 0 {
15811577
// Every other thread is blocked at the next recv().
15821578
// Send the initial quit message and quit.
1583-
self.tx.send(Message::Quit).unwrap();
1579+
self.send_quit();
15841580
return None;
15851581
}
15861582
// Wait for next `Work` or `Quit` message.
1587-
value = Ok(self
1588-
.rx
1589-
.recv()
1590-
.expect("channel disconnected while worker is alive"));
1591-
}
1592-
Err(TryRecvError::Disconnected) => {
1593-
unreachable!("channel disconnected while worker is alive");
1583+
loop {
1584+
if let Some(v) = self.recv() {
1585+
value = Some(v);
1586+
break;
1587+
}
1588+
// Our stack isn't blocking. Instead of burning the
1589+
// CPU waiting, we let the thread sleep for a bit. In
1590+
// general, this tends to only occur once the search is
1591+
// approaching termination.
1592+
thread::sleep(Duration::from_millis(1));
1593+
}
15941594
}
15951595
}
15961596
}
@@ -1614,7 +1614,20 @@ impl<'s> Worker<'s> {
16141614
/// Send work.
16151615
fn send(&self, work: Work) {
16161616
self.num_pending.fetch_add(1, Ordering::SeqCst);
1617-
self.tx.send(Message::Work(work)).unwrap();
1617+
let mut stack = self.stack.lock().unwrap();
1618+
stack.push(Message::Work(work));
1619+
}
1620+
1621+
/// Send a quit message.
1622+
fn send_quit(&self) {
1623+
let mut stack = self.stack.lock().unwrap();
1624+
stack.push(Message::Quit);
1625+
}
1626+
1627+
/// Receive work.
1628+
fn recv(&self) -> Option<Message> {
1629+
let mut stack = self.stack.lock().unwrap();
1630+
stack.pop()
16181631
}
16191632

16201633
/// Signal that work has been received.

0 commit comments

Comments
 (0)