Skip to content

Commit 9fa1562

Browse files
committed
biolatency
1 parent 74016c3 commit 9fa1562

File tree

4 files changed

+370
-0
lines changed

4 files changed

+370
-0
lines changed

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@ Examples:
6464

6565
Tools:
6666

67+
- tools/[biolatency](tools/biolatency): Summarize block device I/O latency as a histogram. [Examples](tools/biolatency_example.txt).
6768
- tools/[biosnoop](tools/biosnoop): Trace block device I/O with PID and latency. [Examples](tools/biosnoop_example.txt).
6869
- tools/[funccount](tools/funccount): Count kernel function calls. [Examples](tools/funccount_example.txt).
6970
- tools/[funclatency](tools/funclatency): Time kernel functions and show their latency distribution. [Examples](tools/funclatency_example.txt).

man/man8/biolatency.8

+89
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
.TH biolatency 8 "2015-08-20" "USER COMMANDS"
2+
.SH NAME
3+
biolatency \- Summarize block device I/O latency as a histogram.
4+
.SH SYNOPSIS
5+
.B biolatency [\-h] [\-T] [\-Q] [\-m] [interval [count]]
6+
.SH DESCRIPTION
7+
biolatency traces block device I/O (disk I/O), and records the distribution
8+
of I/O latency (time). This is printed as a histogram either on Ctrl-C, or
9+
after a given interval in seconds.
10+
11+
The latency of the disk I/O is measured from the issue to the device to its
12+
completion. A \-Q option can be used to include time queued in the kernel.
13+
14+
This tool uses in-kernel eBPF maps for storing timestamps and the histogram,
15+
for efficiency.
16+
17+
This works by tracing various kernel blk_*() functions using dynamic tracing,
18+
and will need updating to match any changes to these functions.
19+
20+
Since this uses BPF, only the root user can use this tool.
21+
.SH REQUIREMENTS
22+
CONFIG_BPF and bcc.
23+
.SH OPTIONS
24+
\-h
25+
Print usage message.
26+
.TP
27+
\-T
28+
Include timestamps on output.
29+
.TP
30+
\-m
31+
Output histogram in milliseconds.
32+
.TP
33+
interval
34+
Output interval, in seconds.
35+
.TP
36+
count
37+
Number of outputs.
38+
.SH EXAMPLES
39+
.TP
40+
Summarize block device I/O latency as a histogram:
41+
#
42+
.B biolatency
43+
.TP
44+
Print 1 second summaries, 10 times:
45+
#
46+
.B biolatency 1 10
47+
.TP
48+
Print 1 second summaries, using milliseconds as units for the histogram, and
49+
include timestamps on output:
50+
#
51+
.B biolatency \-mT 1
52+
.TP
53+
Include OS queued time in I/O time:
54+
#
55+
.B biolatency \-Q
56+
.SH FIELDS
57+
.TP
58+
usecs
59+
Microsecond range
60+
.TP
61+
mecs
62+
Millisecond range
63+
.TP
64+
count
65+
How many I/O fell into this range
66+
.TP
67+
distribution
68+
An ASCII bar chart to visualize the distribution (count column)
69+
.SH OVERHEAD
70+
This traces kernel functions and maintains in-kernel timestamps and a histgroam,
71+
which are asynchronously copied to user-space. This method is very efficient,
72+
and the overhead for most storage I/O rates (< 10k IOPS) should be negligible.
73+
If you have a higher IOPS storage environment, test and quantify the overhead
74+
before use.
75+
.SH SOURCE
76+
This is from bcc.
77+
.IP
78+
https://github.com/iovisor/bcc
79+
.PP
80+
Also look in the bcc distribution for a companion _examples.txt file containing
81+
example usage, output, and commentary for this tool.
82+
.SH OS
83+
Linux
84+
.SH STABILITY
85+
Unstable - in development.
86+
.SH AUTHOR
87+
Brendan Gregg
88+
.SH SEE ALSO
89+
biosnoop(8)

tools/biolatency

+120
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
#!/usr/bin/python
2+
#
3+
# biolatency Summarize block device I/O latency as a histogram.
4+
# For Linux, uses BCC, eBPF.
5+
#
6+
# USAGE: biolatency [-h] [-T] [-Q] [-m] [interval] [count]
7+
#
8+
# Copyright (c) 2015 Brendan Gregg.
9+
# Licensed under the Apache License, Version 2.0 (the "License")
10+
#
11+
# 20-Sep-2015 Brendan Gregg Created this.
12+
13+
from __future__ import print_function
14+
from bcc import BPF
15+
from time import sleep, strftime
16+
import argparse
17+
18+
# arguments
19+
examples = """examples:
20+
./biolatency # summarize block I/O latency as a histogram
21+
./biolatency 1 10 # print 1 second summaries, 10 times
22+
./biolatency -mT 1 # 1s summaries, milliseconds, and timestamps
23+
./biolatency -Q # include OS queued time in I/O time
24+
"""
25+
parser = argparse.ArgumentParser(
26+
description="Summarize block device I/O latency as a histogram",
27+
formatter_class=argparse.RawDescriptionHelpFormatter,
28+
epilog=examples)
29+
parser.add_argument("-T", "--timestamp", action="store_true",
30+
help="include timestamp on output")
31+
parser.add_argument("-Q", "--queued", action="store_true",
32+
help="include OS queued time in I/O time")
33+
parser.add_argument("-m", "--milliseconds", action="store_true",
34+
help="millisecond histogram")
35+
parser.add_argument("interval", nargs="?", default=99999999,
36+
help="output interval, in seconds")
37+
parser.add_argument("count", nargs="?", default=99999999,
38+
help="number of outputs")
39+
args = parser.parse_args()
40+
countdown = int(args.count)
41+
debug = 0
42+
43+
# load BPF program
44+
bpf_text = """
45+
#include <uapi/linux/ptrace.h>
46+
#include <linux/blkdev.h>
47+
48+
BPF_TABLE(\"array\", int, u64, dist, 64);
49+
BPF_HASH(start, struct request *);
50+
51+
// time block I/O
52+
int trace_req_start(struct pt_regs *ctx, struct request *req)
53+
{
54+
u64 ts = bpf_ktime_get_ns();
55+
start.update(&req, &ts);
56+
return 0;
57+
}
58+
59+
// output
60+
int trace_req_completion(struct pt_regs *ctx, struct request *req)
61+
{
62+
u64 *tsp, delta;
63+
64+
// fetch timestamp and calculate delta
65+
tsp = start.lookup(&req);
66+
if (tsp == 0) {
67+
return 0; // missed issue
68+
}
69+
delta = bpf_ktime_get_ns() - *tsp;
70+
FACTOR
71+
72+
// store as histogram
73+
int index = bpf_log2l(delta);
74+
u64 *leaf = dist.lookup(&index);
75+
if (leaf) (*leaf)++;
76+
77+
start.delete(&req);
78+
return 0;
79+
}
80+
"""
81+
if args.milliseconds:
82+
bpf_text = bpf_text.replace('FACTOR', 'delta /= 1000000;')
83+
label = "msecs"
84+
else:
85+
bpf_text = bpf_text.replace('FACTOR', 'delta /= 1000;')
86+
label = "usecs"
87+
if debug:
88+
print(bpf_text)
89+
90+
# load BPF program
91+
b = BPF(text=bpf_text)
92+
if args.queued:
93+
b.attach_kprobe(event="blk_account_io_start", fn_name="trace_req_start")
94+
else:
95+
b.attach_kprobe(event="blk_start_request", fn_name="trace_req_start")
96+
b.attach_kprobe(event="blk_mq_start_request", fn_name="trace_req_start")
97+
b.attach_kprobe(event="blk_account_io_completion",
98+
fn_name="trace_req_completion")
99+
100+
print("Tracing block device I/O... Hit Ctrl-C to end.")
101+
102+
# output
103+
exiting = 0 if args.interval else 1
104+
dist = b.get_table("dist")
105+
while (1):
106+
try:
107+
sleep(int(args.interval))
108+
except KeyboardInterrupt:
109+
exiting=1
110+
111+
print()
112+
if args.timestamp:
113+
print("%-8s\n" % strftime("%H:%M:%S"), end="")
114+
115+
dist.print_log2_hist(label)
116+
dist.clear()
117+
118+
countdown -= 1
119+
if exiting or countdown == 0:
120+
exit()

tools/biolatency_example.txt

+160
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
Demonstrations of biolatency, the Linux eBPF/bcc version.
2+
3+
4+
biolatency traces block device I/O (disk I/O), and records the distribution
5+
of I/O latency (time), printing this as a histogram when Ctrl-C is hit.
6+
For example:
7+
8+
# ./biolatency
9+
Tracing block device I/O... Hit Ctrl-C to end.
10+
^C
11+
usecs : count distribution
12+
0 -> 1 : 0 | |
13+
2 -> 3 : 0 | |
14+
4 -> 7 : 0 | |
15+
8 -> 15 : 0 | |
16+
16 -> 31 : 0 | |
17+
32 -> 63 : 0 | |
18+
64 -> 127 : 1 | |
19+
128 -> 255 : 12 |******** |
20+
256 -> 511 : 15 |********** |
21+
512 -> 1023 : 43 |******************************* |
22+
1024 -> 2047 : 52 |**************************************|
23+
2048 -> 4095 : 47 |********************************** |
24+
4096 -> 8191 : 52 |**************************************|
25+
8192 -> 16383 : 36 |************************** |
26+
16384 -> 32767 : 15 |********** |
27+
32768 -> 65535 : 2 |* |
28+
65536 -> 131071 : 2 |* |
29+
30+
The latency of the disk I/O is measured from the issue to the device to its
31+
completion. A -Q option can be used to include time queued in the kernel.
32+
33+
This example output shows a large mode of latency from about 128 microseconds
34+
to about 32767 microseconds (33 milliseconds). The bulk of the I/O was
35+
between 1 and 8 ms, which is the expected block device latency for
36+
rotational storage devices.
37+
38+
The highest latency seen while tracing was between 65 and 131 milliseconds:
39+
the last row printed, for which there were 2 I/O.
40+
41+
For efficiency, biolatency uses an in-kernel eBPF map to store timestamps
42+
with requests, and another in-kernel map to store the histogram (the "count")
43+
column, which is copied to user-space only when output is printed. These
44+
methods lower the perormance overhead when tracing is performed.
45+
46+
47+
In the following example, the -m option is used to print a histogram using
48+
milliseconds as the units (which eliminates the first several rows), -T to
49+
print timestamps with the output, and to print 1 second summaries 5 times:
50+
51+
# ./biolatency -mT 1 5
52+
Tracing block device I/O... Hit Ctrl-C to end.
53+
54+
06:20:16
55+
msecs : count distribution
56+
0 -> 1 : 36 |**************************************|
57+
2 -> 3 : 1 |* |
58+
4 -> 7 : 3 |*** |
59+
8 -> 15 : 17 |***************** |
60+
16 -> 31 : 33 |********************************** |
61+
32 -> 63 : 7 |******* |
62+
64 -> 127 : 6 |****** |
63+
64+
06:20:17
65+
msecs : count distribution
66+
0 -> 1 : 96 |************************************ |
67+
2 -> 3 : 25 |********* |
68+
4 -> 7 : 29 |*********** |
69+
8 -> 15 : 62 |*********************** |
70+
16 -> 31 : 100 |**************************************|
71+
32 -> 63 : 62 |*********************** |
72+
64 -> 127 : 18 |****** |
73+
74+
06:20:18
75+
msecs : count distribution
76+
0 -> 1 : 68 |************************* |
77+
2 -> 3 : 76 |**************************** |
78+
4 -> 7 : 20 |******* |
79+
8 -> 15 : 48 |***************** |
80+
16 -> 31 : 103 |**************************************|
81+
32 -> 63 : 49 |****************** |
82+
64 -> 127 : 17 |****** |
83+
84+
06:20:19
85+
msecs : count distribution
86+
0 -> 1 : 522 |*************************************+|
87+
2 -> 3 : 225 |**************** |
88+
4 -> 7 : 38 |** |
89+
8 -> 15 : 8 | |
90+
16 -> 31 : 1 | |
91+
92+
06:20:20
93+
msecs : count distribution
94+
0 -> 1 : 436 |**************************************|
95+
2 -> 3 : 106 |********* |
96+
4 -> 7 : 34 |** |
97+
8 -> 15 : 19 |* |
98+
16 -> 31 : 1 | |
99+
100+
How the I/O latency distribution changes over time can be seen.
101+
102+
103+
104+
The -Q option begins measuring I/O latency from when the request was first
105+
queued in the kernel, and includes queuing latency:
106+
107+
# ./biolatency -Q
108+
Tracing block device I/O... Hit Ctrl-C to end.
109+
^C
110+
usecs : count distribution
111+
0 -> 1 : 0 | |
112+
2 -> 3 : 0 | |
113+
4 -> 7 : 0 | |
114+
8 -> 15 : 0 | |
115+
16 -> 31 : 0 | |
116+
32 -> 63 : 0 | |
117+
64 -> 127 : 0 | |
118+
128 -> 255 : 3 |* |
119+
256 -> 511 : 37 |************** |
120+
512 -> 1023 : 30 |*********** |
121+
1024 -> 2047 : 18 |******* |
122+
2048 -> 4095 : 22 |******** |
123+
4096 -> 8191 : 14 |***** |
124+
8192 -> 16383 : 48 |******************* |
125+
16384 -> 32767 : 96 |**************************************|
126+
32768 -> 65535 : 31 |************ |
127+
65536 -> 131071 : 26 |********** |
128+
131072 -> 262143 : 12 |**** |
129+
130+
This better reflects the latency suffered by the application (if it is
131+
synchronous I/O), whereas the default mode without kernel queueing better
132+
reflects the performance of the device.
133+
134+
Note that the storage device (and storage device controller) usually have
135+
queues of their own, which are always included in the latency, with or
136+
without -Q.
137+
138+
139+
USAGE message:
140+
141+
# ./biolatency -h
142+
usage: biolatency [-h] [-T] [-Q] [-m] [interval] [count]
143+
144+
Summarize block device I/O latency as a histogram
145+
146+
positional arguments:
147+
interval output interval, in seconds
148+
count number of outputs
149+
150+
optional arguments:
151+
-h, --help show this help message and exit
152+
-T, --timestamp include timestamp on output
153+
-Q, --queued include OS queued time in I/O time
154+
-m, --milliseconds millisecond histogram
155+
156+
examples:
157+
./biolatency # summarize block I/O latency as a histogram
158+
./biolatency 1 10 # print 1 second summaries, 10 times
159+
./biolatency -mT 1 # 1s summaries, milliseconds, and timestamps
160+
./biolatency -Q # include OS queued time in I/O time

0 commit comments

Comments
 (0)