Skip to content

Commit 6332a0e

Browse files
committed
Big Count Collectives Test Suite
* Run collectives with a `count` parameter as close to `INT_MAX` as possible. - This test suite often highlights cases where the underlying algorithm assumes that the payload (roughly `count x sizeof(datatype)`) is and `int` when it should be handled as a `size_t`. * Includes: - Test with `int` and `double _Complex` primitive data types - Correctness checks - Mechanism to control (as best as we can) the amount of memory consumed per node. * Assumes: - Roughly the same amount of memory per node - Same number of processes per node * See `README.md` for details Signed-off-by: Joshua Hursey <[email protected]>
1 parent 394d50a commit 6332a0e

15 files changed

+3208
-0
lines changed

.gitignore

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,15 @@ sessions/sessions_test7
2020
sessions/sessions_test8
2121
sessions/sessions_test9
2222

23+
collective-big-count/diagnostic
24+
collective-big-count/test_allgather
25+
collective-big-count/test_allgatherv
26+
collective-big-count/test_allreduce
27+
collective-big-count/test_alltoall
28+
collective-big-count/test_bcast
29+
collective-big-count/test_gather
30+
collective-big-count/test_gatherv
31+
collective-big-count/test_reduce
32+
collective-big-count/test_scatter
33+
collective-big-count/test_scatterv
34+
collective-big-count/*_uniform_count

collective-big-count/Makefile

Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
#
2+
# Copyright (c) 2021-2022 IBM Corporation. All rights reserved.
3+
#
4+
# $COPYRIGHT$
5+
#
6+
7+
######################################################################
8+
# Utilities
9+
######################################################################
10+
.PHONY: default help
11+
12+
CC = mpicc
13+
F77 = mpif77
14+
F90 = mpif90
15+
MPIRUN = mpirun
16+
RM = /bin/rm -f
17+
18+
# GCC
19+
CC_FLAGS = -g -O0 -Wall -Werror
20+
# Clang
21+
#CC_FLAGS = -g -O0 -Wall -Wshorten-64-to-32 -Werror
22+
F90_FLAGS =
23+
F77_FLAGS = $(F90_FLAGS)
24+
25+
######################################################################
26+
# TEST_UNIFORM_COUNT: (Default defined here)
27+
# The 'count' size to be used regardless of the datatype
28+
# This should never exceed that of INT_MAX (2147483647) which
29+
# is the maximum count allowed by the MPI Interface in MPI 3
30+
######################################################################
31+
# Test at the limit of INT_MAX : 2147483647
32+
TEST_UNIFORM_COUNT=2147483647
33+
34+
######################################################################
35+
# TEST_PAYLOAD_SIZE: (Default in common.h)
36+
# This value is the total payload size the collective should perform.
37+
# The 'count' is calculated as relative to the datatype size so
38+
# as to target this payload size as closely as possible:
39+
# count = TEST_PAYLOAD_SIZE / sizeof(datatype)
40+
######################################################################
41+
# INT_MAX : == 2 GB so guard will not trip (INT_MAX == 2GB -1byte)
42+
TEST_PAYLOAD_SIZE=2147483647
43+
44+
######################################################################
45+
# Binaries
46+
######################################################################
47+
BINCC = \
48+
test_alltoall \
49+
test_allgather test_allgatherv \
50+
test_allreduce \
51+
test_bcast \
52+
test_gather test_gatherv \
53+
test_reduce \
54+
test_scatter test_scatterv \
55+
diagnostic
56+
57+
BIN = $(BINCC)
58+
59+
######################################################################
60+
# Targets
61+
######################################################################
62+
all: $(BIN)
63+
64+
clean:
65+
$(RM) $(BIN) *.o *_uniform_count *_uniform_payload
66+
67+
diagnostic: common.h diagnostic.c
68+
$(CC) $(CC_FLAGS) -DTEST_PAYLOAD_SIZE=$(TEST_PAYLOAD_SIZE) -o $@ -I. diagnostic.c
69+
$(CC) $(CC_FLAGS) -DTEST_UNIFORM_COUNT=$(TEST_UNIFORM_COUNT) -o $@_uniform_count -I. diagnostic.c
70+
71+
test_allgather: common.h test_allgather.c
72+
$(CC) $(CC_FLAGS) -DTEST_PAYLOAD_SIZE=$(TEST_PAYLOAD_SIZE) -o $@ -I. test_allgather.c
73+
$(CC) $(CC_FLAGS) -DTEST_UNIFORM_COUNT=$(TEST_UNIFORM_COUNT) -o $@_uniform_count -I. test_allgather.c
74+
75+
test_allgatherv: common.h test_allgatherv.c
76+
$(CC) $(CC_FLAGS) -DTEST_PAYLOAD_SIZE=$(TEST_PAYLOAD_SIZE) -o $@ -I. test_allgatherv.c
77+
$(CC) $(CC_FLAGS) -DTEST_UNIFORM_COUNT=$(TEST_UNIFORM_COUNT) -o $@_uniform_count -I. test_allgatherv.c
78+
79+
test_allreduce: common.h test_allreduce.c
80+
$(CC) $(CC_FLAGS) -DTEST_PAYLOAD_SIZE=$(TEST_PAYLOAD_SIZE) -o $@ -I. test_allreduce.c
81+
$(CC) $(CC_FLAGS) -DTEST_UNIFORM_COUNT=$(TEST_UNIFORM_COUNT) -o $@_uniform_count -I. test_allreduce.c
82+
83+
test_alltoall: common.h test_alltoall.c
84+
$(CC) $(CC_FLAGS) -DTEST_PAYLOAD_SIZE=$(TEST_PAYLOAD_SIZE) -o $@ -I. test_alltoall.c
85+
$(CC) $(CC_FLAGS) -DTEST_UNIFORM_COUNT=$(TEST_UNIFORM_COUNT) -o $@_uniform_count -I. test_alltoall.c
86+
87+
test_bcast: common.h test_bcast.c
88+
$(CC) $(CC_FLAGS) -DTEST_PAYLOAD_SIZE=$(TEST_PAYLOAD_SIZE) -o $@ -I. test_bcast.c
89+
$(CC) $(CC_FLAGS) -DTEST_UNIFORM_COUNT=$(TEST_UNIFORM_COUNT) -o $@_uniform_count -I. test_bcast.c
90+
91+
test_gather: common.h test_gather.c
92+
$(CC) $(CC_FLAGS) -DTEST_PAYLOAD_SIZE=$(TEST_PAYLOAD_SIZE) -o $@ -I. test_gather.c
93+
$(CC) $(CC_FLAGS) -DTEST_UNIFORM_COUNT=$(TEST_UNIFORM_COUNT) -o $@_uniform_count -I. test_gather.c
94+
95+
test_gatherv: common.h test_gatherv.c
96+
$(CC) $(CC_FLAGS) -DTEST_PAYLOAD_SIZE=$(TEST_PAYLOAD_SIZE) -o $@ -I. test_gatherv.c
97+
$(CC) $(CC_FLAGS) -DTEST_UNIFORM_COUNT=$(TEST_UNIFORM_COUNT) -o $@_uniform_count -I. test_gatherv.c
98+
99+
test_reduce: common.h test_reduce.c
100+
$(CC) $(CC_FLAGS) -DTEST_PAYLOAD_SIZE=$(TEST_PAYLOAD_SIZE) -o $@ -I. test_reduce.c
101+
$(CC) $(CC_FLAGS) -DTEST_UNIFORM_COUNT=$(TEST_UNIFORM_COUNT) -o $@_uniform_count -I. test_reduce.c
102+
103+
test_scatter: common.h test_scatter.c
104+
$(CC) $(CC_FLAGS) -DTEST_PAYLOAD_SIZE=$(TEST_PAYLOAD_SIZE) -o $@ -I. test_scatter.c
105+
$(CC) $(CC_FLAGS) -DTEST_UNIFORM_COUNT=$(TEST_UNIFORM_COUNT) -o $@_uniform_count -I. test_scatter.c
106+
107+
test_scatterv: common.h test_scatterv.c
108+
$(CC) $(CC_FLAGS) -DTEST_PAYLOAD_SIZE=$(TEST_PAYLOAD_SIZE) -o $@ -I. test_scatterv.c
109+
$(CC) $(CC_FLAGS) -DTEST_UNIFORM_COUNT=$(TEST_UNIFORM_COUNT) -o $@_uniform_count -I. test_scatterv.c

collective-big-count/README.md

Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# Big Count Collectives Tests
2+
3+
This test suite is for testing with large **count** payload operations. Large payload is defined as:
4+
5+
> total payload size (count x sizeof(datatype)) is greater than UINT_MAX (4294967295 =~ 4 GB)
6+
7+
| Test Suite | Count | Datatype |
8+
| ---------- | ----- | -------- |
9+
| N/A | small | small |
10+
| BigCount | **LARGE** | small |
11+
| [BigMPI](https://github.com/jeffhammond/BigMPI) | small | **LARGE** |
12+
13+
* Assumes:
14+
- Roughly the same amount of memory per node
15+
- Same number of processes per node
16+
17+
## Building
18+
19+
```
20+
make clean
21+
make all
22+
```
23+
24+
## Running
25+
26+
For each unit test two different binaries are generated:
27+
* `test_FOO` : Run with a total payload size as close to `INT_MAX` as possible relative to the target datatype.
28+
* `test_FOO_uniform_count` : Run with a uniform count regardless of the datatype. Default `count = 2147483647 (INT_MAX)`
29+
30+
Currently, the unit tests use the `int` and `double _Complex` datatypes in the MPI collectives.
31+
32+
```
33+
mpirun --np 8 --map-by ppr:2:node --host host01:2,host02:2,host03:2,host04:2 \
34+
-mca coll basic,inter,libnbc,self ./test_allreduce
35+
36+
mpirun --np 8 --map-by ppr:2:node --host host01:2,host02:2,host03:2,host04:2 \
37+
-x BIGCOUNT_MEMORY_PERCENT=15 -x BIGCOUNT_MEMORY_DIFF=10 \
38+
--mca coll basic,inter,libnbc,self ./test_allreduce
39+
40+
mpirun --np 8 --map-by ppr:2:node --host host01:2,host02:2,host03:2,host04:2 \
41+
-x BIGCOUNT_MEMORY_PERCENT=15 -x BIGCOUNT_MEMORY_DIFF=10 \
42+
--mca coll basic,inter,libnbc,self ./test_allreduce_uniform_count
43+
```
44+
45+
Expected output will look something like the following. Notice that depending on the `BIGCOUNT_MEMORY_PERCENT` environment variable you might see the collective `Adjust count to fit in memory` message as the test harness is trying to honor that parameter.
46+
```
47+
shell$ mpirun --np 4 --map-by ppr:1:node --host host01,host02,host03,host04 \
48+
-x BIGCOUNT_MEMORY_PERCENT=6 -x BIGCOUNT_MEMORY_DIFF=10 \
49+
--mca coll basic,inter,libnbc,self ./test_allreduce_uniform_count
50+
----------------------:-----------------------------------------
51+
Total Memory Avail. : 567 GB
52+
Percent memory to use : 6 %
53+
Tolerate diff. : 10 GB
54+
Max memory to use : 34 GB
55+
----------------------:-----------------------------------------
56+
INT_MAX : 2147483647
57+
UINT_MAX : 4294967295
58+
SIZE_MAX : 18446744073709551615
59+
----------------------:-----------------------------------------
60+
: Count x Datatype size = Total Bytes
61+
TEST_UNIFORM_COUNT : 2147483647
62+
V_SIZE_DOUBLE_COMPLEX : 2147483647 x 16 = 32.0 GB
63+
V_SIZE_DOUBLE : 2147483647 x 8 = 16.0 GB
64+
V_SIZE_FLOAT_COMPLEX : 2147483647 x 8 = 16.0 GB
65+
V_SIZE_FLOAT : 2147483647 x 4 = 8.0 GB
66+
V_SIZE_INT : 2147483647 x 4 = 8.0 GB
67+
----------------------:-----------------------------------------
68+
---------------------
69+
Results from MPI_Allreduce(int x 2147483647 = 8589934588 or 8.0 GB):
70+
Rank 3: PASSED
71+
Rank 2: PASSED
72+
Rank 1: PASSED
73+
Rank 0: PASSED
74+
--------------------- Adjust count to fit in memory: 2147483647 x 50.0% = 1073741823
75+
Root : payload 34359738336 32.0 GB = 16 dt x 1073741823 count x 2 peers x 1.0 inflation
76+
Peer : payload 34359738336 32.0 GB = 16 dt x 1073741823 count x 2 peers x 1.0 inflation
77+
Total : payload 34359738336 32.0 GB = 32.0 GB root + 32.0 GB x 0 local peers
78+
---------------------
79+
Results from MPI_Allreduce(double _Complex x 1073741823 = 17179869168 or 16.0 GB):
80+
Rank 3: PASSED
81+
Rank 2: PASSED
82+
Rank 0: PASSED
83+
Rank 1: PASSED
84+
---------------------
85+
Results from MPI_Iallreduce(int x 2147483647 = 8589934588 or 8.0 GB):
86+
Rank 2: PASSED
87+
Rank 0: PASSED
88+
Rank 3: PASSED
89+
Rank 1: PASSED
90+
--------------------- Adjust count to fit in memory: 2147483647 x 50.0% = 1073741823
91+
Root : payload 34359738336 32.0 GB = 16 dt x 1073741823 count x 2 peers x 1.0 inflation
92+
Peer : payload 34359738336 32.0 GB = 16 dt x 1073741823 count x 2 peers x 1.0 inflation
93+
Total : payload 34359738336 32.0 GB = 32.0 GB root + 32.0 GB x 0 local peers
94+
---------------------
95+
Results from MPI_Iallreduce(double _Complex x 1073741823 = 17179869168 or 16.0 GB):
96+
Rank 2: PASSED
97+
Rank 0: PASSED
98+
Rank 3: PASSED
99+
Rank 1: PASSED
100+
```
101+
102+
## Environment variables
103+
104+
* `BIGCOUNT_MEMORY_DIFF` (Default: `0`): Maximum difference (as integer in GB) in total available memory between processes.
105+
* `BIGCOUNT_MEMORY_PERCENT` (Default: `80`): Maximum percent (as integer) of memory to consume.
106+
* `BIGCOUNT_ENABLE_NONBLOCKING` (Default: `1`): Enable/Disable the nonblocking collective tests. `y`/`Y`/`1` means Enable, otherwise disable.
107+
* `BIGCOUNT_ALG_INFLATION` (Default: `1.0`): Memory overhead multiplier for a given algorithm. Some algorithms use internal buffers relative to the size of the payload and/or communicator size. This envar allow you to account for that to help avoid Out-Of-Memory (OOM) scenarios.
108+
109+
## Missing Collectives (to do list)
110+
111+
Collectives missing from this test suite:
112+
* Barrier (N/A)
113+
* Alltoallv
114+
* Reduce_scatter
115+
* Scan / Exscan

0 commit comments

Comments
 (0)