Skip to content

Commit 2cbb968

Browse files
authored
Update README.md
Improve MPI example to avoid confusion of number of processes / total number of GPUs. NVIDIA/nccl-tests#54 (comment)
1 parent 0b4c4cb commit 2cbb968

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

‎README.md‎

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,9 @@ Run on 8 GPUs (`-g 8`), scanning from 8 Bytes to 128MBytes :
2929
$ ./build/all_reduce_perf -b 8 -e 128M -f 2 -g 8
3030
```
3131

32-
Run with MPI on 40 processes (potentially on multiple nodes) with 4 GPUs each :
32+
Run with MPI on 10 processes (potentially on multiple nodes) with 4 GPUs each, for a total of 40 GPUs:
3333
```shell
34-
$ mpirun -np 40 ./build/all_reduce_perf -b 8 -e 128M -f 2 -g 4
34+
$ mpirun -np 10 ./build/all_reduce_perf -b 8 -e 128M -f 2 -g 4
3535
```
3636

3737
### Performance

0 commit comments

Comments
 (0)