Skip to content

CUDA-aware MPI build on fresh Ubuntu 24.04 LTS, MPIX_Query_cuda_support() returns zero #13130

Open
@niklebedenko

Description

@niklebedenko

Background information

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

v5.0.7

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Obtained from https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.7.tar.gz

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

N/A

Please describe the system on which you are running

  • Operating system/version: Ubuntu 24.04 LTS
  • Computer hardware: x86_64
  • Network type: single node, single GPU

Details of the problem

I'm really struggling to run CUDA-aware MPI on just one node. I want to do this so that I can test my code locally before deploying to a cluster. I've reproduced this on a fresh install of Ubuntu 24.04 on two different machines.

Here's my install steps:

tar xf openmpi-5.0.7.tar.gz
cd openmpi-5.0.7
mkdir build
cd build
../configure --with-cuda=/usr/local/cuda --prefix=/opt/openmpi | tee config.out
make -j$(nproc) all | tee make.out
sudo make install

Now, I build a very simple test program:

// mpi_check.c
#include "mpi.h"
#include <stdio.h>

#if !defined(OPEN_MPI) || !OPEN_MPI
#error This source code uses an Open MPI-specific extension
#endif

/* Needed for MPIX_Query_cuda_support(), below */
#include "mpi-ext.h"

int main(int argc, char* argv[]) {
        MPI_Init(&argc, &argv);

        printf("Compile time check:\n");
#if defined(MPIX_CUDA_AWARE_SUPPORT) && MPIX_CUDA_AWARE_SUPPORT
        printf("This MPI library has CUDA-aware support.\n");
#else
        printf("This MPI library does not have CUDA-aware support.\n");
#endif /* MPIX_CUDA_AWARE_SUPPORT */

        printf("Run time check:\n");
#if defined(MPIX_CUDA_AWARE_SUPPORT)
        if (1 == MPIX_Query_cuda_support()) {
                printf("This MPI library has CUDA-aware support.\n");
        }
        else {
                printf("This MPI library does not have CUDA-aware support.\n");
        }
#endif /* MPIX_CUDA_AWARE_SUPPORT */

        MPI_Finalize();

        return 0;
}

This was built with:

/opt/openmpi/bin/mpicc mpi_check.c -o mpi_check

/opt/openmpi/bin/mpirun -n 1 ./mpi_check

Then, we get this output:

Compile time check:
This MPI library has CUDA-aware support.
Run time check:
This MPI library does not have CUDA-aware support.

However, if I just run ./mpi_check, i.e. no mpirun, I get this output:

Authorization required, but no authorization protocol specified

Compile time check:
This MPI library has CUDA-aware support.
Run time check:
This MPI library has CUDA-aware support.

There's no other MPI installations, this was reproduced on two independent machines.

Perhaps I'm missing a step, or missing some configuration, but I've tried lots of variations of each of the above commands to no avail, and (I think?) I've followed the install instructions in the documentation correctly. So I believe it is a bug.

If I'm missing something, please let me know. Also please let me know if you'd like the config.out and make.out log files.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions