MPI_Wtime initialization can be faulty and lead to 0.0 being returned during one second

## Background information

### What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
latest


### Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
compiled directly from latest sources

-----------------------------

## Details of the problem

In https://github.com/open-mpi/ompi/commit/9bced03213eac35250f64e9ed58dd091f7dabf1c a change to the handling of MPI_Wtime was performed, to set the origin of time at the first call, return a time of 0.0 for this one and then return values relative it later.

This improves precision, but there is a small catch/bug we found when running some simulations recently.

The issue is that the test to check if the value has already been initialized is performed by simply checking that the amount of seconds in the stored value is 0, which is the default value
See line : https://github.com/open-mpi/ompi/blob/e5cc709aaecc3e9f36a332e4db0f5767a5888150/ompi/mpi/c/wtime.c#L65

In most of the cases the first call to clock_gettime or others will return time since boot or epoch, and this value will be more than 0 second, which works. But in our particular case boot is "superfast", and we enter MPI before the 1s mark on our platform timers. This means that all calls to MPI_Wtime before the 1s mark will consider the timers as uninitialized, reinitialize it and return 0.0 in the process. Which lead to some odd timing and bandwidth results.

Fix options are either to mark things explicitely as uninitialized/initialized properly through a dedicated boolean (tested, works), or to set a really impossible uninitialized value (-1 or else), or to check nanoseconds as well. 

Even if this only triggers on a particular emulated platform, another aspect is that this seems to be non-reentrant, and even if I don't see a case where we would suffer from having several threads setting almost the same value as the default one, it might be a good idea to make it reentrant in the process.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MPI_Wtime initialization can be faulty and lead to 0.0 being returned during one second #12953

Background information

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Details of the problem

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MPI_Wtime initialization can be faulty and lead to 0.0 being returned during one second #12953

Description

Background information

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Details of the problem

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions