Error about "cudaMemsetAsync on f falied: invalid argument" when run gmx2020.2+deepmd-kit #4683

wangshuai-simulation · 2025-03-29T14:25:46Z

wangshuai-simulation
Mar 29, 2025

when i run "gmx mdrun" using the example in deepmd-kit-r3.0/examples/water/gmx, the error shown as:

Program: gmx mdrun, version 2020.2-UNCHECKED Source file: src/gromacs/nbnxm/cuda/nbnxm_cuda_data_mgmt.cu (line 588) MPI rank: 0 (out of 2) Fatal error: cudaMemsetAsync on f falied: invalid argument

Here is my commands:

source ../../../../../gromacs-2020.2/install-deepmd/bin/GMXRC export GMX_DEEPMD_INPUT_JSON=input.json gmx grompp -f md.mdp -c water.gro -p water.top -o md.tpr >& grompp.log nohup gmx mdrun -gpu_id 3 -ntmpi 2 -ntomp 8 -deffnm md &

Here is my mdp file:

integrator = md
ld-seed = -1
bd-fric = 0
dt = 0.0005
nsteps = 1000000
nstcomm = 100
nstxout-compressed = 100
nstlog = 0
nstenergy = 100
tcoupl = nose-hoover
nsttcouple = 10
tc_grps = System
tau_t = 0.5
ref_t = 300.0
constraints = none
constraint_algorithm = Lincs
lincs_order = 4
comm-mode = Linear
cutoff-scheme = Verlet
nstlist = 10
pbc = xyz
rlist = 0.8
;coulombtype = PME
coulombtype = cut-off
coulomb-modifier = none
rcoulomb = 0.8
fourierspacing = 0.16
pme_order = 4
ewald_rtol = 1.0E-5
vdwtype = cut-off
vdw-modifier = force-switch
rvdw = 0.8
rvdw-switch = 0.7
DispCorr = EnerPres
pcoupl = no
gen_temp = 300
gen-vel = yes

Here is the information about my gromacs:

Command line:
gmx --version
GROMACS version: 2020.2-UNCHECKED
The source code this program was compiled from has not been verified because the reference checksum was missing during compilation. This means you have an incomplete GROMACS distribution, please make sure to download an intact source distribution and compile that before proceeding.
Computed checksum: NoChecksumFile
Precision: single
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: CUDA
SIMD instructions: AVX_512
FFT library: fftw-3.3.8-sse2-avx
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: /usr/bin/cc GNU 9.4.0
C compiler flags: -mavx512f -mfma -fexcess-precision=fast -funroll-all-loops -O3 -DNDEBUG
C++ compiler: /usr/bin/c++ GNU 9.4.0
C++ compiler flags: -mavx512f -mfma -fexcess-precision=fast -funroll-all-loops SHELL:-fopenmp -O3 -DNDEBUG
CUDA compiler: /data/bld-wangshuai/deepmd/cuda11/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2020 NVIDIA Corporation;Built on Mon_Nov_30_19:08:53_PST_2020;Cuda compilation tools, release 11.2, V11.2.67;Build cuda_11.2.r11.2/compiler.29373293_0
CUDA compiler flags:-std=c++14;-gencode;arch=compute_86,code=compute_86;-use_fast_math;-D_FORCE_INLINES;-mavx512f -mfma -fexcess-precision=fast -funroll-all-loops SHELL:-fopenmp -O3 -DNDEBUG
CUDA driver: 12.80
CUDA runtime: 11.20

I guess the Error about "cudaMemsetAsync" when i execute "gmx mdrun" might be related to multiple tests failed when executing "make check" in the process of compiling gromacs with deepmd-kit. But i do not know how to avoid the tests failed. Here is the failed tests:

[----------] Global test environment tear-down
[==========] 73 tests from 21 test cases ran. (27297 ms total)
[ PASSED ] 67 tests.
[ FAILED ] 6 tests, listed below:
[ FAILED ] HostAllocatorTestCopyable/0.ManualPinningOperationsWorkWithCuda, where TypeParam = int
[ FAILED ] HostAllocatorTestCopyable/1.ManualPinningOperationsWorkWithCuda, where TypeParam = float
[ FAILED ] HostAllocatorTestCopyable/2.ManualPinningOperationsWorkWithCuda, where TypeParam = gmx::BasicVector
[ FAILED ] PinnedMemoryCheckerTest.DefaultContainerIsRecognized
[ FAILED ] PinnedMemoryCheckerTest.NonpinnedContainerIsRecognized
[ FAILED ] PinnedMemoryCheckerTest.DefaultCBufferIsRecognized

Here is the compiling process for gmx+deepmd-kit:

SOFTWARE=/data/bld-wangshuai/deepmd
wget https://github.com/deepmodeling/deepmd-kit/releases/latest/download/libdeepmd_c.tar.gz
tar xzvf libdeepmd_c.tar.gz
git clone https://github.com/deepmodeling/deepmd-kit.git deepmd-kit
cd $SOFTWARE/deepmd-kit/source
mkdir build
cd build
cmake -DDEEPMD_C_ROOT=$SOFTWARE/libdeepmd_c -DCMAKE_INSTALL_PREFIX=$SOFTWARE/deepmd-kit/source/install-wangshuai ..
make -j8
make install
tar -zxvf gromacs-2020.2.tar.gz
export PATH=$PATH:$SOFTWARE/deepmd-kit/source/install-wangshuai/bin
dp_gmx_patch -d $SOFTWARE/gromacs-2020.2 -v 2020.2 -p
mkdir build install-deepmd
cd build
cmake .. -DGMX_CUDA_TARGET_COMPUTE=86 -DCMAKE_CXX_STANDARD=14 -DGMX_MPI=OFF -DGMX_GPU=CUDA
-DCUDA_TOOLKIT_ROOT_DIR=/data/bld-wangshuai/deepmd/cuda11 -DCMAKE_INSTALL_PREFIX=/data/bld-wangshuai/deepmd/gromacs-2020.2/install-deepmd -DCMAKE_PREFIX_PATH=/data/bld-wangshuai/deepmd/fftw-3.3.10/install-wangshuai/bin -DCMAKE_LIBRARY_PATH=/data/bld-wangshuai/deepmd/fftw-3.3.10/install-wangshuai/bin/fftw-3.3.10/install-wangshuai/lib/ -DCMAKE_INCLUDE_PATH=/data/bld-wangshuai/deepmd/fftw-3.3.10/install-wangshuai/include/
-DGMX_BUILD_OWN_FFTW=OFF -DGMX_FFT_LIBRARY=fftw3 -DGMX_DOUBLE=OFF -DREGRESSIONTEST_DOWNLOAD=OFF
-DGMXAPI=ON -DGMX_VERSION_STRING_OF_FORK=deepmd -DFFTW_LIBRARY='/usr/lib/x86_64-linux-gnu/libfftw3f.so'
make -j 15
export CUDA_LAUNCH_BLOCKING=1
export TF_ENABLE_ONEDNN_OPTS=0
export TF_XLA_FLAGS=--tf_xla_enable_xla_devices=false
make check
make install

Appreciate if anyone can help me !!!

njzjz · 2025-03-30T15:45:44Z

njzjz
Mar 30, 2025
Maintainer

The error message does not seem to be related to deepmd-kit. Do you have the issue when not using deepmd-kit?

2 replies

wangshuai-simulation Mar 30, 2025
Author

when i run gmx (other molecular system) without using deepme-kit, it runs normally.

wangshuai-simulation Mar 30, 2025
Author

Maybe i need to check the molecular simulation parameters for gromacs when run gmx+deepmd-kit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error about "cudaMemsetAsync on f falied: invalid argument" when run gmx2020.2+deepmd-kit #4683

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Error about "cudaMemsetAsync on f falied: invalid argument" when run gmx2020.2+deepmd-kit #4683

wangshuai-simulation Mar 29, 2025

when i run "gmx mdrun" using the example in deepmd-kit-r3.0/examples/water/gmx, the error shown as:

Here is my commands:

Here is my mdp file:

Here is the information about my gromacs:

I guess the Error about "cudaMemsetAsync" when i execute "gmx mdrun" might be related to multiple tests failed when executing "make check" in the process of compiling gromacs with deepmd-kit. But i do not know how to avoid the tests failed. Here is the failed tests:

Here is the compiling process for gmx+deepmd-kit:

Appreciate if anyone can help me !!!

Replies: 1 comment · 2 replies

njzjz Mar 30, 2025 Maintainer

wangshuai-simulation Mar 30, 2025 Author

wangshuai-simulation Mar 30, 2025 Author

wangshuai-simulation
Mar 29, 2025

Replies: 1 comment 2 replies

njzjz
Mar 30, 2025
Maintainer

wangshuai-simulation Mar 30, 2025
Author

wangshuai-simulation Mar 30, 2025
Author