Error about "cudaMemsetAsync on f falied: invalid argument" when run gmx2020.2+deepmd-kit #4683
Unanswered
wangshuai-simulation
asked this question in
Q&A
Replies: 1 comment 2 replies
-
The error message does not seem to be related to deepmd-kit. Do you have the issue when not using deepmd-kit? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
when i run "gmx mdrun" using the example in deepmd-kit-r3.0/examples/water/gmx, the error shown as:
Program: gmx mdrun, version 2020.2-UNCHECKED Source file: src/gromacs/nbnxm/cuda/nbnxm_cuda_data_mgmt.cu (line 588) MPI rank: 0 (out of 2) Fatal error: cudaMemsetAsync on f falied: invalid argument
Here is my commands:
source ../../../../../gromacs-2020.2/install-deepmd/bin/GMXRC export GMX_DEEPMD_INPUT_JSON=input.json gmx grompp -f md.mdp -c water.gro -p water.top -o md.tpr >& grompp.log nohup gmx mdrun -gpu_id 3 -ntmpi 2 -ntomp 8 -deffnm md &
Here is my mdp file:
integrator = md
ld-seed = -1
bd-fric = 0
dt = 0.0005
nsteps = 1000000
nstcomm = 100
nstxout-compressed = 100
nstlog = 0
nstenergy = 100
tcoupl = nose-hoover
nsttcouple = 10
tc_grps = System
tau_t = 0.5
ref_t = 300.0
constraints = none
constraint_algorithm = Lincs
lincs_order = 4
comm-mode = Linear
cutoff-scheme = Verlet
nstlist = 10
pbc = xyz
rlist = 0.8
;coulombtype = PME
coulombtype = cut-off
coulomb-modifier = none
rcoulomb = 0.8
fourierspacing = 0.16
pme_order = 4
ewald_rtol = 1.0E-5
vdwtype = cut-off
vdw-modifier = force-switch
rvdw = 0.8
rvdw-switch = 0.7
DispCorr = EnerPres
pcoupl = no
gen_temp = 300
gen-vel = yes
Here is the information about my gromacs:
Command line:
gmx --version
GROMACS version: 2020.2-UNCHECKED
The source code this program was compiled from has not been verified because the reference checksum was missing during compilation. This means you have an incomplete GROMACS distribution, please make sure to download an intact source distribution and compile that before proceeding.
Computed checksum: NoChecksumFile
Precision: single
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: CUDA
SIMD instructions: AVX_512
FFT library: fftw-3.3.8-sse2-avx
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
C compiler: /usr/bin/cc GNU 9.4.0
C compiler flags: -mavx512f -mfma -fexcess-precision=fast -funroll-all-loops -O3 -DNDEBUG
C++ compiler: /usr/bin/c++ GNU 9.4.0
C++ compiler flags: -mavx512f -mfma -fexcess-precision=fast -funroll-all-loops SHELL:-fopenmp -O3 -DNDEBUG
CUDA compiler: /data/bld-wangshuai/deepmd/cuda11/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2020 NVIDIA Corporation;Built on Mon_Nov_30_19:08:53_PST_2020;Cuda compilation tools, release 11.2, V11.2.67;Build cuda_11.2.r11.2/compiler.29373293_0
CUDA compiler flags:-std=c++14;-gencode;arch=compute_86,code=compute_86;-use_fast_math;-D_FORCE_INLINES;-mavx512f -mfma -fexcess-precision=fast -funroll-all-loops SHELL:-fopenmp -O3 -DNDEBUG
CUDA driver: 12.80
CUDA runtime: 11.20
I guess the Error about "cudaMemsetAsync" when i execute "gmx mdrun" might be related to multiple tests failed when executing "make check" in the process of compiling gromacs with deepmd-kit. But i do not know how to avoid the tests failed. Here is the failed tests:
[----------] Global test environment tear-down
[==========] 73 tests from 21 test cases ran. (27297 ms total)
[ PASSED ] 67 tests.
[ FAILED ] 6 tests, listed below:
[ FAILED ] HostAllocatorTestCopyable/0.ManualPinningOperationsWorkWithCuda, where TypeParam = int
[ FAILED ] HostAllocatorTestCopyable/1.ManualPinningOperationsWorkWithCuda, where TypeParam = float
[ FAILED ] HostAllocatorTestCopyable/2.ManualPinningOperationsWorkWithCuda, where TypeParam = gmx::BasicVector
[ FAILED ] PinnedMemoryCheckerTest.DefaultContainerIsRecognized
[ FAILED ] PinnedMemoryCheckerTest.NonpinnedContainerIsRecognized
[ FAILED ] PinnedMemoryCheckerTest.DefaultCBufferIsRecognized
Here is the compiling process for gmx+deepmd-kit:
SOFTWARE=/data/bld-wangshuai/deepmd
wget https://github.com/deepmodeling/deepmd-kit/releases/latest/download/libdeepmd_c.tar.gz
tar xzvf libdeepmd_c.tar.gz
git clone https://github.com/deepmodeling/deepmd-kit.git deepmd-kit
cd $SOFTWARE/deepmd-kit/source
mkdir build
cd build
cmake -DDEEPMD_C_ROOT=$SOFTWARE/libdeepmd_c -DCMAKE_INSTALL_PREFIX=$SOFTWARE/deepmd-kit/source/install-wangshuai ..
make -j8
make install
tar -zxvf gromacs-2020.2.tar.gz
export PATH=$PATH:$SOFTWARE/deepmd-kit/source/install-wangshuai/bin
dp_gmx_patch -d $SOFTWARE/gromacs-2020.2 -v 2020.2 -p
mkdir build install-deepmd
cd build
cmake .. -DGMX_CUDA_TARGET_COMPUTE=86 -DCMAKE_CXX_STANDARD=14 -DGMX_MPI=OFF -DGMX_GPU=CUDA
-DCUDA_TOOLKIT_ROOT_DIR=/data/bld-wangshuai/deepmd/cuda11 -DCMAKE_INSTALL_PREFIX=/data/bld-wangshuai/deepmd/gromacs-2020.2/install-deepmd -DCMAKE_PREFIX_PATH=/data/bld-wangshuai/deepmd/fftw-3.3.10/install-wangshuai/bin -DCMAKE_LIBRARY_PATH=/data/bld-wangshuai/deepmd/fftw-3.3.10/install-wangshuai/bin/fftw-3.3.10/install-wangshuai/lib/ -DCMAKE_INCLUDE_PATH=/data/bld-wangshuai/deepmd/fftw-3.3.10/install-wangshuai/include/
-DGMX_BUILD_OWN_FFTW=OFF -DGMX_FFT_LIBRARY=fftw3 -DGMX_DOUBLE=OFF -DREGRESSIONTEST_DOWNLOAD=OFF
-DGMXAPI=ON -DGMX_VERSION_STRING_OF_FORK=deepmd -DFFTW_LIBRARY='/usr/lib/x86_64-linux-gnu/libfftw3f.so'
make -j 15
export CUDA_LAUNCH_BLOCKING=1
export TF_ENABLE_ONEDNN_OPTS=0
export TF_XLA_FLAGS=--tf_xla_enable_xla_devices=false
make check
make install
Appreciate if anyone can help me !!!
Beta Was this translation helpful? Give feedback.
All reactions