Skip to content

speed up compute_small_box#1122

Merged
brucefan1983 merged 1 commit intobrucefan1983:masterfrom
MoseyQAQ:gpu_vector
Aug 15, 2025
Merged

speed up compute_small_box#1122
brucefan1983 merged 1 commit intobrucefan1983:masterfrom
MoseyQAQ:gpu_vector

Conversation

@MoseyQAQ
Copy link
Copy Markdown
Contributor

Summary
In the original implementation of compute_small_box() in the NEP class, several new GPU_Vector objects are created using cudaMalloc and then destroyed by cudaFree, which introduces a significant computational cost:

GPUMD/src/force/nep.cu

Lines 1387 to 1391 in ef31657

GPU_Vector<int> NN_radial(type.size());
GPU_Vector<int> NL_radial(size_x12);
GPU_Vector<int> NN_angular(type.size());
GPU_Vector<int> NL_angular(size_x12);
GPU_Vector<float> r12(size_x12 * 6);

Modification
I added a new member variable small_box_data to store the relative information at the beginning to avoid recreating these new GPU_Vectors:

https://github.com/MoseyQAQ/GPUMD/blob/c00ef6d62c4a33cc922d2779b12f22d8e64753e1/src/force/nep.cuh#L111-L117

Verifications

For a given model.xyz, and a given nep.txt, I performed the following calculations using the old and new implementations separately :

  • predict energy, force and virial, and compare them
  • perform minimization, and compare optimized position, and corresponding energy, force, virial.
  • perform 1ps-NPT simulation (1000 steps), and compare the position, cell, energy, force, and virial along the 1000 configurations.
    All the tests mentioned above have been passed, yielding the same results. All original input files and output files are attached.

Small_Cell_PR_test_upload.zip
(Note: raw trajectory files for NPT simulations are deleted, since they are too huge to be uploaded in GitHub issue)

Speed Benchmark

image
@brucefan1983 brucefan1983 marked this pull request as draft August 13, 2025 19:19
@brucefan1983
Copy link
Copy Markdown
Owner

Very nice speedup!

@brucefan1983 brucefan1983 marked this pull request as ready for review August 15, 2025 17:43
@brucefan1983 brucefan1983 merged commit d727b7a into brucefan1983:master Aug 15, 2025
2 checks passed
@MoseyQAQ MoseyQAQ deleted the gpu_vector branch August 15, 2025 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants