Skip to content

[FIX]: cuda.core: simplify _check_runtime_error logic#2003

Merged
rwgk merged 2 commits intoNVIDIA:mainfrom
rwgk:nvbugs_6115382_accommodate_windows_hybrid_cudart
May 1, 2026
Merged

[FIX]: cuda.core: simplify _check_runtime_error logic#2003
rwgk merged 2 commits intoNVIDIA:mainfrom
rwgk:nvbugs_6115382_accommodate_windows_hybrid_cudart

Conversation

@rwgk
Copy link
Copy Markdown
Contributor

@rwgk rwgk commented May 1, 2026

Summary

While investigating nvbugs 6115382, inspecting the cuda.core error-handling path revealed that _check_runtime_error() can be simplified structurally.

_check_error() only routes runtime.cudaError_t values into _check_runtime_error(). Because the function already receives a runtime.cudaError_t enum member, it can use error.name directly instead of calling runtime.cudaGetErrorName() first and decoding the returned string.

This keeps runtime error formatting aligned with the generated bindings enum and removes a runtime name lookup that is unnecessary in the normal cuda.core path.

Rationale

Before this change, _check_runtime_error() asked the runtime library for the error name even though the function had already been given a runtime.cudaError_t enum member. In the expected path, the name lookup was therefore recovering information already present on the enum object.

Using error.name is also a better fit for the situation that prompted this investigation. When the Windows runtime library's error-name table lags the generated cuda.bindings enum table, the enum member still carries the correct installed-bindings name, so the raised CUDAError continues to use a stable and specific error name.

Test Simplification

The runtime-side test can be simplified for the same reason. Since test_check_runtime_error() iterates over runtime.cudaError_t members, it now exercises a path where the error name always comes from error.name.

That means the runtime test no longer needs to account for an UNEXPECTED ERROR CODE branch in this loop. The test can directly assert that each non-success runtime error message includes error.name.

Side Note

The same structural simplification does not apply to _check_driver_error(). The driver-side path still performs meaningful lookup through the driver API and retains explicit handling for unexpected codes, so its current cuGetErrorName() / cuGetErrorString() structure remains appropriate.

Use the generated runtime error enum as the name source for known CUDA Runtime errors so error messages remain stable when the runtime name table differs from the installed bindings.

Made-with: Cursor
@rwgk rwgk added this to the cuda.core v1.0.0 milestone May 1, 2026
@rwgk rwgk self-assigned this May 1, 2026
@rwgk rwgk added bug Something isn't working P1 Medium priority - Should do cuda.core Everything related to the cuda.core module labels May 1, 2026
@copy-pr-bot
Copy link
Copy Markdown
Contributor

copy-pr-bot Bot commented May 1, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rwgk
Copy link
Copy Markdown
Contributor Author

rwgk commented May 1, 2026

/ok to test

@github-actions

This comment has been minimized.

`_check_error()` only routes `runtime.cudaError_t` instances into
`_check_runtime_error()`, so consulting `cudaGetErrorName()` and keeping a
fallback for unknown values does not improve the normal `cuda.core` path. The
Windows hybrid cudart issue is that the runtime name table can lag the
generated enum table, so using `error.name` directly is both simpler and a
better match for the values the code already has.

With the runtime path now relying on enum members, the runtime-side tests no
longer need to account for `UNEXPECTED ERROR CODE` in this loop or keep a
separate monkeypatch test for avoiding the runtime name lookup.

Made-with: Cursor
@rwgk
Copy link
Copy Markdown
Contributor Author

rwgk commented May 1, 2026

/ok to test

@rwgk rwgk changed the title cuda.core: prefer binding runtime error names when available May 1, 2026
@rwgk rwgk marked this pull request as ready for review May 1, 2026 07:46
@rwgk rwgk requested a review from rparolin May 1, 2026 07:46
@rwgk rwgk merged commit 371fa42 into NVIDIA:main May 1, 2026
98 checks passed
@rwgk rwgk deleted the nvbugs_6115382_accommodate_windows_hybrid_cudart branch May 1, 2026 15:01
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Doc Preview CI
Preview removed because the pull request was closed or merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working cuda.core Everything related to the cuda.core module P1 Medium priority - Should do

2 participants