Skip to content

Move driver and nvrtc cython and internal layers to new generator#1972

Open
mdboom wants to merge 14 commits intoNVIDIA:mainfrom
mdboom:driver-v2
Open

Move driver and nvrtc cython and internal layers to new generator#1972
mdboom wants to merge 14 commits intoNVIDIA:mainfrom
mdboom:driver-v2

Conversation

@mdboom
Copy link
Copy Markdown
Contributor

@mdboom mdboom commented Apr 24, 2026

This is a continuation of the work in #1900. Now adds driver to the mix and both nvrtc and driver are generated from the "real" new generator.

@copy-pr-bot
Copy link
Copy Markdown
Contributor

copy-pr-bot Bot commented Apr 24, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions Bot added the cuda.bindings Everything related to the cuda.bindings module label Apr 24, 2026
@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 24, 2026

/ok to test

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 24, 2026

/ok to test

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 24, 2026

/ok to test

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 24, 2026

/ok to test

1 similar comment
@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 24, 2026

/ok to test

@github-actions github-actions Bot added CI/CD CI/CD infrastructure cuda.core Everything related to the cuda.core module labels Apr 24, 2026
@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 24, 2026

/ok to test

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 24, 2026

/ok to test

@leofang leofang self-requested a review April 24, 2026 23:59
@leofang leofang added this to the cuda.bindings 13.3.0 & 12.9.7 milestone Apr 24, 2026
@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 25, 2026

/ok to test

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 25, 2026

/ok to test

1 similar comment
@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 25, 2026

/ok to test

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 29, 2026

/ok to test

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 29, 2026

/ok to test

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 29, 2026

/ok to test

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 29, 2026

/ok to test

@mdboom
Copy link
Copy Markdown
Contributor Author

mdboom commented Apr 29, 2026

/ok to test

@mdboom mdboom marked this pull request as ready for review April 29, 2026 22:31
Copy link
Copy Markdown
Member

@leofang leofang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First wave of questions

from libc.stdint cimport uintptr_t
from cpython cimport PyUnicode_AsWideCharString, PyMem_Free

# You must 'from .utils import NotSupportedError' before using this template
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: check if this is from cybind template

cdef int err, driver_ver = 0

# Load driver to check version
handle = dlopen('libcuda.so.1', RTLD_NOW | RTLD_GLOBAL)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: why don't we use pathfinder here?

raise RuntimeError("Failed to get __cuGetProcAddress_v2")
_F_cuGetProcAddress_v2 = <__cuGetProcAddress_v2_T>__cuGetProcAddress_v2

if os.getenv('CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM', default=0):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this evaluate to True for export CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM=0?

>>> if '0': print(123)
...
123
Comment on lines +591 to +596
# Get latest __cuGetProcAddress_v2
global __cuGetProcAddress_v2
__cuGetProcAddress_v2 = dlsym(handle, 'cuGetProcAddress_v2')
if __cuGetProcAddress_v2 == NULL:
raise RuntimeError("Failed to get __cuGetProcAddress_v2")
_F_cuGetProcAddress_v2 = <__cuGetProcAddress_v2_T>__cuGetProcAddress_v2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the old code has a path where we do dlsym to get unversioned symbols, is it no longer relevant?

cdef int err, driver_ver = 0

# Load driver to check version
handle = LoadLibraryExW("nvcuda.dll", NULL, LOAD_LIBRARY_SEARCH_SYSTEM32)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, re: pathfinder

@leofang leofang added P0 High priority - Must do! enhancement Any code-related improvements and removed CI/CD CI/CD infrastructure cuda.core Everything related to the cuda.core module labels May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.bindings Everything related to the cuda.bindings module enhancement Any code-related improvements P0 High priority - Must do!

2 participants