Skip to content

Move driver and nvrtc cython and internal layers to new generator#1972

Open
mdboom wants to merge 18 commits into
NVIDIA:mainfrom
mdboom:driver-v2
Open

Move driver and nvrtc cython and internal layers to new generator#1972
mdboom wants to merge 18 commits into
NVIDIA:mainfrom
mdboom:driver-v2

Conversation

@mdboom

@mdboom mdboom commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

This is a continuation of the work in #1900. Now adds driver to the mix and both nvrtc and driver are generated from the "real" new generator.

@copy-pr-bot

copy-pr-bot Bot commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions Bot added the cuda.bindings Everything related to the cuda.bindings module label Apr 24, 2026
@mdboom

mdboom commented Apr 24, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom

mdboom commented Apr 24, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom

mdboom commented Apr 24, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom

mdboom commented Apr 24, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

1 similar comment
@mdboom

mdboom commented Apr 24, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@github-actions github-actions Bot added CI/CD CI/CD infrastructure cuda.core Everything related to the cuda.core module labels Apr 24, 2026
@mdboom

mdboom commented Apr 24, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom

mdboom commented Apr 24, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@github-actions

Copy link
Copy Markdown

@leofang leofang self-requested a review April 24, 2026 23:59
@leofang leofang added this to the cuda.bindings 13.3.0 & 12.9.7 milestone Apr 24, 2026
@mdboom

mdboom commented Apr 25, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom

mdboom commented Apr 25, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

1 similar comment
@mdboom

mdboom commented Apr 25, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom

mdboom commented Apr 29, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom

mdboom commented Apr 29, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom

mdboom commented Apr 29, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom

mdboom commented Apr 29, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom

mdboom commented Apr 29, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom mdboom marked this pull request as ready for review April 29, 2026 22:31

@leofang leofang left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First wave of questions

Comment thread cuda_bindings/cuda/bindings/_internal/driver_windows.pyx
cdef int err, driver_ver = 0

# Load driver to check version
handle = dlopen('libcuda.so.1', RTLD_NOW | RTLD_GLOBAL)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: why don't we use pathfinder here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from linux_externs.pxd snippet in cybind and is pasted into every _internal/X_linux.pyx file that the generator generates. We should probably update this to use pathfinder, but it would have implications beyond cuda_bindings, so I didn't want to touch it.

I have an open PR in cybind to move away from snippets, because I think we are seeing one of the downsides of that approach here.

Importantly, this is dead code in this file. get_cuda_version is never called elsewhere -- it's a utility for libraries that need to get the cuda version before loading their own library. (Again, that should probably get moved to pathfinder as well, but is orthogonal to this change).

raise RuntimeError("Failed to get __cuGetProcAddress_v2")
_F_cuGetProcAddress_v2 = <__cuGetProcAddress_v2_T>__cuGetProcAddress_v2

if os.getenv('CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM', default=0):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this evaluate to True for export CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM=0?

>>> if '0': print(123)
...
123

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good catch, and we should probably fix it. Note this was copied directly from the existing code which also has this bug:

https://github.com/NVIDIA/cuda-python/blob/main/cuda_bindings/cuda/bindings/_bindings/cydriver.pyx.in#L522

Comment on lines +591 to +596
# Get latest __cuGetProcAddress_v2
global __cuGetProcAddress_v2
__cuGetProcAddress_v2 = dlsym(handle, 'cuGetProcAddress_v2')
if __cuGetProcAddress_v2 == NULL:
raise RuntimeError("Failed to get __cuGetProcAddress_v2")
_F_cuGetProcAddress_v2 = <__cuGetProcAddress_v2_T>__cuGetProcAddress_v2

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the old code has a path where we do dlsym to get unversioned symbols, is it no longer relevant?

@mdboom mdboom May 13, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the old code had a path to fall back to dlsym if we can't load cuGetProcAddress_v2. This now assumes it's available and works. IIUC, the old code path was for old CTK's where that might not be available and we don't need it anymore. But I think the best way would be to have QA run this across a range of things. It would be great to drop that complexity if we could. The complexity is not just another branch of this that uses dlsym -- it's from re-creating the versioned symbol mapping that cuGetProcAddress_v2 handles for us. It's a big part of what the old generator did that doesn't yet exist in the new generator.

cdef int err, driver_ver = 0

# Load driver to check version
handle = LoadLibraryExW("nvcuda.dll", NULL, LOAD_LIBRARY_SEARCH_SYSTEM32)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, re: pathfinder

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason not to update this now as for Linux.

@leofang leofang added P0 High priority - Must do! enhancement Any code-related improvements and removed CI/CD CI/CD infrastructure cuda.core Everything related to the cuda.core module labels May 1, 2026
@mdboom mdboom requested a review from leofang May 13, 2026 13:38
@mdboom mdboom marked this pull request as draft May 19, 2026 20:32
@mdboom

mdboom commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom mdboom marked this pull request as ready for review June 9, 2026 14:17
@github-actions github-actions Bot added the cuda.core Everything related to the cuda.core module label Jun 9, 2026
# SPDX-License-Identifier: LicenseRef-NVIDIA-SOFTWARE-LICENSE
#
# This code was automatically generated across versions from 12.9.1 to 13.3.0, generator version 0.3.1.dev1719+g565f73f4e. Do not modify it directly.
# This code was automatically generated across versions from 12.9.0 to 13.2.0, generator version 0.3.1.dev1630+gadce055ea.d20260422. Do not modify it directly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether the 13.3 -> 13.2 change hints a potential issue?

Comment thread cuda_bindings/cuda/bindings/_lib/utils.pxi
Comment thread .gitignore
cuda_bindings/cuda/bindings/_bindings/cyruntime_ptds.pxd
cuda_bindings/cuda/bindings/_bindings/cyruntime_ptds.pyx
cuda_bindings/cuda/bindings/_internal/_nvml.pyx
cuda_bindings/cuda/bindings/_internal/cufile.pyx

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we should add them here now that both are auto-gen'd?

Suggested change
cuda_bindings/cuda/bindings/_internal/cufile.pyx
cuda_bindings/cuda/bindings/_internal/driver.pyx
cuda_bindings/cuda/bindings/_internal/nvrtc.pyx
cuda_bindings/cuda/bindings/_internal/cufile.pyx

@leofang

leofang commented Jun 9, 2026

Copy link
Copy Markdown
Member

LGTM overall, @mdboom it seems the build fails at cythonization again 😛

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.bindings Everything related to the cuda.bindings module cuda.core Everything related to the cuda.core module enhancement Any code-related improvements P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants