-
Notifications
You must be signed in to change notification settings - Fork 277
Add managed-memory advise, prefetch, and discard-prefetch free functions #1775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
rparolin
wants to merge
63
commits into
NVIDIA:main
Choose a base branch
from
rparolin:rparolin/managed_mem_advise_prefetch
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
Show all changes
63 commits
Select commit
Hold shift + click to select a range
abdec47
wip
rparolin c418050
wip
rparolin b879fa5
fixing ci compiler errors
rparolin 04ee3de
skipping tests that aren't supported
rparolin 9ab3f46
cu12 support
rparolin bd75bc3
Merge branch 'main' into rparolin/managed_mem_advise_prefetch
rparolin 1b1343b
Merge branch 'main' into rparolin/managed_mem_advise_prefetch
rparolin a948066
Moving to function from Buffer class methods to free standing functio…
rparolin 1457599
precommit format
rparolin acb4024
iterating on implementation
rparolin d10ab07
Simplify managed-memory helpers: remove long-form aliases, cache look…
rparolin ae1de36
Merge branch 'main' into rparolin/managed_mem_advise_prefetch
rparolin c250c92
fix(test): reset _V2_BINDINGS cache so legacy-signature tests take th…
rparolin 89329d9
fix(test): require concurrent_managed_access for advise tests that hi…
rparolin 8a75d1b
fix: validate managed buffer before checking discard_prefetch binding…
rparolin 9e9b1e0
refactor: extract managed memory ops into dedicated _managed_memory_o…
rparolin 90f0711
pre-commit fix
rparolin b4d252c
Removing blank file
rparolin faaa1d8
wip
rparolin 18786be
Merge branch 'main' into rparolin/managed_mem_advise_prefetch
rparolin 9766ddc
Merge remote-tracking branch 'upstream/main' into rparolin/managed_me…
rparolin cf2f20d
fix(cuda.core): update binding_version import after upstream merge
rparolin db3bac2
revert: drop managed_memory shim in cuda.core.experimental
rparolin 20d036e
feat(cuda.core): add Location dataclass for managed memory
rparolin c2dae53
feat(cuda.core): add _coerce_location helper
rparolin 935c8ba
test(cuda.core): update monkeypatch target after binding_version rename
rparolin dc46535
refactor(cuda.core): tighten memory-attr query
rparolin 818f5d2
feat(cuda.core): unified 1..N managed_memory.prefetch with cydriver
rparolin e296e72
feat(cuda.core): add managed_memory.discard
rparolin e697131
feat(cuda.core): unified 1..N managed_memory.discard_prefetch with cy…
rparolin 3bc1021
feat(cuda.core): unified 1..N managed_memory.advise + drop legacy app…
rparolin fa23869
refactor(cuda.core): use Buffer.is_managed property in managed_memory…
rparolin 68bdd14
docs(cuda.core): document Location, discard, and 1..N managed_memory ops
rparolin b4d9cbf
chore(cuda.core): drop narrative comments and tighten _coerce_locatio…
rparolin ee96758
chore(cuda.core): satisfy pre-commit hooks
rparolin d6f60f2
refactor(cuda.core): move managed_memory ops to cuda.core.utils
rparolin 3176271
chore(cuda.core): use __all__ in utils instead of per-import noqa
rparolin 782f6a9
chore(cuda.core): collapse nested if in Location.__post_init__ (SIM102)
rparolin 0789bf6
test(cuda.core): share one DummyUnifiedMemoryResource per batched test
rparolin e0c782a
test(cuda.core): query all buffers before closing in test_batched_sam…
rparolin 10de998
review(cuda.core): address PR #1775 feedback
rparolin ab9a3ab
test(cuda.core): split managed-memory ops tests into tests/memory/
rparolin a3f342f
test(cuda.core): fix options regex for AdviseOptions ("an" vs "a")
rparolin c2a9662
chore(cuda.core): drop unused utils import + trailing blank lines
rparolin bede674
feat(cuda.core): add ManagedBuffer subclass + Host location
rparolin f59af4e
chore(cuda.core): simplify ManagedBuffer per /simplify review
rparolin 5147a7d
ci: re-trigger CI (transient cuInit INVALID_DEVICE on l4 runner)
rparolin 2151e61
refactor(cuda.core): use libcpp.vector for batched-op C arrays (R14)
rparolin 5c6d054
fix(cuda.core): restore CUDA_ERROR_NOT_INITIALIZED auto-init in _quer…
rparolin 47d5609
refactor(cuda.core): make Host a plain class instead of a dataclass (R1)
rparolin a40bb81
feat(cuda.core)!: drop int location shorthand from managed-memory ops…
rparolin c43e81e
docs(cuda.core): add AccessedBySet to api_private.rst (R5)
rparolin 71e9daa
docs(cuda.core): note the legacy NUMA round-trip limitation on prefer…
rparolin df928a0
refactor(cuda.core): use collections.abc.Sequence for input checks (R…
rparolin f522916
refactor(cuda.core): narrow Buffer.from_handle to Buffer-only (R3)
rparolin 6204c57
refactor(cuda.core): single API surface per operation (R9, R10, R11)
rparolin 36012fd
refactor(cuda.core): build advise reverse-lookup eagerly at module lo…
rparolin 067fb15
refactor(cuda.core): factor shared body of _do_batch_{prefetch,discar…
rparolin a9cd713
test(cuda.core): reuse production _get_int_attr in managed-memory tes…
rparolin d75a7bd
feat(cuda.core): cu12 fallback for prefetch_batch (N3)
rparolin 0af5bd4
test(cuda.core): cover AccessedBySet read methods (N7)
rparolin b0d1a21
feat(cuda.core): cu13 NUMA round-trip for ManagedBuffer.preferred_loc…
rparolin 4c228eb
docs(cuda.core): replace stale utils autosummary entries
rparolin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
refactor(cuda.core): make Host a plain class instead of a dataclass (R1)
Per Leo's review on PR #1775 (_host.py:9), drop the @DataClass(frozen=True) in favor of a hand-written class with property accessors. Matches Leo's original sketch from the 2026-04-28 drive-by comment and aligns with how Device is structured in this codebase. Behavior preserved: Host(), Host(numa_id=N), and Host.numa_current() all work identically. __eq__, __hash__, and immutability are hand-rolled rather than dataclass-generated. is_numa_current is no longer an __init__ kwarg — it's internal state settable only via the Host.numa_current() classmethod. Two existing TestHost cases updated: - test_numa_current_with_id_rejected → test_numa_current_only_via_classmethod - test_frozen → test_immutable (AttributeError instead of FrozenInstanceError) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Loading branch information
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.