Skip to content

Move to nanobind#522

Merged
evertlammerts merged 50 commits into
duckdb:mainfrom
evertlammerts:prototype/nanobind-cutover
Jul 1, 2026
Merged

Move to nanobind#522
evertlammerts merged 50 commits into
duckdb:mainfrom
evertlammerts:prototype/nanobind-cutover

Conversation

@evertlammerts

Copy link
Copy Markdown
Member

No description provided.

…ilding)

Build-system integration WORKS (CMake configure passes): find_package(Python)+nanobind,
nanobind_build_library(nanobind-static) feeding the object libs, nanobind_add_module NB_STATIC;
pyproject build dep pybind11->nanobind. Umbrella (pybind_wrapper.hpp) + enum caster macro +
identifier caster ported to nanobind from_python/from_cpp API; mechanical renames applied
(NB_MODULE, python_error, borrow/steal, def_prop_ro, namespace py = nanobind).

First build surfaced 254 errors; keystone fixes bring it to 224, cascade cleared. Remaining work
concentrated: numpy nb::ndarray port (~122), arrow_array_stream (59), py:: API diffs in
python_objects/relation/result/connection headers (~60), object wrappers in dataframe.hpp (12),
optional/pyconnection_default casters, register_exception, py::options, init_implicit, 81 .none().
Cleared categorically: identifier+enum casters, object wrappers (borrow_t ctors,
handle_type_name blocks removed), module_::import_, py::module_, namespace py = nanobind.
Build system still green (configure passes). Remaining concentrated in: numpy nb::ndarray
port (py::dtype has no nanobind equiv -> reroute via numpy.empty + nb::ndarray; touches
callers, not just the facade), ~150 scattered py:: API diffs (py::str->string, handle/object
nuances) across connection/relation/result/expression, optional/pyconnection_default casters,
register_exception->nb::exception, init_implicit, py::options.
numpy DONE: NumpyArray facade ported off py::array/py::dtype (cold-path ctypes.data buffer
access, dtype-as-string Allocate via numpy.empty, in-place resize) -- move-faithful, no copies.
Converted 15 .cast<>() method calls -> py::cast<>(), py::ssize_t->Py_ssize_t,
py::function->py::callable, dropped py::options. numpy_array.hpp + arrow_array_stream.hpp now
compile. Remaining: per-site py:: tail (~25 functional-cast string(obj)->py::cast, ~36
missing-member, move/ref bindings) across 12 files + pybind_wrapper.cpp impl + pyconnection_default
caster.
…ault caster retirement, bulk str/int/type-of/cast conversions
…, type_object, capsule.data, len, more conversions
…ssion), dict/list builds, bytes; numpy buffer-pointer caching (perf)
…t type-punning) in dataframe/scan/bind/map/udf
…or implicit conversions); guard numpy ctypes eager-compute
…o PyObject_Str runs) across numpy/pandas/udf/replacement paths
…ls crash cascade)

Add a custom type_caster<shared_ptr<DuckDBPyExpression>> (mirrors the DuckDBPyType one): keep
cast_flags::convert so the registered implicit conversions (str->column, scalar->constant) fire
for shared_ptr args, and when the inner caster yields no instance, construct through the registered
Python ctor (None->NULL constant) -- a real owned object, no dangling -- with PyErr_Clear() on
failure. Allow None on the Expression object-ctor (py::arg.none()). The PyErr_Clear is what
eliminates the stale-PyErr segfault CASCADE: the full fast suite now runs clean in parallel
(0 crashes, was unmeasurable). Failures 86 -> 66; expression/spark Expression cluster resolved
(spark 6->3). Belt-and-suspenders None guard in CreateCompareExpression/Coalesce.
@evertlammerts evertlammerts force-pushed the prototype/nanobind-cutover branch from b9929c1 to 2983c92 Compare June 30, 2026 18:36
The NumpyArray facade read the buffer pointer via numpy's `ctypes.data`
attribute chain and allocated via `numpy.empty(count, dtype_string)`. For a
top-level column that runs once per 2048-row chunk (amortized), but the
LIST/ARRAY per-element converter allocates a fresh array per row, so at 200k
rows it became ~600k ctypes-object allocations: df()/fetchnumpy() of a LIST
column ran ~6x slower than the pybind11 baseline (829ms vs 136ms).

Read the buffer pointer directly from numpy's PyArrayObject C struct (a plain
field read, as pybind11's array.data() did), gated by a PyObject_TypeCheck
against numpy.ndarray so non-ndarray wrappers are never reinterpreted. Cache the
numpy.empty callable and per-dtype np.dtype objects, and skip the no-op
resize-to-current-length on the per-element path.

Output is byte-identical (lists, nested, nulls, empty, masked, large-N); the row
and arrow paths and the int/double/struct columnar paths are unaffected. LIST
df()/fetchnumpy() now match-or-beat the pybind11 baseline (69ms).
@evertlammerts evertlammerts marked this pull request as draft July 1, 2026 08:57
@evertlammerts evertlammerts marked this pull request as ready for review July 1, 2026 08:57
@evertlammerts evertlammerts merged commit d7e138f into duckdb:main Jul 1, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant