Skip to content

BUG: Fix MultiIndex partial-key lookup when np.datetime64 indexes datetime.date level (GH#55969)#64343

Open
logiop wants to merge 5 commits intopandas-dev:mainfrom
logiop:fix/gh-55969-datetime-date-multiindex-np-datetime64
Open

BUG: Fix MultiIndex partial-key lookup when np.datetime64 indexes datetime.date level (GH#55969)#64343
logiop wants to merge 5 commits intopandas-dev:mainfrom
logiop:fix/gh-55969-datetime-date-multiindex-np-datetime64

Conversation

@logiop
Copy link

@logiop logiop commented Feb 27, 2026

Summary

Fixes #55969.

When a MultiIndex has an object-dtype level whose values are
datetime.date objects, using np.datetime64 as a partial key
(e.g. df.loc[(np.datetime64("2023-11-01"), "A")]) silently ignores all
key components after the date level and returns every row whose date matches —
regardless of what the remaining key levels say.

Root cause

MultiIndex._partial_tup_index guards with lab not in lev to detect keys
absent from a level and then takes a fast short-circuit path that returns
immediately
(without processing the remaining key levels).

The hashtable-based __contains__ does not coerce np.datetime64
datetime.date, so the membership test always evaluated to True (key
"absent") for np.datetime64 keys against object-dtype levels holding
datetime.date values. The subsequent key components (e.g. "A", "B")
were therefore never applied, producing wrong results.

The same problem exists in _get_loc_single_level_index, which is called
when the key falls into the follow_key portion of MultiIndex.get_loc.

Fix

Before the lab not in lev guard in both _partial_tup_index and
_get_loc_single_level_index, attempt to convert a bare np.datetime64
key to its Python-scalar equivalent when the target level is object-dtype:

  • midnight timestamps → datetime.date
  • all other timestamps → datetime.datetime

Only substitute the converted value when it is actually present in the level,
so genuine type-mismatch key errors are preserved.

Example

import datetime, numpy as np, pandas as pd

dates = [datetime.date(2023, 11, 1),
         datetime.date(2023, 11, 1),
         datetime.date(2023, 11, 2)]
df = pd.DataFrame({"dates": dates, "t1": ["A","B","C"], "t2": ["C","D","E"],
                   "vals": [0.1, 0.2, 0.3]})
df.set_index(["dates","t1","t2"], inplace=True)

date = np.datetime64("2023-11-01")

# Before fix: both return 2 rows (all rows for 2023-11-01)
# After fix:  each returns exactly 1 row
df.loc[(date, "A")]   # 1 row: (2023-11-01, A, C)
df.loc[(date, "B")]   # 1 row: (2023-11-01, B, D)

Test plan

  • Added test_loc_datetime_date_index_with_np_datetime64 in
    pandas/tests/indexes/multi/test_indexing.py which exercises:
    • df.loc[(np_date, second_level)] for two different second-level values
    • equality with the datetime.date-keyed equivalent
    • MultiIndex.slice_locs with np.datetime64 partial tuple keys

🤖 Generated with Claude Code

…etime.date level (GH#55969)

When a MultiIndex has an object-dtype level containing `datetime.date` values,
using `np.datetime64` as a partial key (e.g. `df.loc[(np_date, "A")]`) would
silently ignore all key levels after the first date level and return all rows
matching that date.

Root cause: `_partial_tup_index` checks `lab not in lev` to detect keys that
are absent from the level and takes a fast short-circuit path that returns
immediately.  The hashtable-based `__contains__` does not coerce
`np.datetime64` to `datetime.date`, so the check incorrectly evaluated to True
and the remaining key components (e.g. "A") were never applied.

Fix: before the `lab not in lev` guard in both `_partial_tup_index` and
`_get_loc_single_level_index`, convert a bare `np.datetime64` key to its
Python-scalar equivalent (`datetime.date` for midnight timestamps,
`datetime.datetime` otherwise) when the target level is object-dtype. This
allows the normal lookup path to be taken and keeps subsequent key levels in
play.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
logiop added a commit to logiop/pandas that referenced this pull request Feb 28, 2026
…#64343

- Fix numpy datetime64 to Python datetime conversion by adding .item()
  to ensure proper type coercion in _partial_tup_index and
  _get_loc_single_level_index (resolves typing validation failure)
- Fix namespace inconsistency in test_indexing.py: use DataFrame
  instead of pd.DataFrame to match file conventions
- Reorganize imports in test_indexing.py to put datetime import
  at module level (resolves inconsistent-namespace-usage pre-commit check)

These changes resolve GitHub Actions typing check failures and
pre-commit.ci validation errors for the MultiIndex datetime.date +
np.datetime64 bugfix.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@logiop logiop force-pushed the fix/gh-55969-datetime-date-multiindex-np-datetime64 branch 2 times, most recently from ad4a1f5 to fadf5cb Compare February 28, 2026 21:48
logiop and others added 4 commits February 28, 2026 22:50
- Move import datetime to module level (isort compliance)
- Remove local import datetime from function
- Use DataFrame instead of pd.DataFrame (namespace consistency)

Resolves pre-commit.ci validation errors:
- isort check
- inconsistent-namespace-usage check

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Apply isort rules:
- multi.py: move 'import datetime' after 'from collections.abc'
- test_indexing.py: reorder imports (from collections before import datetime)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add # type: ignore[union-attr] to suppress pyright errors where we access
datetime.datetime attributes (hour, minute, second, microsecond, date)
on a union type datetime.datetime | datetime.date. The code logic
guarantees these attributes exist at those points.

Fixes the typing validation check in GitHub Actions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant