Skip to content

Preserve non-string dict keys in rich display#9301

Merged
manzt merged 7 commits into
mainfrom
manzt/dict-keys
Apr 21, 2026
Merged

Preserve non-string dict keys in rich display#9301
manzt merged 7 commits into
mainfrom
manzt/dict-keys

Conversation

@manzt

@manzt manzt commented Apr 21, 2026

Copy link
Copy Markdown
Collaborator

Fixes #9288
Fixes #2667

Marimo displays dicts using application/json but Python dicts aren't JSON and accept non-string keys (int, tuple, ...).

The existing serializer worked around this limitation by passing primitive keys to json.dumps (which stringifies them) and running str() on composite keys. This lead to:

These changes extend the existing text/plain+<type>: leaf-mimetype convention (already used for value encoding) to dict keys. Non-string keys are emitted as prefixed strings that the frontend decodes on render and copy. Literal string keys that happen to start with text/plain+ are escaped so they round-trip unchanged.

Wire format

Python key Wire string
"hello" "hello"
"text/plain+int:2" (literal str that looks encoded) "text/plain+str:text/plain+int:2"
2 / 2**64 (any int) "text/plain+int:<value>"
2.5 "text/plain+float:2.5"
float('nan') / float('inf') "text/plain+float:nan" / "text/plain+float:inf"
True / False "text/plain+bool:True" / "text/plain+bool:False"
None "text/plain+none:"
(1, 2) "text/plain+tuple:[1, 2]"
frozenset({1, 2}) "text/plain+frozenset:[1, 2]"

Before / after

my_map = {"2": "oh", 2: "no"}

Before:

{ 1 Items
  "2": "no"          # silently dropped one entry
}

After:

Copilot AI review requested due to automatic review settings April 21, 2026 14:19
@manzt manzt added the bug Something isn't working label Apr 21, 2026
@vercel

vercel Bot commented Apr 21, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Apr 21, 2026 9:27pm

Request Review

Comment thread tests/_output/formatters/test_structures.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes rich dict rendering/copying when Python dicts contain non-string keys by introducing a reversible “typed key” wire encoding (text/plain+<type>:) so keys survive JSON round-trips without collisions or type loss.

Changes:

  • Backend: extend structure flattening to support a key_formatter, and use it in the structures formatter to encode non-string dict keys with text/plain+... prefixes (escaping literal string keys that already start with the prefix).
  • Tests (Python): add regression and coverage tests for non-string key encoding (ints, floats incl. NaN/Inf, tuples, frozensets, escaping, nesting).
  • Frontend: decode encoded keys for tree rendering and for “copy as Python” output; add corresponding unit tests.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/_output/formatters/test_structures.py Adds coverage/regression tests asserting dict keys encode safely and round-trip through strict JSON parsing.
marimo/_utils/flatten.py Adds optional key_formatter hook to control dict-key repacking during flatten/unflatten.
marimo/_output/formatters/structures.py Implements key encoding (_key_formatter) and wires it into format_structure() output for JSON.
frontend/src/components/editor/output/JsonOutput.tsx Decodes typed key strings for display in the JSON tree and for Python-like copy output.
frontend/src/components/editor/output/tests/json-output.test.ts Adds copy-output tests to ensure encoded keys decode into correct Python literals.
frontend/src/components/editor/output/tests/JsonOutput-mimetype.test.tsx Adds render test verifying encoded keys display as Python-style keys (unquoted ints, tuples, etc.).
Comment thread frontend/src/components/editor/output/JsonOutput.tsx Outdated
Comment thread frontend/src/components/editor/output/JsonOutput.tsx Outdated
Comment thread frontend/src/components/editor/output/JsonOutput.tsx
Comment thread frontend/src/components/editor/output/__tests__/JsonOutput-mimetype.test.tsx Outdated
Comment thread marimo/_output/formatters/structures.py Outdated
Comment thread marimo/_output/formatters/structures.py Outdated
@manzt manzt force-pushed the manzt/dict-keys branch from 2346ad0 to 64379ce Compare April 21, 2026 14:41
manzt added a commit that referenced this pull request Apr 21, 2026
Review feedback from Copilot on #9301:

- Tuple key with a single element rendered as `(1)` instead of `(1,)`
  (the former is just `1` in Python, not a tuple). Parse the JSON list
  payload and format with a trailing comma for length-1.
- Empty frozenset key rendered as `frozenset({})` (which Python reads
  as constructing from an empty dict). Special-case empty payloads as
  `frozenset()`.
- Both fixes apply to tree rendering (`KEY_DECODERS`) and copy output
  (`decodeKeyForCopy`); shared helpers `formatTuplePayload` and
  `formatFrozensetPayload` handle both paths.
- Python `str(k)` fallback paths could emit strings starting with
  `text/plain+` (e.g. a custom hashable with a hostile `__str__`),
  which the frontend would then mis-decode. Route all fallbacks through
  `_escape_fallback` so they get the same `text/plain+str:` escape
  we use for literal string keys.
- Tightened the `not.toContain(...)` test assertions that had extra
  trailing characters making them pass trivially.

New tests:
- Python: `{frozenset(): "v"}`, `{(42,): "v"}`, and a `Hostile`
  class that returns `text/plain+int:99` from `__str__`.
- Frontend: copy output for 1-tuple and empty-frozenset keys.
@manzt manzt force-pushed the manzt/dict-keys branch from 64379ce to 3ec0172 Compare April 21, 2026 14:54
@manzt

manzt commented Apr 21, 2026

Copy link
Copy Markdown
Collaborator Author

As a follow up, I think we could make quoting consistent between keys and values. fixed

image
manzt added a commit that referenced this pull request Apr 21, 2026
Review feedback from Copilot on #9301:

- Tuple key with a single element rendered as `(1)` instead of `(1,)`
  (the former is just `1` in Python, not a tuple). Parse the JSON list
  payload and format with a trailing comma for length-1.
- Empty frozenset key rendered as `frozenset({})` (which Python reads
  as constructing from an empty dict). Special-case empty payloads as
  `frozenset()`.
- Both fixes apply to tree rendering (`KEY_DECODERS`) and copy output
  (`decodeKeyForCopy`); shared helpers `formatTuplePayload` and
  `formatFrozensetPayload` handle both paths.
- Python `str(k)` fallback paths could emit strings starting with
  `text/plain+` (e.g. a custom hashable with a hostile `__str__`),
  which the frontend would then mis-decode. Route all fallbacks through
  `_escape_fallback` so they get the same `text/plain+str:` escape
  we use for literal string keys.
- Tightened the `not.toContain(...)` test assertions that had extra
  trailing characters making them pass trivially.

New tests:
- Python: `{frozenset(): "v"}`, `{(42,): "v"}`, and a `Hostile`
  class that returns `text/plain+int:99` from `__str__`.
- Frontend: copy output for 1-tuple and empty-frozenset keys.
@manzt manzt force-pushed the manzt/dict-keys branch from 3ec0172 to 4b84ff9 Compare April 21, 2026 15:14
manzt added a commit that referenced this pull request Apr 21, 2026
Review feedback from Copilot on #9301:

- Tuple key with a single element rendered as `(1)` instead of `(1,)`
  (the former is just `1` in Python, not a tuple). Parse the JSON list
  payload and format with a trailing comma for length-1.
- Empty frozenset key rendered as `frozenset({})` (which Python reads
  as constructing from an empty dict). Special-case empty payloads as
  `frozenset()`.
- Both fixes apply to tree rendering (`KEY_DECODERS`) and copy output
  (`decodeKeyForCopy`); shared helpers `formatTuplePayload` and
  `formatFrozensetPayload` handle both paths.
- Python `str(k)` fallback paths could emit strings starting with
  `text/plain+` (e.g. a custom hashable with a hostile `__str__`),
  which the frontend would then mis-decode. Route all fallbacks through
  `_escape_fallback` so they get the same `text/plain+str:` escape
  we use for literal string keys.
- Tightened the `not.toContain(...)` test assertions that had extra
  trailing characters making them pass trivially.

New tests:
- Python: `{frozenset(): "v"}`, `{(42,): "v"}`, and a `Hostile`
  class that returns `text/plain+int:99` from `__str__`.
- Frontend: copy output for 1-tuple and empty-frozenset keys.
@manzt manzt force-pushed the manzt/dict-keys branch from 4b84ff9 to 0d60939 Compare April 21, 2026 15:15
manzt added a commit that referenced this pull request Apr 21, 2026
Review feedback from Copilot on #9301:

- Tuple key with a single element rendered as `(1)` instead of `(1,)`
  (the former is just `1` in Python, not a tuple). Parse the JSON list
  payload and format with a trailing comma for length-1.
- Empty frozenset key rendered as `frozenset({})` (which Python reads
  as constructing from an empty dict). Special-case empty payloads as
  `frozenset()`.
- Both fixes apply to tree rendering (`KEY_DECODERS`) and copy output
  (`decodeKeyForCopy`); shared helpers `formatTuplePayload` and
  `formatFrozensetPayload` handle both paths.
- Python `str(k)` fallback paths could emit strings starting with
  `text/plain+` (e.g. a custom hashable with a hostile `__str__`),
  which the frontend would then mis-decode. Route all fallbacks through
  `_escape_fallback` so they get the same `text/plain+str:` escape
  we use for literal string keys.
- Tightened the `not.toContain(...)` test assertions that had extra
  trailing characters making them pass trivially.

New tests:
- Python: `{frozenset(): "v"}`, `{(42,): "v"}`, and a `Hostile`
  class that returns `text/plain+int:99` from `__str__`.
- Frontend: copy output for 1-tuple and empty-frozenset keys.
@manzt manzt force-pushed the manzt/dict-keys branch from fac7e59 to 151907c Compare April 21, 2026 20:20
manzt added a commit that referenced this pull request Apr 21, 2026
Review feedback from Copilot on #9301:

- Tuple key with a single element rendered as `(1)` instead of `(1,)`
  (the former is just `1` in Python, not a tuple). Parse the JSON list
  payload and format with a trailing comma for length-1.
- Empty frozenset key rendered as `frozenset({})` (which Python reads
  as constructing from an empty dict). Special-case empty payloads as
  `frozenset()`.
- Both fixes apply to tree rendering (`KEY_DECODERS`) and copy output
  (`decodeKeyForCopy`); shared helpers `formatTuplePayload` and
  `formatFrozensetPayload` handle both paths.
- Python `str(k)` fallback paths could emit strings starting with
  `text/plain+` (e.g. a custom hashable with a hostile `__str__`),
  which the frontend would then mis-decode. Route all fallbacks through
  `_escape_fallback` so they get the same `text/plain+str:` escape
  we use for literal string keys.
- Tightened the `not.toContain(...)` test assertions that had extra
  trailing characters making them pass trivially.

New tests:
- Python: `{frozenset(): "v"}`, `{(42,): "v"}`, and a `Hostile`
  class that returns `text/plain+int:99` from `__str__`.
- Frontend: copy output for 1-tuple and empty-frozenset keys.
@manzt manzt force-pushed the manzt/dict-keys branch from 151907c to 1e07171 Compare April 21, 2026 21:05
manzt added 2 commits April 21, 2026 17:25
Rich display of a Python dict serialized as application/json via
json.dumps, which coerces non-string keys to strings. That means
{"2": "oh", 2: "no"} emitted duplicate JSON keys that JSON.parse
collapses on the frontend (entries silently dropped), and tuple keys
like (1, 2) rendered as quoted "(1, 2)" (type info lost).

Non-string primitive and composite keys are now encoded with the same
text/plain+<type>: mimetype convention used for values. The frontend
can decode them to restore the original Python types.

- flatten: new optional key_formatter param applied to each dict key
  before repacking; existing json_compat_keys behavior preserved as the
  default for other callers.
- structures: _key_formatter handles int, float (incl. NaN/Inf), bool,
  None, tuple, frozenset, and escapes literal string keys that start
  with 'text/plain+' so they round-trip unchanged.

Fixes #9288. Partial fix for #2667 (frontend render in follow-up).
Decode the text/plain+<type>: keys emitted by the Python side so dict
output renders with the right Python types: int/float/bool/None
unquoted, tuples in parens, frozenset({...}), and string keys that
were escaped (because they looked encoded) are re-quoted as plain
strings.

- JsonOutput: standalone keyRenderer backed by a small KEY_DECODERS
  table, wired into the JsonViewer only when valueTypes is 'python'.
- getCopyValue: pre-walk the data to rewrite encoded keys into
  REPLACE_PREFIX/SUFFIX marker strings so the existing quote-strip
  pass unquotes them as Python literals. NaN/Inf float keys copy as
  float('nan'), float('inf'), -float('inf').

Closes #2667 (tuple keys display as strings). Companion to the
backend encoder that fixes #9288.
manzt added 5 commits April 21, 2026 17:25
Manual smoke-test notebook for the rich display of Python dicts, exercising:

- baselines (empty, single-entry, record-shaped string-key dicts)
- value variety (ints, bigints, floats, NaN/Inf, bools, None, strings,
  lists, tuples, sets, frozensets, nested dicts, bytes)
- non-string keys (collision cases, all primitives, NaN/Inf, tuple,
  frozenset)
- the text/plain+str: string-escape edge case
- nesting and dict-in-list / tuple-of-dict composition
- defaultdict and OrderedDict
- Python-level True/1/1.0 hash-collapse
- the copy-to-Python button with an all-types target

Each section includes a serialized() helper that shows the wire JSON and
the JSON.parse entry count, making it easy to spot if a future change
silently drops entries.

Related issues: #9288, #2667.
Review feedback from Copilot on #9301:

- Tuple key with a single element rendered as `(1)` instead of `(1,)`
  (the former is just `1` in Python, not a tuple). Parse the JSON list
  payload and format with a trailing comma for length-1.
- Empty frozenset key rendered as `frozenset({})` (which Python reads
  as constructing from an empty dict). Special-case empty payloads as
  `frozenset()`.
- Both fixes apply to tree rendering (`KEY_DECODERS`) and copy output
  (`decodeKeyForCopy`); shared helpers `formatTuplePayload` and
  `formatFrozensetPayload` handle both paths.
- Python `str(k)` fallback paths could emit strings starting with
  `text/plain+` (e.g. a custom hashable with a hostile `__str__`),
  which the frontend would then mis-decode. Route all fallbacks through
  `_escape_fallback` so they get the same `text/plain+str:` escape
  we use for literal string keys.
- Tightened the `not.toContain(...)` test assertions that had extra
  trailing characters making them pass trivially.

New tests:
- Python: `{frozenset(): "v"}`, `{(42,): "v"}`, and a `Hostile`
  class that returns `text/plain+int:99` from `__str__`.
- Frontend: copy output for 1-tuple and empty-frozenset keys.
Previously set values serialized as `text/plain+set:{1, 2, 3}` (Python
set-literal string via `str()`) and frozenset values fell through to
the `text/plain:` fallback (plain-text display). Both used Python's
single-quoted repr for string elements, so a dict like:

    {"a": frozenset({"x", "y"})}

rendered with inconsistent quoting — double-quoted keys/values
throughout except for the frozenset value's elements, which came out
single-quoted (`frozenset({'x', 'y'})`).

Normalize both to the JSON-list payload form we already use for tuple
values and non-string key encoding. The frontend shares a pair of
helpers (`formatSetPayload`, `formatFrozensetPayload`) between the
tree renderer and the copy path, handling the empty cases correctly
(`set()` and `frozenset()`, not `{}`).

Wire format changes:

- set:       `text/plain+set:{1, 2, 3}`       -> `text/plain+set:[1, 2, 3]`
- frozenset: `text/plain:frozenset({'x','y'})` -> `text/plain+frozenset:["x", "y"]`

Rendering is now consistent:

- `{1, 2}` / `set()` for sets
- `frozenset({"x", "y"})` / `frozenset()` for frozensets

Tests updated accordingly.
Replace `assert x == {literal}` patterns in the dict-key-encoding
tests with `assert x == snapshot({literal})`. Functionally identical
today, but if we ever need to update the expected wire format, the
snapshots auto-update with `pytest --inline-snapshot=update` instead
of hand-edited in every test.

Also cleans up the redundant `import json` inside several test
functions — the module-level import (hoisted earlier) is in scope.
@manzt manzt force-pushed the manzt/dict-keys branch from 1e07171 to 6ca6b9d Compare April 21, 2026 21:25
@manzt manzt merged commit 2952e30 into main Apr 21, 2026
43 checks passed
@manzt manzt deleted the manzt/dict-keys branch April 21, 2026 21:52
@github-actions

Copy link
Copy Markdown
Contributor

🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.23.3-dev15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

3 participants