Skip to content

Strip ANSI escapes and control bytes from terminal tool output#554

Merged
0xallam merged 1 commit into
mainfrom
tui/sanitize-terminal-output
Jun 9, 2026
Merged

Strip ANSI escapes and control bytes from terminal tool output#554
0xallam merged 1 commit into
mainfrom
tui/sanitize-terminal-output

Conversation

@0xallam

@0xallam 0xallam commented Jun 9, 2026

Copy link
Copy Markdown
Member

Why

Shell tool output was passed to Rich's Text verbatim, so SGR colour codes, cursor-movement CSI sequences, DEC private mode toggles, bracketed paste markers, OSC window-title escapes, and raw control bytes (NUL, backspace, BEL) all rendered as literal characters. The TUI layout broke whenever a command produced colour output (ls --color, npm, pip, make) or when the target's shell printed a title escape.

What

_clean_output runs Text.from_ansi(...).plain so Rich's parser strips all ANSI sequences, then a stdlib str.translate drops remaining control bytes (preserves \t and \n). _truncate_line drops its now-redundant in-place SGR strip.

Verified

11-case offline before/after diff over realistic samples (ls --color, npm/pip output, bracketed paste, DEC private modes, OSC titles, raw NUL/BS/BEL bytes, CR progress bars, ssh banner with title escape). Plus an in-scan probe prompt that exercises each category via the agent's shell tool.

Known partial cases (acceptable)

  • OSC payloads (\x1b]0;title\x07) — Rich strips the opener and BEL, the title string leaks as plain text. Not TUI-corrupting (no cursor moves), just a visual artefact.
  • CR progress bars expand to multi-line — Rich converts \r to \n. Same behaviour as before this PR.
Shell tool output was passed to Rich verbatim, so SGR colour codes,
DEC private mode toggles, bracketed paste, OSC window-title escapes,
and raw control bytes (NUL, backspace, BEL) all rendered as literal
characters and corrupted the TUI layout.

_clean_output now runs Text.from_ansi(...).plain to let Rich parse
the ANSI sequences, then a stdlib str.translate drops the remaining
control bytes (preserves TAB and LF). _truncate_line drops its now
redundant in-place SGR strip.
@greptile-apps

greptile-apps Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes TUI layout corruption caused by ANSI/CSI escape sequences and raw control bytes being passed verbatim to Rich's Text renderer. The fix threads all shell tool output through Text.from_ansi(...).plain followed by a str.translate pass that drops non-printable control bytes (preserving tab and newline), and removes a now-redundant SGR-strip from _truncate_line.

  • _clean_output gains a two-step pipeline: Rich's ANSI parser consumes SGR colours, cursor-movement sequences, bracketed-paste markers, and most OSC escapes; str.translate then drops any stray control bytes (NUL, BS, BEL, ESC, etc.) that survive the ANSI parse.
  • _truncate_line is simplified to a plain len(line) check, which is now correct because _clean_output is always applied before _format_output/_truncate_line in the call chain; the old code also had a latent bug where it checked length against the ANSI-stripped string but sliced the raw string, potentially cutting mid-escape-sequence.
  • Two known partial cases are documented in the PR: OSC title text (\\x1b]0;…\\x07) leaks as plain text after Rich strips its delimiters, and \ -based progress bars expand to multi-line because Rich converts \ to \ before .plain is returned.

Confidence Score: 5/5

Safe to merge; the change is narrow and well-contained, and the ordering invariant (_clean_output before _format_output/_truncate_line) is maintained throughout the file.

The two-step ANSI-strip pipeline is logically correct: Text.from_ansi handles the structured escape sequences and str.translate catches any residual control bytes. The only artefact is the documented OSC title-string leak, which is cosmetic and non-corrupting. The _truncate_line simplification is a genuine improvement over the old code, which measured visible length against the stripped string but sliced the raw (escape-laden) string.

No files require special attention; the single changed file is small and the logic is easy to follow.

Important Files Changed

Filename Overview
strix/interface/tui/renderers/shell_renderer.py Adds _clean_output pipeline: Text.from_ansi().plain strips ANSI/CSI sequences, then str.translate drops remaining control bytes; simplifies _truncate_line since output is already ANSI-free at that point. Logic is correct and the ordering guarantee holds.
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
strix/interface/tui/renderers/shell_renderer.py:74
**OSC title text leaks into visible output**

`Text.from_ansi` consumes the `\x1b]0;` opener and the `\x07` terminator but leaves the enclosed title string as plain text. A process that emits `\x1b]0;some text\x07` (e.g. SSH banners, shell PS1 sequences, many CLI tools that set the terminal title) will have `"some text"` appear verbatim in the rendered output. `\x07` is already dropped by `translate`, so the only artefact is the title payload itself. This is noted in the PR as an acceptable known limitation, but worth a follow-up `re.sub(r"\x1b\][^\x07\x1b]*[\x07\x1b\\]?", "", ...)` pass before the `translate` call if that leakage becomes noisy in practice.

Reviews (1): Last reviewed commit: "Strip ANSI escapes and control bytes fro..." | Re-trigger Greptile


def _clean_output(output: str) -> str:
cleaned = output
cleaned = Text.from_ansi(output).plain.translate(_CONTROL_BYTES_TO_DROP)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 OSC title text leaks into visible output

Text.from_ansi consumes the \x1b]0; opener and the \x07 terminator but leaves the enclosed title string as plain text. A process that emits \x1b]0;some text\x07 (e.g. SSH banners, shell PS1 sequences, many CLI tools that set the terminal title) will have "some text" appear verbatim in the rendered output. \x07 is already dropped by translate, so the only artefact is the title payload itself. This is noted in the PR as an acceptable known limitation, but worth a follow-up re.sub(r"\x1b\][^\x07\x1b]*[\x07\x1b\\]?", "", ...) pass before the translate call if that leakage becomes noisy in practice.

Prompt To Fix With AI
This is a comment left during a code review.
Path: strix/interface/tui/renderers/shell_renderer.py
Line: 74

Comment:
**OSC title text leaks into visible output**

`Text.from_ansi` consumes the `\x1b]0;` opener and the `\x07` terminator but leaves the enclosed title string as plain text. A process that emits `\x1b]0;some text\x07` (e.g. SSH banners, shell PS1 sequences, many CLI tools that set the terminal title) will have `"some text"` appear verbatim in the rendered output. `\x07` is already dropped by `translate`, so the only artefact is the title payload itself. This is noted in the PR as an acceptable known limitation, but worth a follow-up `re.sub(r"\x1b\][^\x07\x1b]*[\x07\x1b\\]?", "", ...)` pass before the `translate` call if that leakage becomes noisy in practice.

How can I resolve this? If you propose a fix, please make it concise.
@0xallam 0xallam merged commit 7217abf into main Jun 9, 2026
2 checks passed
@0xallam 0xallam deleted the tui/sanitize-terminal-output branch June 9, 2026 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant