Strip ANSI escapes and control bytes from terminal tool output#554
Conversation
Shell tool output was passed to Rich verbatim, so SGR colour codes, DEC private mode toggles, bracketed paste, OSC window-title escapes, and raw control bytes (NUL, backspace, BEL) all rendered as literal characters and corrupted the TUI layout. _clean_output now runs Text.from_ansi(...).plain to let Rich parse the ANSI sequences, then a stdlib str.translate drops the remaining control bytes (preserves TAB and LF). _truncate_line drops its now redundant in-place SGR strip.
Greptile SummaryThis PR fixes TUI layout corruption caused by ANSI/CSI escape sequences and raw control bytes being passed verbatim to Rich's
Confidence Score: 5/5Safe to merge; the change is narrow and well-contained, and the ordering invariant ( The two-step ANSI-strip pipeline is logically correct: No files require special attention; the single changed file is small and the logic is easy to follow. Important Files Changed
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
strix/interface/tui/renderers/shell_renderer.py:74
**OSC title text leaks into visible output**
`Text.from_ansi` consumes the `\x1b]0;` opener and the `\x07` terminator but leaves the enclosed title string as plain text. A process that emits `\x1b]0;some text\x07` (e.g. SSH banners, shell PS1 sequences, many CLI tools that set the terminal title) will have `"some text"` appear verbatim in the rendered output. `\x07` is already dropped by `translate`, so the only artefact is the title payload itself. This is noted in the PR as an acceptable known limitation, but worth a follow-up `re.sub(r"\x1b\][^\x07\x1b]*[\x07\x1b\\]?", "", ...)` pass before the `translate` call if that leakage becomes noisy in practice.
Reviews (1): Last reviewed commit: "Strip ANSI escapes and control bytes fro..." | Re-trigger Greptile |
|
|
||
| def _clean_output(output: str) -> str: | ||
| cleaned = output | ||
| cleaned = Text.from_ansi(output).plain.translate(_CONTROL_BYTES_TO_DROP) |
There was a problem hiding this comment.
OSC title text leaks into visible output
Text.from_ansi consumes the \x1b]0; opener and the \x07 terminator but leaves the enclosed title string as plain text. A process that emits \x1b]0;some text\x07 (e.g. SSH banners, shell PS1 sequences, many CLI tools that set the terminal title) will have "some text" appear verbatim in the rendered output. \x07 is already dropped by translate, so the only artefact is the title payload itself. This is noted in the PR as an acceptable known limitation, but worth a follow-up re.sub(r"\x1b\][^\x07\x1b]*[\x07\x1b\\]?", "", ...) pass before the translate call if that leakage becomes noisy in practice.
Prompt To Fix With AI
This is a comment left during a code review.
Path: strix/interface/tui/renderers/shell_renderer.py
Line: 74
Comment:
**OSC title text leaks into visible output**
`Text.from_ansi` consumes the `\x1b]0;` opener and the `\x07` terminator but leaves the enclosed title string as plain text. A process that emits `\x1b]0;some text\x07` (e.g. SSH banners, shell PS1 sequences, many CLI tools that set the terminal title) will have `"some text"` appear verbatim in the rendered output. `\x07` is already dropped by `translate`, so the only artefact is the title payload itself. This is noted in the PR as an acceptable known limitation, but worth a follow-up `re.sub(r"\x1b\][^\x07\x1b]*[\x07\x1b\\]?", "", ...)` pass before the `translate` call if that leakage becomes noisy in practice.
How can I resolve this? If you propose a fix, please make it concise.
Why
Shell tool output was passed to Rich's
Textverbatim, so SGR colour codes, cursor-movement CSI sequences, DEC private mode toggles, bracketed paste markers, OSC window-title escapes, and raw control bytes (NUL, backspace, BEL) all rendered as literal characters. The TUI layout broke whenever a command produced colour output (ls --color,npm,pip,make) or when the target's shell printed a title escape.What
_clean_outputrunsText.from_ansi(...).plainso Rich's parser strips all ANSI sequences, then a stdlibstr.translatedrops remaining control bytes (preserves\tand\n)._truncate_linedrops its now-redundant in-place SGR strip.Verified
11-case offline before/after diff over realistic samples (
ls --color, npm/pip output, bracketed paste, DEC private modes, OSC titles, raw NUL/BS/BEL bytes, CR progress bars, ssh banner with title escape). Plus an in-scan probe prompt that exercises each category via the agent's shell tool.Known partial cases (acceptable)
\x1b]0;title\x07) — Rich strips the opener and BEL, the title string leaks as plain text. Not TUI-corrupting (no cursor moves), just a visual artefact.\rto\n. Same behaviour as before this PR.