Add OpenTelemetry observability with local JSONL traces and Traceloop option#347
Conversation
Greptile SummaryThis PR introduces a dual-write OpenTelemetry observability layer — local Key remaining issues:
Both issues are straightforward to fix and do not block core functionality, but the credential redaction gap is a security concern for users with Base64-encoded tokens. Confidence Score: 4/5
Last reviewed commit: 9f42719 |
e4af469 to
827a4f6
Compare
|
Looks good! All three changes check out:
Nice cleanup! ✅ |
| _SENSITIVE_TOKEN_PATTERN = re.compile( | ||
| r"(?i)\b(" | ||
| r"bearer\s+[a-z0-9._-]+|" | ||
| r"sk-[a-z0-9_-]{8,}|" | ||
| r"gh[pousr]_[a-z0-9_-]{12,}|" | ||
| r"xox[baprs]-[a-z0-9-]{12,}" | ||
| r")\b" | ||
| ) |
There was a problem hiding this comment.
Bearer token regex misses standard Base64 characters, leaving tokens partially un-redacted.
_SENSITIVE_TOKEN_PATTERN uses [a-z0-9._-]+ for the bearer-token value (line 33), but standard Base64 encoding also uses +, /, and = (padding). Any OAuth 2.0 access_token or similar that contains these characters will only be partially matched — the regex stops at the first +, /, or =, leaving the remainder of the token un-redacted in the JSONL output.
For example, Bearer ya29.A0AXzc1v+abc/def= would match only Bearer ya29.A0AXzc1v, leaving +abc/def= in plaintext.
Consider broadening the character class to cover standard Base64:
| _SENSITIVE_TOKEN_PATTERN = re.compile( | |
| r"(?i)\b(" | |
| r"bearer\s+[a-z0-9._-]+|" | |
| r"sk-[a-z0-9_-]{8,}|" | |
| r"gh[pousr]_[a-z0-9_-]{12,}|" | |
| r"xox[baprs]-[a-z0-9-]{12,}" | |
| r")\b" | |
| ) | |
| _SENSITIVE_TOKEN_PATTERN = re.compile( | |
| r"(?i)\b(" | |
| r"bearer\s+[a-z0-9._\-+/]+=*|" | |
| r"sk-[a-z0-9_-]{8,}|" | |
| r"gh[pousr]_[a-z0-9_-]{12,}|" | |
| r"xox[baprs]-[a-z0-9-]{12,}" | |
| r")\b" | |
| ) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: strix/telemetry/utils.py
Line: 31-38
Comment:
Bearer token regex misses standard Base64 characters, leaving tokens partially un-redacted.
`_SENSITIVE_TOKEN_PATTERN` uses `[a-z0-9._-]+` for the bearer-token value (line 33), but standard Base64 encoding also uses `+`, `/`, and `=` (padding). Any OAuth 2.0 `access_token` or similar that contains these characters will only be partially matched — the regex stops at the first `+`, `/`, or `=`, leaving the remainder of the token un-redacted in the JSONL output.
For example, `Bearer ya29.A0AXzc1v+abc/def=` would match only `Bearer ya29.A0AXzc1v`, leaving `+abc/def=` in plaintext.
Consider broadening the character class to cover standard Base64:
```suggestion
_SENSITIVE_TOKEN_PATTERN = re.compile(
r"(?i)\b("
r"bearer\s+[a-z0-9._\-+/]+=*|"
r"sk-[a-z0-9_-]{8,}|"
r"gh[pousr]_[a-z0-9_-]{12,}|"
r"xox[baprs]-[a-z0-9-]{12,}"
r")\b"
)
```
How can I resolve this? If you propose a fix, please make it concise.| def sanitize_run_dir_name(run_dir_name: str) -> str: | ||
| normalized = run_dir_name.strip() | ||
| digest = f"{zlib.crc32(normalized.encode('utf-8')):08x}" | ||
|
|
||
| sanitized = re.sub(r"[^A-Za-z0-9._-]+", "-", normalized).strip(".-") | ||
| if not sanitized: | ||
| sanitized = f"run-{digest}" | ||
| elif sanitized != normalized: | ||
| sanitized = f"{sanitized}-{digest}" | ||
|
|
||
| if len(sanitized) > 80: | ||
| prefix = sanitized[:71].rstrip(".-") | ||
| sanitized = f"{prefix}-{digest}" if prefix else f"run-{digest}" | ||
|
|
||
| return sanitized |
There was a problem hiding this comment.
sanitize_run_dir_name silently produces a fallback directory when the input becomes empty after sanitization, with no warning logged.
When the function applies strip(".-") on a string like "...", the result becomes empty. The fallback to f"run-{digest}" occurs (line 117–118) without any log output to help operators diagnose why their configured run name was not usable.
Consider adding a warning when the name becomes fully sanitized:
| def sanitize_run_dir_name(run_dir_name: str) -> str: | |
| normalized = run_dir_name.strip() | |
| digest = f"{zlib.crc32(normalized.encode('utf-8')):08x}" | |
| sanitized = re.sub(r"[^A-Za-z0-9._-]+", "-", normalized).strip(".-") | |
| if not sanitized: | |
| sanitized = f"run-{digest}" | |
| elif sanitized != normalized: | |
| sanitized = f"{sanitized}-{digest}" | |
| if len(sanitized) > 80: | |
| prefix = sanitized[:71].rstrip(".-") | |
| sanitized = f"{prefix}-{digest}" if prefix else f"run-{digest}" | |
| return sanitized | |
| if not sanitized: | |
| sanitized = f"run-{digest}" | |
| logger.warning("run_dir_name %r was fully sanitized; using fallback %r", run_dir_name, sanitized) | |
| elif sanitized != normalized: |
Prompt To Fix With AI
This is a comment left during a code review.
Path: strix/telemetry/utils.py
Line: 112-126
Comment:
`sanitize_run_dir_name` silently produces a fallback directory when the input becomes empty after sanitization, with no warning logged.
When the function applies `strip(".-")` on a string like `"..."`, the result becomes empty. The fallback to `f"run-{digest}"` occurs (line 117–118) without any log output to help operators diagnose why their configured run name was not usable.
Consider adding a warning when the name becomes fully sanitized:
```suggestion
if not sanitized:
sanitized = f"run-{digest}"
logger.warning("run_dir_name %r was fully sanitized; using fallback %r", run_dir_name, sanitized)
elif sanitized != normalized:
```
How can I resolve this? If you propose a fix, please make it concise.LiteLLM OTel callbacks are not enabled, so _build_trace_metadata() and its test are dead code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove all streaming event emission (agent.streaming.updated, cleared, interrupted) — high-frequency noise with no observability value. Revert unnecessary i->index, loc->location renames. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Run names are auto-generated by generate_run_name() which already slugifies the input. The extra sanitization layer was redundant. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Traceloop uses raw print() statements for "exporting traces to a custom exporter" and "Metrics are disabled" messages. Redirect stdout during init to suppress them. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Now that traces are always written to the run dir, the output path is always relevant — not just when vulnerabilities are found. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Base64 screenshot payloads from browser_action were bloating events.jsonl (~10MB for small runs). Strip them at the sanitizer level since DOM/HTML content is sufficient for debugging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Docker SDK uses urllib3 for container API calls, which Traceloop auto-instruments — flooding events.jsonl with noise. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: 0xallam <ahmed39652003@gmail.com>
Co-authored-by: 0xallam <ahmed39652003@gmail.com>
Summary
Tracerstrix_runs/<run_name>/events.jsonlTRACELOOP_BASE_URL+TRACELOOP_API_KEY(+ optionalTRACELOOP_HEADERS)agent_idincludeagent_namewhen availableConfig / Docs
Tests
run.completed, streaming throttling, and agent-name enrichmentotelcallback is added without clobbering existing callbacksValidation
uvx poetry run ruff checkon changed telemetry/test filesuvx poetry run pytest tests/config/test_config_telemetry.py tests/telemetry/test_tracer.py tests/llm/test_llm_otel.pyhttps://github.com/usestrix/strixandhttps://app.strix.aito verify runtime telemetry output inevents.jsonl