fix(title): generate thread titles asynchronously#922
fix(title): generate thread titles asynchronously#922tda1017 wants to merge 5 commits intobytedance:mainfrom
Conversation
|
@tda1017 Please check the .gitconfig file user setting in your commit box. It looks like you just use root as the user.name. |
0bdc88b to
c691ac2
Compare
|
Hi @WillemJiang, thanks for the review! I've checked the commits and they show The CLA has also been signed now. Is there anything else that needs to be addressed? |
|
@tda1017 Please check the Ready for review button if the PR is ready. |
|
@tda1017 Do you mind fixing the lint error? |
There was a problem hiding this comment.
Pull request overview
This PR addresses #887 by decoupling thread-title generation from the main agent run so the UI can complete a conversation without waiting on title-model latency. It introduces an asynchronous title generation queue/updater with timeout/retry/fallback behavior and adds test coverage for the new async flow.
Changes:
- Move title generation out of
TitleMiddleware.after_agentinto an async queue + updater that writes back titles after the main run completes. - Add
timeout_secondsandmax_retriesto title generation config. - Add backend tests covering enqueue behavior, timeout fallback, and non-overwrite semantics.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| config.example.yaml | Adds new title timeout/retry configuration knobs. |
| backend/src/config/title_config.py | Extends TitleConfig with timeout and retry fields + validation. |
| backend/src/agents/middlewares/title_middleware.py | Switches from synchronous title generation to enqueueing async work. |
| backend/src/agents/title/queue.py | Introduces a singleton queue that dispatches background workers. |
| backend/src/agents/title/updater.py | Implements title generation with timeout/retry/fallback and guarded write-back via LangGraph SDK. |
| backend/src/agents/title/init.py | Exposes queue/updater symbols at the package level. |
| backend/tests/test_title_generation.py | Updates config tests to cover the new config fields/validation. |
| backend/tests/test_title_async.py | Adds tests for async enqueueing and updater behavior (fallback + non-overwrite). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| set_title_config(TitleConfig(timeout_seconds=0.01, max_retries=0, max_chars=20)) | ||
| monkeypatch.setattr("src.agents.title.updater.create_chat_model", lambda **kwargs: SlowModel()) | ||
|
|
||
| updater = TitleGenerationUpdater(client_factory=lambda url: None) | ||
| title = updater.generate_title([_message("human", "A long first message for fallback title")]) | ||
|
|
||
| set_title_config(original) | ||
| assert title.startswith("A long first message") |
There was a problem hiding this comment.
This test mutates the global title config and restores it afterwards, but restoration isn’t protected if the test fails mid-way (assertion error/exception), which can leak config into later tests. Use try/finally or a fixture to guarantee set_title_config(original) always runs.
| set_title_config(TitleConfig(timeout_seconds=0.01, max_retries=0, max_chars=20)) | |
| monkeypatch.setattr("src.agents.title.updater.create_chat_model", lambda **kwargs: SlowModel()) | |
| updater = TitleGenerationUpdater(client_factory=lambda url: None) | |
| title = updater.generate_title([_message("human", "A long first message for fallback title")]) | |
| set_title_config(original) | |
| assert title.startswith("A long first message") | |
| try: | |
| set_title_config(TitleConfig(timeout_seconds=0.01, max_retries=0, max_chars=20)) | |
| monkeypatch.setattr("src.agents.title.updater.create_chat_model", lambda **kwargs: SlowModel()) | |
| updater = TitleGenerationUpdater(client_factory=lambda url: None) | |
| title = updater.generate_title([_message("human", "A long first message for fallback title")]) | |
| assert title.startswith("A long first message") | |
| finally: | |
| set_title_config(original) |
| updater = TitleGenerationUpdater(client_factory=lambda url: ClientStub()) | ||
| updater.process(TitleGenerationTask(thread_id="thread-2", messages=[_message("human", "x")])) | ||
|
|
There was a problem hiding this comment.
test_updater_does_not_override_manual_title calls updater.process(...), which currently runs generate_title() (and may try to load config.yaml / hit model creation) even though the thread title is already manual and no update should occur. To keep this unit test fast/deterministic, stub create_chat_model or TitleGenerationUpdater.generate_title in this test as well.
| def process(self, task: TitleGenerationTask) -> None: | ||
| title = self.generate_title(task.messages) | ||
| self._update_thread_title_if_needed(task.thread_id, title) | ||
|
|
There was a problem hiding this comment.
process() generates a title before checking whether the thread already has a non-default/manual title. If the user manually renames the thread before the async worker runs, this will still spend an LLM call (and potentially retries/timeouts) even though write-back is skipped. Consider fetching the current thread title first (or passing the known title into the task) and returning early before calling generate_title() when the title is already set.
| def process(self, task: TitleGenerationTask) -> None: | |
| title = self.generate_title(task.messages) | |
| self._update_thread_title_if_needed(task.thread_id, title) | |
| def process(self, task: TitleGenerationTask) -> None: | |
| # Avoid unnecessary title generation if the thread already has a non-default/manual title. | |
| if self._has_non_default_title(task.thread_id): | |
| return | |
| title = self.generate_title(task.messages) | |
| self._update_thread_title_if_needed(task.thread_id, title) | |
| def _has_non_default_title(self, thread_id: str) -> bool: | |
| """ | |
| Returns True if the thread already has a non-empty title, indicating that | |
| a manual or non-default title has been set and we should skip generation. | |
| """ | |
| # If no client is available, we cannot check the existing title; fall back to generation. | |
| if self._client_factory is None and _get_sync_client is None: | |
| return False | |
| try: | |
| if self._client_factory is not None: | |
| client = self._client_factory(self._langgraph_url) | |
| else: | |
| client = _get_sync_client(self._langgraph_url) # type: ignore[misc] | |
| thread = client.threads.get(thread_id) | |
| except Exception as exc: | |
| logger.warning("Failed to fetch thread %s while checking existing title: %s", thread_id, exc) | |
| return False | |
| existing_title = getattr(thread, "title", None) | |
| if not isinstance(existing_title, str): | |
| return False | |
| return existing_title.strip() != "" |
| def _invoke_with_timeout(self, prompt: str, config: TitleConfig) -> str: | ||
| result: dict[str, Any] = {} | ||
| error: dict[str, BaseException] = {} | ||
|
|
||
| def _invoke() -> None: | ||
| try: | ||
| model = create_chat_model(name=config.model_name, thinking_enabled=False) | ||
| response = model.invoke(prompt) | ||
| result["content"] = str(getattr(response, "content", "") or "") | ||
| except Exception as exc: | ||
| error["exception"] = exc | ||
|
|
||
| worker = threading.Thread(target=_invoke, daemon=True) | ||
| worker.start() | ||
| worker.join(timeout=config.timeout_seconds) | ||
| if worker.is_alive(): | ||
| raise TimeoutError("title generation timed out") | ||
| if "exception" in error: |
There was a problem hiding this comment.
The timeout implementation starts a daemon thread to run model.invoke() and raises TimeoutError if join() times out, but the underlying invoke thread continues running. With retries enabled, multiple timed-out invocations can run concurrently and accumulate, causing unbounded thread growth and lingering network calls. Prefer enforcing timeouts at the model/client layer (e.g., request timeout settings) or using a cancellable execution mechanism so timed-out attempts don’t keep running.
| def add(self, thread_id: str, messages: list[Any]) -> None: | ||
| with self._lock: | ||
| if thread_id in self._pending: | ||
| return | ||
| self._pending.add(thread_id) | ||
|
|
||
| worker = threading.Thread(target=self._process, args=(thread_id, messages), daemon=True) | ||
| worker.start() | ||
|
|
There was a problem hiding this comment.
TitleGenerationQueue.add() spawns a new daemon thread per enqueued task. Under load (many new threads completing around the same time), this can create a large number of concurrent threads and increase memory/CPU overhead. Consider using a bounded worker pool (e.g., ThreadPoolExecutor) or a single background worker thread consuming from a queue to cap concurrency.
Summary
TitleMiddleware.after_agentso main run completion is no longer blocked by title model latencyTesting
make test(fails:/bin/sh: 1: uv: not found)python3 -m pytest backend/tests/test_title_generation.py backend/tests/test_title_async.py -v(fails:No module named pytest)pnpm -C frontend check(fails:next: not found,node_modules missing)http://localhost:2026andhttp://localhost:3000(bothERR_CONNECTION_REFUSED)Fixes #887