fix(agent): hot-init providers on session restore credential miss (#74)#75
Open
Kantosaurus wants to merge 1 commit into1jehuang:masterfrom
Open
fix(agent): hot-init providers on session restore credential miss (#74)#75Kantosaurus wants to merge 1 commit into1jehuang:masterfrom
Kantosaurus wants to merge 1 commit into1jehuang:masterfrom
Conversation
When restoring a session, `Agent::new_with_session` and
`Agent::restore_session` (in turn_execution) call
`provider.set_model(&model)` to re-apply the persisted model selection.
If the running `MultiProvider` was constructed before an Anthropic
sub-provider could be hot-initialised (e.g. across a `restart-snapshot`
auto-restore where credentials were written between the original probe
and reload), `set_model` returns:
Claude credentials not available. Run `jcode login --provider claude` first.
The error was logged and silently swallowed. The user-visible symptom
was: jcode prompts for an OAuth login on every reload even though
`~/.jcode/auth.json` already holds valid, unexpired credentials. The
log evidence (1jehuang#74) shows the same on-disk credentials succeed
immediately after a manual `Hot-initialized Anthropic provider after
auth change` event.
Fix: introduce `set_session_model_with_lazy_auth_init` which retries
`set_model` exactly once after invoking `provider.on_auth_changed()`,
the same hook the login flow uses. `on_auth_changed` is idempotent and
cheap (a few file reads), and is a no-op for providers that are already
configured. If the second attempt still fails, the original error is
preserved verbatim so the existing failure mode (and message) is
unchanged for genuine credential-missing cases.
Both session-restore call sites (`Agent::new_with_session` in
`src/agent.rs` and `Agent::restore_session` in
`src/agent/turn_execution.rs`) now go through this helper.
Tests:
- recovers_after_on_auth_changed: a stub provider whose first
`set_model` fails with the canonical error, and whose
`on_auth_changed` flips an internal credentials-loaded flag, succeeds
on the second attempt with exactly one `set_model` retry and one
`on_auth_changed` call.
- returns_original_error_on_persistent_failure: when the underlying
provider keeps failing, the original error message ("credentials not
available") is preserved.
- does_not_retry_when_first_call_succeeds: `on_auth_changed` is never
called on the success path.
Fixes 1jehuang#74
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #74.
When restoring a session, jcode prompts for an OAuth login on every reload even though
~/.jcode/auth.jsonalready holds valid, unexpired Claude credentials. The user-visible message is:The log evidence (see #74) shows the on-disk credentials work fine before and after the prompt — they only stop "being seen" during the session-restore window.
Root cause
Agent::new_with_session(src/agent.rs) andAgent::restore_session(src/agent/turn_execution.rs) callprovider.set_model(&model)to re-apply the persisted model selection. When the runningMultiProviderwas constructed before an Anthropic sub-provider could be hot-initialised — e.g. across arestart-snapshotauto-restore where credentials were written between the original auth probe and the reload —set_model_on_providerreturns:The error was logged via
logging::error(...)and silently swallowed; the provider stayed in its credential-less state and any subsequent attempt to use it tripped the OAuth flow. Meanwhile,MultiProvider::on_auth_changed()is exactly the entry point that re-scans~/.jcode/auth.jsonand hot-initialises the Anthropic provider — but it was only invoked from the login flow.Fix
Introduce
set_session_model_with_lazy_auth_initinsrc/agent.rs. It callsprovider.set_model(model)and, on failure, invokesprovider.on_auth_changed()once and retries.on_auth_changedis idempotent and cheap (a few file reads), and a no-op for already-configured providers. If the second attempt still fails, the original error is returned verbatim so genuinely missing credentials produce the same message as before.Both session-restore call sites now go through this helper.
Tests
Added three unit tests in
src/agent_tests.rs:set_session_model_with_lazy_auth_init_recovers_after_on_auth_changed— stub provider whose firstset_modelreturns the canonical "credentials not available" error, and whoseon_auth_changedflips an internal flag. Verifies the helper returnsOk,set_modelis called exactly twice, andon_auth_changedis called exactly once.set_session_model_with_lazy_auth_init_returns_original_error_on_persistent_failure— when the provider never has credentials, the original error message is preserved.set_session_model_with_lazy_auth_init_does_not_retry_when_first_call_succeeds—on_auth_changedis never invoked on the success path (no perf regression for the common case).Note:
cargo checkpasses locally; fullcargo test --no-runwas still compiling at submission time and will be validated on the reviewer's hardware / CI.Diff
Risk
Low. The behaviour change is strictly additive on the failure path:
set_modelcall, noon_auth_changed).on_auth_changedcall (already used safely from the login flow) plus oneset_modelretry. Worst-case latency on a true credential miss is a few file-system reads.No public APIs change. The new free function is
pub(crate).Need help on this PR? Tag
@codesmithwith what you need.