Skip to content

refactor(session): improve testability and simplify the session package#2550

Merged
dgageot merged 9 commits intodocker:mainfrom
dgageot:board/improving-testability-of-session-package-9919443e
Apr 28, 2026
Merged

refactor(session): improve testability and simplify the session package#2550
dgageot merged 9 commits intodocker:mainfrom
dgageot:board/improving-testability-of-session-package-9919443e

Conversation

@dgageot
Copy link
Copy Markdown
Member

@dgageot dgageot commented Apr 27, 2026

Summary

Improves the testability and readability of pkg/session/ without changing any behaviour observable by callers. The branch is structured as 9 small, focused commits — each with its own validation — so it can be reviewed (and reverted) one piece at a time.

Why

The session package was hard to test in isolation:

  • session.New(), UserMessage(), SystemMessage(), ImplicitUserMessage() all called time.Now() and uuid.New() directly, so tests had to overwrite CreatedAt/ID after construction or rebuild chat.Message literals by hand.
  • The token-budget heuristic in truncateOldToolContent was an inline len(s)/4, so the test had to encode the divisor in comments.
  • NewSQLiteSessionStore only accepted a path, forcing every store test to allocate a temp directory and write WAL files.
  • MigrationManager baked in the production migration list, so any test of ordering, idempotency, version-mismatch or per-step failure had to drag in all 21 migrations.
  • AddSession, UpdateSession and addSessionTx each had their own copy of the JSON marshalling for Permissions / AgentModelOverrides / CustomModelsUsed plus the parent_id NULL dance — adding a column meant editing 3+ places.
  • Two near-identical message deep-copy implementations existed, and one of them (in branch.go) silently forgot to duplicate MessagePart.File, so branched sessions shared that pointer with the parent.
  • scanSession read 7 columns as strings and then strconv.Parse{Bool,Int,Float}'d them.
  • GetSession/loadSessionWith and loadSessionItems/loadSessionItemsWith were thin wrapper pairs, and GetMessages/GetAllMessages duplicated the same lock-snapshot-clone pattern (and disagreed on which Item fields to preserve).

What

Testability (commits 1–5)

# Commit Effect
1 test(session): inject clock and ID generator for deterministic tests Package-level seams nowFn/newIDFn plus setNowForTest/setIDForTest test helpers. Constructors now produce deterministic CreatedAt/ID. Persisted migrations.applied_at keeps using real time.Now().
2 refactor(session): extract approximateTokens helper The len(content)/4 heuristic becomes a named function with its own table-driven test; doc comments updated.
3 feat(session): add NewSQLiteSessionStoreFromDB for in-memory tests New constructor accepts a caller-supplied *sql.DB. Tests can use sql.Open("sqlite", ":memory:") instead of needing a temp dir & WAL files. Shared setupAndMigrate helper is the new bootstrap path.
4 refactor(session): make MigrationManager migration list injectable NewMigrationManager(db) keeps the production list (callers unchanged); NewMigrationManagerWithMigrations(db, migrations) lets tests drive synthetic migrations. Adds 7 unit tests covering rollback, retry, UpFunc failure, and version mismatch — paths that were previously unreachable.
5 refactor(session): collapse duplicated session marshalling/SQL sessionPersistedFieldsOf(*Session) is now the single source of truth for JSON-bearing columns + parent_id NULL handling. sessionSelectColumns is the canonical SELECT list. ~30 fewer lines in store.go; future column additions become one-place changes.

Simplification (commits 6–8)

# Commit Effect
6 refactor(session): consolidate the two deep-copy implementations One cloneMessage/cloneChatMessage pair instead of two. Bonus: fixes a latent bug — branch.go's version forgot to deep-copy MessagePart.File, leaving branched sessions sharing that pointer with the parent. Added a regression test that mutates the branched copy and asserts the parent is untouched.
7 refactor(session): scan SQL columns into native Go types Modernc/sqlite returns INTEGER/BOOLEAN as their native Go types; the strconv.Parse* layer (and 7 untestable error branches) goes away. scanSession drops from ~95 to ~55 lines.
8 refactor(session): extract snapshot helper and collapse store loaders One Session.snapshotItems() for GetMessages/GetAllMessages (which now agree on which Item fields to preserve). One loadSession(ctx, q, id) and one loadSessionItems(ctx, q, sessionID) instead of two thin-wrapper pairs.

Review polish (commit 9)

# Commit Effect
9 fix(session): plug review nits in test helpers Defensive t.Cleanup(db.Close) in openMemoryStore so a migration-time failure can't leak the connection. Doc comments on setNowForTest/setIDForTest calling out that they mutate package-level state and must not be combined with t.Parallel().

Diffstat

 pkg/session/branch.go                      |  56 +---
 pkg/session/branch_test.go                 |  41 +++
 pkg/session/clock_test.go                  | 135 +++++++++
 pkg/session/migrations.go                  |  29 ++-
 pkg/session/migrations_unit_test.go        | 170 ++++++++++++
 pkg/session/session.go                     |  85 +++---
 pkg/session/store.go                       | 404 ++++++++++++-----------------
 pkg/session/store_memory_test.go           |  74 ++++++
 pkg/session/store_persisted_fields_test.go |  50 ++++
 pkg/session/truncate_test.go               |  22 ++
 10 files changed, 732 insertions(+), 334 deletions(-)

No behaviour change

  • No public API removed. New constructors / helpers are additive: NewSQLiteSessionStoreFromDB, NewMigrationManagerWithMigrations, approximateTokens, sessionPersistedFieldsOf, Session.snapshotItems.
  • All previous tests pass unchanged.
  • One pre-existing bug fixed (the missing MessagePart.File deep-copy on branched sessions), with a regression test.

Validation

  • mise lint (golangci-lint + custom lint + go mod tidy --diff): 0 issues
  • go test ./... across the whole repo (~125 packages, including all consumers of pkg/session): all green
  • go test -race ./pkg/session/...: green (run multiple times)
  • go vet ./...: clean

How to review

The commits are deliberately self-contained and stand on their own — reviewing them in order tells the story:

  1. Start with commits 1–2 (clock seam + token helper) — small, low-risk seams.
  2. Commit 3 introduces the in-memory test path; commits 4–5 broaden it to migrations and SQL marshalling.
  3. Commit 6 is the only one that fixes a behaviour bug (with the regression test that locks it in).
  4. Commits 7–8 are pure code-shape simplification.
  5. Commit 9 is review polish.
dgageot added 9 commits April 27, 2026 17:46
Replace direct time.Now() and uuid.New() calls inside session.New(),
UserMessage(), ImplicitUserMessage(), SystemMessage() and the summary
message builder with package-level seams (nowFn, newIDFn) that tests can
override.

Persisted timestamps in migrations.applied_at intentionally keep using
time.Now() since they record real wall-clock events.

Adds setNowForTest / setIDForTest helpers and a small test suite that
exercises the seams (TestNew_UsesInjectedClockAndID,
TestUserMessage_UsesInjectedClock, TestSystemMessage_UsesInjectedClock,
TestImplicitUserMessage_IsImplicitAndUsesClock,
TestDuration_DeterministicWithInjectedClock,
TestNew_WithIDOverridesGenerator).

Assisted-By: docker-agent
The len(content)/4 token-budget heuristic in truncateOldToolContent
is now a named function (approximateTokens). Updates the docstrings on
DefaultMaxOldToolCallTokens and Session.MaxOldToolCallTokens to point
at the helper, and adds a small TestApproximateTokens table covering
boundary cases.

Centralising the heuristic clarifies intent in the truncation loop and
gives a single place to swap in a real tokenizer or per-session
overrides later.

Assisted-By: docker-agent
The store can now be wrapped around a caller-supplied *sql.DB so that
tests don't need a temp directory or WAL files on disk. The bootstrap
schema and migrations are run in a small shared setupAndMigrate helper
used by both the file-backed (NewSQLiteSessionStore) and DB-backed
(NewSQLiteSessionStoreFromDB) constructors.

Adds an openMemoryStore(t) test helper that opens sql.Open("sqlite",
":memory:") and wires it through the new constructor, plus three small
tests covering the nil-db error, the migration round-trip, and
end-to-end message persistence.

Assisted-By: docker-agent
MigrationManager used to bake getAllMigrations() in, which forced any
test of ordering, idempotency, version-mismatch detection or per-step
failure handling to drag in the full 21-migration production schema.

Add an injectable migration list:
* NewMigrationManager(db) — convenience constructor that uses the
  production list (production callers unchanged)
* NewMigrationManagerWithMigrations(db, migrations) — for tests that
  want a synthetic list

RunPendingMigrations and checkForUnknownMigrations now read from the
field instead of calling getAllMigrations() directly.

Adds migrations_unit_test.go covering paths that were previously
untestable in isolation:
* applies all migrations + idempotent re-run
* only applies new migrations on subsequent runs
* invalid UpSQL rolls back transaction and stays retryable
* UpFunc failure surfaces error (and pins current SQL-commit-first
  behaviour)
* checkForUnknownMigrations rejects newer DBs and accepts equal/older
* empty migration list is a no-op

Assisted-By: docker-agent
AddSession, UpdateSession and addSessionTx each had their own copy of
the JSON marshalling logic for Permissions / AgentModelOverrides /
CustomModelsUsed and the parent_id NULL-coalescing dance. GetSession,
GetSessions and loadSessionWith each repeated the full SELECT column
list. Adding a new field meant editing 3+ places.

Extract two helpers:

* sessionPersistedFieldsOf(*Session) -> sessionPersistedFields, error
  is the single source of truth for the JSON-bearing fields and the
  parent_id NULL handling. The three INSERT call sites now use it.

* sessionSelectColumns is a constant holding the canonical SELECT list
  used by all three read paths.

Net effect: ~30 fewer lines in store.go and any future column addition
is a one-place change. Behaviour is unchanged (verified by the existing
suite plus a new TestSessionPersistedFieldsOf_* unit covering the
default, populated and empty-collection cases of the helper).

Assisted-By: docker-agent
branch.go and session.go each carried a near-identical message
deep-copy: deepCopyMessage/deepCopyChatMessage in session.go and
cloneMessage/cloneChatMessage in branch.go. The two differed in:

  * branch.go's cloneMessage reset Message.ID to 0 (because branched
    messages get fresh database IDs when persisted)
  * branch.go's cloneChatMessage forgot to deep-copy MessagePart.File,
    leaving branched sessions sharing that pointer with the parent
    (latent bug)

Keep the complete session.go implementation, rename it to clone* for
stylistic consistency with the surrounding clone* helpers in branch.go,
and let cloneSessionItem reset Message.ID at the call site instead of
inside the cloner. This:

  * removes ~50 lines of duplicated traversal code
  * fixes the missing File-pointer copy
  * keeps the ID-reset behaviour explicit at the only place it matters

Adds a regression test (TestBranchSession/deep-copies_pointer_fields_in_MultiContent)
that mutates branched ImageURL/File pointers and asserts the parent is
unaffected.

Assisted-By: docker-agent
scanSession was reading 7 columns (tools_approved, input_tokens,
output_tokens, cost, send_user_message, max_iterations, starred) as
strings and then running strconv.ParseBool / ParseInt / ParseFloat on
each, with seven separate error branches. The same string-typed dance
existed for starred in GetSessionSummaries.

Modernc.org/sqlite returns INTEGER and BOOLEAN columns as their native
Go types when scanned into bool/int64/float64, which eliminates the
parsing layer entirely. Scan straight into struct fields where possible
and skip the intermediate variables.

The thinking column is still selected (so the SELECT list and column
order stay stable) but is now scanned into a discarded local rather
than a string + comment.

Net effect: scanSession drops from ~95 to ~55 lines, with seven fewer
error paths; GetSessionSummaries' inner loop drops by ~10 lines.

No behaviour change. Verified with the full session test suite under
-race.

Assisted-By: docker-agent
Two pieces of duplicated structure that obscured intent:

* Session.GetMessages and Session.GetAllMessages each opened the read
  lock, allocated a snapshot of s.Messages, deep-copied each Message,
  released the lock. The two copies even disagreed on whether non-Message
  fields (Summary, SubSession, Cost) should be preserved on Message
  items. Extract a single Session.snapshotItems() helper that does the
  conservative thing (preserve all Item fields, deep-copy Message) and
  use it from both call sites.

* SQLiteSessionStore had GetSession + loadSessionWith doing the same
  SELECT + scanSession + load-items dance, differing only in the
  querier they use; and loadSessionItems was a one-line wrapper around
  loadSessionItemsWith. Collapse to GetSession (validates id, delegates)
  + loadSession(ctx, q, id) + loadSessionItems(ctx, q, sessionID).
  Recursive sub-session loads inside loadSessionItems now call
  loadSession directly.

Behaviour unchanged. Verified with the full session test suite under
-race and golangci-lint.

Assisted-By: docker-agent
Two small defensive fixes flagged during a review of the recent
testability work:

* openMemoryStore now registers t.Cleanup(db.Close) immediately after
  sql.Open, before the call to NewSQLiteSessionStoreFromDB that may
  fail. Previously a migration error would have leaked the *sql.DB for
  the duration of the test process. Calling Close twice on a *sql.DB is
  a no-op, so the existing store.Close() cleanup remains harmless.

* setNowForTest and setIDForTest now document that they mutate
  package-level variables and therefore must NOT be combined with
  t.Parallel(). setIDForTest's docstring also notes that the installed
  generator calls t.Fatalf, which is only valid on the test goroutine
  \u2014 a constraint to remember if production code under test ever calls
  New() from a background goroutine.

No behaviour change. Verified with go test -race and mise lint.

Assisted-By: docker-agent
@dgageot dgageot requested a review from a team as a code owner April 27, 2026 16:28
@dgageot dgageot merged commit 4068813 into docker:main Apr 28, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants