Skip to content

Add W&B Inference prefix caching docs#2372

Open
corbt wants to merge 3 commits intomainfrom
codex/prefix-caching-docs
Open

Add W&B Inference prefix caching docs#2372
corbt wants to merge 3 commits intomainfrom
codex/prefix-caching-docs

Conversation

@corbt
Copy link
Copy Markdown

@corbt corbt commented Mar 26, 2026

Summary

Add a new W&B Inference docs page for prefix caching and cache isolation, and add it to the English Inference "Response Settings" nav.

Motivation

Motivating Slack thread:
https://weightsandbiases.slack.com/archives/C08RU04P36G/p1774553788470369

A customer is worried about security on a multi-tenant installation, and we want to show that we take that concern seriously. This page explains how prefix caching works at a high level and how cache_salt can be used to isolate cache reuse across trust boundaries.

Notes

  • This keeps the existing chat-completions page simple instead of documenting one advanced parameter there while more basic request parameters are still undocumented.
  • I limited this change to English content plus the English nav entry. I did not edit JA or KO content.
  • I verified the behavior live against https://api.inference.wandb.ai/v1/chat/completions on moonshotai/Kimi-K2.5:
    • same prompt + same cache_salt reused almost the entire prompt cache
    • same prompt + different cache_salt forced a cold miss
    • empty cache_salt returned a 400
@corbt corbt requested a review from a team as a code owner March 26, 2026 21:16
@mintlify
Copy link
Copy Markdown
Contributor

mintlify Bot commented Mar 26, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
wandb 🟢 Ready View Preview Mar 26, 2026, 9:20 PM
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 26, 2026

📚 Mintlify Preview Links

🔗 View Full Preview

✨ Added (1 total)

📄 Pages (1)

File Preview
inference/response-settings/prefix-caching.mdx Prefix Caching

📝 Changed (1 total)

⚙️ Other (1)
File
docs.json

🤖 Generated automatically when Mintlify deployment succeeds
📍 Deployment: 0b22697 at 2026-04-29 15:58:27 UTC

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 26, 2026

🔗 Link Checker Results

All links are valid!

No broken links were detected.

Checked against: https://wb-21fd5541-codex-prefix-caching-docs.mintlify.app

@mdlinville
Copy link
Copy Markdown
Contributor

Tagging @jamie-rasmussen for technical review to start with, since he was involved in the Slack thread. 🙏

"content": "Summarize this document in one sentence: <long shared prefix here>"
},
],
cache_salt="tenant-a-user-123-secret",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@corbt I don't think this works - gives you TypeError: Completions.create() got an unexpected keyword argument 'cache_salt'

I think you can replace with something like:

    extra_body={
        "cache_salt": "tenant-a-user-123-secret",
    },
Style-only changes, no technical content added or removed:
- Fix title and heading to sentence case
- Add document purpose statement, section intros, and list lead-ins
- Expand KV abbreviation on first use
- Use present tense, contractions, and active voice
- Replace idiom with plain language for global audience
- Tighten parallel structure in bullet lists
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants