Token-Based Pricing

The best tools for AI developers

W&B hosted models

Prices shown are per 1 million tokens.

Model

Input Tokens

Output Tokens

Cache Hit

Z.AI GLM 5.2

$1.39

$4.40

$0.26

Moonshot AI Kimi K2.7 Code

$0.94

$4.00

$0.19

NVIDIA Nemotron 3 Ultra

$0.75

$2.75

$0.15

JetBrains Mellum2 12B A2.5B

$0.05

$0.10

-

IBM Granite 4.1 8B

$0.05

$0.10

-

DeepSeek V4-Pro

$1.74

$3.46

$0.14

DeepSeek V4-Flash

$0.14

$0.28

$0.07

Qwen3.6 27B

$0.60

$3.60

$0.12

Moonshot AI Kimi K2.6

$0.95

$4.00

$0.16

Z.AI GLM 5.1

$1.40

$4.40

$0.26

Google Gemma 4 31B

$0.12

$0.35

$0.09

Qwen3.6 35B A3B

$0.25

$1.25

-

NVIDIA Nemotron 3 Super 120B

$0.20

$0.80

-

Qwen3.5 27B

$0.39

$3.12

$0.08

Qwen3.5 35B A3B

$0.25

$1.25

-

MiniMax M2.5

$0.30

$1.20

-

Z.AI GLM 5

$1.00

$3.20

-

Moonshot AI Kimi K2.5

$0.60

$3.00

$0.10

DeepSeek V3.1

$0.55

$1.65

-

OpenAI GPT OSS 120B

$0.04

$0.14

-

OpenAI GPT OSS 20B

$0.03

$0.13

-

Qwen3 30B A3B

$0.10

$0.30

-

Qwen3 235B A22B Thinking-2507

$0.10

$0.10

-

Qwen3 Coder 480B A35B

$1.00

$1.50

-

Qwen3 235B A22B-2507

$0.10

$0.10

-

OpenPipe Qwen3 14B Instruct

$0.05

$0.22

-

Meta Llama 3.1 8B

$0.22

$0.22

-

Meta Llama 3.3 70B

$0.71

$0.71

-

Meta Llama 3.1 70B

$0.80

$0.80

-

Microsoft Phi 4 Mini 3.8B

$0.08

$0.35

-

ARIA Pricing

AI Research & Iteration Agent

ARIA runs on OpenAI models. A single run may use multiple models (e.g., gpt-5.4-mini and gpt-5.5).

All plans include an Agent Token Rate of $0.50 per million tokens. This rate applies on top of model API pricing for all token types — input, cached input, and output. The combined cost is shown below:

Model

Input / 1M

Cached Input / 1M

Output / 1M

gpt-5.4

$3.00

$0.75

$15.50

gpt-5.4-mini

$1.25

$0.575

$5.00

gpt-5.5

$5.50

$1.00

$30.50

Frequently Asked Questions

Will I be charged for API usage in the Playground?

Yes, we treat Playground usage the same as regular API usage. You will be billed at the per-token input and output prices mentioned above.

A token is a mathematical representation of natural language. Log in to your account to view your billing dashboard⁠. This dashboard will show you how many tokens you’ve used during the current and past months.

Weights & Biases Inference APIs & Playground usage is billed in addition to your Enterprise and Pro licenses. Weights & Biases subscription pricing can be found at https://wandb.ai/site/pricing/.

You can set a monthly W&B Inference budget in your billing settings⁠⁠, after which we’ll stop serving your requests. There may be a delay in enforcing the limit, and you are responsible for any overage incurred. You can also configure an email notification threshold to receive an email alert once you cross that threshold each month. We recommend checking your usage tracking dashboard⁠ regularly to monitor your spend.