Reasoning

Extended thinking tokens for complex tasks

Overview

Reasoning gives the model extra "thinking" time before answering. Useful for math, logic, multi-step analysis, and code review. Two ways to enable it: the reasoning object or the top-level reasoning_effort parameter. Reasoning tokens count as output tokens.

Two Configuration Formats

Format A: reasoning object (OpenRouter-style)

{
  "messages": [{"role": "user", "content": "Solve this math problem..."}],
  "reasoning": {
    "effort": "high"
  }
}

Full reasoning object:

Field	Description
`effort`	`"xhigh"` \| `"high"` \| `"medium"` \| `"low"` \| `"minimal"` \| `"none"`
`max_tokens`	Direct token budget (alternative to effort)
`exclude`	Use reasoning internally but don't return it
`enabled`	Inferred from effort/max_tokens

Rules: Use effort OR max_tokens, not both. exclude: true means the model reasons but doesn't include it in the response.

Format B: reasoning_effort (OpenAI-style)

{
  "messages": [{"role": "user", "content": "Solve this math problem..."}],
  "reasoning_effort": "high"
}

Accepted values: "xhigh", "high", "medium", "low", "minimal", "none"

Equivalent to reasoning: { effort: "high" }.

Precedence: If both reasoning and reasoning_effort are provided, the reasoning object takes precedence.

Effort Levels

Level	Description	Use when
`xhigh`	Maximum reasoning depth	Complex math, logic puzzles
`high`	Deep reasoning	Multi-step analysis, code review
`medium`	Balanced (default)	General-purpose reasoning
`low`	Light reasoning	Simple analysis, quick decisions
`minimal`	Very light	Basic tasks that benefit slightly
`none`	No reasoning	Disable reasoning entirely

Reasoning in Responses

Non-Streaming

Reasoning appears in choices[0].message.reasoning (string) and/or choices[0].message.reasoning_details (structured array):

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "The answer is 42.",
      "reasoning": "Let me think step by step...",
      "reasoning_details": [{
        "type": "reasoning.text",
        "text": "Let me think step by step...",
        "id": "reasoning-1",
        "format": "anthropic-claude-v1",
        "index": 0
      }]
    }
  }]
}

Streaming

Reasoning details appear in delta.reasoning_details before content starts:

data: {"choices":[{"delta":{"reasoning_details":[{"type":"reasoning.text","text":"Step 1..."}]}}]}
data: {"choices":[{"delta":{"content":"The answer is "}}]}
data: {"choices":[{"delta":{"content":"42."}}]}

reasoning_details Types

Type	Fields	Description
`reasoning.text`	`text`, `signature`	Raw reasoning text
`reasoning.summary`	`summary`	High-level summary
`reasoning.encrypted`	`data`	Encrypted/redacted block

All types share: id, format, index

Preserving Reasoning Blocks (Tool Calling)

When using reasoning + tools, pass reasoning_details back unmodified in assistant messages:

{
  "messages": [
    {"role": "user", "content": "What's the weather?"},
    {
      "role": "assistant",
      "content": null,
      "tool_calls": [{"id": "call_1", "function": {"name": "get_weather", "arguments": "{}"}}],
      "reasoning_details": [/* pass back unmodified from previous response */]
    },
    {
      "role": "tool",
      "tool_call_id": "call_1",
      "content": "{\"temp\": 72}"
    }
  ]
}

This maintains reasoning continuity across tool calls.

Billing

Reasoning tokens are counted as output tokens
Included in usage.completion_tokens
Higher reasoning effort = more output tokens = higher cost
Use exclude: true if you want reasoning benefits without tokens in the response (tokens are still generated and billed, just not returned)