What does 'non-reasoning' mean for Grok 4 Fast Non-Reasoning?

The model produces direct answers without generating chain-of-thought reasoning traces. This reduces latency and output token consumption compared to the reasoning variant.

How does Grok 4 Fast Non-Reasoning differ from Grok 4 Fast Reasoning?

Both share the same Grok 4 Fast foundation. The reasoning variant generates chain-of-thought traces for analytical tasks, while Grok 4 Fast Non-Reasoning produces direct responses optimized for speed.

What does Grok 4 Fast Non-Reasoning cost?

Rates are listed on this page. They reflect the providers routing through AI Gateway and shift when providers update their pricing.

How do I authenticate with Grok 4 Fast Non-Reasoning through Vercel AI Gateway?

Use your Vercel AI Gateway API key with `xai/grok-4-fast-non-reasoning` as the model identifier. No separate xAI account is required for gateway-managed access.

Can Grok 4 Fast Non-Reasoning call tools and functions?

Yes. Grok 4 Fast Non-Reasoning supports tool calling and function invocation, making it suitable for agentic workflows that need fast decision-making.

Does Vercel AI Gateway support Zero Data Retention for Grok 4 Fast Non-Reasoning?

Zero Data Retention is not currently available for this model. ZDR on AI Gateway applies to direct gateway requests; BYOK flows aren't covered. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.

Grok 4 Fast Non-Reasoning

View Status

Grok 4 Fast Non-Reasoning is the speed-optimized, non-reasoning variant of xAI's Grok 4 Fast. It delivers fast inference without chain-of-thought overhead, tailored for high-throughput applications within a context window of 2M tokens.

Tool UseImplicit Cachingtiered-costVision (Image)File Input

import { streamText } from 'ai'

const result = streamText({
  model: 'xai/grok-4-fast-non-reasoning',
  prompt: 'Why is the sky blue?'
})

Playground

Try out Grok 4 Fast Non-Reasoning by xAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

About Grok 4 Fast Non-Reasoning

Grok 4 Fast Non-Reasoning is the non-reasoning configuration of xAI's Grok 4 Fast model, released September 19, 2025. It disables the extended chain-of-thought reasoning process, producing direct answers without intermediate reasoning traces. This eliminates reasoning token overhead and reduces both latency and output cost per request.

The model builds on the Grok 4 training foundation, carrying forward its language understanding and instruction following capabilities, but operates in a direct-response mode optimized for speed. With a context window of 2M tokens, it handles general-purpose tasks including text generation, summarization, classification, and tool calling.

Grok 4 Fast Non-Reasoning is available at $0.2 per million input tokens and $0.5 per million output tokens through Vercel AI Gateway. It pairs naturally with its reasoning counterpart: use the non-reasoning variant for straightforward tasks and the reasoning variant when analytical depth is needed.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Legal:Terms

•

Privacy

0.7s

43tps

$0.20/M

$0.50/M

Read:

$0.05/M

Write:

—

09/19/2025

More models by xAI

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

1.0s

91tps

$1.25/M

$2.50/M

Read:

$0.2/M

Write:

—

04/30/2026

0.8s

109tps

$2.00/M

$6.00/M

Read:

$0.2/M

Write:

—

03/11/2026

1.3s

108tps

$2.00/M

$6.00/M

Read:

$0.2/M

Write:

—

03/09/2026

256K

0.3s

97tps

$0.20/M

$1.50/M

Read:$0.02/M

Write:—

—

08/28/2025

0.6s

97tps

$0.20/M

$0.50/M

Read:

$0.05/M

Write:

—

07/09/2025

6.9s

183tps

$0.20/M

$0.50/M

Read:

$0.05/M

Write:

—

07/09/2025

What To Consider When Choosing a Provider

Configuration: This variant produces direct answers without chain-of-thought output. If you need to inspect the model's reasoning process or require multi-step analytical depth, use the reasoning variant instead.
Configuration: Without reasoning overhead, Grok 4 Fast Non-Reasoning delivers higher tokens-per-second throughput. This is ideal for streaming applications and high-volume pipelines.
Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Grok 4 Fast Non-Reasoning

Best For

High-throughput production APIs: Direct answers at low latency serve end users best
Chat and conversational interfaces: Users expect fast, natural responses without verbose reasoning
Text generation and content creation: Drafting, editing, and rephrasing tasks where throughput matters more than deep reasoning
Classification and routing pipelines: That categorize inputs quickly before downstream processing
Tool-calling agentic workflows: The model needs to decide and act quickly rather than deliberate

Consider Alternatives When

Complex analytical tasks: Requiring multi-step reasoning. Use the Grok 4 Fast Reasoning variant or the full Grok 4
Competition-level math or science: Chain-of-thought produces measurably better accuracy
Tasks where showing reasoning builds trust: Such as medical or legal analysis. The reasoning variant exposes its thinking
Maximum cost efficiency on simple tasks: Grok 3 Mini Fast offers even lower per-token costs

Conclusion

Grok 4 Fast Non-Reasoning strips away reasoning overhead to deliver the Grok 4 foundation at maximum speed. Use it for production workloads that need direct answers without chain-of-thought latency or token cost. Pair it with the reasoning variant for a two-tier architecture that matches model capability to task complexity.

Frequently Asked Questions

What does 'non-reasoning' mean for Grok 4 Fast Non-Reasoning?
The model produces direct answers without generating chain-of-thought reasoning traces. This reduces latency and output token consumption compared to the reasoning variant.
How does Grok 4 Fast Non-Reasoning differ from Grok 4 Fast Reasoning?
Both share the same Grok 4 Fast foundation. The reasoning variant generates chain-of-thought traces for analytical tasks, while Grok 4 Fast Non-Reasoning produces direct responses optimized for speed.
What is the context window?
2M tokens.
What does Grok 4 Fast Non-Reasoning cost?
Rates are listed on this page. They reflect the providers routing through AI Gateway and shift when providers update their pricing.
How do I authenticate with Grok 4 Fast Non-Reasoning through Vercel AI Gateway?
Use your Vercel AI Gateway API key with xai/grok-4-fast-non-reasoning as the model identifier. No separate xAI account is required for gateway-managed access.
Can Grok 4 Fast Non-Reasoning call tools and functions?
Yes. Grok 4 Fast Non-Reasoning supports tool calling and function invocation, making it suitable for agentic workflows that need fast decision-making.
Does Vercel AI Gateway support Zero Data Retention for Grok 4 Fast Non-Reasoning?
Zero Data Retention is not currently available for this model. ZDR on AI Gateway applies to direct gateway requests; BYOK flows aren't covered. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Grok 4 Fast Non-Reasoning

Playground

About Grok 4 Fast Non-Reasoning

Providers

More models by xAI

What To Consider When Choosing a Provider

When to Use Grok 4 Fast Non-Reasoning

Best For

Consider Alternatives When

Conclusion

Frequently Asked Questions

Playground

About Grok 4 Fast Non-Reasoning

Providers

More models by xAI