Trinity Large Preview is a 400B-parameter sparse mixture-of-experts model from Arcee AI that activates 13B parameters per forward pass, targeting math, coding, and multi-step agent workloads across a context window of 131K tokens.
import { streamText } from 'ai'
const result = streamText({ model: 'arcee-ai/trinity-large-preview', prompt: 'Why is the sky blue?'})What To Consider When Choosing a Provider
- Configuration: Stream long analytical outputs to improve time-to-first-token. At $0.25 per million input tokens and $1 per million output tokens, compare spend to your latency budget before you scale traffic.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Trinity Large Preview
Best For
- Multi-step agent pipelines: The model plans, calls tools, and synthesizes results over many turns
- Sustained math and coding: Tasks that need continuous reasoning across a long context
- Large-scale code work: Generation or debugging where the model must follow logic across large files or refactors
- Long-context analysis: Ingesting a large corpus and producing structured conclusions
Consider Alternatives When
- Short single-turn requests: A smaller or faster model may match quality at lower cost
- Generally available contract: Preview terms don't fit teams needing a fixed long-term API contract
- Latency-dominant tasks: Simpler models suffice when deep multi-step reasoning isn't required
Conclusion
Trinity Large Preview brings Arcee AI's large MoE stack to AI Gateway for agentic, math, and coding workloads. If you need long-context reasoning and can accept preview terms, run it through your own benchmarks on AI Gateway.
Frequently Asked Questions
What kinds of tasks is Trinity Large Preview explicitly designed for?
Math, coding, and complex multi-step agent workflows. The release notes also emphasize efficient extended multi-turn use with high inference throughput.
Why is this model labeled a "preview" release?
It ships before general availability. Expect changes while Arcee AI finalizes production behavior, pricing, and versioning.
How does Trinity Large Preview differ from Trinity Mini?
Trinity Large Preview is a 400B-parameter MoE with 13B active per forward pass, targeting deep reasoning on math, coding, and multi-step agent tasks. Trinity Mini is a 26B-parameter MoE with 3B active parameters, tuned for lean inference and volume. Pick Mini when cost per token is the binding constraint; pick this model when you need the larger parameter space.
Do I need an Arcee AI account to use this model through AI Gateway?
No. Use your AI Gateway API key or an OIDC token. You don't need a separate provider account.
Can I use this model with the AI SDK?
Yes. Set
modeltoarcee-ai/trinity-large-previewin the AI SDK'sstreamTextorgenerateTextcall. AI Gateway also exposes OpenAI Chat Completions, OpenAI Responses, Anthropic Messages, and OpenResponses-compatible interfaces.Does AI Gateway provide observability for requests to Trinity Large Preview?
Yes. Token usage, latency, and cost show in your AI Gateway dashboard for each request without extra instrumentation.