Skip to content

Trinity Mini

View Status

Trinity Mini is a 26B-parameter sparse MoE from Arcee AI with 3B active parameters per forward pass. It handles function calling and multi-step agent workflows at low per-token cost, trained end-to-end in the United States.

index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'arcee-ai/trinity-mini',
prompt: 'Why is the sky blue?'
})

What To Consider When Choosing a Provider

  • Configuration: MoE routing keeps active parameters low per token, which helps cost at scale. At $0.045 per million input tokens and $0.15 per million output tokens, stress-test cost against quality on your traffic.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Trinity Mini

Best For

  • High-volume reasoning routes: Deployments where cost per token is a hard constraint
  • Structured inference tasks: Reverse engineering or deduction from partial observations
  • U.S. training provenance: Teams that need domestic-only training for policy or procurement

Consider Alternatives When

  • Deepest large-model reasoning: Trinity Large Preview offers a larger parameter space at higher cost
  • Long-term enterprise SLA: This tier does not offer a fixed enterprise support contract
  • Tight latency budgets: Some workloads rule out even a compact MoE path

Conclusion

Trinity Mini pairs MoE efficiency with U.S. training provenance for teams that balance cost, control, and reasoning depth. Match spend to $0.045 and $0.15, then scale what works.

Frequently Asked Questions

  • What does "26B parameters, 3B active" mean in practice?

    The stack is mixture-of-experts with 26B total parameters. Each token activates roughly 3B parameters through routed experts, so cost and latency stay closer to a 3B-class forward pass than a dense 26B run.

  • What does "trained end-to-end in the U.S." mean?

    Arcee AI ran the full training pipeline in the United States. Buyers who care about geography for compliance or sourcing can use that fact in reviews.

  • How is Trinity Mini different from Trinity Large Preview?

    Trinity Mini is the 26B / 3B active open-weight MoE built for efficient volume inference. Trinity Large Preview is the 400B-parameter (13B active) large MoE aimed at heavier long-context reasoning. They sit at different cost-versus-capability points.

  • Do I need a separate Arcee AI account to access Trinity Mini on AI Gateway?

    No. Use your AI Gateway API key or an OIDC token. You don't need a separate provider account.

  • What reasoning style does Trinity Mini use?

    It supports chain-of-thought style traces, including the long-form machine-inference example from the model's AI Gateway announcement. Use that pattern when you need stepwise causal analysis from partial evidence.

  • Can I use Trinity Mini with the AI SDK?

    Yes. Set model to arcee-ai/trinity-mini in the AI SDK's streamText or generateText call. AI Gateway also exposes OpenAI Chat Completions, OpenAI Responses, Anthropic Messages, and OpenResponses-compatible interfaces.