xiand.ai
Technology

Arcee AI Unveils Trinity Large: A 400B Sparse MoE Pushing Frontier Performance on a Budget

Arcee AI has launched Trinity Large, a massive 400B parameter Sparse Mixture-of-Experts (MoE) model, demonstrating frontier-level capabilities across reasoning and knowledge benchmarks. The release includes three critical checkpoints: a fast, chat-ready Preview, the fully pre-trained Base, and the pure research-focused TrueBase. This achievement was realized through extreme training efficiency, completing the 17T token run in just 33 days on 2048 B300 GPUs.

La Era

Arcee AI Unveils Trinity Large: A 400B Sparse MoE Pushing Frontier Performance on a Budget
Arcee AI Unveils Trinity Large: A 400B Sparse MoE Pushing Frontier Performance on a Budget

The landscape of open-source foundation models just experienced a significant upward shift. Arcee AI, defying the typical astronomical costs associated with frontier model development, has unveiled Trinity Large—a colossal 400B parameter Sparse Mixture-of-Experts (MoE) architecture. This initiative signals a growing maturity in parameter-efficient design, proving that cutting-edge performance need not be locked behind the budgets of the largest labs.

Trinity Large operates with remarkable sparsity: 400 billion total parameters yield only 13 billion active parameters per token, utilizing 256 experts with 4 active per token. This aggressive sparsity ratio, balanced by increasing dense layers from three to six for routing stability, was key to achieving unparalleled throughput. The result is a model lineage that runs roughly 2-3x faster than peers in the same weight class on equivalent hardware, a critical advantage for both training and deployment.

Arcee is shipping three distinct checkpoints to serve varied community needs. Trinity-Large-Preview is optimized for immediate utility, offering chat readiness with light post-training, excelling in creative tasks and agentic workflows like OpenCode navigation. The centerpiece, Trinity-Large-Base, represents the culmination of a rigorous 17T token pretraining recipe curated by DatologyAI, targeting superior performance in math, coding, and scientific reasoning.

For the research community, Arcee has made a crucial offering: Trinity-Large-TrueBase. This is an unadulterated checkpoint from the 10T token mark, devoid of instruction tuning or learning rate annealing. It provides a rare, clean baseline for studying what a model learns purely from high-quality pretraining data before any alignment processes—a vital tool for academic ablation studies.

The scale of the training run was staggering, utilizing 2048 Nvidia B300 GPUs for just over 30 days. Efficiency was paramount, driving the adoption of advanced techniques like careful MoE router bias nudging, momentum application, and per-sequence balance loss to manage expert utilization. Furthermore, the team employed z-loss to prevent logit drift, ensuring a smooth, stable loss curve throughout the entire 33-day pretraining phase.

The data diet underpinning Trinity Large was equally ambitious, featuring 17 trillion tokens curated by DatologyAI, including over 8 trillion tokens of synthetic data across multiple domains and 14 non-English languages. This focused curation is directly reflected in the Base model’s frontier-level performance across targeted capability domains.

Perhaps the most striking revelation is the cost efficiency. The entire endeavor—compute, salaries, data, and operations—was accomplished for approximately $20 million over six months. This figure positions Trinity Large as a high-value proposition when contrasted with the operational expenditure of hyperscale AI labs, underscoring the democratization potential of optimized MoE architectures.

As Arcee continues post-training on the reasoning variant of Trinity Large, the initial Preview release already showcases strong utility. This launch is not merely about parameter count; it’s a demonstration of engineering discipline, showing how meticulous architectural choices and efficient training methodologies can redefine the ceiling for publicly accessible, frontier-class AI.

Comments

Comments are stored locally in your browser.