Prime Intellect Releases prime-rl 0.6.0 to Train Trillion-Parameter MoE Models on Agentic RL Workloads
Summary
<p>Prime Intellect has released prime-rl 0.6.0, an open framework for asynchronous reinforcement learning on trillion-parameter Mixture-of-Experts models. It trained GLM-5 on SWE tasks at up to 131k sequence length, with sub-5-minute step times and 256 rollouts, on 28 H200 nodes. This breakdown covers the inference and training optimizations behind those numbers — FP8 inference, Wide Expert Parallelism, prefill/decode disaggregation, router replay, and 3-D parallelism (FSDP, EP, CP).</p> <p>The post <a href="https://www.marktechpost.com/2026/06/23/prime-intellect-releases-prime-rl-0-6-0-to-train-trillion-parameter-moe-models-on-agentic-rl-workloads/">Prime Intellect Releases prime-rl 0.6.0 to Train Trillion-Parameter MoE Models on Agentic RL Workloads</a> appeared first on <a href="https://www.marktechpost.com">MarkTechPost</a>.</p>
Discussion on
Trending posts from X.
GLM-5.2 leads open weights models and sits at #3 overall on GDPval-AA, a real-world agentic work benchmark
— Artificial Analysis (@ArtificialAnlys) June 22, 2026
GLM-5.2 from @Zai_org scores 1524 Elo on GDPval-AA, which measures performance on real-world, economically valuable knowledge work through long-horizon, multi-turn tasks.… pic.twitter.com/UxldYcXloy
Andrew Ng:
— Movez (@0xMovez) June 21, 2026
"100% of my tasks are now done by AI agents - hype has exceeded my expectations. Loops is next step.
in 3-6 months, everyone will be using self-improving loops. No more prompting."
In a 30-minute talk, Andrew Ng explains how to build self-improving agentic systems… https://t.co/6iXIbRJB6m pic.twitter.com/cZjjSNmdf8