Cohere sold sovereign AI to enterprises, now it’s targeting developers with its first coding model
Canadian foundation model company Cohere has spent the past few years selling a specific idea to banks, governments, and healthcare providers: that AI should run on their infrastructure, under their control, with their data never leaving the perimeter.
Cohere’s pitch went down well in regulated industries. Now the company is taking it to a different audience, with the launch of North Mini Code — its first coding model, released under an Apache 2.0 license from the get-go.
Model access as infrastructure
The sovereignty argument Cohere has long made to enterprise customers is, at its root, about ownership. Regulated industries have hard requirements: data can’t leave certain boundaries, and the intelligence layer running on sensitive infrastructure needs to be something the organization controls. That requirement shaped how Cohere built its products — deployable anywhere, runnable on private infrastructure.
What’s changed, according to Cohere co-founder Nick Frosst, is who is asking those same questions.
“We’re now hearing similar concerns from developers,” Frosst tells The New Stack. “They’re starting to think of model access as infrastructure, and infrastructure should be something you own and control. That is an extension of sovereignty.”
“[Developers] are starting to think of model access as infrastructure, and infrastructure should be something you own and control.”
North Mini Code is a direct response to that demand. It’s a 30-billion-parameter Mixture of Experts (MoE) model with just 3 billion active parameters and is designed for agentic coding tasks: the kind of multi-step, tool-using work that coding agents like Claude Code and Cursor are built around.
Cohere says it runs on a single Nvidia H100 GPU, making self-hosting practical without a larger multi-GPU deployment. Developers who would rather not manage their own infrastructure can access it via API instead.
“We want to give developers a capable, fast, open-weight model they can run locally on their own terms, and that fits in their compute environments,” Frosst says.
“We want to give developers a capable, fast, open-weight model they can run locally on their own terms, and that fits in their compute environments.”
Cohere claims it outperforms comparable open-weight models including Alibaba’s Qwen3 and Google’s Gemma 4 on the Artificial Analysis Coding Index, where it scores 33.4, and says it delivers up to 2.8x higher output throughput than Mistral’s Devstral Small 2 on identical hardware.
Cohere’s own benchmark testing shows North Mini Code leading on terminal and code generation tasks — but results are mixed across the full evaluation suite, with Qwen 3.6 ahead on SWE-Bench Verified and LiveCodeBench v6, as its chart illustrates. Those comparisons are based on Cohere’s own testing and should be taken as indicative.

A growing club
Cohere’s timing puts it alongside a growing group of international companies that have made open-weight coding models a deliberate product choice. Mistral, the Paris-based AI company, launched Devstral in May 2025 — its first dedicated agentic coding model, also under Apache 2.0 — and followed it with Devstral 2 in December. JetBrains, the Czech developer tools company, recently open-sourced Mellum2, its second-generation coding model.
The emphasis differs. Mistral has explicitly linked open weights to AI sovereignty and the ability to deploy models on private infrastructure, while JetBrains focuses on latency, cost and deployment flexibility. In practice, both approaches give developers and enterprises more control over where models run and how they are operated.
Owning the infrastructure
The appetite for open-weight alternatives to frontier models is clearly there. AI agent platform Lindy recently announced it had moved 100% of its inference traffic from Anthropic to China’s DeepSeek, saying the switch would save the company millions while actually improving performance on its core use cases. Lindy’s CEO Flo Crivello addressed the obvious question about routing through a Chinese-developed model: the company uses Atlas Cloud, a US-based inference provider that hosts DeepSeek on American soil. The open-weight nature of DeepSeek made that possible — the model can be hosted by any provider, in any jurisdiction.
That’s precisely the dynamic Frosst is pointing to. Open weights give developers optionality that a proprietary API does not: the ability to choose where the model runs, who operates it, and under what terms. For companies whose inference bill has grown to exceed payroll — as Crivello noted is the case at Lindy — those are decisions with real commercial consequences.
Cohere’s Command family — its flagship line of enterprise models built for agentic, multilingual, and multimodal tasks — had previously shipped as open-weight models under more restrictive licenses. With Command A+, the company moved to Apache 2.0 in May, making the legal terms around use and redistribution significantly more permissive.
“Open-source development was concentrated in a small number of jurisdictions, and organizations running critical infrastructure had no reliable alternative.”
Frosst draws a direct line between the enterprise sovereignty argument Cohere has made for years and the thinking behind North Mini Code. The open-source coding model, he says, is a response to the same concentration problem Cohere saw in enterprise AI — only now playing out at the developer layer.
“Open-source development was concentrated in a small number of jurisdictions, and organizations running critical infrastructure had no reliable alternative,” Frosst says. “North Mini Code extends that thinking to the developer layer. As coding agents become the infrastructure software engineering runs on, whoever controls those systems controls how they work, how they evolve, and what they’re optimized for. We think that developers and enterprises should be in control.”