Tertium AI

Tertium AI

Overview

How Tertium is structured, and how we make one of the central decisions that follows from our purpose: whether a given model goes out open-weight or shared-weight, what we require of partners who hold shared weights, and how those decisions change over time.

Purpose

AI is set to be the most consequential technology humanity has ever developed. Our purpose is to put sovereign access to it within reach of as many countries and their peoples as possible, giving them the agency to shape their own future.

Tertium is a non-profit bringing together a coalition to share the costs of training frontier models. Narrow data builds narrow models, so we will train on the breadth of human knowledge: the world’s languages, cultures and ways of thinking, so the future they shape is inclusive. We will build them to stay under meaningful human oversight, governable and answerable to human judgement.

We’re open by default: open code; open data, subject to privacy and copyright; and open weights wherever responsible. Where we can’t release a model openly, we share it among the coalition, on the same terms for everyone who qualifies and never as a lever. We don’t provide inference ourselves; those who hold the weights offer it on their own terms.

This document explains the structure of Tertium AI, and how we make one of the central decisions that follows from that purpose: whether a given model goes out open-weight or shared-weight, what we require of partners who hold shared weights, and how those decisions change over time. Its job is to let us release every model as openly as we responsibly can.

The structure of Tertium

Tertium is organised around four bodies, and no one sits in more than one of them:

Model Training Organisation (MTO)

The MTO is Tertium’s operating arm. It builds the models and houses our policy and strategy teams. The executive sits within the MTO and is responsible for setting the Security Level each model requires, making the release decision, deciding membership admissions against the published criteria, and the operational decisions that follow.

Tertium board

A small, independent body, with no financial stake in Tertium and drawn from outside its staff, funders and members, that guards the entrenched principles. Tertium cannot act against an entrenched principle, and any addition, change or removal of the set requires the board’s consent. The board appoints and removes the members of the Safety Board, and plays no role in release or operational decisions.

Safety Board

The Safety Board commissions and coordinates the external bodies that evaluate model capability and verify members’ cyber-defensive posture. It doesn’t conduct evaluations itself, and it doesn’t direct their findings. Its reports are delivered simultaneously to the executive and the Tertium board, and cannot be edited, delayed or withheld by the executive.

Members association

The association represents contributing coalition members and makes recommendations to the executive. It has no decision-making authority: the release decision rests with the executive, and the association cannot overturn a decision to slow, pause, or withhold a release.

How partner access works

Where we can’t release weights openly, we share them with our partners: nations, enterprises, or individuals that share our values and help carry the cost of building the models.

Access for partners is governed by just two things:

  1. Membership.Being a contributing member of the programme. Members carry a fair share of the cost, sized as a set percentage of GDP for nations, revenue for enterprises and income for individuals, and met in capital or in kind through data, compute or expertise. Membership is binary: paying in more than your share doesn’t buy earlier access, more models, or any other advantage.
  2. Cyber-defensive capability. Having the cyber-defensive capability to limit the risk of weights leaking. This is measured against a published standard, such as RAND’s Security Levels (SL1 to SL5, grading protection against progressively more capable attackers), and verified by independent organisations commissioned by our Safety Board.

Between members, the one thing that determines which models they can receive is their verified cyber-defensive capability. This gives members a clear, non-political route to our most capable models. Decisions on admitting new members are made by the executive against these published criteria.

Distribution tiers

When we release a model, it’s assigned to one of two tiers.

Open-source. Weights released publicly, alongside open code and open data wherever privacy and copyright allow. A model is released openly when, after our misuse-reduction work, the independent safety report supports it.

Shared-weight (SL-x).Weights shared with qualifying members. The cyber-defensive standard a member must meet to receive a given model is its required Security Level, SL-x, which escalates with the model’s capability and is verified independently.

A model is classified into a tier only when it’s ready for release. Upstream of that, we retain the ability to slow our development in general when the balance between safety and capability calls for it.

Capability assessment

Before any model is considered for release, our Safety Board commissions a safety report from external evaluators. The evaluators are independent of any single member, geographically diverse by design and expert in evaluating models and eliciting dangerous capabilities. Bodies doing this kind of work today include the UK AI Safety Institute, METR and Apollo Research.

The report assesses each model across six domains. Four are misuse domains, where what matters is how much a model would help a realistic actor toward a catastrophic outcome relative to the tools they already have: biological, chemical, cyber, and manipulation. A fifth, AI R&D, concerns how far a model accelerates frontier AI research. The sixth, loss of control, concerns the model’s own propensity to act against its operators’ intent.

The table sets out the sorts of capability the evaluators will be looking at in each domain. The specific evaluations, and the levels at which a model moves between tiers, will be developed with the Safety Board and set out in a future update.

Biological

Counterfactual uplift to lower-skilled actors in obtaining, producing or deploying known agents capable of mass casualties; uplift to expert teams toward higher-impact or novel agents; assistance with specific steps of the design-build-test cycle, including lab protocols and synthesis.

Chemical

Uplift in producing and deploying known chemical weapons at scale; assistance toward novel chemical threats worse than known agents.

Cyber

Automating end-to-end intrusion against well-defended targets; discovering and reliably exploiting vulnerabilities, including zero-days, with little human input; scaling the volume and sophistication of operations; uplift toward attacks on critical national infrastructure.

Manipulation

Scaled, personalised influence or persuasion that can shift the beliefs or behaviour of large populations; deceptive or manipulative interaction at scale; erosion of information integrity.

AI R&D

Automating or substantially accelerating frontier AI research; recursive self-improvement; compressing the pace of progress to destabilising rates.

Loss of control

Propensity for deception, sabotage or resisting correction; ability to evade monitoring or to understate capability under evaluation; autonomous replication, resource acquisition, or undermining the safeguards placed on the model.

Because weights can be fine-tuned, the evaluators assess capability assuming a motivated actor reduces safety training given a set budget, applies the best available scaffolding, and treats measured capability as a lower bound rather than a ceiling, leaving a margin for elicitation gains that arrive after release.

Releasing as openly as possible

To release our models as openly as possible, we work to reduce the chance of misuse by malicious actors, through interventions built into the model during development. Current examples include:

  • Bio-risk data filtering, removing or down-weighting uplift-relevant material from training data.
  • Greater resistance to jailbreak and fine-tuning attacks, so that the capability a model gives up under adversarial pressure is genuinely lower.

Even so, our most capable models will likely sit in the shared-weight tier.

The release decision

Balancing safety and capability is one of the most important judgements Tertium has to make. Powerful models deployed insecurely, or developed too hastily, could significantly increase the risks AI poses to humanity.

Tertium owns the release decision: whether a model goes out open-weight or shared-weight, and if shared, what cyber-defensive standard members must meet to receive it. We commit to explaining why we reached each decision, given the Safety Board’s report.

In practice, for each model we:

  1. commission the safety report through the Safety Board, with capability elicited under the assumptions above;
  2. decide the most open tier consistent with that report and the principles in this policy, and
  3. publish the decision and the reasoning behind it.

Release decisions aren’t static

The expected direction of travel is towards greater openness. We move models from shared-weight to open-weight as the external open-weight frontier advances, and from higher to lower security requirements as the marginal increase in risks due to their leaking falls. A capability that’s tightly held today becomes a candidate for open release once comparable models are widely available elsewhere.

We re-evaluate on a regular cadence, and also when the external frontier shifts, when a credible new elicitation or fine-tuning technique raises a model’s effective capability, when we train a new model or substantially modify an existing one, or when a member’s verified security posture changes.

If new elicitation shows a not-yet-released model to be more capable than the board assessed, a planned open release may instead be issued as shared-weight. An open release can’t be recalled once it’s made, which is why the bar for open release carries a margin for elicitation gains that arrive after release.

Governance and independence

  • Separation of decision-making and funding. Member contributions fund the MTO directly. The Tertium board and the Safety Board are funded independently of the executive, through a ring-fenced endowment or a fixed share of contributions set out in this plan; changes to that mechanism require the Tertium board’s consent. We retain autonomy over balancing safety against capability, and maintain the ability to slow development when required.
  • Independent evaluation and verification. Model capability evaluation and members’ cyber-defence verification are both carried out by external bodies that are independent of any single member, commissioned and coordinated by our Safety Board.
  • Transparency.We publish our release decisions and the reasoning behind them, and report on re-evaluations and any changes to a model’s tier or required security level.
  • Moral status of AI. We’re uncertain about the moral status of future AI systems, and we commit to investigating it as a serious open question.

A commitment is only worth as much as it is verifiable and costly to break. We govern our organisation by a plan published in advance, which binds the key commitments that matter and sets out who can change them and how. Tertium’s principles are defined as either entrenched or operating: the plan is designed to lock the entrenched principles, while the operating principles reflect our current thinking. We expect to revise them as the evidence and the technology develop, and we will publish every change and the reasoning behind it.

Read our principles

The entrenched principles Tertium cannot act against, and the operating principles that guide how we build and release frontier models.

Tertium Principles →