Not every document needs the same level of provenance. A company's internal wiki has different requirements than a pharmaceutical dosage reference. Six levels (0-5), from not ready to fully audited, describing both what the publisher provides and what AI agents can do.
| Level | Name | Key Requirement | Use Case | Agent Outcome |
|---|---|---|---|---|
| 0 | Not Ready | Auto-generated boilerplate, no deliberate MX intent | Default state | Agents guess, infer, hallucinate |
| 1 | Basic | Deliberate metadata, publisher identified | Internal documentation | Agents can parse and discover |
| 2 | Structured | Full MX fields, maintainer + contact | Public documentation | Agents can cite and attribute |
| 3 | Attested | Cryptographically attested, review cycle | Commercial documentation | Agents can compare and recommend |
| 4 | Registered | Attested + registered, full contract with SLA | Enterprise documentation | Agents can transact with confidence |
| 5 | Audited | Attested + registered + third-party provenanced | Critical documentation | Agents can guarantee accuracy |
Level 0 describes the default state before any deliberate MX work. Level 3 is the minimum for REGINALD registration. The progression is cumulative — each level includes everything from the level below.
Default state
The baseline before any deliberate MX work. The site may have auto-generated meta tags from a CMS, boilerplate social sharing cards, or templated HTML. None of this was placed with deliberate machine readability in mind. An AI agent visiting this site must infer meaning from visual layout, guess at business identity, and hallucinate missing context.
<title> from CMS or templateNothing. This is the absence of deliberate MX. Most websites on the internet are at Level 0.
Agents must guess, infer, and hallucinate. They cannot reliably identify the publisher, extract structured facts, or verify any claims. Recommendations based on Level 0 content are unreliable.
Any website that has not been deliberately structured for machine consumption. CMS-generated pages with default templates, marketing sites built for visual impact without metadata consideration, legacy sites with outdated HTML.
Quick-start adoption
The entry point. MX metadata is present and the publisher is identified. This is the minimum structure that makes content machine-parseable rather than just human-readable. Any file type can qualify — markdown, HTML, JavaScript, CSS, or shell scripts — each using its own carrier format.
title (or equivalent) and description fields presentauthor field identifying the publisherSomeone has deliberately structured this content for machine consumption. It is not a random file — it has identity.
Agents can parse and discover this content. They can find it, extract basic facts, and confirm the publisher's identity. Discovery-stage agent tasks succeed.
Internal wikis, team documentation, product pages, personal knowledge bases. Content that needs to be parseable by internal AI tools but does not require external trust verification.
---
title: "Internal API Guide"
description: "Authentication flows for the payments service"
author: "Engineering Team"
version: "1.0"
---
# Internal API Guide
Content here...
<!-- Schema.org structured data -->
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "TechArticle",
"name": "Internal API Guide",
"description": "Authentication flows for the payments service",
"author": { "@type": "Organization", "name": "Engineering Team" }
}
</script>
<!-- MX governance metadata -->
<meta name="mx:status" content="active">
<meta name="mx:contentType" content="guide">
Professionalised content
The document follows the full MX metadata specification. It has a named maintainer, contact details, and the operational metadata that tells AI agents not just what the content is, but who is responsible for it.
mx:status, mx:contentType, mx:tags fieldscreated and modified dates (or equivalent)This content has governance. A named person or team is responsible for its accuracy. An AI agent can assess not just the content, but who stands behind it.
Agents can cite and attribute this content. They can reference it as a source, link to the named maintainer, and assess when it was last updated. Citation-stage agent tasks succeed.
Public-facing documentation, product pages, API references. Content that external parties will read and rely upon, but where cryptographic attestation is not yet required.
---
title: "Product Specification"
description: "Rancilio Silvia Pro X technical spec"
author: "Rancilio Group"
created: 2026-01-15
modified: 2026-03-01
version: "2.1"
mx:
status: active
contentType: product
tags: [espresso, dual-boiler, pid]
maintainer: "product-team@rancilio.com"
audience: [humans, machines]
---
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Product",
"name": "Rancilio Silvia Pro X",
"description": "Dual-boiler espresso machine with PID",
"brand": { "@type": "Brand", "name": "Rancilio" },
"offers": {
"@type": "Offer",
"price": "1299.00",
"priceCurrency": "GBP",
"availability": "https://schema.org/InStock"
}
}
</script>
<meta name="mx:status" content="active">
<meta name="mx:contentType" content="product">
<meta name="mx:tags" content="espresso, dual-boiler, pid">
<meta name="mx:audience" content="humans,machines">
<meta name="mx:content-policy" content="extract-with-attribution">
Commercial documentation
The threshold for REGINALD registration. The document is cryptographically attested by the publisher, proving authorship and integrity. It has a review cycle and update triggers — the Contract of Governance is now enforceable.
mx.reviewCycle defined (e.g. quarterly, monthly)mx.expires date set — content has a defined shelf lifeTwo things simultaneously. The Certificate of Genuineness proves "I wrote this and it has not been tampered with." The Contract of Governance proves "I will keep this current, and here is the schedule."
Without cryptographic attestation, an AI agent cannot distinguish between genuine documentation and a modified copy. Without a review cycle, content decays silently. Level 3 is where computational trust begins — the point where an AI system can programmatically confirm that content is authentic and maintained.
Agents can compare and recommend this content against alternatives. Cryptographic attestation means agents can programmatically confirm authenticity before including it in comparisons. Search and compare agent tasks succeed.
Commercial product documentation, pricing data, technical specifications that customers and AI agents rely upon for purchasing decisions. The level where getting it wrong costs money.
Enterprise documentation
The full contract. Attested content is registered with REGINALD and carries a service-level agreement. The publisher commits to response times, update frequencies, and availability guarantees.
This publisher is committed to accuracy as a service. Not just "I wrote this" or "I will review this" — but "I guarantee this will be correct and available within defined parameters."
Agents can transact with confidence using this content. SLA guarantees, aliveness checks, and registry presence give agents the assurance needed for procurement and commerce workflows. Transaction-stage agent tasks succeed.
Enterprise documentation, regulated industry content, partner integrations. Organisations where downstream systems depend on the accuracy of this data and where SLA breaches have contractual consequences.
Critical documentation
The highest level of trust. Everything from Level 4, plus independent third-party verification. An external auditor has confirmed that the content is accurate, the governance processes are followed, and the publisher's claims are substantiated.
Someone independent has checked this. Not just the publisher saying "trust me" — a third party confirming "we have provenanced this."
Level 5 exists for industries where a wrong answer does not just waste tokens — it harms people. A misquoted drug dosage. A wrong financial disclosure. An incorrect legal precedent. These are domains where computational trust must be backed by human accountability.
Agents can guarantee accuracy of this content. Independent third-party verification means agents can make safety-critical recommendations backed by human accountability. Guarantee-stage agent tasks succeed.
Healthcare documentation, pharmaceutical data, financial disclosures, legal references. Content where regulatory compliance demands independent verification.
Most sites start at Level 0. Each level is cumulative. You cannot skip levels — a document must satisfy all requirements from the levels below before qualifying for the next. This ensures that higher-trust documents always carry the full chain of provenance.
| If Your Content Is | Start At | Why |
|---|---|---|
| No deliberate MX work done | Level 0 | This is where most sites start — the audit shows what to do first |
| Internal team documentation | Level 1 | Machine-parseable is enough — trust is implicit within the organisation |
| Public product pages or API docs | Level 2 | External readers need to know who maintains this and when it was last updated |
| Commercial product data, pricing | Level 3 | AI agents making purchasing recommendations need cryptographic proof of authenticity |
| Enterprise integrations, partner APIs | Level 4 | Downstream systems depend on this data — SLA guarantees reduce integration risk |
| Healthcare, finance, legal content | Level 5 | Regulatory compliance demands independent verification — self-attestation is insufficient |
Compliance levels describe document quality. Pricing tiers describe registry access. They are related but distinct.
| Pricing Tier | Minimum Level | Notes |
|---|---|---|
| Open (Free) | Level 3 | All REGINALD-registered COGs must be attested |
| Professional (£149/yr) | Level 3 | Same minimum, plus analytics and priority refresh |
| Business (£499/yr) | Level 4 | SLA requires full registered status |
| Enterprise (Custom) | Level 4+ | Level 5 available for regulated content |
See pricing for full tier details.