Hero image is MarkMyAI-verified
GuideMarch 7, 2026 · 12 min read

AI Image Provenance: The Complete Guide for Publishers

EU AI Act Article 50 takes effect August 2, 2026. This guide covers everything publishers need to know: what provenance actually means, why the most common marking techniques fail in the real world, and what a compliant workflow looks like. Each section links to a deeper technical article if you want to go further.

What is AI image provenance?

Provenance, in the traditional sense, is the documented history of an object — where it came from, who owned it, how it moved through the world. For AI images, it means something more specific: can you prove, at any point after publication, that this image was AI-generated, who published it, and when?

The reason this matters is the EU AI Act's Article 50, which requires publishers of AI-generated content to ensure their images are marked in a machine-readable, verifiable way. Not a text label. Not a social media caption. A technically verifiable proof that persists through the image's distribution lifecycle.

The challenge: images don't travel in controlled environments. By the time an AI-generated marketing image reaches a regulator who asks about it, it may have passed through a CMS, a CDN, Instagram, WhatsApp, and a screenshot. Each of those steps can — and usually does — destroy embedded metadata.

The legal obligation

Article 50(4) of the EU AI Act places the disclosure obligation on the deployer — the organization that publishes AI-generated content — not just on the AI tool provider. This is the distinction that most publishers miss.

OpenAI, Google, and Adobe embed C2PA metadata in images their tools generate. That fulfills their obligation as system providers. But that metadata identifies the AI tool, not the publisher. Under Article 50(4), you must ensure the disclosure is present, verifiable, and survives your publishing workflow.

The compliance gap

AI tools sign the creation of an image. That C2PA signature doesn't survive your publishing pipeline. By the time the image reaches your audience — through a CMS thumbnail, a CDN re-encode, a social media upload — the metadata is gone. You're left responsible for a proof that was destroyed before it could be used.

Non-compliance carries fines of up to €15 million or 3% of global annual turnover under Article 99(4). Enforcement begins August 2, 2026.

→ Deep dive: EU AI Act Article 50: What Every Publisher Needs to Know

Why C2PA alone isn't enough

C2PA is a real standard. It's supported by Adobe, Google, Microsoft, and the BBC. It embeds a cryptographically signed manifest into the image file, recording who published it, when, and how it was created.

The problem is infrastructure. Every major social platform re-encodes images on upload: Instagram, LinkedIn, WhatsApp, X. This re-encoding discards the JUMBF containers that hold C2PA manifests. Screenshots lose everything. CMS thumbnail generators strip non-pixel data. In our testing, C2PA survived zero of ten common transformations.

→ See the test results: We Sent an AI Image via WhatsApp. Here's What Happened.

There's a second issue: even an intact C2PA signature from DALL·E or Firefly identifies the AI tool as the signer, not your organization. Under Article 50(4), you need your own publisher signature — separate from the tool's C2PA — that identifies you as the deployer.

→ Why this matters: Why Your AI Image Is Not Compliant — Even with C2PA

The four-layer approach

The EU Draft Code of Practice (December 2025) is explicit that no single marking technique is sufficient. The reason is exactly the stripping problem: each technology has failure modes, and they don't all fail in the same scenarios.

A robust provenance strategy combines four independent layers:

1

C2PA Publisher Signature

Your cryptographic publisher identity embedded in the file. Machine-readable by any C2PA tool.

Direct download, CMS, PDF embedding
Social media upload, re-encoding, screenshots
2

Invisible Watermark (TrustMark)

A neural-network payload encoded in the pixel data itself. Imperceptible to human eyes.

JPEG compression, resize, format conversion, social media
Heavy cropping (>30%), adversarial removal attacks
3

Perceptual Fingerprint

A visual hash of the image stored in an external database. Finds records by visual similarity.

Most transformations that preserve visual content
Extreme modification that changes visual appearance
4

Blockchain Anchor (Polygon)

An immutable timestamp on a public blockchain. Proves a record existed at a specific time.

Everything — even MarkMyAI shutting down
Nothing (but alone it can't recover an image record)

Together, the four layers create a cascading fallback system. If C2PA is stripped, the watermark recovers the record. If the watermark degrades, the fingerprint matches the record. If nothing else works, the blockchain anchor confirms the proof existed. No single failure mode takes down the whole system.

→ Full comparison: C2PA vs. Watermark vs. Blockchain: Which Layer Does What?

What a compliant workflow looks like

The practical workflow is simpler than the underlying technology:

1

Generate an AI image using any tool (DALL·E, Midjourney, Firefly, Stable Diffusion, etc.)

2

Before publishing, send it through POST /v1/mark (API) or upload via the MarkMyAI dashboard — all four layers are applied automatically

3

Publish the marked image to your website, CMS, or social channels as usual

4

If provenance is ever questioned, anyone can verify via the public checker at markmyai.com/check — no account required

5

If needed for legal or compliance purposes, download the Proof PDF — a self-contained document that works even if MarkMyAI's servers are offline

For WordPress sites, the plugin automates steps 2–3: every uploaded image gets marked before it's published. For high-volume pipelines, the REST API lets you integrate marking into existing content workflows without a UI.

What survives if MarkMyAI shuts down

A compliance tool you can't rely on long-term isn't much of a compliance tool. We designed the proof to outlast us:

Survives permanently

✅ Invisible watermark (open-source TrustMark decoder)

✅ C2PA signature (any C2PA-compatible tool)

✅ Blockchain anchor (Polygon is independent)

✅ Proof PDF (offline document with all data)

Server-dependent

❌ Audit trail database

❌ Fingerprint lookup

❌ Verify API

→ Proof PDF replaces these for long-term use

5 of 8 proof elements survive indefinitely without any server infrastructure. The Proof PDF is a complete offline backup that includes the blockchain transaction hash, image hashes, and step-by-step instructions for independent verification using only publicly available tools.

The three proof states you'll see

When any image is submitted to our checker, it returns one of three states:

Verified Provenance

C2PA signature is present and cryptographically valid. Full publisher identity confirmed. This is the ideal state for images that haven't been through a social platform.

⚠️ Recovered Provenance

C2PA was stripped, but the watermark or fingerprint matched a proof record. This is normal after social media sharing — and it's exactly what the multi-layer system was designed to produce.

No Verifiable Provenance

No match found across any layer. Either the image was never marked, or it was modified beyond what the system can recover from. Unmarked AI images will always return this state.

"Recovered" is not a degraded result. It means the image traveled through a distribution pipeline that destroyed embedded metadata — which is the normal case for social media — and the system recovered the proof anyway. That's the whole point of having multiple layers.

Go deeper: everything we've written on this

The Publisher's Guide to AI Image Provenance

A 15-page PDF covering everything in this guide in printable form — with real test data tables, the four-layer comparison, and a step-by-step independent verification guide. Designed for compliance officers and decision-makers.

Read the full guide →
Analytics Consent

We use Google Analytics 4 only if you agree, to understand which pages bring traffic and where visitors drop off. No advertising features are enabled. You can change your choice at any time in the privacy settings.