AI Image Provenance: The Complete Guide for Publishers
EU AI Act Article 50 takes effect August 2, 2026. This guide covers everything publishers need to know: what provenance actually means, why the most common marking techniques fail in the real world, and what a compliant workflow looks like. Each section links to a deeper technical article if you want to go further.
What is AI image provenance?
Provenance, in the traditional sense, is the documented history of an object — where it came from, who owned it, how it moved through the world. For AI images, it means something more specific: can you prove, at any point after publication, that this image was AI-generated, who published it, and when?
The reason this matters is the EU AI Act's Article 50, which requires publishers of AI-generated content to ensure their images are marked in a machine-readable, verifiable way. Not a text label. Not a social media caption. A technically verifiable proof that persists through the image's distribution lifecycle.
The challenge: images don't travel in controlled environments. By the time an AI-generated marketing image reaches a regulator who asks about it, it may have passed through a CMS, a CDN, Instagram, WhatsApp, and a screenshot. Each of those steps can — and usually does — destroy embedded metadata.
The legal obligation
Article 50(4) of the EU AI Act places the disclosure obligation on the deployer — the organization that publishes AI-generated content — not just on the AI tool provider. This is the distinction that most publishers miss.
OpenAI, Google, and Adobe embed C2PA metadata in images their tools generate. That fulfills their obligation as system providers. But that metadata identifies the AI tool, not the publisher. Under Article 50(4), you must ensure the disclosure is present, verifiable, and survives your publishing workflow.
The compliance gap
AI tools sign the creation of an image. That C2PA signature doesn't survive your publishing pipeline. By the time the image reaches your audience — through a CMS thumbnail, a CDN re-encode, a social media upload — the metadata is gone. You're left responsible for a proof that was destroyed before it could be used.
Non-compliance carries fines of up to €15 million or 3% of global annual turnover under Article 99(4). Enforcement begins August 2, 2026.
→ Deep dive: EU AI Act Article 50: What Every Publisher Needs to Know
Why C2PA alone isn't enough
C2PA is a real standard. It's supported by Adobe, Google, Microsoft, and the BBC. It embeds a cryptographically signed manifest into the image file, recording who published it, when, and how it was created.
The problem is infrastructure. Every major social platform re-encodes images on upload: Instagram, LinkedIn, WhatsApp, X. This re-encoding discards the JUMBF containers that hold C2PA manifests. Screenshots lose everything. CMS thumbnail generators strip non-pixel data. In our testing, C2PA survived zero of ten common transformations.
→ See the test results: We Sent an AI Image via WhatsApp. Here's What Happened.
There's a second issue: even an intact C2PA signature from DALL·E or Firefly identifies the AI tool as the signer, not your organization. Under Article 50(4), you need your own publisher signature — separate from the tool's C2PA — that identifies you as the deployer.
→ Why this matters: Why Your AI Image Is Not Compliant — Even with C2PA
The four-layer approach
The EU Draft Code of Practice (December 2025) is explicit that no single marking technique is sufficient. The reason is exactly the stripping problem: each technology has failure modes, and they don't all fail in the same scenarios.
A robust provenance strategy combines four independent layers:
C2PA Publisher Signature
Your cryptographic publisher identity embedded in the file. Machine-readable by any C2PA tool.
Invisible Watermark (TrustMark)
A neural-network payload encoded in the pixel data itself. Imperceptible to human eyes.
Perceptual Fingerprint
A visual hash of the image stored in an external database. Finds records by visual similarity.
Blockchain Anchor (Polygon)
An immutable timestamp on a public blockchain. Proves a record existed at a specific time.
Together, the four layers create a cascading fallback system. If C2PA is stripped, the watermark recovers the record. If the watermark degrades, the fingerprint matches the record. If nothing else works, the blockchain anchor confirms the proof existed. No single failure mode takes down the whole system.
→ Full comparison: C2PA vs. Watermark vs. Blockchain: Which Layer Does What?
What a compliant workflow looks like
The practical workflow is simpler than the underlying technology:
Generate an AI image using any tool (DALL·E, Midjourney, Firefly, Stable Diffusion, etc.)
Before publishing, send it through POST /v1/mark (API) or upload via the MarkMyAI dashboard — all four layers are applied automatically
Publish the marked image to your website, CMS, or social channels as usual
If provenance is ever questioned, anyone can verify via the public checker at markmyai.com/check — no account required
If needed for legal or compliance purposes, download the Proof PDF — a self-contained document that works even if MarkMyAI's servers are offline
For WordPress sites, the plugin automates steps 2–3: every uploaded image gets marked before it's published. For high-volume pipelines, the REST API lets you integrate marking into existing content workflows without a UI.
What survives if MarkMyAI shuts down
A compliance tool you can't rely on long-term isn't much of a compliance tool. We designed the proof to outlast us:
Survives permanently
✅ Invisible watermark (open-source TrustMark decoder)
✅ C2PA signature (any C2PA-compatible tool)
✅ Blockchain anchor (Polygon is independent)
✅ Proof PDF (offline document with all data)
Server-dependent
❌ Audit trail database
❌ Fingerprint lookup
❌ Verify API
→ Proof PDF replaces these for long-term use
5 of 8 proof elements survive indefinitely without any server infrastructure. The Proof PDF is a complete offline backup that includes the blockchain transaction hash, image hashes, and step-by-step instructions for independent verification using only publicly available tools.
The three proof states you'll see
When any image is submitted to our checker, it returns one of three states:
✅ Verified Provenance
C2PA signature is present and cryptographically valid. Full publisher identity confirmed. This is the ideal state for images that haven't been through a social platform.
⚠️ Recovered Provenance
C2PA was stripped, but the watermark or fingerprint matched a proof record. This is normal after social media sharing — and it's exactly what the multi-layer system was designed to produce.
❌ No Verifiable Provenance
No match found across any layer. Either the image was never marked, or it was modified beyond what the system can recover from. Unmarked AI images will always return this state.
"Recovered" is not a degraded result. It means the image traveled through a distribution pipeline that destroyed embedded metadata — which is the normal case for social media — and the system recovered the proof anyway. That's the whole point of having multiple layers.
Go deeper: everything we've written on this
C2PA vs. Watermark vs. Blockchain: Which Layer Does What?
A side-by-side breakdown of what each technology proves and where each one breaks.
Invisible Watermarks: How TrustMark Survives What C2PA Can't
How neural watermarking works, what gets encoded, and why 'recovered' is a success state.
We Sent an AI Image via WhatsApp. Here's What Happened.
Real experiment: full C2PA in, zero metadata out. Platform-by-platform results.
EU AI Act Article 50: What Every Publisher Needs to Know
What the law actually requires, who is responsible, and what 'machine-readable marking' means.
Why Your AI Image Is Not Compliant — Even with C2PA
The AI tool's C2PA signature is not your publisher signature. Here's why that matters.
EU Code of Practice: What's Actually in the Draft
What the December 2025 draft requires, what it leaves open, and what it means for publishers.
The Publisher's Guide to AI Image Provenance
A 15-page PDF covering everything in this guide in printable form — with real test data tables, the four-layer comparison, and a step-by-step independent verification guide. Designed for compliance officers and decision-makers.
Read the full guide →