Zion Boggan · April 2026 · Oversight Protocol v0.4.7

Intent of this document

This page is the threat-model companion to the Oversight specification and mirrors the language in docs/security.md. The point is to state, in public, what Oversight does and does not defend against, so that issuers, recipients, and reviewers can reason about the protocol honestly. RFC 2119 and BCP 14 keywords ("MUST", "SHOULD", "MAY") are interpreted only when written in all capitals.

1. Watermark layer survival matrix

Oversight's three watermark layers target different attack classes. Each survives different transformations, and each degrades when pushed past its intended envelope. The matrix below is the condensed version of the table in docs/security.md.

Layer Screenshot Reformat Manual retype Motivated adversary with vocabulary
L1 zero-width No Often no No No
L2 whitespace No No No No
L3 semantic Yes Yes Often yes No; canonicalization can defeat it

L1 and L2 are steganographic convenience layers. They are useful forensic signals but fragile under normalization: a copy-paste through an editor that strips non-printable Unicode will take L1 out, and any save-on-normalize tool will take L2 out. L3 is stronger because it encodes recipient identity in visible prose choices, but that strength is bought at a cost: the recipient copy is textually non-identical to the canonical source. v0.4.5's L3 policy forces that cost to be an explicit decision at seal time, not a silent default.

2. The L3 collusion and canonicalization attack

L3's synonym choices are deterministic per mark ID. If multiple recipients collude and compare their copies, they can identify the positions where their texts diverge and reconstruct the controlled vocabulary at those positions. Once the vocabulary is known, they can canonicalize it: replace every variant with a single canonical choice before leaking. The leaked text still carries the original semantic content, but the L3 attribution signal is destroyed.

This is not a bug. It is a property of deterministic semantic watermarking in the presence of an N-party collusion oracle, and no protocol-layer change to L3 alone can escape it. Mitigations under evaluation include per-recipient vocabulary randomization, stronger candidate scoring that models collusion edits, and warnings or thresholds for large recipient sets before L3 is enabled. Until those land, issuers should treat L3 as attribution evidence against ordinary leaks and low-to-medium-effort stripping, not as a perfect collusion-resistant watermark.

The manifest's canonical_content_hash and l3_policy fields, added in v0.4.5, do not defeat collusion. They preserve the evidentiary trail: a later investigation can verify which L3 mode was in force at seal time, whether the recipient acknowledged the non-identity disclosure, and what canonical source bytes the sealed copy derived from.

3. Passive beacons are telemetry, not guarantees

Oversight supports passive beacons that fire callbacks when a sealed document is opened in an environment with network egress. A beacon is a forensic signal, not a detection guarantee. Absence of a beacon does not prove absence of a leak. Corporate egress filtering, air-gapped readers, privacy tools, sandboxed previews, and offline workflows can all suppress callbacks. Any deployment that treats beacon silence as "the document was not opened" is building a policy on an assumption the protocol does not support.

4. Jurisdiction-by-IP is a soft policy control

Jurisdiction-by-IP is a useful operational signal for audit trails, routing decisions, and honest-client policy enforcement. It is not a cryptographic security boundary. VPNs, corporate NATs, residential proxies, and IP geolocation errors can all defeat or blur the control. An issuer who wants to prevent opens from a particular jurisdiction with cryptographic certainty has to enforce it through key custody and revocation, not through IP-based policy.

5. What RFC 3161 timestamps actually prove

An RFC 3161 qualified timestamp proves that a particular datum existed at or before the TSA's signing time. It does not prove authorship, ownership, or the content of the underlying document beyond the hash that was submitted. The Time Stamp Authority remains a trust anchor: its clock, its private key, and its operational integrity determine the strength of the attestation. Oversight's Sigstore Rekor v2 integration reduces reliance on any single TSA by anchoring evidence in a public append-only log, but transparency logging does not eliminate timestamp trust; it distributes it across a wider set of auditors.

For evidentiary purposes, the combination is stronger than either control alone: the RFC 3161 chain documents an "exists-before" bound, the Rekor entry documents "appended to a public log at checkpoint X," and the signed manifest links both to the specific recipient and content hash. An auditor who trusts any one of these anchors plus Oversight's cryptographic construction can verify the claim without having to trust Oversight as an operator.

6. What this threat model does not claim

Oversight does not claim to defeat all adversaries. Specifically:

7. References