Threat Model: Layer Survival, Collusion, Beacons, and Timestamp Trust
Honest companion to the Oversight protocol specification
Zion Boggan · April 2026 · Oversight Protocol v0.4.7
Intent of this document
This page is the threat-model companion to the Oversight specification and mirrors the
language in docs/security.md. The point is to state, in public, what Oversight
does and does not defend against, so that issuers, recipients, and reviewers can reason
about the protocol honestly. RFC 2119 and BCP 14 keywords ("MUST", "SHOULD", "MAY") are
interpreted only when written in all capitals.
1. Watermark layer survival matrix
Oversight's three watermark layers target different attack classes. Each survives different
transformations, and each degrades when pushed past its intended envelope. The matrix below
is the condensed version of the table in docs/security.md.
| Layer | Screenshot | Reformat | Manual retype | Motivated adversary with vocabulary |
|---|---|---|---|---|
| L1 zero-width | No | Often no | No | No |
| L2 whitespace | No | No | No | No |
| L3 semantic | Yes | Yes | Often yes | No; canonicalization can defeat it |
L1 and L2 are steganographic convenience layers. They are useful forensic signals but fragile under normalization: a copy-paste through an editor that strips non-printable Unicode will take L1 out, and any save-on-normalize tool will take L2 out. L3 is stronger because it encodes recipient identity in visible prose choices, but that strength is bought at a cost: the recipient copy is textually non-identical to the canonical source. v0.4.5's L3 policy forces that cost to be an explicit decision at seal time, not a silent default.
2. The L3 collusion and canonicalization attack
L3's synonym choices are deterministic per mark ID. If multiple recipients collude and compare their copies, they can identify the positions where their texts diverge and reconstruct the controlled vocabulary at those positions. Once the vocabulary is known, they can canonicalize it: replace every variant with a single canonical choice before leaking. The leaked text still carries the original semantic content, but the L3 attribution signal is destroyed.
This is not a bug. It is a property of deterministic semantic watermarking in the presence of an N-party collusion oracle, and no protocol-layer change to L3 alone can escape it. Mitigations under evaluation include per-recipient vocabulary randomization, stronger candidate scoring that models collusion edits, and warnings or thresholds for large recipient sets before L3 is enabled. Until those land, issuers should treat L3 as attribution evidence against ordinary leaks and low-to-medium-effort stripping, not as a perfect collusion-resistant watermark.
The manifest's canonical_content_hash and l3_policy fields,
added in v0.4.5, do not defeat collusion. They preserve the evidentiary trail: a later
investigation can verify which L3 mode was in force at seal time, whether the recipient
acknowledged the non-identity disclosure, and what canonical source bytes the sealed
copy derived from.
3. Passive beacons are telemetry, not guarantees
Oversight supports passive beacons that fire callbacks when a sealed document is opened in an environment with network egress. A beacon is a forensic signal, not a detection guarantee. Absence of a beacon does not prove absence of a leak. Corporate egress filtering, air-gapped readers, privacy tools, sandboxed previews, and offline workflows can all suppress callbacks. Any deployment that treats beacon silence as "the document was not opened" is building a policy on an assumption the protocol does not support.
4. Jurisdiction-by-IP is a soft policy control
Jurisdiction-by-IP is a useful operational signal for audit trails, routing decisions, and honest-client policy enforcement. It is not a cryptographic security boundary. VPNs, corporate NATs, residential proxies, and IP geolocation errors can all defeat or blur the control. An issuer who wants to prevent opens from a particular jurisdiction with cryptographic certainty has to enforce it through key custody and revocation, not through IP-based policy.
5. What RFC 3161 timestamps actually prove
An RFC 3161 qualified timestamp proves that a particular datum existed at or before the TSA's signing time. It does not prove authorship, ownership, or the content of the underlying document beyond the hash that was submitted. The Time Stamp Authority remains a trust anchor: its clock, its private key, and its operational integrity determine the strength of the attestation. Oversight's Sigstore Rekor v2 integration reduces reliance on any single TSA by anchoring evidence in a public append-only log, but transparency logging does not eliminate timestamp trust; it distributes it across a wider set of auditors.
For evidentiary purposes, the combination is stronger than either control alone: the RFC 3161 chain documents an "exists-before" bound, the Rekor entry documents "appended to a public log at checkpoint X," and the signed manifest links both to the specific recipient and content hash. An auditor who trusts any one of these anchors plus Oversight's cryptographic construction can verify the claim without having to trust Oversight as an operator.
6. What this threat model does not claim
Oversight does not claim to defeat all adversaries. Specifically:
- A motivated adversary with a large language model, enough time, and willingness to paraphrase sentence-by-sentence will destroy every watermark layer. This follows directly from Shannon channel capacity at a crossover probability of 0.5.
- A recipient with root access on their own machine can always decrypt documents they legitimately received; protection beyond that requires hardware-backed key custody, which is a planned feature rather than a shipped one.
- Registry revocation depends on recipients' clients reaching the registry.
REGISTRYmode fails closed on unreachable registries;HYBRIDmode uses bounded local staleness as a cache. Neither mode can revoke a document from a client that refuses to check. - Beacons, geolocation, and open-count limits are defense-in-depth controls, not root-of-trust controls. They raise the cost of misuse; they do not guarantee its absence.
7. References
- RFC 2119 / BCP 14 requirement keyword handling: datatracker.ietf.org/doc/html/rfc2119
- RFC 8174 clarification of RFC 2119 keyword scope.
- RFC 3161 Time-Stamp Protocol.
- ETSI EN 319 421 for qualified timestamping operational requirements.
- Sigstore Rekor v2 protocol reference.
- OpenAPI latest specification for registry interoperability: spec.openapis.org/oas/latest
- Companion document: docs/security.md in the Oversight repository.