Interactive Watermark Demo
One document, two recipients, three watermark layers. Explore how Oversight embeds invisible marks, how stripping attacks destroy some layers but not others, and how L3 semantic watermarks survive even a complete formatting wipe.
1. The Original Document
Below is a confidential memo before any watermarking. This is the clean source text that an issuer would seal and distribute. No hidden marks, no invisible characters, no synonym rotations. Plain, unmodified text.
Source Document clean
2. Two Watermarked Copies (All 3 Layers)
The same document is distributed to two recipients: Alice and Bob. Each copy contains all three watermark layers: L1 zero-width characters embedded in the text, L2 trailing whitespace patterns at line endings, and L3 synonym rotations that change specific words to encode the recipient's identity. The result looks identical to the human eye, but every copy carries a unique forensic fingerprint.
Alice's Copy ●
Bob's Copy ●
3. The Stripping Attack
An adversary who suspects watermarking will attempt to strip invisible characters before sharing the document. Below, you can simulate three levels of attack against Alice's watermarked copy. Watch how each attack destroys certain layers while L3 remains untouched, because its signal lives in the word choices, not in formatting.
4. Attribution After Stripping
This is the core demonstration. Even after every invisible character and whitespace pattern has been destroyed, the attribution pipeline can still identify the source recipient using L3 semantic verification and content fingerprinting. Select a stripped copy and run the full 5-phase pipeline.
5. How L3 Survives (Visual Diff)
L3 encodes the recipient's identity by rotating words through synonym classes. Each recipient gets a different set of synonyms, drawn from 151 word classes (verbs, adjectives, adverbs, nouns, connectors), 25 British/American spelling variants, 30 contraction choices, and number formatting rules. The table below shows every word that differs between the original and each recipient's copy. Because these changes exist in the content itself, no amount of invisible-character stripping, format conversion, or even screenshot-to-OCR can erase them.
| Word # | Original | Alice's Version | Bob's Version |
|---|
15 word substitutions per copy shown above. In production, the rotation pool is larger; this demo uses a representative subset to illustrate the mechanism.
6. The Three Watermark Layers
Oversight Protocol applies watermarks at three independent layers, each targeting a different threat model. If one layer is stripped, the remaining layers can still provide attribution.
L1: Zero-Width Characters
Unicode code points U+200B (ZWSP), U+200C (ZWNJ), and U+200D (ZWJ) are inserted into the plaintext at deterministic positions. ZWSP and ZWNJ encode binary 0 and 1 respectively; ZWJ pairs serve as frame delimiters. This layer survives copy-paste and most text reformatting.
L2: Trailing Whitespace Patterns
Each line ending encodes one bit via trailing space (bit 0) or trailing tab (bit 1). The mark_id is distributed across the first N lines of the document. This layer survives most editors and format conversions that preserve line structure, but is defeated by trailing-whitespace normalization.
L3: Semantic Rotation (151 classes)
Synonym rotation across 151 word classes (verbs, adjectives, adverbs, nouns, connectors) encodes bits in the choice of words themselves. Additional channels: 25 British/American spelling variants, 30 contraction choices, punctuation style (Oxford comma, dash style, quote style), and number formatting. These marks survive format conversion, invisible-char stripping, and screenshot/OCR because they exist in the content, not the formatting.
Content Fingerprinting (server-side)
When all embedded watermarks are stripped (VM-export attack), a server-side content fingerprint database identifies which recipient's copy was leaked. Winnowing hash fingerprints and sentence-level content hashes are computed at seal time and stored alongside the sealed file. At attribution time, the leaked text's fingerprint is compared against all stored copies via Jaccard similarity.
7. The Attribution Pipeline (5 Phases)
When leaked text is recovered, the attribution pipeline runs five phases to identify the source recipient, combining evidence from all available layers.
Direct Extract
Scan for L1 zero-width frames and L2 trailing whitespace patterns. Partial L2 recovery reports confidence instead of failing silently.
Registry Query
Look up recovered mark_ids against the registry at registry.oversightprotocol.dev and fetch candidate mark_ids for L3 verification.
L3 Semantic Verify
Test each candidate mark_id against the leaked text's synonym choices, spelling, contractions, and punctuation. Score each candidate.
Bayesian Fusion
Combine L1, L2, and L3 evidence into ranked candidates using 1 - product(1-s_i) independence-assumption scoring.
Fingerprint Match
When all embedded marks are stripped, compare the leaked text's winnowing and sentence hashes against stored per-recipient fingerprints.
Each recipient registration produces a signed DSSE attestation recorded in Sigstore's Rekor v2 transparency log, providing a tamper-evident, publicly auditable record that the mark was assigned before the leak occurred. Any auditor can verify the attestation using standard Sigstore tooling without Oversight-specific code.
8. The .sealed File Format
On disk, every watermarked document is stored as a .sealed binary file.
The format is compact, self-describing, and designed for both confidentiality and
tamper-evidence. Below is the binary layout.
+--------+----------+-------+-------------------------------------------+
| Offset | Size | Field | Description |
+--------+----------+-------+-------------------------------------------+
| 0 | 6 bytes | Magic | b"OSGT\x01\x00" (file signature) |
| 6 | 1 byte | FVer | Format version (1) |
| 7 | 1 byte | Suite | 0x01=CLASSIC, 0x02=HYBRID (PQ) |
| 8 | 4 bytes | MLen | Manifest length (u32, big-endian) |
| 12 | MLen | Man. | Canonical JSON manifest, Ed25519 signed |
| 12+M | 4 bytes | WLen | Wrapped DEK length (u32, big-endian) |
| 16+M | WLen | WDEK | Wrapped DEK (ephemeral pub + nonce + key) |
| ... | 24 bytes | Nonce | XChaCha20-Poly1305 nonce |
| ... | 4 bytes | CLen | Ciphertext length (u32, big-endian) |
| ... | CLen | CT | AEAD ciphertext (AAD = content_hash) |
+--------+----------+-------+-------------------------------------------+
The JSON manifest contains the document identifier, issuer public key, recipient public key, creation timestamp, and a digital signature over the canonical JSON encoding. Because the manifest is in cleartext, any third party can verify the issuer's signature without needing the recipient's private key. The watermarked plaintext is only accessible after the recipient unwraps the DEK with their X25519 secret key and decrypts the ciphertext.