Performance Evaluation | Oversight Protocol

Zion Boggan · April 2026 · Oversight Protocol v0.4.4 (measurement snapshot), documentation current as of v0.4.5

This page presents measured performance characteristics of the Oversight protocol v0.4.4, benchmarked on an Intel Core i7 (6th gen) running Windows 10 with CPython 3.14.2. Each measurement is the mean of 10 runs. The benchmark script (bench_usenix.py) and raw data are available in the repository. The estimates in earlier sections have been replaced with actual measurements where available.

Seal/Open Throughput

The seal and open operations are dominated by three categories of work: cryptographic operations, watermark embedding (seal only), and I/O. For a typical document (1-100 KB of text), the cryptographic operations are the primary cost.

Cryptographic Operations per Seal

Operation	Algorithm	Expected Time	Notes
Manifest signing	Ed25519	~50 us	Constant time regardless of document size. Signing the canonical JSON manifest bytes.
DEK generation	CSPRNG (256 bits)	<1 us	Single call to `secrets.token_bytes(32)`.
X25519 key agreement	X25519	~100 us	Ephemeral keypair generation + Diffie-Hellman exchange.
HKDF key derivation	HKDF-SHA256	<10 us	Single extract-and-expand cycle, 32-byte output.
DEK wrap	XChaCha20-Poly1305	<5 us	Encrypting 32 bytes (the DEK) with the derived wrapping key.
Content encryption	XChaCha20-Poly1305	~1 us/KB	Linear in document size. A 100 KB document takes roughly 100 us.
SHA-256 content hash	SHA-256	~0.5 us/KB	Computed twice: once for the manifest, once for post-decrypt verification.

For a 10 KB document without watermarking, the total cryptographic cost of seal is on the order of 200-300 microseconds. The open operation is comparable: parse the container, verify the Ed25519 signature (~100 us), perform X25519 key agreement (~100 us), HKDF + AEAD decrypt, and SHA-256 content verification. Both operations are well under 1 millisecond for documents under 1 MB.

For the hybrid suite (OSGT-HYBRID-v1), ML-KEM-768 encapsulation adds approximately 200 us and ML-DSA-65 signing adds approximately 2 ms, making hybrid seal roughly 10x slower than classical seal. This is still fast enough for interactive use.

Scaling with Document Size

The only size-dependent operations are AEAD encryption/decryption and SHA-256 hashing, both of which are linear. XChaCha20-Poly1305 processes data at approximately 1 GB/s on modern hardware (single core, no hardware acceleration). SHA-256 throughput is comparable. A 100 MB document therefore adds roughly 200 ms of crypto time. The practical ceiling is I/O bandwidth, not CPU time.

Watermark Embedding Overhead

Watermark embedding occurs before the cryptographic seal and adds processing time proportional to the text length. Each layer has distinct performance characteristics.

L1: Zero-Width Unicode

L1 performs a single linear pass over the text, inserting a pre-computed frame (approximately 66 zero-width characters for a 64-bit mark_id) at every 40th visible character. The operation is dominated by string concatenation. Expected overhead: on the order of 10 microseconds per KB of text. Negligible relative to cryptographic costs.

L2: Trailing Whitespace

L2 splits the text on newlines, iterates through lines, and appends a space or tab to lines without existing trailing whitespace. Processing stops after 64 eligible lines (for a 64-bit mark_id). Expected overhead: constant time for most documents (under 50 microseconds regardless of total document size, since only the first ~64 clean-ending lines are modified).

L3: Semantic Marks

L3 is the most expensive embedding layer. The apply_semantic() function runs five sequential passes over the text:

Synonym rotation (T1) requires regex-based word-boundary scanning and dictionary lookup for each word. The v2 dictionary uses a precompiled lookup table, so each word lookup is O(1). The full pass is linear in text length. For a 10 KB document with approximately 1,500 words, the expected time is on the order of 1-5 milliseconds (dominated by regex matching, not dictionary lookups).

Punctuation (T2), spelling (T2b), contractions (T2c), and number formatting (T2d) each perform regex-based find-and-replace passes. Each pass is linear but typically matches only a small number of positions. Combined overhead for all four: on the order of 1-3 milliseconds for a 10 KB document.

Total L3 embedding for a 10 KB document is expected to be 2-8 milliseconds. For a 100 KB document, 20-80 milliseconds. L3 is the bottleneck in the watermarking pipeline, but it is still fast enough to be imperceptible in interactive workflows.

File Size Overhead from Watermarking

Watermark embedding increases the byte size of the plaintext before encryption. The container format adds its own fixed overhead (header, manifest, wrapped DEK).

Per-Layer Size Impact

Layer	Mechanism	Size Increase	Example (10 KB doc)
L1 (zero-width)	66-char frame every 40 visible chars	~5-8% (UTF-8 encoded, 3 bytes per zero-width char)	~500-800 bytes
L2 (whitespace)	1 trailing byte per modified line	<0.5% (at most 64 extra bytes)	~64 bytes
L3 (synonyms)	Word replacement (same or similar length)	<0.1% (net change near zero)	~0-10 bytes
L3 (punctuation/spelling)	Character-level substitution	<0.1%	~0-20 bytes

L1 is the largest contributor to size overhead because zero-width Unicode characters require 3 bytes each in UTF-8 encoding, and frames are inserted frequently. The combined watermark overhead for all layers is typically 5-9% of the original text size. This overhead is present in the plaintext before encryption; the encrypted container adds a fixed overhead of approximately 200-400 bytes (6-byte magic, 2-byte header, manifest JSON, wrapped DEK JSON, 24-byte AEAD nonce, 16-byte Poly1305 tag).

L3 marks produce negligible size change because they replace words with synonyms of similar length, replace punctuation characters with other punctuation characters, or swap between equally-sized spelling variants.

Fingerprint Computation Cost

Content fingerprinting runs once during seal (to store the fingerprint) and once during attribution (to compare against stored fingerprints). Both the winnowing and sentence hashing algorithms are linear in text length.

Winnowing

The winnowing algorithm normalizes the text (lowercase, collapse whitespace, strip non-alphanumeric), computes rolling MD5 hashes over all k-grams (k=10 by default), and selects the minimum hash in each window (W=4 by default). The dominant cost is the rolling hash computation: one MD5 call per k-gram position.

MD5 is fast (approximately 500 MB/s on modern hardware), and each k-gram is only 10 characters. For a 10 KB document (approximately 8,000 normalized characters), winnowing computes approximately 8,000 MD5 hashes of 10-byte inputs. Expected time: on the order of 1-5 milliseconds.

Sentence Hashing

Sentence hashing splits the text on sentence boundaries, extracts content words (length greater than 2) from each sentence, sorts them, and computes a SHA-256 hash of the sorted content. For a 10 KB document with approximately 50 sentences, this requires 50 SHA-256 computations of short inputs. Expected time: well under 1 millisecond.

Similarity Comparison

Fingerprint comparison at attribution time is fast: winnowing similarity is a set intersection/union computation (O(n log n) for sorted sets), and sentence similarity is a set membership check (O(n) with a hash set). For typical fingerprints (100-300 winnowing hashes, 30-100 sentence hashes), comparison takes microseconds.

Comparing one leaked document against N stored fingerprints scales linearly in N. For registries with thousands of sealed documents, this is well within interactive response times. For million-document registries, a locality-sensitive hashing index would be advisable (planned for a future version).

Cross-Language Performance: Python vs Rust

The Python reference implementation prioritizes correctness and readability. The Rust port prioritizes performance and memory safety. Both produce bit-identical output for the same inputs (verified by 3 cross-language conformance tests), but their runtime characteristics differ substantially.

Expected Performance Differences

Operation	Python (reference)	Rust (port)	Expected Speedup
Seal (no watermark, 10 KB)	~2-5 ms	~0.3-0.5 ms	5-10x
Open (10 KB)	~2-5 ms	~0.3-0.5 ms	5-10x
L3 semantic embed (10 KB)	~5-10 ms	~0.5-1 ms	10-20x
Winnowing fingerprint (10 KB)	~2-5 ms	~0.1-0.3 ms	10-30x
Container parsing	~0.5-1 ms	~10-50 us	20-50x

The Python implementation's performance is adequate for all practical use cases. A 10 KB document seals in under 15 ms with full watermarking. The Rust implementation is faster due to zero-copy parsing, compiled regex, and direct use of RustCrypto primitives without Python's FFI overhead.

The primary bottleneck in the Python implementation is not cryptography (which is backed by OpenSSL via the cryptography library and libsodium via PyNaCl) but rather text processing: regex matching for synonym lookup, string concatenation for watermark frame insertion, and JSON serialization for the manifest. These operations cross the Python/C boundary less efficiently than pure Rust string processing.

Memory Usage

Both implementations hold the full plaintext in memory during seal and open. The Python implementation has higher baseline memory usage due to the interpreter and string interning. The Rust implementation uses approximately 2-3x the plaintext size in peak memory (plaintext + watermarked copy + ciphertext buffer). For documents under 100 MB, memory usage is not a practical concern in either implementation.

Registry Query Latency

The attribution registry (FastAPI + SQLite) introduces network latency for two operations: seal-time registration (POST /register) and attribution-time queries (POST /attribute, GET /marks).

Registration is optional and occurs after the sealed file is written. The HTTP POST contains the manifest, watermark references, beacon tokens, and optionally the content fingerprint. For a typical registration payload (2-5 KB of JSON), the server-side processing time is dominated by SQLite inserts (under 1 ms with WAL mode). Total latency is determined by network round-trip time.

Attribution queries are similarly lightweight on the server side. The POST /attribute endpoint performs an indexed SQLite lookup on mark_id (O(log n) in the number of registered marks). The GET /marks endpoint returns all known mark_ids, which may be large for registries with many sealed files. Pagination is advisable for registries exceeding 10,000 sealed files.

The registry is intentionally simple (no caching layer, no distributed backend) in the current version. For production deployments handling hundreds of concurrent attribution queries, the planned v1.0 Rust/Axum port with SQLx connection pooling would provide substantially higher throughput.

Bottleneck Summary

Workflow	Bottleneck	Typical Latency
Seal (no watermark)	X25519 key agreement	<1 ms
Seal (with watermark)	L3 semantic embedding (regex passes)	5-15 ms (10 KB doc)
Open	Ed25519 verify + X25519 + AEAD decrypt	<1 ms
Inspect	Container parse + JSON deserialize	<1 ms
Attribute (L1+L2 only)	Text scanning for zero-width frames	<5 ms
Attribute (full pipeline)	L3 verification against N candidates	5-20 ms per candidate
Fingerprint comparison	Winnowing hash computation	2-10 ms per document pair

In all cases, the protocol's computational costs are well below interactive latency thresholds (100 ms). Network latency to the registry and timestamp authorities dominates end-to-end seal time for workflows that include registration and RFC 3161 timestamping.

Measured Results Summary (v0.4.4)

The following table summarizes actual measurements from bench_usenix.py run on the reference hardware described above. Each value is the mean of 10 runs.

Operation	1 KB	10 KB	100 KB	1 MB
Seal (no watermark)	297 us	325 us	627 us	4.07 ms
Seal (with watermark)	305 us	471 us	2.76 ms	23.78 ms
Open (decrypt + verify)	272 us	301 us	576 us	3.93 ms
L1 embed (zero-width)	230 us	1.88 ms	19.52 ms	213 ms
L2 embed (whitespace)	21 us	66 us	401 us	3.72 ms
L3 embed (semantic)	1.39 ms	12.49 ms	122 ms	1.21 s
Content fingerprint	3.37 ms	32.0 ms	321 ms	3.35 s
L3 verify (correct ID)	961 us	9.10 ms	90.5 ms	986 ms
ECC encode (R=7, 64-bit)	23.6 us (constant)
ECC decode (R=7, 64-bit)	50.8 us (constant)

Peak throughput for seal and open is approximately 253 MB/s at the 1 MB level, dominated by XChaCha20-Poly1305 AEAD. Watermark embedding adds 484% overhead at 1 MB, with L3 semantic processing (regex-based synonym matching across 151 classes) accounting for 85% of that cost. Content fingerprinting via winnowing is the most expensive per-byte operation at 3.35 seconds per megabyte. Full benchmark data and methodology are in bench_usenix.py and PERFORMANCE_BENCHMARKS.md in the repository.

Performance measurements from v0.4.4 on Intel Core i7, CPython 3.14.2, Windows 10. Consult the repository for the benchmark script and raw data.