# Bitcoin-Anchored Verifiable Attribution Index

**Design Specification**
*May 2026*

---

> **Core invariant.** Per-commitment production cost is positive and scales linearly with volume up to operational constants, under a stated, adversary-favourable threat model. The architectural contribution is cost preservation and attribution: production cost paid by a holder persists as attributable scarcity across every system that recognises the commitment. See *Orange Anchor White Paper v2.3* for the full argument.

---

## Purpose of This Document

This document is the design specification for the Bitcoin-Anchored Verifiable Attribution Index (BAVAI) — a construction-agnostic key-value attribution infrastructure for recording signed observations against collateral commitments anchored to Bitcoin. The index is designed to serve any BACC construction; Orange Anchor is the reference instantiation. This document establishes the key and value encoding, tree structure, on-chain commitment mechanism, lookup flow, and the architectural properties of the index.

The mechanism is named the **Bitcoin-anchored verifiable attribution index**. This naming aligns with established terminology in the cryptographic literature (authenticated data structures, verifiable key-value commitments, transparency logs) while being specific enough to describe the function: an index that supports verifiable attribution of actions to actors against a Bitcoin commitment.

---

## What the Mechanism Is

A Bitcoin-anchored verifiable attribution index is a key-value data structure that answers two questions with cryptographic certainty:

**Attribution.** Who performed a specific action against a specific subject?

**Verifiability.** Can anyone, anywhere, at any future time, independently prove the answer is correct without trusting any intermediary?

The structure is built from well-understood components — Merkle trees, OP_RETURN commitments, signature verification — assembled in a specific architectural pattern. The novelty is in the application, not in the cryptography.

---

## The Core Design

The index supports lookups of the form: *given a composite key describing what was acted upon, return the cryptographic identity of who acted, with a proof verifiable against Bitcoin.*

### Key Construction

```
KEY   = flag_indicator (1 byte) || anchor_reference (32 bytes)
      = 33 bytes composite key
```

The key encodes two things together: the type of attribution event (flag, dispute, endorsement — extensible via the indicator byte) and the subject of the event (the **anchor reference** of the Orange Anchor being attributed against). The anchor reference is the canonical 32-byte identifier of an Orange Anchor commitment, defined as `SHA256(start_anchor_txid ‖ end_anchor_txid)` (see *BAVAI Operator Specification v1.0* §2). The anchor reference is preferred over the 64-byte commitment signature because (a) it is the canonical identity used throughout the rest of the protocol, (b) it is 32 bytes shorter, and (c) every party that needs to construct a lookup against a given Orange Anchor already holds its anchor reference.

### Value Construction

```
VALUE = flagger_pubkey (33 bytes compressed)
      + flagger_signature (64 bytes)
      + timestamp (4 bytes)
```

The value contains the cryptographic identity of the actor (the *flagger*) and a timestamped, signed record of the attribution event. The flagger's signature is over a canonical message containing the flag indicator, the target anchor reference, and the timestamp.

### Tree Structure

A sorted Merkle tree is built over all `(KEY, VALUE)` pairs in an epoch. Sorted construction enables binary search lookup with logarithmic proof size. A Patricia Trie is functionally equivalent and gives native key-based lookup; the sorted Merkle tree is preferred for Bitcoin contexts because it is simpler to implement and culturally aligned with the existing Bitcoin ecosystem (OpenTimestamps uses sorted Merkle trees).

### On-chain Commitment

The 32-byte Merkle root is committed to Bitcoin via OP_RETURN at the close of each epoch:

```
OP_RETURN <magic_bytes> <merkle_root>
```

OP_RETURN allows up to 80 bytes of arbitrary data per output and is the standard Bitcoin mechanism for application-layer commitments. The full tree data lives off-chain with operators; only the root sits on the chain.

### Lookup Flow

```
1. Verifier holds Orange Anchor with anchor_reference AR
2. Constructs lookup key: flag_indicator || AR
3. Queries any operator for value at that key in epoch N
4. Operator returns: flagger_pubkey + flagger_signature + timestamp + Merkle proof
5. Verifier reads OP_RETURN root for epoch N from Bitcoin directly
6. Verifies Merkle proof against on-chain root
7. Verifies flagger_signature using flagger_pubkey
8. Result: cryptographically confirmed identity of who attributed against AR
```

The verification is fully local. The operator can refuse to serve, but cannot forge a valid proof against the Bitcoin-committed root. The operator is a service provider, not a trust authority.

---

## Worked Example — The John and Alice Case

**Setup.** John creates an Orange Anchor with anchor reference `AR_John = SHA256(start_txid_John ‖ end_txid_John)`. Alice decides to flag John.

**Commit phase.** Alice constructs:

```
KEY   = 0x01 || AR_John
VALUE = Alice_pubkey || Alice_signature || timestamp
```

Where `Alice_signature` is Alice's signature over `0x01 || AR_John || timestamp`.

This `(KEY, VALUE)` pair is included in the operator's flag tree for epoch N. The operator builds the sorted Merkle tree, computes the 32-byte root, and commits it to Bitcoin via OP_RETURN.

**Query phase.** Bob wants to know whether John has been flagged. Bob has `AR_John` (anchor references are public; any party that has seen John's portable proof envelope holds it). Bob constructs:

```
lookup_key = 0x01 || AR_John
```

Bob queries any operator. The operator returns `Alice_pubkey, Alice_signature, timestamp` and a Merkle proof.

**Verify phase.** Bob reads the OP_RETURN root for epoch N from Bitcoin, verifies the Merkle proof against that root, and verifies Alice's signature over the canonical message.

**Result.** Alice flagged John at the recorded timestamp. The operator cannot have fabricated this; Alice cannot repudiate it.

---

## Multiple Flagger Resolution

The base design supports one flag per `(flag_indicator, anchor_reference)` pair per epoch. If multiple parties flag the same Orange Anchor in the same epoch, the key is extended with a 6-byte flagger short-id (matching *BAVAI Operator Specification v1.0* §6.4.9):

```
KEY = flag_indicator || anchor_reference || flagger_short_id
```

where `flagger_short_id = first 6 bytes of SHA256(flagger_anchor_reference)`. The 6-byte short-id provides collision resistance sufficient for tens of millions of distinct flaggers per target while keeping the key short. Looking up a specific flag requires knowing the flagger's short-id, which works for direct queries. For enumeration of all flags against a specific Orange Anchor, the operator returns the list of keys sharing the 33-byte prefix `flag_indicator || anchor_reference`, after which the verifier requests proofs for each. Patricia Tries support prefix queries natively; sorted Merkle trees require operator-side indexing of prefix relationships, which is straightforward to implement.

---

## Position Within Existing Literature

The components of this mechanism are well-established. The novelty lies in the architectural assembly applied to public, attributable Sybil-resistance flagging.

### Established Components

**Authenticated data structures** — Foundational cryptographic primitive dating to the 1980s. Merkle trees, Patricia Tries, sparse Merkle trees, and authenticated dictionaries all support proofs of inclusion against a published root.

**Bitcoin-anchored commitments via OP_RETURN** — The standard mechanism for committing application data to Bitcoin. OpenTimestamps (Peter Todd, 2016) is the canonical example, using Merkle trees for timestamping. Various protocols use OP_RETURN to anchor higher-level state.

**Verifiable key-value lookup with Merkle proofs** — Native to Ethereum's Patricia Trie state. Used in Certificate Transparency logs (Google, 2013) for attributing certificate issuance. Used in Git signed commits combined with Merkle trees for attributing code changes. Used in various blockchain event indexes for attributing token transfers.

### Direct Precedents

**Certificate Transparency** uses a Merkle-tree-backed log of certificate issuance, where each entry attributes a certificate to its issuer. Anyone can verify that a specific certificate was logged at a specific time. The pattern is identical in structure: authenticated index, public attribution, verifiable proofs against a published root.

**OpenTimestamps** anchors Merkle roots to Bitcoin via OP_RETURN. The verifiable attribution index extends this pattern with key-value semantics, where the key encodes the subject and the value encodes the actor.

**Signed Git commits with Merkle history** form an attribution chain in version control: each commit is signed by its author and chained through Merkle hashes. The Bitcoin-anchored version makes the chain itself anchorable to public commitment.

### Why This Application Has Not Been Standard

The pattern has been technically available for years but has not been the default approach in Bitcoin specifically. Three reasons explain this.

The Bitcoin culture has been historically reserved about application-layer use of OP_RETURN, even for commitment-only purposes. Some segments of the community view any non-monetary OP_RETURN data as borderline, even when the data is a single 32-byte commitment with no semantic content on-chain.

When developers wanted verifiable key-value lookups with cryptographic proofs, they typically used Ethereum or a similar chain where the state model is natively a Patricia Trie. Building the same functionality on Bitcoin requires more off-chain infrastructure, which felt like a regression compared to having it native to the chain.

The use cases that genuinely required Bitcoin-anchored attribution were narrow until recently. Timestamping needed only inclusion proofs. Token systems on Bitcoin (Stamps, Runes, BRC-20) had different shapes. Public attributable Sybil-resistance flagging is a use case that emerged with the BACC pattern and specifically benefits from Bitcoin's permanence and neutrality as the commitment substrate.

### What Is Genuinely Novel

The cryptography is not new. The Bitcoin-anchored commitment pattern is not new. The use of Merkle trees for attribution is not new. The Bitcoin-native architectural assembly applied to flagging of self-issued cost-backed digital objects is the new contribution. This contribution sits in the *application of well-understood techniques to a previously unaddressed problem*, not in invention of new cryptographic primitives.

The honest framing: standard primitives, novel architectural application.

---

## Architectural Properties

### Public

The index is fully public. Anyone can query, anyone can verify, anyone can read the on-chain root. There is no access control. This is intentional — public attribution is the mechanism by which flags acquire weight.

### Trustless Verification

Verification depends only on Bitcoin and the cryptographic content of the proof. No operator, indexer, or platform is in the trust path. Operators serve data and proofs; Bitcoin serves the root; verification is local.

### Extensible

The flag_indicator byte allows multiple types of attribution events within the same architectural pattern. Flags, disputes, endorsements, votes, and other public attributable events can all share the same index structure with different indicator bytes. The extension does not require any change to the on-chain commitment mechanism.

### Append-only with Temporal Dimension

Each epoch produces a new commitment. Older commitments remain on Bitcoin permanently. The temporal sequence forms an append-only log of all attribution events, with cryptographic guarantees against backdating or rewriting. Combined with timestamps in the value, this enables historical queries with full verifiability.

### Operator-as-Service-Not-Gatekeeper

This pattern is consistent with the broader BACC trust architecture. Operators provide convenience (storage, indexing, proof serving). Operators cannot affect verification correctness. If one operator becomes unavailable or unreliable, another can serve the same data because the data is fully reproducible from the public on-chain commitment plus any complete copy of the underlying tree.

---

## Limitations and Honest Trade-offs

### Public Disclosure

The mechanism is fundamentally public. Every flag reveals the flagger's pubkey to anyone querying. Privacy-preserving variants would require zero-knowledge layers or private set intersection techniques. These are possible extensions but are not part of the base design.

### Data Availability

If all operators disappear, the on-chain root remains but the underlying data needed to construct proofs is gone. Mitigations include periodic archival to IPFS or Arweave with additional on-chain pointers, multiple independent operators maintaining the index, and incentive structures for operator persistence. The mitigation strategy belongs to deployment, not to the core architectural pattern.

### Bootstrapping

The index is useful in proportion to the number of Orange Anchors using it and the number of attribution events recorded. Like any network-effect system, early-stage utility is lower than mature-stage utility. This is structural to the pattern, not a flaw.

### Query Cost

Single-key lookups are fast and cheap. Large prefix scans (enumerating all flags against a popular Orange Anchor) require operator-side indexing and may impose serving costs. These costs are bounded and acceptable for the expected query patterns.

### Operator Economics

Operators incur real costs for storage, computation, and proof serving. Sustainable operator economics requires either fee structures, donations, public-goods funding, or volunteer maintenance. This is a deployment question rather than an architectural one but should be addressed in the broader operator specification.

---

## Naming and Communication

### Recommended Public Naming

For external communication and documentation, use **Bitcoin-anchored verifiable attribution index** or shorter forms when context is established:

- **Verifiable attribution index** — when BACC context is already understood
- **Anchored attribution index** — when Bitcoin substrate is implicit
- **Attribution index** — in technical contexts where the verifiable and anchored properties are stipulated

This naming is precise, descriptive, and aligned with existing terminology in cryptographic literature (authenticated data structure, transparency log, verifiable key-value commitment). It does not overclaim novelty and gives readers a recognisable conceptual frame.

### What Not to Call It

Avoid terminology that overclaims novelty or sounds proprietary. Avoid terminology that obscures the relationship to existing techniques. The mechanism gains credibility from being recognisable as a competent application of well-known primitives, not from being framed as something unprecedented.

### Reference Vocabulary for Documents

For the *Orange Anchor White Paper v2.3*, *BACC v1.9*, and companion materials, the consistent vocabulary is:

- *Flag* — an attribution event recorded in the index
- *Flag indicator* — the type byte distinguishing event categories
- *Anchor reference* — the canonical 32-byte identifier `SHA256(start_anchor_txid ‖ end_anchor_txid)` of the Orange Anchor being attributed against (the lookup-key subject)
- *Flagger* — the actor performing the attribution (preferred term; *raiser* is a deprecated synonym retained only in cross-references to earlier drafts)
- *Flagger short-id* — a 6-byte prefix of `SHA256(flagger_anchor_reference)` used in multi-flagger key extension
- *Flag record* — the (key, value) pair recording one attribution event
- *Index epoch* — the time period over which one Merkle root commits
- *Commitment root* — the on-chain Merkle root
- *Operator* — a service provider hosting the index data
- *Proof* — the Merkle path validating a specific entry against the root

---

## Integration with BACC

### Where the Mechanism Sits in the Architecture

The verifiable attribution index is the implementation mechanism for the *flag* role described in the BACC white paper §4.1 (Roles and Lifecycle). The white paper describes flagging at the architectural level: "a flagging party makes the standing of a commitment publicly disputable through a signed, persistent record." The verifiable attribution index is the concrete realisation of this — it provides the signed, persistent, publicly verifiable record.

### Relationship to Other BACC Components

The index does not interact with the production of Orange Anchor commitments themselves. Commitment production is independent and uses Bitcoin anchoring directly. The index is a separate Bitcoin-anchored structure that records attribution events about commitments after they exist.

The index is consistent with the BACC trust architecture. Operators serve as service providers without verification authority. Bitcoin is the source of truth for all on-chain commitments. Verification is local to anyone who reads the chain and queries an operator.

### Position in the Document Ecosystem

This document specifies the BAVAI index design. The *BAVAI Operator Specification v1.0* specifies the operator protocol surfaces — batching, indexing, flag publication, and the wire formats for attribution events. The *Orange Anchor White Paper v2.3* §6 describes the attribution index at the architectural level; the White Paper need not contain detailed mechanism specification, and a brief reference in §6 is sufficient for the architectural argument. The *Strategic Cost Calibration Model v1.3* treats flag effects as external inputs to attacker economics and does not engage with the index mechanism directly.

---

*BAVAI Reference v1.0 — Design specification for the Bitcoin-Anchored Verifiable Attribution Index. May 2026.*
