Privacy on Mir
Predicate Labs · 2021-3-18
TLDR: As blockchain usage grows, privacy is becoming increasingly important. However, existing private smart contract designs like ZEXE aren't practical at scale due to state growth. Mir offers a new design for scalable privacy.
Transparency is expensive
As more financial infrastructure is built on decentralized protocols, users face an increasing need for privacy. Without privacy, blockchains become an adversarial environment where every transaction is an arbitrage opportunity for miners and adversarial bots to exploit users. Transparent blockchains become Dark Forests.
On existing chains, transactions become public as soon as they're submitted to the network. As a result, arbitrage bots and miners can extract value from transactions before they're included in a block. This Miner Extractable Value (MEV) can include anything from frontrunning trades to censoring transactions.
We wanted a better financial system, instead we got an updated version of Robinhood selling order flow to Citadel. Blockchains need privacy.
Enter ZEXE (but there's a problem...)
ZEXE is an exciting academic work that offers privacy for smart contracts. It hides the amounts involved in a transaction (confidentiality), the transaction graph of who transacted with whom (anonymity), and even smart contract logic (function privacy).
But there's a problem. State growth is the biggest bottleneck for scaling blockchains (state is the data stored by validators to verify new transactions). On Ethereum, gas limits are lowered to control state growth, with the current state size at 50 gigabytes after operating for nearly six years.1
Unfortunately, the anonymity scheme used in ZEXE means that state size grows with each transaction (rather than with each active account or UTXO). This state can't be discarded or pruned like historical transaction data on Ethereum.
If an implementation of ZEXE like Aleo handled even 100 transactions per second, chain state would grow by 200 gigabytes per year. At 1000 tps, it would grow by 2 terabytes every year.2 This is a prohibitive barrier for scalability.
Anonymity and state growth
State growth is a problem because ZEXE uses the same anonymity scheme3 as Zcash. In this design, when a coin is spent, the spender generates a serial number unique to the coin. An observer can't determine which coin is spent in a transaction from the serial number (providing anonymity). Validators must check that each serial numbers appears only once in the ledger to prevent double spends.
This is a significant drawback because validators must store serial numbers forever.4 This problem has been discussed in relation to Zcash, but it's much worse for a smart contract platform with high throughput.
We can't have a system that adds 40x more state than the entire Ethereum chain every year. Decentralized protocols need privacy, but we also need to scale beyond 10 transactions per second.
Mir to the rescue
We propose a way for Mir to provide anonymity, confidentiality, and function privacy, without state growth. Validators on Mir only need to store 5 bits per active application, so we can provide privacy and scalability.
But this design also makes a tradeoff. In the nullifier scheme, the anonymity set for a transaction consists of every transaction in the blockchain's history. On Mir, the anonymity set for a transaction includes all transactions within a fixed-size block interval.5
In practice, we can make the anonymity set on Mir as large as necessary to provide sufficient privacy. It could include hundreds of thousands or millions of transactions (even bigger than the current anonymity set on Zcash6).
Most importantly, validators won't have to bear the cost of state growth forever, enabling privacy without limiting scalability.
A new design for anonymity
So how does Mir provide anonymity? We take a transparent transaction and split it, separating consumed records from created records.7
We generate a pre-nullifier from the set of created records. The transaction proof proves that the pre-nullifier is generated correctly.
The transaction sender submits the transaction and the pre-nullifier to the network. After the transaction is accepted, the sender generates a proof showing that the pre-nullifier is included in the chain, and sends the proof and the created records to the recipient.
The recipient generates a unique post-nullifier and provides a proof showing that the created commitments and post-nullifier are valid. The pre- and post-nullifiers ensure that an observer can't link transaction inputs and outputs, preserving anonymity. For more details, see the appendix.
For simplicity, we've batched output records together with a single post-nullifier. However, this design links commitments created in the same transaction. Instead, we can simply generate a separate pre- and post-nullifier for every created record.
No state growth
Validators check that post-nullifiers are included once within a set of blocks, preventing double spends. Validators can discard post-nullifiers after a certain number of blocks, eliminating state growth.
Conclusion
ZEXE is an exciting research project that shows that strong privacy guarantees are possible for smart contracts. Mir uses a different approach to offer privacy at scale. Scalability is important because everyone should have access to privacy, not just wealthy speculators who can afford to pay high transaction fees.
We plan to release a more formal specification of Mir's privacy design in the near future, and we welcome feedback and comments. You can find us on Discord and Telegram. We're building one of the most talented teams in crypto. Join us.
Thanks to Dev Ojha and Ashwin Ramachandran for feedback on earlier drafts.
Appendix
Here's a more detailed description of Mir's design.
First, we take a transaction that has some input and output records.
The recipient samples a secret $\mathrm{S}$ and generates $\mathrm{K = H(S, r)}$, for a CRH $\mathrm{H}$ and random nonce $\mathrm{r}$ and sends $\mathrm{K}$ to the transaction sender. This is equivalent to the address public key in ZEXE or Zcash.
Then the sender generates $\mathrm{C}$, a commitment to the created records. The sender generates the pre-nullifier $\mathrm{Pre = H(C, K, r)}$ for new random nonce $\mathrm{r}$. The transaction proof verifies that the pre-nullifier is generated correctly given the set of created records.
From here there are two variants.
Variant 1
The sender submits the transaction inputs and auxiliary data, the transaction proof, and the pre-nullifier to the network. After it is accepted, the sender queries an inclusion proof for the pre-nullifier in a block.
The sender generates a ZKP $\pi'$ with public inputs $\mathrm{R}$, the root of Merkle log consisting of recent blocks, $\mathrm{K}$, and $\mathrm{C}$. This ZKP verifies that a pre-nullifier $\mathrm{Pre}$ is included in the Merkle log with root $\mathrm{R}$.
The sender transmits the proof and created records to the transaction recipient. The recipient generates new hiding commitments $\mathrm{C'}$ for the created records, $\mathrm{Post = H(C, S)}$ and generates a proof $\pi$ with $\mathrm{C'}$, $\mathrm{R}$, and $\mathrm{Post}$ as public inputs that proves that:
- $\mathrm{Post}$ is generated correctly with created records $\mathrm{C}$ and secret $\mathrm{S}$.
- The proof $\pi'$ is valid and consistent with $\mathrm{K}$, $\mathrm{S}$, and $\mathrm{C}$.
- $\mathrm{C}$ and $\mathrm{C'}$ have the same preimage.
- $\mathrm{R}$ is within the desired block interval.
We send this proof to the network with the post-nullifier and created commitments. The post-nullifier is indistinguishable even to the transaction sender, given the hiding properties of the commitment
Variant 2
Alternately, the transaction creator could reveal the pre-nullifier, or the entire transaction, to the recipient. This allows the recipient to generate all proofs and submit them to the network. This presents a more similar transaction flow to Zcash or ZEXE, but at the expense of linking input commitments for the recipient.
However, in practice, these inputs should be anonymous, and revealing input commitments seems equivalent to revealing created commitments in a nullifier transaction. In any case, inputs can be further anonymized by creating a dummy transaction spending and creating an identical record.
Interactivity and transaction flow
In the most private version of Mir, the transaction sender submits a transaction to the network and then must wait until after the transaction is accepted to send it to the recipient. In the nullifier scheme, the commitment can be sent at any time, though in both Mir and the nullifier model, outputs are spendable only after users query an inclusion proof.
Alternately, the sender can accept a very slight anonymity penalty and open the pre-nullifier commitment to the recipient, revealing which input record commitments are used in the transaction (but only to the recipient). In practice input records are anonymized, so this should represent a negligible loss of privacy.
In this mode, the recipient can verify that the pre-nullifier has been accepted by the network, and generate all proofs independently. Commitments can be spent in the same block that a post-nullifier is submitted. This is equivalent to the nullifier model, with the caveat that the post-nullifier must be submitted within a (large) window of blocks.
Adapting the nullifier model to limit state growth
ZEXE or Zcash could accept a similar tradeoff to Mir, by periodically discarding commitments and nullifiers. This would cap state growth in exchange for a smaller anonymity set. This already happens in practice with turnstiling, when users reveal their coins to move between different shielded pools.
However, discarding old nullifiers would require users to periodically consume their records to move to the new nullifier set or lose their funds. From a UX perspective, this is unacceptable.
We can think of a Mir transaction as functionally equivalent to combining a shielded-shielded transaction with a shielded-unshielded transaction, without revealing balances or application logic.
Transaction reversal
The transaction creator might go offline, or an attacker may somehow censor a post-nullifier from being included in the ledger. In this case, we want to ensure that transactions can be reversed.
The transaction sender or recipient could provide a proof after the window for submitting a post-nullifier has expired. This proof would verify that a post-nullifier doesn't exist for a transaction. We can reverse a transaction, un-consuming the input records.8
Timing and nation-state attacks
Mir's privacy scheme is more susceptible to timing attacks, so wallets must vary the time interval between submitting pre- and post-nullifiers to the network.
Mir's privacy scheme can also be broken if an adversary pays transaction fees to entirely fill a block containing a targeted transaction. This would be enormously expensive for an adversary, and would just break anonymity, not confidentiality or function privacy.
Notes
Etherscan reports 130gb using Geth, with 50gb in the current state.
Each nullifier is 32 bytes, and each transaction outputs 2 nullifiers, or 64 bytes. At 100 tps for a year, this would total 200 gigabytes. We can compress this a bit on disk, but it remains an insurmountable scaling problem.
From Sander and Ta-Shma 1999, as in Zerocash/Zcash.
In theory it's possible to use algebraic accumulators like RSA accumulators to generate exclusion proofs without storing the entire nullifier set. Unfortunately, these proofs are extremely expensive to compute for large sets, and nodes would still need to verify that the accumulator is updated correctly after each block. It isn't practical to eliminate the requirement that nodes store the entire nullifier set.
To put it another way, this design relies on the observation that what matters for privacy is that the anonymity set has a minimum size, not that it's perpetually growing. With the moving window approach, we can set the anonymity set at or slightly above this minimum and achieve privacy without state growth.
As of January 2021, the total number of shielded transactions on Zcash was ~310,000.
We can think of records as UTXOs, or effectively snapshots of an application state.
In practice, transaction reversal should never happen, and is included only to address the case of a very small anonymity set consisting of a short block interval.
In theory, the transaction creator could privately prove that a nullifier doesn't exist in an authenticated data structure, but with fast block times, this proof would become invalid by the time it was published to the network.
Depending on the transaction flow, the sender can provide witness data to generate the proof for the post-nullifier, or can simply open a record commitment already on chain. In the first case, the recipient could identify which records are consumed in the transaction, but could also provide the tx reversal proof (and potentially replay the transaction).