Hardware Acceleration

Leading Problem

Business Models

3 Potential Scenarios

  1. Permissioned networks, e.g. StarkNet
  2. Permissionless networks where provers compete on cost, e.g. Mina’s Snarketplace
  3. Permissionless networks where provers compete on latency, e.g. Polygon Hermes & Zero

Selection metrics

  1. Performance: latency, throughput, and power consumption
  2. Total cost of ownership (capital expenditures and maintenance costs)
  3. NRE cost (non recurring engineering: onboarding difficulties)
  4. Long-term comparative advantage: competitor performance may double in a year with the same MSPR (e.g. Intel, Nvidia GPU)

Structure of ZKP Hardware Acceleration

Light Client

Leading Problem

Technology Highlight

Commit-and-proof Schemes

Name Mechanism Prover Complexity Verifier Complexity Trusted Setup Reference
Recursive SNARKs Cycle of elliptic curves High Low Yes https://minaprotocol.com/wp-content/uploads/technicalWhitepaper.pdf
KVC (Key-Value Commitment) Constant size key-value map Med-high Medium No https://eprint.iacr.org/2020/1161.pdf
AMT (authenticated multipoint evaluation tree) authenticate a polynomial multipoint evaluation at the first n roots of unity Medium Medium No https://people.csail.mit.edu/devadas/pubs/scalable_thresh.pdf
KZG Polynomial Commitment Elliptic curve pairings with BLS12-381 High Low Yes https://www.iacr.org/archive/asiacrypt2010/6477178/6477178.pdf
Verkle Tree (with anonymous revocation) High Low No https://math.mit.edu/research/highschool/primes/materials/2018/Kuszmaul.pdf
Vector Commitment RSA and computational Diffie-Hellman over bilinear groups High Low Yes https://eprint.iacr.org/2011/495.pdf
One-to-many prover VSS Prover batching with sumcheck and Fiat-Shamir Medium Low No https://www.usenix.org/system/files/sec22summer_zhang-jiaheng.pdf

Arithmetic Fields

Leading Problem

Solutions

Reference Reading

Efficient Signatures

Leading Problems

Specialized ZK signatures

Native curves

Bignum techniques

CRT inspired methods

Other tricks

Edge cases

Conventional optimizations

Suggestions

  1. Use “ZK native” signatures if possible (i.e. Picnic)
  2. Use “native” curves if possible (i.e. JubJub)
  3. In non-native curve arithmetic, for example ED25519, combine the limb products with the same weight mod 255 (in bits)
    1. minimize the number of splits by adding intermediate products with the same weight first. For example, when we have 2 intermediate projects of weight 2^51, we can split the sum into 52 bits
    2. If we have an intermediate product with weight of 2^255 or more, we can change its weight to 1 and multiply by 19, since 2^255 = 19 mod p_ed25519
  4. Use CRT related method like Aztec’s approach to have the prover give a purported product and check the identity mod p_native to ignore large limb products
  5. Use a PLONK or STARK can cut down the cost significantly
  6. Check windowing in the scalar multiplication code
  7. Leverage the constant generator in one of the multiplications in EdDSA
  8. GLV Endomorphism can split a curve multiplication into two smaller ones in curves like secp256k1

Proof Aggregation

Leading Problem

When comparing zk-SNARKs, we are usually primarily interested in three different metrics: prover time, verifier time, and proof length - some may also want to add quantum security and setup trust assumptions to that list.

Within the context of blockchains, there is one further criterion which I would argue to be as important as those three, and we’ll give it the clunky name aggregation-friendliness.

A zk-SNARK with universal setup can be said to be aggregation-friendly if (informally) there exists some m so that it is easy to merge m proofs P1,...,PmP1,...,Pm into one proof P1,...,mP1,...,m so that P1,...,mP1,...,m is not much bigger than either PiPi, and certainly much smaller than the two of them together. Specifically there should exist a deterministic polynomial-time algorithm 𝔄A (which we call the aggregator) and 𝔙𝔄VA (which we call the aggregation verifier) such that for any P1,...,PmP1,...,Pm:

So why is aggregation-friendliness so important to us? In blockchains, especially for scalability purposes, SNARKs are used to prove the validity of a bunch of statements at once. Now, if we want to prove a block of n transactions of, say for simplicity, the same size m, with a single zk-SNARK, the prover time will typically be O(nmlognmnmlog⁡nm). Often both proof length and verifier time are also non-constant in n. Bigger n therefore leads to significantly bigger costs, and furthermore, this doesn’t really scale with hardware parallelization or multiple provers.

On the other hand, imagine if the same proof system were aggregation-friendly, and let us for now ignore network limitations. You could have n different provers, each proving one transaction in time O(mlogmmlog⁡m), and then aggregating them in time sublinear in n, therefore generating a block proof much more quickly. Hopefully the resulting proof would also not be much bigger than the original one or much more difficult to verify.

To hammer the point home, assume we have two schemes, S1:=(1,1),S1:=(2,2)S1:=(P1,V1),S1:=(P2,V2). 1P1 is more efficient that 2P2, with the respective proving times for a statement of length m being t1=5mlog2m,t2=10mlog22mt1=5mlog2⁡m,t2=10mlog22⁡m. However, while S2S2 is aggregation-friendly, S1S1 is not. There exists an aggregator A for S2S2 which can aggregate n proofs of a length-m statement in time 4log22n4log22⁡n. In total, the proving time for P1P1 will be 5nmlog2nm5nmlog2⁡nm. On the other hand, we can run P2P2 in parallel for each of the statements, and then run A to aggregate them into a single proof, for a total time of

10mlog22m⋅4log22n=40mlog22mlog22n10mlog22⁡m⋅4log22⁡n=40mlog22⁡mlog22⁡n

Let’s look at how t1t1 and t2t2 compare for some different values of n and m:

Here we see that, although 1P1 performs significantly better than 2+P2+A for small n, it’s a very different story for big n, and should make it clear why aggregation-friendliness is important.

This post will focus on some different approaches to zk-SNARK proof aggregation. At a high level, there are three axes which divide them into categories:

Popular zk-SNARK Aggregation Constructions

Halo

As so much of cryptography, zero-knowledge and otherwise, Halo is based on elliptic curves. Specifically, it uses a polynomial commitment scheme (PCS), which was previously used in another well-known zk-SNARK scheme, namely Bulletproofs: Given an elliptic curve E (where we write E(𝔽)E(F) for the group defined by the points on E in the field 𝔽F over which we work) and random points Pi∈E(𝔽)Pi∈E(F), one can commit to a polynomial p(x) = ∑iaixi∑iaixi with the element ∑iai⋅Pi∑iai⋅Pi, where ⋅⋅ denotes scalar multiplication in the additive group E(𝔽)E(F). This commitment is also often written as <a,P><a,P> as if it were an inner product.

The "inner product argument" (IPA) is a protocol between the prover and the verifier where the prover tries to convince the verifier that he knows the polynomial corresponding to his commitment. Roughly speaking, he does this by iteratively chopping his tuples a and P in two equally long tuples and then putting them together so as to get two new tuples of half the length. Eventually, after a logarithmic number of rounds, the prover is left with a single equation, where, almost if and only if he has acted honestly throughout will he be able to satisfy the verifier’s final challenge.

The problem is that, although the number of rounds is nice and small, the verifier still needs to compute two "inner products" of "vectors" of the original length. One may think that this ruins the protocol: the verifier’s work is now linear rather than logarithmic. However, this is where the recursive proof composition (aggregation) comes in:

The aggregation technique largely relies on an ad hoc polynomial gu1,...,uku1,...,uk(x), which is defined in terms of the verifier’s challenges uiui to the prover in the IPA. As it turns out, gu1,...,uku1,...,uk(x) can be used to compute both of the "inner products", where we will focus on one of them: At the end of the IPA, the prover has a single elliptic curve point P. As it turns out, by the construction of gu1,...,uku1,...,uk(x), the commitment to it defined by the above scheme yields exactly P! The other "inner product" can also be defined as an evaluation of gu1,...,uku1,...,uk(x). This does not help us directly, as evaluating the polynomial still takes linear time, however, the prover can now create such polynomials for many statements, commit to them, and then at the end prove that a linear combination of them corresponds to a desired value - this uses a property of the PCS which we have not mentioned thus far, namely that it’s additively homomorphic (specifically, it’s based on Pedersen commitments).

To recap, the verifier previously had to do a linear-time operation for each statement. By invoking a certain polynomial with some nice properties, the prover can defer the final opening proof until he has computed and committed to arbitrarily many such polynomials. Halo people often talk about the "inner" and the "outer" circuit to reflect the distinction between the point of view of the prover (who finishes most of each proof before computing an aggregated proof for all together) and the verifier (for whom the proof really boils down to the linear-time computation she has to perform).

For specifics, we refer to the original Halo paper. The Halo approach is clearly algebraic in its nature: It depends on the nice homomorphic properties of Pedersen commitments, as well as a particular polynomial. Moreover, it is sequential: The prover iteratively adds commitments G1,G2,...G1,G2,... to his inner proof, before opening a linear combination of them at the end of the protocol. Finally, it is recursion-based: The prover proves the validity of the previous inner proof to the current one.

Halo 2 is used by several protocols, and its main selling point is the aggregation property. Notably, Mina protocol uses Halo to create a constant-sized blockchain, in the sense that the validity of the current state is proven simply by verifying the validity of the proof of the previous state together with the transactions in the last block. Finally, a proof is computed, representing everything that has happened up to that state.

Electric Coin Company, the developer of the scheme, recently incorporated Halo 2 as part of their upgrade to the Orchard shielded pools.

Halo 2 is also used by the zkEVM rollup project Scroll, in order to aggregate multiple blocks into one final proof and thereby amortize the L1 verification costs between them.

Plonky2