BLS Signature Aggregation 1. Hierarchies We have roughly four aggregation-like techniques that apply to BLS signatures. We briefly explain them and classify their possible layerings according to several criteria: - performance or priority, - mutual compatibility and natural layering, - if signature side aggregation can be done securely outside the verifier, or if the scheme is really just a local batching technique, - and possible future work. We have two linear aggregation strategies: _Distinct message_ aggregation requires verifiers compute one pairing for each message, which favours distinct messages being among the final/outermost aggregation techniques, and sounds compatible with other aggregation techniques. We may aggregate the signature side for distinct messages early, meaning outside the verifier. In principle, we might aggregate the message hashes if an identical set of messages is signed by several signers, improving performance, but this requires some care. We might consider if any threshold schemes support distinct message aggregation like this. _Proof-of-possesion_ requires key material be previously establish and verified by the protocol, but requires no masking with expensive exponentiations. See the security arguments in The Power of Proofs-of-Possession: Securing Multiparty Signatures against Rogue-Key Attacks by Thomas Ristenpart and Scott Yilek at https://eprint.iacr.org/2007/264.pdf We expect this proof-of-possesion assumption makes delinearization aggregation techniques unnecessary, although this might not be true if different security goals apply to different aggregation layers, maybe even in our para-chain vs relay chain situation. In particular, verifiers might either batch verifications, or prove their own correct behavior using ideas around random oracle delinearization. We believe proof-of-possesion must occur "below" other aggregation techniques because the first/lowest aggregator should itself validate the proof-of-possesion, with later/higher aggregators avoiding this only by attributing all responsibility to the first/lowest. We also have two delinearization strategies: _Random oracle delinearization_ is the most robust and flexible aggregation technique, and also the slowest. It delinearizes by exponentiating both signatures and public keys with the result of a query to a random oracle that depends upon both the signer's public key along with the set of all signing keys included in the signature. See the security arguments in Compact Multi-Signatures for Smaller Blockchains by Dan Boneh, Manu Drijvers, and Gregory Neven at https://eprint.iacr.org/2018/483.pdf We envision random oracle delinearization comes "below" other aggregation techniques because doing it higher up requires access to the separate signatures, and we'd expect to simply pay the high cost throughout. That said, we could an initial validation layer might aggregate differently due to some trade off in performance vs threat model, maybe with local small random maskings. _Local small random delinearization_ aka _batching_ should normally occur entirely within each verifier, making it likely the final or outermost aggregation technique. Any such batching scheme takes a collection of signature triples (public key, message, signature) and considers them all simultaneously. We know batching is superfluous when combined with proof-of-possesion, unless the proofs-of-possesion come with different security assurances. We envision batching to be superfluous when combined with random oracle delinearization too, provided all keys live in one key set. In principle, one might delinearize between different key sets locally though, which protects the signatures in different key sets from one another. We now see four likely hierarchies that yield nice APIs: If we have proofs-of-possesion then we simply aggregate freely and support distinct messages, but avoid delinearization by requiring that all proof-of-possesion have similar security assurances. Polkadot will use this for validators. If we pay the cost of random oracle delinearization then we do so early, and support distinct message aggregation too, but require identical key sets only for signatures corresponding to the same message. Fog of Trust shall use roughly this. If we merely batch then we do so at verification time, and delinearize only between keys corresponding to the same message. We can make the combined points cachable using `BuildHasher` If we do not want any duplicate messages, then callers can simply detect duplicates themselves, and give an error. In this case, both first and third option provide equivalent performance. Polkadot could use this for payments, as messages should mostly be distinct, but EdDSA batching sounds faster. 2. Performance In a normal BLS signature one verification equation is either e(H_1(m),P_2) = e(sigma_1,G_2) where P_2=G_2^p or e(P_1,H_2(m)) = e(G_1,sigma_2) where P_1=G_1^p We have G_1 living in a curve defined over the base field Fq with G_2 living in a curve defined over an extension field Fqe, which makes G_2 much bigger. There are no exponentiations in a single BLS signature verification, so only the H_1 vs H_2 speeds influenced verifiers preferences, and verifiers prefer H_1. Signers prefer H_1 too, assuming they sign many messages but cache their public key. Aggregating distinct messages requires a pairing and addition in the target group for each message. As the public keys are not aggregated here, their prepared point form may be cached, so presumably the H_i speed dominates for verifiers, magnifying the no aggregation case for H_1 preferences. Aggregating identical messages with proof-of-possesion requires verifiers do addition in P_{2-i} when the signature uses H_i, so this eventually favours H_2 with enough signatures per message, but the number is large. If the signing key sets are commonly identical, then again their combined prepared point form may be cached. Aggregating identical messages with repeated signers with any delinearization scheme requires verifiers do exponentiation in P_{2-i} when the signature uses H_i, so verifiers favour H_2 much faster, although less fast with batching than with random oracle delinearization. If the signer list is fixed, then perhaps one could cache delinerized signers too, which slows the switch to H_2. If we do switch to H_2 so that verifier exponentiations happen in P_1, then actually caching affine or prepared points for messages becomes more important, assuming the same messages get verified in different batches. In all this, proof-of-possesion becomes much more tempting of course. We expect threshold BLS signatures favor H_2 much faster for signers because they compute [commitments on the public key curve](https://github.com/poanetwork/threshold_crypto/blob/master/src/poly.rs#L578), which requires numerous exponentiations. In principle, verifiers might still favor H_1, so if the verifiers outnumber the signers by a wide margin then the system may become quite slow. 3. Operations Serialization and deserialization go from and to affine elements, but deserialization requires an expensive scalar multiplication check. Anything algebraic like scalar multiplication results in projective points, but may or may not operate faster if given affine points. Hashing produces projective points too. Affine's `mul_bits` function has less short circuiting logic, but still remains non-constant time. If affine is faster, then support cached affine points for delinerazation. Pairing requires prepared points, which one obtains from affine points. It appears the preparation step is expensive only on the big curve. We think conversion to affine always benefits from being batched, so currently we batch them whenever possible. We could cache affine or prepared points, but only we they'll actually be repeated as we cannot do algebra on them. We need to check benchmarks for all these operatins. Pippenger’s algorithm https://github.com/rust-lang/rfcs/pull/2289#issuecomment-409337644 ``` fn batch_normalization(v: &mut II) where for<'a> &'a mut II: IntoIterator, for<'a> <&'a mut II as IntoIterator>::IntoIter: DoubleEndedIterator+ExactSizeIterator ``` /// - ECDSA: https://cr.yp.to/badbatch/badbatch-20120919.pdf