Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Extractability and Proximity Extraction of Function Ensembles, Schemes and Mind Maps of Construction

The extractability and proximity extraction of function ensembles in the context of cryptography. It introduces concepts such as polynomial-size adversaries, extractors, and zero-knowledge proofs, and explains how these concepts are used to ensure the security of cryptographic functions. The document also covers topics like witness-indistinguishability, proofs and arguments of knowledge, and SNARKs.

What you will learn

  • What is a polynomial-size adversary in cryptography?
  • What is a SNARK in cryptography?
  • What is the difference between witness-indistinguishability and adaptive proof of knowledge in cryptography?
  • What is extractability in the context of cryptography?
  • What is a polynomial-size extractor in cryptography?

Typology: Schemes and Mind Maps

2021/2022

Uploaded on 09/27/2022

arjaa
arjaa 🇺🇸

4.2

(5)

229 documents

1 / 67

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
The Hunting of the SNARK
Nir BitanskyRan CanettiAlessandro Chiesa§Shafi Goldwasser§Huijia Lin
Aviad RubinsteinEran Tromer
July 24, 2014
Abstract
The existence of succinct non-interactive arguments for NP (i.e., non-interactive computationally-
sound proofs where the verifier’s work is essentially independent of the complexity of the NP nonde-
terministic verifier) has been an intriguing question for the past two decades. Other than CS proofs in
the random oracle model [Micali, FOCS ’94], the only existing candidate construction is based on an
elaborate assumption that is tailored to a specific protocol [Di Crescenzo and Lipmaa, CiE ’08].
We formulate a general and relatively natural notion of an extractable collision-resistant hash func-
tion (ECRH) and show that, if ECRHs exist, then a modified version of Di Crescenzo and Lipmaa’s
protocol is a succinct non-interactive argument for NP. Furthermore, the modified protocol is actually a
succinct non-interactive adaptive argument of knowledge (SNARK). We then propose several candidate
constructions for ECRHs and relaxations thereof.
We demonstrate the applicability of SNARKs to various forms of delegation of computation, to suc-
cinct non-interactive zero knowledge arguments, and to succinct two-party secure computation. Finally,
we show that SNARKs essentially imply the existence of ECRHs, thus demonstrating the necessity of
the assumption.
Going beyond ECRHs, we formulate the notion of extractable one-way functions (EOWFs). As-
suming the existence of a natural variant of EOWFs, we construct a 2-message selective-opening-attack
secure commitment scheme and a 3-round zero-knowledge argument of knowledge. Furthermore, if
the EOWFs are concurrently extractable, the 3-round zero-knowledge protocol is also concurrent zero-
knowledge. Our constructions circumvent previous black-box impossibility results regarding these pro-
tocols by relying on EOWFs as the non-black-box component in the security reductions.
This research was supported by the Check Point Institute for Information Security, by the Israeli Centers of Research Excellence
(I-CORE) program (center No. 4/11), by the European Community’s Seventh Framework Programme (FP7/2007-2013) grant
240258, by a European Union Marie Curie grant, by the Israeli Science Foundation, and by the Israeli Ministry of Science and
Technology.
This paper is a merge of [BCCT11] and [GLR11]. A preliminary version including part of the results appeared in ITCS 2012.
Tel Aviv University, {nirbitan,tromer,aviadrub}@tau.ac.il
Boston University and Tel Aviv University, canetti@tau.ac.il
§MIT, {alexch,shafi}@csail.mit.edu
Boston University and MIT, huijia@csail.mit.edu
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43

Partial preview of the text

Download Extractability and Proximity Extraction of Function Ensembles and more Schemes and Mind Maps Construction in PDF only on Docsity!

The Hunting of the SNARK

Nir Bitansky†^ Ran Canetti‡^ Alessandro Chiesa§^ Shafi Goldwasser§^ Huijia Lin¶

Aviad Rubinstein†^ Eran Tromer†

July 24, 2014

Abstract The existence of succinct non-interactive arguments for NP (i.e., non-interactive computationally- sound proofs where the verifier’s work is essentially independent of the complexity of the NP nonde- terministic verifier) has been an intriguing question for the past two decades. Other than CS proofs in the random oracle model [Micali, FOCS ’94], the only existing candidate construction is based on an elaborate assumption that is tailored to a specific protocol [Di Crescenzo and Lipmaa, CiE ’08]. We formulate a general and relatively natural notion of an extractable collision-resistant hash func- tion (ECRH) and show that, if ECRHs exist, then a modified version of Di Crescenzo and Lipmaa’s protocol is a succinct non-interactive argument for NP. Furthermore, the modified protocol is actually a succinct non-interactive adaptive argument of knowledge (SNARK). We then propose several candidate constructions for ECRHs and relaxations thereof. We demonstrate the applicability of SNARKs to various forms of delegation of computation, to suc- cinct non-interactive zero knowledge arguments, and to succinct two-party secure computation. Finally, we show that SNARKs essentially imply the existence of ECRHs, thus demonstrating the necessity of the assumption. Going beyond ECRHs, we formulate the notion of extractable one-way functions (EOWFs). As- suming the existence of a natural variant of EOWFs, we construct a 2 -message selective-opening-attack secure commitment scheme and a 3-round zero-knowledge argument of knowledge. Furthermore, if the EOWFs are concurrently extractable, the 3-round zero-knowledge protocol is also concurrent zero- knowledge. Our constructions circumvent previous black-box impossibility results regarding these pro- tocols by relying on EOWFs as the non-black-box component in the security reductions.

∗This research was supported by the Check Point Institute for Information Security, by the Israeli Centers of Research Excellence (I-CORE) program (center No. 4/11), by the European Community’s Seventh Framework Programme (FP7/2007-2013) grant 240258, by a European Union Marie Curie grant, by the Israeli Science Foundation, and by the Israeli Ministry of Science and Technology. This paper is a merge of [BCCT11] and [GLR11]. A preliminary version including part of the results appeared in ITCS 2012. †Tel Aviv University, {nirbitan,tromer,aviadrub}@tau.ac.il ‡Boston University and Tel Aviv University, canetti@tau.ac.il §MIT, {alexch,shafi}@csail.mit.edu ¶Boston University and MIT, huijia@csail.mit.edu

Contents

1 Introduction

For the Snark’s a peculiar creature, that won’t Be caught in a commonplace way. Do all that you know, and try all that you don’t: Not a chance must be wasted to-day! The Hunting of the Snark, Lewis Carroll

The notion of interactive proof systems [GMR89] is central to both modern cryptography and complex- ity theory. One extensively studied aspect of interactive proof systems is their expressiveness; this study culminated with the celebrated result that IP = PSPACE [Sha92]. Another aspect of such systems, which is the focus of this work, is that proofs for rather complex NP-statements can potentially be verified much faster than by direct checking of an NP witness. We know that if statistical soundness is required then any non-trivial savings would cause unlikely complexity-theoretic collapses (see, e.g., [BHZ87, GH98, GVW02, Wee05]). However, if we settle for proof systems with only computational soundness (also known as interactive arguments [BCC88]) then significant savings can be made. Indeed, using collision-resistant hash functions (CRHs), Kilian [Kil92] shows a four-message interactive argument for NP: the prover first uses a Merkle hash tree to bind itself to a polynomial-size PCP (Probabilistically Checkable Proof) string for the statement, and then answers the PCP verifier’s queries while demonstrating consistency with the Merkle tree. This way, membership of an instance y in an NP language L can be verified in time that is bounded by p(k, |y|, log t), where t is the time to evaluate the NP verification relation for L on input y, p is a fixed polynomial independent of L, and k is a security parameter that determines the soundness error. Following tradition, we call such argument systems succinct. Can we have succinct argument systems which are non-interactive? Having posed and motivated this question, Micali [Mic00] provides a one-message succinct non-interactive argument for NP, in the random oracle model, by applying the Fiat-Shamir paradigm [FS87] to Kilian’s protocol. In the standard model, such “totally non-interactive” succinct arguments (against non-uniform provers) do not exist except for “quasi- trivial” languages (i.e., languages in BPtime(npolylogn)), because the impossibility results for statistical soundness can be directly extended to this case. Nonetheless, it may still be possible to obtain a slightly weaker notion of non-interactivity:

Definition 1.1. A succinct non-interactive argument (SNARG) is a succinct argument where the verifier (or a trusted entity) generates ahead of time a succinct verifier-generated reference string (VGRS) and sends it to the prover. The prover can then use the VGRS to generate a succinct non-interactive proof π for a statement y of his choice. (The VGRS is thus independent of the statements to be proven later, and the definition requires “adaptive soundness”, since y is chosen by the prover, potentially based on the VGRS.)

In this paper, we consider the following questions:

Can SNARGs for NP exist in the standard model? And if so, under what assumptions can we prove their existence?

Attempted solutions. To answer the above question, Aiello et al. [ABOR00] propose to avoid Kilian’s hash-then-open paradigm, and instead use a polylogarithmic PIR (Private Information Retrieval) scheme

to access the PCP oracle as a long database. The verifier’s first message consists of the queries of the underlying PCP verifier, encrypted using the PIR chooser algorithm. The prover applies the PIR sender algorithm to the PCP oracle, and the verifier then runs the underlying PCP verifier on the values obtained from the PIR protocol. However, Dwork et al. [DLN+04] point out that this “PCP+PIR approach” is inherently problematic, because a cheating prover could “zigzag” and answer different queries according to different databases.^1 (Natural extensions that try to force consistency by using multiple PIR instances run into trouble due to potential PIR malleability.) Di Crescenzo and Lipmaa [DCL08] propose to address this problem by further requiring the prover to bind itself (in the clear) to a specific database using a Merkle Tree (MT) as in Kilian’s protocol. Intuitively, the prover should now be forced to answer according to a single PCP string. In a sense, this “PCP+MT+PIR approach” squashes Kilian’s four-message protocol down to two messages “under the PIR”. However, while initially appealing, it is not a-priori clear how this intuition can be turned into a proof of security under some well defined properties of the Merkle tree hash. Indeed, to prove soundness of their protocol Di Crescenzo and Lipmaa use an assumption that is non-standard in two main ways: first, it is a “knowledge assumption,” in the sense that any adversary that generates a value of a certain form is assumed to “know” a corresponding preimage (see more discussion on such assumptions below). Furthermore, their assumption is very specific and intimately tied to the actual hash, PIR, and PCP schemes in use, as well as the language under consid- eration. Two other non-interactive arguments for NP, based on more concise knowledge assumptions, are due to Mie [Mie08] and Groth [Gro10]. However, neither of these protocols is succinct: in both protocols the verifier’s runtime is polynomially related to the time needed to directly verify the NP witness. Recently, Gentry and Wichs [GW11] showed that some of the difficulty is indeed inherent by proving that no SNARG construction can be proved secure via a black-box reduction to an efficiently falsifiable assumption [Nao03]. For example, the assertion that one-way functions exist or that fully-homomorphic encryption exists are both falsifiable assumptions; in general, an assumption is efficiently falsifiable if it can be modeled as a game between an adversary and a challenger, where the challenger can efficiently decide whether the adversary has won the game. The impossibility result of Gentry and Wichs holds even for designated-verifier protocols, where the verifier needs secret randomness in order to verify. This suggests that non-standard assumptions, such as the knowledge (extractability) assumptions described next, may be inherent.

Knowledge (extractability) assumptions. Knowledge (or extractability) assumptions capture our belief that certain computational tasks can be achieved efficiently only by (essentially) going through specific intermediate stages and thereby obtaining, along the way, some specific intermediate values. Such an as- sumption asserts that, for any efficient algorithm that achieves the task, there exists a knowledge extractor algorithm that efficiently recovers the said intermediate values. A number of different extractability assumptions exist in the literature, most of which are specific num- ber theoretic assumptions (such as several variants of the knowledge of exponent assumption [Dam92]). It is indeed hard to gain assurance regarding their relative strengths. Abstracting from such specific assump- tions, one can formulate general notions of extractability for one-way functions and other basic primitives (see [CD09, Dak09]). That is, say that a function family F is extractable if, given a random f ← F, it is infeasible to produce y ∈ Image(f ) without actually “knowing” x such that f (x) = y. This is expressed by saying that for any efficient adversary A there is an efficient extractor EA such that, if A(f ) = f (x) for some x, then EA(f ) almost always outputs x′^ such that f (x′) = f (x). Typically, for such a family

(^1) The problem becomes evident when implementing the PIR using fully-homomorphic encryption; indeed, since any efficient adversarial strategy can be executed “under the encryption”, such a solution would be as insecure as sending the PCP queries in the clear.

A SNARG for NP could improve on these by minimizing both interaction and the verifier’s computa- tional effort. (Note that adaptive soundness of the SNARG seems crucial for this application since a cheating worker might choose a bogus result of the computation based on the delegators first message.^2 ) However, the application to delegation schemes brings with it additional security concerns. For example, the untrusted worker may store a long database z whose short Merkle hash h = MT(z) is known to the delegator; the delegator may then ask the worker to compute F (z) for some function F. However, from the delegator’s perspective, merely being convinced that “there exists z˜ such that h = MT(˜z) and F (˜z) = f ” is not enough. The delegator should also be convinced that the worker knows such a ˜z, which implies due to collision resistance of MT that indeed z˜ = z. Thus, the delegator may not only be interested in establishing that a witness for a claimed theorem exists, but also want that such a witness can be extracted from a convincing prover. That is, we require proof of knowledge (or rather, an argument of knowledge) and thus SNARKs (rather than merely SNARGs) are needed. The ability to efficiently extract a witness for an adaptively-chosen theorem seems almost essential for making use of a delegation scheme when the untrusted worker is expected to contribute its own input (such as a database, as in the above example, or a signature, and so on) to a computation. Another application where adaptive proofs of knowledge are crucial is recursive proof composition, a technique that has already been shown to enable desirable cryptographic tasks [Val08, CT10, BSW11, BCCT13].

1.2 ECRHs, SNARKs, and Applications

(i) Extractable collision-resistant hash functions. We start by defining a natural strengthening of collision- resistant hash functions (CRHs): a function ensemble H = {Hk}k is an extractable CRH (ECRH) if (a) it is collision-resistant in the standard sense, and (b) it is extractable in the sense that for any efficient adversary that is able to produce a valid evaluation of the function there is an extractor that is able to produce a corresponding preimage. More precisely, extractability is defined as follows:

Definition 1. A function ensemble H = {Hk}k mapping { 0 , 1 }`(k)^ to { 0 , 1 }k^ is extractable if for any polynomial-size adversary A there exists a polynomial-size extractor EA such that for large enough security parameter k ∈ N and any auxiliary input z ∈ { 0 , 1 }poly(k):

Pr h←Hk

[

y ← A(h, z) ∃ x : h(x) = y

x′^ ← E(h, z) h(x′) 6 = y

]

≤ negl(k).

We do not require that there is an efficient way to tell whether a given string in { 0 , 1 }k^ is in the image of a given h ∈ Hk. We note that:

  • For extractability and collision resistance (or one-wayness) to coexist, it should be hard to “obliviously sample images”; in particular, this implies that the image of almost any h ∈ Hk should be sparse in { 0 , 1 }k, i.e., with cardinality at most 2 k−ω(log^ k). (This remark is a bit over-simplified and not entirely accurate; see discussion in Section 6.1.)
  • For simplicity of exposition, the above definition accounts for the most general case of arbitrary polynomial-size auxiliary-input; this is, in fact, too strong and cannot be achieved assuming indistin- guishability obfuscation [BCPR14, BP13]. However, for our main result, we can actually settle for a (^2) More precisely, this seems to be the case when the worker contributes an input to the computation. In contrast, when the worker does not contribute any inputs, the delegator can use a SNARG with non-adaptive soundness by requesting two proofs for each bit of the claimed output.

relaxed definition that only considers a specific distribution over auxiliary inputs of a-priori bounded size. See further discussion in Section 6.1.

(ii) From ECRHs to adaptive succinct arguments of knowledge, and back again. We modify the “PCP+MT+PIR” construction of [DCL08] and show that the modified construction can be proven to be a SNARK based solely on the existence of ECRHs and PIR schemes with polylogarithmic complexity.^3

Theorem 1 (informal). If there exist ECRHs and (appropriate) PIRs then there exist SNARKs (for NP).

A single VGRS in our construction suffices for only logarithmically many proofs; however, since the VGRS can be succincly generated, the cost of occasionally resending a new one is limited.

We complement Theorem 1 by showing that ECRHs are in fact essential for SNARKs:

Theorem 2 (informal). If there exist SNARKs and (standard) CRHs then there exist ECRHs.

More accurately, we show that SNARKs and CRHs imply a slightly relaxed notion of ECRHs that we call proximity ECRHs, and which is still sufficient for our construction of SNARKs. To simplify the exposition of our main results we defer the discussion of the details of this relaxation to Section 1.3.

We also show that SNARKs can be used to construct extractable variants of other cryptographic primitives. A na¨ıve strategy to obtain this may be to “add a succinct proof of knowledge of a preimage to the output”. While this strategy does not work as such because the proof may leak secret information, we show that in many cases this difficulty can be overcome by combining SNARKs with (non-extractable) leakage-resilient primitives. For example, since CRHs and subexponentially-hard OWFs are leakage-resilient, we obtain:

Theorem 3 (informal). Assume SNARKs and (standard) CRHs exist. Then there exist extractable one- way functions and extractable computationally hiding and binding commitments. Alternatively, if there exist SNARKs and (standard) subexponentially-hard one-way functions then there exist extractable one- way functions. Furthermore, if these functions are one-to-one, then we can construct perfectly-binding computationally-hiding extractable commitments.

We believe that this approach merits further investigation. One question, for example, is whether extractable pseudorandom generators and extractable pseudorandom functions can be constructed from generic ex- tractable primitives (as was asked and left open in [CD09]). Seemingly, our SNARK-based approach can be used to obtain the weaker variants of extractable pseudo-entropy generators and pseudo-entropy func- tions, by relying on previous results regarding leakage-resilience of PRGs [DK08, RTTV08, GW11] and leakage-resilient pseudo-entropy functions [BHK11].

(iii) Applications of SNARKs. As discussed earlier, SNARKs directly enable non-interactive delegation of computation, including settings where the delegator has a very long input or where the worker supplies his own input to the computation. An important property of SNARK-based delegation is that it does not require expensive preprocessing and (as a result) soundness can be maintained even when the prover learns the verifier’s responses between successive delegation sessions because a fresh VGRS can simply be resent for each time. In addition, SNARKs can be used to obtain zkSNARKs, that is, zero-knowledge succinct non-interactive arguments of knowledge in the common reference string (CRS) model. We provide two constructions to do so, depending on whether the succinct argument is “on top or below” the NIZK.

(^3) More precisely, we shall require PIR schemes with polylogarithmic complexity where a fixed polynomial bound for the database size is not required by the query algorithm. See Section 1.4 for more details.

(or t-KEA for short) proceeds as follows. For any polynomial-size adversary, there exists a polynomial-size extractor such that, on input g 1 ,... , gt, gα 1 ,... , gαt where each gi is a random generator (of an appropriate group) and α is a random exponent: if the adversary outputs (f, f α), then the extractor finds a vector of “coefficients” (x 1 ,... , xt) such that f =

i∈[t] g

xi i. This assumption can be viewed as a simplified version of the assumption used by Groth in [Gro10] (the formal relation between the assumptions is discussed in Section 8.1). Similarly to Groth’s assumption, t-KEA holds in the generic group model.

Theorem 4 (informal). If t-KEA holds in a group where taking discrete logs is hard, then there exists an ECRH whose compression is proportional to t.

The construction is straightforward: the function family is parameterized by (g 1 ,... , gt, g 1 α ,... , gtα ). Given input (x 1 ,... , xt), the function outputs the two group elements (

i∈[t] g

xi i ,^

i∈[t] g

αxi i ).^ Extractability directly follows from t-KEA, while collision resistance is ensured by the hardness of taking discrete logs. See Section 8.1 for more details. Next we proceed to propose ECRH candidates that are based on the subset sum problem in finite groups. Here, however, we are only able to construct candidates for somewhat weaker variants of ECRHs that are still sufficient for constructing SNARKs. While these variants are as not elegantly and concisely stated as the “vanilla”ECRH notion, they are still natural. Furthermore, we can show that these variants are necessary for SNARKs. We next proceed to formulate these weaker variants.

1.3.1 Proximity ECRH

We say that H defined on domain D is a proximity ECRH (PECRH) if (for any h ∈ H) there exist a reflexive

“proximity” relation

h ≈ on values in the range and an extension of the hash to a larger domain Dh ⊇ D fulfilling the following: (a) proximity collision resistance: given h ← H, it is hard to find x, x′^ ∈ Dh such

that h(x) h ≈ h(x′), and (b) proximity extraction: for any poly-time adversary A there exists an extractor

E such that, whenever A outputs y ∈ h(D), E outputs x ∈ Dh such that h(x)

h ≈ y. (See Definition 6.2 for further details.)

Harder to find collisions, easier to extract. The notions of proximity extraction and proximity collision

resistance are the same as standard extraction and collision resistance in the “strict” case, where x h ≈ y is the equality relation and the domain is not extended (Dh = { 0 , 1 }`(k), h¯ = h). However, in general, proximity collision resistance is stronger than (standard) collision resistance, be-

cause even “near collisions” (i.e., x 6 = y such that ¯h(x) h ≈ ¯h(y)) must not be efficiently discoverable, not even over the extended domain Dh. Conversely, proximity extraction is weaker than (standard) extraction, since it suffices that the extractor finds a point mapping merely close the the adversary’s output (i.e., finds x′

such that ¯h(x′)

h ≈ y); moreover, it suffices that the point is in the extended domain Dh. Thus, the notion of PECRH captures another, somewhat more flexible tradeoff between the requirements of extractability and collision resistance. We show that any point on this tradeoff (i.e., any choice of h ≈, Dh and ¯h fulfilling the conditions) suffices for the construction of SNARKs:

Theorem 5 (informal). If there exist PECRHs and (appropriate) PIRs then there exist SNARKs for NP.

Candidate PECRHs based on knapsack (subset sum) problems. A necessary property of ECRHs is that the image should be sparse; knapsack-based CRHs, which typically rely on a proper algebraic structure, can often be tweaked to obtain this essential property. For example, in the t-KEA-based ECRH that we

already discussed, we start from a standard knapsack hash f =

i∈[t] g

xi i and extend it to a “sparsified” knapsack hash (f, f α) for a secret α. While for t-KEA this step is enough for plausibly assuming precise extractability (leading to a full fledged ECRH), for other knapsack-based CRHs this is not the case. For example, let us consider the task of sparsifying modular subset-sum [Reg03]. Here, the hash function is given by random coefficients l 1 ,... , lt ∈ ZN and the hash of x ∈ { 0 , 1 }t^ is simply the corresponding modular subset-sum

i:xi=1 li^ mod^ N^.^ A standard way to sparsify the function is, instead of drawing random coefficients, drawing them from a distribution of noisy multiples of some secret integer. However, by doing so, we lose the “algebraic structure” of the problem. Hence, now we also have to deal with new “oblivious image-sampling attacks” that exploit the noisy structure. For example, slightly perturbing an honestly computed subset-sum is likely to “hit” another image of the function. This is where the relaxed notion of proximity extraction comes into play: it allows the extractor to output the preimage of the nearby (honest) image and, more generally, to thwart “perturbation attacks”. Sparsification of modular subset-sum in fact introduces additional problems. For instance, an attacker may take “small-norm” combinations of the coefficients that are not 0 / 1 and still obtain an element in the image (e.g., if there are two even coefficients); to account for this, we need to further relax the notion of extraction by allowing the extractor to output a preimage in an extended domain, while ensuring that (proximity) collision resistance still holds for the extended domain too. Additionally, in some cases a direct na¨ıve sparsification is not sufficient and we also need to consider amplified knapsacks. The relaxations of extractability discussed above have to be matched by a corresponding strengthening of collision resistance following the definition of PECRH. Fortunately, this can still be done under standard hardness assumptions. A similar approach can be taken in order to sparsify the modular matrix subset-sum CRH [Ajt96, GGH96], resulting in a a noisy inner-product knapsack hash based on the LWE assumption [Reg05]. Over- all, we propose three candidate for PECRHs:

Theorem 6 (informal). There exist PECRHs under any of the following assumptions:

  1. A Knowledge of Knapsack of Exponent assumption (which in fact follows from t-KEA) and hardness of discrete logs.
  2. A Knowledge of Knapsack of Noisy Multiples assumption and lattice assumptions.
  3. A Knowledge of Knapsack of Noisy Inner Products assumption and learning with errors.

1.3.2 Weak PECRHs^4

Our second weakening is essentially orthogonal to the first one and relates to the condition that determines when the extractor has to “work”. The ECRH and PECRH definitions required extraction whenever the ad- versary outputs a valid image; here the sparseness of the image appears to be key. In particular, unstructured CRHs where one can sample elements in the image obliviously of their preimage have no hope to be either ECRH or PECRH. However, for our purposes it seems sufficient to only require the extractor to “work” when the adversary outputs an image y together with extra encoding of a preimage that can be verified given proper trapdoor information; oblivious image-sampling, on its own, is no longer sufficient for failing the extractor. More formally, a family H of functions is weakly extractable if for any efficient adversary A there exists an efficient extractor EAH such that for any auxiliary input z and efficient decoder Y, the probability of the

(^4) This further weakening was inspired by private communication with Ivan Damg˚ard.

length, all independent of the length of the witness. At a very high-level, the soundness follows from the fact that the Merkle tree provides the verifier “virtual access” to the PCP proof, in the sense that given the root value of the Merkle tree, for every query q, it is infeasible for a cheating prover to answer q differently depending on the queries. Therefore, interacting with the prover is “equivalent” to having access to a PCP proof oracle. Then it follows from the soundness of the PCP system that Kilian’s protocol is sound.

The “PCP+MT+PIR approach”: The work of [DCL08] proposed the “PCP+MT+PIR approach” to “squash” Kilian’s four-message protocol into a two-message protocol as follows. In Kilian’s protocol, the verifier ob- tains from the prover a Merkle hash to a PCP oracle and only then asks the prover to locally open the queries requested by the PCP verifier. In [DCL08]’s protocol, the verifier sends also in the first message, a PIR- encrypted version of the PCP queries (the first message of a PIR scheme can be viewed as an encryption to the queries); the prover then prepares the required PCP oracle, computes and sends a Merkle hash of it, and answers the verifier’s queries by replying to the PIR queries according to a database that contains the answer (as well as the authentication path with respect to the Merkle hash) to every possible verifier’s query. [DCL08] proved the soundness of the above scheme based on the assumption that any convincing prover P∗^ must essentially behave as an honest prover: Namely, if a proof is accepting, then the prover must have in mind a full PCP oracle π, which maps under the Merkle hash procedure to the claimed root, and such a proof π can be obtained by an efficient extractor EP∗^.^5 [DCL08] then showed that, if this is the case, the extracted string π must be consistent with the answers the prover provides to the PCP queries, for otherwise the extractor can be used to obtain collisions of the hash function underlying the Merkle tree. Therefore, the extracted string π also passes the PCP test, where the queries are encrypted under PIR. Then, it follows from the privacy of the PIR scheme that, the string π is “computationally independent” of the query. Hence from the soundness of PCP, they conclude that the statement must be true.

1.4.1 The main challenges and our solutions.

Our goal is to obtain the stronger notion of SNARK, based on the more restricted assumption that ECRHs exist. At a very high-level, our construction follows the “PCP+MT+PIR approach” but replacing the CRH underlying the Merkle tree in their construction with an ECRH. Unlike[DCL08] which directly assumed the “global extraction” guarantees from a Merkle tree, we show that we can lift the “local extraction” guarantee provided by the ECRH, to the “global extraction” guarantee on the entire Merkle tree. More precisely, by relying on the (local) ECRH extraction guarantee, we show that for every convincing prover, there is an extractor that can efficiently extract a PCP proof π˜ that is “sufficiently satisfying” (i.e., a PCP verifier given ˜π accepts with high probability). Furthermore, to obtain the argument of knowledge property, we instantiate the underlying PCP system with PCPs of knowledge, which allows for extracting a witness from any sufficiently-satisfying proof oracle. (See details for the requisite PCP system in Section 3.7.) We next describe the high-level ideas for achieving global extraction guarantees from the local extraction of ECRHs. Full details are contained in Section 5, and the construction is summarized in Figure 1.

From local to global extraction. The main technical challenge lies in establishing a “global” knowledge feature (i.e., informally, extraction of a sufficiently satisfying proof π˜ from a convincing prover given that it outputs the root of a Merkle tree) from a very “local” one (namely, extraction of a preimage from a machine that outputs a valid hash value of the ECRH h). A natural attempt is to start from the root of the Merkle tree

(^5) Note that, as originally formulated, the assumption of [DCL08] seems to be false; indeed, a malicious prover can always start from a good PCP oracle π for a true statement and compute an “almost full” Merkle hash on π, skipping very few branches — so one should at least formulate an analogous but more plausible assumption by only requiring “sufficient consistency” with the claimed root.

and extract the values of the intermediate nodes of the Merkle tree layer by layer towards the leaves; that is, to extract a candidate proof π˜ by recursively applying the ECRH-extractor to extract the entire Merkle tree ˜MT, where the leaves should correspond to π˜.

However, recursively composing ECRH-extractors encounters a difficulty: each time applying the ECRH extraction incurs a polynomial blowup in extraction time. Therefore, (without making a very strong assump- tion on the amount of “blowup” incurred by the extractor,) we can only apply the ECRH extraction recur- sively a constant number of times; as a result, we can only extract from Merkle trees of a constant depth. We address this problem by opting to use a “squashed” Merkle tree, where the fan-in of each intermediate node is polynomial rather than binary as in the traditional case. Consequently, for every NP-statement, since its PCP proof is of polynomial length, the depth of the Merkle tree built over the PCP proof is a constant. Then, we can apply the ECRH extraction procedure recursively to extract the whole Merkle tree, while overall incurring polynomial overhead in the size of the extractor. A technical problem that arises when applying the above recursive extraction procedure is that we might not be able to extract out a full Merkle tree (where all paths have the same length). More precisely, after applying the ECRH extraction recursively times, we obtain the values for the intermediate nodes up to level ` (thinking about the root as level zero). However, at each intermediate node, when applying the ECRH extractor, extraction could have failed to extract a preimage if the given intermediate node is not a valid hash image under the ECRH. Hence, the extracted tree might be at some points “disconnected”.^6 Nevertheless, we show (relying solely on ECRH extraction) that the leaves in the extracted (perhaps partial) tree connected to the root are sufficiently satisfying for witness-extraction.

Proof at high level. Given the foregoing discussion, we show the correctness of the extraction procedure in two steps:

  • Step 1: “local consistency”. We first show that whenever the verifier is convinced, the recursively extracted string π˜ contains valid answers to the verifier’s PCP queries specified in its PIR queries. Otherwise, it is possible to find collisions within the ECRH h as follows. A collision finder could simulate the PIR-encryption on its own, invoke both the extraction procedure and the prover, and obtain two paths that map to the same root but must differ somewhere (as one is satisfying and the other is not) and therefore obtain a collision.
  • Step 2: “from local to global consistency”. Next, using the privacy guarantees of the PIR scheme, we show that, whenever we extract a set of leaves that are satisfying with respect to the PIR-encrypted queries, the same set of leaves must also be satisfying for almost all other possible PCP queries and is thus sufficient for witness-extraction. Indeed, if this was not the case then we would be able to use the polynomial-size extraction circuit to break the semantic security of the PIR.

For further details of the proof we refer the reader to Section 5.2. Our construction ensures that the communication complexity and the verifier’s time complexity are bounded by a polynomial in the security parameter, the size of the instance, and the logarithm of the time it takes to verify a valid witness for the instance; furthermore, this polynomial is independent of the specific NP language at hand.

On Adaptive Soundness: The soundness requirement of SNARK prevents a cheating prover from proving any false statement even if it is allowed to choose the statement after seeing the verifier’s first message (or, rather, the verifier-generated reference string). In the above construction, the verfier sends in the fisrt message the PIR-encrypted PCP queries. However, in general, the PCP queries may depend on the statement

(^6) This captures for example the behavior of the prover violating the [DCL08] assumption described above.

2 Other Related Work

Knowledge assumptions. A popular class of knowledge assumptions, which have been successfully used to solve a number of (at times notoriously open) cryptographic problems, is that of Knowledge of Exponent assumptions. These have the following flavor: if an efficient circuit, given the description of a finite group along with some other public information, computes a list of group elements that satisfies a certain algebraic relation, then there exists a knowledge extractor that outputs some related values that “explain” how the public information was put together to satisfy the relation. Most such assumptions have been proven secure against generic algorithms (see Nechaev [Nec94], Shoup [Sho97], and Dent [Den06]), thus offering some evidence for their truth. In the following we briefly survey prior works which, like ours, relied on Knowledge of Exponent assumptions. Damg˚ard [Dam92] first introduced a Knowledge of Exponent assumption to construct a CCA-secure encryption scheme. Later, Hada and Tanaka [HT98] showed how to use two Knowledge of Exponent as- sumptions to construct the first three-round zero-knowledge argument. Bellare and Palacio [BP04] then showed that one of the assumptions of [HT98] was likely to be false, and proposed a modified assumption, using which they constructed a three-round zero-knowledge argument. More recently, Abe and Fehr [AF07] extended the assumption of [BP04] to construct the first perfect NIZK for NP with “full” adaptive soundness. Prabhakaran and Xue [PX09] constructed statistically-hiding sets for trapdoor DDH groups [DG06] using a new Knowledge of Exponent assumption. Gennaro et al. [GKR10] used another Knowledge of Exponent assumption (with an interactive flavor) to prove that a modi- fied version of the Okamoto-Tanaka key-agreement protocol [OT89] satisfies perfect forward secrecy against fully active attackers. In a different direction, Canetti and Dakdouk [CD08, CD09, Dak09] study extractable functions. Roughly, a function f is extractable if finding a value x in the image of f implies knowledge of a preimage of x. The motivation of Canetti and Dakdouk for introducing extractable functions is to capture the abstract essence of prior knowledge assumptions, and to formalize the “knowledge of query” property that is sometimes used in proofs in the Random Oracle Model. They also study which security reductions are “knowledge-preserving” (e.g., whether it possible to obtain extractable commitment schemes from extractable one-way functions). [BCPR14, BP13] show that, assuming indistinguishability obfuscation [BGI+12], extractable one-way functions (and thus also ECRHs) cannot be constructed against adversaries with arbitrary polynomial-size auxiliary-input if the (efficient) extractor is universally fixed before the adversary’s auxiliary input. On the other hand, they show that, under standard assumptions, extractable one-way functions are achievable against adversaries with a-prori bounded auxiliary input. (It is still not known whether such ECRHs can also be constructed under standard assumptions).

Prior (somewhat) succinct arguments from Knowledge of Exponent assumptions. Knowledge of Ex- ponent assumptions have been used to obtain somewhat succinct arguments, in the sense the non-interactive proof is short, but the verifier’s running time is long. Recently, Groth [Gro10] introduced a family of knowledge of expenonent assumptions, generalizing those of [AF07], and used them to construct extractable length-reducing commitments, as a building block for short non-interactive perfect zero-knowledge arguments system for circuit satisfiability. These arguments have very succinct proofs (independent of the circuit size), though the public key is large: quadratic in the size of the circuit. Groth’s assumption holds in the generic group model. For a comparison between our t-KEA assumption and Groth’s assumptions see Section 8.1. Mie [Mie08] observes that the PCP+MT+PIR approach works as long as the PIR scheme is database aware — essentially, a prover that is able to provide valid answers to PIR queries must “know” their de-

crypted values, or, equivalently, must “know” a database consistent with those answers (by arbitrarily setting the rest of the database). Mie then shows how to make the PIR scheme of Gentry and Ramzan [GR05] PIR- aware, based on Damg˚ard’s Knowledge of Exponent assumption [Dam92]; unfortunately, while the commu- nication complexity is very low, the receiver in [GR05], and thus also the verifier in [Mie08], are inefficient relative to the database size. We note that PIR schemes with database awareness can be constructed directly from ECRHs (without going through the PCPs of a SNARK construction); moreover, if one is willing to use PCPs to obtain a SNARK, one would then be able to obtain various stronger notions of database awareness.

Delegation of computation. An important application of succinct arguments is delegation of computation schemes, where one usually also cares about privacy, and not only soundness, guarantees. Specifically, a succinct argument can be usually combined in a trivial way with fully-homomorphic encryption [Gen09] (in order to ensure privacy) to obtain a delegation scheme where the delegator runs in time polylogarithmic in the running time of the computation (see Section 10.1). Within the setting of delegation, however, where the same weak delegator may be asking a power- ful untrusted worker to evaluate an expensive function on many different inputs, a weaker preprocessing approach may still be meaningful. In such a setting, the delegator performs a one-time function-specific expensive setup phase, followed by inexpensive input-specific delegations to amortize the initial expensive phase. Indeed, in the preprocessing setting a number of prior works have already achieved constructions where the online stage is only two messages [GGP10, CKV10, AIK10]. These constructions do not al- low for an untrusted worker to contribute his own input to the computation, namely they are “P-delegation schemes” rather than “NP-delegation schemes”. Note that all of these works do not rely on any knowledge assumption; indeed, the impossibility results of [GW11] only apply for NP and not for P. However, even given that the preprocessing model is very strong, all of the mentioned works maintain soundness over many delegations only as long as the verifier’s answers remain secret. (A notable exception is the work of Benabbas et al. [BGV11], though their constructions are not generic, and are only for specific functionalities such as polynomial functions.) Goldwasser et al. [GKR08] construct interactive proofs for log-space uniform NC where the verifier run- ning time is quasi-linear. When combining [GKR08] with the PIR-based squashing technique of Kalai and Raz [KR06], one can obtain a succinct two-message delegation scheme. Canetti et al. [CRR11] introduce an alternative way of squashing [GKR08], in the preprocessing setting; their scheme is of the public coin type and hence the verifier’s answers need not remain secret (another bonus is that the preprocessing state is publicly verifiable and can thus be used by anyone).

3 Preliminaries

In this section we give basic definitions for the cryptographic primitives that we use (along with any non- standard properties that we may need). Throughout, negl(k) is any negligible function in k.

3.1 Interactive Proofs, Zero Knowledge and Witness Indistinguishability

We use the standard definitions of interactive proofs (and interactive Turing machines) [GMR89] and argu- ments (also known as computationally-sound proofs) [BCC88]. Given a pair of interactive Turing machines, P and V , we denote by 〈P (w), V 〉(x) the random variable representing the final (local) output of V , on com- mon input x, when interacting with machine P with private input w, when the random input to each machine is uniformly and independently chosen.

3.2 Proofs and Arguments of Knowledge

Given a language L ∈ NP and an instance x, a proof or argument of knowledge (POK or AOK) not only convinces the verifier that x ∈ L, but also to demonstrate that the prover possesses an NP-witness for x. This is formalized by the existence of an extractor: given black-box access to a machine that can successfully complete the proof or argument of knowledge on input x, the extractor can compute a witness for x.

Definition 3.4 (Proofs and arguments of knowledge). An interactive protocol Π = (P, V ) is a proof of knowledge (resp. argument of knowledge) of NP-language L with respect to witness relation RL if Π is indeed an interactive proof (resp. argument) for L. Additionally, there exists a polynomial q, a negligible function ν, and a PPT oracle machine E, such that for every interactive machine P ∗^ (resp. for every polynomially-sized machine P ∗), every x ∈ L and every auxiliary input z ∈ { 0 , 1 }∗, the following holds: On input x and oracle access to P ∗(x, z), machine E outputs a string from the RL(x) with probablity at least q(Pr[〈P ∗(z), V 〉(x) = 1]) − ν(|x|). The machine E is called the knowledge extractor.

3.3 Commitment Scheme

A commitment scheme (C, R) consists of a pair of PPT ITMs C and R that interact in a commit stage and a reveal stage. In this work, we consider commitment schemes that are computationally binding and computationally hiding. The computational-binding property asserts that the probability that a malicious commitment, after the interaction with an honest receiver, can decommit to two different values is negligi- ble. The computational-hiding property guarantees that commitments to any two different values are com- putationally indistinguishable. (See [Gol01] for a formal definition.) Furthermore, we restrict our attention to commitment schemes where the reveal phase is non-interactive—the committer decommits to value v by simply sending a decommitment pair (v, d). One-message statistically-binding commitment schemes can be constructed using any one-to-one one-way function (see Section 4.4.1 of [Gol01]). Allowing some minimal interaction (in which the receiver first sends a single random initialization message), statistically-binding commitment schemes can be obtained from any one-way function [Nao91, HILL99].

3.4 Collision-Resistant Hashes

A collision-resistant hash (CRH) is a function ensemble for which it is hard to find two inputs that map to the same output. Formally:

Definition 3.5. A function ensemble H is a CRH if it is collision-resistant in the following sense: for every polynomial-size adversary A,

Pr h←Hk

[

x 6 = y h(x) = h(y) : (x, y)^ ←^ A(h)

]

≤ negl(k).

We say that a function ensemble H is ((k), k)-compressing if each h ∈ Hk maps strings of length(k) to strings of length k < `(k).

3.5 Merkle Trees

Merkle tree (MT) hashing [Mer89] enables a party to use a CRH to compute a succinct commitment to a long string π and later to locally open to any bit of π (again in a succinct manner). Specifically, given a function h : { 0 , 1 }`(k)^ → { 0 , 1 }k^ randomly drawn from a CRH ensemble, the committer divides π into

|π|/(k) parts (padding with 0 ’s if needed) and evaluates h on each of these; the same operation is applied to the resulting string, and so on, until one reaches the single k-bit root. For |π| = (/k)d+1, this results in a tree of depth d, whose nodes are all the intermediate k-bit hash images. An opening to a leaf in π (or any bit within it) includes all the nodes and their siblings along the path from the root to the leaf, and is of size d. Typically,(k) = 2k, resulting in a binary tree of depth log π. In this work, we shall also be interested in “wide trees” with polynomial fan-in (relying on CRHs with polynomial compression); see Section 5.1.

3.6 Private Information Retrieval

A (single-server) polylogarithmic private information retrieval (PIR) scheme [CMS99] consists of a triple of algorithms (PEnc, PEval, PDec) where:

  • PEncR(1k, i) outputs an encryption Ci of query i to a database DB using randomness R,
  • PEval(DB, Ci) outputs a succinct blob ei “containing” the answer DB[i], and
  • PDecR(ei) decrypts the blob ei to an answer DB[i].

Formally:

Definition 3.6. A triple of algorithms (PEnc, PEval, PDec) is a PIR if it has the following properties:

  1. Correctness. For any database DB, any query i ∈ { 1 ,... , |DB|}, and security parameter k ∈ N,

Pr R

[

PDecR(ei) = DB[i] : Ci ← PEncR(1k, i) ei ← PEval(DB, Ci)

]

where PEval(DB, Ci) runs in poly(k, |DB|) time.

  1. Succinctness. The running time of both PEncR(1k, i) and PEval(DB, Ci) is bounded by poly(k, log |DB|). In particular, the sizes of the two messages Ci and ei are also so bounded.
  2. Semantic security. The query encryption is semantically secure for multiple queries, i.e., for any polynomial-size A, all large enough security parameter k ∈ N and any two tuples of queries i = (i 1 · · · iq), i′^ = (i′ 1 · · · i′ q) ∈ { 0 , 1 }poly(k),

Pr

[

A(PEncR(1k, i)) = 1

]

− Pr

[

A(PEncR(1k, i′)) = 1

]

≤ negl(k) ,

where PEncR(1k, i) the coordinate-wise encryption the tuple i.

PIR schemes with the above properties have been constructed under various hardness assumptions such as ΦHA [CMS99] or LWE [BV11].

A-priori unknown database size. In certain cases, we want the server to be able to specify the database DB only after receiving the query. In such cases, the client might not known the database’s size when generating his query, but will only know some superpolynomial bound, such as |DB| ≤ 2 log (^2) k

. In this case we require that the PIR scheme allows the server to deduce from an encrypted (long) query r ∈ { 0 , 1 }log (^2) k a shorter ρ-bit query rˆ ∈ { 0 , 1 }ρ^ that works for |DB| = 2ρ. In FHE-based schemes, such as the one in [BV11] (which is in turn based on LWE), this extra property can be easily supported. In other cases (as in delegation of computation), even if an adversary is adaptive, an a-priori bound on the database size is still available; whenever this is the case, then no additional properties are required of the PIR scheme.