Survey: Recovering cryptographic keys from partial information, by example

. Side-channel attacks targeting cryptography may leak only partial or indirect information about the secret keys. There are a variety of techniques in the literature for recovering secret keys from partial information. In this work, we survey several of the main families of partial key recovery algorithms for RSA, (EC)DSA, and (elliptic curve) Diffie-Hellman, the classical public-key cryptosystems in common use today. We categorize the known techniques by the structure of the information that is learned by the attacker, and give simplified examples for each technique to illustrate the underlying ideas.


Introduction
In a side-channel attack, an attacker exploits side effects from computation or storage to reveal ostensibly secret information.Many side-channel attacks stem from the fact that a computer is a physical object in the real world, and thus computations can take different amounts of time [Koc96], cause changing power consumption [KJJ99], generate electromagnetic radiation [QS01], or produce sound [GST14], light [FH08], or temperature [HS14] fluctuations.The specific character of the information that is leaked depends on the high-and low-level implementation details of the algorithm and often the computer hardware itself: branch conditions, error conditions, memory cache eviction behavior, or the specifics of capacitor discharges.
The first work on side-channel attacks in the published literature did not directly target cryptography [EL85], but since Kocher's work on timing and power analysis in the 90s [Koc96,KJJ99], cryptography has become a popular target for side-channel work.However, it is rare that an attacker will be able to simply read a full cryptographic secret through a side channel.The information revealed by many side channel attacks is often indirect or incomplete, or may contain errors.
Thus in order to fully understand the nature of a given vulnerability, the side-channel analyst often needs to make use of additional cryptanalytic techniques.The main goal for the cryptanalyst in this situation is typically: "I have obtained the following type of incomplete information about the secret key.Does it allow me to efficiently recover the rest of the key?" Unfortunately there is not a one-size-fits-all answer: it depends on the specific algorithm used, and on the nature of the information that has been recovered.
The goal of this work is to collect some of the most useful techniques in this area together in one place, and provide a reasonably comprehensive classification on what is known to be efficient for the most commonly encountered scenarios in practice.That is, this is a non-exhaustive survey and a concrete tutorial with motivational examples.Many of the algorithmic papers in this area give constructions in full generality, which can sometimes obscure the reader's intuition about why a method works.Here, we aim to give minimal working examples to illustrate each algorithm for simple but nontrivial cases.We restrict our focus to public-key cryptography, and in particular, the algorithms that are currently in wide use and thus the most popular targets for attack: RSA, (EC)DSA, and (elliptic curve) Diffie-Hellman.
Throughout this work, we will illustrate the information known for key values as follows: Most significant bits Least significant bits

Known bits
The organization of this survey is given in Table 1.

Motivation
While this survey is mostly operating at a higher level of mathematical abstraction than the side-channel attacks that we are motivated by, we will give a few examples of how attackers can learn partial information about secrets.
Modular exponentiation.All of the public-key cryptographic algorithms we discuss involve modular exponentiation or elliptic curve scalar addition operating on secret values.For RSA signatures, the victim computes s = m d mod N where d is the secret exponent.
For DSA signatures, the victim computes a per-signature secret value k and computes the value r = g k mod p, where g and p are public parameters.For Diffie-Hellman key exchange, the victim generates a secret exponent a and computes the public key exchange value A = g a mod p, where g and p are public parameters.Naive modular exponentiation algorithms like square-and-multiply operate bit by bit over the bits of the exponent: each iteration will execute a square operation, and if that bit of the exponent is a 1, will execute a multiply operation.More sophisticated modular exponentiation algorithms precompute a digit representation of the exponent using nonadjacent form (NAF), windowed non-adjacent form (wNAF) [Möl03], sliding windows, or Booth recoding [Boo51] and then operate on the precomputed digit representation [Gor98].
Cache attacks on modular exponentiation.Cache timing attacks are one of the most commonly exploited families of side-channel attacks in the academic literature [Pag02, TTMH02, TSS + 03, Per05, Ber05, OST06].There are many variants of these attacks, but they all share in common that the attacker is able to execute code on a CPU that is co-located with the victim process and shares a CPU cache.While the victim code executes, the attacker measures the amount of time that it takes to load information from locations in the cache, and thus deduces information about the data that the victim process loaded into those cache locations during execution.In the context of the modular exponentiation or scalar addition algorithms discussed above, a cache attack on a vulnerable implementation might reveal whether a multiply operation was executed at a particular bit location if the attacker can detect whether the code to execute the multiply instruction was loaded into the cache.Alternatively, for a pre-computed digit representation of the number, the attacker may be able to use a cache attack to observe the digit values that were accessed [ASK07, AS08,BvSY14].
Other attacks on modular exponentiation.Other families of side channels that have been used to exploit vulnerable modular exponentiation implementations include power analysis and differential power analysis attacks [KJJ99,KJJR11], electromagnetic radiation [QS01], acoustic emanations [GST14], raw timing [Koc96], photonic emission [FH08], and temperature [HS14].These attacks similarly exploit code or circuits whose execution varies based on secrets.Cold boot and memory attacks.An entirely different class of side-channel attacks that can reveal partial information against keys include attacks that may leak the contents of memory.These include cold boot attacks [HSH + 08], DMA (Direct Memory Access), Heartbleed, and Spectre/Meltdown [LSG + 18, KHF + 19].While these attacks may reveal incomplete information, and thus serve as theoretical motivation for some of the algorithms we discuss, most of the vulnerabilities in this family of attacks can simply be used to read arbitrary memory with near-perfect precision, and cryptanalytic algorithms are rarely necessary.
Length-dependent operations.A final vulnerability class is implementations whose behavior depends on the length of a secret value, and thus variations in the behavior may leak information about the number of leading zeros in a secret.A simple example is copying a secret key to a buffer in such a way that it reveals the bit length of a secret key.
In another example, the Raccoon attack observes that TLS versions 1.2 and below strips leading zeros from the Diffie-Hellman shared secret before applying the key derivation function, resulting in a timing difference depending on the number of hash input blocks required for the length of the secret [MBA + 21].

Mathematical background
Lattices and lattice reduction algorithms Several of the algorithms we present make use of lattices and lattice algorithms.
For the purposes of this survey, we will specify a lattice by giving a basis matrix B which is an n × n matrix of linearly independent row vectors with rational (but in our applications usually integer) entries.The lattice generated by B, written as L(B), consists of all vectors that are integer linear combinations of the row vectors of B. The determinant of a lattice is the absolute value of the determinant of a basis matrix: det Geometrically, a lattice resembles a discrete, possibly skewed, grid of points in ndimensional space.This discreteness property ensures that there is a shortest vector in the lattice: there is a non-infinitesimal smallest length of a vector in the lattice, and there is at least one vector v 1 that achieves this length.For a random lattice, the Euclidean length of this vector is approximated using the Gaussian heuristic: |v 1 | 2 ≈ n/(2πe)(det L) 1/n .We rarely need this much precision; for lattices of very small dimension we will often use the approximation that The shortest vector in an arbitrary lattice is NP-hard to compute exactly, but the LLL algorithm [LLL82] will compute an exponential approximation to this shortest vector in polynomial time: in the worst case, it will return a vector b 1 satisfying ||b 1 || 2 ≤ 2 (n−1)/4 (det L) 1/n .In practice, for random lattices, the LLL algorithm obtains a better approximation factor ||b 1 || 2 ≤ 1.02 n (det L) 1/n [NS06].In fact, the LLL algorithm will return an entire basis for the lattice whose vectors are good approximations for what are called the successive minima for the lattice; for our purposes the only fact we need is that these vectors will be fairly short, and for a random lattice they will be close to the same length.Current implementations of the LLL algorithm can be run fairly straightforwardly on lattices of dimension from a few hundred to a few thousand [RH23].
To compute a closer approximation to the shortest vector than LLL, one can use the BKZ algorithm [Sch87,SE94].This algorithm runs in time exponential in a block size, which is a parameter to the algorithm that determines the quality of the approximation factor.The theoretical guarantees of this algorithm are complicated to express; for our purposes we only need to know that for lattices of dimension below around 100, one can easily compute the shortest vector in the heuristically random-looking lattices we consider using the BKZ algorithm, and can often find the shortest vector, or a "good enough" approximation to it, by using smaller block sizes.Theoretically, the LLL algorithm is equivalent to using BKZ with block size 2.

RSA Preliminaries
Parameter Generation.To generate an RSA key pair, implementations typically start by choosing the public exponent e.By far the most common choice is to simply fix e = 65537.Some implementations use small primes like 3 or 17. Almost no implementations use public exponents larger than 32 bits.This means that attacks that involve brute forcing values less than e are generally feasible in practice.
In the next step, the implementation generates two random primes p and q such that p − 1 and q − 1 are relatively prime to e.The public modulus is N = pq.The private exponent is then computed as d = e −1 mod (p − 1)(q − 1).
The public key is the pair (e, N ).In theory, the secret key is the pair (d, N ), but in practice many implementations store keys in a data structure including much more information.For example, the PKCS#1 private key format includes the fields p, q, d p = d mod (p − 1), d q = d mod (q − 1), and q inv = q −1 mod p to speed encryption using the Chinese Remainder Theorem.

Encryption and Signatures.
In textbook RSA, Alice encrypts the message m to Bob by computing c = m e mod N .In practice, the message m is not a "raw" message, but has first been transformed from the content using a padding scheme.The most common encryption padding scheme in network protocols is PKCS#1v1.5,but OAEP [BR95] is also sometimes used or specified in protocols.To decrypt the encrypted ciphertext, Bob computes m = c d mod N and verifies that m has the correct padding.
To generate a digital signature, Bob first hashes and pads the message he wishes to sign using a padding scheme like PKCS#1v1.5 signature padding (most common) or PSS (less common); let m be the hashed and padded message of this form.Then Bob generates the signature as s = m d mod N .Alice can verify the signature by computing the value m ′ = s e mod N and verifying that m ′ is the correct hashed and padded value.
Since encryption and signature verification only use the public key, decryption and signature generation are the operations typically targeted by side-channel attacks.

RSA-CRT.
To speed up decryption, instead of computing c d mod N directly, implementations often use the Chinese remainder theorem (CRT).RSA-CRT splits the exponent d into two parts d p = d mod (p − 1) and d q = d mod (q − 1).
To decrypt using the Chinese remainder theorem, Alice would compute m p = c dp mod p and m q = c dq mod q.The message can be recovered with the help of the pre-computed value q inv = q −1 mod p by computing This is called Garner's formula [Gar59].
Relationships Between RSA Key Elements.For the purpose of secret key recovery, we typically assume that the attacker knows the public key.
RSA keys have a lot of mathematical structure that can be used to relate the different components of the public and private keys together for key recovery algorithms.The RSA public and private keys are related to each other as The modular equivalence can be removed by introducing a new variable k to obtain an integer relation We know that d < (p − 1)(q − 1), so k < e.The value of k is not known to the attacker, but since generally e ≤ 65537 in practice it is efficient to brute force over all possible values of k.
For attacks against the CRT coefficients d p and d q , we can obtain similar relations: for some integers k p < e and k q < e. Brute forcing over two independent 16-bit values can be burdensome, but we can relate k p and k q as follows: Rearranging the two relations, we obtain ed p − 1 + k p = k p p and ed q − 1 + k q = k q q.Multiplying these together, we get Reducing the above modulo e, we get Thus given a value for k p , we can solve for the unique value of k q mod e, and for applications that require brute forcing values of k p and k q we only need to brute force at most e pairs [IGA + 15].The multiplier k also has a nice relationship to these values.Multiplying the relations from Equation 1 together, we have Substituting (p−1)(q −1) = (ed−1)/k and reducing modulo e, we can relate the coefficients as k ≡ −k p k q mod e Any of the secret values p, q, d, d p , d q , or q inv suffices to compute all of the other values when the public key (N, e) is known.
From either p or q, computing the other values is straightforward.
For small e, N can be factored from d by computing The integer multiplier k can be recovered by rounding ⌈(ed − 1)/N ⌋.Once k is known, then Equation 3 can be rearranged to solve for s = p + q.Once s is known, we have (p + q) 2 = s 2 = p 2 + 2N + q 2 and s 2 − 4N = p 2 − 2N + q 2 = (p − q) 2 .Then N can be factored by computing gcd((p + q) − (p − q), N ).
When e is small, p can be computed from d p as p = gcd((ed p − 1)/k p + 1, N ) where k p can be brute forced from 1 to e.If k p is not known and is too large to brute force, with high probability for a random a, p = gcd(a edp−1 − 1, N ).
Factoring from q inv is more complex.As noted in [HS09], q inv satisfies q inv q 2 − q ≡ 0 mod N , and q can be recovered using Coppersmith's method, described below.

RSA Key Recovery with Consecutive bits known
This section covers techniques for recovering RSA private keys when large contiguous portions of the secret keys are known.The main technique used in this case is lattice basis reduction.
For the key recovery problems in this section, we can typically recover a large unknown chunk of bits of an unknown secret key value (p, d mod (p − 1), or d).We typically assume that the attacker has access to the public key (N, e) but does not have any other auxiliary information (about q or d mod (q − 1), for example.Knowledge of large contiguous portions of secret keys is unlikely to arise in side channels that involve noisy measurements, but could arise in scenarios where secrets are being read out of memory that got corrupted in an identifiable region.They can also help make attacks more efficient if a high cost is paid to recover known bits.

Warm-up: Lattice attacks on low-exponent RSA with bad padding.
The main algorithmic technique used for RSA key recovery with contiguous bits is to formulate the problem as finding a small root of a polynomial modulo an integer, and then to use lattice basis reduction to solve this problem.
In order to introduce the main tool of using lattice basis reduction to find roots of polynomials, we will start with an illustrative example for the concrete application of breaking small-exponent RSA with known padding.In later sections we will show how to modify the technique to cover different RSA key recovery scenarios.
The original formulation of this problem is due to Coppersmith [Cop96].Howgrave-Graham [HG97] gave a dual approach that we find easier to explain and easier to implement.May's survey [May10] contains a detailed description of the Coppersmith/Howgrave-Graham algorithm.
To set up the problem, we have an integer N , and a polynomial f (x) of degree k that has a root r modulo N , that is, f (r) ≡ 0 mod N .We wish to find r.Finding roots of polynomials can be done efficiently modulo primes [LLL82], so this problem is easy to solve if N is prime or the prime factorization of N is known.The Coppersmith/Howgrave-Graham methods are generally of interest when the prime factorization of N is not known: it gives an efficient algorithm for finding all small roots (if they exist) modulo N of unknown factorization.Cast the problem as finding roots of a polynomial.Let a = 0x01FFFFFFFFFFFFFFFF0000 be the known padding string, offset to the correct byte location.We also know the length of the message; in this case m < 2 16 .Thus we have that c = (a + m) 3 mod N , for unknown small m.Let f (x) = (a + x) 3 − c; we have set up the problem so that we wish to find a small root m satisfying f (m) ≡ 0 mod N for the polynomial (We have reduced the coefficients modulo N so that they will fit on the page.) Construct a lattice.Let the coefficients of f be f (x) = x 3 + f 2 x 2 + f 1 x + f 0 .Let M = 2 16 be an upper bound on the size of the root m.We construct the matrix We then apply the LLL lattice basis reduction algorithm to the matrix.The shortest vector of the reduced basis is Extract a polynomial from the lattice and find its roots.We then construct the polynomial The polynomial g has one integer root, 0x42, which is the desired solution for m.
This specific 4 × 4 lattice construction works to find roots up to size N 1/6 .For the small key size we used in our example, this is only 16 bits, but since it scales directly with the modulus size, this same lattice construction would suffice to learn 170 unknown bits of message for a 1024-bit RSA modulus, or 341 bits of message for a 2048-bit RSA modulus.Lattice reduction on a 4 × 4 lattice basis with entries that have a few thousand bits is essentially instantaneous on a modern laptop.More detailed explanation.Why does this work?The rows of this matrix correspond to the coefficient vectors of the polynomials f (x), N x 2 , N x, and N .We know that each of these polynomials evaluated at x = m will be 0 modulo N .Each column is scaled by a power of M , so that the ℓ 1 norm of any vector in this lattice is an upper bound on the value of the corresponding (un-scaled) polynomial evaluated at m.For a vector We have constructed the lattice so that every polynomial g we extract from it has the property that g(m) ≡ 0 mod N .We have also constructed our lattice so that the length of the shortest vector in a reduced basis will be less than N .The only integer multiple of N less than N is 0, so by construction the polynomial corresponding to this short vector satisfies g(m) = 0 over the integers, not just modulo N .Since finding roots of polynomials over the integers, rationals, reals, and complex numbers can be done in polynomial time, we can compute the roots of this polynomial and check which of them is our desired solution.
This method will always work if the lattice is constructed properly.That is, we need to ensure that the reduced basis will contain a vector of length less than N .For this example, det B = M 6 N 3 .Heuristically, the LLL algorithm will find a vector of ℓ 2 norm |v| 2 ≤ 1.02 n (det B) 1/ dim B .We ignore the 1.02 n factor, and the difference between the ℓ 2 and ℓ 1 norms for the moment.Then the condition we wish to satisfy is For our example, we have (det B) 1/ dim L = (M 6 N 3 ) 1/4 < N .Solving for M , this will be satisfied when M < N 1/6 .In this case, N has 96 bits, and m is 16 bits, so the condition is satisfied.
This can be extended to N 1/e , where e is the degree of the polynomial f by using a larger dimension lattice.Howgrave-Graham's dissertation [HG98] and May's survey [May10] give detailed explanations of this method and improvements.

Factorization from consecutive bits of p.
In this section we show how to use lattices to factor the RSA modulus N if a large portion of contiguous bits of one of the factors (without loss of generality p) is known.Coppersmith solves this problem in [Cop96] but we find the reformulation from Howgrave-Graham as "approximate integer common divisors" [HG01] simpler to apply, and will give that construction here.

Problem setup.
Let N = pq be an RSA modulus with equal-sized p and q.Choosing an example with numbers small enough to fit on the page, we have a 240-bit RSA modulus We assume N is known.Assume we know a large contiguous portion of the most significant bits b of p, so that p = a + r, where we do not know r but do know the value a = 2 ℓ b.Here ℓ = 30 is the number of unknown bits, or equivalently the left shift of the known bits.
In our example, we have Cast the problem as finding the roots of a polynomial.Let f (x) = a + x.We know that there is some value r such that f (r) = p ≡ 0 mod p.We do not know p, but we know that p divides N and we know N .We know that the unknown r is small, and in particular |r| < R for some bound R that is known.Here, R = 2 30 .

Construct a lattice. We can form the lattice basis
We then run the LLL algorithm on our lattice basis be the shortest vector in the reduced basis.In our example, we get the vector Extract a polynomial and find the roots.We form a polynomial We can then calculate the roots of f .In this example, f has one integer root, r = 0x873209.
We can then reconstruct a + r and verify that gcd(a + r, N ) factors N .
This 3×3 lattice construction works for any |r| < p 1/3 , and directly scales as p increases.In our example, we chose p and q so that they have 120 bits, and r has 30 bits.However, this same construction will work to recover 170 bits from a 512-bit factor of a 1024-bit RSA modulus, or 341 bits from a 1024-bit factor of a 2048-bit RSA modulus.More detailed explanation.The rows of this matrix correspond to the coefficient vectors of the polynomials x(x + a), x + a, and N .We know that each of these polynomials evaluated at x = r will be 0 modulo p, and thus every polynomial corresponding to a vector in the lattice has this property.As in the previous example, each column is scaled by a power of R, so that the ℓ 1 norm of any vector in this lattice is an upper bound on the value of the corresponding (un-scaled) polynomial evaluated at r.
If we can find a vector in the lattice of length less than p, then it corresponds to a polynomial g that must satisfy g(r) < p.Since by construction, g(r) = 0 (mod p), this means that g(r) = 0 over the integers.
We compute the determinant of the lattice to verify that it contains a sufficiently small vector.For this example, det B = R 3 N .This means we need (det B) 1/ dim L = (R 3 N ) 1/3 < p. Solving for R, this gives R < p 1/3 .For an RSA modulus we have p ≈ N 1/2 , or R < N 1/6 .This method works up to R < p 1/2 at the limit by increasing the dimension of the lattice.This is accomplished by taking higher multiples of f and N .See Howgrave-Graham's dissertation [HG98] and May's survey [May10] for details on how to do this.

RSA key recovery from least significant bits of p
It is also straightforward to adapt this method to deal with a contiguous chunk of unknown bits in the least significant bits of p: if the chunk begins at bit position ℓ, the input polynomial will have the form f (x) = 2 ℓ x + a.This can be multiplied by 2 −ℓ mod N and solved exactly as above.

RSA key recovery from middle bits of p
RSA key recovery from middle bits of p is somewhat more complex than the previous examples, because there are two unknown chunks of bits in the most and least significant bits of p. Problem setup.Assume we know a large contiguous portion of the middle bits of p, so that p = a + r ℓ + 2 t r m , where a is an integer representing the known bits of p, r ℓ and r m are unknown integers representing the least and most significant bits of p that we wish to solve for, and t is the starting bit position of the unknown most significant bits.We know that |r ℓ | < R and |r m | < R for some bound R.
As a concrete example, let be the middle bits of one of its factors p; there are 16 unknown bits in the most and least significant bit positions.Thus we know that R = 2 16 in our concrete example.We wish to recover p.
Cast the problem as finding solutions to a polynomial.In the previous examples, we only had one variable to solve for.Here, we have two, so we need to use a bivariate polynomial.We can write down f (x, y) = x + 2 t y + a, so that f (r ℓ , r m ) = p.
In our concrete example, p has 164 bits, so we have f (x, y) = x + 2 148 y + a.We hope to construct two polynomials g 1 (x, y) and g 2 (x, y) satisfying g 1 (r ℓ , r m ) = 0 and g 2 (r ℓ , r m ) = 0 over the integers.Then we can solve the system for the simultaneous roots.

Construct a lattice.
As before, we will use our input polynomial f and the public RSA modulus N to construct a lattice.Unfortunately for the simplicity of our example, the smallest polynomial that is guaranteed to result in a nontrivial bound on the solution size for our desired roots has degree 3, and results in a lattice of dimension 10.
As before, each column corresponds to a monomial that appears in our polynomials, and each row corresponds to a polynomial that evaluates to 0 mod p at our desired solution.
In our example, we will use the polynomials f 3 , f 2 y, f y 2 , y 3 N, f 2 , f y, y 2 N, f, yN , and N ; the monomials in the columns are x 3 , x 2 y, xy 2 , y 3 , x 2 , xy, y 2 , x, y, and 1.Each column is scaled by the appropriate power of R.
We reduce this matrix using the LLL algorithm, and reconstruct the bivariate polynomials corresponding to each row of the reduced basis.Unfortunately, these are too large to fit on a page.
Solve the system of polynomials to find common roots.Heuristically, we would hope to only need two sufficiently short vectors and then compute the resultant of the corresponding polynomials or use a Gröbner basis to find the common roots, but in our example the two shortest vectors are not algebraically independent.In this case it suffices to use the first three vectors.Concretely, we construct an ideal over the ring of bivariate polynomials with integer coefficients whose basis is the polynomials corresponding to the three shortest vectors in the reduced basis for L(B) above, and then call a Gröbner basis algorithm on it.For this example, the Gröbner basis is exactly the polynomials (x − 0x339b, y − 0x5a94), which reveals the desired solutions for x = r ℓ and y = r m .
In this example, the nine shortest vectors all vanish at the desired solution, so we could have constructed our Gröbner basis from other subsets of these short vectors.More detailed explanation.The determinant of our lattice is det B = R 20 N 4 , and the lattice has dimension 10.We hope to find two vectors v 1 and v 2 of length approximately det B 1/ dim B ; this is not guaranteed to be possible, but for random lattices we expect the lengths of the vectors in a reduced basis to have close to the same lengths.The ℓ 1 norms of the vectors v 1 and v 2 are upper bounds on the magnitude of the corresponding polynomials f v1 (x, y), f v2 (x, y) evaluated at the desired roots r ℓ , r m .In order to guarantee that these vanish, we want the inequality Thus the desired condition for success is In our example, N was 326 bits long, and we chose R to have 16 bits.
This attack was applied in [BCC + 13] to recover RSA keys generated by a faulty random number generator that generated primes with predictable sequences of bits.

RSA key recovery from multiple chunks of bits of p
The above idea can be extended to handle more chunks of p at the cost of increasing the dimension of the lattice.Each unknown "chunk" of bits introduces a new variable in the linear equation that will be solved for p.At the limit, the algorithm requires 70% of the bits of p divided into at most log log N blocks [HM08].

Open problem: RSA key recovery from many nonconsecutive bits of p
The above methods scale poorly with the number of chunks of known bits.It is an open problem to develop a subexponential-time method to recover an RSA key or factor the RSA modulus N with more than log log N unknown chunks of bits, if these bits are only known about, say, one factor p of N .If information is known about both p and q or other fields of the RSA private key, then the methods of Section 4.3.1 may be applicable.

Partial recovery of RSA d p
Recovering the CRT coefficient d p = d mod (p − 1) from a large contiguous bits can be done using the approach given in Sections 4.2.2, 4.2.3 and 4.2.4.We illustrate the method in the case of known most significant bits.
given many contiguous bits of d p .

Problem setup.
Let be a 240-bit RSA modulus.We will use public exponent e = 65537.
In this problem, we are given some of the most significant bits b of d p , and we want to recover the rest.As before, let ℓ be the number of least significant bits of d p we need to recover, so that there is some value a = 2 ℓ b with a + r = d p for some r < 2 ℓ .For our concrete example, we have Cast the problem as finding the roots of a polynomial.We start with the relation ed p ≡ 1 mod (p − 1) and rewrite it as an integer relation by introducing a new variable k p : The integer k p is unknown, but we know that k p < e since d p < (p − 1).In our example, and typically in practice, we have e = 65537, so we will run the attack for all possible values of 1 ≤ k p < 65537.With the correct parameters, we are guaranteed to find a solution for the correct value of k p .For other incorrect guesses of k p , in practice the attack is unlikely to result in any solutions found, but any spurious solutions that arise can be eliminated because they will not result in a factorization of N .
We can rearrange Equation 4, with e −1 computed modulo N : . Then we wish to find a small root r of the polynomial f (x) = A + x modulo p, where |r| < R.
For our concrete example, we have R = 2 30 and k p = 23592, so Construct a lattice.Since the form of the problem is identical to the previous section, we use the same lattice construction: We apply the LLL algorithm to this basis and take the shortest vector in the reduced basis.For our example, this is We construct the corresponding polynomial Computing the roots of f , we discover that r = 0x39d9b141 is among them, and that gcd(A + r, N ) = p.
At the limit, this technique can work up to R < p 1/2 [BM03] by increasing the dimension of the lattice with higher degree polynomials and higher multiplicities of the root.

Partial recovery of RSA d from most significant bits is not possible
Partial recovery for d varies somewhat depending on the bits that are known and the size of e.Since e is small in practice, we will focus on that case here.Most significant bits of d.When e is small enough to brute force, the most significant half of bits of d can be recovered easily with no additional information.This implies that if full key recovery were possible from only the most significant half of bits of d, then small public exponent RSA would be completely broken.Since small public exponent RSA is not known to be insecure in general, this unfortunately means that no such key recovery method is possible for this case.
Consider the RSA equation Since p + q ≈ √ N , the second term affects only the least significant half of the bits of d, so the value kN/e shares approximately the most significant half of its bits in common with d.
On the positive side, this observation allows the attacker to narrow down possible values for k if the attacker knows any most significant bits of d for certain.See Boneh, Durfee, and Frankel [BDF98] for more details.

Partial recovery of RSA d from least significant bits
For low-exponent RSA, if an adversary knows the least significant t bits of d, then this can be transformed into knowledge of the least significant t bits of p, and then the method of Section 4.2.3 can be applied.This algorithm is due to Boneh, Durfee, and Frankel [BDF98].Assume the adversary knows the t least significant bits of d; call this value d 0 .Then The adversary tries all possible values of k, 1 < k < e to obtain e candidate values for the t least significant bits of s.
Then for each candidate s, the least significant bits of p are solutions to the quadratic equation Let a be a candidate solution for the least significant bits of p. Putting this in the context of Section 4.2.3, the attacker wishes to solve f (x) = a + 2 t x ≡ 0 mod p.This can be multiplied by 2 −t mod N and the exact method of Section 4.2.3 can be applied to recover p.Since at the limit, the methods of Section 4.2.3 work to recover N 1/4 bits of p, this method will work when as few as N 1/4 bits of d are known.
There are more sophisticated lattice algorithms that involve different tradeoffs, but for very small e, which is typically the case in practice, they require nearly all of the least significant bits of d to be known [BM03].

Non-consecutive bits known with redundancy
This section covers key recovery in the case that many non-consecutive bits of secret values are known or need to be recovered.The lattice methods covered in the previous section can be adapted to recover multiple chunks of unknown key bits, but at a high cost: the lattice dimension increases with the number of chunks, and when a large number of bits is to be recovered, the running time can be exponential in the number of chunks.
In this section, we explore a different technique that allows a different tradeoff.In this case, the attacker has knowledge of many non-contiguous bits of secret key values, and knows these for multiple secret values of the key.The attacker might have learned parts of both p and q, or d mod (p − 1) and d mod (q − 1), for example.We begin by analyzing a case that is less likely to arise in practice, the case of random erasures of bits of p and q, in order to give the main ideas behind the algorithm in the simplest setting.

Random known bits of
The main technique used for these cases is a branch and prune algorithm.The idea behind the branch and prune algorithm is to write down an integer relationship between the elements in the secret key and the public key, and progressively solve for unknown bits of the secret key, starting at the least significant bits.This produces a tree of solutions: every branch corresponds to guesses for one or more unknown bits at a particular solution, and branches are pruned if the guesses result in incorrect relationships to the public key.
This algorithm is presented and analyzed in [HS09].
Problem setup.Let N = 899.Imagine we have learned some bits of p and q, in an erasure model: for each bit position, we either know the bit value, or we know that we do not know it.For example, we have p = ⊔11 ⊔ 1, and q = ⊔1 ⊔ 0⊔.
Defining an integer relation.The integer relation that we will take advantage of for this example is N = pq.
Iteratively solve for each bit.The main idea of the algorithm is to iteratively solve for the bits of the unknowns p and q, starting at the least significant bits.These can then be checked against the known public value of N .At the least significant bit, the value is known for p and is unknown for q.There are two options for the value of q, but only the bit value 1 satisfies the constraint that pq = N mod 2. The algorithm then proceeds to the next step, where the value of the second bit is known for q but not for p.Only the bit value 1 satisfies the constraint pq = N mod 2 2 , so the algorithm continues down this branch.Since this generates a tree, the tree can be traversed in depth-first or breadth-first order; depth-first will be more memory efficient.This is illustrated in Figure 10.
Figure 10: The branch and prune tree for our numeric example.The algorithm begins at the right-hand node representing the least significant bits, and iteratively branches and prunes guesses for successive bits moving towards the most significant bits.
The algorithm works because N = pq mod 2 i for all values of i.Additionally, we want some assurance that an incorrect guess for a value at a particular bit location should eventually lead to that branch being pruned.Heuristically, when the ith bits of both p and q are unknown, the tree will branch; when bit i is known for one but not the other, there will be a unique solution; and when the ith bits of both p and q are known, an incorrect solution has around a 50% probability of being pruned.Thus the algorithm is expected to be efficient as long as there are not long runs of simultaneous unknown bits.We assume the length of p and q is known.Once the algorithm has traversed this many bits, the final solution pq = N can be checked without modular constraints.
When random bits are known from p and q, the analysis of [HS09] shows that the tree of generated solutions is expected to have polynomial size when 57% of the bits of p and q are revealed at random.This algorithm can still be efficient if the distribution of bits known is not random, as long as it allows efficient pruning of the tree.An example would be learning 3 out of every 5 bits of p and q, as in [YGH16].
Paterson, Polychroniadou, and Sibborn [PPS12] give an analysis of the required information for different scenarios, and observe that doing a depth-first search is more efficient memory-wise than a breadth-first search.

Random known bits of the Chinese remainder coefficients d mod (p − 1)
and d mod (q − 1) The description in Section 4.3.1 can be extended to recover the Chinese remainder exponents d p = d mod (p − 1) and d q = d mod (q − 1) using the same technique as the previous section.This is the most common case encountered in RSA side channel attacks.
Factorization of N = pq given non-consecutive bits of d p , d q .
Problem setup.Let N = 899 be the RSA public modulus, and e = 17 be the public exponent.Imagine that the adversary has recovered some bits of the secret Chinese remainder exponents d p = d mod (p − 1) and d q = d mod (q − 1).
We wish to recover the missing unknown bits of d p and d q , which will allow us to recover the secret key itself.

Define integer relations.
We know that ed p ≡ 1 mod (p − 1) and ed q ≡ 1 mod (q − 1).We rewrite these as integer relations We have no information about the values of p and q, but their values are uniquely determined from a guess for d p or d q .
We also know that pq = N.
The values k p and k q are unknown, so we must brute force them by running the algorithm for all possible values.We expect it to fail for incorrect guesses, and succeed for the unique correct guess.Equation 2 in Section 4.1 shows that there is a unique value of k q for a given guess for k p .Since k p < e we need to brute force at most e pairs of values for k p and k q .
In our example, we have k p = 13 and k q = 3, although this won't be verified as the correct guesses until the solution is found.
Iteratively solve for each bit.With our integer relations in place, we can then use them to iteratively solve for each bit of the unknowns d p , d q , p, and q, starting from the least significant bit.We check guesses for each value against our three integer relations, and at bit i we prune those that do not satisfy the relations mod 2 i .We have three relations and four unknowns, so we generate at most two new branches at each bit.
. .0011 d q = . . .0001 p = . . .1011 q = . . .0001 d p = . . .011 d q = . . .101 p = . . .011 q = . . .101 d p = . . .111 d q = . . .001 p = . . .111 q = . . .001 We give a sample branch and prune tree for recovering d p and d q from known bits, starting from the least significant bits on the right side of the tree.At each bit location, the value of p up to bit i is uniquely determined by the guess for d p up to bit i, and the value of q up to bit i is uniquely determined by the buess for d q up to bit i.The red X marks the branches that are pruned by verifying the relation pq = N mod 2 i .Since the values of p and q up to bit i are uniquely determined by our guess for d p and d q up to bit i, the algorithm prunes solutions based on the relation pq ≡ N mod 2 i .The analysis of this case is then identical to the case of learning bits of p and q at random.
For incorrect guesses for the values of k p and k q , we expect the equations to act like random constraints, and thus to quickly become unsatisfiable.Once there are no more possible solutions in a tree, the guess for k p and k q is known to be incorrect.This is illustrated by Figure 11.

Recovering RSA keys from indirect information
For this type of key recovery algorithm, it is not always necessary to have direct knowledge of bits of the secret key values with certainty.It can still be possible to apply the branchand-prune technique to recover secret keys even if only "implicit" information is known about the secret values, as long as this implicit information implies a relationship that can be checked to prioritize or prune candidate key guesses from the least significant bits.Examples in the literature include [BBG + 17], which computes partial sliding window square-and-multiply sequences for candidate guesses and compares them to the ground truth measurements, and [MVH + 20], which compares the sequence of program branches in a binary GCD algorithm implementation computed over the cryptographic secrets to a ground truth measurement.

Open problem: Random known bits without redundancy
As mentioned in Section 4.2.6, it is an open problem to recover an RSA secret key when many nonconsecutive chunks of bits need to be recovered, and the bits known are from only one secret key field, with no additional information from other values.Applying the branch-and-prune methods discussed in this secction to a single secret key value, say a factor p of N , where random bits are known, would result in a tree with exponentially many solutions unless additional information were available to prune the tree.
5 Key recovery methods for DSA and ECDSA

DSA and ECDSA preliminaries
From the perspective of partial key recovery, DSA and ECDSA are very similar, and we will cover them together.We will use slightly nonstandard notation to describe each signature scheme to make them as close as possible, so that we can use the same notation to describe the attacks simultaneously.

DSA
The Digital Signature Algorithm [NIS13] (DSA) is an adaptation of the ElGamal Signature Scheme [EG85] that reduces the amount of computation required and the resulting signature size by using Schnorr groups [Sch90].
Parameter Generation.A DSA public key includes several global parameters specifying the group to work over: a prime p, a subgroup of order n satisfying n | (p − 1), and an integer g that generates a group of order n mod p, where n is typically much smaller than p, for example 256 bits for a 2048-bit p.A single set of group parameters can be shared across many public keys, or individually generated for a given public key.
To generate a long-term private signing key, an implementation starts by choosing the secret key 0 < d < n and computing y = g d mod p.The public key is the tuple (y, g, p, n) and the private key is (d, g, p, n).Signature Generation.To sign a message m, implementations apply a collision-resistant hash function H to m to obtain a hashed message h = H(m).To generate the signature, the implementation generates an ephemeral secret integer 0 < k < n, and computes the integers r = g k mod p mod n, and s = k −1 (h + dr) mod n.The signature is the pair (r, s).

ECDSA
The Elliptic Curve Digital Signature Algorithm (ECDSA) is an adaptation of DSA to use elliptic curves instead of Schnorr groups.
Parameter Generation.An ECDSA public key includes global parameters specifying an elliptic curve E over a finite field together with a generator point g of a subgroup over E of order n.
To generate a long-term private signing key, an implementation starts by choosing a secret integer 0 < d < n, and computing the elliptic curve point y = dg on E. The public key is the elliptic curve point y together with the global parameters specifying E, g, and n.The private key is the integer d together with these global parameters.Signature Generation.To sign a message m, implementations apply a collision-resistant hash function H to m to obtain a hashed message h = H(m).To generate the signature, the implementation generates an ephemeral secret 0 < k < n.The implementation computes the elliptic curve point kg and sets the value r to be the x-coordinate of kg.The implementation then computes the integer s = k −1 (h + dr) mod n.The signature is the pair of integers (r, s).

Nonce recovery and (EC)DSA security.
The security of (EC)DSA is extremely dependent on the signature nonce k being securely generated, uniformly distributed, and unique for every signature.If the nonce for one or more signatures is generated in a vulnerable manner, then an attacker may be able to efficiently recover the long-term secret signing key.Because of this property, side channel attacks against (EC)DSA almost universally target properties of the signature nonces.
Key recovery from signature nonce.For a DSA or ECDSA key, if the nonce k is known for a single signature, it is simple to compute the long-term private key.Rearranging the expression for s, the secret key d can be recovered as

(EC)DSA key recovery from most significant bits of the nonce k
There are two families of techniques for (EC)DSA key recovery from most significant bits of the nonce k.Both techniques require knowing information about the nonce used in multiple signatures from the same secret key.We assume that the attacker knows the long-term public signature verification key, and has access to multiple signatures generated using the corresponding secret signing key.The attacker also needs to know the hash of the messages that the signatures correspond to.The first technique is via lattices.This is generally considered more straightforward to implement, and works well when more nonce bits are known, and information from fewer signatures is available: we would need to know at least two most significant bits from the nonces of dozens to hundreds of signatures.We cover this technique below.
The second technique is via Fourier analysis.This technique can deal with as little as one known most significant bit from signature nonces, but empirically appears to require an order of magnitude or more signatures than the lattice approach, and as many as 2 32 -2 35 for record computations [ANT + 20].We leave a more detailed tutorial on this technique to future work.Nice descriptions of the algorithm can be found in [DHMP13,TTA18].

Lattice attacks
The main idea behind lattice attacks for (EC)DSA key recovery is to formulate the (EC)DSA key recovery problem as an instance of the Hidden Number Problem and then compute the shortest vector of a specially constructed lattice to reveal the solution.
Below we give a simplified example that shows how to recover the key from a small number of signatures when many of the most significant bits of the nonce are zero, and then we will show how to extend the attack to more signatures with fewer bits known from each nonce, and cover the case of arbitrary bits known from the nonce.Problem setup.Let p = 0xffffffffffffd21f be a 64-bit prime, and let E : y 2 = x 3 +3 be an elliptic curve over F p .Let g = (1, 2) be our generator point on E, which has order n = 0xfffffffefa23f437.
Cast the problem as a system of equations.Our signatures above satisfy the equivalencies The values k 1 , k 2 , and d are unknown; the other values are known.We can eliminate the variable d and rearrange terms as follows: We can then simplify the above as We wish to solve for k 1 and k 2 , and we know that they are both small.Let |k 1 |, |k 2 | < K.
For our example, we have K = 2 32 .
Construct a lattice.We construct the following lattice basis: ) is in this lattice by construction, and we expect it to be particularly short.
Calling the BKZ algorithm on B results in a basis that contains this short vector v = (−0x270feca3, 0x4dbd2db0, 0x100000000) as the third vector in the reduced basis.We can verify that the value r 1 in our example matches the x-coordinate of k 1 g, and we can use Equation 5 to compute the private key d.
More detailed explanation.In our example, we have constructed a lattice that is guaranteed to contain our target vector.In order for this method to work, we hope that it is the shortest vector, or close to the shortest vector in the lattice, and we solve the shortest vector problem in the lattice in order to find it.The vector v = (k 1 , k 2 , K) has length |v| 2 ≤ √ 3K by construction.Our lattice has determinant det B = nK.Ignoring constants for the moment, if our lattice were truly random, we would expect the shortest vector to have length ≈ det B 1/ dim B .Thus if |v| 2 < det B 1/ dim B , we expect it to be the shortest vector in the lattice, and to be found by a sufficiently good approximation to the shortest vector problem.
For our example, we expect this to be satisfied when K < (nK) 1/3 , or when K < √ n.The way we have presented this method may remind the reader of the flavor of the methods in Section 4.2.1.The specific lattice construction used here is a sort of "dual" to the constructions from Section 4.2.1, in that the target vector is the desired solution to our system of equations.However, in contrast to Section 4.2.1, we are not guaranteed to find the solution we desire once we find a sufficiently short vector: this method can fail with probability that decreases the shorter our target vector d is compared to the expected shortest vector length.
The Hidden Number Problem.The lattice-based algorithms we describe for solving these problems are based on the Hidden Number Problem introduced by Boneh and Venkatesan [BV96].They applied the technique to show that the most significant bits of a Diffie-Hellman shared secret are hardcore.Nguyen and Shparlinski showed how to use this approach to break DSA and ECDSA from information about the nonces [NS02,NS03].

Various extensions of the technique can deal with different numbers of bits known per signature [BvSY14] or errors [DDE + 18].
There is another algorithm to solve this problem using Fourier analysis [Ble98, DHMP13] originally due to Bleichenbacher; it requires more samples than the lattice approach but can handle fewer bits known.

Scaling to many signatures to decrease the number of bits known.
To decrease the number of bits required from each signature, we can incorporate more signatures into the lattice.If we have access to many signatures (r 1 , s 1 ), . . ., (r m , s m ) on message hashes h 1 , . . ., h m , we use the same method above to write down equivalencies s i ≡ k −1 i (h i + dr i ) mod n, then as above we rearrange terms and eliminate the variable d to obtain We then construct the lattice In order to solve SVP, we must run an algorithm like BKZ with block size dim L(B) = m + 1.Using BKZ to look for the shortest vector can be done relatively efficiently up to dimension around 100 currently; beyond that it becomes increasingly expensive.In practice, one can often achieve a faster running time for fixed parameters by using more samples to construct a larger dimension lattice, and applying BKZ with a smaller block size to find the target vector.This method can recover a secret key from knowledge of the 4 most significant bits of nonces from 256-bit ECDSA signatures using about 70 samples, and 3 most significant bits using around 95 samples.For fewer bits known, either the Fourier analysis technique or a more powerful application of these lattice techniques is required, along with significantly more computational power.
Known nonzero most significant bits.If the most significant bits of the k i are nonzero and known, we can write k i = a i + b i , where the a i are known, and the b i are small, so satisfy some bound |b i | < K. Then substituting into Equation 6, we obtain , and use the same lattice construction as above, with u ′ i substituted for u i .Nonce rebalancing.The signature nonces k i take values in the range 0 < k i < n, but the lattice construction bounds the absolute value |k i |.Thus if we know that 0 < k i < K for some bound K, we can achieve a tighter bound by renormalizing the signatures.Let Then we can write Equations 7 as Thus we have an equivalent problem with t ′ i = t i , u ′ i = (t i + 1)K/2 + u i , and K ′ = K/2, and can solve as before.This optimization can make a significant difference in practice by reducing the number of required samples.

(EC)DSA key recovery from least significant bits of the nonce k
The attack described in the previous section works just as well for known least significant bits of the (EC)DSA nonce.Problem setup.We input a collection (EC)DSA signatures (r i , s i ) on message hashes h i .For each signature, we know the least significant bits, so the signature nonces k i satisfy for known a i , and b i unknown but satisfying Substituting these into Equations 7, we get We have an equivalent instance of the problem with , and B ′ = B, and solve as above.Recovering an ECDSA key from middle bits of the nonce k is slightly more complex than the methods discussed above, because we have two unknown "chunks" of the nonce to recover per signature.Fortunately, we can deal with these by extending the methods to multiple variables per signature.The method we will use here is similar to the multivariate extension in Section 4.2.4,but this case is simpler.

(EC)DSA key recovery from middle bits of the nonce
Problem setup.We will use the same elliptic curve group parameters as above.Let p = 0xffffffffffffd21f be a 64-bit prime, and let E : y 2 = x 3 +3 be an elliptic curve over F p .Let g = (1, 2) be our generator point on E, which has order n = 0xfffffffefa23f437.
We have two ECDSA signatures (r 1 , s 1 ) =(1a4adeb76b4a90e0, eba129bb2f97f7cd) on message hash h 1 = 608932fcfaa7785d and (r 2 , s 2 ) =(c4e5bec792193b51, 0202d6eecb712ae3) We know some middle bits of the corresponding nonces.Let be the middle 34 bits of the signature nonce k 1 used for the first signature above.The first and last 15 bits are unknown.Let be the middle 34 bits of the signature nonce k 2 used for the second signature above.
Cast the problem as a system of equations.As above, our two signature nonces k 1 and k 2 satisfy the where Since we know the middle bits of k 1 and k 2 are a 1 and a 2 respectively, we can write where b 1 , c 1 , b 2 , and c 2 are unknown but small, less than some bound K.In our example, we have Substituting and rearranging into Equation 8, we have We wish to find the small solution Construct a lattice.We construct the following lattice basis: If we call the BKZ algorithm on B, we obtain a basis that contains the vector v = (0x6589e5fb1823K, −0x42b0986d3e11K, This corresponds to the linear equation We can do the same for the next three short vectors in the basis, and obtain four linear polynomials in our four unknowns.Solving the system, we obtain the solutions More detailed explanation.The row vectors of the lattice correspond to the weighted coefficient vectors of the linear polynomial f in Equation 9, nx 1 , ny 1 , nx 2 , and ny 2 .Each of these linear polynomials vanishes by construction modulo n when evaluated at the desired solution and thus so does any linear polynomial corresponding to a vector in this lattice.If we can find a lattice vector whose ℓ 1 norm is less than n, then the corresponding linear equation vanishes over the integers when evaluated at the desired solution.Since we have four unknowns, if we can find four sufficiently short lattice vectors corresponding to four linearly independent equations, we can solve for our desired unknowns.The determinant of our example lattice is det B = K 4 n 4 , and the lattice has dimension 5. Thus, ignoring approximation factors and constants, we expect to find a vector of length det B 1/ dim B = (Kn) (4/5) .This is less than n when K 4 < n; in our example this is satisfied because we have chosen a 15-bit K and a 64-bit n.
The determinant bounds guarantee that we will find one short lattice vector, but do not guarantee that we will find four short lattice vectors.For that, we rely on the heuristic that the reduced vectors of a random lattice are close to the same length.

(EC)DSA key recovery from many chunks of nonce bits
The above technique can be extended to an arbitrary number of variables.The extension is called the Extended Hidden Number problem [HR07] and can be used to solve for ECDSA keys when many chunks of signature nonces are known.Each unknown "chunk" of nonce in each signature introduces a new variable, so the resulting lattice will have dimension one larger than the total number of unknowns; if there are m signatures and h unknown chunks of nonce per signature, the lattice will have dimension mh + 1.We expect this technique to find the solution when the parameters are such that the system of equations has a unique solution.If the size of each chunk is K, heuristically this will happen when K mh < n m−1 .This technique has been used in practice in [FWC16] and further explored in [DPP20].
6 Key recovery method for the Diffie-Hellman Key Exchange

Finite field and elliptic curve Diffie-Hellman preliminaries
The Diffie-Hellman (DH) key exchange protocol [DH76] allows two parties to create a common secret in a secure manner.We summarize the protocol in the context of finite fields and elliptic curves.
Finite field Diffie-Hellman.Finite-field Diffie-Hellman parameters are specified by a prime p and a group generator g.Common implementation choices are p a safe prime, i.e., q = (p − 1)/2 is prime, in which case g is often equal to 2, 3 or 4, or p is chosen such that p − 1 has a 160, 224, or 256-bit prime factor q and g generates a subgroup of F * p of order q.Key exchange is performed as follows: 1. Alice chooses a random private key a, where 1 ≤ a < q and computes a public key A = g a mod p.
2. Bob chooses a random private key b, where 1 ≤ b < q and computes a public key B = g b mod p.
3. Alice and Bob exchange the public keys.
4. Alice computes s A = B a mod p.

Bob computes s
Because B a mod p = (g b ) a mod p = (g a ) b mod p = A b mod p, we have s A = s B .The latter is the secret that now Alice and Bob share.
Elliptic Curve Diffie-Hellman.The Elliptic Curve Diffie-Hellman (ECDH) protocol is the elliptic curve counterpart of the Diffie-Hellman key exchange protocol.In ECDH, Alice and Bob agree on an elliptic curve E over a finite field and a generator G of order q.
The protocol proceeds as follows: 1. Alice chooses a random private integer a, where 1 ≤ a < q and computes a public key A = aG.
2. Bob chooses a random private integer b, where 1 ≤ b < q and computes a public key B = bG.
3. Alice and Bob exchange the public keys.

Bob computes s
The shared secret is

Most significant bits of finite field Diffie-Hellman shared secret
The Hidden Number Problem approach we used in the previous section to recover ECDSA or DSA keys from information about the nonces can also be used to recover a Diffie-Hellman shared secret from most significant bits.
Recovering Diffie-Hellman shared secret from most significant bits of s.
Problem setup.Let p = 0xffffffffffffffffffffffffffffc3a7 be a 128-bit prime used for finite field Diffie-Hellman, and let g = 2 be a generator of the multiplicative group modulo p.
Let s the Diffie-Hellman shared secret s between public keys A = g a mod p = 0x3526bb85185259cd42b61e5532fe60e0 and B = g b mod p = 0x564df0b92ea00ea314eb5a246b01ac9c.
We have learned the value of the first 65 bits of s: let r 1 = 0x3330422f6047011b8000000000000000, so we know that s = r 1 + k 1 where k 1 < K = 2 63 .Let c = 0x56e112dac14f4a4cc02951414aa43a38.We have also learned the most significant 65 bits of the Diffie-Hellman shared secret between AC = g a+c = g a g c mod p and B. Let r 2 = 0x80097373878e37d20000000000000000.
We know that g (a+c)b = g ab g bc = sB c mod p.Let t = B c so st = r 2 + k 2 mod p where Cast the problem as a system of equations.We have two relations where s, k 1 , and k 2 are small and unknown, and r 1 , r 2 , and t are known.We can eliminate the variable s to obtain the linear equation We now have a linear equation in the same form as the Hidden Number Problem we solved in the previous section.

Construct a lattice. We construct the lattice basis
If we call the LLL algorithm on M , we obtain a basis that contains the vector (−0x2ddb23aa673107bd, −0x216afa75f66a39d5, 0x10000000000000000) This corresponds to our desired solution (k 1 , k 2 , K), although if the Diffie-Hellman assumption is true we cannot verify its correctness.More detailed explanation.This method is due to Boneh and Venkatesan [BV96], and was the original motivation for their formulation of the Hidden Number Problem.The Raccoon attack demonstrated an attack scenario using this technique in the context of TLS [MBA + 21].
This method can be adapted to multiple samples with the same number of bits required as the attacks on ECDSA.Knowing the most significant bits of s is not necessary either; we only need the most significant bits of known multiples t i of s.

Discrete log from contiguous bits of Diffie-Hellman secret exponents
This section addresses the problem of Diffie-Hellman key recovery when the known partial information is part of one or the other of the secret exponents.The technique we apply in this section is Pollard's kangaroo (also known as lambda) algorithm [Pol78].Unlike the techniques of the previous sections, which are generally efficient when the attacker's knowledge of the key is above a certain threshold, and either inefficient or infeasible when the attacker's knowledge of the key is below this threshold, this algorithm runs in exponential time: square root of the size of the interval.Thus it provides a significant benefit over brute force, but in practice is likely limited to 80 bits or fewer of key recovery unless one has access to an unusually large amount of computational resources.The Pollard kangaroo algorithm is a generic discrete logarithm algorithm that is designed to compute discrete logarithms when the discrete logarithm lies in a small known interval.It applies to both elliptic curve and finite field discrete logarithms.We will use finite field discrete logarithms for our examples, but the algorithm is the same in the elliptic curve context.

Known most significant bits of the Diffie-Hellman secret exponent.
Problem Setup.Using the same notation for finite fields as in Section 6.1, let A be a a Diffie-Hellman public key, p be a prime modulus, and g a generator of a multiplicative group of order q modulo p.These values are all public, and thus we assume that they are known.Imagine that we have obtained a consecutive fraction of the most significant bits of the secret exponent a, and we wish to recover the unknown bits of a to reconstruct the secret.In other words, let a = m + r, where m = 2 ℓ m ′ for some known integers m ′ and ℓ, and 0 ≤ r < 2 ℓ is unknown.Let w be the width of the interval that r is contained in: here we have w = 2 ℓ .For our concrete example, let p = 0xfef3 be a 16-bit prime, and let g = 3 be a multiplicative generator of the group of order q = (p − 1)/2 = 0x7f79 modulo p.We know a Diffie-Hellman public key A = 0xa163 and we are given the most significant bits of the secret exponent a but the 8 least significant bits of a are unknown, corresponding to m = 0x1400, ℓ = 8, and r < 2 8 .Take some pseudorandom walks.We define a deterministic pseudorandom walk along values s 0 , s 1 , . . ., s i , . . . in our multiplicative group modulo p (and the corresponding exponents s 0 = g xo mod p, . . ., when known) by choosing a set of random step lengths for the exponents in [0, √ w].For our example, we pseudorandomly generated the lengths (1, 3, 7, 10).
This is a small sample pseudorandom walk generated to run our small example computation.Each step in the pseudorandom walk is determined by the representation of the previous value as an integer 0 ≤ s i < p.
We run two random walks.The first random walk, which is called "the tame kangaroo", starts in the middle of the interval of exponents to be searched, at s 0 = g m+⌊ w 2 ⌋ mod p.In our example, we have m = 0x1400 and w = 2 8 = 256, so the tame kangaroo begins at s 0 = g 0x1480 mod p = 0x9581.We take √ w steps along this deterministic pseudorandom path, and store the values s i together with the exponent x i that is computed at each step so that g xi ≡ s i mod p.
The second random walk is called the "wild kangaroo".It begins at the target s ′ 0 = A = 0xa163 and follows the same rules as above.We do not know the secret exponent a, but at every step of the walk, we know that s ′ i = Ag x ′ i mod p = g a+x ′ i mod p.We take at most √ w steps along this deterministic pseudorandom path.If at some point the wild kangaroo's path intersects the tame kangaroo's path, then we are done and can compute the result.Compute the discrete log.We know that s i = s ′ j for s i on the tame kangaroo's path and s ′ j on the wild kangaroo's path.Thus we have In our example, the kangaroos' paths intersected at g 0x1497 and g a+0x36 ; we can thus compute a = 0x1461 and verify that g 0x1461 ≡ 0xa163 mod p.More detailed explanation.Pollard gave the original version of this algorithm in [Pol78].Teske gives an alternative random walk in [Tes00] that should provide an advantage in theory, but in practice, it seems that no noticeable advantage is gained from it.
We expect this algorithm to reach a collision in O( √ w) steps; this algorithm thus takes O( √ w) time to compute a discrete log in an interval of width w.Thus in principle, the armchair cryptanalyst should be able to compute discrete logarithms within intervals of 64 to 80 bits, and those with more resources should be able to go slightly higher than this.
In order to scale to these larger bit sizes, several changes are necessary.First, one typically uses a random walk with many more subdivisions: 32 might be a typical value.Second, van Oorschot and Wiener [OW99] show how to parallelize the kangaroo algorithm using the method of distinguished points.The idea behind this method is that storing the entire tame kangaroo walk will require too much memory.Instead, one stores a subset of values that satisfy some distinguishing property, such as starting with a certain number of zeros.Then the algorithm launches many wild and tame kangaroo walks, storing distinguished points in a central database.The algorithm is finished when a wild and a tame kangaroo land on the same distinguished point.
Elliptic curves.This algorithm applies equally well to elliptic curve discrete logarithm.One can gain a √ 2 improvement in the complexity of the algorithm as a by-product of the efficiency of inversion on elliptic curves.Since the points P and −P share the same x-coordinate, one can then do a pseudorandom walk on equivalence classes for the relation P ∼ ±P .It is straightforward to extend the kangaroo method to solve for unknown most significant bits of the exponent.As before, we have a known A = g a mod p for unknown a that we wish to solve for.In the case of unknown most significant bits, we know an m such that a = m + 2 ℓ r for some unknown r satisfying 0 ≤ r < w.The offset ℓ is known.Then we can reduce to the previous problem by running the kangaroo algorithm on the value A ′ = g 2 −ℓ A = g 2 −ℓ +m+2 ℓ r mod p.The case of recovering a Diffie-Hellman secret key in practice with multiple chunks of unknown bits is still an open problem.In theory, finding the secret key in this particular case can be done using a multi-dimensional variant of the discrete log problem.The latter generalizes the discrete logarithm problem in an interval to the case of multiple intervals, see [Rup10, Chapter 6] for further details.In [Rup10], Ruprai analyzes the multi-dimensional discrete log problem for small dimensions.This approach appears to run into boundary issues for multi-dimensional pseudorandom walks when the dimension is greater than five, suggesting that this approach may not extend to the case of recovering many unknown chunks of a Diffie-Hellman exponent.

Conclusion
This work surveyed key recovery methods with partial information for popular public key cryptographic algorithms.We focused in particular on the most widely-deployed asymmetric primitives: RSA, (EC)DSA and Diffie-Hellman.The motivation for these algorithms arises from a variety of side-channel attacks.

Figure 1 :
Figure 1: Illustration of low-exponent RSA message recovery attack setup.The attacker knows the public modulus N , a ciphertext c, and the padding a prepended to the unknown message m before encryption.The attacker wishes to recover m.

Figure 2 :
Figure 2: Factorization of N = pq given contiguous known most significant bits of p.

Figure 3 :
Figure 3: Factorization of N = pq given contiguous known least significant bits of p.

Figure 4 :
Figure 4: Factorization of N = pq given contiguous known bits of p in the middle.

Figure 5 :
Figure 5: Factorization of N = pq given multiple chunks of p.

Figure 6 :
Figure 6: Efficient factorization of N = pq given many chunks of p and no information about p is an open problem.

dNFigure 7 :
Figure 7: For small exponent e, the most significant bits of d do not allow full key recovery.

d d 0 NFigure 8 :
Figure 8: Recovering RSA p given contiguous least significant bits of d.

p and q q p Figure 9 :
Figure 9: Factorization of N = pq given non-consecutive bits of both p and q.

Figure 12 :
Figure 12: (EC)DSA key recovery from signatures where most significant bits of the nonces are known.

Figure 13 :
Figure 13: (EC)DSA key recovery from signatures where least significant bits of the nonces are known.

Figure 14 :
Figure 14: (EC)DSA key recovery from signatures where middle bits of the nonces are known.
DSA key recovery from signatures where multiple chunks of the nonces are known.

a 2 ℓ m ′ r Figure 15 :
Figure 15: Recovering Diffie-Hellman shared secret with most significant bits of secret exponent.

Figure 16 :
Figure 16: Recovering Diffie-Hellman shared secret with least significant bits

:Figure 17 :
Figure 17: Recovering Diffie-Hellman shared secret with multiple chunks of unknown bits.

#
S e c t i o n 4 .2 .4 : RSA Key r e c o v e r y from middle b i t s o f p d e f b i v a r i a t e _ c o p p e r s m i t h ( ) : R.<x , y> = ZZ [ ] #p = random_prime ( 2 ^1 6 4 , 2 ^1 6 3 )#q = random_prime ( 2 ^1 6 4 , 2 ^1 6 3 ) #N = p * q #a = l i f t (mod( p , 2 ^1 4 8 ) ) − l i f t (mod( p , 2 ^1 6 + a + y * 2^148 m o n o m i a l _ l i s t = ( f .monomials ( ) f u n c t i o n _ l i s t = [ f ^3 , f ^2 * y , f * ( y ) ^2 , ( y ) ^3 * N, f ^2 , f * ( y ) , ( y ) ^2 * N, f , ( y ) * N, N] M = matrix ( 1 0 ) f o r i i n r a n g e ( 1 0 ) : M[ i ] = [R( f u n c t i o n _ l i s t [ i ] ) ( x * X, y * Y) .m o n o m i a l _ c o e f f i c i e n t (m) f o r m i n m o n o m i a l _ l i s t ]scaled_monomials = [m( x/X, y/Y) f o r m i n m o n o m i a l _ l i s t ] d e f g e t f (M, i ) : r e t u r n sum ( b * m f o r b ,m i n z i p (M[ i ] , scaled_monomials ) ) A = M. LLL ( ) p r i n t ( "N =" , hex (N) ) p r i n t ( " a =" , hex ( a ) ) I = i d e a l ( g e t f (A, 0 ) , g e t f (A, 1 ) ) p r i n t ( I .g r o e b n e r _ b a s i s ( ) ) I = i d e a l ( g e t f (A, i ) f o r i i n r a n g e ( 3 ) ) p r i n t ( I .g r o e b n e r _ b a s i s ( ) ) I = i d e a l ( g e t f (A, i ) f o r i i n r a n g e ( 9 ) ) p r i n t ( I .g r o e b n e r _ b a s i s ( ) )#p r i n t ( g e t f (A, 0 ) ) #p r i n t ( g e t f (A, 1 ) ) d e f p r i n t _ m a t r i x ( ) : RR.<x , y , T, a , R, N> = ZZ [ ] # S e c t i o n 5 . 2 . 1 : L a t t i c e a t t a c k s d e f l a t t i c e _ a t t a c k s ( ) :p , F , C, n ,G, x = ecdsa_params ( ) p r i n t ( " n " , hex ( n ) ) p r i n t ( " p " , hex ( p ) ) p r i n t ( "G" b f b 4 0 c 9 c 621 ee64e65d1e938 ' r1 , s 1 = [ I n t e g e r ( f , 1 6 ) f o r f i n s i g 1 .s p l i t ( ) ] s i g 2 = ' 3 e a 8 7 2 0 a f a 6 d 0 3 c 2 16 f c 6 a a 6 5 b f 2 4 1 e a ' r2 , s 2 = [ I n t e g e r ( f , 1 6 ) f o r f i n s i g 2 .s p l i t ( ) ] p r i n t ( hex ( r 2 ) ) t = I n t e g e r (−inverse_mod ( s1 , n ) * s 2 * r 1 * inverse_mod ( r2 , n ) ) u = I n t e g e r ( inverse_mod ( s1 , n ) 1 ) r e t u r n " v e c t o r ( k1 , k2 ) " , ( hex ( k1 ) , hex ( k2 ) ) # S e c t i o n 5 . 2 .3 : (EC)DSA key r e c o v e r y from middle b i t s o f t h e nonce k d e f e c d s a _ m i d dl e _ b i t s ( ) : p , F , C, n ,G, x = ecdsa_params ( ) h1 = 0 x 6 0 8 9 3 2 f c f a a 7 7 8 5 d h2 = 0 x e 5 f 8 e c a 4 8 a c 2 a 4 5 c k1 = 0 x 7 3 4 4 5 0 e 2 f d 5 d a 4 1 c s i g 1 = ' 1 a4adeb76b4a90e0 e b a 1 2 9 b b 2 f 9 7 f 7 c d ' r1 , s 1 = [ I n t e g e r ( f , 1 6 ) f o r f i n s i g 1 .s p l i t ( ) ] k2 = 0 x4de972930ab4a534 s i g 2 = ' c 4 e 5 b e c 7 9 2 1 9 3 b 5 1 0202 d6eecb712ae3 ' r2 , s 2 = [ I n t e g e r ( f , 1 6 ) f o r f i n s i g 2 .s p l i t ( ) ]a1 = l i f t (mod( k1 ,2^(64 −15) ) )− l i f t (mod( k1 , 2 ^1 5 ) ) a2 = l i f t (mod( k2 ,2^(64 −15) ) )− l i f t (mod( k2 , 2 ^1 5 ) )p r i n t ( " a1 =" , hex ( a1 ) ) p r i n t ( " a2 =" , hex ( a2) ) b1 = l i f t (mod( k1 , 2 ^1 5 ) ) b2 = l i f t (mod( k2 , 2 ^1 5 ) ) c1 = 2^( −64+15) * ( k1 − l i f t (mod( k1 ,2^(64 −15) ) ) ) c2 = 2^( −64+15) * ( k2 − l i f t (mod( k2 ,2^(64 −15) ) ) ) t = I n t e g e r ( r 1 * inverse_mod ( s1 , n ) * inverse_mod ( r2 , n ) * s 2 ) u = I n t e g e r (−inverse_mod ( s1 , n ) * h1+r 1 * inverse_mod ( s1 , n ) * inverse_mod ( r2 , n ) * h2 ) p r i n t (mod( b1+c1 * 2^(64 −15)−t * b2−t * c2 * 2^(64 −15)+a1−t * a2+u , n ) ) M = matrix ( 5 ) X = 2^15 M[ 0 ] = [ X, X * 2^(64 −15) , −X * t , −X * t * 2^(64 −15) , a1−t * a2+u ] M[ 1 , 1 ] = n * X M[ 2 , 2 ] = n * X M[ 3 , 3 ] = n * X M[ 4 , 4 ] = n A = M. LLL ( ) R.<x1 , y1 , x2 , y2> = ZZ [ ] d e f g e t f (M, i ) : r e t u r n M[ i , 0 ] /X * x1+M[ i , 1 ] /X * y1+M[ i , 2 ] /X * x2+M[ i , 3 ] /X * y2+M[ i, 4 ] I = i d e a l ( g e t f (A, i ) f o r i i n r a n g e ( 4 ) ) r e t u r n I .g r o e b n e r _ b a s i s ( ) # S e c t i o n 6 . 2 : Most s i g n i f i c a n t b i t s o f f i n i t e f i e l d D i f f i e −Hellman s h a r e d s e c r e t d e f dh_msb ( i f t (mod( g , p ) ^( ( d+r ) * c ) ) a1 = DH1 − l i f t (mod(DH1, 2 ^6 3 ) ) a2 = DH2 − l i f t (mod(DH2, 2 ^6 3 ) ) b1 = DH1−a1 b2 = DH2−a2 t = l i f t (mod( g , p ) ^( c * r ) )M = matrix ( 3 ) M[ 0 , 0 ] = p M[ 1 , 0 ] = inverse_mod ( t , p ) M[ 1 , 1 ] = 1 M[ 2 , 0 ] = a1 − inverse_mod ( t , p ) * a2 M[ 2 , 2 ] = 2^64 N = M. LLL ( )p r i n t ( " a1 =" , hex ( a1 ) ) p r i n t ( " a2 =" , hex ( a2 ) ) r e t u r n " s o l u t i o n ( k1 , k2 ) i s g i v e n by " , ( hex ( b1 ) , hex ( b2 ) ) #p r i n t (N) #p r i n t (mod( b1−inverse_mod ( t , p ) * b2+a1−inverse_mod ( t , p ) * a2 , p ) )# Other code : s e t t i n g p a r a m e t e r s and o t h e r d e f gen_curve ( ) :p = 0 x f f f f f f f f f f f f f f c 5 done = Fa l s e i = 1 w h i l e not done : p r i n t ( i ) F = F i n i t e F i e l d ( p ) C = E l l i p t i c C u r v e ( [ F ( 0 ) ,F ( 3 ) ] ) i f is_prime (C .c a r d i n a l i t y ( ) ) : done = True r e t u r n p e l s e : p = p r e v i o u s _ p r i m e ( p ) i += 1 d e f ecdsa_params ( ) :p = 0 x f f f f f f f f f f f f d 2 1 f F = F i n i t e F i e l d ( p ) C = E l l i p t i c C u r v e ( [ F ( 0 ) ,F ( 3 ) ] ) n = 0 x f f f f f f f e f a 2 3 f 4 3 7 G = C .l i f t _ x ( 1 )# ( 1 , 2 ) x = 0 x34aad14 0ec2 c3a3 r e t u r n p , F , C, n , G, x d e f dsa_params ( ) : g = 0 x 1 7 d f d b f 2 b b b a e 7 d 6 c 0 5 2 c 2 f d c 5 d 3 2 8 8 d p = 0 x 8 9 5 2 4 b f c a 9 5 8 c 9 1 6 5 a 0 8 7 c c 4 f 8 8 9 a 0 8 f q = 0 x f f f f f f f f f f f f f f c 5 y = 0 x 2 4 1 0 f 1 5 6 3 4 2 2 2 d 3 3 0 0 e a b e b 4 4 2 2 6 c e a 8 x = 0 x 3 8 d b e f c 0 6 2 c d 4 c f 3 d e f dh_params ( ) : s a f e _ p r i m e ( l =128) : p = p r e v i o u s _ p r i m e (2^l ) done = F a l s e i = 0 w h i l e not done : p r i n t ( i ) i f is_prime ( I n t e g e r ( ( p−1) / 2 ) ) : done = True r e t u r n p e l s e : p = p r e v i o u s _ p r i m e ( p ) i += 1 d e f b t o i ( b ) : r e t u r n i n t .from_bytes ( b , " b i g " ) d e f i t o b ( i , b a s e l e n ) : r e t u r n i n t .to_bytes ( i n t ( i ) , l e n g t h=b a s e l e n , b y t e o r d e r =" b i g " ) d e f s i g n ( h , k l e n =32 , return_k=F a l s e ) : p , F , C, n ,G, x = ecdsa_params ( ) d = x h i = b t o i ( h ) k = ZZ .random_element ( 2 * * k l e n ) r = I n t e g e r ( (G * k ) .xy ( ) [ 0 ] ) s = l i f t ( inverse_mod ( k , n ) * mod( h i + d * r , n ) ) s i g = b y t e s .hex ( i t o b ( r , 8 ) ) +" "+ b y t e s .hex ( i t o b ( s , 8 ) ) i f return_k : r e t u r n k , s i g e l s e : r e t u r n s i g d e f gen_dsa_prime ( ) : p = 2 * q * random_prime ( 2 ^6 4 )+1 i = 1 w h i l e not is_prime ( p ) : p = 2 * q * random_prime ( 2 ^6 4 )+1 i += 1 p r i n t ( i ) r e t u r n p d e f gen_sig ( ) : h = i t o b (ZZ .random_element ( 2 ^6 4 ) , 6 4 / 8 ) r e t u r n b y t e s .hex ( h ) , s i g n ( h ) i = l e n ( p i )+1 c a n d i d a t e _ l i s t = [ ( bp+pi , bq+q i ) f o r ( bp , bq ) i n [ ( ' 0 ' , ' 0 ' ) , ( ' 0 ' , ' 1 ' ) , ( ' 1 ' , ' 0 ' ) , ( ' 1 ' , ' 1 ' ) ] ] f o r new_pi , new_qi i n c a n d i d a t e _ l i s t : i f l e n ( new_pi ) <= l e n ( p ) and p[− i ] != ' ?' and p[− i ] != new_pi[− i ] : 1 ' ) ] ] f o r new_dpi , new_dqi i n dpdq_candidates : i f l e n ( new_dpi ) <= l e n ( dp ) and dp[− i ] != ' ?' and dp [− i ] != new_dpi[− i ] : c o n t i n u e i f l e n ( new_dqi ) <= l e n ( dq ) and dq[− i ] != ' ?' and dq [− i ] != new_dqi[− i ] : c o n t i n u e f o r new_pi , new_qi i n pq_candidates :

Table 1 :
Visual table of contents for key recovery methods for public-key cryptosystems.