2 results sorted by publication date
New proposals for scalable key rank estimation methods have appeared recently, in particular the sampling based approach MCRank. The idea is that one can consistently estimate the key rank by sampling only a small portion of the key space as a “proxy”, leading to both an accurate and scalable approach, at least in comparison with another approach based on histograms. We show that the (earlier) GEEA algorithm is in fact a sampling based algorithm, and provide an in-depth comparison between GEEA (when adapted to produce rank estimates rather than guessing entropy estimates), GM bounds, MCRank and the currently most performant counting based rank estimation as implemented in the Labynkyr library. We find that although MCRank does live up to the promised accuracy and scalability for probability-based distinguishers, it fails to handle cases with unusual distinguisher distributions.
Furthermore, we put forward a novel proposal for a highly scalable key rank estimation method by introducing the notion of an “attacker budget”. Our proposal is based on the idea that, in particular for very long keys, the exact key rank is less important than the knowledge whether a key is within a certain bound. Thus our “budget approach” is based on efficiently checking if the result of an attack is such that the attacker's budget suffices for successful enumeration. Our budget approach scales linearly with the key size and thus enables security estimations even for post-quantum key lengths.
We present a solution to the open problem of designing a linear-time, unbiased and timing attack-resistant shuffling algorithm for fixed-weight sampling. Although it can be implemented without timing leakages of secret data in any architecture, we illustrate with ARMv7-M and ARMv8-A implementations; for the latter, we take advantage of architectural features such as NEON and conditional instructions, which are representative of features available on architectures targeting similar systems, such as Intel. Our proposed algorithm improves asymptotically upon the current approach based on constant-time sorting networks ($O(n)$ versus $O(n \log^2 n)$), and an implementation of the new algorithm applied to NTRU is also faster in practice, by a factor of up to $6.91\ (591\%)$ on ARMv8-A cores and $12.89\ (1189\%)$ on the Cortex-M4; it also requires fewer uniform random bits. This translates into performance improvements for NTRU encapsulation, compared to state-of-the-art implementations, of up to 50% on ARMv8-A cores and 72% on the Cortex-M4, and small improvements to key generation (up to 2.7% on ARMv8-A cores and 6.1% on the Cortex-M4), with negligible impact on code size and a slight improvement in RAM usage for the Cortex-M4.