Communications in Cryptology IACR CiC


Dates are inconsistent
7 results sorted by publication date
Editors in chief
Call for papers: IACR Communications in Cryptology Submit a paper Communications in Cryptology is a journal for original research papers which welcomes submissions on any topic in cryptology. This covers all research topics in cryptography and cryptanalysis, including but not limited to foundational theory and mathematics the design, proposal, and analysis of cryptographic primitives a...
Aein Rezaei Shahmirzadi, Michael Hutter
Published 2024-10-07 PDFPDF

Masking schemes are key in thwarting side-channel attacks due to their robust theoretical foundation. Transitioning from Boolean to arithmetic (B2A) masking is a necessary step in various cryptography schemes, including hash functions, ARX-based ciphers, and lattice-based cryptography. While there exists a significant body of research focusing on B2A software implementations, studies pertaining to hardware implementations are quite limited, with the majority dedicated solely to creating efficient Boolean masked adders. In this paper, we present first- and second-order secure hardware implementations to perform B2A mask conversion efficiently without using masked adder structures. We first introduce a first-order secure low-latency gadget that executes a B2A2k in a single cycle. Furthermore, we propose a second-order secure B2A2k gadget that has a latency of only 4 clock cycles. Both gadgets are independent of the input word size k. We then show how these new primitives lead to improved B2Aq hardware implementations that perform a B2A mask conversion of integers modulo an arbitrary number. Our results show that our new gadgets outperform comparable solutions by more than a magnitude in terms of resource requirements and are at least 3 times faster in terms of latency and throughput. All gadgets have been formally verified and proven secure in the glitch-robust PINI security model. We additionally confirm the security of our gadgets on an FPGA platform using practical TVLA tests.

Ruize Wang, Kalle Ngo, Joel Gärtner, Elena Dubrova
Published 2024-10-07 PDFPDF

Most of the previous attacks on Dilithium exploit side-channel information which is leaked during the computation of the polynomial multiplication cs1, where s1 is a small-norm secret and c is a verifier's challenge. In this paper, we present a new attack utilizing leakage during secret key unpacking in the signing algorithm. The unpacking is also used in other post-quantum cryptographic algorithms, including Kyber, because inputs and outputs of their API functions are byte arrays. Exploiting leakage during unpacking is more challenging than exploiting leakage during the computation of cs1 since c varies for each signing, while the unpacked secret key remains constant. Therefore, post-processing is required in the latter case to recover a full secret key. We present two variants of post-processing. In the first one, a half of the coefficients of the secret s1 and the error s2 is recovered by profiled deep learning-assisted power analysis and the rest is derived by solving linear equations based on t = As1 + s2, where A and t are parts of the public key. This case assumes knowledge of the least significant bits of t, t0. The second variant uses lattice reduction to derive s1 without the knowledge of t0. However, it needs a larger portion of s1 to be recovered by power analysis. We evaluate both variants on an ARM Cortex-M4 implementation of Dilithium-2. The experiments show that the attack assuming the knowledge of t0 can recover s1 from a single trace captured from a different from profiling device with a non-negligible probability.

Soichiro Kobayashi, Rei Ueno, Yosuke Todo, Naofumi Homma
Published 2024-10-07 PDFPDF

This paper presents a new side-channel attack (SCA) on unrolled implementations of stream ciphers, with a particular focus on Trivium. Most conventional SCAs predominantly concentrate on leakage of some first rounds prior to the sufficient diffusion of the secret key and initial vector (IV). However, recently, unrolled hardware implementation has become common and practical, which achieves higher throughput and energy efficiency compared to a round-based hardware. The applicability of conventional SCAs to such unrolled hardware is unclear because the leakage of the first rounds from unrolled hardware is hardly observed. In this paper, focusing on Trivium, we propose a novel SCA on unrolled stream cipher hardware, which can exploit leakage of rounds latter than 80, while existing SCAs exploited intermediate values earlier than 80 rounds. We first analyze the algebraic equations representing the intermediate values of these rounds and present the recursive restricted linear decomposition (RRLD) strategy. This approach uses correlation power analysis (CPA) to estimate the intermediate values of latter rounds. Furthermore, we present a chosen-IV strategy for a successful key recovery through linearization. We experimentally demonstrate that the proposed SCA achieves the key recovery of a 288-round unrolled Trivium hardware implementation using 360,000 traces. Finally, we evaluate the performance of unrolled Trivium hardware implementations to clarify the trade-off between performance and SCA (in)security. The proposed SCA requires 34.5 M traces for a key recovery of 384-round unrolled Trivium implementation and is not applicable to 576-round unrolled hardware.

Dinal Kamel, François-Xavier Standaert, Olivier Bronchain
Published 2024-10-07 PDFPDF

Raccoon is a lattice-based scheme submitted to the NIST 2022 call for additional post-quantum signatures. One of its main selling points is that its design is intrinsically easy to mask against side-channel attacks. So far, Raccoon's physical security guarantees were only stated in the abstract probing model. In this paper, we discuss how these probing security results translate into guarantees in more realistic leakage models. We also highlight that this translation differs from what is usually observed (e.g., in symmetric cryptography), due to the algebraic structure of Raccoon's operations. For this purpose, we perform an in-depth information theoretic evaluation of Raccoon's most innovative part, namely the AddRepNoise function which allows generating its arithmetic shares on-the-fly. Our results are twofold. First, we show that the resulting shares do not enforce a statistical security order (i.e., the need for the side-channel adversary to estimate higher-order moments of the leakage distribution), as usually expected when masking. Second, we observe that the first-order leakage on the (large) random coefficients manipulated by Raccoon cannot be efficiently turned into leakage on the (smaller) coefficients of its long-term secret. Concretely, our information theoretic evaluations for relevant leakage functions also suggest that Raccoon's masked implementations can ensure high security with less shares than suggested by a conservative analysis in the probing model.

Lichao Wu, Azade Rezaeezade, Amir Ali-pour, Guilherme Perin, Stjepan Picek
Published 2024-10-07 PDFPDF

Profiling side-channel analysis has gained widespread acceptance in both academic and industrial realms due to its robust capacity to unveil protected secrets, even in the presence of countermeasures. To harness this capability, an adversary must access a clone of the target device to acquire profiling measurements, labeling them with leakage models. The challenge of finding an effective leakage model, especially for a protected dataset with a low signal-to-noise ratio or weak correlation between actual leakages and labels, often necessitates an intuitive engineering approach, as otherwise, the attack will not perform well.

In this paper, we introduce a deep learning approach with a flexible leakage model, referred to as the multi-bit model. Instead of trying to learn a pre-determined representation of the target intermediate data, we utilize the concept of the stochastic model to decompose the label into bits. Then, the deep learning model is used to classify each bit independently. This versatile multi-bit model can adjust to existing leakage models like the Hamming weight and Most Significant Bit while also possessing the flexibility to adapt to complex leakage scenarios. To further improve the attack efficiency, we extend the multi-bit model to profile all 16 subkey bytes simultaneously, which requires negligible computational effort. The experimental results show that the proposed methods can efficiently break all key bytes across four considered datasets while the conventional leakage models fail. Our work signifies a significant step forward in deep learning-based side-channel attacks, showcasing a high degree of flexibility and efficiency with the proposed leakage model.

Gaëtan Cassiers, Loïc Masure, Charles Momin, Thorben Moos, Amir Moradi, François-Xavier Standaert
Published 2024-07-08 PDFPDF

Masking is a prominent strategy to protect cryptographic implementations against side-channel analysis. Its popularity arises from the exponential security gains that can be achieved for (approximately) quadratic resource utilization. Many variants of the countermeasure tailored for different optimization goals have been proposed. The common denominator among all of them is the implicit demand for robust and high entropy randomness. Simply assuming that uniformly distributed random bits are available, without taking the cost of their generation into account, leads to a poor understanding of the efficiency vs. security tradeoff of masked implementations. This is especially relevant in case of hardware masking schemes which are known to consume large amounts of random bits per cycle due to parallelism. Currently, there seems to be no consensus on how to most efficiently derive many pseudo-random bits per clock cycle from an initial seed and with properties suitable for masked hardware implementations. In this work, we evaluate a number of building blocks for this purpose and find that hardware-oriented stream ciphers like Trivium and its reduced-security variant Bivium B outperform most competitors when implemented in an unrolled fashion. Unrolled implementations of these primitives enable the flexible generation of many bits per cycle, which is crucial for satisfying the large randomness demands of state-of-the-art masking schemes. According to our analysis, only Linear Feedback Shift Registers (LFSRs), when also unrolled, are capable of producing long non-repetitive sequences of random-looking bits at a higher rate per cycle for the same or lower cost as Trivium and Bivium B. Yet, these instances do not provide black-box security as they generate only linear outputs. We experimentally demonstrate that using multiple output bits from an LFSR in the same masked implementation can violate probing security and even lead to harmful randomness cancellations. Circumventing these problems, and enabling an independent analysis of randomness generation and masking, requires the use of cryptographically stronger primitives like stream ciphers. As a result of our studies, we provide an evidence-based estimate for the cost of securely generating $n$ fresh random bits per cycle. Depending on the desired level of black-box security and operating frequency, this cost can be as low as $20n$ to $30n$ ASIC gate equivalents (GE) or $3n$ to $4n$ FPGA look-up tables (LUTs), where $n$ is the number of random bits required. Our results demonstrate that the cost per bit is (sometimes significantly) lower than estimated in previous works, incentivizing parallelism whenever exploitable. This provides further motivation to potentially move low randomness usage from a primary to a secondary design goal in hardware masking research.