Communications in Cryptology IACR CiC

Volume 1, Issue 2

Published 2024-07-08
A Survey of Polynomial Multiplications for Lattice-Based Cryptosystems

Vincent Hwang

SoK paper PDFPDF

We survey various mathematical tools used in software works multiplying polynomials in \[ \frac{\mathbb{Z}_q[x]}{\left\langle {x^n - \alpha x - \beta} \right\rangle}. \] In particular, we survey implementation works targeting polynomial multiplications in lattice-based cryptosystems Dilithium, Kyber, NTRU, NTRU Prime, and Saber with instruction set architectures/extensions Armv7-M, Armv7E-M, Armv8-A, and AVX2.

There are three emphases in this paper: (i) modular arithmetic, (ii) homomorphisms, and (iii) vectorization. For modular arithmetic, we survey Montgomery, Barrett, and Plantard multiplications. For homomorphisms, we survey (a) various homomorphisms such as Cooley–Tukey FFT, Good–Thomas FFT, Bruun's FFT, Rader's FFT, Karatsuba, and Toom–Cook; (b) various algebraic techniques for adjoining nice properties to the coefficient rings, including localization, Schönhage's FFT, Nussbaumer's FFT, and coefficient ring switching; and (c) various algebraic techniques related to the polynomial moduli, including twisting, composed multiplication, evaluation at $\infty$, truncation, incomplete transformation, striding, and Toeplitz matrix-vector product. For vectorization, we survey the relations between homomorphisms and vector arithmetic.

We then go through several case studies: We compare the implementations of modular multiplications used in Dilithium and Kyber, explain how the matrix-to-vector structure was exploited in Saber, and review the design choices of transformations for NTRU and NTRU Prime with vectorization. Finally, we outline several interesting implementation projects.

An Explicit High-Moment Forking Lemma and its Applications to the Concrete Security of Multi-Signatures

Gil Segev, Liat Shapira

Research paper PDFPDF

In this work we first present an explicit forking lemma that distills the information-theoretic essence of the high-moment technique introduced by Rotem and Segev (CRYPTO '21), who analyzed the security of identification protocols and Fiat-Shamir signature schemes. Whereas the technique of Rotem and Segev was particularly geared towards two specific cryptographic primitives, we present a stand-alone probabilistic lower bound, which does not involve any underlying primitive or idealized model. The key difference between our lemma and previous ones is that instead of focusing on the tradeoff between the worst-case or expected running time of the resulting forking algorithm and its success probability, we focus on the tradeoff between higher moments of its running time and its success probability.

Equipped with our lemma, we then establish concrete security bounds for the BN and BLS multi-signature schemes that are significantly tighter than the concrete security bounds established by Bellare and Neven (CCS '06) and Boneh, Drijvers and Neven (ASIACRYPT '18), respectively. Our analysis does not limit adversaries to any idealized algebraic model, such as the algebraic group model in which all algorithms are assumed to provide an algebraic justification for each group element they produce. Our bounds are derived in the random-oracle model based on the standard-model second-moment hardness of the discrete logarithm problem (for the BN scheme) and the computational co-Diffie-Hellman problem (for the BLS scheme). Such second-moment assumptions, asking that the success probability of any algorithm in solving the underlying computational problems is dominated by the second moment of the algorithm's running time, are particularly plausible in any group where no better-than-generic algorithms are currently known.

Decentralized Multi-Client Functional Encryption with Strong Security

Ky Nguyen, David Pointcheval, Robert Schädlich

Research paper PDFPDF

Decentralized Multi-Client Functional Encryption (DMCFE) extends the basic functional encryption to multiple clients that do not trust each other. They can independently encrypt the multiple plaintext-inputs to be given for evaluation to the function embedded in the functional decryption key, defined by multiple parameter-inputs. And they keep control on these functions as they all have to contribute to the generation of the functional decryption keys. Tags can be used in the ciphertexts and the keys to specify which inputs can be combined together. As any encryption scheme, DMCFE provides privacy of the plaintexts. But the functions associated to the functional decryption keys might be sensitive too (e.g. a model in machine learning). The function-hiding property has thus been introduced to additionally protect the function evaluated during the decryption process.

In this paper, we provide new proof techniques to analyze a new concrete construction of function-hiding DMCFE for inner products, with strong security guarantees: the adversary can adaptively query multiple challenge ciphertexts and multiple challenge keys, with unbounded repetitions of the same tags in the ciphertext-queries and a fixed polynomially-large number of repetitions of the same tags in the key-queries. Previous constructions were proven secure in the selective setting only.

Randomness Generation for Secure Hardware Masking – Unrolled Trivium to the Rescue

Gaëtan Cassiers, Loïc Masure, Charles Momin, Thorben Moos, Amir Moradi, François-Xavier Standaert

Research paper PDFPDF

Masking is a prominent strategy to protect cryptographic implementations against side-channel analysis. Its popularity arises from the exponential security gains that can be achieved for (approximately) quadratic resource utilization. Many variants of the countermeasure tailored for different optimization goals have been proposed. The common denominator among all of them is the implicit demand for robust and high entropy randomness. Simply assuming that uniformly distributed random bits are available, without taking the cost of their generation into account, leads to a poor understanding of the efficiency vs. security tradeoff of masked implementations. This is especially relevant in case of hardware masking schemes which are known to consume large amounts of random bits per cycle due to parallelism. Currently, there seems to be no consensus on how to most efficiently derive many pseudo-random bits per clock cycle from an initial seed and with properties suitable for masked hardware implementations. In this work, we evaluate a number of building blocks for this purpose and find that hardware-oriented stream ciphers like Trivium and its reduced-security variant Bivium B outperform most competitors when implemented in an unrolled fashion. Unrolled implementations of these primitives enable the flexible generation of many bits per cycle, which is crucial for satisfying the large randomness demands of state-of-the-art masking schemes. According to our analysis, only Linear Feedback Shift Registers (LFSRs), when also unrolled, are capable of producing long non-repetitive sequences of random-looking bits at a higher rate per cycle for the same or lower cost as Trivium and Bivium B. Yet, these instances do not provide black-box security as they generate only linear outputs. We experimentally demonstrate that using multiple output bits from an LFSR in the same masked implementation can violate probing security and even lead to harmful randomness cancellations. Circumventing these problems, and enabling an independent analysis of randomness generation and masking, requires the use of cryptographically stronger primitives like stream ciphers. As a result of our studies, we provide an evidence-based estimate for the cost of securely generating $n$ fresh random bits per cycle. Depending on the desired level of black-box security and operating frequency, this cost can be as low as $20n$ to $30n$ ASIC gate equivalents (GE) or $3n$ to $4n$ FPGA look-up tables (LUTs), where $n$ is the number of random bits required. Our results demonstrate that the cost per bit is (sometimes significantly) lower than estimated in previous works, incentivizing parallelism whenever exploitable. This provides further motivation to potentially move low randomness usage from a primary to a secondary design goal in hardware masking research.

Quantum-Resistance Meets White-Box Cryptography: How to Implement Hash-Based Signatures against White-Box Attackers?

Kemal Bicakci, Kemal Ulker, Yusuf Uzunay, Halis Taha Şahin, Muhammed Said Gündoğan

Research paper PDFPDF

The adversary model of white-box cryptography includes an extreme case where the adversary, sitting at the endpoint, has full access to a cryptographic scheme. Motivating by the fact that most existing white-box implementations focus on symmetric encryption, we present implementations for hash-based signatures so that the security against white-box attackers (who have read-only access to data with a size bounded by a space-hardness parameter M) depends on the availability of a white-box secure cipher (in addition to a general one-way function). We also introduce parameters and key-generation complexity results for white-box secure instantiation of stateless hash-based signature scheme SPHINCS+, one of the NIST selections for quantum-resistant digital signature algorithms, and its older version SPHINCS. We also present a hash tree-based solution for one-time passwords secure in a white-box attacker context. We implement the proposed solutions and share our performance results.

A Comprehensive Survey on Post-Quantum TLS

Nouri Alnahawi, Johannes Müller, Jan Oupický, Alexander Wiesmaier

SoK paper PDFPDF

Transport Layer Security (TLS) is the backbone security protocol of the Internet. As this fundamental protocol is at risk from future quantum attackers, many proposals have been made to protect TLS against this threat by implementing post-quantum cryptography (PQC). The widespread interest in post-quantum TLS has given rise to a large number of solutions over the last decade. These proposals differ in many aspects, including the security properties they seek to protect, the efficiency and trustworthiness of their post-quantum building blocks, and the application scenarios they consider, to name a few.

Based on an extensive literature review, we classify existing solutions according to their general approaches, analyze their individual contributions, and present the results of our extensive performance experiments. Based on these insights, we identify the most reasonable candidates for post-quantum TLS, which research problems in this area have already been solved, and which are still open. Overall, our work provides a well-founded reference point for researching post-quantum TLS and preparing TLS in practice for the quantum age.

A Central Limit Approach for Ring-LWE Noise Analysis

Sean Murphy, Rachel Player

Research paper PDFPDF

This paper develops Central Limit arguments for analysing the noise in ciphertexts in two homomorphic encryption schemes that are based on Ring-LWE. The first main contribution of this paper is to present and evaluate an average-case noise analysis for the BGV scheme. Our approach relies on the recent work of Costache et al.(SAC 2023) that gives the approximation of a polynomial product as a multivariate Normal distribution. We show how this result can be applied in the BGV context and evaluate its efficacy. We find this average-case approach can much more closely model the noise growth in BGV implementations than prior approaches, but in some cases it can also underestimate the practical noise growth. Our second main contribution is to develop a Central Limit framework to analyse the noise growth in the homomorphic Ring-LWE cryptosystem of Lyubashevsky, Peikert and Regev (Eurocrypt 2013, full version). Our approach is very general: apart from finite variance, no assumption on the distribution of the noise is required (in particular, the noise need not be subgaussian). We show that our approach leads to tighter bounds for the probability of decryption failure than those of prior work.

Compact Key Function Secret Sharing with Non-linear Decoder

Chandan Kumar, Sikhar Patranabis, Debdeep Mukhopadhyay

Research paper PDFPDF
Provably Secure and Area-Efficient Modular Addition over Boolean Shares

Guilhèm Assael, Philippe Elbaz-Vincent

Research paper PDFPDF

Several cryptographic schemes, including lattice-based cryptography and the SHA-2 family of hash functions, involve both integer arithmetic and Boolean logic. Each of these classes of operations, considered separately, can be efficiently implemented under the masking countermeasure when resistance against vertical attacks is required. However, protecting interleaved arithmetic and logic operations is much more expensive, requiring either additional masking conversions to switch between masking schemes, or implementing arithmetic functions as nonlinear operations over a Boolean masking. Both solutions can be achieved by providing masked arithmetic addition over Boolean shares, which is an operation with relatively long latency and usually high area utilization in hardware. A further complication arises when the arithmetic performed by the scheme is over a prime modulus, which is common in lattice-based cryptography. In this work, we propose a first-order masked implementation of arithmetic addition over Boolean shares occupying a very small area, while still having reasonable latency. Our proposal is specifically tuned for efficient addition and subtraction modulo an arbitrary integer, but it can also be configured at runtime for power-of-two arithmetic. To the best of our knowledge, we propose the first such construction whose security is formally proven in the glitch+transition-robust probing model.

Communication-Efficient Multi-Party Computation for RMS Programs

Thomas Attema, Aron van Baarsen, Stefan van den Berg, Pedro Capitão, Vincent Dunning, Lisa Kohl

Research paper PDFPDF

Despite much progress, general-purpose secure multi-party computation (MPC) with active security may still be prohibitively expensive in settings with large input datasets. This particularly applies to the secure evaluation of graph algorithms, where each party holds a subset of a large graph. Recently, Araki et al. (ACM CCS '21) showed that dedicated solutions may provide significantly better efficiency if the input graph is sparse. In particular, they provide an efficient protocol for the secure evaluation of “message passing” algorithms, such as the PageRank algorithm. Their protocol's computation and communication complexity are both $\tilde{O}(M\cdot B)$ instead of the $O(M^2)$ complexity achieved by general-purpose MPC protocols, where $M$ denotes the number of nodes and $B$ the (average) number of incoming edges per node. On the downside, their approach achieves only a relatively weak security notion; $1$-out-of-$3$ malicious security with selective abort.

In this work, we show that PageRank can instead be captured efficiently as a restricted multiplication straight-line (RMS) program, and present a new actively secure MPC protocol tailored to handle RMS programs. In particular, we show that the local knowledge of the participants can be leveraged towards the first maliciously-secure protocol with communication complexity linear in $M$, independently of the sparsity of the graph. We present two variants of our protocol. In our communication-optimized protocol, going from semi-honest to malicious security only introduces a small communication overhead, but results in quadratic computation complexity $O(M^2)$. In our balanced protocol, we still achieve a linear communication complexity $O(M)$, although with worse constants, but a significantly better computational complexity scaling with $O(M\cdot B)$. Additionally, our protocols achieve security with identifiable abort and can tolerate up to $n-1$ corruptions.

Optimizing and Implementing Fischlin's Transform for UC-Secure Zero Knowledge

Yi-Hsiu Chen, Yehuda Lindell

Research paper PDFPDF

Fischlin's transform (CRYPTO 2005) is an alternative to the Fiat-Shamir transform that enables straight-line extraction when proving knowledge. In this work we focus on the problem of using the Fischlin transform to construct UC-secure zero-knowledge from Sigma protocols, since UC security – that guarantees security under general concurrent composition – requires straight-line (non-rewinding) simulators. We provide a slightly simplified transform that is much easier to understand, and present algorithmic and implementation optimizations that significantly improve the running time. It appears that the main obstacles to the use of Fischlin in practice is its computational cost and implementation complexity (with multiple parameters that need to be chosen). We provide clear guidelines and a simple methodology for choosing parameters, and show that with our optimizations the running-time is far lower than expected. For just one example, on a 2023 MacBook, the cost of proving the knowledge of discrete log with Fischlin is only 0.41ms (on a single core). This is 15 times slower than plain Fiat-Shamir on the same machine, which is a significant multiple but objectively not significant in many applications. We also extend the transform so that it can be applied to batch proofs, and show how this can be much more efficient than individually proving each statement. We hope that this paper will both encourage and help practitioners implement the Fischlin transform where relevant.

PACIFIC Privacy-preserving automated contact tracing featuring integrity against cloning

Scott Griffy, Anna Lysyanskaya

Research paper PDFPDF

To be useful and widely accepted, automated contact tracing schemes (also called exposure notification) need to solve two seemingly contradictory problems at the same time: they need to protect the anonymity of honest users while also preventing malicious users from creating false alarms. In this paper, we provide, for the first time, an exposure notification construction that guarantees the same levels of privacy and integrity as existing schemes but with a fully malicious database (notably similar to Auerbach et al. CT-RSA 2021) without special restrictions on the adversary. We construct a new definition so that we can formally prove our construction secure. Our definition ensures the following integrity guarantees: no malicious user can cause exposure warnings in two locations at the same time and that any uploaded exposure notifications must be recent and not previously uploaded. Our construction is efficient, requiring only a single message to be broadcast at contact time no matter how many recipients are nearby. To notify contacts of potential infection, an infected user uploads data with size linear in the number of notifications, similar to other schemes. Linear upload complexity is not trivial with our assumptions and guarantees (a naive scheme would be quadratic). This linear complexity is achieved with a new primitive: zero knowledge subset proofs over commitments which is used by our "no cloning" proof protocol. We also introduce another new primitive: set commitments on equivalence classes, which makes each step of our construction more efficient. Both of these new primitives are of independent interest.

Simple Watermarking Pseudorandom Functions from Extractable Pseudorandom Generators

Estuardo Alpirez Bock, Chris Brzuska, Russell W. F. Lai

Research paper PDFPDF

Watermarking pseudorandom functions (PRF) allow an authority to embed an unforgeable and unremovable watermark into a PRF while preserving its functionality. In this work, we extend the work of Kim and Wu [Crypto'19] who gave a simple two-step construction of watermarking PRFs from a class of extractable PRFs satisfying several other properties – first construct a mark-embedding scheme, and then upgrade it to a message-embedding scheme.

While the message-embedding scheme of Kim and Wu is based on complex homomorphic evaluation techniques, we observe that much simpler constructions can be obtained and from a wider range of assumptions, if we forego the strong requirement of security against the watermarking authority. Concretely, we introduce a new notion called extractable PRGs (xPRGs), from which extractable PRFs (without security against authorities) suitable for the Kim-Wu transformations can be simply obtained via the Goldreich-Goldwasser-Micali (GGM) construction. We provide simple constructions of xPRGs from a wide range of assumptions such as hardness of computational Diffie-Hellman (CDH) in the random oracle model, as well as LWE and RSA in the standard model.

Efficient isochronous fixed-weight sampling with applications to NTRU

Décio Luiz Gazzoni Filho, Tomás S. R. Silva, Julio López

Research paper PDFPDF

We present a solution to the open problem of designing a linear-time, unbiased and timing attack-resistant shuffling algorithm for fixed-weight sampling. Although it can be implemented without timing leakages of secret data in any architecture, we illustrate with ARMv7-M and ARMv8-A implementations; for the latter, we take advantage of architectural features such as NEON and conditional instructions, which are representative of features available on architectures targeting similar systems, such as Intel. Our proposed algorithm improves asymptotically upon the current approach based on constant-time sorting networks ($O(n)$ versus $O(n \log^2 n)$), and an implementation of the new algorithm applied to NTRU is also faster in practice, by a factor of up to $6.91\ (591\%)$ on ARMv8-A cores and $12.89\ (1189\%)$ on the Cortex-M4; it also requires fewer uniform random bits. This translates into performance improvements for NTRU encapsulation, compared to state-of-the-art implementations, of up to 50% on ARMv8-A cores and 72% on the Cortex-M4, and small improvements to key generation (up to 2.7% on ARMv8-A cores and 6.1% on the Cortex-M4), with negligible impact on code size and a slight improvement in RAM usage for the Cortex-M4.

Ad Hoc Broadcast, Trace, and Revoke Plus Time-Space Trade-Offs for Attribute-Based Encryption

Ji Luo

Research paper PDFPDF

Traitor tracing schemes [Chor–Fiat–Naor, Crypto ’94] help content distributors fight against piracy and are defined with the content distributor as a trusted authority having access to the secret keys of all users. While the traditional model caters well to its original motivation, its centralized nature makes it unsuitable for many scenarios. For usage among mutually untrusted parties, a notion of *ad hoc* traitor tracing (naturally with the capability of broadcast and revocation) is proposed and studied in this work. Such a scheme allows users in the system to generate their own public/secret key pairs, without trusting any other entity. To encrypt, a list of public keys is used to identify the set of recipients, and decryption is possible with a secret key for any of the public keys in the list. In addition, there is a tracing algorithm that given a list of recipients’ public keys and a pirate decoder capable of decrypting ciphertexts encrypted to them, identifies at least one recipient whose secret key must have been used to construct the said decoder.

Two constructions are presented. The first is based on functional encryption for circuits (conceptually, obfuscation) and has constant-size ciphertext, yet its decryption time is linear in the number of recipients. The second is a generic transformation that reduces decryption time at the cost of increased ciphertext size. A matching lower bound on the trade-off between ciphertext size and decryption time is shown, indicating that the two constructions achieve all possible optimal trade-offs, i.e., they fully demonstrate the Pareto front of efficiency. The lower bound also applies to broadcast encryption (hence all mildly expressive attribute-based encryption schemes) and is of independent interest.

Public-Key Authenticated Encryption with Keyword Search Made Easy

Qinyi Li, Xavier Boyen

Research paper PDFPDF

Public-key searchable encryption allows keyword-associated tokens to be used to test if a ciphertext contains specific keywords. Due to the low entropies of keywords, the token holder can create ciphertexts from candidate keywords and test them using the token in hand to recover the keywords, known as inside keyword guessing attacks (IKGA). Public-key authenticated encryption with keyword search is a searchable encryption proposed to defend against such attacks. It ensures the sender's private key protects the ciphertexts from the IKGA. PAEKS schemes with reasonable security and practical efficiency remain elusive despite many proposals. This work provides a simple generic PAEKS scheme from non-interactive key exchange (NIKE) and symmetric-key equality-predicate encryption with three new constructions for the latter, respectively from pseudorandom functions (PRFs), the decision bilinear Diffie-Hellman assumption, and the learning-with-errors assumption. Instantiating our generic scheme, we derive several PAEKS schemes from the most well-known assumptions, with some of them achieving full cipher-keyword indistinguishability and full token indistinguishability in the standard model, for the first time. Our instantiated schemes allow practical implementations and outperform the existing PAEKS schemes under the same assumptions.

A Long Tweak Goes a Long Way: High Multi-user Security Authenticated Encryption from Tweakable Block Ciphers

Benoît Cogliati, Jérémy Jean, Thomas Peyrin, Yannick Seurin

Research paper PDFPDF

We analyze the multi-user (mu) security of a family of nonce-based authentication encryption (nAE) schemes based on a tweakable block cipher (TBC). The starting point of our work is an analysis of the mu security of the SCT-II mode which underlies the nAE scheme Deoxys-II, winner of the CAESAR competition for the defense-in-depth category. We extend this analysis in two directions, as we detail now.

First, we investigate the mu security of several TBC-based variants of the counter encryption mode (including CTRT, the encryption mode used within SCT-II) that differ by the way a nonce, a random value, and a counter are combined as tweak and plaintext inputs to the TBC to produce the keystream blocks that will mask the plaintext blocks. Then, we consider the authentication part of SCT-II and study the mu security of the nonce-based MAC Nonce-as-Tweak (NaT) built from a TBC and an almost universal (AU) hash function. We also observe that the standard construction of an AU hash function from a (T)BC can be proven secure under the assumption that the underlying TBC is unpredictable rather than pseudorandom, allowing much better conjectures on the concrete AU advantage. This allows us to derive the mu security of the family of nAE modes obtained by combining these encryption/MAC building blocks through the NSIV composition method.

Some of these modes require an underlying TBC with a larger tweak length than what is usually available for existing ones. We then show the practicality of our modes by instantiating them with two new TBC constructions, Deoxys-TBC-512 and Deoxys-TBC-640, which can be seen as natural extensions of the Deoxys-TBC family to larger tweak input sizes. Designing such TBCs with unusually large tweaks is prone to pitfalls: Indeed, we show that a large-tweak proposal for SKINNY published at EUROCRYPT 2020 presents an inherent construction flaw. We therefore provide a sound design strategy to construct large-tweak TBCs within the Superposition Tweakey (STK) framework, leading to new Deoxys-TBC and SKINNY variants. We provide software benchmarks indicating that while ensuring a very high security level, the performances of our proposals remain very competitive.

Inspector Gadget

Camille Mutschler, Laurent Imbert, Thomas Roche

Research paper PDFPDF

We introduce InspectorGadget, an Open-Source Python-based software for assessing and comparing the complexity of masking gadgets. By providing a limited set of characteristics of a hardware platform, our tool allows to estimate the cost of a masking gadget in terms of cycle count equivalent and memory footprint. InspectorGadget is highly flexible. It enables the user to define her own estimation functions, as well as to expand the set of gadgets and predefined microcontrollers. As a case-study, we produce a fair comparison of several masked versions of Kyber compression function from the literature, together with novel alternatives automatically generated by our tool. Our results confirm that an interesting middle ground exists between theoretical performance measures (asymptotic complexity or operations count) and real implementations benchmarks (clock cycle accurate evaluations). InspectorGadget offers both simplicity and genericity while capturing the main performance-related parameters of a hardware platform.

Synchronous Distributed Key Generation without Broadcasts

Nibesh Shrestha, Adithya Bhat, Aniket Kate, Kartik Nayak

Research paper PDFPDF

Distributed key generation (DKG) is a key building block in developing many efficient threshold cryptosystems. This work initiates the study of communication complexity and round complexity of DKG protocols over a point-to-point (bounded) synchronous network. Our key result is the first synchronous DKG protocol for discrete log-based cryptosystems with $O(\kappa n^3)$ communication complexity ($\kappa$ denotes a security parameter) that tolerates any $t < n/2$ Byzantine faults among $n$ parties. We present two variants of the protocol: (i) a protocol with worst-case $O(\kappa n^3)$ communication and $O(t)$ rounds, and (ii) a protocol with expected $O(\kappa n^3)$ communication and expected constant rounds. In the process of achieving our results, we design (1) a novel weak gradecast protocol with a communication complexity of $O(\kappa n^2)$ for linear-sized inputs and constant rounds, (2) a protocol called “recoverable-set-of-shares” for ensuring recovery of shared secrets, (3) an oblivious leader election protocol with $O(\kappa n^3)$ communication and constant rounds, and (4) a multi-valued validated Byzantine agreement (MVBA) protocol with $O(\kappa n^3)$ communication complexity for linear-sized inputs and expected constant rounds. Each of these primitives is of independent interest.

Twinkle: A family of Low-latency Schemes for Authenticated Encryption and Pointer Authentication

Jianhua Wang, Tao Huang, Shuang Wu, Zilong Liu

Research paper PDFPDF

In this paper, we aim to explore the design of low-latency authenticated encryption schemes particularly for memory encryption, with a focus on the temporal uniqueness property. To achieve this, we present the low-latency Pseudo-Random Function (PRF) called Twinkle with an output up to 1152 bits. Leveraging only one block of Twinkle, we developed Twinkle-AE, a specialized authenticated encryption scheme with six variants covering different cache line sizes and security requirements. We also propose Twinkle-PA, a pointer authentication algorithm, which takes a 64-bit pointer and 64-bit context as input and outputs a tag of 1 to 32 bits.

We conducted thorough security evaluations of both the PRFs and these schemes, examining their robustness against various common attacks. The results of our cryptanalysis indicate that these designs successfully achieve their targeted security objectives.

Hardware implementations using the FreePDK45nm library show that Twinkle-AE achieves an encryption and authentication latency of 3.83 ns for a cache line. In comparison, AES-CTR with WC-MAC scheme and Ascon-128a achieve latencies of 9.78 ns and 27.30 ns, respectively. Moreover, Twinkle-AE is also most area-effective for the 1024-bit cache line. For the pointer authentication scheme Twinkle-PA, the latency is 2.04 ns, while QARMA-64-sigma0 has a latency of 5.57 ns.

FEDT: Forkcipher-based Leakage-resilient Beyond-birthday-secure AE

Nilanjan Datta, Avijit Dutta, Eik List, Sougata Mandal

Research paper PDFPDF

There has been a notable surge of research on leakage-resilient authenticated encryption (AE) schemes, in the bounded as well as the unbounded leakage model. The latter has garnered significant attention due to its detailed and practical orientation. Designers have commonly utilized (tweakable) block ciphers, exemplified by the TEDT scheme, achieving $\mathcal{O}(n-\log(n^2))$-bit integrity under leakage and comparable AE security in the black-box setting. However, the privacy of TEDT was limited by $n/2$-bits under leakage; TEDT2 sought to overcome these limitations by achieving improved security with $\mathcal{O}(n-\log n)$-bit integrity and privacy under leakage.

This work introduces FEDT, an efficient leakage-resilient authenticated encryption (AE) scheme based on fork-cipher. Compared to the state-of-the-art schemes TEDT and TEDT2, which process messages with a rate of $1/2$ block per primitive call for encryption and one for authentication, FEDT doubles their rates at the price of a different primitive. FEDT employs a more parallelizable tree-based encryption compared to its predecessors while maintaining $\mathcal{O}(n-\log n)$-bit security for both privacy and integrity under leakage. FEDT prioritizes high throughput at the cost of increased latency. For settings where latency is important, we propose FEDT*, which combines the authentication part of FEDT with a CTR-based encryption. FEDT* offers security equivalent to FEDT while increasing the encryption rate of $4/3$ and reducing the latency.

Efficient Post-Quantum Pattern Matching on Encrypted Data

Anis Bkakria, Malika Izabachène

Research paper PDFPDF

Pattern matching methods are essential in various applications where users must disclose highly sensitive information. Among these applications are genomic data analysis, financial records inspection, and intrusion detection processes, all of which necessitate robust privacy protection mechanisms. Balancing the imperative of protecting the confidentiality of analyzed data with the need for efficient pattern matching presents a significant challenge.

In this paper, we propose an efficient post-quantum secure construction that enables arbitrary pattern matching over encrypted data while ensuring the confidentiality of the data to be analyzed. In addition, we address scenarios where a malicious data sender, intended to send an encrypted content for pattern detection analysis, has the ability to modify the encrypted content. We adapt the data fragmentation technique to handle such a malicious sender. Our construction makes use of a well-suited Homomorphic Encryption packing method in the context of fragmented streams and combines homomorphic operations in a leveled mode (i.e. without bootstrapping) to obtain a very efficient pattern matching detection process.

In contrast to the most efficient state-of-the-art scheme, our construction achieves a significant reduction in the time required for encryption, decryption, and pattern matching on encrypted data. Specifically, our approach decreases the time by factors of $1850$, $10^6$, and $245$, respectively, for matching a single pattern, and by factors of $115$, $10^5$, and $12$, respectively, for matching $2^{10}$ patterns.