Towards Practical Transciphering for FHE with Setup Independent of the Plaintext Space

. Fully Homomorphic Encryption (FHE) is a powerful tool to achieve non-interactive privacy preserving protocols with optimal computation/communication complexity. However, the main disadvantage is that the actual communication cost (bandwidth) is high due to the large size of FHE ciphertexts. As a solution, a technique called transciphering (also known as Hybrid Homomorphic Encryption) was introduced to achieve almost optimal bandwidth for such protocols. However, all of existing works require clients to ﬁx a precision for the messages or a mathematical structure for the message space beforehand. It results in unwanted constraints on the plaintext size or underlying structure of FHE based applications. In this article, we introduce a new approach for transciphering which does not require ﬁxed message precision decided by the client, for the ﬁrst time. In more detail, a client uses any kind of FHE-friendly symmetric cipher for { 0 , 1 } to send its input data encrypted bit-by-bit, then the server can choose a precision p depending on the application and homomorphically transforms the encrypted bits into FHE ciphertexts encrypting integers in Z p . To illustrate our new technique, we evaluate a transciphering using FiLIP cipher and adapt the most practical homomorphic evaluation technique [CCS’22] to keep the practical latency. As a result, our proof-of-concept implementation for p from 2 2 to 2 8 takes only from 13 ms to 137 ms.


Introduction
As fully homomorphic encryption (FHE) allows any computation over encrypted data, it can be applied to secure outsourced computation where a server which has strong computational resources do (requested) computation over a client's data keeping the client's privacy.It unlocked many real-world applications recently such as privacy preserving machine learning [TBK20, ZS21, CDPP22, BPM22, KJL + 22, SFB + 23] and secure outsourced storage [CCR19,CDNP23].
While the latency of this protocol is deemed sufficiently practical for prototype applications, the associated network cost remains undesirable due to the considerable size of FHE ciphertexts.In fact, FHE suffers from a large ciphertext expansion, resulting in the client needing to upload data to the server that is voluminous to the extent of several orders of magnitude, compared to the original data.For example, in the worst case, an FHE ciphertext encrypting one single bit costs 2.5KB, if we use a small TFHE [CGGI20] ciphertext achieving 128 bits of security.
To solve this problem, the transciphering (a.k.a.Hybrid Homomorphic Encryption (HHE)) approach has been proposed [NLV11]: a client encrypts the data using some block or stream cipher Π, and sends the ciphertexts to the server.Because these ciphers have ciphertext expansion close to one or exactly one, now the upload size is almost the same as the size of the data itself.The server then runs the decryption of Π homomorphically, using an FHE scheme, obtains FHE.Enc(m), and can then proceed with the usual homomorphic computation.
One way of implementing transciphering is by simply asking the client to encrypt the data using some well-known cipher, like AES.However, it is generally very expensive to evaluate the decryption of such ciphers using FHE.Hence, multiple works have proposed transciphering strategies by first constructing FHE-friendly ciphers [ARS + 15, CCF + 16, MJSC16, DEG + 18, HL20, DGH + 23, MCJS19, HKC + 20, CHK + 21, HKL + 22, AMT22, CHMS22], whose decryption function can be evaluated homomorphically using little memory and time.However, all of these FHE-friendly ciphers fix a plaintext space (denoted by M) to obtain efficiency gains during the FHE evaluation or to fit with the constraints of the security analysis.For example, most of them [ARS + 15, DEG + 18, MCJS19, HL20] are designed for M = F 2 .For larger space, PASTA [DGH + 23] requires M = F k p , for a positive integer k and a prime p such that p − 1 is not divisible by 3, and HERA [CHK + 21] requires M = Z t where t ≥ 2 16 .
These natural strategies incur some inconveniences for the versatility of the computations to evaluate, since each application has its ideal plaintext space, that can also evolve based on the client requests.For example, if we want to evaluate binary circuits, we set M = F 2 , if we want to work with bytes, we set M = Z 2 8 , etc. Thus, to use transciphering without relying on a sole model of computation, the client would have to be able to implement several different ciphers, depending on each application.For illustration we consider the following scenario: medical doctors/researchers handling patients' private data want to study a relation between certain diseases and a specific human genome sequence via secure machine learning algorithm, keeping individual patient's privacy.Patients' data is already stored up to certain number of bits (let's say 32 bits).Depending on which algorithm the computing party uses and accuracy rate they want to achieve, the input precision differs.For example, the recent secure neural network instantiation [SFB + 23] uses 8-bit integer for their inputs which is enough to achieve over 90% accuracy, and [CDPP22, TBK20] uses 11-16 bits for private decision tree evaluation.We discuss the complexities and inconveniences that would arise for both the server and client from employing distinct ciphers for various computations.
First, it means that the client would have to generate and manage several keys and run the setup of the transciphering for each plaintext space appropriate for the desired computations.The setup part of a transciphering is expensive (it is the bottleneck in bandwidth for the HHE protocol e.g.[DGH + 23]) and this cost is amortized because the setup is run only once in the full protocol.However, this is no longer true if the client has to run several different setups.Then, working with multiple different ciphers is far from optimal from the security point of view, because the security of the full protocol relies on the security of all these ciphers.The full protocol is secure only if all of them are secure, alone and combined, since the same data is encrypted with the different schemes.Assuming the security of multiple ciphers alone and combined is a stronger assumption than relying on the security of a sole cipher.
Finishing on the downside of fixed plaintext space for transciphering, the aforementioned message spaces of existing FHE-friendly ciphers do not always match the most common plaintext spaces and structures used by applications.Namely, since general purpose CPUs use bytes, 32-bit or 64-bit integers, we expect to use these data types most of the times in our applications.Thus, the FHE schemes would have to use the rings Z 2 8 , Z 2 32 , and Z 2 64 as message spaces.Moreover, FHE-friendly symmetric ciphers are designed to fit the particularities of one FHE scheme, and the best performances in terms of latency and throughput of one transciphering are obtained for the transciphering standing alone.One downside is that getting these best transciphering performances impacts the choice of parameters of the chosen FHE scheme, which focuses the optimization on the symmetric cipher rather than on the evaluation of the functions that the server will have to compute later on.
Therefore, it would be ideal if there were an efficient transciphering technique which does not require FHE scheme's message space as a parameter on the client's side, so that the server can transform client's given data into any FHE ciphertext of which message space fits into various applications on the fly.

Our Contributions
In this article, we introduce a possible solution to achieve the aforementioned ideal case for the first time.Our solution is to "compose" bits into an integer homomorphically.In more detail, a client encrypts each bit of its input data separately 1 using an FHE-friendly cipher for Z 2 , then upload those ciphertexts.The server homomorphically transforms them into FHE ciphertexts encrypting bits (where by bits we refer to 0 and 1 integer values, without the structure of F 2 ) and homomorphically composes them into an integer, by taking some of given encrypted bits, depending on the precision that a target application requires.
Since a client only sends the bit representation of its data, regardless of its original data size, message space for applications based on FHE is not specified beforehand.Let's assume that the client's input data size is initially set to t bits.If an application which the client targets requires M = Z p , where log p ≤ t, the server grabs the upper log p bits from given client's data stored as FHE-friendly cipher, and transforms them into an FHE ciphertext encrypting message in M via our transciphering technique.
In our instantiation, we use FiLIP cipher [MCJS19] for the client's side since its homomorphic decryption is already optimized by [CDPP22] for a practical application, which only takes 2.6 milliseconds per bit.Therefore, their little computational overhead is still preserved in our case.Moreover, we tweak their approach to directly produce an FHE ciphertext encrypting a bit scaled with a corresponding power of 2, instead of computing 2 j • FHE.Enc(b j ) = FHE.Enc(2 j • b j ), where b j ∈ {0, 1}, after transciphering, to minimize additional computation overhead.
We implement our result as a proof of concept by using FINAL [BIP + 22] for the underlying FHE scheme and choose message precision log p in the range from 2 to 8 bits.The corresponding running time of server is in the range from 6.5ms to 18ms per bit by only using single thread, for the full transciphering over Z p .Since we cannot directly use [CDPP22]'s optimization for efficient compositing technique, we have more computational overhead than their result.Moreover, the choice of parameters differs for different log p to manage the noise growth, which affects on computation time.Compared to the most recent transciphering result (Elisabeth-4 [CHMS22] which is designed for 4-bit integer only) of which best performance is 371 ms per bit without parallel computation, our method is faster and easier to adapt to different use-cases.

Technical Overview
We start with a bit representation of integer data elements.In the client side, an integer Then, each b j is encrypted with FiLIP cipher, generating a ciphertext c j , which is sent to the server.On the server side, depending on the target application, a precision log p is chosen, and the log p most significant bits of µ are considered.The goal is to generate an FHE ciphertext encrypting μ := µ − (µ mod 2 t−log p ).To do so, we proceed in two steps: Firstly, we homomorphically evaluate a modified version of FiLIP's decryption so that instead of just generating FHE.Enc(b j ) for the desired bits, we generate FHE.Enc(2 j • b j ), where the message space is Z p .This is the main step.Second, it just remains to homomorphically add all those ciphertexts, so that we obtain First of all, notice that FiLIP's decryption requires evaluating a Boolean function g on a secret vector v derived2 from the secret key sk.Thus, it would be natural to work with an FHE scheme whose message space is binary.However, we have to adapt the decryption to work modulo p.More specifically for the instances of FiLIP we consider g(v) consists of two main sub-functions: one that computes the XOR operation of the first k bits of v, denoted by x := XOR(x 1 , . . ., x k ), and the other is a Boolean threshold function, denoted by y := T d,s (y), which outputs 1 if the Hamming Weight (HW, denoted by W H ) of y= (y 1 , . . ., y s ), the last s bits of v, is equal to or greater than d.The results of these two subfunctions are them xored.Since the threshold function involves a comparison, it is hard to evaluate it homomorphically.Thus, our strategy is to use homomorphic look-up-tables to evaluate it.In more detail, we use the standard technique of mapping an encryption of a monomial X m , for some integer m, to an encryption of f (m) by multiplying by a so-called "test polynomial" T (X) that depends on f .Thus, we define T (X) with respect to the function g of FiLIP's decryption and develop an arithmetic gate that computes the XORs already multiplied by T (X), so that applying it k times, we have FHE.Enc(T (X) • XOR(x 1 , . . ., x k )), which we can lift to the exponent of X, obtaining FHE.Enc(T (X) • X XOR(x1,...,x k ) ).Then, we multiply this ciphertext by encryptions of X 2yi , for each of the last s bits of v, denoted y i .With this, we obtain FHE.Enc(T (X) • X XOR(x1,...,x k )+2•W H (y) ).But, due to the way T (X) is defined, this is basically the same as FHE.Enc(2 j • g(v)), thus, we just have to combine this result with the FiLIP ciphertext c j encrypting the bit b j , to finally produce FHE.Enc(2 j • b j ).We illustrate this process in Figure 1.It is worth noting that the building blocks presented in Section 3, such as the the homomorphic scaled XOR gate and the homomorphic lifting operation including the test polynomial, are novel operations that can be of independent interest.

Related works, alternatives for secure computations without FHE ciphertext expansion
In this work we focus on the communication cost of FHE-based solutions for secure computation in practice, other techniques such as multi party computation protocols can be used.The baseline we are comparing is the size of communication or storing data when encrypted using FHE encryption only (with high ciphertext expansion) and HHE where symmetric encryption allows communication without ciphertext expansion.There are orthogonal methods to circumvent the ciphertext expansion.
Rate-1 FHE.Recent works such as [GH19] and [BDGM19] propose FHE schemes with smaller ciphertext expansion, referred as high-rate FHE schemes.These compressed FHE schemes decrease the expansion factor close to 1 asymptotically and are called rate-1 FHE scheme.The first one gives a ratio between aggregate plaintext size and aggregate ciphertext size of 1ε for any ε (assuming the aggregate plaintext is sufficiently large, proportional to 1/ε 3 .).This rate-1 scheme is used in [MW22] to get a better rate ( n 2 /(n 2 + n) where n is the plaintext dimension) to perform a single-server private information retrieval.The second one allows uses compression of many ciphertexts into a compressed ciphertext reaching rate 1 − 1/λ where λ is the security parameter.
Compact storage with homomorphic encryption retrieval.Recently, the study in [AOSV23] introduced an alternative to FHE-based outsourced computation in a twoserver model.This approach is built around two protocols that depend on the collaboration of two servers, which are designed not to collude.For data storage, a client-or multiple clients-employs a two-out-of-two secret sharing scheme for their data.This entails that one share is held by an auxiliary server, while the other is with a computing server.The auxiliary server then homomorphically encrypts its share and forwards it to the computing server.Utilizing its plaintext share in conjunction with the encrypted share from the auxiliary server, the computing server is able to reconstruct an homomorphic encryption of the secret.Consequently, the computing server can either return the homomorphically encrypted data back to the client or apply homomorphic computations on it prior to doing so.According to this protocol, the storage requirement for the client is double the size of the original data (alternatively, it remains the same size, as described in [AOSV23] through the on-the-fly generation of the second share).The burden of ciphertext expansion is borne by the two servers during the retrieval of data rather than the client, as they are the ones utilizing homomorphic encryption.

Vectors, matrices, distributions
Notation: We use lower-case bold letters for vectors and upper-case bold letters for matrices.A zero vector is denoted by 0. The inner product of two vectors a and b is denoted by a • b (or ⟨a, b⟩).For any vector u, ∥u∥ denotes the infinity norm.We denote the dot product of two vectors v, w by ⟨v, w⟩.For a vector x, x[i] or x i denotes the i-th component scalar of x.We use the Euclidean norm as a default norm for a vector x.

Subgaussian distribution.
For the analysis of noise of each homomorphic operation, we need subgaussian random variables over R.
From the definition, we can prove that the variance of V , denoted by Var(V ) is bounded by σ 2 , i.e.Var(V ) ≤ σ 2 .Informally, the tails of V are dominated by a Gaussian function with standard deviation σ.We use the property that a vector with subgaussian coordinates is also subgaussian for our noise analysis, which is proved in [JLP23].Subgaussian random variables have an important property called Pythagorean additivity.Given two random variables, α-subgaussian X and β-subgaussian Y , and a, b ∈ Z, the random variable a

Fully homomorphic encryption
Roughly speaking, we can divide fully homomorphic encryption (FHE) schemes in two classes: one with large ciphertexts, packing and slow bootstrapping and the other with small ciphertexts, no packing, but fast and programmable bootstrapping.The first family contains schemes such as BGV [BGV12], FV [FV12], and CKKS [CKKS17], while the second one is represented by schemes like FHEW [DM15], TFHE [CGGI16], FHE over the integers [Per21], and FINAL [BIP + 22].In this work, we are only interested in the second type of FHE, thus, in this section we present a general and abstract definition of an FHE scheme that can be instantiated with any of those schemes.
There are three types of ciphertexts: • Integer ciphertext, which is defined over the set Z q for some q ∈ N. We denote by IntCtxt z (⌊q/p⌉ • m, E) the set of integer ciphertexts encrypting m ∈ Z p , under key z, and with E-subgaussian noise.They are the output format of our transciphering and the input format of the subsequent homomorphic computation.
• Ring ciphertext, which is defined as a single element or a pair of elements of with N as a power of two.We denote by RingCtxt s (⌊Q/p⌉ • m, E) the set of ring ciphertexts encrypting m ∈ R p , under key s ∈ R, and with E-subgaussian noise.
• Gadget ciphertext, which is defined as a vector or a matrix with entries in R Q .They are parametrized by a decomposition base, say, B g , and a dimension ℓ = O(log Bg Q).
We denote by GadgetCtxt Q,ℓ s (m, E) the set of gadget ciphertexts encrypting m ∈ R under key s ∈ R, with ciphertext modulus Q, dimension ℓ, and with E-subgaussian noise.
Note that we omit the noise parameter when we define any ciphertext if it is not necessary in the context.
Given the security parameter λ, we typically have q, Q ∈ Õ(λ 1.5 ) and N ∈ O(λ).We assume that all ciphertexts carry an estimation of their current noise, which increases as we operate homomorphically with them.
The concrete structure of the ciphertexts is not relevant for the general description of our transciphering, thus, we present them and the their corresponding homomorphic operations only abstractly, by the following algorithms.For some concrete instantiations of this abstract scheme, we refer to [CGGI16], where the gadget ciphertexts are 2ℓ × 2 matrices, and to [BIP + 22], where the gadget ciphertexts are ℓ-dimensional vectors.
• FHE.ParamGen(1 λ , p): generate parameters params that achieve λ bits of security and allow us to work with plaintext space Z p .The parameters also include the ring R an integer B g , called the decomposition base, and ℓ := ⌈log Bg Q⌉, which defines the dimension of the gadget ciphertexts.
• FHE.KeyGen(params): generate the secret key sk := (z, s), a key-switching key ksk from s to z, where s is the vector of coefficients of s, and the bootstrapping key bk.
• FHE.DecInt(z, c): output the message m ∈ Z p encrypted by c under the secret key z.
• FHE.EncGadget(s, m): • Trivial-noiseless ciphertext: any FHE ciphertext defined above where all randomness and the noise are set to 0. We call it a trivial-noiseless ciphertext in this paper.
• FHE.Add: homomorphically add two ciphertexts of the same type, e.g., maps • FHE.AddPtxt: given a ciphertext of any type, encrypting some message m 0 , and a plaintext m 1 , this operation outputs a ciphertext of the same type encrypting m 0 + m 1 .The noise is unchanged, i.e., both input and output have the same noise.
• FHE.MultPtxt: given a message m 0 ∈ R p and a ciphertext , where B g is the decomposition base.For succinctness, we can write c 0 , where m i is the i-th coefficient of m.We note that this algorithm is defined as SampleExtract in [CGGI20] and it does not add any noise to the ciphertext.Moreover, it is almost for free in practice since it only rearranges the order of components of input vector/polynomial, which is by far much cheaper than the other operations.
• FHE.bootstrap:Given c ′ ∈ IntCtxt z (⌊q/p⌉•m, E ′ ), and a function f : Notice that the bootstrapping allow us to change the ciphertext modulus from q to q and the plaintext modulus from p to p, but in most of the cases, one chooses q = q and p = p.
To analyze the noise growth of a sequence of homomorphic operations, we can iteratively apply the noise bounds of each operations.For example, to homomorphically compute 3 • (m 0 + m 1 ), we could have c ′ = FHE.Add(c 0 , c 1 ) and the final ciphertext as c = FHE.MultPtxt(c ′ , 3).Then, assuming that c i has noise with parameter E i , the noise of c ′ would have parameter Ē = E 2 0 + E 2 1 , and the final noise would be E-subgaussian where Most of the time, deriving the noise like the above is good enough, however, there is a special case, that will be used to construct our homomorphic XOR gate modulo p presented in Section 3.1, where we can have better bounds by analyzing the final noise more carefully (See Lemma 1).Notice that the final noise in the lemma just has E 0 itself, while a naive computation would give us noise including 2 • E 0 .This would be problematic because applying this homomorphic computation iteratively k times would introduce a factor of 2 k to the noise, thus, the estimation would be far from the actual noise.
s (m 1 , E 1 ), with m 1 ∈ {0, 1}.Now, consider the following homomorphic computation: Proof.For any ciphertext c, denote by Err(c) the noise term included in c.Let y be the vector with the decomposition in base B g of c ′ .Then, we have Thus, assuming that Err(C 1 ) and Err(c 0 ) are independent, we have that y )-subgaussian, then, by Pythagorean inequality for subgaussians, it gives us The correctness of the encrypted message follows directly from the definition of the homomorphic operations.

Corollary 1. Instead of the homomorphic computation presented in Lemma 1, if we compute
Proof.This follows directly from the fact that FHE.AddPtxt does not add any noise to the ciphertexts.

FiLIP cipher
FiLIP is a binary stream cipher based on the improved filter permutator paradigm [MCJS19].
The encryption and decryption algorithms work as follows: Let K ∈ {0, 1} Z be the secret key; for each bit m i of the message, we use a forward secure PRNG to sample • S i : a subset of z out of Z, • P i : a z to z permutation, • w i : an z-dimensional binary vector called whitening.
Then, for a filter function f : {0, 1} z → {0, 1} fixed beforehand, we compute The paradigm of FiLIP is recalled in Figure 2. We implemented the variant that is called FiLIP-144 in [HMR20], which consists in setting Z = 2 14 , z = 144 and f as the XOR-THR function XTHR [81,32,63] (that we recall in Definition 2).We note that those parameters of FiLIP-144 yield 128 bit security, following the analysis in [MCJS19].The cryptographic parameters of XOR-THR functions are studied in details in [CM22].

Ad hoc homomorphic building blocks
In this section we define new homomorphic operations that will be used in our transciphering.They are constructed using the operations defined in Section 2.2, thus, they can also be instantiated with any FHEW-like scheme.

Homomorphic XOR modulo p
We start with the simplest scenario where the ciphertexts only encrypt bits, but using Z p as the message space.Given m 0 , m 1 ∈ {0, 1}, we can see that Thus, we can easily compute an encryption of XOR(m 0 , m 1 ) given encryptions of m 0 and m 1 , as we show in Appendix A. And, in fact, one could implement FiLIP modulo p using Figure 3: Two strategies to compute a sequence of XOR gates multiplied by a polynomial.In both cases, the output is T (X) • XOR(m 2 , XOR(m 1 , m 0 )), but the second computation inserts T (X) right in the beginning and carries it until the end, reducing thus the final noise when evaluated homomorphically.
Algorithm 1: FHE.XOR such simple homomorphic XOR, however, at the very end of the main loop, after all the external products, one would obtain an encryption of a power of X and would have to multiply it by the test polynomial T (X)3 to extract XTHR [k,d,s] .But, multiplying by T (X) introduces an extra factor of √ N in the final noise.Thus, as it was done originally in the bootstrapping of TFHE [CGGI16], we would like to start the loop with T (X) already, so that it is always multiplied on the left and does not impact the noise.
For this, we introduce another homomorphic XOR gate modulo p that outputs an encryption of u•XOR(m 0 , m 1 ) for any polynomial u, so that we can carry the test polynomial T (X) from the beginning of the computation and we do not need to multiply it at the end, thus, reducing the final noise.This is illustrated in Figure 3.
In more detail, let m 0 , m 1 ∈ {0, 1} and u be a polynomial, let We define this gate as follows We show it in thoroughly in Algorithm 1. From Lemma 1, it holds that Ê ≤ Then, by the properties of FHE.Add, we have Moreover, one can see that the output of FHE.XOR is composable as Thus, if we execute k consecutive compositions of this gate with s (m j , Êj ) for 1 ≤ j ≤ k, we obtain an encryption XOR k (m 0 , m 1 , . . ., m k ).Additionally, we can verify that the final noise is E-subgaussian with (1)

Homomorphically lifting a bit to the exponent
Let u be a polynomial and b ∈ {0, 1}.This homomorphic operation takes an encryption of u • b and outputs an encryption of u • X b .Suppose we have c ∈ RingCtxt s (∆ • u • b, E), we just compute the following The correctness and noise growth follow directly from the properties of the plaintextciphertext addition and multiplication.We show it in detail in Algorithm 2.
Algorithm 2: FHE.LiftExp 4 Transciphering for Z p from transciphering for {0, 1} In this part we detail the transciphering protocol for Z p from a transciphering for {0, 1} using the example of FiLIP cipher.First, in Section 4.1 we elaborate on the setup phase, on the generation of the homomorphic ciphertext of the symmetric key that will be used in the HHE protocol.Then, in Section 4.2 we specify the different steps of the online phase.We detail the two main algorithms BinaryTranscipher and Z p Transcipher.Finally, we prove the correctness and a bound of the error growth for ciphertexts obtained from these algorithms.

Setup for homomorphic FiLIP
This phase starts with the client generating the secret keys for FiLIP and for the FHE scheme, then encrypting FiLIP's key under the FHE key and sending it to the server.This is called client's setup and it is shown in Algorithm 3, where we assume that the noise of fresh ciphertexts is sampled from a σ-subgaussian distribution.
Then, the server expands the FHE encryptions by running a global setup which is independent of the FHE plaintext space p.This is shown thoroughly in Algorithm 4.Moreover, for any given p, the server also has to run, only once, a setup step.We call this p−Setup and show it in detail in Algorithm 5.It depends on the following function F d , which is used to map a value of the form b + 2 • w to b + y mod 2, where y = 1 if w ≥ d and y = 0 otherwise.That is, we define In the online phase, we will compute b as the XOR of some bits of FiLIP's secret key and w as the Hamming weight of some other bits, then F d (b + 2 • w) is applied to those bits.After that, the server is ready to apply the transciphering as many times as needed to transform FiLIP's ciphertexts into FHE ciphertexts with Z p as the plaintext space.
To do so, we proceed by generating ciphertexts c such that the first coefficient of µ j is equal to 2 j • F(k, IV j ).Then, by using FiLIP's ciphertexts c j 's, we can generate encryptions of 2 j • b j and add them together to obtain an encryption of m, as desired.Notice that each c (j) can be computed in parallel.This first step is described in Algorithm 6 and it is similar to the homomorphic evaluation of FiLIP presented in [CDPP22], but the XOR is no longer computed with homomorphic additions and the whole computation carries the power of two and the test vector T (X).In Lemma 2, we prove the correctness of Algorithm 6 and analyze the noise of its output.

Lemma 2. [Correctness and noise analysis of
T (X) is the test vector defined in p−Setup.Let ∆ := ⌊Q/p⌉.For i ∈ 0, Z − 1 , consider the following input ciphertexts: Then, if c is the output of BinaryTranscipher, it holds that c ∈ RingCtxt s (⌊Q/p⌉•µ, E out ) where µ ∈ R t with µ 0 = 2 j • F(k, IV), and Proof.Let k ′ 0 , ..., k ′ 143 be the bits of FiLIP's secret key after taking the subset and applying the permutation and the whitening.Notice that F(k, IV) = XOR(k ′ 0 , ..., k ′ 80 ) + T 32,63 (k ′ 81 , ..., k ′ 143 ) mod 2, as in Definition 3. Since the whitening corresponds to negating the bit when w i = 1, it holds that at the end of the first loop of BinaryTranscipher, the ciphertexts c i,j , C (i) , and Ĉ(i) encrypt and X 2•k ′ i , respectively.Thus, from the correctness of FHE.XOR, at the end of the second for loop, we have Finally, each iteration of the last loop adds 2 • k ′ i to the exponent of X encrypted in c.But notice that since k ′ i ∈ {0, 1}, it holds that the Hamming weight is equal to the sum, Algorithm 6: BinaryTranscipher Input: An integer IV, an integer j, and, for i ∈ 0, Z − 1 , the ciphertexts c i,j and ci,j computed in p−Setup, Ci , Ĉi , and Ci computed in GlobalSetup, and C i generated by ClientSetup.
where k ′ i are the permuted and whitened bits of FiLIP's secret key ) , E out ), for some E out .Now, recall that the test vector T (X) encodes the function F d from Equation 5, thus, T (X) • X k results in a polynomial µ whose constant term is m 0 = F d (k).Hence, c encrypts µ such that as desired.
Now it remains to analyze the noise.By Inequality 5, it holds that Finally, the 63 consecutive external products in the last loop give us The full transciphering procedure is shown in Algorithm 7 and it works by calling L times BinaryTranscipher, then combing the ciphertexts and finally using key-and modulusswitching procedures to output an integer ciphertext with the right format.

Lemma 3. [Correctness and noise analysis of Z p Transcipher]
Consider the same notation and inputs used in Lemma 2. Let L := ⌈log p⌉.Assume that the key-switching key ksk has σ ksk -subgaussian noise for some σ ksk .For j ∈ 0, L − 1 , let where ℓ ksk = log B ksk q.
Proof.From Lemma 2, we know that the constant term of the message encrypted by c (j) is equal to 2 j • F(k, IV).Notice that if c j = 0, then b j = F(k, IV) ∈ {0, 1}, thus, this constant term is already equal to 2 j • b j .If c j = 1, then b j = 1 − F(k, IV) ∈ {0, 1}, and line 4 turns the constant term into 2 j − 2 j • F(k, IV) = 2 j • (1 − F(k, IV)).Therefore, at the end of the for loop, each c (j) encrypts 2 j • b j in the constant term.It follows that c encrypts m in the constant term.
Hence, from the correctness of FHE.Extract, FHE.ModSwt, and FHE.KeySwt, c ∈ IntCtxt z (⌊q/p⌉ • m, E out ), for some E out , as desired.Now it remains to prove the noise bound.Again from Lemma 2 and using the fact that FHE.AddPtxt does not change the noise, at the end of the for loop, each c (j) has E-subgaussian noise with E ≤ 15 In line 5, we apply log p times FHE.Add, thus, the we obtain ( Then, FHE.Extract does not change the noise distribution and FHE.ModSwt gives us Finally, the after FHE.KeySwt, we have results in Table 3.Since the client has to encrypt each bit of the FiLIP's secret key k ∈ {0, 1} 2 14 into one gadget ciphertext, the total upload, in bits, is 2 14 • ℓ • N • log Q plus the size of the key-switching key."On-line phase" shows the running time of executing Z p Transcipher, and the next column shows this time divided by the number of bits, i.e., log p.We note that the running times are already very low, although we have a nonoptimized proof of concept (for example, one could speed it up by using a dedicated FFT library for the cyclotomic rings used in FHE instead of the general library FFTW that we used).We stress that the first loop of our transciphering is composed by log p independent calls to BinaryTranscipher, therefore, it can be easily parallelized, which should divide the total time, and thus, also the amortized time per bit, by almost log p, since the step where the outputs of BinaryTranscipher are combined is very cheap compared to the running time of BinaryTranscipher itself.

Failure probabilities
As it is done in virtually all FHE schemes that use subgaussian noise analysis [DM15, CGGI16, BIP + 22], we use the central limit heuristic to model as a Gaussian the final error in the LWE ciphertexts output by Z p Transcipher.Moreover, based on Lemma 3, we assume the following variance: where l := log Bg (Q/2) and ℓ ksk := log B ksk (q/2), since in practice before decomposing the values, we can put them in the centered representation, e.g., in −q/2, ..., q/2 , instead of in 0, ..., q − 1 .Also, B ′ g := B g − 1 and B ′ ksk := B ksk − 1, since when we decompose integers )), thus, increasing p increases the probability exponentially (this is the case for any FHE scheme).In Table 4, we present all the values σ LW E and the corresponding probabilities.We stress that this is the probability that an LWE ciphertext output by Z p Transcipher does not encrypt the correct value, but the failure probability of the programmable bootstrapping executed afterwards is independent of this and can be chosen by setting accordingly the parameters of the FHE scheme used to the computation -which is not necessarily the same scheme and parameters used for the transciphering.

General comparisons
Since our transciphering is adapted for so-called third generation schemes, we do comparisons with the other transcipherings performed with this type of schemes.Namely, we compare our performances to the transcipherings with FiLIP performed with TFHE in [HMR20,CHMS22] and FINAL in [CDPP22], and Elisabeth with TFHE in [CHMS22].For a further extension to other types of schemes which require batched ciphertexts, we discuss the possibility and the limitation in Section 6.1.
Since no previous work discuss plaintext-independent setup, we use this section to compare our solution with the the naive strategy one would have to apply to obtain plaintext-independent transciphering.Namely, to obtain transciphering for arbitrary plaintext space Z p , one would run a binary transciphering, such as FiLIP, k := ⌈log(p)⌉ times to obtain LWE ciphertexts of each bit in Z 2 , then run a bootstrapping on Z p on each of the k ciphertexts to change their message space from modulo 2 to modulo p, and finally combine them into a single ciphertext as we do.Thus, in Table 5, we compare our strategy with this naive one for 3-and 7-bit message spaces.For this, we used the fastest implementation of TFHE bootstrappings that we know of for the required number of bits5 , and estimate the transciphering time per bit as t B + t s , where t B is the time to run functional bootstrapping modulo p and t s , which is extracted from each paper, is the time of on-line computation required by the sever to obtain an homomorphic ciphertext encrypting a single bit.Notice that we are ignoring the composition step where all the k ciphertexts are combined (while the running times corresponding to our results include We use the most efficient transcipherings generating 3rd-generation ciphertexts with Z 2 as message space [HMR20,CHMS22,CDPP22] to benchmark this naive strategy, since it relies on evaluating k times a binary transciphering.That is, we extracted from those papers the value t s of a single execution of the binary transciphering.We notice that all of them use FiLIP, which is expected, as FiLIP is an FHE-friendly cipher that encrypts bit by bit.We defer the comparison with Elisabeth to the next subsection, since each ciphertext contains 4 bits of information, which makes it comparable with ours for p with 4 bits.
As expected, our results faster than applying the naive strategy to existing binary transcipherings, especially for larger message precision, as the bootstrapping becomes much more expensive as we increase the message space p.
We note that the parameters that all the works introduced in Table 5 used parameters yielding 128 bits of security.Especially, [HMR20], [CDPP22], and our work with Set-I used the same polynomial degree N = 1024, however the works implemented with TFHE used larger ciphertext size q which is 2 32 than FINAL.To achieve larger precision of message, [CHMS22], and our work with Set-II used N = 2048, and [CHMS22] used larger q = 2 64 , compared to 2 30 in our work.Noise variance per work is chosen accordingly to achieve the desired security level.

Comparison with Elisabeth-4
In [CHMS22], an HHE scheme is presented combining the symmetric cipher Elisabeth-46 and TFHE as FHE scheme.Since it generates ciphertexts of 3rd-generation FHE ciphertexts, more specifically, TFHE ciphertexts, and Elisabeth-4 works with plaintext over Z 16 , this work is a directly comparable with our method when we fix p = 2 4 .
Their algorithm uses homomorphic additions modulo 16 and evaluations of Negacyclic Look up Tables (NLUT) from Z 16 to Z 16 using the Programmable Bootstrapping (PBS).To  6, where their evaluation corresponds to 75264 external products for the mode with 2 key switchings and for the mode with a single key switching.The comparison of the running time of Elisabeth-4 in [CHMS22] and ours when setting p = 2 4 is given in Table 7, from which we can conclude that our transciphering (for 4 bits) is much faster than the one with Elisabeth-4.Comparing to monothreaded computations, our implementations is more than 20 times faster, for the different sets of parameters.The latency we obtain is smaller but of the same order of the timings of the multithreaded evaluation of Elisabeth-47 ), since most of the operations in our transciphering are performed independently on the 4 bits we could expect a latency close to the current time per bit with 4 threads.The main reason for such efficiency for our transciphering is that we have far smaller number of external products, namely, executing a single PBS requires more external products than transciphering one bit with our method.
On the downside, in our method, each bit of the FiLIP's secret key is encrypted into one gadget ciphertext, while in Elisabeth-4, the client sends LWE ciphertexts to the server, which can be compressed with standard techniques.Namely, each LWE ciphertext is composed by n + 1 elements of Z q , but n of them are uniformly distributed and only one of them depends on the secret key.Thus, instead of sending those n + 1 elements, the client can send the seed used to generate the n random elements together with one single element of Z q , hence, drastically reducing the upload.In [CHMS22], it is reported that the client just has to upload 8 KB or 20 KB, depending on the mode, to send the symmetric key encrypted with compressed LWE ciphertexts.However, to evaluate Elisabeth-4, the server also needs the bootstrapping keys, which corresponds to more than 12 MB.Thus, depending on whether one considers that the bootstrapping keys are part of the setup step of Elisabeth-4 or not, the client's upload is estimated as a few kilobytes or a few megabytes.While in our case, the client's upload ranges from megabytes to one gigabyte.We stress that if the client wants to use applications with different values of plaintext modulus p, then extra costly conversions of homomorphic ciphertexts and more uploads are needed, since Elisabeth-4 is bent to use p = 2 4 only.

Further comparisons with transcipherings using TFHE
Recently, two new works proposing a transciphering with TFHE have been presented at WAHC 2023.
In [BOS23] the authors evaluate the standardized cipher Trivium that has a security claim of 80 bits, and its (non-standardized) variant with a security claim 128 bits, introduced in [CCF + 16].We can compare the timings obtained for the evaluation of Kreyvium with the ones of Table 5 since the output are 64 TFHE ciphertexts encrypting one bit.After a warm-up phase of 2883 ms their optimized transciphering produces 64 encrypted bits in 150 ms using 128 virtual CPUs (Table 2 in [BOS23]).
[TCBS23] the authors evaluate AES scheme using TFHE and the programmable bootstrapping using representation in basis 16.Since the blocs of 128 bits of AES are obtained as 32 ciphers of 4 bits, we can compare the timings they obtain with the ones with p = 2 4 of Table 7.The best timing is obtained with 16 threads, resulting in 28.73s for 128 bits.

Comparison for neural networks evaluation
Our method can shine in neural network evaluation.The versatility of our non binary transciphering allows to adapt the precision on the plaintexts, which fits well with Convolutional Neural Networks (CNN) working on quantized data.For example, the transciphering of [CHMS22] is followed by a CNN evaluating the classification of Fashion MNIST pictures homomorphically.The Fashion MNIST picture database consists of images of 784 gray pixels, each one of 8 bits of information.For a faster evaluation (taking advantage of the 4-bits PBS implemented in Concrete), the evaluation of [CHMS22] restrict the gray-scale to only 3 bits of information and homomorphically evaluate the quantized CNN.The advantage of ours is that from the encrypted data of the client, the server could choose rather to evaluate a cheap CNN with data quantized over t bits with potentially relatively low accuracy or a more costly CNN with data quantized over t ′ > t bits with high accuracy, depending on computing environment.The CNN choice does not requires the client to re-encrypt data, and choosing the precision after client's query allows the server to adapt the cost and the precision for each functionality asked by the client.
For the particular CNN used in [CHMS22], the transciphering is considered with parameters already compatible with the PBS used for the CNN, rather than the optimal ones we recalled in Table 6.Moreover, only 3 of the 4 bits of each plaintext of Elisabeth-4 are used in the ciphertext with plaintext space Z 16 , since the CNN takes bootstrapped LWE ciphertexts, that allow PBS on 3 bits of data, one bit being used for padding.In [CHMS22] the transciphering with Elisabeth-4 and homomorphic inference takes 427 seconds, compared to 6 seconds without the transciphering.Using the optimal parameters for Elisabeth-4 evaluation recalled in Table 6, it reduces the total time to 77 seconds, without considering a keyswitching before entering the CNN.If we use our technique which outputs LWE ciphertexts encrypting 3 bits of integer, in the same setting, we would expect the total running time to be reduced to 15 seconds8 (with parameter Set-I), and 43 seconds with parameter Set-II).

Discussion
Our idea which homomorphically composes {0, 1} elements into an integer can be naturally extended to any FHE scheme which uses batching methods [BGV12,FV12,CKKS17].The naive approach for server is to run our transciphering per coefficient, and homomorphically moves the coefficient message into the corresponding slot by computing a linear transformation [HS21].However, the complexity of this process is O(N ), where N is the number of slots, which would require more optimizations for practical uses.
Additionally, one might argue that the same property for general message precision can be achieved by functional bootstrapping [CJP21].However, our approach is much cheaper than running one bootstrapping since our transciphering only requires 144 times of external products (using log p-threads), whereas one bootstrapping requires at least 630 up to around 900 external products depending on the desired precision.

Conclusion
In this article, we have presented a new transciphering method which can be used for any message precision for the first time.In other words, the client does not need to set message precision before sending its data to the server.Therefore, the server can reuse the given data for several application algorithms by taking only necessary upper bits of data, depending on the target application, without running different setups with the client.This approach gives more freedom to clients in cloud-based service, in terms of parameter setting and communications with the server.Hence, a service provider can offer a more user-friendly environment to the clients.

Figure 1 :
Figure 1: Main pieces of the first step of our homomorphic decryption, where we transform a FiLIP ciphertext c j encrypting a bit b j with initialization vector IV and secret key sk into an FHE encryption of 2 j • b j .The red boxes represent values encrypted with FHE.
we denote by Var(a) (resp.Var(x)) the maximum variance of each coefficient (resp.component) of a (resp.x).The variance of the product of two polynomials a, b ∈ R is Var(a • b) = n • Var(a) • Var(b).Similarly, Var(X) denotes the maximum variance of each column of X.

Table 2 :
The parameters of NTRU gadget ciphertexts, the decomposition base of the NTRU-to-LWE key-switching, and the parameters of the LWE ciphertexts.An upper bound to σ LWE is presented in Lemma 3.

Table 3 :
Running times and upload depending on different parameter sets

Table 4 :
Failure probability of output of Z p Transcipher.p log(σ LWE ) Upper bound on failure probability we actually obtain values less than or equal to B − 1.And since we used ternary keys for the NTRU secret, the value ∥s∥ 2 was replaced by √ N .Notice that σ LW E grows very slowly as we increase p (assuming other parameters fixed), as it is just proportional to √ log p.However, the failure probability is computed as 1

Table 5 :
Comparison of running time (in milliseconds) of transcipherings with FiLIP.For previous works, we present the latency as t 0 , t 1 corresponding to the latency for plaintext space p = 2 3 and the latency for p = 2 7 , respectively.The time per bit is presented in the same way.The latency is just the time per bit multiplied by k.All the timings correspond to monothreaded computations.

Table 6 :
TFHE parameters for Elisabeth-4.is the size of the LWE key and 203 LWE ciphertext additions.The error constraints to enter in the PBS require to use key switching during the evaluations, therefore the authors present two evaluations with different sets of parameters, with one or two key switchings.The parameters for these two modes are shown in Table

Table 7 :
[CHMS22]comparison between the evaluation of Elisabeth-4 with TFHE from[CHMS22], and FiLIP recombining 4 bits with our transciphering.Multithreaded versions of Elisabeth-4 were probably executed on 12, or 48, or 64 threads, but this information is not explicitly written in[CHMS22].