Monday, October 17, 2016

Supersingular isogeny Diffie-Hellman

Supersingular isogeny Diffie-Hellman (SIDH) is a relatively recent proposal for a post-quantum secure key-exchange method which I will briefly outline here.

Most post-quantum cryptography is based on lattices, codes, multivariate quadratics or hashes. See Gustavo's post for more.

A fifth category seems to slowly establish itself: Isogeny-based crypto.

These schemes are based on the difficulty of finding an isogeny between two supersingular elliptic curves. Isogenies are specific rational maps between elliptic curves which must also be a group homomorphism for the group of points on the curve.

The original proposal [Stolbunov et al., 2006] was to use the problem of finding isogenies between ordinary elliptic curves but this system was shown to be breakable with a quantum computer [Childs et al., 2010]. After that it was proposed to use supersingular elliptic curves instead [De Feo et al., 2011].

SIDH currently has significantly worse performance than lattice based key-exchange schemes but offers much smaller key-sizes. Compared to Frodo it is 15 times slower but the key size is only one twentieth. Compared to NewHope it is over 100 times slower at less than one third of the keysize. This can be relevant in scenarios like IOT where cryptographic computations require orders of magnitude less energy then actually transmitting data via the radio.

Although finding isogenies between curves is difficult, Vélu's formulas allow calculating an isogeny with a given finite subgroup as its kernel. All such isogenies are identical up to isomorphism.

Now, starting with a public curve \(E\) that is a system parameter we have both parties, Alice and Bob generate an isogeny for kernels \(\langle r_a \rangle, \langle r_b\rangle\) respectively. Let \(r_a, r_b\) be any generators of a subgroup for now. This gives us two isogenies
$$ \phi_a: E \rightarrow E_a\\
\phi_b: E \rightarrow E_b.$$
Now we  would like to exchange $E_a, E_b$ between the partners and somehow derive a common $E_{ab}$ using the kernels we have used. Unfortunately, $r_a$ does not even lie on $E_b$ so we have a problem.

The solution that was proposed by De Feo et al. is to use 4 more points $P_a, P_b, Q_a, Q_b$ on $E$ as public parameters, two for each party. This allows constructing
$$r_a = m_aP_a + n_aQ_a\\
r_b = m_bP_b + n_bQ_b$$ using random integers $m_a, n_a, m_b, n_b$ appropriate for the order.
Now, after calculating the isogenies $\phi_a, \phi_b$ the parties not only exchange the curves $E_a, E_b$ but also $\phi_a(P_b), \phi_a(Q_b)$ and $\phi_b(P_a), \phi_b(Q_a)$.
Looking at the example of Alice she can now calculate
$$m_a\phi_b(P_a)+n_a\phi_b(Q_a) = \phi_b(m_aP_a + n_aQ_a) = \phi_b(r_a)$$ and Bob can perform the analogous computation. Constructing another isogeny using this $\langle \phi_b(r_a) \rangle$ and $\langle \phi_a(r_b) \rangle$ respectively gives Alice and Bob two curves $E_{ba}, E_{ab}$ which are isomorphic and their j-invariant can be used as a common secret.

I will leave you with this wonderfully formula laden image from the De Feo et al. paper showing the protocol.

Monday, October 10, 2016

Quantum computation, algorithms and some walks.. pt.1

During the last meeting of the ECRYPT-NET project in Leuven, I gave an introduction to Quantum Algorithms. In the first part, I explained quantum computation, it is natural that we need to have a base. In this post, I will give more detailed explanation of it. First, we need to understand the concept of superposition in quantum physics. To achieve this, we are going to go back to the idea from Schrödinger, i.e., a cat that may be simultaneously both alive and dead. This phenomenon is called superposition in quantum physics and quantum computers take advantage of this. If you are interested about how physically implement a quantum processor you can see at [2].

Well, computers aren't made with cats inside. Computers use bits and registers to work. So, how is it possible to compute with a quantum computer? How is it possible to represent bits? How is it possible to take advantage of the superposition?

Quantum Notation

First, let's learn about the Dirac notation, that is commonly used in Quantum physics and in most of the literature about quantum computers.  Since Blogger doesn't allow me to create mathematical equations, I will get some help from physciense blog and pick some images from them:
The Dirac notation could be a little bit different. If you want to check about it, I selected some nice lecture notes in the following links: Lecture 1 and Lecture 2.

So, the bra-ket notation is just vectors and we are going to use this to represent our qubit, yes we call the bit of a quantum computer by this name. In classical computers we represent a bit with 0 and 1 but in quantum computers it is a little bit different. We can represent the states as follows:

As the image show to us, the qubit is 0,1 or a superposition of 0 and 1. However, if we measure, i.e., if we see the qubit we lose the superposition. In other words, our state collapses and we cannot take advantage of the superposition anymore. In the same way that in classical computers we have gates, the quantum computer also has gates. One very famous gate is the Hadamard gate. This gate has the property to put a qubit in the superposition state. We can see the action of this gate in the following image:

Quantum Algorithms

Now, we know what is a qubit and how we can operate with it. We can move for the next step and create some algorithms to solve problems. The most common and very well-known example came from Deutsch and Jozsa. It is known by Deutsch-Jozsa problem and consist of:

  • Input:  f: {0,1}^n to {0,1} either constant or balanced
  • Output:   0 iff the function f is constant
  • Constraints: f is a black-box
In the classical computers we can represent this problem such as:
If we solve this problem with a quantum computer, we are going to make exactly 1 query. The function f will be implemented as a black-box in the quantum computer and it will be:
However, if we just use this implementation, it is not enough to solve the problem. In order to solve this, we are going to create a quantum circuit and use Hadamard gates. In fact, the Hadamard gates are going to help us to take advantage of the superposition. We are going to have this circuit:

Now, we have the circuit to determine if a function is constant or balanced. Nevertheless, how do  we compute this circuit? How does it work? In order to answer these questions, we work out the qubit after each gate. In the end we have the final measure. Suppose that we just have one qubit and we start it in state 0 and pass through it the first gate. We are going to have this:
 After this, we can see that we put our qubit in a superposition state. Now, we go to our function and call our black box. The result of it can be seen as:
The third step, we are going to use the interferences to "reconstruct" the qubit. In the last step, we are going to compute now the last gate and see what is "expected" when we measure:
In the end, we have that final state, i.e., if the function is constant "f(0)=1" then we are going to have the result as qubit in 0. However, if the function is balanced "f(1)=1" then we are going to have the qubit in 1. The algorithm can be generalized and it works for a system with "n"-qubits. It is possible to find more information in [1]. Also, you can check this class.

In my next blog entry, I will talk about  Shor's, Grover's algorithms and Random quantum walks. However, we just saw the begging of quantum algorithms. If you want more information about it, you can check the slides from my presentation in Leuven on my website.

[1] David Deutsch and Richard Jozsa (1992). "Rapid solutions of problems by quantum computation". Proceedings of the Royal Society of London A.
[2] Gambetta, Jay M., Jerry M. Chow, and Matthias Steffen. "Building logical qubits in a superconducting quantum computing system." arXiv preprint arXiv:1510.04375 (2015).

Tuesday, September 6, 2016

Requirements and security goals for the Cloud

This is a two-part blog post giving an overview on the requirements and security goals for the Internet of Things (IoT) and the Cloud.

But first ... I'd like to mention that soon the ECRYPT-NET Summer School on cryptography for the cloud in Leuven is coming up.
This blog post can be seen as an intro to cryptology in the cloud; so there are some topics we will for sure hear about at the event in depth. I'm looking forward to a good discourse during the scheduled talks and to the discussions afterwards.

Part I: Requirements and security goals for the Cloud

The cloud here is loosly defined as computing and storage resources out-sourced to servers located off-site that are on-demand accessible via Internet.

Overview of cloud computing services to be secured
(Image by Wikimedia user Sam Johnston)
The usage of cloud services should thus be viewed as a potential threat, because inherently critical data is released from personal control and uploaded and so immediately raising several security as well privacy issues.

Requirements and the according cryptologic counter-measures are defined for various use-cases (such as depicted in the image) and briefly explained. It is important to realize that this needs to be done without deliberately weakening (or "back-dooring") solutions, as is sometimes suggested. To ensure democratic use of the developed technologies it is vital to see that the difference of a "front door" and a "back door" is merely ones viewpoint. Intentionally implementing two entrances makes an attacker happy not the legitimate user.

Transparent services, meaning the use of the cloud as if it was on-site, should ideally be built upon a back-end with verifiable, clear, possibly standardized, cryptologic concepts and open code for public scrutiny.

To capture the real-world cloud settings one has to view it from different perspectives: First the one of a private entity (user), then that of an organization (such as a company) and lastly the global perspectives of, say, a government.

An interesting starting-point for threats we have to consider, is the Cloud Security Alliance's 2016 report for cloud security naming twelve critical issues ranked by severity of real-world impact.

Data breaches is followed directly by weak identity, credential and access management and insecure Application Programming Interfaces (APIs) in this report --- issues that can be addressed by assessing requirements and tailoring cryptographic solutions from the beginning and deploying state-of-the-art implementations instead of sticking to legacy code.

Roughly four categories of requirements for the usages of the cloud can be distinguished:
  • Computations in the Cloud
  • Sharing Data in the Cloud
  • Information Retrieval from the Cloud
  • Privacy Preservation in the Cloud
The approach to secure these areas are manifold. General speaking, while data minimization is a fundamental paradigm for efficient interaction with the cloud security-by-design --- considering best-practices from the beginning --- is enabled by encapsulation of scopes of methods.
The following, briefly explained, concepts tackle concrete use-cases for end-users, companies and e-Government tasks alike:
  • Order-Preserving Encryption (OPE) allows efficient range queries but does it diminish security too much?
  • Format-Preserving Encryption (FPE) offers in-place encryptions in legacy databases.
  • Data De-Duplication (DDD) is enhancing servers' back-end performance.
  • Secret Sharing Schemes (SSS) is solving back-up and availability issues by distributing encrypted parts of the whole.
  • Malleable Signature Schemes (MSS) is providing flexible authentication for documents and derived data.
  • Private Information Retrieval (PIR) privately accessing elements in a database.
  • Provable Data Possession (PDP) ensures that the cloud is storing the (uncorrupted) files.
  • Multi-Party Computation (MPC) allows secure cooperation across the Internet.
  • Verifiable Computation (VC) builds trust in results of a delegated computation.
We can observe that the cryptology community is striving for efficient cryptographic primitives that serve as building-blocks for implementations. Naturally researchers base their constructions on well-established security assumptions and derive clear security proofs to offer i.e. long-term confidentiality. This is not at least important because of the emergence of future quantum computers with powerful quantum algorithms posing a threat and needs to be addressed today before the transition of current solutions to the cloud, where the constructions would be exposed and would possibly be vulnerable to a quantum attacker.

Let's hope this trend continues and ultimately leads to reliable, standardized primitives for real-world applications.

Stay tuned for Part II: Requirements and security goals for the IoT...

Tuesday, August 23, 2016


The morning of the second day at CRYPTO’s 2016 started with a track on “Compromised Systems”, consisting of three talks covering different scenarios and attacks usually disregarded in the vast majority of the cryptographic literature. They all shared, as well, a concern about mass surveillance.

Suppose Alice wishes to send a message to Bob privately over an untrusted channel. Cryptographers have worked hard on this scenario, developing a whole range of tools, with different notions of security and setup assumptions, between which one of the most common axioms is that Alice has access to a trusted computer with a proper implementation of the cryptographic protocol she wants to run. The harsh truth is that this is a naïve assumption. The Snowden revelations show us that powerful adversaries can and will corrupt users machines via extraordinary means, such as subverting cryptographic standards, intercepting and tampering with hardware on its way to users or using Tailored Access Operation units.

Nevertheless, the relevance of these talks was not just a matter of a “trending topic”, or distrust on the authoritarian and unaccountable practices of intelligence agencies. More frequently than we would like, presumably accidental vulnerabilities (such as Poodle, Heartbleed, etc.) are found in popular cryptographic software, leaving the final user unprotected even when using honest implementations. In the meantime, as Paul Kocher remembered on his invited talk the day after, for most of our community it passes without notice that, when we design our primitives and protocols, we blindly rely on a mathematical model of reality that sometimes has little to do with it.

In the same way as people from the CHES community has become more aware –mainly also the hard way– that relying on the wrong assumptions leads to a false confidence of the security of the deployed systems and devices, I think those of us not that close to hardware should also try to step back and look at how realistic are our assumptions. This includes, as these talks addressed in different ways, starting to assume that some standards might –and most of the systems will— be compromised at some point, and that we understand what can still be done in those cases.

How would a Cryptography that worries not only about prevention, but also about the whole security cycle look like? How can the cryptography and information security communities come closer?

Message Transmission with Reverse Firewalls— Secure Communication on Corrupted Machines

The reverse firewalls framework was recently introduced by Mironov and Stephens-Davidowitz, with a paper that has already been discussed in our group’s seminars and this same blog. A secure reverse firewall is a third party that “sits between Alice and the outside world” and modifies her sent and received messages so that even if her machine has been corrupted, Alice’s security is still guaranteed.

Their elegant construction does not require the users to place any additional trust on the firewall, and relies on having the underlying cryptographic schemes to be rerandomizable. With this threat model and rerandomization capabilities, they describe impossibility results as well as concrete constructions.

For example, in the context of semantically secure public-key encryption, in order to provide reverse firewalls for Bob, the scheme must allow a third party to rerandomize a public key and map ciphertexts under the rerandomized public key to ciphertexts under the original public key. In the same context, Alice’s reverse firewall must be able to rerandomize the ciphertext she sends to Bob, in such a way that Dec(Rerand(Enc(m)))=m.

Big-Key Symmetric Encryption: Resisting Key Exfiltration

The threat addressed in Bellare’s talk is that of malware that aims to exfiltrate a user's key, likely using her system’s network connection. On their work, they design a schemes that aim to protect against this kind of Advanced Persistent Threats by making secret keys so big that their undetected exfiltration by the adversary is difficult, yet making the user’s overhead almost exclusively in terms of storage instead of speed.

Their main result is a subkey prediction lemma, that gives a nice bound on an adversary’s ability to guess a modest length subkey, derived by randomly selecting bits of a big-key from which partial information has already been leaked. This approach, known as the Bounded Retrieval Model, has been –in the words of the authors—largely a theoretical area of research, whereas they show a fully concrete security analysis with good numerical bounds, constants considered.
Other highlighted aspects of their paper were the concrete improvements over [ADW09] and the key encapsulation technique carefully based in different security assumptions (random oracle, standard model).

Backdoors in Pseudorandom Number Generators: Possibility and Impossibility Results

The last talk of the session focused on the concrete problem of backdoored Pseudorandom Number Generators (PRGs) and PRNGs with input, which are fundamental building blocks in cryptographic protocols that have already been successfully compromised, as we learnt when the DUAL_EC_DRBG scandal came to light.
On their paper, the authors revisit a previous abstraction of backdoored PRGs [DGA+15] which modeled the adversary (Big Brother) with weaker powers than it could actually have. By giving concrete “Backdoored PRG” constructions, they show how that model fails. Moreover, they also study robust PRNGs with input, for which they show that Big Brother is still able to predict the values of the PRNG state backwards, as well as giving bounds on the number of these previous phases that it can compromise, depending on the state-size of the generator.

[ADW09] J. Alwen, Y. Dodis, and D. Wichs. Leakage-resilient public-key cryptography in the bounded-retrieval model. In S. Halevi, editor, CRYPTO 2009, volume 5677 of LNCS, pages 36{54. Springer, Heidelberg, Aug. 2009.

[DGA+15] Yevgeniy Dodis, Chaya Ganesh, Alexander Golovnev, Ari Juels, and Thomas Ristenpart. A formal treatment of backdoored pseudorandom generators. In Elisabeth Oswald and Marc Fischlin, editors, EUROCRYPT 2015, Part I, volume 9056 of LNCS, pages 101–126, Sofia, Bulgaria, April 26–30, 2015. Springer, Heidelberg, Germany.

Saturday, July 30, 2016


From 17th to 22nd of July the northernmost crypto workshop ever organized took place in Longyearbyen, Svalbard at latitude 78.13N. There are only about 1300 kilometers missing until the north pole.

Nordenskiöld glacier
Three ECRYPT-NET fellows (Marie-Sarah, Matthias and Ralph) had the opportunity to join the fascinating event.

The talks had a good mixture between talks from invited speakers as well as talks from researchers that submitted a paper. The topics reached from symmetric cryptography to fully homomorphic encryption as well as digital signatures and side channel attacks. The full program can be found here:

One of the most interesting talks on Monday, was given by Eik List: POEx: A Beyond-Birthday-Bound-Secure On-Line Cipher. In his talk he presented POEx that reached beyond-birthday-bound security with one call to a tweakable blockcipher and a call to a 2n-bit universal hash function per message block.  He then showed a security proof and gave possible instantiations.

In the evening from Monday/Tuesday at midnight there were two fascinating talks during midnight sun. The first one was given by Ron Rivest on: Symmetric Encryption based on Keyrings and Error correction. The second talk was from Adi Shamir: How Can Drunk Cryptographers Locate Polar Bears.

Adi Shamir explaining how drunk cryptographers can locate polar bears.

On Tuesday, Joan Daemen gave his invited talk on: Generic security of full-state keyed duplex. In his talk he briefly explained sponge constructions and how they can be used for authenticated encryption. Afterwards, he explained how to achieve beyond-birthday-bound security using sponges. In the end, he showed a new core of sponges, the (full state) keyed duplex construction.

On Wednesday, there was a full day of sightseeing planned, where we went on a boat trip to the "ghost town" Pyramiden. We started our boat trip in Longyearbyen, where we saw some minke whale. The captain of the ship told us that he saw also a blue whale a few days earlier. After a while we approached the bird cliffs where many seagulls and puffins were nesting. Birds are very important for the eco system of svalbard, as they exchange life from the water to the main lands. From there we continued our journey to the nordenskiöld glacier, a huge glacier with blueish shining ice. After a whiskey with glacier ice, we continued to our final destination. The "ghost town" pyramiden was a russian settlement and coal-mining community was closed in 1998 and is since 2007 a tourist attraction.
Polar bear
Polar bear warning sign
On Thursday, Marie-Sarah presented her paper: Security of BLS and BGLS signatures in a multi-user setting that was related to her master's thesis at the University of Waterloo. Marie-Sarah was inspired to re-visit it after reading about how the tightness of security reductions for the Schnorr signature scheme played a role in the CFRG's selection of an elliptic-curve based signature scheme for standardization. The standard notion of existential unforgeability under chosen-message attacks is in the single-user setting, since the adversary is given one public key to target.
In the more realistic multi-user setting, the attacker gets all users' public keys and can choose which one to attack. In her paper, she first analysed the BLS (Boneh-Lynn-Shacham) signature scheme's security in a manner similar to what was done for Schnorr - is key-prefixing necessary to maintain unforgeability of signatures in a multi-user setting? Next, she analysed the multi-user security of the aggregate signature scheme BGLS (Boneh-Gentry-Lynn-Shacham). She proposed a security notion in a multi-user setting analogous to the multi-user setting for normal (non-aggregate) signatures, then analysed BGLS's security in this model.

On Friday, Gregor Leander was presenting Structural Attacks on Block Ciphers where he presented invariant subspace attacks. Furthermore, he introduced an improved technique called non-linear invariant attacks.

Matthias, Marie-Sarah and Ralph
The artist of this sign never saw a polar
bear before in his life. Therfore, they told
him to paint a big white dog. 
Santa Clause mailbox

Wednesday, July 13, 2016

Crypto events in Île-de-France

The sunny weather and the general feeling of holiday were not impeding crypto-enthusiasts around Paris to meet and discuss the advancements in this topic. On one hand, the Paris Crypto Day brought together people working on different aspects of cryptography, who are based in the Paris area. The last such meeting was organized by ENS on 30.06.2016 and was fortunate to have Anne Canteaut (INRIA Paris), Leo Reyzin (BU), Victor Shoup (NYU) and Rafael Pass (Cornell) speaking about their research. On July 5-7, Paris also hosted a workshop organized within the HEAT (Homomorphic Encryption Applications and Technology) programme. It was held at Universite Pierre et Marie Curie (a.k.a. Paris 6) and it was composed of six invited talks given by famous researchers within the homomorphic encryption community and ten "regular" talks given by younger researchers and students.

Paris Crypto Day

The first presentation was given by Anne Canteaut on Algebraic Distinguishers against Symmetric Primitives. The talk focused on presenting a unified view about the notions of cube distinguishers and the more recently introduced division property. The aforementioned attacks are based on Knudsen's higher-order differential attacks which exploit properties of the polynomial representation of the cipher. The presentation was very appreciated by the symmetric and asymmetric cryptographers.

Victor Shoup gave a talk about hash proof systems1 and their applications, in which he reviewed definitions, constructions and applications. Hash proof systems can be seen as a family of keyed hash functions $H_{sk}$ associated to a language $L$ defined over a domain $D$. The secret hashing key $sk$ is used to compute a hash value for every input $x \in D$. Magically, there is a second way to compute the same hash value: it uses a projection key $pk$ (derived from the $sk$) and also a witness $w$ for $x \in L$. The original definition of hash proof systems requires that the projection key does not depend on the word $x$, but later, smooth projective hash functions allow for this change. Smooth projective hash functions have found applications, among others, in password authenticated key exchange.

Leo Reyzin from Boston University (joint work with Joel Alwen, Jeremiah Blocki, and Krzysztof Pietrzak), presented an analysis of SCrypt (originally introduced by Colin Percival in 2009 for Tarsnap), a tool whose potential applications include the realization of time-lock puzzles from memory-hard problems. The starting point for their work was the key derivation function in SCrypt. As stated by Leo during the talk, SCrypt is defined as the result of $n$ steps, where each step consists of selecting one of two previously computed values (the selection depends on the values themselves) and hashing them. It is conjectured that this function is memory-hard. The new result shows that in the Parallel Random Oracle Model, SCrypt is maximally memory-hard. One metric used is the product of time and memory used during the execution of SCrypt, for which the authors show the bound must be $\Theta(n^2)$. Interestingly, for a non-constant amount of memory used during the computation (this scenario simulates real applications), a more accurate metric - defined by the sum of memory usage over time - is again proven to be bounded by $\Theta(n^2)$ and this holds even if the adversary is allowed to make an unbounded number of parallel random oracle queries at each step.

The last speaker was Rafael Pass, from Cornell, who gave an gripping talk about the Analysis of the Blockchain Protocol in Asynchronous Networks. During his talk, Rafael defined the notions of consistency and liveness in asynchronous networks. In what followed, he explained his result that proves the blockchain consensus mechanism satisfies a strong forms of consistency and liveness in an asynchronous network with adversarial delays that are a-priori bounded.

HEAT Workshop

The workshop was really interesting because, besides new theoretical advances in the field, many talks were about the practical-side of FHE: how to set the parameters, concrete results in cryptanalysis, libraries and real-world applications. The part about lattice reduction techniques was especially interesting.

In particular, Antoine Joux gave a talk named "The prehistory of lattice-based cryptanalysis" where he reviewed some lattice reduction algorithms (Gauss's algorithm for two dimensions and LLL for higher dimensions) and gave some cryptanalytic results, e.g. Shamir's attack against the knapsack problem and the low-density attack against Merkle-Hellman knapsack. Basically, lattice-reduction aims at finding a "good" basis, made of short and almost orthogonal vectors, from a "bad" one, made of long and non-orthogonal vectors. In fact, with a good basis problems like SVP or CVP become easy and it is possible to break cryptosystems based on these problems. There are algorithms that do this (like the famous LLL) but the conclusion was that lattice-base cryptography remains secure as long as lattices are big enough: in fact, all the lattice-reduction algorithms work well if the dimension is not too high. With higher dimension many problems appear and lattice-reduction remains hard.

Another interesting talk about this kind of topic was "An overview of lattice reduction algorithms" by Damien Stehlé, who pointed out that lattice reduction has mainly two goals: beside the predictable one of cryptanalysing lattice-based cryptosystems (such as NTRU and all those based on SIS and LWE), it is useful for cryptanalysing other cryptosystems as well, like variants of RSA. He then presented the two main algorithms in this field, i.e. BKZ and LLL, and outlined their differences, like the global strategy used by BKZ versus the local one used by LLL. He also introduced faster-LLL2, an improvement of the LLL algorithm which is the subject of one of his most recent works. In the conclusions, he mentioned some open problems and finding a "quantum acceleration" is certainly one of the most interesting ones. In fact, as far as we know, lattice problems are not easier for quantum computers, and this is the reason why they are considered the most promising candidate for post-quantum cryptography.

If someone is into coding, this may be interesting: Shi Bai gave a short talk about FPLLL, an implementation of Floating-Point LLL and BKZ reduction algorithms created by Damien Stehlé. It is a C++ library (also available in Python under the name of FPyLLL) which is also used by the popular Sage. Its goal, as stated by the authors, is to provide benchmarks for lattice reduction algorithms and, more in general, lattice reduction for everyone. More details can be found at and contributions are welcome!

Besides lattice reduction algorithms, another interesting talk was given by Florian Bourse, who presented a recent work3 about circuit privacy for FHE. The main result is that it is possible to homomorphically evaluate branching programs over GSW ciphertext's without revealing anything about the computation, i.e. the branching program, except for the result and a bound on the circuit's size, by adding just a small amount of noise at each step of computation. This means that the "price" to pay is quite low, especially if compared to other techniques based on bootstrapping. Also, this method does not rely on not-so-well-understood assumptions like circular security and only assumes the hardness of LWE with polynomial modulus-to-noise ratio.


1. Cramer R, Shoup V. Universal hash proofs and a paradigm for adaptive chosen ciphertext secure public-key encryption. In Advances in Cryptology— EUROCRYPT 2002, vol. 2332, LNCS. Springer: New York, NY, 2002; 45–64.

2. Arnold Neumaier and Damien Stehlé. Faster LLL-type reduction of lattice bases. ISSAC 2016.

3. Florian Bourse, Rafael Del Pino, Michele Minelli and Hoeteck Wee. FHE circuit privacy almost for free. CRYPTO 2016, to appear.

This blog post has been collaboratively written by Michele and Razvan.

Tuesday, July 12, 2016

The Subset-Sum Problem

The Subset-Sum problem (also known as knapsack problem) we want to review in this blog-post is defined as follows: Given $n, S, a_1, a_2, \dots a_n \in \mathbb{N}$, find \begin{align} I \subseteq [n] = \{1,2, \dots, n\}: \sum_{i \in I} a_i = S. \quad (1) \label{s} \end{align}
Historical Remarks

This old problem was first studied in 1897, the same year the first airborne mission to completely reach the geographical north pole (NP) started (and ended...), and was one of the first proven to be NP-complete -- worst-case instances of this problem are computationally intractable. Subset-Sum was proved to be NP-complete by reducing '3-SAT' to the 'Graph Coloring Problem' which was reduced to 'Exact cover' which was reduced to Knapsack and close variants thereof. These proofs were carried out during the early 1970's rigorous reduction proofs and Subset-Sum problem is featured on Karp's somewhat famous list of 21 NP-complete problems, all infeasible to solve on current computers & algorithms thus a possible basis for cryptographic primitives. In the following table one can see how the expected time/space requirements of algorithms solving (1) in hard cases evolved as the techniques were refined by modern research:
Expected time and space requirements of algorithms solving (1) in average hard instances.
The time and space requirements of current approaches are considerably less than for performing exhaustive search, a generic meet in the middle attack or even a finer combinatorial split into four (or multiple) lists. The currently best known algorithm, asymptotically speaking, is a quantum algorithm. The problem of determining a lower bound of the run-time is still an open research question. It seems more difficult than to see a possible link between the declining polar bear population in the arctic regions and the retreating sea ice accelerated by the observable rapid climate change.

Let us review two classical techniques that led to remarkable speed-ups:

Technique 1 - Meet in the Middle

Schröppel-Shamir: Combining disjoint sub-problems of smaller weight.
Hard instances of the Subset-Sum problem are characterized by relatively large elements $(\log_2 a_i \approx n)$ and a balanced solution, i.e. $|I| \approx \frac n 2$ in Equation (1). Identifying subsets of $[n]$ with length $n$ vectors $x$ over the 'number-set' $\{0,1\}$ via $i \in I \Leftrightarrow x[i] = 1$ one constructs lists $L_1, L_2$ of pairs merged to a solution in $L_0$:
Algorithms based on the birthday-paradox construct expected collisions in the second component of the sub-problems in the lists $L_1, L_2$ forcing any $x \in L_0$ to fulfill (1). The difficulty is to estimate the list-size needed to observe the existence of one solution with high probability. It is desirable to ensure that it is more likely to terminate the algorithm with a non-empty $|L_0| \geq 1$ (i.e. have a solution) than the chance to see a polar bear towards the north-east or meet one in the middle of Svalbard, Norway.

Technique 2 - Enlarge Number Set

BCJ11: Adding length $n$ sub-solutions increases the number-set.
The idea in Howgrave-Graham-Joux (2010) that was later extended by Becker-Coron-Joux (2011) was to allow multiple representations. This comes at the price of enlarging the number-set, i.e. having $$x_0[i] = x_1[i] + x_2[i] \not \in \{0, 1\}.$$ Additionally to the first improvements due to constructing colliding sums, constructing too many potential solutions and introducing a non-trivial filtering step to remove 'inconsistent' ones when merging the two lists $L_1$ and $L_2$ back into one, still gave an overall speed-up asymptotically. 
The number-set used by the authors was $\{-1,0,1\}$, indicating a summand appearing on both sides of Equation (1).
After constructing sufficiently many sub-problems and their respective partial solutions a collision can be expected thus the combination forms a solution for the given instance.  

The cryptanalytic methods for structurally approaching the Subset-Sum problem are valuable algorithmic meta-techniques also applicable to other NP-complete problems like lattice- or code-based problems.
Such problems are promising candidates for the construction of post-quantum cryptosystems, cloud security applications and encrypted Polar Bear TV broadcasts. There is a conference focusing on such topics (Polar bears and NP-complete problems) coming up - stay tuned.

PS: The bad image quality, is due to blogger wouldn't let me include vector-graphics like .pdf or .eps nor directly render them giving latex code... :-(