Sunday, March 27, 2016

FSE 2016 and DISC workshop

This is part two of a two-part blog post collaboratively written by Matthias and Ralph.

FSE 2016


From 20. to 23. March the 23rd International Conference on Fast Software Encryption (FSE) took place in Bochum, Germany at the Ruhr University Bochum. The conference focus on fast and secure primitives for symmetric cryptography.


The conference was split in 9 sessions. 

Session 1: Operating Modes


This session was on operating modes and the following three papers were presented:

The paper "A MAC Mode for Lightweight Block Ciphers" was presented by Atul Luykx, where he presented a new MAC mode called LightMAC that focus on extending the lifetime of a symmetric key. 

Session 2: Stream-Cipher Cryptanalysis

This session was on stream-cipher cryptanalysis and the following two papers were presented:

The paper "Cryptanalysis of the Full Spritz Stream Cipher" was presented by Subhadeep Banik where he presented an improved state recovery attack that takes advantage of a special state, that when entered all even values in the permutation are mapped to even values and all odd values to odd values. 

Session 3: Components

This session was on components and the following three papers were presented: 
Siang Meng Sim presented the paper on "Lightweight MDS Generalized Circulant Matrices" where they showed left circulant MDS matrixes of order $\le$ 8.

Session 4: Side-Channels and Implementations

This session was on side-channels and implementations and the following three papers were presented: 
Pascal Sasdrich presented the paper "White-Box Cryptography in the Gray Box - A Hardware Implementation and its Side Channels. In his talk he presented the first AES white-box implementation in hardware and provided results of a practical gray-box (side-channel) analysis. 

Session 5: Automated Tools for Cryptanalysis

This session was on automated tools for cryptanalysis and the following three papers were presented: 

Vesselin Velichkov presented the paper "Automatic Search for the Best Trails in ARX: Applicaton to Block Cipher Speck where they present a new automatic search tool tha applies Matsui's algorithm with optimal results. Then they searched for differential trails in Speck and presented new bounds on the security of Speck regadring to differential cryptanalysis.

Session 6: Designs

This session was on designs and the following two papers were presented: 

Jeremey Jean presented the paper on Efficient Design Strategies Based on the AES Round Function. In their paper they present cascaded iterations of the AES round function together with intermediate XOR's which achives a high performance. 

Invited Talk: On White-Box Cryptography

The invited talk on white-box cryptography was given by Henri Gilbert from ANSSI, France. 

Session 7: Block-Cipher Cryptanalysis

This session was on block-cipher cryptanalysis and the following five papers were presented: 

  • Bit-Based Division Property and Application to Simon Family
  • Algebraic Insights into the Secret Feistel Network
  • Integrals go Statistical: Cryptanalysis of Full Skipjack Variants
  • Note on Impossible Differential Attacks
  • Improved Linear Hull Attack on Round-Reduced Simon with Dynamic Key-guessing Techniques

Rump Session

The rump session consisted of several short talks for a few minutes. 

Thomas Peyrin presented a new block cipher called Skinny that is based on the TWEAKEY framework. Skinny has an AES like design and achives better performance than SIMON and other lightweight designs. 

Jeremy Jean presented a website with standardized figures for ciphers in his talk about TikZ for Cryptographers. He aimed that every cryptographer should use this standardized figures in their papers. 

Christian Rechberger presented the FHEMPCZK-Cipher Zoo where one could compare ciphers for Fully Hommomorphic Encryption (FHE), Multi Party Computation (MPC) and Zero Knowledge (ZK).

Session 8: Foundations and Theory


This session was on foundations and theory and the following four papers were presented:
  • Modeling Random Oracles under Unpredictable Queries
  • Practical Order-Revealing Encryption with Limited Leakage
  • Strengthening the Known-Key Security Notion for Block Ciphers
  • Related-Key Almost Universal Hash Functions: Definitions, Constructions and Applications

Session 9: Authenticated-Encryption and Hash Function Cryptanalysis


This session was on authenticated-encryption and hash function cryptanalysis and the following three papers were presented:
  • Key Recovery Attack against 2.5-round $\pi$-Cipher
  • Cryptanalysis of Reduced NORX
  • Analysis of the Kupyna-256 Hash Function
Yu Sasaki presented the Cryptanalysis of Reduced NORX, where they present state and key recovery attacks on the core permutation which is reduced to 2 out of 4 rounds. 

DISC workshop


After FSE, the Directions in Symmetric Cryptography (DISC) workshop for PhD students and young post-docs took place at the Ruhr University Bochum.

The workshop was divided into 5 working groups of 5 to 9 participants, who got the opportunity to work together for one and a half day on a specific topic. Furthermore, the goal was to meet some other people that are working in the same area and to build some research collaborations.

Topic 1: How to design a bad key schedule

The goal of this topic was to approach key schedule design from the opposite direction: Can we design a key schedule -- seemingly harmless -- that has a decremental effect on the block cipher's security. Is it even possible to hide back doors in the key schedule only?

Topic 2: The TWEAKEY framework - New Designs and Cryptanalysis of STK

The TWEAKEY framework was introduced at ASIACRYPT 2014 as a more general design idea for a tweak/key (tweakey) scheduling. In this framework, one does not to separate between key and tweak material. The authors proposed a specific instance called superposition TWEAKEY (STK) and designed three tweakable block ciphers Joltik-BC, Deoxys-BC and Kiasu-BC based on this idea. This topic was both about cryptanalysis of the STK construction and thinking about design alternatives.

TOPIC 3: Distinguishing block ciphers: Is the attack space covered?

Block cipher cryptanalysis relies to a large degree on the existence of efficient distinguishers. In this topic, we want to discuss and explore possible directions where novel cryptanalytic techniques might be found or alternatively find arguments why new techniques are unlikely to be found.

TOPIC 4: How reliable are our assumption in statistical attacks?

In symmetric cryptanalysis, statistical attacks such as differential and linear cryptanalysis, boomerang attacks or differential-linear attacks, play an important role in the security evaluations of block ciphers. These attack inherently rely on varying independence or randomization assumptions that are necessary to estimate their success probability. Is it possible to determine criteria when these assumptions will fail or hold? Can we sometimes remove the assumptions or substitute them with
weaker variants? Can we give heuristic arguments for their validity to increase our faith in them?

TOPIC 5: Resistance against cryptanalytic attacks: What can we prove?

Unlike algorithms in public-key schemes, block ciphers are usually not based on hard-problems. To estimate the security, block ciphers are instead evaluated against the range of known attack vectors. Both from the designers and evaluators perspective it would be desirable to have proofs against larger classes of attacks. Even finding good heuristic formulas that determine the number of rounds
needed for security would be large step forward. In this topic, we would like to discuss design, evaluation and proof strategies that might help us to move towards this goal.



Friday, March 18, 2016

Spring School on Symmetric Cryptography

This is part one of a two-part blog post collaboratively written by Matthias and Ralph.

The Spring School on Symmetric Cryptography takes place at Beckmanns Hof just one minute south of Ruhr University Bochum.
The Spring school, which can be seen as a supplementary event for FSE 2016 next week, lies literally on the verge of the botanical garden of Bochum!

The Spring School's program covers lectures on the theory of "Boolean Functions" and "Statistical Models" in symmetric cryptography. Additionally there are exercises to gain a firm understanding thereof which were compiled by Anne Canteaut and Celine Blondeau.
The theoretical part is supplemented by the more practically oriented talks of the invited researchers Joan Daemen and Ventzi Nikov.


For the lecture on "Boolean Functions" Anne Canteaut preferred a sympathetic, old-fashioned presentation with chalk on black board in the lecture hall over an overhead presentation in the modern seminar room.

The topic ranged from Boolean functions and their algebraic normal form to the interpretation as Reed–Muller codes. Then the Walsh transform was introduced as a tool to analyze and approximate Boolean functions by linear ones.


In the lecture on "Statistical Models" Celine Blondeau (lecture notes here)
reminded us of basic probability distributions well known to statisticians and we explored some of their uses in cryptography. The discussion lead us from Binomial-, Poisson-, Normal- over Chi-Square- and Hypergeometric distributions to the study of Kullback-Leibler divergence and the applications for i.e. Linear attacks, Differential attacks and Impossible differential cryptanalysis. She also gave directions of ongoing work in that field.



The advantages of "Permutation Based Cryptography" were presented by Keccak (SHA-3) and Rijndael (AES) (co-)designer Joan Daemen.


The talk about "Threshold Implementations" by Ventzi Nikov covered techniques to decompose (non-linear) functions to separate the inputs from intermediate results and masking.
The key-values are masked by splitting up function inputs into separate shares such that knowledge of less than all shares does not allow to recover the value.

Matthias and Ralph will stay tuned for some S-Box theory and the statistical wrap-up on Saturday morning and the slides and recorded videos of this Spring School will soon be available at the official website.

Tuesday, March 15, 2016

Path ORAM

In the past few months, I've learned about many different areas of securely outsourcing storage and computation. One particularly interesting idea I've come across is oblivious random-access machines (ORAMs). In this post, I explain the basics of ORAMs and present the simplest version of path ORAM (eprint, ACM DL).

First, what is a random-access machine? It's a model of computation. Essentially, it's a simple computer that has two parts—a control unit and memory—and can run a program stored in its memory by fetching instructions one at a time. The control unit has a small amount of storage, the registers, and it can read from or write to specific addresses in the memory. Already, you might see how it could model cloud storage: the data owner (or client) is like the control unit, and the remote storage is like the memory. In this post, instead of referring to a control unit and memory, I'll refer to a "client" or "data owner" and a "server."

Oblivious RAMs (ORAMs) hide access patterns to the memory or server. They were proposed by Ostrovsky and Goldreich about 25 years ago (Ostrovsky's PhD thesis). Their motivation was software protection: ORAMs can prevent reverse engineering of programs—useful for software companies with proprietary implementations (and for malware authors). But ORAM is useful in settings other than software protection, like searchable encryption and cloud storage.

If you're already familiar with oblivious transfer or PIR, you may wonder how they relate to ORAM. The following table gives an overview of the main differences.

Scheme Participants Remote data Requirements
oblivious RAM client, server read and write, private server doesn't know access pattern
private information retrieval (PIR) client(s), server read-only, public requested item unknown to server
symmetrically private information retrieval (SPIR) client, server read-only, public requested item unknown to server, non-requested items unknown to client
oblivious transfer (OT) chooser, sender read-only requested item(s) unknown to sender, non-requested items unknown to chooser

What does it mean to hide access patterns? It means that the server (or anyone who sees all client-server communication and what's stored on the server at any time) doesn't learn anything about which data is actually being accessed, whether it is being updated or only read, how frequently it's accessed, whether the same data is accessed twice, etc. The server should be oblivious to what the owner is doing with its data.

ORAM, more generally, is a cryptographic protocol with three algorithms, Init, Read, and Write, between a client and a server:

  • Init(B, N): the client allocates some memory on the server to store N data blocks of size B and maybe allocates some local memory.
  • Read(i): the server obliviously returns data block i to the client.
  • Write(i, d): the server obliviously replaces block i's data with d.

An ORAM is pattern-hiding if an adversary who chooses two access sequences of the same length can't distinguish which one is being executed, given a complete transcript of communication between the client and server and the contents of the server's memory at any time.

For example, maybe an adversary chooses the two access sequences
( (read, 2), (read, 2), (read, 2) )
and
( (read, 5), (write, 7, 0x30e84a13), (read, 1) ).
It sees the resulting communication transcript, which could look like
( (read, 10), (read, 12), (read, 8),
(write, 10, 0x53d35dd9), (write, 12, 0x16701de9), (write, 8, 0x0eae293c),
(read, 6), (read, 8), (read, 1),
(write, 6, 0x30e84a13), (write, 8, 0xffe8d5a4), (write, 1, 0xc113bc8b),
(read, 7), (read, 2), (read, 6),
(write, 7, 0xbb7da98a), (write, 2, 0x08bbfc25), (write, 6, 0xcaf6d2e2) )
.
Given this transcript and what's stored on the server at any time, it shouldn't be able to determine which access sequence the ORAM is actually executing with probability greater than 1/2.

ORAM schemes (like path ORAM) tend to ignore exactly how data blocks are encrypted and instead focus on hiding the access pattern. However, all data should be encrypted, otherwise the server could easily learn about the access pattern. A scheme with IND-CPA security would be sufficient.

Here's a trivial ORAM scheme to show that it is indeed possible to hide access patterns. Init(B, N): the server allocates BN bits of memory. Read(i): the client requests all N data blocks from the server, decrypts them, identifies block i among them (e.g., by always prepending a block's identifier i to its data before encrypting it), then re-encrypts all N blocks and sends them back to the server. Write(i, d): the client requests all N data blocks from the server, decrypts them, replaces the data of block i with d, then re-encrypts all N blocks, and sends them back to the server. This scheme is inefficient, but it hides the access pattern!

Path ORAM

I'll explain the main ideas of an elegant tree-based ORAM protocol, path ORAM (eprint, ACM DL), developed by Emil Stefanov and others.

Init(B, N)

Suppose the client wants to store N data blocks of size B. In the simplest variant of path ORAM, the storage on the server is treated as a binary tree of height L = ⌈log N⌉ with 2L leaves. (A binary tree of height L has L + 1 node levels.) Each node in the tree is a bucket that contains Z blocks of size B, where Z is a small constant (e.g., 3–5) independent of N. Buckets are padded with dummy blocks when necessary so they all appear full.

Tree abd94b3c c69c4871 56610ea3 ad334e15 d46e5072 a79f0bea 3928c32e e64fa3dd e4f14800 a8e601af 4d2bea5c 73a51888 edf14401 504259cf d6d31c04 541d03ef 6a3e5773 d59baa4d f3f44080 9daa5154 c136bed9 4f359312 0a3086d3 f03710cb 2ae64c97 03dc583b 785fcc5b b5d6ed27 80ee5dee 5ec2193f 22567f2d f4785fbe b418da68 98f98198 cb15b1ad fce19058 7f3ade3d 87420922 7d428bc3 cf3dc681 bfff23d8 473804d0 92d8f24b b4ea7935 d60ac687

An example of a path ORAM tree with height L = 3 and Z = 3 blocks per bucket, that can store up to N = 8 data blocks. These data blocks are indistinguishable from the dummy blocks that fill up the rest of the tree.

The client stores two data structures: a position map and a stash. The stash stores data blocks during block accesses and data blocks that have overflowed from the tree. It's initially empty. The position map associates each of the N data blocks to one of the 2L leaves, chosen independently and uniformly at random. What makes path ORAM work is the following rule: at any time, every data block is stored either in the stash or in a node along the path from the root to its assigned leaf.

Position map Block ID Assigned leaf 0 3 1 2 2 2 3 7 4 1 5 0 6 3 7 6

An example of a position map for N = 8 items,
each assigned one of the tree's 2L = 8 leaves.

Read(i)

To read block i, the client first looks up its associated leaf, PosMap[i]. Then, it sends a read request to the server for the contents of all the buckets along the path from the root to the leaf PosMap[i].

The client decrypts all of these (L+1)Z blocks, discards any dummy blocks, and stores the rest in its stash. It identifies block i among the blocks in the stash. According to the "law" of path ORAM, the block was either already in the stash or was in the path that is now stored in the stash. The client updates block i's assigned leaf in the position map, picking another one of the 2L leaves uniformly at random.

Next, it tries to empty the stash, encrypting data blocks with fresh randomness and writing them back to the same path—the path from the root to block i's former assigned leaf. Data blocks are put as close to the leaves as they can go while still respecting their assigned leaves: a block can be stored in a bucket only if that bucket is also on the path from its assigned leaf to the root. (Any two paths will intersect at the root bucket, at least.) If there aren't enough eligible data blocks in the stash to fill a bucket, then dummy blocks are added. If a bucket is full of data blocks and there are still eligible blocks in the stash, they will be eligible for the next bucket in the path, one level up. If the root bucket is full of data blocks, then any remaining data blocks will stay in the stash.

If you've been following along closely, you might have noticed that after a block is read, it ends up in the root node with probability 1/2. However, even if this block is accessed twice in a row, the access will be indistinguishable from an access to any other block.

Write(i, d)

To write data d to block i, the client proceeds exactly as if it were reading block i, but simply changes the contents of block i to d in the stash before re-encrypting it and re-writing it to the server.

And that's the basic version of path ORAM. Simple, right? Have a look at my path ORAM simulation on GitHub.

Security

Why is path ORAM "secure" in the sense of hiding the client's access pattern? Its security is mainly due to the property that each data block is assigned a leaf that was chosen independently and uniformly at random, and is updated after each access. Therefore, each data access consists of reading and writing the path from the root to a random leaf. Since all values are re-encrypted with fresh randomness every time they are accessed, only the client can distinguish reads and writes.

Efficiency

The client must store the position map and the stash. The size of the position map is NL bits, or O(Nlog N) bits. During an access, the stash must store an entire path from the root to a leaf node (transient storage). The transient storage required is (L+1)ZB bits, which is O(log N) data blocks. Between accesses, the stash must store any blocks that overflowed (persistent storage). Stefanov et al. show that for a bucket size of Z = 5, the probability that a stash exceeds its capacity of R blocks is at most 14(0.6002)R, i.e., it decreases exponentially with the stash size. (If you look at the path ORAM paper, you'll notice that the security analysis is half a page, while proving a bound on the persistent stash usage requires 8 pages!)

The server must store the tree, whose total size is (2L+1 - 1)ZB bits, while it is storing NB bits of actual data (i.e., non-dummy data), which corresponds to an overhead of about 2Z.

For each block access, the server must send an entire path from the root to a leaf node, and the client must also send it back. So, the bandwidth cost is 2(L+1)ZB bits.

There's a lot more to path ORAM than what I presented in this post. If you're interested in learning more about it, its variants, or ORAMs in general, here are a few recent papers:

Sunday, March 6, 2016

Paris Crypto Day (Part II)

The afternoon session of the Paris Crypto Day was just as interesting as the morning part. Here is a summary of Romain Gay, Sonia Belaïd and Alain Passelègue's talks.


Romain Gay -PhD student at ENS- presented his EuroCrypt 2016 paper (joint work with D. Hofheinz, E. Kiltz and H. Wee), which introduces a public-key scheme based on the DDH assumption. The work also considers the security loss of the construction and positively answers the open question of finding a tightly secure, pairing-free, CCA-secure construction that is based on the DDH hardness assumption. Aside from not using pairings, the novel construction is also efficient in terms of the number of elements that are needed to represent the ciphertext.

The following table compares the new PKE scheme with previous CCA-secure works1:
Reference |ct|-|m| Loss L Assumption Pairing
CS98 3 O(Q) DDH no
KD04 2 O(Q) DDH no
HJ12 O($\lambda$) O(1) DLIN yes
LPJY15 47 O($\lambda$) DLIN yes
AHY15 12 O($\lambda$) DLIN yes
GCDCT15 10 O($\lambda$) SXDH yes
presented work 3 O($\lambda$) DDH no

Sonia Belaïd -PhD from ENS Paris- gave a presentation on the use of masking to defeat power-analysis attacks. The theoreticians of the audience welcomed the presentation, since it brings into light side-channel attacks, used in the real world. A power analysis attack works by tracing the electrical consumption of a hardware implementation of a cipher to recover sensitive variables. Common algorithmic countermeasures include fresh re-keying and masking, the presentation focusing on the latter technique.

Masking breaks a sensitive variable (one depending on the key and on the public data) into t+1 components, known as the masks (or shares). Precisely t variables are generated uniformly at randomly, while one variable is obtained by composing the others. Commonly, boolean, additive or multiplicative field operations are used for compositions. The protection gained through the usage of masking as a technique has direct consequences in the efficiency: the running time of algorithms depending on large states, such as Keccak, can be significantly increased.


Following Sonia's talk, Alain Passelègue (also a PhD student at ENS Paris) presented his EuroCrypt 2016 paper (joint work with S. Belaïd, F. Benhamouda, E. Prouff, A. Thillard and D. Vergnaud). Alain's talk followed the ideas introduced in the previous one and focused on a model whose aim is to analyze the power of masking, the d-probing model, in which a sensitive variable is broken into masks but an attacker can learn up to d variables. The goal is to construct generic masking schemes that are provably secure in the d-probing model.

Specifically, the question asks for the value of c=ab (a, b and c are bits). a and b are given as input shares and we need to find the minimal number of random bits needed to securely compute a share of the bit c. This introduces a scheme that obtains a share of c using the sum of the products of the input shares. Another significant practical contribution, is an automated tool for finding attacks against masking schemes, which is also described in detail in their work.


Overall, the Paris Crypto Day was a good chance for the members of the crypto community around Paris to present their work and exchange technical ideas. To see a summary of the morning session talks, check Paris Crypto Day I. Please check the official webpage for the future editions.


Notes

1. |ct| and |m| denote the number of group elements representing the ciphertexts and the message, respectively.