CSci 8980-1:

Program Analysis for Security

Quantitative information flow, part 2

James Newsome, Stephen McCamant, and Dawn Song. “Measuring channel capacity to distinguish undue influence”. In Programming Languages and Analysis for Security (PLAS), pages 73–85, Dublin, Ireland, June 2009.
[ACM]

Miguel Castro, Manuel Costa, and Jean-Philippe Martin. “Better bug reporting with better privacy”. In Architectural Support for Programming Languages and Operating Systems (ASPLOS XIII), pages 319–328, Seattle, WA, USA, March 2008.
[ACM]

Question: The Newsome et al. paper is missing some details about how the XOR streamlining approach works. (This doesn't really excuse it, but one reason is that this feature was added to the system only at the last minute.) In particular, it does not say from what distribution the random parity constraints are selected. There are two possible parity constraints for each subset of the output bits: for instance if x₁, x₃, and x₇ are three bits in the output, you could either require that (x₁ xor x₃ xor x₇) = 0 (even parity), or that (x₁ xor x₃ xor x₇) = 1 (odd parity). But the paper doesn't specify which such parity constraints are chosen with what probability.

Your classmate Ben argues that using simpler constraints will make the solving process faster, so he proposes that you should use parity constraints over a single bit. For each constraint, you should pick one of the 32 bits in the output at random, and then constrain that bit to be either zero (with probability one-half) or one (with probability one-half).

Explain to Ben why his technique won't work, by giving a counter-example function that takes a 32-bit integer as input and produces a 32-bit integer as output. Show how XOR streamlining with Ben's one-variable constraints would apply to your counter-example, and explain how the result is either too large or too small compared to the correct result.

Optional

Michael Backes, Boris Köpf, and Andrey Rybalchenko. “Automatic discovery and quantification of information leaks”. In IEEE Symposium on Security and Privacy “Oakland”, pages 141–153, Oakland, CA, USA, May 2009.
[IEEE]

Another near-exhaustive approach to quantitative information flow that appeared at almost the same time as the influence paper in the main readings. By concentrating on linear constraints, this approach can be more precise and easily supports more kinds of definition, at the expense of supporting a small set of programs.

Hirotoshi Yasuoka and Tachio Terauchi. “Quantitative information flow — verification hardness and possibilities”. In Computer Security Foundations (CSF), pages 15–27, Edinburgh, UK, July 2010.
[IEEE]

A more theoretical exploration of the difficulty of quantitative information flow measurement. A key result is that most problems are #P-hard, which can be seen as another motivation for using #SAT solving as an approach.

Quoc-Sang Phan, Pasquale Malacaria, Oksana Tkachuk, and Corina Păsăreanu. “Symbolic quantitative information flow”. In Java Pathfinder Workshop (JPF), Cary, NC, USA, November 2012.
[ACM]

An even more recent near-exhaustive approach, this time incorporating symbolic execution.