otpkx-content.tex

\section{Introduction}

We have learned a lot about modern government surveillance from the Snowden 
revelations starting in 2013.
For our current treatment, the most interesting ones are the tapping of 
fibre-optic cables~\cite{Fibretap}, the storage of all intercepted encrypted 
data~\cite{NSAStoringCiphertexts}, the search~\cite{XKeyscore} and 
visualization capabilities~\cite{BoundlessInformant} for all intercepted data.
It is not the details that are interesting, it is the fact that one actor can 
collect, store and search Internet-wide transcripts of communication.
This paper focuses on the possibility of deniability in this setting.

Today, \ac{GPG}~\cite{gpg}, \ac{OTR}~\cite{otr2004} and Text\-Secure 
\cite{textsecure} are among the popular services used for private 
communication.
\ac{GPG} provides standard asymmetric and symmetric encryption, intended for 
use with email.
In 2004, \citeauthor{otr2004}~\cite{otr2004} first described the \ac{OTR} due 
to limitations of deniability in \ac{GPG}.
The design goal of the protocol is to achieve strong privacy properties for 
users' online communication, the same properties as expected from 
a face-to-face conversation.
The main application at the time was \ac{IM}.
In 2010, OpenWhisperSystems adapted the \ac{OTR} messaging 
protocol~\cite{frosch2014secure} for use in the smartphone text-messaging app 
TextSecure.

The construction used for deniability in \ac{OTR}, and the derived protocols, 
is based on the principle <innocent until proven otherwise>.
While this holds true for most civil societies, it is not true everywhere.
There are circumstances in which the principle <guilty until proven otherwise> 
is applied instead.
For these circumstances, with an adversary that can record all network traffic, 
it is not possible to create any false witness (proof-of-innocence) due to the 
deterministic nature of the protocol.
In \cref{Undeniability} we show that this allows an adversary with 
transcripts of all network traffic to verify any statements about the 
conversation using the transcripts.
To thwart this we need truly deniable encryption, as defined by 
\citet{DeniableEncryption}, which means that we need to introduce some 
randomness.

\subsection{Our Contributions}
\label{Contributions}

We start from protocols like \ac{OTR}, but we assume a stronger adversary 
(\cref{ModelsOverview}).
This stronger adversary model breaks some assumptions in \ac{OTR}-like 
protocols and removes the possibility for deniability.
We still want to achieve the same basic properties, e.g.~mutual authentication, 
but we also want to have stronger deniability.
In \cref{Undeniability} we show that an adversary who can record all 
communication in a network can use the deterministic properties of commonly 
used mechanisms to reject lies about any communication.

We then outline the security properties needed and formally describe them and 
some results about them (\cref{SecurityProperties}).
We continue to present a scheme which is a combination of authenticated 
encryption and deniable encryption (\cref{AchievingDeniability}).
This protocol is stateful, and as such, is also secure against replay and 
out-of-order attacks.
It is a general design, so any deniable encryption and message authentication 
schemes with the right properties can be plugged in.

To show that this scheme is practically feasible, we present an implementation 
(\cref{Implementation}).
We use \ac{NFC} in smartphones to exchange large-enough amounts of random data 
when two users physically meet.
Later, when the users are apart, this data is used to key a \ac{OTP} for use 
when communicating.
To estimate the feasibility of this scheme we investigate
\begin{itemize}
  \item the order of magnitude of random data needed to be able to cover 
    everyday text-message conversation;

  \item if the \ac{NFC} transmission rates and the random number generation in 
    combination with the number of physical meetings can provide high enough 
    exchange rates in practice; and

  \item how the continuous key-generation in this scheme affects battery life 
    in the device.
\end{itemize}

We answer the first question by estimating the amount of private communication 
for some users (\cref{NeededRandomness}).
We answer the second question by estimating the required number of physical 
meetings for the same users (\cref{NumberOfMeetings}).
Since we have an estimate of the amount of exchanged randomness needed and the 
transmission rates for the \ac{NFC} protocol, we can estimate how many physical 
meetings and how long transfers are needed to cover the needs.
For the third question, we estimate the battery usage by performing key 
generation and exchanges using different Android-based phones while monitoring 
the battery consumption (\cref{BatteryConsumption}).


\section{The System and Adversary Models}
\label{ModelsOverview}

In our system model, we assume that we have two communication channels: one 
private and one public.
We can implement the private channel as an \ac{NFC} channel and a public 
network-channel, e.g.\ the Internet.
We can always use the public channel, but we can only use the private 
(\ac{NFC}) channel more rarely, e.g.\ if we are in the same physical space.

We assume that a user can, at least once, use the private channel.
They must do this before any secure communication over the public channel can 
be started.
We assume that a device can generate cryptographically strong random data and 
that this key-material can be securely stored on the device.

For the adversary model, we assume a stronger adversary, Eve.
Eve records all traffic in the public channel.
This means that she records all traffic for the entire Internet.
Thus Eve has a transcript of all communication that has taken place in the 
public channel and any future communication will also be entered into her 
transcript.
But Eve cannot record any communication in the private channel.
Also, Eve cannot access the devices used in the communication.
Instead, she will force us to reveal the keys that produced the ciphertexts in 
her transcript.
In summary, in related works, Eve has had the role of a prosecutor who must 
prove things to a judge.
In our model, Eve has the role of being both prosecutor and judge, which is 
more the case in some surveillance states.

\subsection{A Formal Definition of the Adversary}
\label{FormalAdversary}

More formally, we summarize Eve's capabilities in the definition below.
She has the transcript of all communication in the public channel and she 
forces Alice to reveal the keys used for the ciphertexts in the transcribed
conversation with Bob.
Eve's task is to decide whether Alice is trying to lie.

\begin{definition}[Deniability under Surveillance-State Attack, 
  DEN-SS]\label{DEN-SS}
  Let \(A\) be an efficient adversary.
  Let \(P = \left( (m_i, \Key{i}) \right)_{i=1}^{n}\) be pairs of messages and 
  keys and let \(T = \left( t_i = \Enc[k_i]{ m_i } \right)_{(m_i, k_i)\in P}\) 
  be transcript of ciphertexts corresponding to the entries in \(P\).
  \(P\) and \(T\) are generated by some algorithm using the encryption scheme 
  \(\Scheme{S} = (\Keygen{}, \Enc{}, \Dec{})\).
  Let \(\phi\) be the challenge algorithm.

  First let the adversary choose (the index of) a transcript \[
    i\gets A( r, T ),
  \] where \(r\rgets R\) is random coins sampled from a set \(R\).
  We then create two challenges, \(c_0\) and \(c_1\):
  \begin{align*}
    c_0 &\gets k_i, \\
    c_1 &\gets \phi( r^\prime, i, P ),
  \end{align*}
  where \(r^\prime\rgets R\) is again some random coins.
  Next, we choose a bit \(b\rgets \{0, 1\}\) uniformly randomly.
  Finally, the adversary outputs a bit \[
    b^\prime\gets A( r, T, c_b ).
  \]
  We define the surveillance-state adversary's advantage as
  \begin{equation}\label{DEN-SSAdvantage}
    \Adv{\denss}{\Scheme{S}, \phi}[A] =
      \left|\Prob{b = b^\prime} - \Prob{b\neq b^\prime}\right|.
  \end{equation}
\end{definition}

Eve's transcript \(T\) and Alice and Bob's plaintext transcript will be 
generated by a protocol.
We we will present one in \cref{Implementation} for which Eve cannot 
win the above game.
In the above, Alice uses the challenger algorithm \(\phi\) to produce a key 
\(k_i^\prime\neq k_i\).
The goal of our scheme is that the Eve cannot distinguish the two keys using 
the transcript \(T\).
Next we will outline the problems other protocols have against this adversary.
We will use \ac{OTR} as an example, but similar arguments can be made against 
similar protocols.
Then we will return to our solution in \cref{SecurityProperties} and onwards.


\section{Why Alice and Bob Currently Must Forget Their Conversation}
\label{Undeniability}

The security of today's popular services --- \ac{GPG}, \ac{OTR} and TextSecure 
--- rely on standard cryptographic mechanisms.
These mechanisms provide strong security properties;
in this section we outline why some of these properties are too strong for 
deniability in the setting where the adversary has a transcript of all 
communications.

\ac{GPG} provides asymmetric and symmetric encryption intended to be used with 
email.
\citet{otr2004} have already presented arguments against \ac{GPG} (and 
\acl{PGP}, \acs{PGP}), but we will summarize them here.
If Alice wants to send a message \(m\) to Bob, then she will encrypt it for 
Bob's public key \(\PubKey{B}\).
She will then create a signature for the resultant ciphertext \(c 
  = \Enc[\PubKey{B}]{ m }\) with her own private key \(\PriKey{A}\), i.e.~\(s 
  = \Sign[\PriKey{A}]{ H(c) }\).
Alice will then send the ciphertext block and the signature to Bob, and this 
transaction will be recorded in Eve's transcript.
This scheme provides non-repudiation, i.e.~Alice can not deny having sent the 
message \(m\) at a later time and Bob can also prove to a third party that 
Alice sent \(m\).
Further, Eve can also prove that Alice sent \(c\), but she can only verify the 
plaintext \(m\) if Bob would reveal it to her.

In their paper, they suggested a scheme which does not have this property: the 
\ac{OTR} messaging protocol.
This protocol provides authentication for Alice and Bob, so that they can trust 
they are talking to the right person.
But they can do no more than that, Bob can no longer prove to a third party 
what Alice has sent.
They accomplish this by a continuous use of the \ac{DH} key-exchange and 
a \ac{MAC} based on symmetric keys.
We provide a simplified description here, mainly to give an understanding of 
the underlying ideas, see the original paper~\cite{otr2004} for a detailed 
description.
Alice chooses a secret exponent \(a\) and Bob chooses a secret exponent \(b\).
Alice signs \(g^a\) and sends \[
  A\to B\colon g^a, \Sign[\PriKey{A}]{ g^a }
\] to Bob.
Bob conversely sends \[%
  B\to A\colon g^b, \Sign[\PriKey{B}]{ g^b }
\] to Alice.
By this time they can both compute the secret shared-key \(k = g^{ab}\).
Let \(H_E\) and \(H_M\) be two cryptographically secure hash functions, used 
for deriving encryption and \ac{MAC} keys, respectively.
When Alice wants to send the message \(m\) to Bob, she chooses a random 
\(a^\prime\) and sends \[
  A\to B\colon g^{a^\prime}, c = \Enc[H_E( k )]{ i }\oplus m,
  \MAC[H_M( k )]{ g^{a^\prime}, c },
\] where \(i\) is some counter, to Bob.
Once she knows Bob has received the message she also sends the \ac{MAC} key 
\(H_M( k )\) to Bob.
The next time Alice wants to send a message to Bob, she will use \(k^\prime 
= g^{a^\prime b}\).

Now, Bob can no longer prove to a third party what Alice has said.
This is due to the \ac{MAC} being based on a secret key which Bob has access 
to.
Also, since the encryption is done in counter mode~\cite{blockmodes}, the 
ciphertext is malleable.
This means that flipping a bit in the ciphertext, yields the same flip in the 
plaintext.
Thus, anyone possessing the \ac{MAC} key can modify the plaintext by flipping 
the bits in the ciphertext and then generate a new \ac{MAC}.

\subsection{Verifying Who Sent What}

The arguments for forgeability using malleable encryption and publishing the 
\ac{MAC} keys only hold if the adversary cannot trust the source of the 
transcript.
This more powerful Eve (\cref{DEN-SS}) can ultimately trust the transcript 
since she collected it herself from the network.
And \emph{if} the courts trust Eve, if there are any courts, they also trust 
the transcript.

In this setting the forgeability property vanishes.
Eve knows that no one has modified the ciphertext, she recorded in her 
transcript as it left Alice and arrived to Bob.
She also recorded Alice publishing the \ac{MAC} key used for the signature.
This allows Eve to use the \ac{MAC} for each ciphertext to verify them.
She knows that Alice is the author of a message because she observes when Alice 
publishes the \ac{MAC} key.
Thus, Eve also knows that no one has used the malleability property, because if 
they did, that action would be recorded in Eve's transcript.

\subsection{Verifying Encryption Keys}
\label{VerifyingKeys}

Furthermore, Eve also learns some information about the key from the ciphertext 
and \ac{MAC} tag.
Eve can use the \ac{MAC} to discard false keys for the ciphertext.
Since Eve has \(t = \MAC[H_M( k )]{ c }\) for a ciphertext \(c\) recorded in 
her transcript, she can reject a key \(k^\prime\neq k\) by verifying that
\(\MAC[H_M( k^\prime )]{ c } \neq t\).
Hence, by having the \ac{MAC} key depend on the encryption key, we 
automatically decrease the number of spurious keys and thus also reduce our 
possibility for deniability.

\subsection{How Hard Is Deniability?}
\label{HardnessOfDeniability}

As suggested above, we have difficulty achieving deniability.
This is illustrated by the following equations.
Assume
\begin{equation*}
  \Enc[H_E(k)]{m} = c = \Enc[H_E(k^\prime)]{m^\prime}
\end{equation*}
and \(k\neq k^\prime\), then
\begin{equation*}
  \Pr\left[
    \MAC[H_M(k)]{c} = \MAC[H_M(k^\prime)]{c}
  \right]
  \approx
  \Pr\left[ H_M(k) = H_M(k^\prime) \right].
\end{equation*}
I.e.~our chance of lying about the key \(k\), replacing it with a key 
\(k^\prime\), is reduced to finding a collision for the hash function \(H_M\).
(There is also the negligible probability of \(\MAC[x]{c} = \MAC[x^\prime]{c}\) 
for \(x\neq x^\prime\) to consider.)

Furthermore, we find the key \(k^\prime\) by finding the preimage of \(H_E( 
k^\prime )\).
And if the encryption system \(\Enc{}\) is a trap-door permutation, then we 
will have to break that first, just to find \(H_E( k^\prime )\) before we can 
attempt finding its preimage.


\section{Required Security Properties}
\label{SecurityProperties}

To be able to get deniability in our given scenario, Alice and Bob need to be 
able to modify the plaintext without modifying the ciphertext.
They also need a \ac{MAC} key independent of the encryption key.
Then they can change the encryption key and the plaintext, but the ciphertext 
and \ac{MAC} remains the same.
In this section we will cover the needed security properties.

\citeauthor{DeniableEncryption} gave the original formal definition of deniable 
encryption in their seminal paper~\cite{DeniableEncryption}.
We will give their definition of sender-deniable encryption for shared-key 
schemes here.
\begin{definition}[Shared-key sender-deniable 
  encryption]\label{DeniableEnc}
  A protocol \(\pi\) with sender \(S\) and receiver \(R\), and with security 
  parameter \(n\), is a shared-key sender-deniable encryption protocol if:
  \begin{description}
    \item[Correctness] The probability that \(R\)'s output is different than 
      \(S\)'s output is negligible (as a function of \(n\)).

    \item[Security] For any \(m_1, m_2\in M\) in the message-space \(M\) and 
      a shared-key \(k\in K\) chosen at random from the key-space \(K\), then 
      we have \(\Pr[ \Enc[k]{ m_1 } = c ] \approx \Pr[ \Enc[k^\prime]{ m_2 
        } = c ]\).

    \item[Deniability] There exists an efficient <faking> algorithm \(\phi\) 
      having the following property with respect to any \(m_1, m_2\in M\).
      Let \(k, r_S, r_R\) be uniformly chosen shared-key and random inputs of 
      \(S\) and \(R\), respectively, let \(c = \Enc[k, r_S, r_R]{ m_1 }\) and 
      let \((k^\prime, r_S^\prime) = \phi( m_1, k, r_S, c, m_2 )\).
      Then the random variables \[
        ( m_2, k^\prime, r_S^\prime, c ) \text{ and }
        ( m_2, k, r_S, \Enc[k, r_S, r_R]{ m_2 } )
      \] are distinguishable with negligible probability in the security 
      parameter \(n\).
  \end{description}
\end{definition}
This means that given a ciphertext \(c = \Enc[k]{ m }\) and a false plaintext 
\(m^\prime\), there exists a polynomial-time algorithm \(\phi\) such that 
\(\phi( c, m^\prime ) = k^\prime\) yields a key \(k^\prime\) and \(m^\prime 
  = \Dec[k^\prime]{ c }\).
As we illustrated in \cref{HardnessOfDeniability}, there exists no such 
polynomial-time algorithm \(\phi\) for \ac{OTR} or \ac{GPG}.
But one encryption system for which the algorithm \(\phi\) is trivial is the 
\ac{OTP}.

\begin{definition}[One-Time Pad]\label{OTP}
  Let \(M = K = {(\Z_2)}^n\).
  Then let \(m\in M\) be a message in the message-space \(M\), let \(k\in K\) 
  be a uniformly chosen key in the key-space \(K\).
  Then we define \[
    \Enc[k]{ m } = m\oplus k \textand \Dec[k]{} = \Enc[k]{}.
  \]
\end{definition}

\citet{ShannonSecrecy} proved that this scheme is perfectly secret.
But this requires that the key \(k\) is as long as the message \(m\).
The key must be uniformly chosen, i.e.~never reused.
This is why this scheme is usually considered impractical.
However, we can easily see, and it is also pointed out 
in~\cite{DeniableEncryption}, that the \ac{OTP} fulfils 
\cref{DeniableEnc}.
We can simply define \(\phi( m_2, c ) = m_2\oplus c\) and this would yield 
\(k^\prime\) such that \[
  \Dec[k^\prime]{ c } = c\oplus k^\prime = c\oplus ( m_2\oplus c ) = m_2.
\]

When we use an encryption scheme for communication we also want authenticity.
\citet{AuthEncryption} treats authenticated encryption and how to create 
composed authenticated encryption schemes.
We will use the \ac{EtM} composition.
This means that we will encrypt and then compute a \ac{MAC} tag on the 
ciphertext.
We use the same formal definition of \ac{EtM} as in 
\cite{AuthEncryption}.

\begin{definition}[Encrypt-then-MAC, EtM]\label{EtM}
  Let \(\Scheme{E} = (\Keygen[E]{}, \Enc{}, \Dec{})\) be an encryption scheme 
  and \(\Scheme{A} = (\Keygen[A]{}, \Tag{}, \Verify{})\) be a message 
  authentication scheme.
  We can then construct the authenticated encryption scheme \(\Scheme*{E} 
    = (\Keygen*{}, \Enc*{}, \Dec*{})\) as follows:
  \begin{center}
    \normalfont{}
    \begin{minipage}[t]{0.3\textwidth}
      \begin{algorithmic}
        \Function{$\Keygen*{}$}{}
          \State{$\Key{}\rgets \Keygen[E]{}$}
          \State{$\TagKey{}\rgets \Keygen[A]{}$}
          \State{\Return{$\Key{}\concat \TagKey{}$}}
        \EndFunction{}
      \end{algorithmic}
    \end{minipage}%
    \vline%
    \begin{minipage}[t]{0.33\textwidth}
      \begin{algorithmic}
        \Function{$\Enc*{}$}{$K, m$}
          \State{$\Key{}\concat \TagKey{}\gets K$}
          \State{$c\rgets \Enc[\Key{}]{ m }$}
          \State{$t\gets \Tag[\TagKey{}]{ c }$}
          \State{\Return{$c\concat t$}}
        \EndFunction{}
      \end{algorithmic}
    \end{minipage}%
    \vline%
    \begin{minipage}[t]{0.33\textwidth}
      \begin{algorithmic}
        \Function{$\Dec*{}$}{$K, C$}
          \State{$\Key{}\concat \TagKey{}\gets K$}
          \State{$c\concat t\gets C$}
          \If{$\Verify[\TagKey{}]{ c }$}
            \State{$m\gets \Dec[\Key{}]{ c }$}
            \State{\Return{$m$}}
          \EndIf{}
          \State{\Return{$\bot$}}
        \EndFunction{}
      \end{algorithmic}
    \end{minipage}
  \end{center}
\end{definition}

\ac{OTR}, for instance, uses a variant of \ac{EtM} composition.
It is a variant since in \ac{OTR} the \ac{MAC} key is derived from a master 
key.
The results of~\cite{AuthEncryption} are proved for independent keys.
Remember, this is one problem with the \ac{OTR} that we want to avoid:
the construction where the \ac{MAC} key is a witness for the correct key.
Instead of deriving the encryption key and the \ac{MAC} key by using two 
different key-derivation functions on the same master key, we have to use 
information-theoretically independent keys.

\citet{AuthEncryption} proved some properties about \ac{EtM}:
If the encryption scheme \(\Scheme{E}\) provides \ac{IND-CPA} and the message 
authentication scheme \(\Scheme{A}\) provides \ac{SUF-CMA}, then the \ac{EtM} 
scheme \(\Scheme{\overline{E}}\) provides \ac{INT-CTXT} and \ac{IND-CCA}.
Consequently, we are interested in what happens if we use a deniable encryption 
scheme in \ac{EtM}.
Since this authenticates the ciphertext, and not the plaintext, it will not 
interfere with our deniability.
Since the key for encryption and authentication are independent, we can lie 
about one but not the other.
We summarize this in the following theorem.

\begin{theorem}\label{DeniableEtM}
  If \(\Scheme{D} = (\Keygen[D]{}, \Enc{}, \Dec{}, \phi)\) is a shared-key 
  sender-deniable encryption scheme and \(\Scheme{A} = (\Keygen[A]{}, \Tag{}, 
    \Verify{})\) is a message authentication scheme,
  then the scheme \(\Scheme*{D}\) formed from the composition of \(\Scheme{D}\) 
  and \(\Scheme{A}\) as in \cref{EtM} is also a shared-key sender-deniable 
  encryption scheme.
\end{theorem}
\begin{proof}
  \citet{AuthEncryption} proved that the resulting scheme \(\Scheme*{D}\) 
  inherits the security properties from the original encryption scheme 
  \(\Scheme{D}\).
  So the security and correctness of \cref{DeniableEnc} remains.

  Let \(K = k\concat \TagKey{} \rgets \Keygen*{}\).
  For any message \(M\) we have \(C = c\concat t = \Enc*[K]{ M }\) and \(M 
    = \Dec*[K]{ C }\).
  Use \(\phi\) to derive a new \(k^\prime\) as follows:
  \(k^\prime\gets \phi( M, k, c, M^\prime )\), where \(M^\prime\) is 
  a new message such that \(M\neq M^\prime\).
  Now let \(K^\prime = k^\prime\concat \TagKey{}\).
  By the construction of \(\Dec*{}\) (\cref{EtM}) we will have 
  \(\Dec*[K^\prime]{ C } = M^\prime\).
  Thus the deniability property is retained as well.
  \qed{}
\end{proof}

Note that the independence of the encryption key and the \ac{MAC} key is 
crucial in the above theorem.
If they are not independent, as in \ac{OTR}, then this will only work if there 
exists an algorithm that can generate a new \ac{MAC} key with the property that 
the \ac{MAC} algorithm generates the same tag \(t\) for the same ciphertext 
\(c\) but with this new different key.

We will call a scheme composed as in \cref{DeniableEtM} a \emph{deniable 
  authenticated encryption} scheme.


\section{Achieving Deniability Against the Surveillance State}
\label{AchievingDeniability}

Due to the deniability requirements outlined above, the randomness used for 
encryption cannot be extended by a \ac{PRNG}: if we do, then we are in the same 
situation as when we were using a trap-door permutation --- we cannot 
efficiently find a seed to the \ac{PRNG} which yields a stream that decrypts 
the ciphertext to the desired plaintext.
Instead we generate randomness continuously and then exchange as much as needed 
using the private channel.
This way we can use the everyday chance-encounters for exchanging the generated 
randomness when we meet, and then use it to key a deniable authenticated 
encryption scheme when physically apart.

\subsection{A Protocol}
\label{TheProtocol}
% XXX Add descriptions of security notions: IND-SFCCA, INT-SFCTXT

Alice and Bob want to communicate securely with the possibility of deniability.
They agree on using a stateful deniable authenticated encryption scheme 
\(\Scheme*{D} = (\Keygen*{}, \Enc*{}, \Dec*{}, \phi)\).
The scheme \(\Scheme*{D}\) provides \ac{IND-SFCCA}, \ac{INT-SFCTXT} and 
shared-key sender-deniability.

Alice and Bob start by each generating a string of random bits.
Alice generates the string \(\Key{A}\rgets \Keygen*{}\) of length 
\(|\Key{A}|\).
When Alice and Bob meet, Alice sends \(\Key{A}\) over the private channel.
Thus Eve cannot see this traffic.
Later Alice wants to send a message to Bob over the public channel.
To send the message \(m\), Alice computes \(C = c\concat t\rgets 
  \Enc*[\Key{A}]{m}\) and sends it to Bob.
We assume the scheme \(\Scheme*{D}\) is stateful, so when Alice wants to send 
her next message to Bob she simply computes \(C^\prime\rgets \Enc*[\Key{A}]{ 
    m^\prime }\) and sends \(C^\prime\) to Bob.
(We can turn any scheme into a stateful scheme by e.g.\ adding a counter, 
cf.~\cite{StatefulDecryption}.)

The protocol is unidirectional.
If Bob wants to send messages to Alice, he has to do the same set up:
first generate \(\Key{B}\rgets \Keygen*{}\), then send the key to Alice over 
the private channel.
After that he can encrypt messages to Alice using \(\Enc*[\Key{B}]{\cdot}\).
% XXX Add an example of an attack against the state in bidirectional OTP
The reason we want the protocol unidirectional is to easily maintain the state 
of the encryption and decryption algorithms.
The protocol, run once in each direction, is illustrated in 
\cref{ProtocolOverview}.

\begin{figure}
  \centering
  \begin{msc}[msc keyword=]{}
    \drawframe{no}

    \declinst{A}{}{Alice}
    \declinst{E}{}{Eve}
    \declinst{B}{}{Bob}

    \mess*{$K_A$}{A}{B}
    \nextlevel{}

    \mess*{$K_B$}{B}{A}
    \nextlevel{}

    \mess{$c, t$}{A}{E}
    \action*{Records $A\to B\colon c, t$ to transcript}{E}
    \mess{$c, t$}{E}{B}
    \nextlevel[1.6]

    \mess{$c^\prime, t^\prime$}{B}{E}
    \action*{Records $B\to A\colon c^\prime, t^\prime$ to transcript}{E}
    \mess{$c^\prime, t^\prime$}{E}{A}

  \end{msc}
  \caption{%
    A sequence diagram illustrating the protocol.
    \(K_A\) and \(K_B\) are long strings of bits sent over the private channel 
    (dashed lines), Eve cannot record this.
    The messaging over the public channel is full lines:
    \(c = \Enc[\Key{A,m}]{ m }\) and \(t = \MAC[\TagKey{A,m}]{ c }\);
    \(c^\prime = \Enc[\Key{B,m^\prime}]{ m^\prime }\) and \(t^\prime 
      = \MAC[\TagKey{B,m^\prime}]{ c^\prime }\).
    Eve records this data as it is sent over the public channel.
  }\label{ProtocolOverview}
\end{figure}

\subsection{The Security of the Protocol}

We will now show that the protocol just described yields negligible advantage 
to the surveillance-state adversary (\cref{DEN-SS}).
However, to do this we need some more details to work with the formal 
definition of the adversary.

% XXX Review how transcript is generated for the adversary
Let \(A\) be an algorithm representing Alice in the above protocol description.
\(A\) is a randomized algorithm which generates a sequence of messages \(M 
  = ( m_i )_{i=1}^n\) of \(n\) messages.
\(A\) then generates a key \(\Key{A}\rgets \Keygen*{}\) by running the key 
generator of the scheme.
Then \(A\) computes the sequence \(T = ( t_i\rgets\Enc*[\Key{A}]{ m_i 
  } )_{i=1}^n\) by encrypting each message in the sequence \(M\).
The sequence \(T\) is the transcript of the surveillance-state adversary from 
\cref{DEN-SS}.
While computing \(T\), \(A\) also stores (possibly part of) the state of the 
decryption algorithm \(\Dec*[\Key{A}]{\cdot}\) in the following way:
For each \(t_i\rgets\Enc*[\Key{A}]{ m_i }\) operation \(A\) will extract the 
state \(K_i = \Key{i}\concat \TagKey{i}\) from \(\Dec*[\Key{A}]{\cdot}\) so 
that \(\Dec*[K_i][\prime]{t_i} = m_i\).
We denote the sequence of states as \(K = ( K_i )_{i=1}^{n}\).
And then we can form the sequence \(P = M\times K\) in \cref{DEN-SS} as the 
pairwise combination of \(M\) and \(K\).

The first theorem shows that this scheme yields negligible advantage to the 
surveillance-state adversary.
So our deniability properties holds.

% XXX Prove deniability against surveillance state
\begin{theorem}[Deniability against the surveillance state]
  Given any adversary \(E_d\) against \(\Scheme*{D}\), we can construct an 
  adversary \(E_d^\prime\) such that if \(E_d\) wins the \(\denss\) game, then 
  \(E_d^\prime\) can distinguish between \[
    ( m_2, k^\prime, r_S^\prime, c ) \text{ and }
    ( m_2, k, r_S, \Enc[k, r_S, r_R]{ m_2 } )
  \] of \(\Scheme{D}\) with non-negligible probability.
\end{theorem}
\begin{proof}
  Assume that \(E_d\) has non-negligible advantage in the \(\denss\) game of 
  \cref{DEN-SS}.
  Then we can construct \(E_d^\prime\) as follows.
  \(E_d^\prime\) forms \(T = ( c )\) and runs \(i\rgets E_d( T )\).
  Then \(E_d^\prime\) runs \(b^\prime\gets E_d( T, k )\).
  If \(b^\prime = 1\), then \(E_d^\prime\) knows that \(k\) was generated using 
  the algorithm \(\phi\) of \(\Scheme{D}\).
  So \(E_d^\prime\) can distinguish \(( m_2, k^\prime, r_S^\prime, c )\) from 
  \(( m_2, k, r_S, \Enc[k, r_S, r_R]{ m_2 } )\) with non-negligible 
  probability.
  \qed{}
\end{proof}

The next theorem states that if the deniable encryption scheme \(\Scheme{D}\) 
provides \ac{IND-SFCCA}, then so will the scheme \(\Scheme*{D}\).

% XXX Prove IND-SFCCA security
\begin{theorem}[Chosen-ciphertext security]
  Given any adversary \(E_p\) against \(\Scheme*{D}\), we can construct an 
  adversary \(E_p^\prime\) such that
  \begin{equation}
    \Adv{\indsfcca}{\Scheme*{D}}[E_p]
      \leq \Adv{\indsfcca}{\Scheme{D}}[E_p^\prime]
  \end{equation}
  and \(E_p^\prime\) makes the same number of queries as \(E_p\) does.
\end{theorem}
\begin{proof}[sketch]
  We use a similar construction as in~\cite{AuthEncryption,StatefulDecryption}, 
  i.e.\ construct \(E_p^\prime\) by letting \(E_p^\prime\) generate and add the 
  message authentication tags itself.
  Then \(E_p^\prime\) will win \(\indsfcca_{\Scheme{D}}\) when \(E_p\) wins 
  \(\indsfcca_{\Scheme*{D}}\).
  %\qed{}
\end{proof}


\section{Implementation and Evaluation}
\label{Implementation}
% XXX Rewrite the implementation and evaluation section
% - Integrate the OTP specific details
% - Integrate text moved from other sections
% XXX Prove implementation security properties
% - OTP is IND-SFCCA
% - We can construct a INT-SFCTXT from SUF-CMA

We want to make a practical implementation of the protocol described above.
To do this we need an encryption scheme which is deniable (\cref{DeniableEnc}) 
and provides \ac{IND-SFCCA}.
As pointed out above, \ac{OTP} provides both.
We will formulate this more rigorously below.
We also need a message authentication scheme providing \ac{SUF-CMA}.
We will use this one to achieve \ac{INT-SFCTXT}.
But first we need to describe the stateful use of the \ac{OTP} and \acp{MAC}.

\begin{definition}[Stateful \acs{OTP}]\label{StatefulOTP}
  Let \(\Keygen{}\) be an algorithm which generates a key \(\Key{}\) consisting 
  of a string of \(|\Key{}|\) uniformly random bits.
  Then we define the encryption and decryption functions for a message \(m\) 
  and a ciphertext \(c\), respectively, as follows:
  \begin{center}
    \normalfont{}
    \begin{minipage}[t]{0.4\textwidth}
      \begin{algorithmic}
        \Function{$\Enc{}$}{$\Key{}, m$}
          \State{State $s$ initialized to 0}
          \If{$s+|m| > |\Key{}|$}
            \State{\Return{$\bot$}}
          \EndIf{}
          \State{$\Key{m}\gets \Key{}[s, s+|m|]$}
          \State{$s\gets s+|m|$}
          \State{$c\gets m\oplus \Key{m}$}
          \State{\Return{$c$}}
        \EndFunction{}
      \end{algorithmic}
    \end{minipage}%
    \vline%
    \begin{minipage}[t]{0.4\textwidth}
      \begin{algorithmic}
        \Function{$\Dec{}$}{$\Key{}, c$}
          \State{State $s$ initialized to 0}
          \If{$s+|m| > |\Key{}|$}
            \State{\Return{$\bot$}}
          \EndIf{}
          \State{$\Key{c}\gets \Key{}[s, s+|m|]$}
          \State{$s\gets s+|m|$}
          \State{$m\gets c\oplus \Key{m}$}
          \State{\Return{$m$}}
        \EndFunction{}
      \end{algorithmic}
    \end{minipage}
  \end{center}
  We call the scheme \(\Scheme{E} = (\Keygen{}, \Enc{}, \Dec{})\) 
  a \emph{stateful \ac{OTP}} encryption scheme.
\end{definition}

The following theorem states that this scheme provides \ac{IND-SFCCA}.

\begin{theorem}[\acs{OTP} implies \acs{IND-SFCCA}]
  If \(\Scheme{E} = (\Keygen{}, \Enc{}, \Dec{})\) is the stateful \ac{OTP} 
  encryption scheme (\cref{StatefulOTP}),
  then \(\Adv{\indsfcca}{\Scheme{E}}[ A ]\) is negligible for any adversary 
  \(A\).
\end{theorem}
\begin{proof}[sketch]
  By construction each encryption is perfectly secure.
  As such the adversary's advantage is no better than random guessing.
\end{proof}

As mentioned above, we also need to have a stateful mechanism for message 
authentication.
For this purpose, we now describe a stateful message authentication algorithm 
based solely on a \ac{MAC} algorithm with \ac{SUF-CMA}.

\begin{definition}[Stateful \acsp{MAC}]
  Let \(\Scheme{A} = (\Keygen{}, \Tag{}, \Verify{})\) be a message 
  authentication scheme yielding \ac{SUF-CMA} and using random bit-strings of 
  length \(l_{\Scheme{A}}\) as keys.
  Let \(\Keygen*{}\) be an algorithm which generates a key \(\Key{}\) 
  consisting of a string of \(|\Key{}|\) uniformly random bits.
  Then we define the tag and verification functions for a ciphertext \(c\) and 
  a tag \(t\), respectively, as follows:
  \begin{center}
    \normalfont{}
    \begin{minipage}[t]{0.4\textwidth}
      \begin{algorithmic}
        \Function{$\Tag*{}$}{$\Key{}, c$}
          \State{State $s$ initialized to 0}
          \If{$s+l_{\Scheme{A}} > |\Key{}|$}
            \State{\Return{$\bot$}}
          \EndIf{}
          \State{$\Key{c}\gets \Key{}[s, s+l_{\Scheme{A}}]$}
          \State{$s\gets s+l_{\Scheme{A}}$}
          \State{$t\gets \Tag[\Key{c}]{ c }$}
          \State{\Return{$t$}}
        \EndFunction{}
      \end{algorithmic}
    \end{minipage}%
    \vline%
    \begin{minipage}[t]{0.4\textwidth}
      \begin{algorithmic}
        \Function{$\Verify*{}$}{$\Key{}, c, t$}
          \State{State $s$ initialized to 0}
          \If{$s+l_{\Scheme{A}} > |\Key{}|$}
            \State{\Return{$\bot$}}
          \EndIf{}
          \State{$\Key{c}\gets \Key{}[s, s+l_{\Scheme{A}}]$}
          \If{$\Verify[\Key{c}]{ c, t } = 1$}
            \State{$s\gets s+l_{\Scheme{A}}$}
            \State{\Return{1}}
          \EndIf{}
          \State{\Return{0}}
        \EndFunction{}
      \end{algorithmic}
    \end{minipage}
  \end{center}
  We call the scheme \(\Scheme*{A} = (\Keygen*{}, \Enc*{}, \Dec*{})\) 
  a \emph{stateful message authentication} scheme.
\end{definition}

Note that we only update the state if the verification is successful.
We do not want to update the state in the case of an attack, that might bring 
the state of the verifying function out-of-sync with the state of the tagging 
function~\cite{StatefulDecryption}.
(Similarly for \(\Dec*{}\) in \cref{EtM}: to not bring \(\Enc{}\) and 
\(\Dec{}\) out-of-sync.)

We now show that the stateful message authentication mechanism provides 
\ac{INT-SFCTXT}.

\begin{theorem}[Stateful \acsp{MAC} implies \acs{INT-SFCTXT}]
  If the message authentication scheme \(\Scheme*{A} = (\Keygen*{}, \Tag*{}, 
    \Verify*{})\) is a stateful message authentication scheme,
  then \(\Adv{\intsfctxt}{\Scheme*{A}}[I]\) is negligible for all adversaries 
  \(I\).
\end{theorem}
\begin{proof}[sketch]
  By construction each tag will use a new key.
  As such the adversary's advantage is at best to randomly guess the next key.
\end{proof}

The private channel for the protocol is implemented using the \ac{NFC} protocol 
with smartphones.
So Alice and Bob exchange the generated keys over the \ac{NFC} protocol.
From a user perspective, putting two phones together <charges the deniable 
encryption tool>.
This is probably a good metaphor to build on, since it builds on the mental 
model of a battery.
Users are already familiar with this model, and thus, when running low on 
randomness, fewer messages should be exchanged until another physical meeting 
can be arranged to <charge> the tool again.

We have developed an app\footnote{%
  The source code is available at URL 
  \url{https://github.com/MKjellqvist/OTPNFCTransfer/}.
} for Android devices which implements the above ideas.
It generates randomness continuously in the background to build up a pool of 
randomness.
It can also exchange this randomness with another phone over \ac{NFC}.

\subsection{The Amount of Randomness Needed}
\label{NeededRandomness}
Since we use the \ac{OTP}, we need as much key material for encryption as we 
have plaintext.
We need some additional key-material for the \acp{MAC}, e.g.~128--256 bits 
per sent message.
Thus we can estimate the total amount of randomness needed by estimating the 
exchange rate of plaintext.
To do this we analyse the Enron email dataset\footnote{%
  The source code for the data analysis described below is available at URL 
  \url{https://github.com/dbosk/mailstat/}.
  The Enron dataset is available from URL 
  \url{https://www.cs.cmu.edu/~./enron/enron_mail_20150507.tgz}.
}.

We are interested in personal communication, i.e.~we are not interested in 
newsletters and the like.
To filter out the newsletter category of messages, we rely on emails found in 
the users <sent> directory, since these are emails sent by real users.

%Since we are using the \ac{OTP}, we also use key material for the replies.
%We thus also include the received replies to the sent emails.
%The rationale for this is that received replies are not necessarily from people 
%within the Enron company, but the emails are written by real users and should 
%thus be included to give us more accurate data.

Since this dataset contains a mix of corporate and private emails, and is 
fairly small, it is hard to draw any general conclusions from it.
So the Enron dataset is just one example.
Another dataset, communication using other media, e.g.~text messages rather 
than email, would probably change the observed user behaviour and these 
numbers.
But our main goal is to get an estimate of user communication to see whether 
our scheme is completely infeasible or not, and we argue that this dataset lets 
us reach that goal.

% XXX Fix the numbers
% - Compute the message frequency
% - Use correct number of significant digits
\begin{pycode}[random]
import math
import sqlite3
import pathlib
import sys
import decimal

sys.path.insert( 0, "mailstat" )
import mailstat
#import libsci

metadata = sqlite3.connect( "enron-sent.sqlite3" )
prectime = decimal.Decimal( "0.1" )
precdata = decimal.Decimal( "1000" )

mean_msg_size, stddev_msg_size = \
  mailstat.mean_message_size( metadata )
mean_msg_size = mean_msg_size.quantize( precdata )
stddev_msg_size = stddev_msg_size.quantize( precdata )

mean_msg_freq, stddev_msg_freq = mailstat.mean_message_freq( metadata )
mean_msg_freq *= 60*60*24
mean_msg_freq = mean_msg_freq.quantize( precdata )
stddev_msg_freq *= 60*60*24
stddev_msg_freq = stddev_msg_freq.quantize( precdata )

mean_contacts, stddev_contacts = \
  mailstat.mean_number_of_contacts( metadata )
mean_contacts = mean_contacts.quantize( precdata )
stddev_contacts = stddev_contacts.quantize( precdata )

data_per_day = ( mean_msg_size + stddev_msg_size ) * \
  ( mean_msg_freq + stddev_msg_freq )
\end{pycode}

In the Enron dataset, we found that the average message was
\(\unit{\py[random]{mean_msg_size}}{\byte}\)
excluding any headers and attachments.
The standard deviation was
\(\unit{\py[random]{stddev_msg_size}}{\byte}\).
The large standard deviation can probably be explained by the data being 
emailed:
If a conversation requires a few rounds, then the previous messages accumulate 
in the body of the email as included history.

We also found that the average user communicates with
\(\py[random]{mean_contacts}\)
other users.
The standard deviation was
\(\py[random]{stddev_contacts}\).
This means that a user has approximately
\(\py[random]{(mean_contacts 
+ stddev_contacts).quantize(decimal.Decimal(10))}\)
users to exchange keys with.
If a user sends
\(\py[random]{mean_msg_freq}\)
messages per day (standard deviation:
\(\py[random]{stddev_msg_freq}\)%
), then we need on average less than
\(\unit{\py[random]{(data_per_day/1024).quantize( 10 )}}{\kibi\byte}\)
per day.
This means that we need approximately
\(\unit{\py[random]{(data_per_day*365/1024/1024).quantize( precdata 
      )}}{\mebi\byte}\) to store the key-material of one year.


We use Android's <SecureRandom> to generate our randomness.
This is the only supported way to generate randomness on the Android platform, 
and it allows us to generate enough amounts of random data.
Some research~\cite{AndroidLowEntropyMyth,JavaRandomness} suggest that 
<SecureRandom> under certain circumstances uses a low entropy seed.
However, the documentation states that SecureRandom can be relied upon for 
cryptographic purposes.
With these contradictory statements, the security of SecureRandom for use with 
the \ac{OTP} must be investigated further.

\subsection{The Number of Meetings and Transfer Time}
\label{NumberOfMeetings}
From the above analysis, we know the average amount of data communicated 
between users per day.
We also know that the \ac{NFC} protocol can achieve a transmission rate of up 
to \unit{424}{\kilo\bit\per\second}~\cite{NFCController}.
Considering this, we can see that even if a user sends \emph{ten times} the 
average amounts, the time required for key-exchanges is still on an order of 
10s of seconds per day.
(Order of minutes weekly or half-hours yearly.)
This number is divided among the contacts with whom the user communicates.
More frequently communicating contacts will require a larger share of the time.
The times provided does not include the set up of the \ac{NFC} radio channel, 
only actual transmission is considered.
The set up phase takes about \unit{5}{\second} on the tested devices.

\subsection{The Battery Consumption}
\label{BatteryConsumption}
To estimate the effects on battery consumption we find a typical RF-active 
rating of \unit{60}{\milli\ampere} for the NFC chip~\cite{NFCController}.
The battery effects of this is negligible and on the order of 
\unit{2}{\text{\textperthousand}} of the battery charge at the considered 
usages.

To estimate the effects on battery consumption we first build a baseline.
For this we used the Android systems built-in power-consumption estimates.
We used one phone as a reference and two others running the app implementing 
our scheme.

For the component generating the randomness, tests were performed where we 
generated the annual demand of key-material.
This provided no indication of battery drain.
The processor load was measured at \unit{2}{\%} and the input-output load was 
  measured at \unit{15}{\%}. 

\subsection{Some Extensions}

A problem that can occur is that Alice and Bob might run out of key-material 
before they can meet again.
One way to handle this is for them to communicate less as they are closing in 
on the end of their random bit strings and use the last of the randomness to 
schedule a new meeting.

An alternative way they can handle this problem is to switch to another scheme, 
but with the knowledge that it is no longer deniable.
In a similar fashion, Alice and Bob might not need deniability for all their 
communications.
Thus they can switch to e.g.~\ac{OTR} or TextSecure when they do not need 
deniability against Eve, and then switch back when they want deniability.
This strategy would use less randomness and they need to meet less often.

An extension to the protocol (\cref{TheProtocol}): Alice can do as in \ac{OTR} 
and publish the \ac{MAC} key when she receives a reply from Bob.
The effect we get through this is that the \ac{MAC} key is recorded in Eve's 
transcript, and this might lower the trust in Eve's transcript.


\section{Conclusions}
\label{Conclusions}

We set out to design a scheme which provides users with deniability in 
a stronger adversary model.
Provided that we can generate random data with high-enough entropy, then our 
protocol provides
\begin{inparablank}
  \item perfect secrecy,
  \item authenticated and
  \item deniable encryption.
\end{inparablank}
However, to achieve this scheme and these properties, we require physical 
meetings to exchange the randomness.
If Alice and Bob run out of randomness they can fall back to e.g.~\ac{OTR}, but 
then they lose deniability against Eve.
In either case, they are never worse off than using \ac{OTR} or TextSecure.

We also showed that our scheme is usable.
We found that a typical exchange of key material requires less than 
\unit{10}{\second} daily to complete.
If you exchange the key-material on a weekly basis, then it is still less than 
a minute, while monthly and bimonthly require up to five minutes.
Thus the transmission rates are not a usability concern.
Also, the effects on battery life under the considered use is not a limiting 
factor in neither the generation of the key-material nor the transmission of 
the key-material.

The method for estimating the needed amount of data can be improved.
This estimate depends on the type of communication, e.g.~corporate emails 
differs from personal text-messaging.
To get more accurate estimates, it might better to evaluate a dataset from 
other settings.
To better estimate communication needs for private individuals, it might be 
better to use text-messages (SMSs).
However, we intended to show that our scheme is feasible, and we argue that we 
have reached that goal.

The only issues found in the scheme are related to the <if> regarding 
high-enough entropy data.
The security of SecureRandom for use with the \ac{OTP} must be investigated 
further.
In addition to~\cite{JavaRandomness,AndroidLowEntropyMyth}, we also have the 
result of \citet{UniversalityOTP} to consider.

As a final note, the design of the \ac{NFC} API is hindering the flexibility of 
our and similar solutions.
We are mostly concerned about the following points:
\begin{itemize}
  \item There is no mechanism in which to stream data over \ac{NFC}\@.
    This is desirable from a usability standpoint of the app, in particular 
    with regards to interrupted transmissions.
    This might be solved by a more innovative implementation.
  \item The transmission must be done in the form of files, and currently these 
    have to reside on a publicly readable file-system on the device.
    This is a concern for both the confidentiality and integrity of the 
    key-material, as the transmitted files can be intercepted by a malicious 
    app competing for the received files.
\end{itemize}

\subsection{Future Work}

There are several interesting directions to follow from this work.
% XXX Treat that Eve controls the public channel
%We further assume that Eve is active, and that she controls the public 
%channel.
%I.e.~she might delay, modify or remove traffic there.
We start with the technical one.
The security of the actual \ac{NFC} transfer was out of the scope of this work.
However, the security of the \ac{NFC} protocol must be considered: in what 
proximity can Eve successfully record the \ac{NFC} traffic?
For instance, \citet{RFIDProximity} found that RFID tags could be read over 
\unit{50}{\metre} away in a hallway.
But more interestingly, can we make usable countermeasures?

Next, we can argue the need for deniability as compared to not being able to 
reveal any keys.
An interesting first step in this direction would be to conduct a study with 
users: what is the users' perception of deniability, what is more convincing?
This would also be interesting to contrast by looking into game theory to see 
what can be said about the behaviour of a probable liar: do we gain any 
credibility using this deniable scheme over simply not being able to disclose 
the keys?
What are the differences if we have a rational adversary compared to an 
irrational one?
Finally, there is the legal perspective, which could probably also benefit from 
exploring these questions.

Another direction, into usable security and privacy, would be to study suitable 
metaphors and mental models for this kind of system.
We suspect that the mental model of <charging deniability> when we exchange 
randomness is good, i.e.~that it does not lead to any contradictory behaviours 
which might put the user's security and privacy at risk.
Our guess is that this is more intuitive than e.g.~asymmetric encryption.


\subsubsection*{Acknowledgements}

This work was funded by the Swedish Foundation for Strategic Research grant SSF 
FFL09-0086 and the Swedish Research Council grant VR 2009-3793.
We would like to thank the anonymous reviewers for valuable feedback and 
especially Peeter Laud for very valuable shepherding.