SlideShare ist ein Scribd-Unternehmen logo
1 von 6
Downloaden Sie, um offline zu lesen
Brute force searching, the typical set and Guesswork
Mark M. Christiansen and Ken R. Duffy
Hamilton Institute
National University of Ireland, Maynooth
Email: {mark.christiansen, ken.duffy}@nuim.ie
FlĀ“avio du Pin Calmon and Muriel MĀ“edard
Research Laboratory of Electronics
Massachusetts Institute of Technology
Email: {ļ¬‚avio, medard}@mit.edu
Abstractā€”Consider the situation where a word is chosen
probabilistically from a ļ¬nite list. If an attacker knows the
list and can inquire about each word in turn, then selecting
the word via the uniform distribution maximizes the attackerā€™s
difļ¬culty, its Guesswork, in identifying the chosen word. It is
tempting to use this property in cryptanalysis of computationally
secure ciphers by assuming coded words are drawn from a
sourceā€™s typical set and so, for all intents and purposes, uniformly
distributed within it. By applying recent results on Guesswork,
for i.i.d. sources, it is this equipartition ansatz that we investigate
here. In particular, we demonstrate that the expected Guesswork
for a source conditioned to create words in the typical set grows,
with word length, at a lower exponential rate than that of the
uniform approximation, suggesting use of the approximation is
ill-advised.
I. INTRODUCTION
Consider the problem of identifying the value of a discrete
random variable by only asking questions of the sort: is its
value X? That this is a time-consuming task is a cornerstone
of computationally secure ciphers [1]. It is tempting to appeal
to the Asymptotic Equipartition Property (AEP) [2], and the
resulting assignment of code words only to elements of the
typical set of the source, to justify restriction to consideration
of a uniform source, e.g. [3], [4], [5]. This assumed uniformity
has many desirable properties, including maximum obfustica-
tion and difļ¬culty for the inquisitor, e.g. [6]. In typical set
coding it is necessary to generate codes for words whose
logarithmic probability is within a small distance of the word
length times the speciļ¬c Shannon entropy. As a result, while
all these words have near-equal likelihood, the distribution is
not precisely uniform. It is the consequence of this lack of per-
fect uniformity that we investigate here by proving that results
on Guesswork [7], [8], [9], [10], [11] extend to this setting.
We establish that for source words originally constructed from
an i.i.d. sequence of letters, as a function of word length it
is exponentially easier to guess a word conditioned to be in
the sourceā€™s typical set in comparison to the corresponding
equipartition approximation. This raises questions about the
wisdom of appealing to the AEP to justify sole consideration
of the uniform distributions for cryptanalysis and provides
alternate results in their place.
II. THE TYPICAL SET AND GUESSWORK
Let A = {0, . . . , m āˆ’ 1} be a ļ¬nite alphabet and consider a
stochastic sequence of words, {Wk}, where Wk is a word of
length k taking values in Ak
. The process {Wk} has speciļ¬c
Shannon entropy
HW := āˆ’ lim
kā†’āˆž
1
k
wāˆˆAk
P(Wk = w) log P(Wk = w),
and we shall take all logs to base e. For > 0, the typical set
of words of length k is
Tk := w āˆˆ Ak
: eāˆ’k(HW + )
ā‰¤ P(Wk = w) ā‰¤ eāˆ’k(HW āˆ’ )
.
For most reasonable sources [2], P(Wk āˆˆ Tk) > 0 for all
k sufļ¬ciently large and typical set encoding results in a new
source of words of length k, Wk, with statistics
P(Wk = w) =
ļ£±
ļ£²
ļ£³
P(Wk = w)
P(Wk āˆˆ Tk)
if w āˆˆ Tk,
0 if w /āˆˆ Tk.
(1)
Appealing to the AEP, these distributions are often substituted
for their more readily manipulated uniformly random counter-
part, Uk,
P(Uk = w) :=
ļ£±
ļ£²
ļ£³
1
|Tk|
if w āˆˆ Tk,
0 if w /āˆˆ Tk,
(2)
where |Tk| is the number of elements in Tk. While the distri-
bution of Wk is near-uniform for large k, it is not perfectly
uniform unless the original Wk was uniformly distributed on
a subset of Ak
. Is a word selected using the distribution of
Wk easier to guess than if it was selected uniformly, Uk?
Given knowledge of Ak
, the source statistics of words, say
those of Wk, and an oracle against which a word can be tested
one at a time, an attackerā€™s optimal strategy is to generate a
partial-order of the words from most likely to least likely and
guess them in turn [12], [7]. That is, the attacker generates a
function G : Ak
ā†’ {1, . . . , mk
} such that G(w ) < G(w) if
P(Wk = w ) > P(Wk = w). The integer G(w) is the number
of guesses until word w is guessed, its Guesswork.
For ļ¬xed k it is shown in [12] that the Shannon entropy of
the underlying distribution bears little relation to the expected
Guesswork, E(G(Wk)), the average number of guesses re-
quired to guess a word chosen with distribution Wk using
the optimal strategy. In a series of subsequent papers [7], [8],
[9], [10], under ever less restrictive stochastic assumptions
from words made up of i.i.d. letters to Markovian letters to
soļ¬c shifts, an asymptotic relationship as word length grows
arXiv:1301.6356v3[cs.IT]13May2013
between scaled moments of the Guesswork and speciļ¬c RĀ“enyi
entropy was identiļ¬ed:
lim
kā†’āˆž
1
k
log E(G(Wk)Ī±
) = Ī±RW
1
1 + Ī±
, (3)
for Ī± > āˆ’1, where RW (Ī²) is the speciļ¬c RĀ“enyi entropy for
the process {Wk} with parameter Ī² > 0,
RW (Ī²) := lim
kā†’āˆž
1
k
1
1 āˆ’ Ī²
log
ļ£«
ļ£­
wāˆˆAk
P(Wk = w)Ī²
ļ£¶
ļ£ø .
These results have recently [11] been built on to prove that
{kāˆ’1
log G(Wk)} satisļ¬es a Large Deviation Principle (LDP),
e.g [13]. Deļ¬ne the scaled Cumulant Generating Function
(sCGF) of {kāˆ’1
log G(Wk)} by
Ī›W (Ī±) := lim
kā†’āˆž
1
k
log E eĪ± log G(Wk)
for Ī± āˆˆ R
and make the following two assumptions.
ā€¢ Assumption 1: For Ī± > āˆ’1, the sCGF Ī›W (Ī±) exists, is
equal to Ī±RW (1/(1 + Ī±)) and has a continuous deriva-
tive in that range.
ā€¢ Assumption 2: The limit
gW := lim
kā†’āˆž
1
k
log P(G(Wk) = 1) (4)
exists in (āˆ’āˆž, 0].
Should assumptions 1 and 2 hold, Theorem 3 of [11] es-
tablishes that Ī›W (Ī±) = gW for all Ī± ā‰¤ āˆ’1 and that
the sequence {kāˆ’1
log G(Wk)} satisļ¬es a LDP with a rate
function given by the Legendre Fenchel transform of the
sCGF, Ī›āˆ—
W (x) := supĪ±āˆˆR{xĪ± āˆ’ Ī›W (Ī±)}. Assumption 1
is motivated by equation (3), while the Assumption 2 is a
regularity condition on the probability of the most likely word.
With
Ī³W := lim
Ī±ā†“āˆ’1
d
dĪ±
Ī›W (Ī±), (5)
where the order of the size of the set of maximum probability
words of Wk is exp(kĪ³W ) [11], Ī›āˆ—
W (x) can be identiļ¬ed as
=
ļ£±
ļ£“ļ£²
ļ£“ļ£³
āˆ’x āˆ’ gW if x āˆˆ [0, Ī³W ]
supĪ±āˆˆR{xĪ± āˆ’ Ī›W (Ī±)} if x āˆˆ (Ī³W , log(m)],
+āˆž if x /āˆˆ [0, log(m)].
(6)
Corollary 5 of [11] uses this LDP to prove a result suggested
in [14], [15], that
lim
kā†’āˆž
1
k
E(log(G(Wk))) = HW , (7)
making clear that the speciļ¬c Shannon entropy determines the
expectation of the logarithm of the number of guesses to guess
the word Wk. The growth rate of the expected Guesswork
is a distinct quantity whose scaling rules can be determined
directly from the sCGF in equation (3),
lim
kā†’āˆž
1
k
log E(G(Wk)) = Ī›W (1).
From these expressions and Jensenā€™s inequality, it is clear that
the growth rate of the expected Guesswork is less than HW .
Finally, as a corollary to the LDP, [11] provides the following
approximation to the Guesswork distribution for large k:
P(G(Wk) = n) ā‰ˆ
1
n
exp āˆ’kĪ›āˆ—
W (kāˆ’1
log n) (8)
for n āˆˆ {1, . . . , mk
}. Thus to approximate the Guesswork
distribution, it is sufļ¬cient to know the speciļ¬c RĀ“enyi entropy
of the source and the decay-rate of the likelihood of the
sequence of most likely words.
Here we show that if {Wk} is constructed from i.i.d.
letters, then both of the processes {Uk} and {Wk} also
satisfy Assumptions 1 and 2 so that, with the appropriate
rate functions, the approximation in equation (8) can be used
with Uk or Wk in lieu of Wk. This enables us to compare
the Guesswork distribution for typical set encoded words with
their assumed uniform counterpart. Even in the simple binary
alphabet case we establish that, apart from edge cases, a word
chosen via Wk is exponential easier in k to guess on average
than one chosen via Uk.
III. STATEMENT OF MAIN RESULTS
Assume that the words {Wk} are made of i.i.d. letters,
deļ¬ning p = (p0, . . . , pmāˆ’1) by pa = P(W1 = a). We
shall employ the following short-hand: h(l) := āˆ’ a la log la
for l = (l0, . . . , lmāˆ’1) āˆˆ [0, 1]m
, la ā‰„ 0, a la = 1,
so that HW = h(p), and D(l||p) := āˆ’ a la log(pa/la).
Furthermore, deļ¬ne lāˆ’
āˆˆ [0, 1]m
and l+
āˆˆ [0, 1]m
lāˆ’
āˆˆ arg max
l
{h(l) : h(l) + D(l||p) āˆ’ = h(p)}, (9)
l+
āˆˆ arg max
l
{h(l) : h(l) + D(l||p) + = h(p)}, (10)
should they exist. For Ī± > āˆ’1, also deļ¬ne lW
(Ī±) and Ī·(Ī±)
by
lW
a (Ī±) :=
p
(1/(1+Ī±))
a
bāˆˆA p
(1/(1+Ī±))
b
for all a āˆˆ A and (11)
Ī·(Ī±) := āˆ’
a
lW
a log pa = āˆ’ aāˆˆA p
1/(1+Ī±)
a log pa
bāˆˆA p
1/(1+Ī±)
b
. (12)
Assume that h(p)+ ā‰¤ log(m). If this is not the case, log(m)
should be substituted in place of h(lāˆ’
) for the {Uk} results.
Proofs of the following are deferred to the Appendix.
Lemma 1: Assumption 1 holds for {Uk} and {Wk} with
Ī›U (Ī±) := Ī±h(lāˆ’
)
and
Ī›W (Ī±) = Ī±h(lāˆ—
(Ī±)) āˆ’ D(lāˆ—
(Ī±)||p),
where
lāˆ—
(Ī±) =
ļ£±
ļ£“ļ£²
ļ£“ļ£³
l+
if Ī·(Ī±) ā‰¤ āˆ’h(p) āˆ’ ,
lW
(Ī±) if Ī·(Ī±) āˆˆ (āˆ’h(p) āˆ’ , h(p) + ),
lāˆ’
if Ī·(Ī±) ā‰„ āˆ’h(p) + .
(13)
Lemma 2: Assumption 2 holds for {Uk} and {Wk} with
gU = āˆ’h(lāˆ’
) and
gW = min āˆ’h(p) + , log max
aāˆˆA
pa .
Thus by direct evaluation of the sCGFs at Ī± = 1,
lim
kā†’āˆž
1
k
log E(G(Uk)) = h(lāˆ’
) and
lim
kā†’āˆž
1
k
log E(G(Wk)) = Ī›W (1).
As the conditions of Theorem 3 [11] are satisļ¬ed
lim
kā†’āˆž
1
k
E(log(G(Uk)) = Ī›U (0) = h(lāˆ’
) and
lim
kā†’āˆž
1
k
E(log(G(Wk)) = Ī›W (0) = h(p),
and we have the approximations
P(G(Uk) = n) ā‰ˆ
1
n
exp āˆ’kĪ›āˆ—
U (kāˆ’1
log n) and
P(G(Wk) = n) ā‰ˆ
1
n
exp āˆ’kĪ›āˆ—
W (kāˆ’1
log n) .
IV. EXAMPLE
Consider a binary alphabet A = {0, 1} and words {Wk}
constructed of i.i.d. letters with P(W1 = 0) = p0 > 1/2. In
this case there are unique lāˆ’
and l+
satisfying equations (9)
and (10) determined by:
lāˆ’
0 = p0 āˆ’
log(p0) āˆ’ log(1 āˆ’ p0)
,
l+
0 = p0 +
log(p0) āˆ’ log(1 āˆ’ p0)
.
Selecting 0 < < (log(p0)āˆ’log(1āˆ’p0)) min(p0āˆ’1/2, 1āˆ’p0)
ensures that the typical set is growing more slowly than 2k
and
that 1/2 < lāˆ’
0 < p0 < l+
0 < 1.
With lW
(Ī±) deļ¬ned in equation (11), from equations (3)
and (4) we have that
Ī›W (Ī±) =
log(p0) if Ī± < āˆ’1,
Ī±h(lW
(Ī±)) āˆ’ D(lW
(Ī±)||p), if Ī± ā‰„ āˆ’1,
=
ļ£±
ļ£²
ļ£³
log(p0) if Ī± < āˆ’1,
(1 + Ī±) log p
1
1+Ī±
0 + (1 āˆ’ p0)
1
1+Ī± if Ī± ā‰„ āˆ’1,
From Lemmas 1 and 2 we obtain
Ī›U (Ī±) =
āˆ’h(lāˆ’
) if Ī± < āˆ’1,
Ī±h(lāˆ’
) if Ī± ā‰„ āˆ’1,
and
Ī›W (Ī±) = Ī±h(lāˆ—
(Ī±)) āˆ’ D(lāˆ—
(Ī±)||p),
where lāˆ—
(Ī±) is deinfed in equation (13) and Ī·(Ī±) deļ¬ned in
equation (12).
With Ī³ deļ¬ned in equation (5), we have Ī³W = 0, Ī³U =
h(lāˆ’
) and Ī³W = h(l+
) so that, as h(lāˆ’
) > h(l+
), the
ordering of the growth rates with word length of the set of
most likely words from smallest to largest is: unconditioned
source, conditioned source and uniform approximation.
From these sCGF equations, we can determine the average
growth rates and estimates on the Guesswork distribution. In
particular, we have that
lim
kā†’āˆž
1
k
E(log(G(Wk))) = Ī›W (0) = h(p),
lim
kā†’āˆž
1
k
E(log(G(Wk))) = Ī›W (0) = h(p),
lim
kā†’āˆž
1
k
E(log(G(Uk))) = Ī›U (0) = h(lāˆ’
).
As h((x, 1 āˆ’ x)) is monotonically decreasing for x > 1/2
and 1/2 < lāˆ’
0 < p0, the expectation of the logarithm of the
Guesswork is growing faster for the uniform approximation
than for either the unconditioned or conditioned word source.
The growth rate of the expected Guesswork reveals more
features. In particular, with A = Ī·(1) āˆ’ (h(p) + ),
lim
kā†’āˆž
1
k
log E(G(Wk)) = 2 log(p
1
2
0 + (1 āˆ’ p0)
1
2 ),
lim
kā†’āˆž
1
k
log E(G(Wk)) =
2 log(p
1
2
0 + (1 āˆ’ p0)
1
2 ), A ā‰¤ 0
h(lāˆ’
) āˆ’ D(lāˆ’
||p), A > 0
lim
kā†’āˆž
1
k
log E(G(Uk)) = h(lāˆ’
).
For the growth rate of the expected Guesswork, from these
it can be shown that there is no strict order between the
unconditioned and uniform source, but there is a strict or-
dering between the the uniform approximation and the true
conditioned distribution, with the former being strictly larger.
With = 1/10 and for a range of p0, these formulae are
illustrated in Figure 1. The top line plots
lim
kā†’āˆž
1
k
E(log(G(Uk)) āˆ’ log(G(Wk)))
= lim
kā†’āˆž
1
k
E(log(G(Uk)) āˆ’ log(G(Wk))) = h(lāˆ’
) āˆ’ h(p),
showing that the expected growth rate in the logarithm of the
Guesswork is always higher for the uniform approximation
than both the conditioned and unconditioned sources. The
second highest line plots the difference in growth rates of the
expected Guesswork of the uniform approximation and the
true conditioned source
lim
kā†’āˆž
1
k
log
E(G(Uk))
E(G(Wk))
=
h(lāˆ’
) āˆ’ 2 log(p
1
2
0 + (1 āˆ’ p0)
1
2 ) if Ī·(1) ā‰¤ h(p) +
D(lāˆ’
||p) if Ī·(1) > h(p) + .
That this difference is always positive, which can be estab-
lished readily analytically, shows that the expected Guess-
work of the true conditioned source is growing at a slower
exponential rate than the uniform approximation. The second
line and the lowest line, the growth rates of the uniform and
unconditioned expected Guesswork
lim
kā†’āˆž
1
k
log
E(G(Uk))
E(G(Wk))
= h(lāˆ’
) āˆ’ 2 log(p
1
2
0 + (1 āˆ’ p0)
1
2 ),
0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
0.08
0.06
0.04
0.02
0
0.02
0.04
0.06
0.08
0.1
p0
Differenceinexpectedgrowthrate
U
ā€™(0) W
ā€™(0)
U
(1) W
(1)
U
(1)
W
(1)
Fig. 1. Bernoulli(p0, 1 āˆ’ p0) source. Difference in exponential growth rates
of Guesswork between uniform approximation, unconditioned and conditioned
distribution with = 0.1. Top curve is the difference in expected logarithms
between the uniform approximation and both the conditioned and uncondi-
tioned word sources. Bottom curve is the log-ratio of the expected Guesswork
of the uniform and unconditioned word sources, with the latter harder to guess
for large p0. Middle curve is the log-ratio of the uniform and conditioned word
sources, which initially follows the lower line, before separating and staying
positive, showing that the conditioned source is always easier to guess than
the typically used uniform approximation.
initially agree. It can, depending on p0 and , be either positive
or negative. It is negative if the typical set is particularly small
in comparison to the number of unconditioned words.
For p0 = 8/10, the typical set is growing sufļ¬ciently slowly
that a word selected from the uniform approximation is easier
to guess than for unconditioned source. For this value, we
illustrate the difference in Guesswork distributions between the
unconditioned {Wk}, conditioned {Wk} and uniform {Uk}
word sources. If we used the approximation in (8) directly,
the graph would not be informative as the range of the
unconditioned source is growing exponentially faster than the
other two. Instead Figure 2 plots āˆ’x āˆ’ Ī›āˆ—
(x) for each of the
three processes. That is, using equation (8) and its equivalents
for the other two processes, it plots
1
k
log G(w), where G(w) āˆˆ {1, . . . , 2k
},
against the large deviation approximations to
1
k
log P(Wk = w),
1
k
log P(Wk = w) and
1
k
log P(Uk = w),
as the resulting plot is unchanging in k. The source of
the discrepancy in expected Guesswork is apparent, with
the unconditioned source having substantially more words to
cover (due to the log x-scale). Both it and the true condi-
tioned sources having higher probability words that skew their
Guesswork. The ļ¬rst plateau for the conditioned and uniform
distributions correspond to those words with approximately
maximum highest probability; that is, the length of this plateau
is Ī³W or Ī³U , deļ¬ned in equation (5), so that, for example,
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
x
x
W
*
(x)
x W
*
(x)
x U
*
(x)
Fig. 2. Bernoulli(8/10, 2/10) source, = 0.1. Guesswork distri-
bution approximations. For large k, x-axis is x = 1/k log G(w) for
G(w) āˆˆ {1, . . . , 2k} and the y-axis is the large deviation approximation
1/k log P(X = w) ā‰ˆ āˆ’x āˆ’ Ī›āˆ—
X (x) for X = Wk, Wk and X = Uk.
approximately exp(kĪ³W ) words have probability of approx-
imately exp(kgW ).
V. CONCLUSION
By establishing that the expected Guesswork of a source
conditioned on the typical set is growing with a smaller expo-
nent than its usual uniform approximation, we have demon-
strated that appealing to the AEP for the latter is erroneous in
cryptanalysis and instead provide a correct methodology for
identifying the Guesswork growth rate.
APPENDIX
The proportion of the letter a āˆˆ A in a word w =
(w1, . . . , wk) āˆˆ Ak
is given by
nk(w, a) :=
|{1 ā‰¤ i ā‰¤ k : wi = a}|
k
.
The number of words in a type l = (l0, . . . , lmāˆ’1), where
la ā‰„ 0 for all a āˆˆ A and aāˆˆA la = 1, is given by
Nk(l) := |{w āˆˆ Ak
such that nk(w, a) = la āˆ€a āˆˆ A}|.
The set of all types, those just in the typical set and smooth
approximations to those in the typical set are denoted
Lk := {l : āˆƒw āˆˆ Ak
such that nk(w, a) = la āˆ€a āˆˆ A},
L ,k := {l : āˆƒw āˆˆ T ,k such that nk(w, a) = la āˆ€a āˆˆ A},
L := l :
a
la log pa āˆˆ [āˆ’h(p) āˆ’ , āˆ’h(p) + ] ,
where it can readily seen that L ,k āŠ‚ L for all k.
For {Uk} we need the following Lemma.
Lemma 3: The exponential growth rate of the size of the
typical set is
lim
kā†’āˆž
1
k
log |Tk| =
log m if log m ā‰¤ h(p) +
h(lāˆ’
) otherwise.
where lāˆ’
is deļ¬ned in equation (9).
Proof: For ļ¬xed k, by the union bound
max
lāˆˆL ,k
k!
aāˆˆA(kla)!
ā‰¤ |Tk| ā‰¤ (k + 1)m
max
lāˆˆL ,k
k!
aāˆˆA(kla)!
.
For the logarithmic limit, these two bounds coincide so con-
sider the concave optimization problem
max
lāˆˆL ,k
k!
aāˆˆA(kla)!
.
We can upper bound this optimization by replacing L ,k with
the smoother version, its superset L . Using Stirlingā€™s bound
we have that
lim sup
kā†’āˆž
1
k
log sup
lāˆˆL
k!
aāˆˆA(kla)!
ā‰¤ sup
lāˆˆL
h(l) =
log(m) if h(p) + ā‰„ log(m)
h(lāˆ’
) if h(p) + < log(m).
For the lower bound, we need to construct a sequence {l(k)
}
such that l(k)
āˆˆ L ,k for all k sufļ¬ciently large and h(l(k)
)
converges to either log(m) or h(lāˆ’
), as appropriate. Let lāˆ—
=
(1/m, . . . , 1/m) or lāˆ’
respectively, letting c āˆˆ arg max pa
and deļ¬ne
l(k)
a =
ļ£±
ļ£“ļ£²
ļ£“ļ£³
kāˆ’1
klāˆ—
a + 1 āˆ’
bāˆˆA
1
k
klāˆ—
b if a = c,
kāˆ’1
klāˆ—
a if a = c.
Then l(k)
āˆˆ L ,k for all k > āˆ’m log(pc)/(2 ) and h(l(k)
) ā†’
h(lāˆ—
), as required.
Proof: Proof of Lemma 1. Considering {Uk} ļ¬rst,
Ī±RU
1
1 + Ī±
= Ī± lim
kā†’āˆž
1
k
log |Tk| = Ī±h(lāˆ’
),
by Lemma 3. To evaluate Ī›U (Ī±), as for any n āˆˆ N and Ī± > 0
n
i=1
iĪ±
ā‰„
n
0
xĪ±
dx,
again using Lemma 3 we have
Ī±h(lāˆ’
) = lim
kā†’āˆž
1
k
log
1
1 + Ī±
|Tk|Ī±
ā‰¤ lim
kā†’āˆž
1
k
log E(eĪ± log G(Uk)
)
= lim
kā†’āˆž
1
k
log
1
|Tk|
|Tk |
i=1
iĪ±
ā‰¤ lim
kā†’āˆž
1
k
log |Tk|Ī±
= Ī±h(lāˆ’
),
where we have used Lemma 3. The reverse of these bounds
holds for Ī± āˆˆ (āˆ’1, 0], giving the result.
We break the argument for {Wk} into three steps. Step 1
is to show the equivalence of the existence of Ī›W (Ī±) and
Ī±RW (1/(1 + Ī±)) for Ī± > āˆ’1 with the existence of the
following limit
lim
kā†’āˆž
1
k
log max
lāˆˆL ,k
Nk(l)
1+Ī±
aāˆˆA
pkla
a . (14)
Step 2 then establishes this limit and identiļ¬es it. Step 3 shows
that Ī›W (Ī±) is continuous for Ī± > āˆ’1. To achieve steps 1 and
2, we adopt and adapt the method of types argument employed
in the elongated web-version of [8].
Step 1 Two changes from the bounds of [8] Lemma 5.5 are
necessary: the consideration of non-i.i.d. sources by restriction
to Tk; and the extension of the Ī± range to include Ī± āˆˆ (āˆ’1, 0]
from that for Ī± ā‰„ 0 given in that document. Adjusted for
conditioning on the typical set we get
1
1 + Ī±
max
lāˆˆL ,k
Nk(l)
1+Ī± aāˆˆA pkla
a
wāˆˆTk
P(Wk = w)
ā‰¤ E(eĪ± log G(Wk )
) ā‰¤ (15)
(k + 1)m(1+Ī±)
max
lāˆˆL ,k
Nk(l)
1+Ī± aāˆˆA pkla
a
wāˆˆTk
P(Wk = w)
.
The necessary modiļ¬cation of these inequalities for Ī± āˆˆ
(āˆ’1, 0] gives
max
lāˆˆL ,k
Nk(l)
1+Ī± aāˆˆA pkla
a
wāˆˆTk
P(Wk = w)
ā‰¤ E(eĪ± log G(Wk )
) ā‰¤ (16)
(k + 1)m
1 + Ī±
max
lāˆˆL ,k
Nk(l)
1+Ī± aāˆˆA pkla
a
wāˆˆTk
P(Wk = w)
.
To show the lower bound holds if Ī± āˆˆ (āˆ’1, 0] let
lāˆ—
āˆˆ arg max
lāˆˆL ,k
Nk(l)
1+Ī± aāˆˆA pkla
a
wāˆˆTk
P(Wk = w)
.
Taking lim infkā†’āˆž kāˆ’1
log and lim supkā†’āˆž kāˆ’1
log of equa-
tions (15) and (16) establishes that if the limit (14) exists,
Ī›W (Ī±) exists and equals it. Similar inequalities provide the
same result for Ī±RW (1/(1 + Ī±)).
Step 2 The problem has been reduced to establishing the
existence of
lim
kā†’āˆž
1
k
log max
lāˆˆL ,k
Nk(l)
1+Ī±
aāˆˆA
pkla
a
and identifying it. The method of proof is similar to that
employed at the start of Lemma 1 for {Uk}: we provide an
upper bound for the limsup and then establish a corresponding
lower bound.
If l(k)
ā†’ l with l(k)
āˆˆ Lk, then using Stirlingā€™s bounds we
have that
lim
kā†’āˆž
1
k
log Nk(lāˆ—,(k)
) = h(l).
This convergence occurs uniformly in l and so, as L ,k āŠ‚ L
for all k,
lim sup
kā†’āˆž
1
k
log max
lāˆˆL ,k
Nk(l)
1+Ī±
aāˆˆA
pkla
a
ā‰¤ sup
lāˆˆL
(1 + Ī±)h(l) +
a
la log pa
= sup
lāˆˆL
(Ī±h(l) āˆ’ D(l||p)) . (17)
This is a concave optimization problem in l with convex
constraints. Not requiring l āˆˆ L , the unconstrained optimizer
over all l is attained at lW
(Ī±) deļ¬ned in equation (11), which
determines Ī·(Ī±) in equation (12). Thus the optimizer of the
constrained problem (17) can be identiļ¬ed as that given in
equation (13). Thus we have that
lim sup
kā†’āˆž
1
k
log max
lāˆˆL ,k
Nk(l)
1+Ī±
aāˆˆA
pkla
a
ā‰¤ Ī±h(lāˆ—
(Ī±)) + D(lāˆ—
(Ī±)||p),
where lāˆ—
(Ī±) is deļ¬ned in equation (13).
We complete the proof by generating a matching lower
bound. To do so, for given lāˆ—
(Ī±) we need only create a
sequence such that l(k)
ā†’ lāˆ—
(Ī±) and l(k)
āˆˆ L ,k for all
k. If lāˆ—
(Ī±) = lāˆ’
, then the sequence used in the proof of
Lemma 3 sufļ¬ces. For lāˆ—
(Ī±) = l+
, we use the same sequence
but with ļ¬‚oors in lieu of ceilings and the surplus probability
distributed to a least likely letter instead of a most likely letter.
For lāˆ—
(Ī±) = lW
(Ī±), either of these sequences can be used.
Step 3 As Ī›W (Ī±) = Ī±h(lāˆ—
(Ī±)) āˆ’ D(lāˆ—
(Ī±)||p), with lāˆ—
(Ī±)
deļ¬ned in equation (13),
d
dĪ±
Ī›W (Ī±) = h(lāˆ—
(Ī±)) + Ī›W (Ī±)
d
dĪ±
lāˆ—
(Ī±).
Thus to establish continuity it sufļ¬ces to establish continuity of
lāˆ—
(Ī±) and its derivative, which can be done readily by calculus.
Proof:
Proof of Lemma 2. This can be established directly by a
letter substitution argument, however, more generically it can
be seen as being a consequence of the existence of speciļ¬c
min-entropy as a result of Assumption 1 via the following
inequalities
Ī±R
1
1 + Ī±
āˆ’ (1 + Ī±) log m
ā‰¤ lim inf
kā†’āˆž
1 + Ī±
k
log
mk
P(G(Wk) = 1)(1/(1+Ī±))
mk
(18)
ā‰¤ lim sup
kā†’āˆž
1
k
log P(G(Wk) = 1)
= (1 + Ī±) lim sup
kā†’āˆž
1
k
log P(G(Wk) = 1)(1/(1+Ī±))
ā‰¤ (1 + Ī±) lim sup
kā†’āˆž
1
k
log(P(G(Wk) = 1)(1/(1+Ī±))
+
mk
i=2
P(G(Wk) = i)(1/(1+Ī±))
) = Ī±R
1
1 + Ī±
.
Equation (18) holds as P(G(Wk) = 1) ā‰„ P(G(Wk) = i) for
all i āˆˆ {1, . . . , mk
}. The veracity of the lemma follows as
Ī±R (1 + Ī±)āˆ’1
exists and is continuous for all Ī± > āˆ’1 by
Assumption 1 and (1 + Ī±) log m tends to 0 as Ī± ā†“ āˆ’1.
ACKNOWLEDGMENT
M.C. and K.D. supported by the Science Foundation Ire-
land Grant No. 11/PI/1177 and the Irish Higher Educational
Authority (HEA) PRTLI Network Mathematics Grant. F.d.P.C.
and M.M. sponsored by the Department of Defense under Air
Force Contract FA8721-05-C-0002. Opinions, interpretations,
recommendations, and conclusions are those of the authors and
are not necessarily endorsed by the United States Government.
Speciļ¬cally, this work was supported by Information Systems
of ASD(R&E).
REFERENCES
[1] A. Menezes, S. Vanstone, and P. V. Oorschot, Handbook of Applied
Cryptography. CRC Press, Inc., 1996.
[2] T. M. Cover and J. A. Thomas, Elements of Information Theory. John
Wiley & Sons, 1991.
[3] J. Pliam, ā€œOn the incomparability of entropy and marginal guesswork
in brute-force attacks,ā€ in INDOCRYPT, 2000, pp. 67ā€“79.
[4] S. Draper, A. Khisti, E. Martinian, A. Vetro, and J. Yedidia, ā€œSecure
storage of ļ¬ngerprint biometrics using Slepian-Wolf codes,ā€ in ITA
Workshop, 2007.
[5] Y. Sutcu, S. Rane, J. Yedidia, S. Draper, and A. Vetro, ā€œFeature
extraction for a Slepian-Wolf biometric system using LDPC codes,ā€ in
ISIT, 2008.
[6] F. du Pin Calmon, M. MĀ“edard, L. Zegler, J. Barros, M. Christiansen,
and K. Duffy, ā€œLists that are smaller than their parts: A coding approach
to tunable secrecy,ā€ in Proc. 50th Allerton Conference, 2012.
[7] E. Arikan, ā€œAn inequality on guessing and its application to sequential
decoding,ā€ IEEE Trans, Inf. Theory, vol. 42, no. 1, pp. 99ā€“105, 1996.
[8] D. Malone and W. Sullivan, ā€œGuesswork and entropy,ā€ IEEE Trans.
Inf. Theory, vol. 50, no. 4, pp. 525ā€“526, 2004, http://www.maths.tcd.ie/
āˆ¼dwmalone/p/guess02.pdf.
[9] C.-E. Pļ¬ster and W. Sullivan, ā€œRĀ“enyi entropy, guesswork moments and
large deviations,ā€ IEEE Trans. Inf. Theory, no. 11, pp. 2794ā€“00, 2004.
[10] M. K. Hanawal and R. Sundaresan, ā€œGuessing revisited: A large devi-
ations approach,ā€ IEEE Trans. Inf. Theory, vol. 57, no. 1, pp. 70ā€“78,
2011.
[11] M. M. Christiansen and K. R. Duffy, ā€œGuesswork, large deviations and
Shannon entropy,ā€ IEEE Trans. Inf. Theory, vol. 59, no. 2, pp. 796ā€“802,
2013.
[12] J. L. Massey, ā€œGuessing and entropy,ā€ IEEE Int. Symo. Inf Theory, pp.
204ā€“204, 1994.
[13] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applica-
tions. Springer-Verlag, 1998.
[14] E. Arikan and N. Merhav, ā€œGuessing subject to distortion,ā€ IEEE Trans.
Inf. Theory, vol. 44, pp. 1041ā€“1056, 1998.
[15] R. Sundaresan, ā€œGuessing based on length functions,ā€ in Proc. 2007
International Symp. on Inf. Th., 2007.

Weitere Ƥhnliche Inhalte

Was ist angesagt?

K-adaptive partitioning for survival data
K-adaptive partitioning for survival dataK-adaptive partitioning for survival data
K-adaptive partitioning for survival dataģˆ˜ķ–‰ ģ–“
Ā 
Constructing and Evaluating Bipolar Weighted Argumentation Frameworks for Onl...
Constructing and Evaluating Bipolar Weighted Argumentation Frameworks for Onl...Constructing and Evaluating Bipolar Weighted Argumentation Frameworks for Onl...
Constructing and Evaluating Bipolar Weighted Argumentation Frameworks for Onl...Andrea Pazienza
Ā 
AstaƱo 4
AstaƱo 4AstaƱo 4
AstaƱo 4Cire Oreja
Ā 
Lecture8 clustering
Lecture8 clusteringLecture8 clustering
Lecture8 clusteringsidsingh680
Ā 
Common fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps ofCommon fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps ofAlexander Decker
Ā 
Maximizing a Nonnegative, Monotone, Submodular Function Constrained to Matchings
Maximizing a Nonnegative, Monotone, Submodular Function Constrained to MatchingsMaximizing a Nonnegative, Monotone, Submodular Function Constrained to Matchings
Maximizing a Nonnegative, Monotone, Submodular Function Constrained to Matchingssagark4
Ā 
Bc4301300308
Bc4301300308Bc4301300308
Bc4301300308IJERA Editor
Ā 
Ireducible core and equal remaining obligations rule for mcst games
Ireducible core and equal remaining obligations rule for mcst gamesIreducible core and equal remaining obligations rule for mcst games
Ireducible core and equal remaining obligations rule for mcst gamesvinnief
Ā 
The Total Strong Split Domination Number of Graphs
The Total Strong Split Domination Number of GraphsThe Total Strong Split Domination Number of Graphs
The Total Strong Split Domination Number of Graphsinventionjournals
Ā 
Novel analysis of transition probabilities in randomized k sat algorithm
Novel analysis of transition probabilities in randomized k sat algorithmNovel analysis of transition probabilities in randomized k sat algorithm
Novel analysis of transition probabilities in randomized k sat algorithmijfcstjournal
Ā 
An Extension to the Zero-Inflated Generalized Power Series Distributions
An Extension to the Zero-Inflated Generalized Power Series DistributionsAn Extension to the Zero-Inflated Generalized Power Series Distributions
An Extension to the Zero-Inflated Generalized Power Series Distributionsinventionjournals
Ā 
2014 vulnerability assesment of spatial network - models and solutions
2014   vulnerability assesment of spatial network - models and solutions2014   vulnerability assesment of spatial network - models and solutions
2014 vulnerability assesment of spatial network - models and solutionsFrancisco PĆ©rez
Ā 
Eryk_Kulikowski_a3
Eryk_Kulikowski_a3Eryk_Kulikowski_a3
Eryk_Kulikowski_a3Eryk Kulikowski
Ā 
A New Enhanced Method of Non Parametric power spectrum Estimation.
A New Enhanced Method of Non Parametric power spectrum Estimation.A New Enhanced Method of Non Parametric power spectrum Estimation.
A New Enhanced Method of Non Parametric power spectrum Estimation.CSCJournals
Ā 
Sums (Sumatorias)
Sums (Sumatorias)Sums (Sumatorias)
Sums (Sumatorias)Beat Winehouse
Ā 
Quiz 1 solution
Quiz 1 solutionQuiz 1 solution
Quiz 1 solutionGopi Saiteja
Ā 

Was ist angesagt? (20)

K-adaptive partitioning for survival data
K-adaptive partitioning for survival dataK-adaptive partitioning for survival data
K-adaptive partitioning for survival data
Ā 
kcde
kcdekcde
kcde
Ā 
Constructing and Evaluating Bipolar Weighted Argumentation Frameworks for Onl...
Constructing and Evaluating Bipolar Weighted Argumentation Frameworks for Onl...Constructing and Evaluating Bipolar Weighted Argumentation Frameworks for Onl...
Constructing and Evaluating Bipolar Weighted Argumentation Frameworks for Onl...
Ā 
Puy chosuantai2
Puy chosuantai2Puy chosuantai2
Puy chosuantai2
Ā 
AstaƱo 4
AstaƱo 4AstaƱo 4
AstaƱo 4
Ā 
Lecture8 clustering
Lecture8 clusteringLecture8 clustering
Lecture8 clustering
Ā 
Common fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps ofCommon fixed point theorems for contractive maps of
Common fixed point theorems for contractive maps of
Ā 
Maximizing a Nonnegative, Monotone, Submodular Function Constrained to Matchings
Maximizing a Nonnegative, Monotone, Submodular Function Constrained to MatchingsMaximizing a Nonnegative, Monotone, Submodular Function Constrained to Matchings
Maximizing a Nonnegative, Monotone, Submodular Function Constrained to Matchings
Ā 
Bc4301300308
Bc4301300308Bc4301300308
Bc4301300308
Ā 
Thesis 6
Thesis 6Thesis 6
Thesis 6
Ā 
Ireducible core and equal remaining obligations rule for mcst games
Ireducible core and equal remaining obligations rule for mcst gamesIreducible core and equal remaining obligations rule for mcst games
Ireducible core and equal remaining obligations rule for mcst games
Ā 
The Total Strong Split Domination Number of Graphs
The Total Strong Split Domination Number of GraphsThe Total Strong Split Domination Number of Graphs
The Total Strong Split Domination Number of Graphs
Ā 
Novel analysis of transition probabilities in randomized k sat algorithm
Novel analysis of transition probabilities in randomized k sat algorithmNovel analysis of transition probabilities in randomized k sat algorithm
Novel analysis of transition probabilities in randomized k sat algorithm
Ā 
An Extension to the Zero-Inflated Generalized Power Series Distributions
An Extension to the Zero-Inflated Generalized Power Series DistributionsAn Extension to the Zero-Inflated Generalized Power Series Distributions
An Extension to the Zero-Inflated Generalized Power Series Distributions
Ā 
2014 vulnerability assesment of spatial network - models and solutions
2014   vulnerability assesment of spatial network - models and solutions2014   vulnerability assesment of spatial network - models and solutions
2014 vulnerability assesment of spatial network - models and solutions
Ā 
Eryk_Kulikowski_a3
Eryk_Kulikowski_a3Eryk_Kulikowski_a3
Eryk_Kulikowski_a3
Ā 
A New Enhanced Method of Non Parametric power spectrum Estimation.
A New Enhanced Method of Non Parametric power spectrum Estimation.A New Enhanced Method of Non Parametric power spectrum Estimation.
A New Enhanced Method of Non Parametric power spectrum Estimation.
Ā 
25010001
2501000125010001
25010001
Ā 
Sums (Sumatorias)
Sums (Sumatorias)Sums (Sumatorias)
Sums (Sumatorias)
Ā 
Quiz 1 solution
Quiz 1 solutionQuiz 1 solution
Quiz 1 solution
Ā 

Andere mochten auch

School to work transition
School to work transitionSchool to work transition
School to work transitionsekharpvrr
Ā 
Empathymap sandra
Empathymap sandraEmpathymap sandra
Empathymap sandraSandraFra
Ā 
Coalition 2013 election policy ā€“ enhance online safety final
Coalition 2013 election policy ā€“ enhance online safety   finalCoalition 2013 election policy ā€“ enhance online safety   final
Coalition 2013 election policy ā€“ enhance online safety finalLisandro Mierez
Ā 

Andere mochten auch (6)

School to work transition
School to work transitionSchool to work transition
School to work transition
Ā 
Hi, I am AndrƩs
Hi, I am AndrƩsHi, I am AndrƩs
Hi, I am AndrƩs
Ā 
Empathymap sandra
Empathymap sandraEmpathymap sandra
Empathymap sandra
Ā 
221 complaint
221 complaint221 complaint
221 complaint
Ā 
Internship
InternshipInternship
Internship
Ā 
Coalition 2013 election policy ā€“ enhance online safety final
Coalition 2013 election policy ā€“ enhance online safety   finalCoalition 2013 election policy ā€“ enhance online safety   final
Coalition 2013 election policy ā€“ enhance online safety final
Ā 

Ƅhnlich wie Brute force searching, the typical set and guesswork

machinelearning project
machinelearning projectmachinelearning project
machinelearning projectLianli Liu
Ā 
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingContext-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingTomonari Masada
Ā 
Fuzzy random variables and Kolomogrovā€™s important results
Fuzzy random variables and Kolomogrovā€™s important resultsFuzzy random variables and Kolomogrovā€™s important results
Fuzzy random variables and Kolomogrovā€™s important resultsinventionjournals
Ā 
Convexity of the Set of k-Admissible Functions on a Compact KƤhler Manifold (...
Convexity of the Set of k-Admissible Functions on a Compact KƤhler Manifold (...Convexity of the Set of k-Admissible Functions on a Compact KƤhler Manifold (...
Convexity of the Set of k-Admissible Functions on a Compact KƤhler Manifold (...UniversitƩ Internationale de Rabat
Ā 
Cs229 notes11
Cs229 notes11Cs229 notes11
Cs229 notes11VuTran231
Ā 
Fixed point result in menger space with ea property
Fixed point result in menger space with ea propertyFixed point result in menger space with ea property
Fixed point result in menger space with ea propertyAlexander Decker
Ā 
Mining group correlations over data streams
Mining group correlations over data streamsMining group correlations over data streams
Mining group correlations over data streamsyuanchung
Ā 
EVEN GRACEFUL LABELLING OF A CLASS OF TREES
EVEN GRACEFUL LABELLING OF A CLASS OF TREESEVEN GRACEFUL LABELLING OF A CLASS OF TREES
EVEN GRACEFUL LABELLING OF A CLASS OF TREESFransiskeran
Ā 
Machine learning (12)
Machine learning (12)Machine learning (12)
Machine learning (12)NYversity
Ā 
Strong convergence of an algorithm about strongly quasi nonexpansive mappings
Strong convergence of an algorithm about strongly quasi nonexpansive mappingsStrong convergence of an algorithm about strongly quasi nonexpansive mappings
Strong convergence of an algorithm about strongly quasi nonexpansive mappingsAlexander Decker
Ā 
The Chase in Database Theory
The Chase in Database TheoryThe Chase in Database Theory
The Chase in Database TheoryJan Hidders
Ā 
Document Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior DistributionsDocument Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior DistributionsTomonari Masada
Ā 
Steven Duplij, "Polyadic Hopf algebras and quantum groups"
Steven Duplij, "Polyadic Hopf algebras and quantum groups"Steven Duplij, "Polyadic Hopf algebras and quantum groups"
Steven Duplij, "Polyadic Hopf algebras and quantum groups"Steven Duplij (Stepan Douplii)
Ā 
Iswc 2016 completeness correctude
Iswc 2016 completeness correctudeIswc 2016 completeness correctude
Iswc 2016 completeness correctudeAndrƩ Valdestilhas
Ā 

Ƅhnlich wie Brute force searching, the typical set and guesswork (20)

machinelearning project
machinelearning projectmachinelearning project
machinelearning project
Ā 
1404.1503
1404.15031404.1503
1404.1503
Ā 
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingContext-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
Ā 
Fuzzy random variables and Kolomogrovā€™s important results
Fuzzy random variables and Kolomogrovā€™s important resultsFuzzy random variables and Kolomogrovā€™s important results
Fuzzy random variables and Kolomogrovā€™s important results
Ā 
Convexity of the Set of k-Admissible Functions on a Compact KƤhler Manifold (...
Convexity of the Set of k-Admissible Functions on a Compact KƤhler Manifold (...Convexity of the Set of k-Admissible Functions on a Compact KƤhler Manifold (...
Convexity of the Set of k-Admissible Functions on a Compact KƤhler Manifold (...
Ā 
Cs229 notes11
Cs229 notes11Cs229 notes11
Cs229 notes11
Ā 
Fixed point result in menger space with ea property
Fixed point result in menger space with ea propertyFixed point result in menger space with ea property
Fixed point result in menger space with ea property
Ā 
Ijetr021233
Ijetr021233Ijetr021233
Ijetr021233
Ā 
draft
draftdraft
draft
Ā 
Mining group correlations over data streams
Mining group correlations over data streamsMining group correlations over data streams
Mining group correlations over data streams
Ā 
04_AJMS_330_21.pdf
04_AJMS_330_21.pdf04_AJMS_330_21.pdf
04_AJMS_330_21.pdf
Ā 
Programming Exam Help
Programming Exam Help Programming Exam Help
Programming Exam Help
Ā 
EVEN GRACEFUL LABELLING OF A CLASS OF TREES
EVEN GRACEFUL LABELLING OF A CLASS OF TREESEVEN GRACEFUL LABELLING OF A CLASS OF TREES
EVEN GRACEFUL LABELLING OF A CLASS OF TREES
Ā 
Machine learning (12)
Machine learning (12)Machine learning (12)
Machine learning (12)
Ā 
Strong convergence of an algorithm about strongly quasi nonexpansive mappings
Strong convergence of an algorithm about strongly quasi nonexpansive mappingsStrong convergence of an algorithm about strongly quasi nonexpansive mappings
Strong convergence of an algorithm about strongly quasi nonexpansive mappings
Ā 
The Chase in Database Theory
The Chase in Database TheoryThe Chase in Database Theory
The Chase in Database Theory
Ā 
Document Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior DistributionsDocument Modeling with Implicit Approximate Posterior Distributions
Document Modeling with Implicit Approximate Posterior Distributions
Ā 
Steven Duplij, "Polyadic Hopf algebras and quantum groups"
Steven Duplij, "Polyadic Hopf algebras and quantum groups"Steven Duplij, "Polyadic Hopf algebras and quantum groups"
Steven Duplij, "Polyadic Hopf algebras and quantum groups"
Ā 
Iswc 2016 completeness correctude
Iswc 2016 completeness correctudeIswc 2016 completeness correctude
Iswc 2016 completeness correctude
Ā 
BNL_Research_Report
BNL_Research_ReportBNL_Research_Report
BNL_Research_Report
Ā 

KĆ¼rzlich hochgeladen

Almora call girls šŸ“ž 8617697112 At Low Cost Cash Payment Booking
Almora call girls šŸ“ž 8617697112 At Low Cost Cash Payment BookingAlmora call girls šŸ“ž 8617697112 At Low Cost Cash Payment Booking
Almora call girls šŸ“ž 8617697112 At Low Cost Cash Payment BookingNitya salvi
Ā 
(KRITI) Pimpri Chinchwad Call Girls Just Call 7001035870 [ Cash on Delivery ]...
(KRITI) Pimpri Chinchwad Call Girls Just Call 7001035870 [ Cash on Delivery ]...(KRITI) Pimpri Chinchwad Call Girls Just Call 7001035870 [ Cash on Delivery ]...
(KRITI) Pimpri Chinchwad Call Girls Just Call 7001035870 [ Cash on Delivery ]...ranjana rawat
Ā 
Call Girls In Goa 9316020077 Goa Call Girl By Indian Call Girls Goa
Call Girls In Goa  9316020077 Goa  Call Girl By Indian Call Girls GoaCall Girls In Goa  9316020077 Goa  Call Girl By Indian Call Girls Goa
Call Girls In Goa 9316020077 Goa Call Girl By Indian Call Girls Goasexy call girls service in goa
Ā 
Russian Call Girl South End Park - Call 8250192130 Rs-3500 with A/C Room Cash...
Russian Call Girl South End Park - Call 8250192130 Rs-3500 with A/C Room Cash...Russian Call Girl South End Park - Call 8250192130 Rs-3500 with A/C Room Cash...
Russian Call Girl South End Park - Call 8250192130 Rs-3500 with A/C Room Cash...anamikaraghav4
Ā 
2k Shot Call girls Laxmi Nagar Delhi 9205541914
2k Shot Call girls Laxmi Nagar Delhi 92055419142k Shot Call girls Laxmi Nagar Delhi 9205541914
2k Shot Call girls Laxmi Nagar Delhi 9205541914Delhi Call girls
Ā 
Call Girls Nashik Gayatri 7001305949 Independent Escort Service Nashik
Call Girls Nashik Gayatri 7001305949 Independent Escort Service NashikCall Girls Nashik Gayatri 7001305949 Independent Escort Service Nashik
Call Girls Nashik Gayatri 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
Ā 
Book Call Girls in Panchpota - 8250192130 | 24x7 Service Available Near Me
Book Call Girls in Panchpota - 8250192130 | 24x7 Service Available Near MeBook Call Girls in Panchpota - 8250192130 | 24x7 Service Available Near Me
Book Call Girls in Panchpota - 8250192130 | 24x7 Service Available Near Meanamikaraghav4
Ā 
Goa Call "Girls Service 9316020077 Call "Girls in Goa
Goa Call "Girls  Service   9316020077 Call "Girls in GoaGoa Call "Girls  Service   9316020077 Call "Girls in Goa
Goa Call "Girls Service 9316020077 Call "Girls in Goasexy call girls service in goa
Ā 
Desi Bhabhi Call Girls In Goa šŸ’ƒ 730 02 72 001šŸ’ƒdesi Bhabhi Escort Goa
Desi Bhabhi Call Girls  In Goa  šŸ’ƒ 730 02 72 001šŸ’ƒdesi Bhabhi Escort GoaDesi Bhabhi Call Girls  In Goa  šŸ’ƒ 730 02 72 001šŸ’ƒdesi Bhabhi Escort Goa
Desi Bhabhi Call Girls In Goa šŸ’ƒ 730 02 72 001šŸ’ƒdesi Bhabhi Escort Goarussian goa call girl and escorts service
Ā 
Call Girl Nagpur Roshni Call 7001035870 Meet With Nagpur Escorts
Call Girl Nagpur Roshni Call 7001035870 Meet With Nagpur EscortsCall Girl Nagpur Roshni Call 7001035870 Meet With Nagpur Escorts
Call Girl Nagpur Roshni Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
Ā 
Call Girl Nashik Amaira 7001305949 Independent Escort Service Nashik
Call Girl Nashik Amaira 7001305949 Independent Escort Service NashikCall Girl Nashik Amaira 7001305949 Independent Escort Service Nashik
Call Girl Nashik Amaira 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
Ā 
Call Girls in Barasat | 7001035870 At Low Cost Cash Payment Booking
Call Girls in Barasat | 7001035870 At Low Cost Cash Payment BookingCall Girls in Barasat | 7001035870 At Low Cost Cash Payment Booking
Call Girls in Barasat | 7001035870 At Low Cost Cash Payment Bookingnoor ahmed
Ā 
Karnal Call Girls 8860008073 Dyal Singh Colony Call Girls Service in Karnal E...
Karnal Call Girls 8860008073 Dyal Singh Colony Call Girls Service in Karnal E...Karnal Call Girls 8860008073 Dyal Singh Colony Call Girls Service in Karnal E...
Karnal Call Girls 8860008073 Dyal Singh Colony Call Girls Service in Karnal E...Apsara Of India
Ā 
College Call Girls New Alipore - For 7001035870 Cheap & Best with original Ph...
College Call Girls New Alipore - For 7001035870 Cheap & Best with original Ph...College Call Girls New Alipore - For 7001035870 Cheap & Best with original Ph...
College Call Girls New Alipore - For 7001035870 Cheap & Best with original Ph...anamikaraghav4
Ā 
š“€¤Call On 6297143586 š“€¤ Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...
š“€¤Call On 6297143586 š“€¤ Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...š“€¤Call On 6297143586 š“€¤ Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...
š“€¤Call On 6297143586 š“€¤ Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...rahim quresi
Ā 
Russian Escorts Agency In Goa šŸ’š 9316020077 šŸ’š Russian Call Girl Goa
Russian Escorts Agency In Goa  šŸ’š 9316020077 šŸ’š Russian Call Girl GoaRussian Escorts Agency In Goa  šŸ’š 9316020077 šŸ’š Russian Call Girl Goa
Russian Escorts Agency In Goa šŸ’š 9316020077 šŸ’š Russian Call Girl Goasexy call girls service in goa
Ā 
Behala ( Call Girls ) Kolkata āœ” 6297143586 āœ” Hot Model With Sexy Bhabi Ready ...
Behala ( Call Girls ) Kolkata āœ” 6297143586 āœ” Hot Model With Sexy Bhabi Ready ...Behala ( Call Girls ) Kolkata āœ” 6297143586 āœ” Hot Model With Sexy Bhabi Ready ...
Behala ( Call Girls ) Kolkata āœ” 6297143586 āœ” Hot Model With Sexy Bhabi Ready ...ritikasharma
Ā 
ā†‘Top Model (Kolkata) Call Girls Salt Lake āŸŸ 8250192130 āŸŸ High Class Call Girl...
ā†‘Top Model (Kolkata) Call Girls Salt Lake āŸŸ 8250192130 āŸŸ High Class Call Girl...ā†‘Top Model (Kolkata) Call Girls Salt Lake āŸŸ 8250192130 āŸŸ High Class Call Girl...
ā†‘Top Model (Kolkata) Call Girls Salt Lake āŸŸ 8250192130 āŸŸ High Class Call Girl...noor ahmed
Ā 
ā†‘Top Model (Kolkata) Call Girls Rajpur āŸŸ 8250192130 āŸŸ High Class Call Girl In...
ā†‘Top Model (Kolkata) Call Girls Rajpur āŸŸ 8250192130 āŸŸ High Class Call Girl In...ā†‘Top Model (Kolkata) Call Girls Rajpur āŸŸ 8250192130 āŸŸ High Class Call Girl In...
ā†‘Top Model (Kolkata) Call Girls Rajpur āŸŸ 8250192130 āŸŸ High Class Call Girl In...noor ahmed
Ā 

KĆ¼rzlich hochgeladen (20)

Almora call girls šŸ“ž 8617697112 At Low Cost Cash Payment Booking
Almora call girls šŸ“ž 8617697112 At Low Cost Cash Payment BookingAlmora call girls šŸ“ž 8617697112 At Low Cost Cash Payment Booking
Almora call girls šŸ“ž 8617697112 At Low Cost Cash Payment Booking
Ā 
(KRITI) Pimpri Chinchwad Call Girls Just Call 7001035870 [ Cash on Delivery ]...
(KRITI) Pimpri Chinchwad Call Girls Just Call 7001035870 [ Cash on Delivery ]...(KRITI) Pimpri Chinchwad Call Girls Just Call 7001035870 [ Cash on Delivery ]...
(KRITI) Pimpri Chinchwad Call Girls Just Call 7001035870 [ Cash on Delivery ]...
Ā 
Call Girls In Goa 9316020077 Goa Call Girl By Indian Call Girls Goa
Call Girls In Goa  9316020077 Goa  Call Girl By Indian Call Girls GoaCall Girls In Goa  9316020077 Goa  Call Girl By Indian Call Girls Goa
Call Girls In Goa 9316020077 Goa Call Girl By Indian Call Girls Goa
Ā 
Russian Call Girl South End Park - Call 8250192130 Rs-3500 with A/C Room Cash...
Russian Call Girl South End Park - Call 8250192130 Rs-3500 with A/C Room Cash...Russian Call Girl South End Park - Call 8250192130 Rs-3500 with A/C Room Cash...
Russian Call Girl South End Park - Call 8250192130 Rs-3500 with A/C Room Cash...
Ā 
2k Shot Call girls Laxmi Nagar Delhi 9205541914
2k Shot Call girls Laxmi Nagar Delhi 92055419142k Shot Call girls Laxmi Nagar Delhi 9205541914
2k Shot Call girls Laxmi Nagar Delhi 9205541914
Ā 
Call Girls Nashik Gayatri 7001305949 Independent Escort Service Nashik
Call Girls Nashik Gayatri 7001305949 Independent Escort Service NashikCall Girls Nashik Gayatri 7001305949 Independent Escort Service Nashik
Call Girls Nashik Gayatri 7001305949 Independent Escort Service Nashik
Ā 
Book Call Girls in Panchpota - 8250192130 | 24x7 Service Available Near Me
Book Call Girls in Panchpota - 8250192130 | 24x7 Service Available Near MeBook Call Girls in Panchpota - 8250192130 | 24x7 Service Available Near Me
Book Call Girls in Panchpota - 8250192130 | 24x7 Service Available Near Me
Ā 
Goa Call "Girls Service 9316020077 Call "Girls in Goa
Goa Call "Girls  Service   9316020077 Call "Girls in GoaGoa Call "Girls  Service   9316020077 Call "Girls in Goa
Goa Call "Girls Service 9316020077 Call "Girls in Goa
Ā 
Desi Bhabhi Call Girls In Goa šŸ’ƒ 730 02 72 001šŸ’ƒdesi Bhabhi Escort Goa
Desi Bhabhi Call Girls  In Goa  šŸ’ƒ 730 02 72 001šŸ’ƒdesi Bhabhi Escort GoaDesi Bhabhi Call Girls  In Goa  šŸ’ƒ 730 02 72 001šŸ’ƒdesi Bhabhi Escort Goa
Desi Bhabhi Call Girls In Goa šŸ’ƒ 730 02 72 001šŸ’ƒdesi Bhabhi Escort Goa
Ā 
Call Girl Nagpur Roshni Call 7001035870 Meet With Nagpur Escorts
Call Girl Nagpur Roshni Call 7001035870 Meet With Nagpur EscortsCall Girl Nagpur Roshni Call 7001035870 Meet With Nagpur Escorts
Call Girl Nagpur Roshni Call 7001035870 Meet With Nagpur Escorts
Ā 
Call Girl Nashik Amaira 7001305949 Independent Escort Service Nashik
Call Girl Nashik Amaira 7001305949 Independent Escort Service NashikCall Girl Nashik Amaira 7001305949 Independent Escort Service Nashik
Call Girl Nashik Amaira 7001305949 Independent Escort Service Nashik
Ā 
Call Girls in Barasat | 7001035870 At Low Cost Cash Payment Booking
Call Girls in Barasat | 7001035870 At Low Cost Cash Payment BookingCall Girls in Barasat | 7001035870 At Low Cost Cash Payment Booking
Call Girls in Barasat | 7001035870 At Low Cost Cash Payment Booking
Ā 
Karnal Call Girls 8860008073 Dyal Singh Colony Call Girls Service in Karnal E...
Karnal Call Girls 8860008073 Dyal Singh Colony Call Girls Service in Karnal E...Karnal Call Girls 8860008073 Dyal Singh Colony Call Girls Service in Karnal E...
Karnal Call Girls 8860008073 Dyal Singh Colony Call Girls Service in Karnal E...
Ā 
Call Girls New Ashok Nagar Delhi WhatsApp Number 9711199171
Call Girls New Ashok Nagar Delhi WhatsApp Number 9711199171Call Girls New Ashok Nagar Delhi WhatsApp Number 9711199171
Call Girls New Ashok Nagar Delhi WhatsApp Number 9711199171
Ā 
College Call Girls New Alipore - For 7001035870 Cheap & Best with original Ph...
College Call Girls New Alipore - For 7001035870 Cheap & Best with original Ph...College Call Girls New Alipore - For 7001035870 Cheap & Best with original Ph...
College Call Girls New Alipore - For 7001035870 Cheap & Best with original Ph...
Ā 
š“€¤Call On 6297143586 š“€¤ Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...
š“€¤Call On 6297143586 š“€¤ Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...š“€¤Call On 6297143586 š“€¤ Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...
š“€¤Call On 6297143586 š“€¤ Ultadanga Call Girls In All Kolkata 24/7 Provide Call W...
Ā 
Russian Escorts Agency In Goa šŸ’š 9316020077 šŸ’š Russian Call Girl Goa
Russian Escorts Agency In Goa  šŸ’š 9316020077 šŸ’š Russian Call Girl GoaRussian Escorts Agency In Goa  šŸ’š 9316020077 šŸ’š Russian Call Girl Goa
Russian Escorts Agency In Goa šŸ’š 9316020077 šŸ’š Russian Call Girl Goa
Ā 
Behala ( Call Girls ) Kolkata āœ” 6297143586 āœ” Hot Model With Sexy Bhabi Ready ...
Behala ( Call Girls ) Kolkata āœ” 6297143586 āœ” Hot Model With Sexy Bhabi Ready ...Behala ( Call Girls ) Kolkata āœ” 6297143586 āœ” Hot Model With Sexy Bhabi Ready ...
Behala ( Call Girls ) Kolkata āœ” 6297143586 āœ” Hot Model With Sexy Bhabi Ready ...
Ā 
ā†‘Top Model (Kolkata) Call Girls Salt Lake āŸŸ 8250192130 āŸŸ High Class Call Girl...
ā†‘Top Model (Kolkata) Call Girls Salt Lake āŸŸ 8250192130 āŸŸ High Class Call Girl...ā†‘Top Model (Kolkata) Call Girls Salt Lake āŸŸ 8250192130 āŸŸ High Class Call Girl...
ā†‘Top Model (Kolkata) Call Girls Salt Lake āŸŸ 8250192130 āŸŸ High Class Call Girl...
Ā 
ā†‘Top Model (Kolkata) Call Girls Rajpur āŸŸ 8250192130 āŸŸ High Class Call Girl In...
ā†‘Top Model (Kolkata) Call Girls Rajpur āŸŸ 8250192130 āŸŸ High Class Call Girl In...ā†‘Top Model (Kolkata) Call Girls Rajpur āŸŸ 8250192130 āŸŸ High Class Call Girl In...
ā†‘Top Model (Kolkata) Call Girls Rajpur āŸŸ 8250192130 āŸŸ High Class Call Girl In...
Ā 

Brute force searching, the typical set and guesswork

  • 1. Brute force searching, the typical set and Guesswork Mark M. Christiansen and Ken R. Duffy Hamilton Institute National University of Ireland, Maynooth Email: {mark.christiansen, ken.duffy}@nuim.ie FlĀ“avio du Pin Calmon and Muriel MĀ“edard Research Laboratory of Electronics Massachusetts Institute of Technology Email: {ļ¬‚avio, medard}@mit.edu Abstractā€”Consider the situation where a word is chosen probabilistically from a ļ¬nite list. If an attacker knows the list and can inquire about each word in turn, then selecting the word via the uniform distribution maximizes the attackerā€™s difļ¬culty, its Guesswork, in identifying the chosen word. It is tempting to use this property in cryptanalysis of computationally secure ciphers by assuming coded words are drawn from a sourceā€™s typical set and so, for all intents and purposes, uniformly distributed within it. By applying recent results on Guesswork, for i.i.d. sources, it is this equipartition ansatz that we investigate here. In particular, we demonstrate that the expected Guesswork for a source conditioned to create words in the typical set grows, with word length, at a lower exponential rate than that of the uniform approximation, suggesting use of the approximation is ill-advised. I. INTRODUCTION Consider the problem of identifying the value of a discrete random variable by only asking questions of the sort: is its value X? That this is a time-consuming task is a cornerstone of computationally secure ciphers [1]. It is tempting to appeal to the Asymptotic Equipartition Property (AEP) [2], and the resulting assignment of code words only to elements of the typical set of the source, to justify restriction to consideration of a uniform source, e.g. [3], [4], [5]. This assumed uniformity has many desirable properties, including maximum obfustica- tion and difļ¬culty for the inquisitor, e.g. [6]. In typical set coding it is necessary to generate codes for words whose logarithmic probability is within a small distance of the word length times the speciļ¬c Shannon entropy. As a result, while all these words have near-equal likelihood, the distribution is not precisely uniform. It is the consequence of this lack of per- fect uniformity that we investigate here by proving that results on Guesswork [7], [8], [9], [10], [11] extend to this setting. We establish that for source words originally constructed from an i.i.d. sequence of letters, as a function of word length it is exponentially easier to guess a word conditioned to be in the sourceā€™s typical set in comparison to the corresponding equipartition approximation. This raises questions about the wisdom of appealing to the AEP to justify sole consideration of the uniform distributions for cryptanalysis and provides alternate results in their place. II. THE TYPICAL SET AND GUESSWORK Let A = {0, . . . , m āˆ’ 1} be a ļ¬nite alphabet and consider a stochastic sequence of words, {Wk}, where Wk is a word of length k taking values in Ak . The process {Wk} has speciļ¬c Shannon entropy HW := āˆ’ lim kā†’āˆž 1 k wāˆˆAk P(Wk = w) log P(Wk = w), and we shall take all logs to base e. For > 0, the typical set of words of length k is Tk := w āˆˆ Ak : eāˆ’k(HW + ) ā‰¤ P(Wk = w) ā‰¤ eāˆ’k(HW āˆ’ ) . For most reasonable sources [2], P(Wk āˆˆ Tk) > 0 for all k sufļ¬ciently large and typical set encoding results in a new source of words of length k, Wk, with statistics P(Wk = w) = ļ£± ļ£² ļ£³ P(Wk = w) P(Wk āˆˆ Tk) if w āˆˆ Tk, 0 if w /āˆˆ Tk. (1) Appealing to the AEP, these distributions are often substituted for their more readily manipulated uniformly random counter- part, Uk, P(Uk = w) := ļ£± ļ£² ļ£³ 1 |Tk| if w āˆˆ Tk, 0 if w /āˆˆ Tk, (2) where |Tk| is the number of elements in Tk. While the distri- bution of Wk is near-uniform for large k, it is not perfectly uniform unless the original Wk was uniformly distributed on a subset of Ak . Is a word selected using the distribution of Wk easier to guess than if it was selected uniformly, Uk? Given knowledge of Ak , the source statistics of words, say those of Wk, and an oracle against which a word can be tested one at a time, an attackerā€™s optimal strategy is to generate a partial-order of the words from most likely to least likely and guess them in turn [12], [7]. That is, the attacker generates a function G : Ak ā†’ {1, . . . , mk } such that G(w ) < G(w) if P(Wk = w ) > P(Wk = w). The integer G(w) is the number of guesses until word w is guessed, its Guesswork. For ļ¬xed k it is shown in [12] that the Shannon entropy of the underlying distribution bears little relation to the expected Guesswork, E(G(Wk)), the average number of guesses re- quired to guess a word chosen with distribution Wk using the optimal strategy. In a series of subsequent papers [7], [8], [9], [10], under ever less restrictive stochastic assumptions from words made up of i.i.d. letters to Markovian letters to soļ¬c shifts, an asymptotic relationship as word length grows arXiv:1301.6356v3[cs.IT]13May2013
  • 2. between scaled moments of the Guesswork and speciļ¬c RĀ“enyi entropy was identiļ¬ed: lim kā†’āˆž 1 k log E(G(Wk)Ī± ) = Ī±RW 1 1 + Ī± , (3) for Ī± > āˆ’1, where RW (Ī²) is the speciļ¬c RĀ“enyi entropy for the process {Wk} with parameter Ī² > 0, RW (Ī²) := lim kā†’āˆž 1 k 1 1 āˆ’ Ī² log ļ£« ļ£­ wāˆˆAk P(Wk = w)Ī² ļ£¶ ļ£ø . These results have recently [11] been built on to prove that {kāˆ’1 log G(Wk)} satisļ¬es a Large Deviation Principle (LDP), e.g [13]. Deļ¬ne the scaled Cumulant Generating Function (sCGF) of {kāˆ’1 log G(Wk)} by Ī›W (Ī±) := lim kā†’āˆž 1 k log E eĪ± log G(Wk) for Ī± āˆˆ R and make the following two assumptions. ā€¢ Assumption 1: For Ī± > āˆ’1, the sCGF Ī›W (Ī±) exists, is equal to Ī±RW (1/(1 + Ī±)) and has a continuous deriva- tive in that range. ā€¢ Assumption 2: The limit gW := lim kā†’āˆž 1 k log P(G(Wk) = 1) (4) exists in (āˆ’āˆž, 0]. Should assumptions 1 and 2 hold, Theorem 3 of [11] es- tablishes that Ī›W (Ī±) = gW for all Ī± ā‰¤ āˆ’1 and that the sequence {kāˆ’1 log G(Wk)} satisļ¬es a LDP with a rate function given by the Legendre Fenchel transform of the sCGF, Ī›āˆ— W (x) := supĪ±āˆˆR{xĪ± āˆ’ Ī›W (Ī±)}. Assumption 1 is motivated by equation (3), while the Assumption 2 is a regularity condition on the probability of the most likely word. With Ī³W := lim Ī±ā†“āˆ’1 d dĪ± Ī›W (Ī±), (5) where the order of the size of the set of maximum probability words of Wk is exp(kĪ³W ) [11], Ī›āˆ— W (x) can be identiļ¬ed as = ļ£± ļ£“ļ£² ļ£“ļ£³ āˆ’x āˆ’ gW if x āˆˆ [0, Ī³W ] supĪ±āˆˆR{xĪ± āˆ’ Ī›W (Ī±)} if x āˆˆ (Ī³W , log(m)], +āˆž if x /āˆˆ [0, log(m)]. (6) Corollary 5 of [11] uses this LDP to prove a result suggested in [14], [15], that lim kā†’āˆž 1 k E(log(G(Wk))) = HW , (7) making clear that the speciļ¬c Shannon entropy determines the expectation of the logarithm of the number of guesses to guess the word Wk. The growth rate of the expected Guesswork is a distinct quantity whose scaling rules can be determined directly from the sCGF in equation (3), lim kā†’āˆž 1 k log E(G(Wk)) = Ī›W (1). From these expressions and Jensenā€™s inequality, it is clear that the growth rate of the expected Guesswork is less than HW . Finally, as a corollary to the LDP, [11] provides the following approximation to the Guesswork distribution for large k: P(G(Wk) = n) ā‰ˆ 1 n exp āˆ’kĪ›āˆ— W (kāˆ’1 log n) (8) for n āˆˆ {1, . . . , mk }. Thus to approximate the Guesswork distribution, it is sufļ¬cient to know the speciļ¬c RĀ“enyi entropy of the source and the decay-rate of the likelihood of the sequence of most likely words. Here we show that if {Wk} is constructed from i.i.d. letters, then both of the processes {Uk} and {Wk} also satisfy Assumptions 1 and 2 so that, with the appropriate rate functions, the approximation in equation (8) can be used with Uk or Wk in lieu of Wk. This enables us to compare the Guesswork distribution for typical set encoded words with their assumed uniform counterpart. Even in the simple binary alphabet case we establish that, apart from edge cases, a word chosen via Wk is exponential easier in k to guess on average than one chosen via Uk. III. STATEMENT OF MAIN RESULTS Assume that the words {Wk} are made of i.i.d. letters, deļ¬ning p = (p0, . . . , pmāˆ’1) by pa = P(W1 = a). We shall employ the following short-hand: h(l) := āˆ’ a la log la for l = (l0, . . . , lmāˆ’1) āˆˆ [0, 1]m , la ā‰„ 0, a la = 1, so that HW = h(p), and D(l||p) := āˆ’ a la log(pa/la). Furthermore, deļ¬ne lāˆ’ āˆˆ [0, 1]m and l+ āˆˆ [0, 1]m lāˆ’ āˆˆ arg max l {h(l) : h(l) + D(l||p) āˆ’ = h(p)}, (9) l+ āˆˆ arg max l {h(l) : h(l) + D(l||p) + = h(p)}, (10) should they exist. For Ī± > āˆ’1, also deļ¬ne lW (Ī±) and Ī·(Ī±) by lW a (Ī±) := p (1/(1+Ī±)) a bāˆˆA p (1/(1+Ī±)) b for all a āˆˆ A and (11) Ī·(Ī±) := āˆ’ a lW a log pa = āˆ’ aāˆˆA p 1/(1+Ī±) a log pa bāˆˆA p 1/(1+Ī±) b . (12) Assume that h(p)+ ā‰¤ log(m). If this is not the case, log(m) should be substituted in place of h(lāˆ’ ) for the {Uk} results. Proofs of the following are deferred to the Appendix. Lemma 1: Assumption 1 holds for {Uk} and {Wk} with Ī›U (Ī±) := Ī±h(lāˆ’ ) and Ī›W (Ī±) = Ī±h(lāˆ— (Ī±)) āˆ’ D(lāˆ— (Ī±)||p), where lāˆ— (Ī±) = ļ£± ļ£“ļ£² ļ£“ļ£³ l+ if Ī·(Ī±) ā‰¤ āˆ’h(p) āˆ’ , lW (Ī±) if Ī·(Ī±) āˆˆ (āˆ’h(p) āˆ’ , h(p) + ), lāˆ’ if Ī·(Ī±) ā‰„ āˆ’h(p) + . (13)
  • 3. Lemma 2: Assumption 2 holds for {Uk} and {Wk} with gU = āˆ’h(lāˆ’ ) and gW = min āˆ’h(p) + , log max aāˆˆA pa . Thus by direct evaluation of the sCGFs at Ī± = 1, lim kā†’āˆž 1 k log E(G(Uk)) = h(lāˆ’ ) and lim kā†’āˆž 1 k log E(G(Wk)) = Ī›W (1). As the conditions of Theorem 3 [11] are satisļ¬ed lim kā†’āˆž 1 k E(log(G(Uk)) = Ī›U (0) = h(lāˆ’ ) and lim kā†’āˆž 1 k E(log(G(Wk)) = Ī›W (0) = h(p), and we have the approximations P(G(Uk) = n) ā‰ˆ 1 n exp āˆ’kĪ›āˆ— U (kāˆ’1 log n) and P(G(Wk) = n) ā‰ˆ 1 n exp āˆ’kĪ›āˆ— W (kāˆ’1 log n) . IV. EXAMPLE Consider a binary alphabet A = {0, 1} and words {Wk} constructed of i.i.d. letters with P(W1 = 0) = p0 > 1/2. In this case there are unique lāˆ’ and l+ satisfying equations (9) and (10) determined by: lāˆ’ 0 = p0 āˆ’ log(p0) āˆ’ log(1 āˆ’ p0) , l+ 0 = p0 + log(p0) āˆ’ log(1 āˆ’ p0) . Selecting 0 < < (log(p0)āˆ’log(1āˆ’p0)) min(p0āˆ’1/2, 1āˆ’p0) ensures that the typical set is growing more slowly than 2k and that 1/2 < lāˆ’ 0 < p0 < l+ 0 < 1. With lW (Ī±) deļ¬ned in equation (11), from equations (3) and (4) we have that Ī›W (Ī±) = log(p0) if Ī± < āˆ’1, Ī±h(lW (Ī±)) āˆ’ D(lW (Ī±)||p), if Ī± ā‰„ āˆ’1, = ļ£± ļ£² ļ£³ log(p0) if Ī± < āˆ’1, (1 + Ī±) log p 1 1+Ī± 0 + (1 āˆ’ p0) 1 1+Ī± if Ī± ā‰„ āˆ’1, From Lemmas 1 and 2 we obtain Ī›U (Ī±) = āˆ’h(lāˆ’ ) if Ī± < āˆ’1, Ī±h(lāˆ’ ) if Ī± ā‰„ āˆ’1, and Ī›W (Ī±) = Ī±h(lāˆ— (Ī±)) āˆ’ D(lāˆ— (Ī±)||p), where lāˆ— (Ī±) is deinfed in equation (13) and Ī·(Ī±) deļ¬ned in equation (12). With Ī³ deļ¬ned in equation (5), we have Ī³W = 0, Ī³U = h(lāˆ’ ) and Ī³W = h(l+ ) so that, as h(lāˆ’ ) > h(l+ ), the ordering of the growth rates with word length of the set of most likely words from smallest to largest is: unconditioned source, conditioned source and uniform approximation. From these sCGF equations, we can determine the average growth rates and estimates on the Guesswork distribution. In particular, we have that lim kā†’āˆž 1 k E(log(G(Wk))) = Ī›W (0) = h(p), lim kā†’āˆž 1 k E(log(G(Wk))) = Ī›W (0) = h(p), lim kā†’āˆž 1 k E(log(G(Uk))) = Ī›U (0) = h(lāˆ’ ). As h((x, 1 āˆ’ x)) is monotonically decreasing for x > 1/2 and 1/2 < lāˆ’ 0 < p0, the expectation of the logarithm of the Guesswork is growing faster for the uniform approximation than for either the unconditioned or conditioned word source. The growth rate of the expected Guesswork reveals more features. In particular, with A = Ī·(1) āˆ’ (h(p) + ), lim kā†’āˆž 1 k log E(G(Wk)) = 2 log(p 1 2 0 + (1 āˆ’ p0) 1 2 ), lim kā†’āˆž 1 k log E(G(Wk)) = 2 log(p 1 2 0 + (1 āˆ’ p0) 1 2 ), A ā‰¤ 0 h(lāˆ’ ) āˆ’ D(lāˆ’ ||p), A > 0 lim kā†’āˆž 1 k log E(G(Uk)) = h(lāˆ’ ). For the growth rate of the expected Guesswork, from these it can be shown that there is no strict order between the unconditioned and uniform source, but there is a strict or- dering between the the uniform approximation and the true conditioned distribution, with the former being strictly larger. With = 1/10 and for a range of p0, these formulae are illustrated in Figure 1. The top line plots lim kā†’āˆž 1 k E(log(G(Uk)) āˆ’ log(G(Wk))) = lim kā†’āˆž 1 k E(log(G(Uk)) āˆ’ log(G(Wk))) = h(lāˆ’ ) āˆ’ h(p), showing that the expected growth rate in the logarithm of the Guesswork is always higher for the uniform approximation than both the conditioned and unconditioned sources. The second highest line plots the difference in growth rates of the expected Guesswork of the uniform approximation and the true conditioned source lim kā†’āˆž 1 k log E(G(Uk)) E(G(Wk)) = h(lāˆ’ ) āˆ’ 2 log(p 1 2 0 + (1 āˆ’ p0) 1 2 ) if Ī·(1) ā‰¤ h(p) + D(lāˆ’ ||p) if Ī·(1) > h(p) + . That this difference is always positive, which can be estab- lished readily analytically, shows that the expected Guess- work of the true conditioned source is growing at a slower exponential rate than the uniform approximation. The second line and the lowest line, the growth rates of the uniform and unconditioned expected Guesswork lim kā†’āˆž 1 k log E(G(Uk)) E(G(Wk)) = h(lāˆ’ ) āˆ’ 2 log(p 1 2 0 + (1 āˆ’ p0) 1 2 ),
  • 4. 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 0.08 0.06 0.04 0.02 0 0.02 0.04 0.06 0.08 0.1 p0 Differenceinexpectedgrowthrate U ā€™(0) W ā€™(0) U (1) W (1) U (1) W (1) Fig. 1. Bernoulli(p0, 1 āˆ’ p0) source. Difference in exponential growth rates of Guesswork between uniform approximation, unconditioned and conditioned distribution with = 0.1. Top curve is the difference in expected logarithms between the uniform approximation and both the conditioned and uncondi- tioned word sources. Bottom curve is the log-ratio of the expected Guesswork of the uniform and unconditioned word sources, with the latter harder to guess for large p0. Middle curve is the log-ratio of the uniform and conditioned word sources, which initially follows the lower line, before separating and staying positive, showing that the conditioned source is always easier to guess than the typically used uniform approximation. initially agree. It can, depending on p0 and , be either positive or negative. It is negative if the typical set is particularly small in comparison to the number of unconditioned words. For p0 = 8/10, the typical set is growing sufļ¬ciently slowly that a word selected from the uniform approximation is easier to guess than for unconditioned source. For this value, we illustrate the difference in Guesswork distributions between the unconditioned {Wk}, conditioned {Wk} and uniform {Uk} word sources. If we used the approximation in (8) directly, the graph would not be informative as the range of the unconditioned source is growing exponentially faster than the other two. Instead Figure 2 plots āˆ’x āˆ’ Ī›āˆ— (x) for each of the three processes. That is, using equation (8) and its equivalents for the other two processes, it plots 1 k log G(w), where G(w) āˆˆ {1, . . . , 2k }, against the large deviation approximations to 1 k log P(Wk = w), 1 k log P(Wk = w) and 1 k log P(Uk = w), as the resulting plot is unchanging in k. The source of the discrepancy in expected Guesswork is apparent, with the unconditioned source having substantially more words to cover (due to the log x-scale). Both it and the true condi- tioned sources having higher probability words that skew their Guesswork. The ļ¬rst plateau for the conditioned and uniform distributions correspond to those words with approximately maximum highest probability; that is, the length of this plateau is Ī³W or Ī³U , deļ¬ned in equation (5), so that, for example, 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 x x W * (x) x W * (x) x U * (x) Fig. 2. Bernoulli(8/10, 2/10) source, = 0.1. Guesswork distri- bution approximations. For large k, x-axis is x = 1/k log G(w) for G(w) āˆˆ {1, . . . , 2k} and the y-axis is the large deviation approximation 1/k log P(X = w) ā‰ˆ āˆ’x āˆ’ Ī›āˆ— X (x) for X = Wk, Wk and X = Uk. approximately exp(kĪ³W ) words have probability of approx- imately exp(kgW ). V. CONCLUSION By establishing that the expected Guesswork of a source conditioned on the typical set is growing with a smaller expo- nent than its usual uniform approximation, we have demon- strated that appealing to the AEP for the latter is erroneous in cryptanalysis and instead provide a correct methodology for identifying the Guesswork growth rate. APPENDIX The proportion of the letter a āˆˆ A in a word w = (w1, . . . , wk) āˆˆ Ak is given by nk(w, a) := |{1 ā‰¤ i ā‰¤ k : wi = a}| k . The number of words in a type l = (l0, . . . , lmāˆ’1), where la ā‰„ 0 for all a āˆˆ A and aāˆˆA la = 1, is given by Nk(l) := |{w āˆˆ Ak such that nk(w, a) = la āˆ€a āˆˆ A}|. The set of all types, those just in the typical set and smooth approximations to those in the typical set are denoted Lk := {l : āˆƒw āˆˆ Ak such that nk(w, a) = la āˆ€a āˆˆ A}, L ,k := {l : āˆƒw āˆˆ T ,k such that nk(w, a) = la āˆ€a āˆˆ A}, L := l : a la log pa āˆˆ [āˆ’h(p) āˆ’ , āˆ’h(p) + ] , where it can readily seen that L ,k āŠ‚ L for all k. For {Uk} we need the following Lemma.
  • 5. Lemma 3: The exponential growth rate of the size of the typical set is lim kā†’āˆž 1 k log |Tk| = log m if log m ā‰¤ h(p) + h(lāˆ’ ) otherwise. where lāˆ’ is deļ¬ned in equation (9). Proof: For ļ¬xed k, by the union bound max lāˆˆL ,k k! aāˆˆA(kla)! ā‰¤ |Tk| ā‰¤ (k + 1)m max lāˆˆL ,k k! aāˆˆA(kla)! . For the logarithmic limit, these two bounds coincide so con- sider the concave optimization problem max lāˆˆL ,k k! aāˆˆA(kla)! . We can upper bound this optimization by replacing L ,k with the smoother version, its superset L . Using Stirlingā€™s bound we have that lim sup kā†’āˆž 1 k log sup lāˆˆL k! aāˆˆA(kla)! ā‰¤ sup lāˆˆL h(l) = log(m) if h(p) + ā‰„ log(m) h(lāˆ’ ) if h(p) + < log(m). For the lower bound, we need to construct a sequence {l(k) } such that l(k) āˆˆ L ,k for all k sufļ¬ciently large and h(l(k) ) converges to either log(m) or h(lāˆ’ ), as appropriate. Let lāˆ— = (1/m, . . . , 1/m) or lāˆ’ respectively, letting c āˆˆ arg max pa and deļ¬ne l(k) a = ļ£± ļ£“ļ£² ļ£“ļ£³ kāˆ’1 klāˆ— a + 1 āˆ’ bāˆˆA 1 k klāˆ— b if a = c, kāˆ’1 klāˆ— a if a = c. Then l(k) āˆˆ L ,k for all k > āˆ’m log(pc)/(2 ) and h(l(k) ) ā†’ h(lāˆ— ), as required. Proof: Proof of Lemma 1. Considering {Uk} ļ¬rst, Ī±RU 1 1 + Ī± = Ī± lim kā†’āˆž 1 k log |Tk| = Ī±h(lāˆ’ ), by Lemma 3. To evaluate Ī›U (Ī±), as for any n āˆˆ N and Ī± > 0 n i=1 iĪ± ā‰„ n 0 xĪ± dx, again using Lemma 3 we have Ī±h(lāˆ’ ) = lim kā†’āˆž 1 k log 1 1 + Ī± |Tk|Ī± ā‰¤ lim kā†’āˆž 1 k log E(eĪ± log G(Uk) ) = lim kā†’āˆž 1 k log 1 |Tk| |Tk | i=1 iĪ± ā‰¤ lim kā†’āˆž 1 k log |Tk|Ī± = Ī±h(lāˆ’ ), where we have used Lemma 3. The reverse of these bounds holds for Ī± āˆˆ (āˆ’1, 0], giving the result. We break the argument for {Wk} into three steps. Step 1 is to show the equivalence of the existence of Ī›W (Ī±) and Ī±RW (1/(1 + Ī±)) for Ī± > āˆ’1 with the existence of the following limit lim kā†’āˆž 1 k log max lāˆˆL ,k Nk(l) 1+Ī± aāˆˆA pkla a . (14) Step 2 then establishes this limit and identiļ¬es it. Step 3 shows that Ī›W (Ī±) is continuous for Ī± > āˆ’1. To achieve steps 1 and 2, we adopt and adapt the method of types argument employed in the elongated web-version of [8]. Step 1 Two changes from the bounds of [8] Lemma 5.5 are necessary: the consideration of non-i.i.d. sources by restriction to Tk; and the extension of the Ī± range to include Ī± āˆˆ (āˆ’1, 0] from that for Ī± ā‰„ 0 given in that document. Adjusted for conditioning on the typical set we get 1 1 + Ī± max lāˆˆL ,k Nk(l) 1+Ī± aāˆˆA pkla a wāˆˆTk P(Wk = w) ā‰¤ E(eĪ± log G(Wk ) ) ā‰¤ (15) (k + 1)m(1+Ī±) max lāˆˆL ,k Nk(l) 1+Ī± aāˆˆA pkla a wāˆˆTk P(Wk = w) . The necessary modiļ¬cation of these inequalities for Ī± āˆˆ (āˆ’1, 0] gives max lāˆˆL ,k Nk(l) 1+Ī± aāˆˆA pkla a wāˆˆTk P(Wk = w) ā‰¤ E(eĪ± log G(Wk ) ) ā‰¤ (16) (k + 1)m 1 + Ī± max lāˆˆL ,k Nk(l) 1+Ī± aāˆˆA pkla a wāˆˆTk P(Wk = w) . To show the lower bound holds if Ī± āˆˆ (āˆ’1, 0] let lāˆ— āˆˆ arg max lāˆˆL ,k Nk(l) 1+Ī± aāˆˆA pkla a wāˆˆTk P(Wk = w) . Taking lim infkā†’āˆž kāˆ’1 log and lim supkā†’āˆž kāˆ’1 log of equa- tions (15) and (16) establishes that if the limit (14) exists, Ī›W (Ī±) exists and equals it. Similar inequalities provide the same result for Ī±RW (1/(1 + Ī±)). Step 2 The problem has been reduced to establishing the existence of lim kā†’āˆž 1 k log max lāˆˆL ,k Nk(l) 1+Ī± aāˆˆA pkla a and identifying it. The method of proof is similar to that employed at the start of Lemma 1 for {Uk}: we provide an upper bound for the limsup and then establish a corresponding lower bound.
  • 6. If l(k) ā†’ l with l(k) āˆˆ Lk, then using Stirlingā€™s bounds we have that lim kā†’āˆž 1 k log Nk(lāˆ—,(k) ) = h(l). This convergence occurs uniformly in l and so, as L ,k āŠ‚ L for all k, lim sup kā†’āˆž 1 k log max lāˆˆL ,k Nk(l) 1+Ī± aāˆˆA pkla a ā‰¤ sup lāˆˆL (1 + Ī±)h(l) + a la log pa = sup lāˆˆL (Ī±h(l) āˆ’ D(l||p)) . (17) This is a concave optimization problem in l with convex constraints. Not requiring l āˆˆ L , the unconstrained optimizer over all l is attained at lW (Ī±) deļ¬ned in equation (11), which determines Ī·(Ī±) in equation (12). Thus the optimizer of the constrained problem (17) can be identiļ¬ed as that given in equation (13). Thus we have that lim sup kā†’āˆž 1 k log max lāˆˆL ,k Nk(l) 1+Ī± aāˆˆA pkla a ā‰¤ Ī±h(lāˆ— (Ī±)) + D(lāˆ— (Ī±)||p), where lāˆ— (Ī±) is deļ¬ned in equation (13). We complete the proof by generating a matching lower bound. To do so, for given lāˆ— (Ī±) we need only create a sequence such that l(k) ā†’ lāˆ— (Ī±) and l(k) āˆˆ L ,k for all k. If lāˆ— (Ī±) = lāˆ’ , then the sequence used in the proof of Lemma 3 sufļ¬ces. For lāˆ— (Ī±) = l+ , we use the same sequence but with ļ¬‚oors in lieu of ceilings and the surplus probability distributed to a least likely letter instead of a most likely letter. For lāˆ— (Ī±) = lW (Ī±), either of these sequences can be used. Step 3 As Ī›W (Ī±) = Ī±h(lāˆ— (Ī±)) āˆ’ D(lāˆ— (Ī±)||p), with lāˆ— (Ī±) deļ¬ned in equation (13), d dĪ± Ī›W (Ī±) = h(lāˆ— (Ī±)) + Ī›W (Ī±) d dĪ± lāˆ— (Ī±). Thus to establish continuity it sufļ¬ces to establish continuity of lāˆ— (Ī±) and its derivative, which can be done readily by calculus. Proof: Proof of Lemma 2. This can be established directly by a letter substitution argument, however, more generically it can be seen as being a consequence of the existence of speciļ¬c min-entropy as a result of Assumption 1 via the following inequalities Ī±R 1 1 + Ī± āˆ’ (1 + Ī±) log m ā‰¤ lim inf kā†’āˆž 1 + Ī± k log mk P(G(Wk) = 1)(1/(1+Ī±)) mk (18) ā‰¤ lim sup kā†’āˆž 1 k log P(G(Wk) = 1) = (1 + Ī±) lim sup kā†’āˆž 1 k log P(G(Wk) = 1)(1/(1+Ī±)) ā‰¤ (1 + Ī±) lim sup kā†’āˆž 1 k log(P(G(Wk) = 1)(1/(1+Ī±)) + mk i=2 P(G(Wk) = i)(1/(1+Ī±)) ) = Ī±R 1 1 + Ī± . Equation (18) holds as P(G(Wk) = 1) ā‰„ P(G(Wk) = i) for all i āˆˆ {1, . . . , mk }. The veracity of the lemma follows as Ī±R (1 + Ī±)āˆ’1 exists and is continuous for all Ī± > āˆ’1 by Assumption 1 and (1 + Ī±) log m tends to 0 as Ī± ā†“ āˆ’1. ACKNOWLEDGMENT M.C. and K.D. supported by the Science Foundation Ire- land Grant No. 11/PI/1177 and the Irish Higher Educational Authority (HEA) PRTLI Network Mathematics Grant. F.d.P.C. and M.M. sponsored by the Department of Defense under Air Force Contract FA8721-05-C-0002. Opinions, interpretations, recommendations, and conclusions are those of the authors and are not necessarily endorsed by the United States Government. Speciļ¬cally, this work was supported by Information Systems of ASD(R&E). REFERENCES [1] A. Menezes, S. Vanstone, and P. V. Oorschot, Handbook of Applied Cryptography. CRC Press, Inc., 1996. [2] T. M. Cover and J. A. Thomas, Elements of Information Theory. John Wiley & Sons, 1991. [3] J. Pliam, ā€œOn the incomparability of entropy and marginal guesswork in brute-force attacks,ā€ in INDOCRYPT, 2000, pp. 67ā€“79. [4] S. Draper, A. Khisti, E. Martinian, A. Vetro, and J. Yedidia, ā€œSecure storage of ļ¬ngerprint biometrics using Slepian-Wolf codes,ā€ in ITA Workshop, 2007. [5] Y. Sutcu, S. Rane, J. Yedidia, S. Draper, and A. Vetro, ā€œFeature extraction for a Slepian-Wolf biometric system using LDPC codes,ā€ in ISIT, 2008. [6] F. du Pin Calmon, M. MĀ“edard, L. Zegler, J. Barros, M. Christiansen, and K. Duffy, ā€œLists that are smaller than their parts: A coding approach to tunable secrecy,ā€ in Proc. 50th Allerton Conference, 2012. [7] E. Arikan, ā€œAn inequality on guessing and its application to sequential decoding,ā€ IEEE Trans, Inf. Theory, vol. 42, no. 1, pp. 99ā€“105, 1996. [8] D. Malone and W. Sullivan, ā€œGuesswork and entropy,ā€ IEEE Trans. Inf. Theory, vol. 50, no. 4, pp. 525ā€“526, 2004, http://www.maths.tcd.ie/ āˆ¼dwmalone/p/guess02.pdf. [9] C.-E. Pļ¬ster and W. Sullivan, ā€œRĀ“enyi entropy, guesswork moments and large deviations,ā€ IEEE Trans. Inf. Theory, no. 11, pp. 2794ā€“00, 2004. [10] M. K. Hanawal and R. Sundaresan, ā€œGuessing revisited: A large devi- ations approach,ā€ IEEE Trans. Inf. Theory, vol. 57, no. 1, pp. 70ā€“78, 2011. [11] M. M. Christiansen and K. R. Duffy, ā€œGuesswork, large deviations and Shannon entropy,ā€ IEEE Trans. Inf. Theory, vol. 59, no. 2, pp. 796ā€“802, 2013. [12] J. L. Massey, ā€œGuessing and entropy,ā€ IEEE Int. Symo. Inf Theory, pp. 204ā€“204, 1994. [13] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applica- tions. Springer-Verlag, 1998. [14] E. Arikan and N. Merhav, ā€œGuessing subject to distortion,ā€ IEEE Trans. Inf. Theory, vol. 44, pp. 1041ā€“1056, 1998. [15] R. Sundaresan, ā€œGuessing based on length functions,ā€ in Proc. 2007 International Symp. on Inf. Th., 2007.