1. Composite Field Multiplier based on Look-Up Table
for Elliptic Curve Cryptography Implementation
Marisa W. Paryasto#1, Budi Rahardjo#2, Fajar Yuliawan*3, Intan Muchtadi-Alamsyah*4, Kuspriyanto#5
#
School of Electrical Engineering and Informatics, Institut Teknologi Bandung
Jl. Ganesha No. 10 Bandung 40132 - Indonesia
1
marisa@stei.itb.ac.id
2
br@paume.itb.ac.id
5
kuspriyanto@yahoo.com
*
Algebra Research Group, Faculty of Mathematics and Natural Sciences, Institut Teknologi Bandung
Jl. Ganesha No. 10 Bandung 40132 - Indonesia
3
fajar.yuliawan@math.itb.ac.id
4
ntan@math.itb.ac.id
Abstract— Abstract--- In this work we propose the use of The keylength used in ECC determines the level of
composite field to implement finite field multiplication, which will security. In general, longer key requires more components in
be use in ECC implementation. We use 299-bit keylength and the corresponding hardware implementation. As the internet
GF((213)23) is used instead of GF(2299) . Composite field multiplier technology grows, the demand to implement ECC on
can be implemented using conventional multiplication operation or
constraint devices increases. As a consequence, there is a need
using LUT (Look-Up Table). In this paper, LUT is used for
multiplication in ground field and Karatsuba Offman Algorithm for efficient algorithm and architecture for implementing ECC
for the extension field multiplication. A generic architecture for the on constraint devices. For the security level requirement at the
multiplier is presented. Implementation is done with VHDL with present, ECC should have above 160 bits keylength.
the target device Altera DE -2. We implement 299 bits in this research. However
implementing 299-bit ECC in constrained device, simulated
with FPGA board, cannot be done using conventional
Keywords— security, cryptography, elliptic curve, finite field, algorithm and architecture. For example, 299-bit binary
multiplier, composite field classic multiplier cannot fit in our FPGA device. One
alternative solution is using composite field.
Composite field is a finite field divided into subfield
I. INTRODUCTION (ground field) and extended field. It is done by composing a
long string of bit into smaller groups of bit. This
Elliptic Curve Cryptography (ECC) is a public-key encryption representation allows arithmetic operations to be done in
that requires high computation for solving complex arithmetic smaller chunks of string so the complex operations can be
operations. Elliptic curve is used in cryptography because of broke down into simpler operations. The focus of our paper is
its mathematical properties that fits the encryption process. implementation of multiplier using composite field. We focus
Elliptic curve has its own arithmetic operations, very specific on multiplier because multiplication is the most frequently
and unpredictable, makes it cryptographically strong and used operation in ECC. The benefit of using composite field is
becomes the most preferable cryptography algorithm to the lower of memory usage and lower components.
replace RSA. Unfortunately, to implement ECC requires The use of composite field characteristic of dividing one
sophisticated mathematical skills. There are many layers, big chunk of operation into smaller ones, combined with the
restrictions, and combinations that make various ECC recursive Karatsuba Offman Algorithm (KOA), allows us to
implementations difficult to compare. Every level of ECC implement multiplier for ECC in limited resources with
offers many things to explore. In this work we focus on the adequate level of security. Other multipliers are limited up to
lowest level: finite field operations. Since multiplication is the 260 bits.
most frequently used operations, we investigate existing
multipliers algoritms and make improvements.
II. PREVIOUS WORK
2. The first idea multiplier for composite field was initiated by the standard alog table. It contains the values (k, gk) sorted
Mastrovito [1]. The multiplier is called hybrid-multiplier. It with repect to the index k , where k = 0, 1, 2, . . .,2n+1-2 . Since
performs the multiplication by doing multiplication serially in the values of i and j in step 1 and 2 of the multiplications are
the ground field and parallel in the extension field. in the range 0, 2n-1 , the range of k = i + j is 0, 2n+1-2 . Thus
Mastrovito's multiplier basically works using a multiplication modular addition operation can be omitted and the ground
matrix that includes the reduction process. Paar in his works field multiplication operation can be simplified as follows:
[2,3,4] added some improvements to Mastrovito's. Paar
implemented multiplication in the ground field using KOA 1. i := log[A]
and Mastrovito for multiplication in the extension field. Later, 2. j := log[B]
Rosner [5] conducted further research of Mastrovito and Paar. 3. k := i + j
4. C := extended-alog[k]
Look-Up Table (LUT) for composite field operations has
been implemented in [6]. The algorithm for ground field 3.2 Composite Field
multiplication using logaritmic table lookup is proven to be
fast. Let GF(2k) denote a binary extension field defined over
GF(2) . If the elements of the set
III. METHODOLOGY
3.1 Look-Up Table
are linearly independent, then B1 forms a polynomial basis for
LUT is used for storing log and alog (anti log) table to GF(2k) . Given an element A in GF(2k) , it can be written as
make multiplication operation in the ground field GF(2n)
perform faster. [6] concludes that n does not have to be
exactly the same as a single computer word (e.g. 8, 16). It has
been proved that n < 2 is more efficient because the table will
be smaller and thus will take advantage of the first level cache
of computers. where a0 , a1 , ..., ak-1 in GF(2) are the coefficients.
One of the reason why this research uses LUT for storing An extension field defined over one of its subfields is
precomputed log and alog table in GF(213) is that Table 2 in known as a composite field. It is denoted as GF((2n)m) where
[6] shows that LUT for n =13 is efficient for polynomial GF(2n) is known as the ground field over which the composite
basis multiplication compared to bigger n. To construct field is defined. For each given degree, there is only one finite
logaritmic lookup table, a primitive element g in GF(2n) is field of characteristic 2, both the binary and composite fields
selected to be the generator of the field GF(2n) , so that every refer to the same field although their representation methods
element A in this field can be written as a power of g as A=gi , are different. To represent the elements in the composite field
where 0 < i < 2n-1 . Then the powers of the primitive element GF((2n)m) , the basis that can be used is
gi can be computed for i=0, 1, 2,.. , 2n-1 , and obtain 2n pairs
of the form (A, i). Two tables sorting these pairs have to be
constructed in two different ways: the log table sorted with
where β is the root of a degree m irreducible polinomial
respect to A and the alog table sorted with the respect i .
whose coefficients are in the base field GF(2n). An element A
These tables then can be used for performing the field
in GF((2n)m) can be written as
multiplication, squaring and inversion operations. Given two
elements A, B in GF(2n) , the multiplication C=AB is
performed as follows:
1. i := log[A] where a0' , a1' , ..., am-1' in GF(2n). The operations in the
2. j := log[B] ground field are carried out using pre-calculated logarithmic
3. k := i+j (mod 2n-1) lookup tables. To construct logarithmic tables, a primitive
4. C := alog[k] element in GF(2n) need to be found.
The steps above is based on the fact that C = AB = gigj = The reason why we use GF((213)23) in this implementation
i+j mod 2n-1
g . Ground field multiplication requires three memory is because GF((213)23)=GF(2299) which complies with the
access and a single addition operation with modulus 2n-1 . security level needed. The other reasons is that
GCD(13,23)=1 so irreducible polynomial can be used for
[6] also proposed the use of the extended alog table for both ground field and extension field, and there are trinomials
eliminating modular addition operation (step 3). The extended and polynomials available. Carefully chosen irreducible
alog table is 2n+1-1 long, which is about twice the length of
3. polynomial will reduce the complexity of multiplication algorithm eventually terminates after t steps. In the final steps
operation. the polynomials M(t)(x) are degenarated into single
coefficients. Since every step halves the number of
coefficients, the algorithm terminates after t=log2 m steps.
3.3 Karatsuba-Offman Algorithm Multiplier
IV. DESIGN AND IMPLEMENTATION
The Karatsuba-Offman Algorithm (KOA) is a recursive The generic architecture of our circuit is shown in the
method for efficient polynomial multiplication. It is known in Figure 1. On the left side there are two set of input registers,
[2] that two arbitrary polynomials in one variable of degree each for input A and B. The size of the input registers depends
less or equal to m-1 with coefficient from a field GF(2m) can on the bit size. For our particular case, it is 13-bit. If we use
be multiplied with not more than m2 multiplications in GF(2m) off the shelf components, we may have to use 16-bit registers.
and (m-1)2 additions in GF(2m). The KOA provides a
recursive algorithm which reduces the above multiplicative
and additive (for large enough m) complexities. A KOA
restricted to polynomials, where m=2t with t an integer is
presented in [2].
Let a(x) and b(x) be two elements in GF(2m). Find a
product d(x) = a(x)b(x) , with degree < 2m – 2. Both elements
can be represented in the polynomial basis as
Fig.1 Multiplier General Architecture
Using the equation above, the polynomial product is given as Right next to the input registers are temporary registers that
are used to store addition terms before they are multiplied.
These terms depend on the KOA splitting. For 223, we can
split it into 12+11 or 8+8+7. The decision will effect the
number of temporary registers (and adders needed).
Auxiliary polynomials M(1)(x) are defined:
In the center of our circuit is the GF(213) multiplier. In this
particular design we have only one multiplier, implemented as
LUT multiplier. Multiplication is done in serial fashion.
Figure 2 shows an estimated timing diagram, which will be
implemented in the sequencer. For example, when
multiplying a22 and b22, the enable lines of registers related to
those element and the result register are activated.
If area is permitting, we could add more multipliers.
Additional multipliers will reduce the time to perform all
And then the product can be defined as: multiplications at the expense of more area. Careful timing
consideration must be done in order to avoid race condition is
multiple multipliers are implemented.
The results of multiplications are stored in temporary
The algorithm becomes recursive if it is applied again to registers before they are added to create the final results. Thus,
the polynomials given above. The next iteration step splits the there is a network of adders on the right side.
polynomials AL , BL , AH , BH , (AL+AH) , and (BL+BH) again
in half. With these newly halved polynomials, new auxiliary
polynomials M(2)(x) can be defined in a similar way. The
4. Fig.2 Estimated timing diagram
Below is the snippets of VHDL code LUT implementing
GF(213) multiplier. A 13-bit multiplier requires 213 entries ≈
8000 entries, for each table. If we implement this in general
purpose hardware then it should be implemented in 16 bit (2
bytes). In our implementation, the log and alog table occupies
2*8*2 bytes = 32 Kbytes .
process (clk) begin
if clk'event and clk = '1' then
case a is
when "0000000000000" => i <= "0000000000001";
when "0000000000001" => i <= "0000000000010";
when "0000000000010" => i <= "0000000000100";
when "0000000000011" => i <= "0000000001000";
when "0000000000100" => i <= "0000000010000";
when "0000000000101" => i <= "0000000100000";
…
V. CONCLUSIONS
We have presented a composite field multiplier using LUT
and KOA. A 13-bit LUT multiplier is used in the ground field,
KOA multiplier is used in the extension field. A general
architecture of the design is presented.
REFERENCES
[1] Mastrovito Edoardo. VLSI Architecture for Computations in
Galois Fields. PhD thesis, Linkoping University, 1991.
[2] Christof Paar. Efficient VLSI Architectures for Bit-parallel
Computation in Galois Fields. PhD thesis, 1994.
[3] Christof Paar. Fast Arithmetic Architectures for Public-Key
Algorithms over Galois Fields GF((2n)m), pages 363–378.
Number 1233 in Lecture Notes in Computer Science. Springer-
Verlag, 1997.
[4] Christof Paar and Peter Fleischmann. Fast arithmetic for public-
key algorithms in galois fields with composite exponents. IEEE
Transactions on Computers, 48(10):1025–1034, October 1999.
[5] Martin Christopher Rosner. Elliptic curve cryptosystems on
reconfigurable hardware. Master’s thesis, Worcester Polytechnic
Institute, May 1998.
[6] E. Savas and C. K. Koc. Efficient methods for composite fields
arithmetic. Technical report, Oregon State University, 1999.