High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture...

26
1 High Speed Cryptoprocessor for η T Pairing on 128-bit Secure Supersingular Elliptic Curves over Characteristic Two Fields Santosh Ghosh, Dipanwita Roy Chowdhury, and Abhijit Das Computer Science and Engineering Indian Institute of Technology Kharagpur India, 721302 {santosh,drc,abhij}@cse.iitkgp.ernet.in CHES 2011, Nara, Japan

Transcript of High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture...

Page 1: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

1

High Speed Cryptoprocessor for ηT Pairing on 128-bit Secure Supersingular Elliptic Curves over

Characteristic Two Fields

Santosh Ghosh, Dipanwita Roy Chowdhury, and Abhijit Das

Computer Science and EngineeringIndian Institute of Technology Kharagpur

India, 721302{santosh,drc,abhij}@cse.iitkgp.ernet.in

CHES 2011, Nara, Japan

Page 2: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

Introduction

ContributionsKaratsuba multiplier over F21223 fieldArchitectures for ηT pairing

Conclusion

2

Contents

CHES 2011, Nara, Japan

Page 3: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

Introduction

• Pairing for designing cryptographic scheme introduced in 2000− ID-based encryption (IBE) scheme by Boneh and Franklin [1]

• It is well suited for identity based cryptography− has gained lot of importance in recent times

• As a natural consequence, implementations of pairings are also extremely important

• This paper broadly addresses design techniques of a pairing cryptoprocessorwith high security level

3

[1] Boneh, D., and Franklin, M.K.: Identity-based encryption from the Weil pairing. Crypto 2001, LNCS 2139, pp. 213-229, 2001.

CHES 2011, Nara, Japan

Page 4: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

Introduction cont…• Cryptographic pairings are computed on elliptic or hyperelliptic curves

− defined over suitably large finite fields− having small embedding degree [2, 3].

• The security of a pairing depends on the underlying algebraic curves and respective field types

− Example of an128-bit secure pairing : ηT pairing computed on a supersingular elliptic curve defined over F21223 and having embedding degree k = 4.

• NIST recommendation: 128-bit symmetric security is essential beyond 2030− it is of importance to explore the efficient implementation techniques of 128-bit

secure pairings on different platforms.

4

[2] Hoffstein, J., Pipher, J., and Silverman, J.H.: An introduction to mathmatical cryptography. Springer, 2008.[3] Galbraith, S.: Pairings. In I. F. Blake, G. Seroussi, and N. P. Smart, editors. Advances in elliptic curve cryptography.

London Mathematical Society Lecture Note Series, chapter IX. Cambridge University Press, 2005.

CHES 2011, Nara, Japan

Page 5: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

Existing works

5CHES 2011, Nara, Japan

• Hardware implementation of 128-bit secure pairings was introduced in 2009, individually by Kammler et al. [4] and Fan et al. [5].

− Described hardware implementation techniques for computing 128-bit secure pairings over Barreto-Naehrig curves (BN curves) [6].

• Thereafter, designs in [7, 8, 9, 10] are appeared in literature− computes 128-bit secure pairings in 2.3ms, 16.4ms, 3.5ms, and 1.07ms.

• High-speed software implementations reported in [11, 12] − compute 128-bit secure pairings in 0.832ms and 1.87ms.

[4] Kammler et al: Designing an ASIP for cryptographic pairings over Barreto-Naehrig curves. CHES 2009.[5] Fan et al: Faster Fp-arithmetic for cryptographic pairings on Barreto-Naehrig curves. CHES 2009.[6] Barreto, P.S.L.M., and Naehrig, M.: Pairing-friendly elliptic curves of prime order. SAC 2005.[7] Estibals, N.: Compact hardware for computing the Tate pairing over 128-bitsecurity supersingular curves. Pairing 2010.[8] Ghosh et al: High speed flexible pairing cryptoprocessor on FPGA platform. Pairing 2010.[9] Aranha et al: Optimal Eta pairing on supersingular genus-2 binary hyperelliptic curves. ePrint Report 2010/559.[10] Duquesne et al: A FPGA pairing implementation using the residue number system. ePrint Report 2011/176.[11] Beuchat et al: High-speed software implementation of the Optimal Ate pairing over Barreto-Naehrig curves. Pairing 2010.[12] Beuchat et al: Multicore Implementation of the Tate Pairing over Supersingular Elliptic Curves. ePrint Report 2009/276.

Page 6: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

Major contributions of the paper are:

It explores area-time tradeoff designs of Karatsuba multiplier over F21223

field.

It further explores high speed architecture for computing ηT pairing on supersingular elliptic curves.

It provides the first hardware implementation result of an 128-bit secure pairing on elliptic curves over characteristic two fields.

The proposed design achieves the fastest computation (190 µs) of an 128-bit secure pairing.

6

Contributions

CHES 2011, Nara, Japan

Page 7: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

7

The F21223-Multiplier

CHES 2011, Nara, Japan

Multiplication is the key operation of a pairing computation

The 128-bit secure ηT pairing demands multiplication in an 1223-bit characteristic-two field.

Karatsuba multiplication is an efficient and popular technique for fields like Fqm.

• It is a divide-and-conquer algorithm• An m-bit multiplication is divided recursively into several

m/k-bit multiplications with small k ∈ {2, 3}.

Page 8: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

8

The F21223-Multiplier (Cont…)

CHES 2011, Nara, Japan

Multiplication for k = 2 in Fqm could be computed as :

An m-bit multiplication can be performed by: - three m/2-bit multiplications - four m-bit and two m/2-bit additions.

Implementation could be performed in several ways.- We show four different design techniques

Page 9: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

9

1. Fully Parallel Multiplier for F21223

CHES 2011, Nara, Japan

Decomposition is done as:

The synthesis tool estimates 324342 LUTs for an 1223-bit fully parallel Karatsuba multiplier.

− makes it infeasible to implement on a single Virtex-4 FPGA device

Page 10: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

10

2. Serial Use of 612-bit Parallel Multiplier

CHES 2011, Nara, Japan

It consists of a fully parallel 612-bit Karatsuba multiplier• Three 612-bit multiplications are performed in serial for computing

a multiplication in F21223

The synthesis tool estimates 95324 LUTs. − It is feasible to implement on a high-end single FPGA device

− a full pairing hardware demands much more circuits than a single multiplier

− may infeasible to put in a single FPGA.

Page 11: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

11

3. Serial Use of 306-bit Parallel Multiplier

CHES 2011, Nara, Japan

It consists of a fully parallel 306-bit Karatsuba multiplier• Nine 306-bit multiplications are performed in serial for computing

a multiplication in F21223

The computation is performed as:

- Three 612-bit multiplications: a0b0, a1b1, and (a1+a0)(b1+b0)- performed as:

Three 306-bit multiplications: a01b01, a00b00, and (a01+a00)(b01+b00)

- To sum up : nine 306-bit multiplications.

Page 12: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

12

Multiplier Architecture

CHES 2011, Nara, Japan

b c

RL

k

L RL RLR RLRL RL

306-bit hybrid-parallel Karatsuba multiplier

h 4 h 31 0

c1

h 2

1 0

c1

h 1

1 0

c1

b 11

1 0

c0

b 101 0

c0

h 0

1 0

c1

b 01

1 0

c0

1 0

c0

+2

1 0

c1

b 00

d0d1d2d3

c3 c2c4c5

1 0 c61 0c7

+4+4

e0e1e2e3

c9 c8c10c11

1 0 c121 0c13

+4+4

f0f1f2f3

c15 c14c16c17

1 0 c181 0c19

+4+4

t0t1

c20c21

r0r1r2r3

c22 c22c22c22

+4

r4r5r6r7

c22 c22c22c22

+4 +4 +4

+2

+2

+2

+2

+4a d

f = a + b + c + db

a

+2 f = a + b

a b

g4

g3

1 0

c1

g2

1 0

c1

g1 1 0

c1

a11

1 0

c0

a10

1 0c0

g0 1 0

c1

a01

1 0

c0

1 0

c0

+2

1 0

c1

a00

+2

+2

+2

+2

a0-305 a306-611 a612-917 a918-1222

b0-305b306-611b612-917b918-1222

• The operands of nine multiplications are stored into two sets of nine 306-bit parallel shift registers.

• The registers are automatically reloaded by synchronous shift operations.

- ensures two correct operands at a00and b00 registers.

• Multiplier latency: one clock cycle.

• Partial results of 1223-bit multiplication are accumulated accordingly. (Algorithm is provided in Appendix)

• Latency of one 1223-bit multiplication:10 clock cycles

Page 13: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

13

Multiplier on FPGA

CHES 2011, Nara, Japan

Demands ≈30k LUT. • affordable to implement on a medium range FPGA

Page 14: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

14

4. Serial use of 153-bit parallel multiplier

CHES 2011, Nara, Japan

Demands low resources : 16231 LUTs

Requires 27 serial use

Latency of one 1223-bit multiplication : 151ns• 2.5 times slower than 306-bit parallel multiplier• The A⋅T value is 2.46.

- is 1.2 times higher than 306-bit parallel multiplier

Serial use of 306-bit parallel multiplier provides the most optimized design.

Page 15: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

15

The ηT Pairing Cryptoprocessor over F21223

CHES 2011, Nara, Japan

The pairing computation consists of two major operations1. the non-reduced pairing (Miller’s algorithm)2. the final exponentiation

Main Features:Consists of a common datapath for both operationsAdequate parallelism is applied to achieve high speed

Major difference with existing architecture of [13]:Contrast to two separate coprocessors current design has one processing unit for both operations.

[13] Beuchat et al.: Fast architectures for the T pairing over small-characteristic supersingular elliptic curves.IEEE Transactions on Computers,

Page 16: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

16

Computation of ηT pairing

CHES 2011, Nara, Japan

-1 and 2 are dependent operations

- Performed in serial

- Parallelism is applied inside these steps

1

2

Final exponentiation

Page 17: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

17

Proposed Architecture

CHES 2011, Nara, Japan

x1

s(i) c1

0 1

+2 1

x1

c2

1 0 resetx2

1 0 reset

( )2

y11 0 reset

y21 0 reset

( )2

x(i)1

c3x(i-1)2 c2y(i)

1 c3y(i-1)2

reset

0 1c4

+2

+2 1 +2+2

0 1 c5

x + xs + x

y + yy + y + x + 1

c7

0 1

+2

c6

(i-1)0f c9

0 1 c8

f (i-1)1

c11

0 1 c10

f (i-1)2

c13

0 1 c12

f (i-1)3

c14g(i)0

c14g(i)1

1 0

6 5 4 3 2 1 0 3 2 1 0

1223-bit hybrid Karatsuba multiplication

and reduction unit

a b

c15 c16

0 1 c17 0 1 c17

+2 +2 +2

c19r0c20r 1 m2, m3,

m5, m6

m1 ,m

4 +2

c25(i)0f

+2

c26(i)1f

+2

c27(i)2f

+2

c28(i)3f

+2

+2 +2

(i)1

(i-1)2

(i) (i-1)2

(i-1)2

(i)1

(i-1)2

(i)1

(i)1

t (i)0 t (i)

1

Squaring and addition in

final exponentiation

c18

Data access mechanism

c29

0 1 c21 0 1 c22 0 1 c23 0 1 c24

c29 c29 c29

0t

1t

dt

compute squaring and square roots

hold intermediate results of (i-1)-th iteration.

computesmultiplications

combine the results of i-thiteration

performs operations other than multiplication in final exponentiation

Page 18: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

18

Execution procedure

CHES 2011, Nara, Japan

Initialization : Step 1 to step 5 is preformed synchronously with reset.

Computation of G(i) : - the sparse value of in {1, u, v, uv} basis is represented by .

- it consists of one multiplication .- in total it takes 12 clock cycles.

Sparse Multiplication over : - it consists only six multiplications in .- in total it takes 61 clock cycles.

Computation Cost of Miller's Algorithm :

- one iteration takes 73 clock cycles. - in total its cost 44688 clock cycles.

Page 19: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

19

Execution procedure

CHES 2011, Nara, Japan

Final Exponentiation : - the output of Miller’s algorithm is raised to the power of (22446 - 1)(21223 – 2612 + 1).

- powering G = g0 + g1u + g2v +g3uv in is easy :

- further one inversion followed by one multiplication in.

- in total, cost of final exponentiation is : 98M + 1842S + 135A.

- the proposed cryptoprocessor computes final exponentiation in 2922 clock cycles.

Total clock cycle count for computing an 128-bit secure ηTpairing is 47610 on our proposed cryptoprocessor.

Page 20: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

20

Experimental results

CHES 2011, Nara, Japan

• The whole design has been done in Verilog (HDL).

• Results from the place-and-route report of Xilinx ISE Design Suit is shown here:

• it finishes computation of one 128-bit secure ηT pairing in 190 µs on a Virtex-6 FPGA.

Page 21: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

21

Comparison

CHES 2011, Nara, Japan

Two aspects of the proposed design

1. Existing ηT pairing processors over characteristic-two fields

Higher security level

Lower area

Intermediate speed

[28] Shu et al. : Reconfigurable computing approach for tate pairing cryptosystems over binary fields. IEEE Transactions on Computers, 2009.

[5] Beuchat et al.: Fast architectures for the T pairing over small-characteristic supersingular elliptic curves.IEEE Transactions on Computers, 2011.

Page 22: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

22

Comparison cont…

CHES 2011, Nara, Japan

2. Existing 128-bit secure pairing implementations irrespective of underlying curve and field types.

Smallest AT valueIntermediate area- Highest speed- First in micro-second range

Page 23: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

23

Comparison cont…

CHES 2011, Nara, Japan

Existing high speed software for 128-bit pairings are also slower than our proposed design. Here we summarize the software results.

The most efficient on take 0.83 ms, which is 4.3 times slower than our proposed design

Page 24: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

24

Conclusion

• Area and time optimized hybrid Karatsuba multiplier for F21223

o Significant parallelism for speed upo Moderate area for optimization.

• Pairing cryptoprocessoro A common datapath for both non-reduced pairing and final

exponentiation.o Reduces the overall logic cellso It computes ηT pairing in characteristic-two field with higher

security (128:105) in half area.o it achieves eight times speedup and provides the best area * time

product compared to the existing designs.

CHES 2011, Nara, Japan

Page 25: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

25

Minor corrections:

CHES 2011, Nara, Japan

1.

g2 and g3 should be g0 and g1, respectively

2.

[MHz] should be at the bottom of Frequency

Page 26: High Speed Cryptoprocessor for ηT Pairing on 128 …...It further explores high speed architecture for computing η T pairing on supersingular elliptic curves. It provides the first

26CHES 2011, Nara, Japan