2008 Generalized SEN

52
Optimal All-To-All Personalized Exchange Algorithms in Generalized Shuffle-Exchange Networks Student: YuChieh Chiu Advisor: Chiuyuan Chen Department of Applied Mathematics National Chiao Tung University July 31, 2008

Transcript of 2008 Generalized SEN

Page 1: 2008 Generalized SEN

Optimal All-To-All PersonalizedExchange Algorithms in Generalized

Shuffle-Exchange Networks

Student: YuChieh Chiu Advisor: Chiuyuan Chen

Department of Applied Mathematics

National Chiao Tung University

July 31, 2008

Page 2: 2008 Generalized SEN

Shuffle

000

001

010

011

100

101

110

111

000

001

010

011

100

101

110

111

When N = 2n, the shuffle operation is:

π(in−1in−2 · · · i1i0) = in−2 · · · i1i0in−1

.

Page 3: 2008 Generalized SEN

Binary Switch

A binary switch is a 2× 2 Switch Element (SE)

2x2Switch

Legitimate States = 4

Permutation Connections = 2

Page 4: 2008 Generalized SEN

Binary Switch

Control bit

0 for straight and 1 for exchange (cross)

the 2 broadcast states are not used in this paper

Straight Exchange

Upper-broadcast Lower-broadcast

The different setting of the 2X2 SE

Page 5: 2008 Generalized SEN

Multistage Interconnection Network(MIN)

three typical MINs

8× 8 baseline network, shuffle exchange network, and indirectbinary n-cube network

Page 6: 2008 Generalized SEN

MIN Implementation

Control (X)

Source (S) Destination (D)

Page 7: 2008 Generalized SEN

N ×N Shuffle Exchange Networks

N ×N Shuffle Exchange Networks = N ×N SENs

N = ♯ of nodes, n = ♯ of stages

N = 2n

Figure: 4× 4, and 8× 8 SENs

Page 8: 2008 Generalized SEN

N ×N Generalized Shuffle Exchange Networks

N ×N Generalized Shuffle Exchange Networks = N ×N

GSENs

N = ♯ of nodes, n + 1 = ♯ of stages

2n < N ≤ 2n+1

SENs are contained in GSENs

Figure: 4× 4, 6× 6, and 8× 8 GSENs

Page 9: 2008 Generalized SEN

NOT unique-path

A MIN is unique-path if there is a unique path between eachpair of input and output.

SENs are unique-path

GSENs are NOT unique-path

0

1

2

3

4

5

6

7

8

9

0

1

2

3

4

5

6

7

8

9

stage 0 stage 1 stage 2 stage 3

Page 10: 2008 Generalized SEN

NOT unique-path

A MIN is unique-path if there is a unique path between eachpair of input and output.

SENs are unique-path

GSENs are NOT unique-path

0

1

2

3

4

5

6

7

8

9

0

1

2

3

4

5

6

7

8

9

stage 0 stage 1 stage 2 stage 3

Page 11: 2008 Generalized SEN

Communications among processors

one-to-one

one-to-many

all-to-all

all-to-all broadcastall-to-all personalized exchange (ATAPE) ←−we focus on here

Definition

In ATAPE, each processor sends a specific message to every otherprocessor.

Page 12: 2008 Generalized SEN

Why ATAPE?

ATAPE occurs in many applications:

matrix transposition

fast Fourier transform (FFT)

Compare MIN with other networks

network model scalability communication delay

hypercubes poor shortermeshes better highertori better higherMINs better shorter

Page 13: 2008 Generalized SEN

is ATAPE easy?

Figure: 6× 6 GSEN

Page 14: 2008 Generalized SEN

is ATAPE easy?

Figure: 6× 6 GSEN

Page 15: 2008 Generalized SEN

is ATAPE easy?

Figure: 6× 6 GSEN

Page 16: 2008 Generalized SEN

Stage control

Stage Control (SC): all the SEs at the same stage are set tothe same state.

0

1

2

3

4

5

6

7

8

9

0

1

2

3

4

5

6

7

8

9

stage 0 stage 1 stage 2 stage 3

Figure: SC in 10× 10 GSEN

Page 17: 2008 Generalized SEN

Configuration

Network configuration: the states of switches of the network

in Matrix form

1 0 0 11 0 0 11 0 0 11 0 0 11 0 0 1

under stage control, configuration represented as n + 1-tuple

(1 0 0 1)

or an integer C

(1001)2 = 9

Page 18: 2008 Generalized SEN

Previous Results

Yang & Wang (2000),IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED

SYSTEMS,Optimal all-to-all personalized exchange in self-routablemultistage networks.

Yang & Wang propose an optimal all-to-all personalizedexchange algorithm with stage control technique in which theLatin square method is used.

Yang & Wang’s algorithm requires constructing a Latin squarein advance and allocating memory for storing the Latin square.

MINs in this paper must be unique path and self-routable.

Page 19: 2008 Generalized SEN

Previous Results

Massini (2003),DISCRETE APPLIED MATHEMATICS,All-to-all personalized communication on multistageinterconnection networks.

Massini’ algorithm does not require precomputation for Latinsquare and extra memory space.

MINs in this paper must be unique path and self-routable.

Page 20: 2008 Generalized SEN

Motivation

The purpose of this thesis

To our knowledge, no one has studied ATAPE algorithms inGSENs.

The purpose of this thesis is to propose ATAPE algorithms forGSENs.

We propose two algorithms

Algorithm 1 uses the stage control technique and works for alleven N .

On the contrary, Algorithm 2 works for all N ≡ 2 (mod 4)without stage control.

Both are optimal.

Page 21: 2008 Generalized SEN

R(N) and Rsc(N)

Definition

Let R(N) denote the minimum number of network configurationsrequired to realize ATAPE in an N ×N GSEN.Also, let Rsc(N) denote the minimum number of networkconfigurations required to realize ATAPE in an N ×N GSEN whenthe stage control technique is assumed.

Lemma 1

N ≤ R(N) ≤ Rsc(N) ≤ 2n+1.

Page 22: 2008 Generalized SEN

Main results

We propose two optimal ATAPE algorithms for N ×N GSENs.

Algorithm 1

with stage control

need 2n+1 configurations

need to construct a destination matrix in advance

Algorithm 2

without stage control

need N configurations

compute destinations directly

Page 23: 2008 Generalized SEN

Algorithm 1

Preprocessing-Phase Destination matrix constructing phase

for each processor i (0 ≤ i < N) do in parallel

for each time k (0 ≤ k < 2n+1) do in sequential

prepare a null message;equip the message with configuration k and send it out;when an output (say, j) receives the message, set sj,k = i

for each j (0 ≤ j < N) do

for each k (0 ≤ k < 2n+1) do

if sj,k = i then set di,k = j;for each i (0 ≤ i < N) do

for each j (0 ≤ j < N) do

set mark[j] = 0;

Page 24: 2008 Generalized SEN

Algorithm 1

Preprocessing-Phase (cont.)

for each k (0 ≤ k < 2n+1) do

if mark[di,k] = 0 then set mark[di,k] = 1;else set di,k = −1;

S =

0 4 8 5 7 6 1 3 5 9 3 0 2 1 6 84 0 5 8 6 7 3 1 9 5 0 3 1 2 8 68 3 0 2 1 5 7 9 3 8 5 7 6 0 2 43 8 2 0 5 1 9 7 8 3 7 5 0 6 4 27 2 6 3 0 9 4 5 2 7 1 8 5 4 9 02 7 3 6 9 0 5 4 7 2 8 1 4 5 0 96 1 7 9 4 8 0 2 1 6 2 4 9 3 5 71 6 9 7 8 4 2 0 6 1 4 2 3 9 7 55 9 4 1 3 2 6 8 0 4 9 6 8 7 1 39 5 1 4 2 3 8 6 4 0 6 9 7 8 3 1

, D =

0 1 2 3 4 5 6 7 8 9 − − − − − −7 6 9 8 2 3 0 1 − − 4 5 − − − −5 4 3 2 9 8 7 6 − − − − 0 1 − −3 2 5 4 8 9 1 0 − − − − 7 6 − −1 0 8 9 6 7 4 5 − − − − − − 3 28 9 1 0 3 2 5 4 − − − − − − 6 76 7 4 5 1 0 8 9 − − − − 2 3 − −4 5 6 7 0 1 2 3 − − − − 9 8 − −2 3 0 1 7 6 9 8 − − 5 4 − − − −9 8 7 6 5 4 3 2 1 0 − − − − − −

Page 25: 2008 Generalized SEN

Algorithm 1

Phase 1: The message preparing phase

for each processor i (0 ≤ i < N) do in parallel

for each time k (0 ≤ k < 2n+1) do in sequential

if di,k 6= −1 then prepare a personalized message to di,k

else prepare a null message;equip the message with configuration k

insert the message into the message queue of i;

Phase 2: The message sending phase

for each processor i (0 ≤ i < N) do in parallel

for each time k (0 ≤ k < 2n+1) do in sequential

do send a message in the message queue of i;

Page 26: 2008 Generalized SEN

Correct and Optimal

Correctness

Easy part. Since we use every configuration k, 0 ≤ k < 2n+1.

Optimality

Hard part. We have to claim every configuration contains a

unique-path.

Definition

stage 0 stage 1 stage n

nb

nf 1n

b 1n

f

0b

0fi

j

Forward control tag F = fn2n + · · · + f121 + f02

0

Backward control tag B = bn2n + · · ·+ b121 + b02

0

Page 27: 2008 Generalized SEN

Observation

Page 28: 2008 Generalized SEN

sketch of proof

C B F !Lem.82 2 1n n "# Lem.9

2 2 1n n " ! !Lem.10

Lem.11 no 2 holes

12ni FB

N

"$ "% & % &' (

Lem.7

Thm FBCT

Thm.21( ) 2nN ")

"#$

Page 29: 2008 Generalized SEN

Without SC

Can we do better, if we abandon the SC technique?

N ≤ R(N), at least N configurations

Alternating stage control (ASC)

A variation of stage control, which means the states of theswitches of a stage alternate between straight and cross.

0

1

2

3

4

5

6

7

8

9

0

1

2

3

4

5

6

7

8

9

1 100

Page 30: 2008 Generalized SEN

Observation

To choose N configurations

We reordered the configurations AF by the control tag F of inputprocess 0, then we found A0 = AN , A1 = AN+1, . . .

Page 31: 2008 Generalized SEN

Algorithm 2

Phase 1: The message preparing phase

for each processor i (0 ≤ i < N) do in parallel

calculate mi by the formula:

mi =

{

(i · 2n+1) mod N, if i is even;((i + 1) · 2n+1 − 1) mod N, if i is odd;

07254361899

70523416988

85032149677

58301294766

63810927455

36189072544

41698705233

14967850322

29476583011

92745638100

9876543210i\k

Page 32: 2008 Generalized SEN

Algorithm 2

Phase 1: The message preparing phase

for each time k (0 ≤ k < N) do in sequential

prepare a personalized message for destination processor{

(mi + k) mod N, if i is even;(mi − k) mod N, if i is odd;

equip the message with configuration Ak = k ⊕⌊

k2

;insert the message into the message queue of i;

07254361899

70523416988

85032149677

58301294766

63810927455

36189072544

41698705233

14967850322

29476583011

92745638100

9876543210i\k

07254361899

70523416988

85032149677

58301294766

63810927455

36189072544

41698705233

14967850322

29476583011

92745638100

9876543210i\k

Page 33: 2008 Generalized SEN

Algorithm 2

Phase 2: The message sending phase

for each processor i (0 ≤ i < N) do in parallel

for each time k (0 ≤ k < N) do in sequential

send a message in the message queue of i;

07254361899

70523416988

85032149677

58301294766

63810927455

36189072544

41698705233

14967850322

29476583011

92745638100

9876543210i\k

Page 34: 2008 Generalized SEN

Correct and Optimal

Optimality

Easy part. Since we use only N configurations Ak, 0 ≤ k < N .N ≤ R(N)

Correctness

Hard part. We proved that the link from any input i to any output

j exists in our algorithm, and the message sent by input i would

reach exactly the output j calculated in our algorithm.

Why N ≡ 2 (mod 4)?

In the proof of correctness, property (∗) only holds when N ≡ 2(mod 4).Thus Algorithm 2 can work when N ≡ 2 (mod 4) only.

Page 35: 2008 Generalized SEN

Property (∗)

Property (∗)

1 If the alternating control bit is 0, E0−→ E, O

1−→ O;

2 Else the control bit is 1, E1−→ O, O

0−→ E.

1

3

5

7

9

0

2

4

6

8

0 0

1

3

5

7

9

0

2

4

6

8

(a) (b)

0

2

4

6

8

1

3

5

7

9

0

2

4

6

8

1 1

1

3

5

7

9

(c) (d)

Page 36: 2008 Generalized SEN

Property (∗)

Proof of property (∗)

N2

is odd because N ≡ 2 (mod 4).

x0 = y and x1 = y + N2.

Thus one of the input port of switch y

is even while the other input port isodd.

z0 is even and z1 is odd.

Now set the control bit 0 or 1,andswitch y even or odd; totally 4 cases.

By examine the states of all switches,we have done.

0

1

2

3

4

5

N-2

N-1

0

1

2

3

4

N-2

N-1

0x

1x

1z

0zy

0

1

2

12

N

Page 37: 2008 Generalized SEN

1, 2 1n

k kF k F k

!" " # #Lem.17

1( 2 )modnj i T N

" $ Lem.6

F F! "Lem.16

Thm.15 relation between

0 1,E E O O%%& %%&

Property (*)

(i)

(ii)

Thm.19 Algorithm is correct

Thm.18 0 1 1, ,...,

NA A A

# fulfill ATAPE

, ,F F A!

1 0,E O O E%%& %%&

Page 38: 2008 Generalized SEN

Concluding remarks

In this thesis

We have proposed two optimal ATAPE algorithms for GSENs

We obtained Rsc(N) = 2n+1

We obtained N = R(N), if N ≡ 2 (mod 4)

Open question

To determine R(N) for N ≡ 0 (mod 4).

Generalize to d× d switch elements.

Page 39: 2008 Generalized SEN

0000

01

2345

6789

Initial GSEN

Page 40: 2008 Generalized SEN

0001

01

2345

6789

0000

05

6127

8349

Round 1

Page 41: 2008 Generalized SEN

0011

01

2345

6789

0001

05

6127

8349

0000

07

8563

4129

Round 2

Page 42: 2008 Generalized SEN

0010

01

2345

6789

0011

05

6127

8349

0001

07

8563

4129

0000

03

4781

2569

Round 3

Page 43: 2008 Generalized SEN

0110

01

2345

6789

0010

05

6127

8349

0011

07

8563

4129

0001

03

4781

2569

0000

01

2345

6789

Round 4

9

8

7

6

5

4

3

2

1

0

Page 44: 2008 Generalized SEN

0111

01

2345

6789

0110

05

6127

8349

0010

07

8563

4129

0011

30

7418

5296

0001

10

3254

7698

Round 5

89

98

67

76

45

54

23

32

01

10

Page 45: 2008 Generalized SEN

0101

01

2345

6789

0111

05

6127

8349

0110

70

5836

1492

0010

30

7418

5296

0011

83

0527

4961

Round 6

189

698

967

476

745

254

523

032

301

810

Page 46: 2008 Generalized SEN

0100

01

2345

6789

0101

05

6127

8349

0111

70

5836

1492

0110

67

0145

8923

0010

38

5072

9416

Round 7

6189

1698

4967

9476

2745

7254

0523

5032

8301

3810

Page 47: 2008 Generalized SEN

1100

01

2345

6789

0100

05

6127

8349

0101

70

5836

1492

0111

67

0145

8923

0110

65

8709

2143

Round 8

36189

41698

14967

29476

92745

07254

70523

85032

58301

63810

Page 48: 2008 Generalized SEN

1101

01

2345

6789

1100

50

1672

3894

0100

70

5836

1492

0101

76

1054

9832

0111

56

7890

1234

Round 9

436189

341698

214967

129476

092745

907254

870523

785032

658301

563810

Page 49: 2008 Generalized SEN

1101

50

1672

3894

1100

25

0381

6947

0100

76

1054

9832

0101

47

6981

0325

Round 10

5436189

2341698

3214967

0129476

1092745

8907254

9870523

6785032

7658301

4563810

Page 50: 2008 Generalized SEN

1101

25

0381

6947

1100

21

6509

4387

0100

74

9618

3052

Round 11

25436189

52341698

03214967

30129476

81092745

18907254

69870523

96785032

47658301

74563810

Page 51: 2008 Generalized SEN

1101

21

6509

4387

1100

29

4163

8507

Round 12

725436189

052341698

503214967

830129476

381092745

618907254

169870523

496785032

947658301

274563810

Page 52: 2008 Generalized SEN

1101

92

1436

5870

Round 13

0725436189

7052341698

8503214967

5830129476

6381092745

3618907254

4169870523

1496785032

2947658301

9274563810