Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

17
Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010

description

Domain Distribution Domains may be distributed across locales var D: domain(2) dmapped Block(CompGrid, …) = …; A distribution defines… …ownership of the domain’s indices (and its arrays’ elements) …default work ownership for operations on the domains/arrays  e.g., forall loops or promoted operations …memory layout/representation of array elements/domain indices …implementation of operations on its domains and arrays  e.g., accessors, iterators, communication patterns, … D A B CompGrid L0L1L2L3 L4L5L6L7 Distributions Data Parallelism Task Parallelism Locality Control Target Machine Base Language

Transcript of Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Page 1: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Chapel: User-Defined Distributions

Brad ChamberlainCray Inc.

CSEP 524May 20, 2010

Page 2: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Chapel DistributionsDistributions: “Recipes for parallel, distributed arrays”

• help the compiler map from the computation’s global view…

…down to the fragmented, per-processor implementation

=

α ·+

=

α ·+

=

α ·+

=

α ·+

=

α ·+

MEMORY MEMORY MEMORY MEMORY

Distributions

Data Parallelism

Task Parallelism

Locality Control

Target Machine

Base Language

Page 3: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Domain Distribution

Domains may be distributed across localesvar D: domain(2) dmapped Block(CompGrid, …) = …;

A distribution defines……ownership of the domain’s indices (and its arrays’ elements)…default work ownership for operations on the domains/arrays

e.g., forall loops or promoted operations…memory layout/representation of array elements/domain indices …implementation of operations on its domains and arrays

e.g., accessors, iterators, communication patterns, …

D AB

CompGrid

L0 L1 L2 L3

L4 L5 L6 L7

Distributions

Data Parallelism

Task Parallelism

Locality Control

Target Machine

Base Language

Page 4: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Domain Distributions Any domain type may be distributed Distributions do not affect program semantics

• only implementation details and therefore performance

“steve”“lee”“sung”“david”“jacob”“albert”“brad”

Distributions

Data Parallelism

Task Parallelism

Locality Control

Target Machine

Base Language

Page 5: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Domain Distributions

“steve”“lee”“sung”“david”“jacob”“albert”“brad”

Any domain type may be distributed Distributions do not affect program semantics

• only implementation details and therefore performance

Distributions

Data Parallelism

Task Parallelism

Locality Control

Target Machine

Base Language

Page 6: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Distributions: Goals & Research Advanced users can write their own distributions

• specified in Chapel using lower-level language features

Chapel will provide a standard library of distributions• written using the same user-defined distribution mechanism

(Draft paper describing user-defined distribution strategy available by request)

Distributions

Data Parallelism

Task Parallelism

Locality Control

Target Machine

Base Language

Page 7: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

A Simple Distribution: Block1DIntent: block a 1D index space across a set of locales

L0 L1 L20 1-1 ……

Use a bounding boxto compute the blocking

8… …

Distributions

Data Parallelism

Task Parallelism

Locality Control

Target Machine

Base Language

Page 8: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Distributions vs. DomainsQ1: Why distinguish between distributions and domains?Q2: Why do distributions map an index space rather than a

fixed index set?A: To permit several domains to share a single distribution

• amortizes the overheads of storing a distribution• supports trivial domain/array alignment and compiler optimizations

const D : …dmapped B1 = [1..8], outerD: …dmapped B1 = [0..9], innerD: subdomain(D) = [2..7], slideD: subdomain(D) = [4..6];

L0 L1 L2

Sharing a distributionsupports trivial alignment

of these domains

Distributions

Data Parallelism

Task Parallelism

Locality Control

Target Machine

Base Language

Page 9: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Distributions vs. DomainsQ1: Why distinguish between distributions and domains?Q2: Why do distributions map an index space rather than a

fixed index set?A: To permit several domains to share a single distribution

• amortizes the overheads of storing a distribution• supports trivial domain/array alignment and compiler optimizations

const D : …dmapped B1 = [1..8], outerD: …dmapped B2 = [0..9], innerD: …dmapped B3 = [2..7], slideD: …dmapped B4 = [4..6];

L0 L1 L2

When each domain isgiven its own distribution,

the compiler cannot reasonabout alignment of indices.

Distributions

Data Parallelism

Task Parallelism

Locality Control

Target Machine

Base Language

Page 10: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

The Block Distribution maps the indices of a domain in a dense fashion across the target Locales according to the boundingBox argument

The Block Distribution

const Dist = new dmap(new Block(boundingBox=[1..4, 1..8]));

var Dom: domain(2) dmapped Dist = [1..4, 1..8];

L0 L1 L2 L3

L4 L5 L6 L7distributed over

Page 11: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

The Cyclic Distribution maps the indices of a domain in a round-robin fashion across the target Locales according to the startIdx argument

The Cyclic Distribution

const Dist = new dmap(new Cyclic(startIdx=(1,1)));

var Dom: domain(2) dmapped Dist = [1..4, 1..8];

L0 L1 L2 L3

L4 L5 L6 L7distributed over

Page 12: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Domain Maps: Distributions and LayoutsDomain Map: The general concept that indicates how to

implement a domain and its arrays

Two flavors:• layout: a domain map targeting a single locale

row-major order, column major order, Morton order hierarchically tiled, linked data structure, etc. compressed sparse row, open hashing w/ quadratic probing

• distribution: a domain map targeting multiple locales block, cyclic, block-cyclic recursive bisection, hashing across locales, graph partitioning …

Page 13: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Domain Map Framework: Layoutsdistribution domain array

Responsibility: How to generate new domains

Responsibility: How to store, iterate over domain indices

Responsibility: How to store, access, iterate over array elements

three descriptors:

Distributions

Data Parallelism

Task Parallelism

Locality Control

Target Machine

Base Language

Page 14: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Domain Map Framework: Distributionsdistribution domain array

Responsibility: How to generate new domains and map indices to locales

Responsibility: How to store, iterate over domain indices

Responsibility: How to store, access, iterate over array elements

global descriptors(one global instance

or replicated per locale)

local descriptors(one instance per locale)

Responsibility: How to store, iterate over local domain indices

Responsibility: How to store, access, iterate over local array elements

Responsibility: How to store asingle locale’sportion of theindex space

Distributions

Data Parallelism

Task Parallelism

Locality Control

Target Machine

Base Language

Page 15: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

Domain Map Framework: Distributions

global descriptors(one global instance

or replicated per locale)

local descriptors(one instance per locale)

distribution domain array

domain descriptor

distribution descriptor

localarray

descriptor

local domain

descriptor

array descriptor

target locale setdistribution argsmap index to locale…

global index infoindex typeindex iterationallocate array…

element typeglobal value iterationrandom access…

store local indiceslocal index iterationadd new indices…

store local valueslocal value iterationlocal random access…

= descriptor state/type= descriptor methods

legend

local distributiondescriptor

Distributions

Data Parallelism

Task Parallelism

Locality Control

Target Machine

Base Language

Page 16: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

myElems =

4 5

myElems =

6 8

myElems =

31

1D Block Distribution Classes

global descriptors

local descriptors

distribution domain arraycode

L0 L1 L2

myChunk =

min(idxType) 3

myChunk =

max(idxType)6

myChunk =

4 5

myElems =

4 5

myElems =

6 8

myElems =

31

const B1 = new dmap( new Block(bbox=[1..8]));

const D: domain(1) dmapped B1 = [1..8];

var A, B: [D] real;

myBlock =

4 5

myBlock =

6 8

myBlock =

31

boundingBox =

1 8

targetLocales =

L0 L1 L2

whole =

1 8

(LocaleSpace = [0..2])

Page 17: Chapel: User-Defined Distributions Brad Chamberlain Cray Inc. CSEP 524 May 20, 2010.

myElems =

4 5

myElems =

6

myElems =

1D Block Distribution Classes

global descriptors

local descriptors

distribution domain arraycode

L0 L1 L2

myChunk =

min(idxType) 3

myChunk =

max(idxType)6

myChunk =

4 5

myElems =

4 5

myElems =

6

myElems =

const B1 = new dmap( new Block(bbox=[1..8]));

const sliceD: domain(1) dmapped B1 = [4..6];

var A2, B2: [sliceD] real;

myBlock =

4 5

myBlock =

6

myBlock =

-10 …

boundingBox =

1 8

targetLocales =

L0 L1 L2

whole =

4 6

(LocaleSpace = [0..2])