Automatic Code Generation for Simulating Non-Newtonian ...

33
Automatic Code Generation for Simulating Non-Newtonian Fluid Flows with ExaStencils Sebastian Kuckuk , Harald Köstler FAU Erlangen-Nürnberg 15.04.2016 SiamPP, Paris, France

Transcript of Automatic Code Generation for Simulating Non-Newtonian ...

Page 1: Automatic Code Generation for Simulating Non-Newtonian ...

Automatic Code Generation forSimulating Non-NewtonianFluid Flows with ExaStencils

Sebastian Kuckuk, Harald Köstler

FAU Erlangen-Nürnberg

15.04.2016

SiamPP, Paris, France

Page 2: Automatic Code Generation for Simulating Non-Newtonian ...

Outline

Page 3: Automatic Code Generation for Simulating Non-Newtonian ...

Outline

● Scope and Motivation

● Project ExaStencils

● Application – CFD

● Results

● Conclusion and Outlook

Page 4: Automatic Code Generation for Simulating Non-Newtonian ...

Scope and Motivation

Page 5: Automatic Code Generation for Simulating Non-Newtonian ...

Solving PDEs

● Goal: Solve a partial differential equation approximately by solving a

discretized form

5

Optical Flow (LIC Visualization)Particles in Flow

∆𝑢 = 𝑓 in Ω𝑢 = 𝑔 in 𝜕Ω

𝐴 𝑢ℎ = 𝑓ℎ

Page 6: Automatic Code Generation for Simulating Non-Newtonian ...

Restriction – Computational Grids

● Regular

● Can be non-uniform

● Can be patch-based

6

Page 7: Automatic Code Generation for Simulating Non-Newtonian ...

Restriction – Algorithm

● (Geometric) Multigrid

7

Smoothing of

High Frequency

Errors

Coarsened Representation

Of Low Frequency Errors

Page 8: Automatic Code Generation for Simulating Non-Newtonian ...

Multigrid V-Cycle

8

Smoothing

Restriction Prolongation

& Correction

Coarse Grid Solving

Page 9: Automatic Code Generation for Simulating Non-Newtonian ...

Main Focus: HPC

● Highly optimized and highly scalable geometric multigrid solvers

● ‘Traditional’ supercomputers like JUQUEEN and SuperMUC

● Alternative architectures like Piz Daint and Mont Blanc

9

Page 10: Automatic Code Generation for Simulating Non-Newtonian ...

A lot of variance…

● Composition of optimal solvers requires choosing

● MG components: Cycle Type? Which smoother(s)? Which coarse grid

solver? Which inter-Grid operators?

● MG parameters: Relaxation? Number of smoothing steps? Other

component dependent parameters?

● Applied Optimizations: Vectorization? (Software) Prefetching? Tiling?

Temporal Blocking? Loop transformations?

● …

10

Page 11: Automatic Code Generation for Simulating Non-Newtonian ...

A lot of variance…

● Composition of optimal solvers requires choosing

● MG components: …

● MG parameters: …

● Applied Optimizations: …

● But the choice relies heavily on various influences

● Hardware: CPU, GPU or both? Number of nodes, sockets and cores?

Cache characteristics? Network characteristics?

● Software: MPI, OpenMP or both? CUDA or OpenCL? Which versions?

● Problem description: Which PDE? Which boundary conditions?

● Discretization: Finite Differences, Finite Elements or Finite Volumes?

● Domain: Uniform or block-structured? How to partition?

● …

11

Page 12: Automatic Code Generation for Simulating Non-Newtonian ...

What about SW frameworks?

● Frequent questions

● Will it run optimally on my chosen hardware?

● Will it support new architectures?

● Do I have to invest a lot of time to understand to framework?

● How fast/easy can I express my ideas and models?

● What happens when I change the target hardware?

● Poses serious challenges for developers of frameworks and

applications alike

● One possible alternative is given by domain-specific languages

(DSLs) in conjunction with code generation techniques

12

Page 13: Automatic Code Generation for Simulating Non-Newtonian ...

Project ExaStencils

Page 14: Automatic Code Generation for Simulating Non-Newtonian ...

14

• Sebastian Kuckuk

• Harald Köstler

• Ulrich Rüde

• Christian Schmitt

• Frank Hannig

• Jürgen Teich

• Hannah Rittich

• Lisa Claus

• Matthias Bolten

• Alexander Grebhahn

• Sven Apel

• Stefan Kronawitter

• Armin Größlinger

• Christian Lengauer

Project ExaStencils

Page 15: Automatic Code Generation for Simulating Non-Newtonian ...

ExaSlang

● We propose a multilayered, external DSL

15

Page 16: Automatic Code Generation for Simulating Non-Newtonian ...

Workflow

● Transition between different layers should be automatic

● Review/ extension through users is possible

● Highly modular transformations implement functionality

16

Page 17: Automatic Code Generation for Simulating Non-Newtonian ...

Features

● With our transformation-based approach we can

● Add MPI/OMP parallelization on demand

● Add GPU support on demand

● Determine coarse performance estimation models

● Perform automatic vectorization, loop transformations, address pre-

calculation, common subexpression elimination, etc.

● Efficiently generate many variants forming the basis for auto-tuning and

advanced performance prediction models

● Vastly improve user experience

17

Page 18: Automatic Code Generation for Simulating Non-Newtonian ...

Application – CFD

Page 19: Automatic Code Generation for Simulating Non-Newtonian ...

Overall Goal

● Simulation of non-isothermal/non-

Newtonian fluid flows

● Suspensions of particles or

macromolecules

● E.g. pastes, gels, foams, drilling

fluids, food products, blood, etc.

● Importance in mining, chemical

and food industry as well as

medical applications

19

https://www.youtube.com/watch?v=G1Op_1yG6lQ

Page 20: Automatic Code Generation for Simulating Non-Newtonian ...

The SIMPLE Algorithm

● Semi-Implicit Method for Pressure Linked Equations

● Concept:

20

Solve for velocity componentsSolve for velocity componentsSolve for velocity components

Solve for pressure correction

Apply pressure correction

Update properties

Page 21: Automatic Code Generation for Simulating Non-Newtonian ...

The SIMPLE Algorithm

● Temperature can be added as a separate step

21

Solve for velocity componentsSolve for velocity componentsSolve for velocity components

Solve for pressure correction

Apply pressure correction

Solve for temperature

Update properties

Page 22: Automatic Code Generation for Simulating Non-Newtonian ...

Staggered grid

● Finite volume discretization on a staggered grid

22

values associated with the

x-staggered grid, e.g. U

values associated with the

y-staggered grid, e.g. V

values associated with the

cell centers, e.g. p and θ

control volumes associated

with cell-centered values

control volumes associated

with x-staggered values

control volumes associated

with y-staggered values

Page 23: Automatic Code Generation for Simulating Non-Newtonian ...

Porting code

● Straight-forward for simple kernels

23

subroutine advance_fields ()

!$omp parallel do &

!$omp private(i,j,k) &

!$omp firstprivate(l1,m1,n1) &

!$omp shared(rho,rho0) &

!$omp schedule(static) default(none)

do k=1,n1

do j=1,m1

do i=1,l1

rho0(i,j,k)=rho(i,j,k)

end do

end do

end do

!$omp end parallel do

Function AdvanceFields@finest ()

: Unit {

loop over rho@current {

rho[next]@current =

rho[active]@current

}

advance rho@current

}

Page 24: Automatic Code Generation for Simulating Non-Newtonian ...

Porting code

● Rather than specifying how the calculations are done…

24

! if not at the boundary

fl = xcvi(i) * v(i,jp,k) * (fy(jp)*rho(i,jp,k) + fym(jp)*rho(i,j,k))

flm = xcvip(im) * v(im,jp,k) * (fy(jp)*rho(im,jp,k) + fym(jp)*rho(im,j,k))

flownu = zcv(k) * (fl+flm)

gm = xcvi(i) * vis(i,j,k) * vis(i,jp,k) / (ycv(j)*vis(i,jp,k) + ycv(jp)*vis(i,j,k) + 1.e-30)

gmm = xcvip(im) * vis(im,j,k) * vis(im,jp,k) /(ycv(j)*vis(im,jp,k) +

ycv(jp)*vis(im,j,k) + 1.e-30)

diff = 2. * zcv(k) * (gm+gmm)

call diflow(flownu,diff,acof)

adc = acof + max(0.,flownu)

anu(i,j,k) = adc - flownu

Page 25: Automatic Code Generation for Simulating Non-Newtonian ...

Porting code

● … we’d like to specify what we want to calculate

25

Val flownu : Real = integrateOverXStaggeredNorthFace (

v[active]@current * rho[active]@current )

Val diffnu : Real = integrateOverXStaggeredNorthFace (

evalAtXStaggeredNorthFace ( vis@current, "harmonicMean" ) )

/ vf_stagControlVolumeWidth_y@current@[0, 1, 0]

AuStencil@current:[0, 1, 0] = -1.0 * ( calc_diflow ( flownu, diffnu )

- min ( 0.0, flownu ) )

Page 26: Automatic Code Generation for Simulating Non-Newtonian ...

Results

Page 27: Automatic Code Generation for Simulating Non-Newtonian ...

Results

● Natural convection problem

● With and without non-Newtonian model

● Convergence criteria for

● solving the single components: At least one v-cycle and

● SIMPLE: Convergence for all components is achieved after updating the

LSE but before starting the solve routine

● 10,000 timesteps

● One Socket on Emmy (Intel Xeon E5-2660v2) with 20 OpenMP

threads

27

𝑟 ≤ 𝛼 1 + 𝛽 𝑏𝑟𝑖+1 − 𝑟𝑖 < 𝛼

Page 28: Automatic Code Generation for Simulating Non-Newtonian ...

Results

28

Temperature distribution along slice (z at 50% of the box

depth) for the Newtonian and non-Newtonian case

Page 29: Automatic Code Generation for Simulating Non-Newtonian ...

Results

29

0,01

0,1

1

10

100

1000

16 32 64 128 16 32 64 128

Newtonian non-Newtonian

Avg.

execution t

ime p

er

tim

este

p [m

s]

Problem size

Update quantities Compile LSEs Solve Total

Page 30: Automatic Code Generation for Simulating Non-Newtonian ...

Conclusion and Outlook

Page 31: Automatic Code Generation for Simulating Non-Newtonian ...

Conclusion

● ExaStencils approach

● Multi-layered, external DSL

● Transformation-based code generation framework

● Automatic optimizations and tuning

● CFD application

31

Page 32: Automatic Code Generation for Simulating Non-Newtonian ...

Outlook

● Enhanced performance optimizations

● Comparison with analytical models

● Study of parallel performance characteristics

● DSL extensions for relevant concepts

● More complex numeric components

● (Multi-) GPU support

32

Page 33: Automatic Code Generation for Simulating Non-Newtonian ...

Thank you for yourAttention!

Questions?