Deployment of Data-flow Applications in Many-core ... · Deployment of Data-flow Applications in...

Post on 09-Sep-2018

250 views 0 download

Transcript of Deployment of Data-flow Applications in Many-core ... · Deployment of Data-flow Applications in...

Deployment of Data-flow Applications

in Many-core Architectures

UltrasoundToGo RTD 2013

Stefanos Skalistis and Alena Simalatsar

Rigorous System Design Laboratory (RiSD), EPFL

Task A

Task C

ω(eAC)

Task B

ω(eAB)

Cluster 1 Cluster 2b(eAC)

IA TA0 ω(eAB)

FA

0

b(eAB)

IB

TB

0

0

b(eBA)

Application Partitioning

Application Placement

Mapping, Scheduling, Buffer Allocation

Power Management

Real-Time Scheduling

SMT

Solver

Offline

Online

Image credit: Marcio Castro et al.

Many-core architectures:

+ Efficient for high-parallelizable applications

+ Support for power management

- Complex communication mechanisms

- Safe code-generation is not straightforward

Kalray MPPA-256:

• 256 cores (400 MHz) in 16 clusters with 2MB shared scratchpad memory, on 2D-torus NoC

• Scheduling constraints

• Resource constraints

• Power constraints

Providing real-time guarantees is challenging

Existing methods:

• Based on WCET → largely overestimated

• Do not focus on optimizing performance

Real-time performance and power efficiency:

• Real-time scheduling respecting time constraints

• Operating within power constraints

• Optimal use of resources

Motivation

Multi-criteria optimization decisions:

• Right degree of parallelism?

• Application partitioning and mapping to PE?

• Size of buffers? Task scheduling with real-time constraints?

Deploying on many-core: Bridging safety-critical and best-effort

Hybrid approach:

Offline: Guarantees based on worst-case execution time

Online: Optimizations based on actual execution times

[1] Tendulkar, Pranav, et al. "Many-Core Scheduling of Data Parallel Applications using SMT Solvers." Digital System Design (DSD), 2014 17th Euromicro Conference on. IEEE, 2014.

[2] De Dinechin, Benoît Dupont, et al. "Time-critical computing on a single-chip massively parallel processor." Design, Automation and Test in Europe Conference and Exhibition (DATE). IEEE, 2014.

[3] Ibrahim, Aya, et al. "Assessment of Image Quality vs. Computation Cost for Different Parameterizations of Ultrasound Imaging Pipelines.“ Medical Cyber-Physical Systems (MCPS) Workshop, 2015

Case study: 3D Ultrasound Imaging

• Data-flow application

• High degree of parallelism

• Medical nature requires guarantees

• Portability imposes power limitations

Finding solutions using SMT solvers:

Safe deployment without

undermining performance