Holistic, Distributed Stream Processing in...

17
Holistic, Distributed Stream Processing in IoT Environments 23. 03. ‘18 UK Systems Research Workshop Peter Michalák CDT Cloud Computing for Big Data Prof. Paul Watson School of Computing Prof. Mike Trenell Medical School Dr. Sarah Heaps School of Maths and Stats

Transcript of Holistic, Distributed Stream Processing in...

Page 1: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

Holistic, Distributed Stream Processing in IoT Environments

23. 03. ‘18 UK Systems Research Workshop

Peter MichalákCDT Cloud Computing

for Big Data

Prof. Paul WatsonSchool of Computing

Prof. Mike TrenellMedical School

Dr. Sarah HeapsSchool of Maths

and Stats

Page 2: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

Stream Processing in IoT

Telemetry

Sensors Field Gateways

MessageBroker

Stream Processing Engine

Clouds

1

Page 3: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

Telemetry

Command and ControlNotification

Sensors Field Gateways

MessageBroker

Stream Processing Engine

Clouds

DB

Holistic Stream Processing in IoT

2

Page 4: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

Healthcare use case Behavioural Prompts & Feedback

BehaviouralPrompt

Feedback

3

Page 5: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

ESPer/Spark/Storm

sky is the limit

C / JavaScript

Objective C / Swift

Java / Kotlin ..

Challenges

• Energy

• Pebble Watch battery life ~7 days

• Streaming raw accelerometer data reduces battery life ~18 hours

• Hand-crafting bespoke solutions

• Definition of data stream processing

• Programming for Heterogeneous Platforms

• Multitude of devices, APIs and programming languages

Page 6: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

PATH2iot systemAutomating Computational Placement

Non-functional Requirements

Description of the ComputationIn

put

Logical Plan

Physical Plan

Execution PlanPATHfinder

PATHdeployer

PATHmonitor

Resource Catalogue

Cost ModelsPlatform Specific Compilation

PATH2iot

Re-optimisation onInfrastructure change

• PATHfinder• Automated Computational

Decisions• Non-functional requirements• Device-specific compilation

• PATHdeployer• A deployment tool delivers

configuration to enable computation

• PATHmonitor• Future work for PATH2iot

5

Page 7: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

PATHfinder : High-Level Declarative Description of Computation

Non-functional Requirements

Description of the ComputationIn

put Resource

Catalogue

• Event Processing Language (EPL) from Esper

• High Level Declarative Description of Computation

• SQL based with extended grammar to support CEP operations

• Decomposable into directed graph of stream operators

6

1. INSERT INTO AccelEventSELECT getAccelData(25, 60) FROM AccelEventSource

2. INSERT INTO EdEventSELECT Math.pow(x*x+y*y+z*z, 0.5) AS ed, tsFROM AccelEvent WHERE vibe=0

3. INSERT INTO StepEventSELECT ed1(’ts’) as ts FROM EdEventMATCH RECOGNIZE (MEASURES A AS ed1, B AS ed2 PATTERN (A B) DEFINE A AS (A.ed > THR), B AS (B.ed ≤ THR))

4. INSERT INTO StepCount SELECT count(*) as steps FROM StepEvent.win:time_batch(120 sec)

5. SELECT persistResult(steps, "time_series", "step_sum") FROM StepCount

[1] N. Zhao, “Full-featured pedometer design realized with 3-axis digital accelerometer,” Analog Dialogue, vol. 44, no. 06, 2010.

Step count algorithm[1] in EPL

Page 8: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

PATHfinder : High-Level Declarative Description of Computation

Non-functional

Requirements

Description of

the ComputationInpu

t

Resource

Catalogue

7

ΩgetAccelData(25, 60)

σvibe=0

Πy*y

ΩMath.pow(ed, 0.5)

"(120)

Ωmatch_recognize(A.ed > THR) AND (B.ed <= THR)

ΩCOUNT(∗)

ΩpersistResult(“steps”, “step_sum”, “time_series”)

Πx*x Πz*z

Π+

Π+

Q5

Q4

Q3

Q2

Q1

1. INSERT INTO AccelEventSELECT getAccelData(25, 60) FROM AccelEventSource

2. INSERT INTO EdEventSELECT Math.pow(x*x+y*y+z*z, 0.5) AS ed, tsFROM AccelEvent WHERE vibe=0

3. INSERT INTO StepEventSELECT ed1(’ts’) as ts FROM EdEventMATCH RECOGNIZE (MEASURES A AS ed1, B AS ed2 PATTERN (A B) DEFINE A AS (A.ed > THR), B AS (B.ed ≤ THR))

4. INSERT INTO StepCount SELECT count(*) as steps FROM StepEvent.win:time_batch(120 sec)

5. SELECT persistResult(steps, "time_series", "step_sum") FROM StepCount

[1] N. Zhao, “Full-featured pedometer design realized with 3-axis digital accelerometer,” Analog Dialogue, vol. 44, no. 06, 2010.

Step count algorithm[1] in EPL

Page 9: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

Non-functional Requirements

Description of the ComputationIn

put

Logical Plan

Resource Catalogue

• Logical Optimisation[2]

• Pushing Selects &Windows closer to the data source

• Physical Optimisation• Enumerating Physical Plans• Placement of the physical plans

on available infrastructure• 225 Physical Plans

• Physical Plan Pruning• Removing non-deployable plans

based on infrastructure capabilities

• 18 Physical Plans

Physical Plan

8

PATHfinder: Optimisation

[2] Hirzel, Martin, Robert Soulé, Scott Schneider, Buğra Gedik, and Robert Grimm. ‘A Catalog of Stream Processing Optimizations’. ACM Computing Surveys 46

PP0 PP1 PP2 PP3 PP4 PP5

PP6 PP7 PP8 PP9 PP10 PP11

PP12 PP13 PP14 PP15 PP16 PP17

PP18 PP19 PP20 PP21 … PP225

x xx

xx

x x

xxxx

x

Page 10: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

Non-functional Requirements

Description of the ComputationIn

put

Logical Plan

Resource Catalogue

• Cost Models• Energy cost model[3]

• Power coefficients with confidence Intervals• Estimated Battery LifePhysical Plan

Cost Models

9

PATHfinder: Energy Cost Model

[3] M. Forshaw, N. Thomas, and A. S. McGough, “The case for energy- aware simulation and modelling of internet of things (iot),” ACM ENERGY-SIM, 2016.

Page 11: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

Non-functional Requirements

Description of the ComputationIn

put

Logical Plan

Resource Catalogue

• Cost Models• Energy cost model• Power coefficients with

confidence Intervals• Estimated Battery LifePhysical Plan

Cost Models

10

PATHfinder : Energy Cost Model

Page 12: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

Non-functional Requirements

Description of the ComputationIn

put

Logical Plan

Resource Catalogue

• Cost Models• Energy cost model• Power coefficients with

confidence Intervals• Estimated Battery LifePhysical Plan

Cost Models

11

PATHfinder : Energy Cost Model

Page 13: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

Non-functional Requirements

Description of the ComputationIn

put

Logical Plan

Resource Catalogue

• Cost Models• Energy cost model• Power coefficients with

confidence Intervals• Estimated Battery LifePhysical Plan

Cost Models

120

20

40

60

100 150 200Time (s)

Aver

age

Pow

er C

onsu

mpt

ion

(mW

)

Exec Plan 040 with 120 s cycle

1 2 3 54

PATHfinder : Energy Cost Model

Page 14: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

PATHdeployer : Architecture Overview

PATHdeployer

PATHfinder

Infra Req EPL

1

2

3

Input

Neo4j

D2ESPer D2ESPer

ZooKeeper

4

ActiveMQ

InfluxDB

Cloud Deployment

...

VersatilePebble(C)

VersatilePebble(JavaScript)

REST API(Flask)

/path2iot/pebble/set_config

/path2iot/pebble/get_config

5

MariaDB

data stream

IoT Deployment• Cloud Deployment• ZooKeeper: configuration delivery• ActiveMQ: event propagation• D2ESPer: in-house built dynamic

ESPer based stream processing tool• InfluxDB: time series database

• IoT Deployment• Flask REST API: configuration

delivery for IoT devices• MariaDB: storage endpoint• IoT agents – iPhone, Pebble Watch

13

Page 15: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

Plan watch phone cloud Energy Impact (mJ)pp00 Ω1 !1 Ω2 Ω3 sxfer Ω4 ω1 Ω6 Ω5

pp01 Ω1 !1 Ω2 sxfer Ω3 Ω4 ω1 Ω6 Ω5

pp02 Ω1 !1 sxfer Ω2 Ω3 Ω4 ω1 Ω6 Ω5

pp03 Ω1 sxfer !1 Ω2 Ω3 Ω4 ω1 Ω6 Ω5

pp04 Ω1 !1 Ω2 Ω3 ω1 sxfer Ω4 Ω6 Ω5

pp05 Ω1 !1 Ω2 Ω3 sxfer ω1 Ω4 Ω6 Ω5

pp06 Ω1 !1 Ω2 sxfer Ω3 ω1 Ω4 Ω6 Ω5

pp07 Ω1 !1 sxfer Ω2 Ω3 ω1 Ω4 Ω6 Ω5

pp08 Ω1 sxfer !1 Ω2 Ω3 ω1 Ω4 Ω6 Ω5

pp09 Ω1 !1 Ω2 ω1 sxfer Ω3 Ω4 Ω6 Ω5

pp10 Ω1 !1 Ω2 sxfer ω1 Ω3 Ω4 Ω6 Ω5

pp11 Ω1 !1 sxfer Ω2 ω1 Ω3 Ω4 Ω6 Ω5

pp12 Ω1 sxfer !1 Ω2 ω1 Ω3 Ω4 Ω6 Ω5

pp13 Ω1 !1 ω1 sxfer Ω2 Ω3 Ω4 Ω6 Ω5

pp14 Ω1 !1 sxfer ω1 Ω2 Ω3 Ω4 Ω6 Ω5

pp15 Ω1 sxfer !1 ω1 Ω2 Ω3 Ω4 Ω6 Ω5

pp16 Ω1 ω1 sxfer !1 Ω2 Ω3 Ω4 Ω6 Ω5

pp17 Ω1 sxfer ω1 !1 Ω2 Ω3 Ω4 Ω6 Ω526.62

10.51

26.62

26.71

10.6

26.62

26.71

27.05

5.87

26.62

26.71

27.05

27.08

5.91

26.62

26.71

27.05

27.08

pp00

pp01

pp02

pp03

pp04

pp05

pp06

pp07

pp08

pp09

pp10

pp11

pp12

pp13

pp14

pp15

pp16

pp17

0 10 20Energy Impact (mJ)

Phys

ical

Pla

n

Physical Plan − Energy Impact overview

Results

baseline

best plan

Page 16: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

Results

15

27.08 27.05 26.71 26.62

5.91

27.08 27.05 26.71 26.62

5.87

27.05 26.71 26.62

10.6

26.71 26.62

10.51

26.62

0

10

20

pp00 pp01 pp02 pp03 pp04 pp05 pp06 pp07 pp08 pp09 pp10 pp11 pp12 pp13 pp14 pp15 pp16 pp17Physical Plan

Ener

gy Im

pact

(mJ)

Physical Plan − Energy Impact overview

Plan watch phone cloud EI (mJ)

95 % conf int

Power (mW)

Bat Life(h)

pp03 Ω1 sxfer !1 Ω2 Ω3 Ω4 ω1 Ω6 Ω5 5.87 5.70 - 6.05 5.88 18.0-18.2

pp09 Ω1 !1 Ω2 ω1 sxfer Ω3 Ω4 Ω6 Ω5 26.62 26.48 - 26.76 26.69 79.5-84.4

27.08 27.05 26.71 26.62

5.91

27.08 27.05 26.71 26.62

5.87

27.05 26.71 26.62

10.6

26.71 26.62

10.51

26.62

0

10

20

pp00 pp01 pp02 pp03 pp04 pp05 pp06 pp07 pp08 pp09 pp10 pp11 pp12 pp13 pp14 pp15 pp16 pp17Physical Plan

Ener

gy Im

pact

(mJ)

Physical Plan − Energy Impact overview

• 453 % battery life improvement[5]

• 3x data reduction between wearable and cloud

• Non-functional requirement satisfied

[5] P. Michalák, P. Watson, “PATH2iot: A Holistic, Distributed Stream Processing System”, CloudCom, 2017

Page 17: Holistic, Distributed Stream Processing in IoTEnvironmentssysws.org.uk/workshop/2018/54-michalak-streams.pdf · Step count algorithm[1]in EPL. PATHfinder: ... Bat Life (h) pp03 Ω1

Holistic Distributed Stream Processing in IoT Environments• Holistic, Distributed Stream Processing System

• Design and open-source implementation[6]

• EPL decomposition• Logical and Physical Optimisation

• Energy Impact coefficients for Pebble Watch• Battery life increased dramatically

• PoC Deployment Architecture• Future work on Multi-objective optimisation

• e.g. Bandwidth, Performance, Accuracy

16[6] https://github.com/PetoMichalak/iotPower