Aerie: Flexible File -System Interfaces to Storage-Class ......Aerie: Flexible File -System...

Post on 05-Jul-2020

7 views 0 download

Transcript of Aerie: Flexible File -System Interfaces to Storage-Class ......Aerie: Flexible File -System...

Aerie: Flexible File-System Interfaces to Storage-Class Memory

Haris Volos† Sanketh Nalli, Sankaralingam Panneerselvam, Venkatanathan Varadarajan, Prashant Saxena,

Michael M. Swift

HP Labs †

• Persistent • Short access time

2

Software overhead matters

Storage-Class Memory (SCM) Flash-backed DRAM

Flash

Phase-Change Memory

Spin Torque MRAM Resistive RAM

ns μs

Latency

3

Memory Controller

SCM DRAM

Storage-Class Memory (SCM)

• Persistent • Short access time • Byte addressable

Accessible via loads/stores

No software overhead

• Direct user-mode access for fast access to data – Moneta-D, PMFS, Quill,

NV-Heaps, Mnemosyne

• File system for sharing

– Shared namespace – Protection – Integrity

4

OS/ File System

Application

Accessing SCM today

+

Device Driver Device Driver

Generic Block Layer

I/O Scheduler

SCM FS

App

Virtual File System

App

Disk FS

I/O Bus

Does SCM need a kernel FS?

5

MMU protects CPU access

DMA is not protected

Variable latency to disk

~ Constant latency to SCM

Load/store interface

No standard interface

SCM Disk

• Enable implementation flexibility – Optimize file-system interface semantics – Optimize operations regarding metadata

Library file systems (libFS)

6

APP

LibFS API

[Exokernel (MIT), Nemesis (Cambridge)]

Aerie libFS in a nutshell

7

APP

LibFS (functionality)

Safely multiplex SCM

User

Kernel HW

API

APP

LibFS (layout, logic)

API

APP

OtherLibFS (layout, logic)

API

APP APP

LibFS (layout, logic)

API

APP

OtherLibFS (layout, logic)

API

Safely multiplex SCM

8

API

User

HW

/

common bob alice

common alice bob /

Aerie libFS in a nutshell

open

Kernel

Outline

• Overview • Motivation: Interface flexibility • Aerie: In-memory library file systems • Evaluation • Conclusion

9

• Universal abstraction: Everything is a file – Has generic-overhead cost

10

Application

POSIX File (Virtual File System)

Storage File IPC Network

Socket

POSIX File: Expensive abstraction

• Rigid interface and policies – Has fixed components and costs – Hinders application-specific customization

11

Application

POSIX File (Virtual File System)

UNIX concurrency semantics

Hierarchical names

Byte streams

Permissions

POSIX File: Expensive abstraction

POSIX File: Expensive abstraction

• Rigid interface and policies – Has fixed components and costs – Hinders application-specific customization

Application

POSIX File (Virtual File System)

open() Syscall Naming + Permissions

In-memory state (Byte streams)

Concurrency control

File Descriptors

~ 2.5 μs ≈ 25x SCM latency

File System

Motivating Example: Web Proxy

13

Web Proxy Cache

/cache

Characteristics • Flat namespace • Immutable files • Infrequent sharing

Motivating Example: Web Proxy

14

Web Proxy Cache

Proxy FS POSIX FS

Customizing the file system today

• Modify the kernel

• Add a layer over existing kernel file system • Use a user-mode framework such as FUSE

15

Cumbersome options

• Software interface overheads handicap fast SCM

• Flexible interface is a must for fast SCM

• Library file systems can help remove generic software overheads

16

Flexible interfaces more important than ever

Outline

• Overview • Motivation: Interface flexibility • Aerie: In-memory library file systems (libFS) • Evaluation • Conclusion

17

Kernel safely multiplexes SCM

18

allocation, protection, addressing

Kernel HW

• Allocation: Allocates SCM regions (i.e. extents)

• Protection: Keeps track of region access rights

• Addressing: Memory-maps SCM regions

<extent>

Library implements functionality

19

APP

LibFS (layout, logic)

allocation, protection, addressing

User

Kernel HW

API

APP

LibFS (layout, logic)

API

APP

OtherLibFS (layout, logic)

API

Implementing file-system features

• File-system objects

• Shared namespace

• Protection (access control)

• Integrity

20

File-system objects build on SCM extents

• Collection (or directory) – key → object ID (oid)

• mFile (or memory file) – Offset → data extent ID

21

Shared namespace

22

APP

LibFS (layout, logic)

allocation, protection, addressing

User

Kernel HW

API

APP

LibFS (layout, logic)

API

APP

OtherLibFS (layout, logic)

API

common alice bob /

23

APP

libFS

User

HW

APP

/

common bob alice

/

common bob alice

common alice bob /

Shared namespace

libFS

open

24

APP

libFS

User

HW

APP

common alice bob /

Decentralize access control via hardware-enforced permissions

libFS /

common bob alice

/

common bob alice Memory protection prevents Bob from accessing Alice’s files

25

APP

libFS

User

HW

APP

libFS /

shared alice

alice bob /

Hardware protection cannot guarantee integrity

common

/

common bob

foo

bar

Trusted FS

Service (TFS)

26

APP

LibFS (layout, logic)

API

User

Kernel HW

APP

LibFS (layout, logic)

API

Integrity via Trusted File Service

allocation, protection, addressing

27

HW

SCM

APP

LibFS API

File data Metadata

Read/ Write Read

Read/Write

APP

LibFS (layout, logic)

API

User

Trusted FS

Service (TFS)

Metadata Server (integrity)

Update

Decentralized architecture

RPC

28

User

HW

SCM

APP

LibFS API

File data Metadata

Read/ Write Read

Read/Write

APP

API

Trusted FS

Service (TFS)

Metadata Server (integrity)

Lease Manager

(sharing) Log Update

Reducing communication: Hierarchical leases + Batching

RPC

29

User

HW

SCM

APP

LibFS API

File data Metadata

Read/ Write Read

Read/Write

APP

API

Trusted FS

Service (TFS)

Metadata Server (integrity)

Lease Manager

(sharing) Log Free blocks

Update

Reducing communication: Hierarchical leases + Batching

RPC

common

adjust

Contention

30

User

HW

SCM

APP

LibFS API

File data Metadata

Read/ Write Read

Read/Write

APP

API

Trusted FS

Service (TFS)

Metadata Server (integrity)

Lease Manager

(sharing) Log Free blocks

Update

Reducing communication: Hierarchical leases + Batching

RPC

Prototype Implementation

• Extent API by Linux 3.2.2 x86-64 kernel modifications

• Communication via loopback RPC • Crash consistency through

– x86 CLFLUSH instruction (cache line flush) – Redo logging

• SCM emulation using DRAM

31

32

HW

SCM

APP

LibFS libFS

File data Metadata

WRITE READ (/common)

APP

Trusted FS

Service (TFS)

Metadata Server (integrity)

Lease Manager

(sharing) Log Free blocks

Example: A shared file

creat(/common/foo) LOCK (/common)

CreateFile

LinkBlock

write(...)

RPC

33

HW

SCM File data Metadata

RELEASE (/shared)

Trusted FS

Service (TFS)

Metadata Server (integrity)

Lease Manager

(sharing)

Example: A shared file

LOCK (/common)

APP

LibFS libFS

APP

LibFS

read(/common/foo)

READ (/common/foo) READ (/common)

API RPC

LOCK (foo)

File Systems

Functionality: PXFS • POSIX interface:

open/read/write/unlink • Hierarchical namespace • POSIX concurrency

semantics • File byte streams

34

File Systems

Functionality: PXFS • POSIX interface:

open/read/write/unlink • Hierarchical namespace • POSIX concurrency

semantics • File byte streams

Optimization: FlatFS • Key-value interface:

put/get/erase • Flat namespace

– Simplifies name resolution

• KV-store concurrency semantics – Reduce in-memory state

• Short, immutable files – Simplify storage allocation

35

File Systems

36

PXFS

APP

FlatFS

APP

Performance Evaluation

• Performance model – Writes to DRAM + software created delay – Reads to DRAM

• Configurations – RamFS: In-memory kernel FS – Ext4: ext4fs + RAM-disk – LibFS: PXFS and FlatFS

• Filebench workloads: Fileserver, Webserver, Webproxy

37

Application-workload performance

0

5

10

15

20

Fileserver Webserver Webproxy

Late

ncy

per o

p

( μs

)

RamFS

ext4

PXFS

FlatFS

38

• PXFS performs better than kernel-mode FS • FlatFS exploits app semantics to improve performance

10%

9% 22%

53% (2.1x) 30%

1

10

Late

ncy

per o

p ( μs

)

PXFS

ext4

FlatFS

Sensitivity to SCM performance: Webproxy

• Shorter SCM latencies favor – Direct access via load/store instructions – Interface specialization

39

0 100 1000 10000 Extra software delay ( ns )

Scalability: Webproxy

0200400600800

10001200

0 5 10

Thro

ughp

ut (K

ops/

s)

# Threads

PXFSRamFSext4FlatFS

• FlatFS retains its benefits over kernel-mode file systems

Machine: Intel 6-core 2-way HT

• Software interface overheads handicap fast SCM

• Flexible interface is a must for fast SCM

• Aerie: Library file systems help remove generic overheads for higher performance – FlatFS improves performance by up to 110%

41

Conclusion

Thank you! Questions?