Stefan Bucur Vlad Ureche Cristian Zam˚r George CandeaParallel Symbolic Execution for Automated...

1
Parallel Symbolic Execution for Automated Real-World Software Testing Cloud9 Stefan Bucur stefan.bucur@epfl.ch http://dslab.epfl.ch Vlad Ureche Cristian Zamfir George Candea vlad.ureche@epfl.ch cristian.zamfir@epfl.ch george.candea@epfl.ch Under The Hood Path Encoding 0 1 1 0 1 x=λ λ>0 λ≤0 Program State if (x>0) { ... x=λ x=λ Symbolic Data x>0 x≤0 Path Constraint W1 W2 W3 x=read(β) P1 P3 P2 Multiple processes Shared memory Multithreading Concretization + External Call Environment Model Parallel Symbolic Execution Symbolic Execution Parallel Tree Exploration Work Transfer The POSIX Environment Model http://cloud9.epfl.ch Full POSIX Model Symbolic Engine Modifications • Unconstrained symbolic data Execution forks when a branch in- volves symbolic values • Resulting execution tree increases ex- ponentially with program size — “path ex- plosion” problem • Parallel symbolic execution engine runs on commodity clusters Worker nodes run independent symbolic engines and are coordi- nated by a load balancer • When frontier becomes unbalanced, the load balancer instructs pairs of workers to exchange jobs • Jobs are encoded as paths from the tree root to the nodes, and the destination node “replays” that path • Outside the symbolic domain (e.g. the program under test), the environ- ment is complex • One may “concretize” calls and lose completeness • A model extends the symbolic domain, while simplifying the envi- ronment behavior Multithreading and Scheduling • Cooperative scheduler simplifies model implementa- tion • Deterministic (round-robin) or symbolic scheduling Address Spaces • Use copy-on-write (CoW) to reuse memory between processess, as well as across states • Put address spaces in CoW domains, to permit memory sharing 1 2 3 4 5 Scalable Cluster-Based Testing 0 1 2 3 4 5 6 2 4 6 12 24 48 Time to complete exhaustive test [hours] Number of workers Time to exhaustively test memcached with symbolic packets 0 10 20 30 40 50 0 10 20 30 40 50 60 70 80 90 Additional code covered [% of program LOC] Index of tested Coreutil (sorted by additional coverage) Cloud9 code coverage improvements on the 96 Coreutils (1-worker vs. 12-worker) Parallel symbolic execution on large clusters of commodity hardware • Suitable for running on public and private cloud infrastructures, such as Amazon EC2 or Eucalyptus Real-World Automated Testing Apache Memcached GNU Coreutils We used Cloud9 to test software ranging from system utilities to large networked and distributed systems Testing Platform API • An API that developers can use to produce symbolic test cases and control the behavior of the OS environment: Inject symbolic data Symbolic fault injection Thread scheduling • Cloud9 can test programs with complex environment interactions: char httpData[10]; make_symbolic(httpData); strcat(req, “X-NewExtension: ”); strcat(req, hData); ioctl(ssock, SIO_PKT_FRAGMENT, RD); ioctl(ssock, SIO_FAULT_INJ, RD | WR); Example: Preparing a symbolic HTTP packet to test a header extension multithreading multiple processes IPC synchronization networking filesystem To ensure exploration disjointness and completeness, each worker’s local tree has 3 kinds of nodes: internal nodes (already explored) fence nodes that demarcate the portion being ex- plored (correspond to nodes explored on other workers) candidate nodes (nodes ready to be explored lo- cally) W1 W2 Symbolic System Calls Replaced the standard OS system calls with a simplified set of calls into the symbolic execution engine: • create/destroy threads • fork/terminate processes • share memroy across processes • sleep/wake-up on waiting queues POSIX Model Architecture Symbolic Execution Engine Stream/Block Buffers Symbolic Files Network (TCP/UDP/UNIX) Pipes Threads C Library Program Under Test Aggregate view

Transcript of Stefan Bucur Vlad Ureche Cristian Zam˚r George CandeaParallel Symbolic Execution for Automated...

Page 1: Stefan Bucur Vlad Ureche Cristian Zam˚r George CandeaParallel Symbolic Execution for Automated Cloud9 Real-World Software Testing Stefan Bucur stefan.bucur@ep˜.ch ˜.ch Vlad Ureche

Parallel Symbolic Execution for AutomatedReal-World Software TestingCloud9

Stefan Bucurstefan.bucur@ep�.ch

http://dslab.ep�.ch

Vlad Ureche Cristian Zam�r George Candeavlad.ureche@ep�.ch cristian.zam�r@ep�.ch george.candea@ep�.ch

Under The Hood

PathEncoding

01

10

1

x=λλ>0 λ≤0

Program State

if (x>0) { ...x=λ x=λ

Symbolic Datax>0 x≤0PathConstraint

W1 W2

W3

x=read(β)

P1

P3

P2

Multipleprocesses

Sharedmemory

Multithreading

Concretization + External Call

EnvironmentModel

Parallel Symbolic Execution Symbolic Execution

Parallel Tree Exploration

Work Transfer

The POSIX Environment Model

http://cloud9.epfl.ch

Full POSIX Model

Symbolic Engine Modi�cations

• Unconstrained symbolic data • Execution forks when a branch in-volves symbolic values

• Resulting execution tree increases ex-ponentially with program size — “path ex-

plosion” problem

• Parallel symbolic execution engine runs on commodity clusters • Worker nodes run independent symbolic engines and are coordi-nated by a load balancer

• When frontier becomes unbalanced, the load balancer instructs pairs of workers to exchange jobs • Jobs are encoded as paths from the tree root to the nodes, and the destination node “replays” that path

• Outside the symbolic domain (e.g. the program under test), the environ-ment is complex

• One may “concretize” calls and lose completeness

• A model extends the symbolic domain, while simplifying the envi-ronment behavior

Multithreading and Scheduling • Cooperative scheduler simplifies model implementa-tion • Deterministic (round-robin) or symbolic scheduling

Address Spaces • Use copy-on-write (CoW) to reuse memory between processess, as well as across states • Put address spaces in CoW domains, to permit memory sharing

1

2

3

45

Scalable Cluster-Based Testing

0

1

2

3

4

5

6

2 4 6 12 24 48

Tim

e to

com

plet

eex

haus

tive

test

[h

ours

]

Number of workers

Time to exhaustively test memcachedwith symbolic packets

0

10

20

30

40

50

0 10 20 30 40 50 60 70 80 90Add

ition

al c

ode

cove

red

[% o

f pro

gram

LO

C]

Index of tested Coreutil (sorted by additional coverage)

Cloud9 code coverage improvementson the 96 Coreutils (1-worker vs. 12-worker)

• Parallel symbolic execution on large clusters of commodity hardware • Suitable for running on public and private cloud infrastructures, such as Amazon EC2 or Eucalyptus

Real-World Automated Testing

ApacheMemcached GNU Coreutils

We used Cloud9 to test software ranging fromsystem utilities to large networked and distributed systems

Testing Platform API • An API that developers can use to produce symbolic test cases and control the behavior of the OS environment: • Inject symbolic data • Symbolic fault injection • Thread scheduling• Cloud9 can test programs with complex environment interactions:

char httpData[10];make_symbolic(httpData);strcat(req, “X-NewExtension: ”);strcat(req, hData);

ioctl(ssock, SIO_PKT_FRAGMENT, RD);ioctl(ssock, SIO_FAULT_INJ, RD | WR);

Example: Preparing a symbolic HTTP packet to test a header extension

multithreading

multiple processes

IPC

synchronization

networking

�lesystem

To ensure exploration disjointness and completeness, each worker’s local tree has 3 kinds of nodes: internal nodes (already explored) fence nodes that demarcate the portion being ex-plored (correspond to nodes explored on other workers) candidate nodes (nodes ready to be explored lo-cally)

W1 W2

Symbolic System CallsReplaced the standard OS system calls with a simplified set of calls into the symbolic execution engine: • create/destroy threads • fork/terminate processes • share memroy across processes • sleep/wake-up on waiting queues

POSIX Model Architecture

Symbolic Execution Engine

Stream/Block Bu�ers

SymbolicFiles

Network(TCP/UDP/UNIX) Pipes

Thre

ads

C Library

Program Under Test

Aggregateview