Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ......

250
Susanne Graf Mahesh Viswanathan (Eds.) 123 LNCS 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 Held as Part of the 10th International Federated Conference on Distributed Computing Techniques, DisCoTec 2015 Grenoble, France, June 2–4, 2015, Proceedings Formal Techniques for Distributed Objects, Components, and Systems

Transcript of Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ......

Page 1: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Susanne GrafMahesh Viswanathan (Eds.)

123

LNCS

903

9

35th IFIP WG 6.1 International Conference, FORTE 2015Held as Part of the 10th International Federated Conferenceon Distributed Computing Techniques, DisCoTec 2015Grenoble, France, June 2–4, 2015, Proceedings

Formal Techniquesfor Distributed Objects, Components, and Systems

Page 2: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Lecture Notes in Computer Science 9039

Commenced Publication in 1973Founding and Former Series Editors:Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board

David HutchisonLancaster University, Lancaster, UK

Takeo KanadeCarnegie Mellon University, Pittsburgh, PA, USA

Josef KittlerUniversity of Surrey, Guildford, UK

Jon M. KleinbergCornell University, Ithaca, NY, USA

Friedemann MatternETH Zürich, Zürich, Switzerland

John C. MitchellStanford University, Stanford, CA, USA

Moni NaorWeizmann Institute of Science, Rehovot, Israel

C. Pandu RanganIndian Institute of Technology, Madras, India

Bernhard SteffenTU Dortmund University, Dortmund, Germany

Demetri TerzopoulosUniversity of California, Los Angeles, CA, USA

Doug TygarUniversity of California, Berkeley, CA, USA

Gerhard WeikumMax Planck Institute for Informatics, Saarbrücken, Germany

Page 3: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

More information about this series at http://www.springer.com/series/7408

Page 4: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Susanne Graf · Mahesh Viswanathan (Eds.)

Formal Techniquesfor Distributed Objects,Components, and Systems35th IFIP WG 6.1 International Conference, FORTE 2015Held as Part of the 10th International Federated Conferenceon Distributed Computing Techniques, DisCoTec 2015Grenoble, France, June 2–4, 2015Proceedings

ABC

Page 5: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

EditorsSusanne GrafUniversité Grenoble Alpes / VERIMAGGrenobleFrance

Mahesh ViswanathanUniversity of Illinois at Urbana-ChampaignUrbana, IllinoisUSA

ISSN 0302-9743 ISSN 1611-3349 (electronic)Lecture Notes in Computer ScienceISBN 978-3-319-19194-2 ISBN 978-3-319-19195-9 (eBook)DOI 10.1007/978-3-319-19195-9

Library of Congress Control Number: 2015939161

LNCS Sublibrary: SL2 – Programming and Software Engineering

Springer Cham Heidelberg New York Dordrecht Londonc© IFIP International Federation for Information Processing 2015

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of thematerial is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting, reproduction on microfilms or in any other physical way, and transmission or information stor-age and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology nowknown or hereafter developed.The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoes not imply, even in the absence of a specific statement, that such names are exempt from the relevantprotective laws and regulations and therefore free for general use.The publisher, the authors and the editors are safe to assume that the advice and information in this bookare believed to be true and accurate at the date of publication. Neither the publisher nor the authors or theeditors give a warranty, express or implied, with respect to the material contained herein or for any errors oromissions that may have been made.

Printed on acid-free paper

Springer International Publishing AG Switzerland is part of Springer Science+Business Media(www.springer.com)

Page 6: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Foreword

The 10th International Federated Conference on Distributed Computing Techniques(DisCoTec) took place in Montbonnot, near Grenoble, France, during June 2–5, 2015.It was hosted and organized by INRIA, the French National Research Institute in Com-puter Science and Control. The DisCoTec series is one of the major events sponsoredby the International Federation for Information Processing (IFIP). It comprises threeconferences:

– COORDINATION, the IFIP WG6.1 International Conference on Coordination Mod-els and Languages.

– DAIS, the IFIP WG6.1 International Conference on Distributed Applications andInteroperable Systems.

– FORTE, the IFIP WG6.1 International Conference on Formal Techniques for Dis-tributed Objects, Components and Systems.

Together, these conferences cover a broad spectrum of distributed computing sub-jects, ranging from theoretical foundations and formal description techniques to systemsresearch issues.

Each day of the federated event began with a plenary keynote speaker nominated byone of the conferences. The three invited speakers were Alois Ferscha (Johannes Ke-pler Universität, Linz, Austria), Leslie Lamport (Microsoft Research, USA), and WillyZwaenepoel (EPFL, Lausanne, Switzerland).

Associated with the federated event were also three satellite workshops, that tookplace on June 5, 2015:

– The 2nd International Workshop on Formal Reasoning in Distributed Algorithms(FRIDA), with a keynote speech by Leslie Lamport (Microsoft Research, USA).

– The 8th International Workshop on Interaction and Concurrency Experience (ICE),with keynote lectures by Jade Alglave (University College London, UK) and SteveRoss-Talbot (ZDLC, Cognizant Technology Solutions, London, UK).

– The 2nd International Workshop on Meta Models for Process Languages (MeMo).

Sincere thanks go to the chairs and members of the Program and Steering Com-mittees of the involved conferences and workshops for their highly appreciated efforts.Organizing DisCoTec was only possible thanks to the dedicated work of the Organiz-ing Committee from INRIA Grenoble-Rhône-Alpes, including Sophie Azzaro, VanessaPeregrin, Martine Consigney, Alain Kersaudy, Sophie Quinton, Jean-Bernard Stefani,and the excellent support from Catherine Nuel and the people at Insight Outside. Fi-nally, many thanks go to IFIP WG6.1 for sponsoring this event, and to INRIA Grenoble-Rhône-Alpes and its Director Patrick Gros for their support and sponsorship.

Alain GiraultDisCoTec 2015 General Chair

Page 7: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

VI Foreword

DisCoTec Steering Committee

Farhad Arbab CWI, Amsterdam, The NetherlandsRocco De Nicola IMT Lucca, ItalyKurt Geihs University of Kassel, GermanyMichele Loreti University of Florence, ItalyElie Najm Telecom ParisTech, France (Chair)Rui Oliveira University of Minho, PortugalJean-Bernard Stefani Inria Grenoble - Rhône-Alpes, FranceUwe Nestmann Technical University of Berlin, Germany

Page 8: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Preface

This volume contains the proceedings of FORTE 2015, the 35th IFIP International Con-ference on Formal Techniques for Distributed Objects, Components and Systems. Thisconference was organized as part of the 10th International Federated Conference onDistributed Computing Techniques (DisCoTec) and was held in Grenoble, France be-tween June 2–4, 2015.

The FORTE conference series represents a forum for fundamental research on the-ory, models, tools, and applications for distributed systems. The conference encouragescontributions that combine theory and practice, and that exploit formal methods andtheoretical foundations to present novel solutions to problems arising from the devel-opment of distributed systems. FORTE covers distributed computing models and for-mal specification, testing, and verification methods. The application domains includeall kinds of application-level distributed systems, telecommunication services, Internet,embedded and real-time systems, as well as networking and communication securityand reliability.

We received a total of 53 full paper submissions for review. Each submission wasreviewed by at least three members of the Program Committee. Based on high-qualityreviews, and a thorough (electronic) discussion by the Program Committee, we selected15 papers for presentation at the conference and for publication in this volume.

Leslie Lamport (Microsoft Research) was keynote speaker of FORTE 2015. Lesliereceived the Turing Award in 2013. He is known for his seminal contributions in dis-tributed systems. He has developed algorithms, formal models, and verification methodsfor distributed systems. Leslie’s keynote lecture was on Temporal Logic of Actions.

We would like to thank all those who contributed to the success of FORTE 2015:the authors, for submitting high-quality work to FORTE 2015; the Program Committeeand the external reviewers, for providing constructive, high-quality reviews, an efficientdiscussion, and a fair selection of papers; the invited speaker for an inspiring talk; and,of course, all the attendees of FORTE 2015. We are also grateful to the DisCoTec Gen-eral Chair, Alain Girault, Organization Chair, Jean-Bernard Stefani, and all members oftheir local organization team. The EasyChair conference management system facilitatedPC discussions, and the preparation of these proceedings. Thank You.

June 2015 Susanne GrafMahesh Viswanathan

Page 9: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Organization

Program Committee Chairs

Susanne Graf VERIMAG & CNRS, Grenoble, FranceMahesh Viswanathan University of Illinois at Urbana-Champaign, USA

Program Committee Members

Erika Abraham RWTH Aachen University, GermanyLuca Aceto Reykjavik University, IcelandS. Akshay IIT Bombay, IndiaPaul Attie American University of Beirut, LebanonRohit Chadha University of Missouri, USARance Cleaveland University of Maryland, USAFrank de Boer CWI, Amsterdam, The NetherlandsBorzoo Bonakdarpour McMaster University, Ontario, CanadaMichele Boreale Università degli Studi di Firenze, ItalyStephanie Delaune CNRS & ENS Cachan, FranceWan Fokkink Vrije Universiteit Amsterdam, The NetherlandsGregor Goessler Inria Grenoble, FranceGerard Holzmann Jet Propulsion Laboratory, Pasadena, CA, USAAlan Jeffrey Alcatel-Lucent Bell Labs, USAPetr Kuznetsov Telecom ParisTech, FranceIvan Lanese University of Bologna/INRIA, ItalyKim Larsen University of Aalborg, DenmarkAntonia Lopes University of Lisbon, PortugalStephan Merz LORIA & INRIA Nancy, FranceCatuscia Palamidessi INRIA Saclay, FranceAlan Schmitt IRISA & INRIA Rennes, France

Steering Committee

Erika Abraham RWTH Aachen, GermanyDirk Beyer University of Passau, GermanyMichele Boreale Università degli Studi di Firenze, ItalyEinar Broch Johnsen University of Oslo, NorwayFrank de Boer CWI, Amsterdam, The NetherlandsHolger Giese University of Potsdam, GermanyCatuscia Palamidessi INRIA, Saclay, FranceGrigore Rosu University of Illinois at Urbana-Champaign, USAJean-Bernard Stefani INRIA, Grenoble, France (Chair)Heike Wehrheim University of Paderborn, Germany

Page 10: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

X Organization

Additional Reviewers

Agrawal, ShreyaAstefanoaei, LacramioaraAzadbakht, KeyvanBauer, MatthewBettini, LorenzoBezirgiannis, NikolaosBracciali, AndreaBresolin, DavideCastellani, IlariaCorzilius, FlorianDalsgaard, Andreas EngelbredtDang, ThaoDella Monica, DarioDemangeon, RomainDenielou, Pierre-MaloDi Giusto, CinziaDokter, KasperEnea, ConstantinFehnker, AnsgarFoshammer, LouiseFrancalanza, AdrianFranco, JulianaGriffith, DennisGuha, ShibashisHenrio, LudovicHerbreteau, FrédéricHirsch, MartinHöfner, PeterJongmans, Sung-Shik T.Q.Kemper, StephanieKini, DileepLaurent, Mounier

Lenglet, SergueïLoreti, MicheleMandel, LouisMarques, Eduardo R.B.Martins, FranciscoMassink, MiekeMateescu, RaduMezzina, Claudio AntaresNajm, ElieOber, IulianPadovani, LucaPeressotti, MarcoPessaux, FrançoisPhawade, RamchandraPoulsen, Danny BøgstedPrisacariu, CristianPérez, Jorge A.Quinton, SophieRavi, SrivatsanReniers, MichelRezine, AhmedS. KrishnaSangnier, ArnaudSerbanescu, Vlad NicolaeSirjani, MarjanTapia Tarifa, Silvia LizethTiezzi, FrancescoTrivedi, AshutoshValencia, FrankWognsen, Erik RamsgaardXue, Bingtian

Page 11: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Contents

Ensuring Properties of Distributed Systems

Types for Deadlock-Free Higher-Order Programs . . . . . . . . . . . . . . . . . . . . . 3Luca Padovani and Luca Novara

On Partial Order Semantics for SAT/SMT-Based Symbolic Encodingsof Weak Memory Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Alex Horn and Daniel Kroening

A Strategy for Automatic Verification of Stabilization of DistributedAlgorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Ritwika Ghosh and Sayan Mitra

Faster Linearizability Checking Via P -Compositionality . . . . . . . . . . . . . . . 50Alex Horn and Daniel Kroening

Translation Validation for Synchronous Data-Flow Specification in theSIGNAL Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

Van Chan Ngo, Jean-Pierre Talpin, and Thierry Gautier

Formal Models of Concurrent and DistributedSystems

Dynamic Causality in Event Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83Youssef Arbach, David Karcher, Kirstin Peters, and Uwe Nestmann

Loop Freedom in AODVv2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98Kedar S. Namjoshi and Richard J. Trefler

Code Mobility Meets Self-organisation: A Higher-Order Calculus ofComputational Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Ferruccio Damiani, Mirko Viroli, Danilo Pianini, and Jacob Beal

Real Time Systems

Timely Dataflow: A Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131Martın Abadi and Michael Isard

Difference Bound Constraint Abstraction for Timed AutomataReachability Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

Weifeng Wang and Li Jiao

Page 12: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

XII Contents

Compliance and Subtyping in Timed Session Types . . . . . . . . . . . . . . . . . . 161Massimo Bartoletti, Tiziana Cimoli, Maurizio Murgia,Alessandro Sebastian Podda, and Livio Pompianu

Security

Type Checking Privacy Policies in the π-calculus . . . . . . . . . . . . . . . . . . . . . 181Dimitrios Kouzapas and Anna Philippou

Extending Testing Automata to All LTL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196Ala Eddine Ben Salem

Efficient Verification Techniques

Simple Isolation for an Actor Abstract Machine . . . . . . . . . . . . . . . . . . . . . . 213Benoit Claudel, Quentin Sabah, and Jean-Bernard Stefani

Sliced Path Prefixes: An Effective Method to EnableRefinement Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228

Dirk Beyer, Stefan Lowe, and Philipp Wendler

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

Page 13: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Ensuring Properties of Distributed Systems

Page 14: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Types for Deadlock-Free Higher-Order Programs

Luca Padovani(�) and Luca Novara

Dipartimento di Informatica, Universita di Torino, Torino, [email protected]

Abstract. Type systems for communicating processes are typically studied usingabstract models – e.g., process algebras – that distill the communication behaviorof programs but overlook their structure in terms of functions, methods, objects,modules. It is not always obvious how to apply these type systems to structuredprogramming languages. In this work we port a recently developed type systemthat ensures deadlock freedom in the π-calculus to a higher-order language.

1 Introduction

In this article we develop a type system that guarantees well-typed programs that com-municate over channels to be free from deadlocks. Type systems ensuring this propertyalready exist [7,8,10], but they all use the π-calculus as the reference language. Thischoice overlooks some aspects of concrete programming languages, like the fact thatprograms are structured into compartmentalized blocks (e.g., functions) within whichonly the local structure of the program (the body of a function) is visible to the typesystem, and little if anything is know about the exterior of the block (the callers ofthe function). The structure of programs may hinder some kinds of analysis: for exam-ple, the type systems in [7,8,10] enforce an ordering of communication events and todo so they take advantage of the nature of π-calculus processes, where programs areflat sequences of communication actions. How do we reason on such ordering whenthe execution order is dictated by the reduction strategy of the language rather than bythe syntax of programs, or when events occur within a function, and nothing is knownabout the events that are supposed to occur after the function terminates? We answerthese questions by porting the type system in [10] to a higher-order functional language.

To illustrate the key ideas of the approach, let us consider the program

〈send a (recv b)〉| 〈send b (recv a)〉 (1.1)

consisting of two parallel threads. The thread on the left is trying to send the messagereceived from channel b on channel a; the thread on the right is trying to do the op-posite. The communications on a and b are mutually dependent, and the program is adeadlock. The basic idea used in [10] and derived from [7,8] for detecting deadlocksis to assign each channel a number – which we call level – and to verify that channelsare used in order according to their levels. In (1.1) this mechanism requires b to havesmaller level than a in the leftmost thread, and a to have a smaller level than b in therightmost thread. No level assignment can simultaneously satisfy both constraints. Inorder to perform these checks with a type system, the first step is to attach levels to

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 3–18, 2015.DOI: 10.1007/978-3-319-19195-9_1

Page 15: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

4 L. Padovani and L. Novara

channel types. We therefore assign the types ![int]m and ?[int]n respectively to a and bin the leftmost thread of (1.1), and ?[int]m and ![int]n to the same channels in the right-most thread of (1.1). Crucially, distinct occurrences of the same channel have typeswith opposite polarities (input ? and output !) and equal level. We can also think ofthe assignments send : ∀ı.![int]ı → int→ unit and recv : ∀ı.?[int]ı → int for the com-munication primitives, where we allow polymorphism on channel levels. In this case,the application send a (recv b) consists of two subexpressions, the partial applicationsend a having type int→ unit and its argument recv b having type int. Neither of thesetypes hints at the I/O operations performed in these expressions, let alone at the levelsof the channels involved. To recover this information we pair types with effects [1]: theeffect of an expression is an abstract description of the operations performed during itsevaluation. In our case, we take as effect the level of channels used for I/O operations,or ⊥ in the case of pure expressions that perform no I/O. So, the judgment

b : ?[int]n � recv b : int& n

states that recv b is an expression of type int whose evaluation performs an I/O opera-tion on a channel with level n. As usual, function types are decorated with a latent effectsaying what happens when the function is applied to its argument. So,

a : ![int]m � send a : int→m unit&⊥states that send a is a function that, applied to an argument of type int, produces aresult of type unit and, in doing so, performs an I/O operation on a channel with levelm. By itself, send a is a pure expression whose evaluation performs no I/O operations,hence the effect ⊥. Effects help us detecting dangerous expressions: in a call-by-valuelanguage an application e1e2 evaluates e1 first, then e2, and finally the body of thefunction resulting from e1. Therefore, the channels used in e1 must have smaller levelthan those occurring in e2 and the channels used in e2 must have smaller level than thoseoccurring in the body of e1. In the specific case of send a (recv b) we have ⊥< n forthe first condition, which is trivially satisfied, and n < m for the second one. Since thesame reasoning on send b (recv a) also requires the symmetric condition (m < n), wedetect that the parallel composition of the two threads in (1.1) is ill typed, as desired.

It turns out that the information given by latent effects in function types is not suffi-cient for spotting some deadlocks. To see why, consider the function

fdef= λ x.(send a x; send b x)

which sends its argument x on both a and b and where ; denotes sequential composition.The level of a (say m) should be smaller than the level of b (say n), for a is used before b(we assume that communication is synchronous and that send is a potentially blockingoperation). The question is, what is the latent effect that decorates the type of f , of theform int→h unit? Consider the two obvious possibilities: if we take h = m, then

〈recv a〉| 〈 f 3; recv b〉 (1.2)

is well typed because the effect m of f 3 is smaller than the level of b in recv b, whichagrees with the fact that f 3 is evaluated before recv b; if we take h = n, then

〈recv a; f 3〉| 〈recv b〉 (1.3)

Page 16: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Types for Deadlock-Free Higher-Order Programs 5

is well typed for similar reasons. This is unfortunate because both (1.3) and (1.2) reduceto a deadlock. To flag both of them as ill typed, we must refine the type of f to int→m,n

unit where we distinguish the smallest level of the channels that occur in the body of f(that is m) from the greatest level of the channels that are used by f when f is appliedto an argument (that is n). The first annotation gives information on the channels in thefunction’s closure, while the second annotation is the function’s latent effect, as before.So (1.2) is ill typed because the effect of f 3 is the same as the level of b in recv b and(1.3) is ill typed because the effect of recv a is the same as the level of f in f 3.

In the following, we define a core multithreaded functional language with commu-nication primitives (Section 2), we present a basic type and effect system, extend itto address recursive programs, and state its properties (Section 3). Finally, we brieflydiscuss closely related work and a few extensions (Section 4). Proofs and additionalmaterial can be found in long version of the paper, on the first author’s home page.

2 Language Syntax and Semantics

In defining our language, we assume a synchronous communication model based on lin-ear channels. This assumption limits the range of systems that we can model. However,asynchronous and structured communications can be encoded using linear channels:this has been shown to be the case for binary sessions [5] and for multiparty sessions toa large extent [10, technical report].

We use a countable set of variables x, y, . . . , a countable set of channels a, b, . . . ,and a set of constants k. Names u, . . . are either variables or channels. We consider alanguage of expressions and processes as defined below:

e ::= k∣∣ u

∣∣ λ x.e

∣∣ ee P,Q ::= 〈e〉 ∣

∣ (νa)P∣∣ P|Q

Expressions comprise constants k, names u, abstractions λ x.e, and applications e1e2.We write _ for unused/fresh variables. Constants include the unitary value (), the in-teger numbers m, n, . . . , as well as the primitives fix, fork, new, send, recv whosesemantics will be explained shortly. Processes are either threads 〈e〉, or the restriction(νa)P of a channel a with scope P, or the parallel composition P|Q of processes.

The notions of free and bound names are as expected, given that the only binders areλ ’s and ν’s. We identify terms modulo renaming of bound names and we write fn(e)(respectively, fn(P)) for the set of names occurring free in e (respectively, in P).

The reduction semantics of the language is given by two relations, one for expres-sions, another for processes. We adopt a call-by-value reduction strategy, for which weneed to define reduction contexts E , . . . and values v, w, . . . respectively as:

E ::= [ ]∣∣ E e

∣∣ vE v,w ::= k

∣∣ a

∣∣ λ x.e

∣∣ send v

The reduction relation −→ for expressions is defined by standard rules

(λ x.e)v −→ e{v/x} fix λ x.e −→ e{fix λ x.e/x}and closed under reduction contexts. As usual, e{e′/x} denotes the capture-avoidingsubstitution of e′ for the free occurrences of x in e.

Page 17: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

6 L. Padovani and L. Novara

Table 1. Reduction semantics of expressions and processes

〈E [send a v]〉| 〈E ′[recv a]〉 a−−→ 〈E [()]〉| 〈E ′[v]〉 〈E [fork v]〉 τ−−→ 〈E [()]〉| 〈v()〉

〈E [new()]〉 τ−−→ (νa)〈E [a]〉a ∈ fn(E )

e −→ e′

〈e〉 τ−−→ 〈e′〉

P�−−→ P′

P|Q�−−→ P′ |Q

P�−−→ Q

(νa)P�−−→ (νa)Q

� = aP

a−−→ Q

(νa)Pτ−−→ Q

P ≡ �−−→≡ Q

P�−−→ Q

The reduction relation of processes (Table 1) has labels �, . . . that are either a chan-nel name a, signalling that a communication has occurred on a, or the special symbolτ denoting any other reduction. There are four base reductions for processes: a com-munication occurs between two threads when one is willing to send a message v on achannel a and the other is waiting for a message from the same channel; a thread thatcontains a subexpression fork v spawns a new thread that evaluates v(); a thread thatcontains a subexpression new() creates a new channel; the reduction of an expressioncauses a corresponding τ-labeled reduction of the thread in which it occurs. Reduc-tion for processes is then closed under parallel compositions, restrictions, and structuralcongruence. The restriction of a disappears as soon as a communication on a occurs: inour model channels are linear and can be used for one communication only; structuredforms of communication can be encoded on top of this simple model (see Example 2and [5]). Structural congruence is defined by the standard rules rearranging parallelcompositions and channel restrictions, where 〈()〉 plays the role of the inert process.

We conclude this section with two programs written using a slightly richer languageequipped with let bindings, conditionals, and a few additional operators. All theseconstructs either have well-known encodings or can be easily accommodated.

Example 1 (parallel Fibonacci function). The fibo function below computes the n-thnumber in the Fibonacci sequence and sends the result on a channel c:

1 fix λfibo.λn.λc.if n ≤ 1 then send c n

2 else let a = new() and b = new() in

3 (fork λ_.fibo (n - 1) a);

4 (fork λ_.fibo (n - 2) b);

5 send c (recv a + recv b)

The fresh channels a and b are used to collect the results from the recursive, parallelinvocations of fibo. Note that expressions are intertwined with I/O operations. It isrelevant to ask whether this version of fibo is deadlock free, namely if it is able toreduce until a result is computed without blocking indefinitely on an I/O operation. �

Example 2 (signal pipe). In this example we implement a function pipe that forwardssignals received from an input stream x to an output stream y:

1 let cont = λx.let c = new() in (fork λ_.send x c); c in

2 let pipe = fix λpipe.λx.λy.pipe (recv x) (cont y)

Page 18: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Types for Deadlock-Free Higher-Order Programs 7

Note that this pipe is only capable of forwarding handshaking signals. A more inter-esting pipe transmitting actual data can be realized by considering data types such asrecords and sums [5]. The simplified realization we consider here suffices to illustrate arelevant family of recursive functions that interleave actions on different channels.

Since linear channels are consumed after communication, each signal includes a con-tinuation channel on which the subsequent signals in the stream will be sent/received.In particular, cont x sends a fresh continuation c on x and returns c, so that c canbe used for subsequent communications, while pipe x y sends a fresh continuationon y after it has received a continuation from x, and then repeats this behavior on thecontinuations. The program below connects two pipes:

3 let a = new() and b = new() in

4 (fork λ_.pipe a b); (fork λ_.pipe b (cont a))

Even if the two pipes realize a cyclic network, we will see in Section 3 that thisprogram is well typed and therefore deadlock free. Forgetting cont on line 4 or notforking the send on line 1, however, produces a deadlock. �

3 Type and Effect System

We present the features of the type system gradually, in three steps: we start with amonomorphic system (Section 3.1), then we introduce level polymorphism required byExamples 1 and 2 (Section 3.2), and finally recursive types required by Example 2 (Sec-tion 3.3). We end the section studying the properties of the type system (Section 3.4).

3.1 Core Types

Let Ldef=Z∪{⊥,�} be the set of channel levels ordered in the obvious way (⊥< n <�

for every n ∈ Z); we use ρ , σ , . . . to range over L and we write ρ �σ (respectively,ρ �σ ) for the minimum (respectively, the maximum) of ρ and σ . Polarities p, q, . . . arenon-empty subsets of {?, !}; we abbreviate {?} and {!} with ? and ! respectively, and{?, !} with #. Types t, s, . . . are defined by

t,s ::= B∣∣ p[t]n

∣∣ t →ρ ,σ s

where basic types B, . . . include unit and int. The type p[t]n denotes a channel withpolarity p and level n. The polarity describes the operations allowed on the channel: ?means input, ! means output, and # means both input and output. Channels are linearresources: they can be used once according to each element in their polarity. The typet →ρ ,σ s denotes a function with domain t and range s. The function has level ρ (itsclosure contains channels with level ρ or greater) and, when applied, it uses channelswith level σ or smaller. If ρ =�, the function has no channels in its closure; if σ =⊥,the function uses no channels when applied. We write → as an abbreviation for →�,⊥,so → denotes pure functions not containing and not using any channel.

Recall from Section 1 that levels are meant to impose an order on the use of channels:roughly, the lower the level of a channel, the sooner the channel must be used. We ex-tend the notion of level from channel types to arbitrary types: basic types have level �because there is no need to use them as far as deadlock freedom is concerned; the levelof functions is written in their type. Formally, the level of t, written |t|, is defined as:

Page 19: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

8 L. Padovani and L. Novara

|B| def=� |p[t]n| def

= n |t →ρ ,σ s| def= ρ (3.1)

Levels can be used to distinguish linear types, denoting values (such as channels) thatmust be used to guarantee deadlock freedom, from unlimited types, denoting values thathave no effect on deadlock freedom and may be disregarded. We say that t is linear if|t| ∈ Z; we say that t is unlimited, written un(t), if |t|=�.

Below are the type schemes of the constants that we consider. Some constants havemany types (constraints are on the right); we write types(k) for the set of types of k.

() : unitn : int

fix : (t → t)→ tfork : (unit→ρ ,σ unit)→ unit

new : unit→ #[t]n n < |t|recv : ?[t]n →�,n t n < |t|send : ![t]n → t →n,n unit n < |t|

The type of (), of the numbers, and of fix are ordinary. The primitive new creates afresh channel with the full set # of polarities and arbitrary level n. The primitive recv

takes a channel of type ?[t]n, blocks until a message is received, and returns the message.The primitive itself contains no free channels in its closure (hence the level �) becausethe only channel it manipulates is its argument. The latent effect is the level of thechannel, as expected. The primitive send takes a channel of type ![t]n, a message of typet, and sends the message on the channel. Note that the partial application send a is afunction whose level and latent effect are both the level of a. Note also that in new, recv,and send the level of the message must be greater than the level of the channel: sincelevels are used to enforce an order on the use of channels, this condition follows fromthe observation that a message cannot be used until after it has been received, namelyafter the channel on which it travels has been used. Finally, fork accepts a thunk witharbitrary level ρ and latent effect σ and spawns the thunk into an independent thread(see Table 1). Note that fork is a pure function with no latent effect, regardless ofthe level and latent effect of the thunk. This phenomenon is called effect masking [1],whereby the effect of evaluating an expression becomes unobservable: in our case, forkdischarges effects because the thunk runs in parallel with the code executing the fork.

We now turn to the typing rules. A type environment Γ is a finite map u1 : t1, . . . ,un :tn from names to types. We write /0 for the empty type environment, dom(Γ) for thedomain of Γ , and Γ(u) for the type associated with u in Γ ; we write Γ1,Γ2 for the union ofΓ1 and Γ2 when dom(Γ1)∩dom(Γ2) = /0. We also need a more flexible way of combiningtype environments. In particular, we make sure that every channel is used linearly bydistributing different polarities of a channel to different parts of the program. To thisaim, following [9], we define a partial combination operator + between types:

t + tdef= t if un(t)

p[t]n + q[t]ndef= (p∪q)[t]n if p∩q = /0

(3.2)

that we extend to type environments, thus:

Γ + Γ ′ def= Γ ,Γ ′ if dom(Γ)∩dom(Γ ′) = /0

(Γ ,u : t)+ (Γ ′,u : s)def= (Γ + Γ ′),u : t + s

(3.3)

For example, we have (x : int,a : ![int]n) + (a : ?[int]n) = x : int,a : #[int]n, so wemight have some part of the program that (possibly) uses a variable x of type int along

Page 20: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Types for Deadlock-Free Higher-Order Programs 9

with channel a for sending an integer and another part of the program that uses the samechannel a but this time for receiving an integer. The first part of the program wouldbe typed in the environment x : int,a : ![int]n and the second one in the environmenta : ?[int]n. Overall, the two parts would be typed in the environment x : int,a : #[int]n

indicating that a is used for both sending and receiving an integer.We extend the function | · | to type environments so that |Γ | def

=�

u∈dom(Γ ) |Γ(u)| withthe convention that | /0|=�; we write un(Γ) if |Γ |=�.

Table 2. Core typing rules for expressions and processes

Typing of expressions

[T-NAME]

Γ ,u : t � u : t &⊥ un(Γ)

[T-CONST]

Γ � k : t &⊥un(Γ)t ∈ types(k)

[T-FUN]

Γ ,x : t � e : s & ρΓ � λx.e : t →|Γ |,ρ s &⊥

[T-APP]

Γ1 � e1 : t →ρ ,σ s & τ1 Γ2 � e2 : t & τ2

Γ1 + Γ2 � e1e2 : s & σ � τ1 � τ2

τ1 < |Γ2|τ2 < ρ

Typing of processes

[T-THREAD]

Γ � e : unit& ρΓ � 〈e〉

[T-PAR]

Γ1 � P Γ2 � Q

Γ1 + Γ2 � P|Q

[T-NEW]

Γ ,a : #[t]n � P

Γ � (νa)P

We are now ready to discuss the core typing rules, shown in Table 2. Judgmentsof the form Γ � e : t & ρ denote that e is well typed in Γ , it has type t and effect ρ ;judgments of the form Γ � P simply denote that P is well typed in Γ .

Axioms [T-NAME] and [T-CONST] are unremarkable: as in all substructural type systemsthe unused part of the type environment must be unlimited. Names and constants haveno effect (⊥); they are evaluated expressions that do not use (but may contain) channels.

In rule [T-FUN], the effect ρ caused by evaluating the body of the function becomes thelatent effect in the arrow type of the function and the function itself has no effect. Thelevel of the function is determined by that of the environment Γ in which the functionis typed. Intuitively, the names in Γ are stored in the closure of the function; if anyof these names is a channel, then we must be sure that the function is eventually used(i.e., applied) to guarantee deadlock freedom. In fact, |Γ | gives a slightly more preciseinformation, since it records the smallest level of all channels that occur in the body ofthe function. We have seen in Section 1 why this information is useful. A few examples:

– the identity function λ x.x has type int→�,⊥ int in any unlimited environment;– the function λ_.a has type unit→n,⊥ ![int]n in the environment a : ![int]n; it contains

channel a with level n in its closure (whence the level n in the arrow), but it doesnot use a for input/output (whence the latent effect ⊥); it is nonetheless well typedbecause a, which is a linear value, is returned as result;

– the function λ x.send x 3 has type ![int]n→�,n unit; it has no channels in its closurebut it performs an output on the channel it receives as argument;

Page 21: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

10 L. Padovani and L. Novara

– the function λ x.(recv a+x) has type int→n,n int in the environment a : ?[int]n;note that neither the domain nor the codomain of the function mention any channel,so the fact that the function has a channel in its closure (and that it performs someI/O) can only be inferred from the annotations on the arrow;

– the function λ x.send x (recv a) has type ![int]n+1 →n,n+1 unit in the environmenta : ![int]n; it contains channel a with level n in its closure and performs input/outputoperations on channels with level n+ 1 (or smaller) when applied.

Rule [T-APP] deals with applications e1e2. The first thing to notice is the type envi-ronments in the premises for e1 and e2. Normally, these are exactly the same as thetype environment used for the whole application. In our setting, however, we want todistribute polarities in such a way that each channel is used for exactly one communica-tion. For this reason, the type environment Γ1 + Γ2 in the conclusion is the combinationof the type environments in the premises. Regarding effects, τi is the effect caused bythe evaluation of ei. As expected, e1 must result in a function of type t →ρ ,σ s and e2 ina value of type t. The evaluation of e1 and e2 may however involve blocking I/O oper-ations on channels, and the two side conditions make sure that no deadlock can arise.To better understand them, recall that reduction is call-by-value and applications e1e2

are evaluated sequentially from left to right. Now, the condition τ1 < |Γ2| makes surethat any I/O operation performed during the evaluation of e1 involves only channelswhose level is smaller than that of the channels occurring free in e2 (the free channelsof e2 must necessarily be in Γ2). This is enough to guarantee that the functional partof the application can be fully evaluated without blocking on operations concerningchannels that occur later in the program. In principle, this condition should be pairedwith the symmetric one τ2 < |Γ1| making sure that any I/O operation performed duringthe evaluation of the argument does not involve channels that occur in the functionalpart. However, when the argument is being evaluated, we know that the functional parthas already been reduced a value (see the definition of reduction contexts in Section 2).Therefore, the only really critical condition to check is that no channels involved in I/Ooperations during the evaluation of e2 occur in the value of e1. This is expressed by thecondition τ2 < ρ , where ρ is the level of the functional part. Note that, when e1 is anabstraction, by rule [T-FUN] ρ coincides with |Γ1|, but in general ρ may be greater than|Γ1|, so the condition τ2 < ρ gives better accuracy. The effect of the whole applicatione1e2 is, as expected, the combination of the effects of evaluating e1, e2, and the latenteffect of the function being applied. In our case the “combination” is the greatest levelof any channel involved in the application. Below are some examples:

– (λ x.x) a is well typed, because both λ x.x and a are pure expressions whose effectis ⊥, hence the two side conditions of [T-APP] are trivially satisfied;

– (λ x.x) (recv a) is well typed in the environment a : ?[int]n: the effect of recv a isn (the level of a) which is smaller than the level � of the function;

– send a (recv a) is ill typed in the environment a : #[int]n because the effect ofevaluating recv a, namely n, is the same as the level of send a;

– (recv a) (recv b) is well typed in the environment a : ?[int→ int]0,b : ?[int]1. Theeffect of the argument is 1, which is not smaller than the level of the environmenta : ?[int→ int]0 used for typing the function. However, 1 is smaller than �, which

Page 22: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Types for Deadlock-Free Higher-Order Programs 11

is the level of the result of the evaluation of the functional part of the application.This application would be illegal had we used the side condition τ2 < |Γ1| in [T-APP].

The typing rules for processes are standard: [T-PAR] splits contexts for typing the pro-cesses in parallel, [T-NEW] introduces a new channel in the environment, and [T-THREAD]

types threads. The effect of threads is ignored: effects are used to prevent circular depen-dencies between channels used within the sequential parts of the program (i.e., withinexpressions); circular dependencies that arise between parallel threads are indirectlydetected by the fact that each occurrence of a channel is typed with the same level (seethe discussion of (1.1) in Section 1).

3.2 Level Polymorphism

Looking back at Example 1, we notice that fibo n c may generate two recursivecalls with two corresponding fresh channels a and b. Since the send operation on c isblocked by recv operations on a and b (line 5), the level of a and b must be smaller thanthat of c. Also, since expressions are evaluated left-to-right and recv a + recv b issyntactic sugar for the application (+) (recv a) (recv b), the level of a must besmaller than that of b. Thus, to declare fibo well typed, we must allow different occur-rences of fibo to be applied to channels with different levels. Even more critically, thisform of level polymorphism of fibo is necessary within the definition of fibo itself,so it is an instance of polymorphic recursion [1].

The core typing rules in Table 2 do not support level polymorphism. Following theprevious discussion on fibo, the idea is to realize level polymorphism by shifting levelsin types. We define level shifting as a type operator ⇑n, thus:

⇑nBdef= B ⇑n p[t]m

def= p[⇑nt]n+m ⇑n(t →ρ ,σ s)

def= ⇑nt →n+ρ ,n+σ ⇑ns (3.4)

where + is extended from integers to levels so that n+�=� and n+⊥=⊥. The effectof ⇑nt is to shift all the finite level annotations in t by n, leaving � and ⊥ unchanged.

Now, we have to understand in which cases we can use a value of type ⇑nt whereone of type t is expected. More specifically, when a value of type ⇑nt can be passed to afunction expecting an argument of type t. This is possible if the function has level �. Weexpress this form of level polymorphism with an additional typing rule for applications:

[T-APP-POLY]

Γ1 � e1 : t →�,σ s & τ1 Γ2 � e2 : ⇑nt & τ2

Γ1 + Γ2 � e1e2 : ⇑ns & (n+σ)� τ1� τ2

τ1 < |Γ2|τ2 <�

This rule admits an arbitrary mismatch n between the level the argument expectedby the function and that of the argument supplied to the function. The type of the appli-cation and the latent effect are consequently shifted by the same amount n.

Soundness of [T-APP-POLY] can be intuitively explained as follows: a function with level� has no channels in its closure. Therefore, the only channels possibly manipulated bythe function are those contained in the argument to which the function is applied orchannels created within the function itself. Then, the fact that the argument has level

Page 23: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

12 L. Padovani and L. Novara

n+ k rather than level k is completely irrelevant. Conversely, if the function has chan-nels in its closure, then the absolute level of the argument might have to satisfy spe-cific ordering constraints with respect to these channels (recall the two side conditionsin [T-APP]). Since level polymorphism is a key distinguishing feature of our type system,and one that accounts for much of its expressiveness, we elaborate more on this intuitionusing an example. Consider the term

fwddef= λ x.λ y.send y (recv x)

which forwards on y the message received from x. The derivation...

[T-APP]y : ![int]1 � send y : int→1,1 unit&⊥

...[T-APP]

x : ?[int]0 � recv x : int& 0[T-APP]

x : ?[int]0,y : ![int]1 � send y (recv x) : unit& 1[T-FUN]

x : ?[int]0 � λ y.send y (recv x) : ![int]1 →0,1 unit&⊥[T-FUN]� fwd : ?[int]0 → ![int]1 →0,1 unit&⊥

does not depend on the absolute values 0 and 1, but only on the level of x being smallerthan that of y, as required by the fact that the send operation on y is blocked by therecv operation on x. Now, consider an application fwd a, where a has type ?[int]2. Themismatch between the level of x (0) and that of a (2) is not critical, because all the levelsin the derivation above can be uniformly shifted up by 2, yielding a derivation for

� fwd : ?[int]2 → ![int]3 →2,3 unit&⊥This shifting is possible because fwd has no free channels in its body (indeed, it is typedin the empty environment). Therefore, using [T-APP-POLY], we can derive

a : ?[int]2 � fwd a : ![int]3 →2,3 unit&⊥Note that (fwd a) is a function having level 2. This means that (fwd a) is not level

polymorphic and can only be applied, through [T-APP], to channels with level 3. If weallowed (fwd a) to be applied to a channel with level 2 using [T-APP-POLY] we could derive

a : #[int]2 � fwd a a : unit& 2

which reduces to a deadlock.

Example 3. To show that the term in Example 1 is well typed, consider the environment

Γdef= fibo : int→ ![int]0 →�,0 unit,n : int,c : ![int]0

In the proof derivation for the body of fibo, this environment is eventually enrichedwith the assignments a : #[int]−2 and b : #[int]−1. Now we can derive

...[T-APP]

Γ � fibo (n - 2) : ![int]0 →�,0 unit&⊥[T-NAME]

a : ![int]−2 � a : ![int]−2 &⊥[T-APP-POLY]

Γ ,a : ![int]−2 � fibo (n - 2) a : unit&−2

Page 24: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Types for Deadlock-Free Higher-Order Programs 13

where the application fibo (n - 2) a is well typed despite the fact thatfibo (n - 2) expects an argument of type ![int]0, while a has type ![int]−2. A similarderivation can be obtained for fibo (n - 1) b, and the proof derivation can now becompleted. �

3.3 Recursive Types

Looking back at Example 2, we see that in a call pipe x y the channel recv x is usedin the same position as x. Therefore, according to [T-APP-POLY], recv x must have thesame type as x, up to some shifting of its level. Similarly, channel c is both sent on y

and then used in the same position as y, suggesting that c must have the same type as y,again up to some shifting of its level. This means that we need recursive types in orderto properly describe x and y.

Instead of adding explicit syntax for recursive types, we just consider the possiblyinfinite trees generated by the productions for t shown earlier. In light of this broadernotion of types, the inductive definition of type level (3.1) is still well founded, but typeshift (3.4) must be reinterpreted coinductively, because it has to operate on possiblyinfinite trees. The formalities, nonetheless, are well understood.

It is folklore that, whenever infinite types are regular (that is, when they are madeof finitely many distinct subtrees), they admit finite representations either using typevariables and the familiar μ notation, or using systems of type equations [4]. Unfortu-nately, a careful analysis of Example 2 suggests that – at least in principle – we alsoneed non-regular types. To see why, let a and c be the channels to which (recv x)

and (cont y) respectively evaluate on line 2 of the example. Now:

– x must have smaller level than a since a is received from x (cf. the types of recv).– y must have smaller level than c since c is sent on y (cf. the types of send).– x must have smaller level than y since x is used in the functional part of an appli-

cation in which y occurs in the argument (cf. line 2 and [T-APP-POLY]).

Overall, in order to type pipe in Example 2 we should assign x and y the types tn andsn that respectively satisfy the equations

tn = ?[tn+2]n sn = ![tn+3]n+1 (3.5)

Unfortunately, these equations do not admit regular types as solutions. We recovertypeability of pipe with regular types by introducing a new type constructor

t ::= · · · ∣∣ �t�n

that wraps types with a pending shift: intuitively �t�n and ⇑nt denote the same type, ex-cept that in �t�n the shift ⇑n on t is pending. For example, �?[int]0�1 and �?[int]2�−1

are both possible wrappings of ?[int]1, while int→0,⊥ ![int]0 is the unwrapping of�int→1,⊥ ![int]1�−1. To exclude meaningless infinite types such as ���· · ·�n�n�n weimpose a contractiveness condition requiring every infinite branch of a type to containinfinite occurrences of channel or arrow constructors. To see why wraps help findingregular representations for otherwise non-regular types, observe that the equations

tn = ?[�tn�2]n sn = ![�tn+1�2]n+1 (3.6)

Page 25: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

14 L. Padovani and L. Novara

denote – up to pending shifts – the same types as the ones in (3.5), with the key differ-ence that (3.6) admit regular solutions and therefore finite representations. For example,tn could be finitely represented as a familiar-looking μα.?[�α�2]n term.

We should remark that �t�n and ⇑nt are different types, even though the former ismorally equivalent to the latter: wrapping is a type constructor, whereas shift is a typeoperator. Having introduced a new constructor, we must suitably extend the notions oftype level (3.1) and type shift (3.4) we have defined earlier. We postulate

|�t�n| def= n+ |t| ⇑n�t�m def

= �⇑nt�m

in accordance with the fact that �·�n denotes a pending shift by n (note that | · | extendedto wrappings is well defined thanks to the contractiveness condition).

We also have to define introduction and elimination rules for wrappings. To this aim,we conceive two constants, wrap and unwrap, having the following type schemes:

wrap : ⇑nt →�t�n unwrap : �t�n →⇑nt

We add wrap v to the value forms. Operationally, we want wrap and unwrap to an-nihilate each other. This is done by enriching reduction for expressions with the axiom

unwrap (wrap v)−→ v

Example 4. We suitably dress the code in Example 2 using wrap and unwrap:

1 let cont = λx.let c = new() in (fork λ_.send x (wrap c)); c in

2 let pipe = fix λpipe.λx.λy.pipe (unwrap (recv x)) (cont y)

and we are now able to find a typing derivation for it that uses regular types. In par-ticular, we assign cont the type sn → sn+2 and pipe the type tn → sn →n,� unit wheretn and sn are the types defined in (3.6). Note that cont is a pure function because itseffects are masked by fork and that pipe has latent effect � since it loops performingrecv operations on channels with increasing level. Because of the side conditions in[T-APP] and [T-APP-POLY], this means that pipe can only be used in tail position, which isprecisely what happens above and in Example 2. �

3.4 Properties

To formulate subject reduction, we must take into account that linear channels are con-sumed after communication (last but one reduction in Table 1). This means that when aprocess P communicates on some channel a, a must be removed from the type environ-ment used for typing the residual of P. To this aim, we define a partial operation Γ − �that removes � from Γ , when � is a channel. Formally:

Theorem 1 (Subject Reduction). If Γ � P and P�−−→Q, then Γ−��Q where Γ−τ def

= Γ

and (Γ ,a : #[t]n)− adef= Γ .

Note that Γ − a is undefined if a ∈ dom(Γ). This means that well-typed programsnever attempt at using the same channel twice, namely that channels in well-typed pro-grams are indeed linear channels. This property has important practical consequences,since it allows the efficient implementation (and deallocation) of channels [9].

Page 26: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Types for Deadlock-Free Higher-Order Programs 15

Deadlock freedom means that if the program halts, then there must be no pendingI/O operations. In our language, the only halted program without pending operations is(structurally equivalent to) 〈()〉. We can therefore define deadlock freedom thus:

Definition 1. We say that P is deadlock free if Pτ−−→∗

Q �−→ implies Q ≡ 〈()〉.

As usual,τ−−→∗

is the reflexive, transitive closure ofτ−−→ and Q �−→ means that Q is

unable to reduce further. Now, every well-typed, closed process is free from deadlocks:

Theorem 2 (Soundness). If /0 � P, then P is deadlock free.

Theorem 2 may look weaker than desirable, considering that every process P (evenan ill-typed one) can be “fixed” and become part of a deadlock-free system if com-posed in parallel with the diverging thread 〈fix λ x.x〉. It is not easy to state an inter-esting property of well-typed partial programs – programs that are well typed in un-even environments – or of partial computations – computations that have not reacheda stable (i.e., irreducible) state. One might think that well-typed programs eventuallyuse all of their channels. This property is false in general, for two reasons. First, ourtype system does not ensure termination of well-typed expressions, so a thread like〈send a (fix λ x.x)〉 never uses channel a, because the evaluation of the message di-verges. Second, there are threads that continuously generate (or receive) new channels,so that the set of channels they own is never empty; this happens in Example 2. Whatwe can prove is that, assuming that a well-typed program does not internally diverge,then each channel it owns is eventually used for a communication or is sent to the envi-ronment in a message. To formalize this property, we need a labeled transition systemdescribing the interaction of programs with their environment. Labels π , . . . of transi-tions are defined by

π ::= �∣∣ a?e

∣∣ a!v

and the transition relationπ�−→ extends reduction with the rules

a ∈ bn(C )

C [send a v]a!v�−→ C [()]

a ∈ bn(C ) fn(e)∩bn(C ) = /0

C [recv a]a?e�−→ C [e]

where C ranges over process contexts C ::= 〈E 〉 | (C |P) | (P|C ) | (νa)C . Messagesof input transitions have the form a?e where e is an arbitrary expression instead of avalue. This is just to allow a technically convenient formulation of Definition 2 below.We formalize the assumption concerning the absence of internal divergences as a prop-erty that we call interactivity. Interactivity is a property of typed processes, which wewrite as pairs Γ � P, since the messages exchanged between a process and the environ-ment in which it executes are not arbitrary in general.

Definition 2 (Interactivity). Interactivity is the largest predicate on well-typed pro-cesses such that Γ � P interactive implies Γ � P and:

1. P has no infinite reduction P�1�−→ P1

�2�−→ P2�3�−→ ·· · , and

2. if P��−→ Q, then Γ − � � Q is interactive, and

Page 27: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

16 L. Padovani and L. Novara

3. if Pa!v�−→ Q and Γ = Γ ′,a : ![t]n, then Γ ′′ � Q is interactive for some Γ ′′ ⊆ Γ ′, and

4. if Pa?x�−→ Q and Γ = Γ ′,a : ?[t]n, then Γ ′′ � Q{v/x} is interactive for some v and

Γ ′′ ⊇ Γ ′ such that n < |Γ ′′ \ Γ ′|.Clause (1) says that an interactive process does not internally diverge: it will even-

tually halt either because it terminates or because it needs interaction with the environ-ment in which it executes. Clause (2) states that internal reductions preserve interactiv-ity. Clause (3) states that a process with a pending output on a channel a must reduceto an interactive process after the output is performed. Finally, clause (4) states that aprocess with a pending input on a channel a may reduce to an interactive process afterthe input of a particular message v is performed. The definition looks demanding, butmany conditions are direct consequences of Theorem 1. The really new requirementsbesides well typedness are convergence of P (1) and the existence of v (4). It is nowpossible to prove that well-typed, interactive processes eventually use their channels.

Theorem 3 (Interactivity). Let Γ � P be an interactive process such that a ∈ fn(P).

Then Pπ1�−→ P1

π2�−→ ·· · πn�−→ Pn for some π1, . . . ,πn such that a ∈ fn(Pn).

4 Concluding Remarks

We have demonstrated the portability of a type system for deadlock freedom of π-calculus processes [10] to a higher-order language using an effect system [1]. We haveshown that effect masking and polymorphic recursion are key ingredients of the typesystem (Examples 1 and 2), and also that latent effects must be paired with one moreannotation – the function level. The approach may seem to hinder program modularity,since it requires storing levels in types and levels have global scope. In this respect,level polymorphism (Section 3.2) alleviates this shortcoming of levels by granting thema relative – rather than absolute – meaning at least for non-linear functions.

Other type systems for higher-order languages with session-based communicationprimitives have been recently investigated [6,14,2]. In addition to safety, types are usedfor estimating bounds in the size of message queues [6] and for detecting memoryleaks [2]. Since binary sessions can be encoded using linear channels [5], our typesystem can address the same family of programs considered in these works with theadvantage that, in our case, well-typed programs are guaranteed to be deadlock freealso in presence of session interleaving. For instance, the pipe function in Example 2interleaves communications on two different channels. The type system described byWadler [14] is interesting because it guarantees deadlock freedom without resorting toany type annotation dedicated to this purpose. In his case the syntax of (well-typed)programs prevents the modeling of cyclic network topologies, which is a necessarycondition for deadlocks. However, this also means that some useful program patternscannot be modeled. For instance, the program in Example 2 is ill typed in [14].

The type system discussed in this paper lacks compelling features. Structured datatypes (records, sums) have been omitted for lack of space; an extended technical re-port [13] and previous works [11,10] show that they can be added without issues. Thesame goes for non-linear channels [10], possibly with the help of dedicated accept

Page 28: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Types for Deadlock-Free Higher-Order Programs 17

and request primitives as in [6]. True polymorphism (with level and type variables)has also been studied in the technical report [13]. Its impact on the overall type sys-tem is significant, especially because level and type constraints (those appearing as sideconditions in the type schemes of constants, Section 3.1) must be promoted from themetatheory to the type system. The realization of level polymorphism as type shift-ing that we have adopted in this paper is an interesting compromise between impactand flexibility. Our type system can also be relaxed with subtyping: arrow types arecontravariant in the level and covariant in the latent effect, whereas channel types areinvariant in the level. Invariance of channel levels can be relaxed refining levels to pairsof numbers as done in [7,8]. This can also improve the accuracy of the type system insome cases, as discussed in [10] and [3]. It would be interesting to investigate whichof these features are actually necessary for typing concrete functional programs usingthreads and communication/synchronization primitives.

Type reconstruction algorithms for similar type systems have been defined [11,12].We are confident to say that they scale to type systems with arrow types and effects.

Acknowledgments. The authors are grateful to the reviewers for their detailed comments anduseful suggestions. The first author has been supported by Ateneo/CSP project SALT, ICT COSTAction IC1201 BETTY, and MIUR project CINA.

References

1. Amtoft, T., Nielson, F., Nielson, H.: Type and Effect Systems: Behaviours for Concurrency.Imperial College Press (1999)

2. Bono, V., Padovani, L., Tosatto, A.: Polymorphic Types for Leak Detection in a Session-Oriented Functional Language. In: Beyer, D., Boreale, M. (eds.) FMOODS/FORTE 2013.LNCS, vol. 7892, pp. 83–98. Springer, Heidelberg (2013)

3. Carbone, M., Dardha, O., Montesi, F.: Progress as compositional lock-freedom. In: Kuhn,E., Pugliese, R. (eds.) COORDINATION 2014. LNCS, vol. 8459, pp. 49–64. Springer, Hei-delberg (2014)

4. Courcelle, B.: Fundamental properties of infinite trees. Theor. Comp. Sci. 25, 95–169 (1983)5. Dardha, O., Giachino, E., Sangiorgi, D.: Session types revisited. In: PPDP 2012, pp. 139–

150. ACM (2012)6. Gay, S.J., Vasconcelos, V.T.: Linear type theory for asynchronous session types. J. Funct.

Program. 20(1), 19–50 (2010)7. Kobayashi, N.: A type system for lock-free processes. Inf. and Comp. 177(2), 122–159

(2002)8. Kobayashi, N.: A new type system for deadlock-free processes. In: Baier, C., Hermanns, H.

(eds.) CONCUR 2006. LNCS, vol. 4137, pp. 233–247. Springer, Heidelberg (2006)9. Kobayashi, N., Pierce, B.C., Turner, D.N.: Linearity and the pi-calculus. ACM Trans. Pro-

gram. Lang. Syst. 21(5), 914–947 (1999)10. Padovani, L.: Deadlock and Lock Freedom in the Linear π-Calculus. In: CSL-LICS 2014,

pp. 72:1–72:10. ACM (2014), http://hal.archives-ouvertes.fr/hal-00932356v2/11. Padovani, L.: Type Reconstruction for the Linear π-Calculus with Composite and Equi-

Recursive Types. In: Muscholl, A. (ed.) FOSSACS 2014. LNCS, vol. 8412, pp. 88–102.Springer, Heidelberg (2014)

Page 29: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

18 L. Padovani and L. Novara

12. Padovani, L., Chen, T.-C., Tosatto, A.: Type Reconstruction Algorithms for Deadlock-Freeand Lock-Free Linear π-Calculi. In: Holvoet, T., Viroli, M. (eds.) COORDINATION 2015.LNCS, vol. 9037, pp. 85–100. Springer, Heidelberg (2015)

13. Padovani, L., Novara, L.: Types for Deadlock-Free Higher-Order Concurrent Programs.Technical report, Universita di Torino (2014), http://hal.inria.fr/hal-00954364

14. Wadler, P.: Propositions as sessions. In: ICFP 2012, pp. 273–286. ACM (2012)

Page 30: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

On Partial Order Semantics for SAT/SMT-BasedSymbolic Encodings of Weak Memory Concurrency

Alex Horn(�) and Daniel Kroening

University of Oxford, Oxford, [email protected]

Abstract. Concurrent systems are notoriously difficult to analyze, andtechnological advances such as weak memory architectures greatly com-pound this problem. This has renewed interest in partial order seman-tics as a theoretical foundation for formal verification techniques. Amongthese, symbolic techniques have been shown to be particularly effectiveat finding concurrency-related bugs because they can leverage highly op-timized decision procedures such as SAT/SMT solvers. This paper givesnew fundamental results on partial order semantics for SAT/SMT-basedsymbolic encodings of weak memory concurrency. In particular, we givethe theoretical basis for a decision procedure that can handle a fragment ofconcurrent programs endowed with least fixed point operators. In addi-tion, we show that a certain partial order semantics of relaxed sequentialconsistency is equivalent to the conjunction of three extensively studiedweak memory axioms by Alglave et al. An important consequence of thisequivalence is an asymptotically smaller symbolic encoding for boundedmodel checking which has only a quadratic number of partial order con-straints compared to the state-of-the-art cubic-size encoding.

1 Introduction

Concurrent systems are notoriously difficult to analyze, and technologicaladvances such as weak memory architectures as well as highly available dis-tributed services greatly compound this problem. This has renewed interestin partial order concurrency semantics as a theoretical foundation for formalverification techniques. Among these, symbolic techniques have been shown tobe particularly effective at finding concurrency-related bugs because they canleverage highly optimized decision procedures such as SAT/SMT solvers. Thispaper studies partial order semantics from the perspective of SAT/SMT-basedsymbolic encodings of weak memory concurrency.

Given the diverse range of partial order concurrency semantics, we link ourstudy to a recently developed unifying theory of concurrency by Tony Hoareet al. [1]. This theory is known as Concurrent Kleene Algebra (CKA) which isan algebraic concurrency semantics based on quantales, a special case of the

This work is funded by a gift from Intel Corporation for research on EffectiveValidation of Firmware and the ERC project ERC 280053.

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 19–34, 2015.DOI: 10.1007/978-3-319-19195-9_2

Page 31: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

20 A. Horn and D. Kroening

fundamental algebraic structure of idempotent semirings. Based on quantales,CKA combines the familiar laws of the sequential program operator (;) with anew operator for concurrent program composition (‖). A distinguishing featureof CKA is its exchange law (U ‖ V); (X ‖ Y) ⊆ (U ;X ) ‖ (V ;Y) that describeshow sequential and concurrent composition operators can be interchanged. In-tuitively, since the binary relation ⊆ denotes program refinement, the exchangelaw expresses a divide-and-conquer mechanism for how concurrency may besequentially implemented on a machine. The exchange law, together with a uni-form treatment of programs and their specifications, is key to unifying existingtheories of concurrency [2]. CKA provides such a unifying theory [3,2] that haspractical relevance on proving program correctness, e.g. using rely/guaranteereasoning [1]. Conversely, however, pure algebra cannot refute that a programis correct or that certain properties about every program always hold [3,2,4].This is problematic for theoretical reasons but also in practice because todayssoftware complexity requires a diverse set of program analysis tools that rangefrom proof assistants to automated testing. The solution is to accompany CKAwith a mathematical model which satisfies its laws so that we can prove as wellas disprove properties about programs.

One such well-known model-theoretical foundation for CKA is Pratt’s [5]and Gischer’s [6] partial order model of computation that is constructed fromlabelled partially ordered multisets (pomsets). Pomsets generalize the concept ofa string in finite automata theory by relaxing the total ordering of the occur-rence of letters within a string to a partial order. For example, a ‖ a denotes apomset that consists of two unordered events that are both labelled with theletter a. By partially ordering events, pomsets form an integral part of the ex-tensive theoretical literature on so-called ‘true concurrency’, e.g. [7,8,9,10,5,6],in which pomsets strictly generalize Mazurkiewicz traces [11], and prime eventstructures [10] are pomsets enriched with a conflict relation subject to certainconditions. From an algorithmic point of view, the complexity of the pomset lan-guage membership (PLM) problem is NP-complete, whereas the pomset languagecontainment (PLC) problem is Πp

2 -complete [12].Importantly, these aforementioned theoretical results only apply to star-free

pomset languages (without fixed point operators). In fact, the decidability ofthe equational theory of the pomset language closed under least fixed point,sequential and concurrent composition operators (but without the exchangelaw) has been only most recently established [13]; its complexity remains anopen problem [13]. Yet another open problem is the decidability of this equa-tional theory together with the exchange law [13]. In addition, it is still unclearhow theoretical results about pomsets may be applicable to formal techniquesfor finding concurrency-related bugs. In fact, it is not even clear how insightsabout pomsets may be combined with most recently studied language-specificor hardware-specific concurrency semantics, e.g. [14,15,16,17].

These gaps are motivation to reinvestigate pomsets from an algorithmicperspective. In particular, our work connects pomsets to a SAT/SMT-basedbounded model checking technique [18] where shared memory concurrency

Page 32: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

On Partial Order Semantics for SAT/SMT-Based Symbolic Encodings 21

is symbolically encoded as partial orders. To make this connection, we adoptpomsets as partial strings (Definition 1) that are ordered by a refinement rela-tion (Definition 3) based on Esik’s notion of monotonic bijective morphisms [19].Our partial-string model then follows from the standard Hoare powerdomainconstruction where sets of partial strings are downward-closed with respect tomonotonic bijective morphism (Definition 4). The relevance of this formaliza-tion for the modelling of weak memory concurrency (including data races) isexplained through several examples. Our main contributions are as follows:

1. We give the theoretical basis for a decision procedure that can handle afragment of concurrent programs endowed with least fixed point operators (The-orem 2). This is accomplished by exploiting a form of periodicity, therebygiving a mechanism for reducing a countably infinite number of events to afinite number. This result particularly caters to partial order encoding tech-niques that can currently only encode a finite number of events due to thedeliberate restriction to quantifier-free first-order logic, e.g. [18].

2. We then interpret a particular form of weak memory in terms of certaindownward-closed sets of partial strings (Definition 11), and show that ourinterpretation is equivalent to the conjunction of three fundamental weakmemory axioms (Theorem 3), namely ‘write coherence’, ‘from-read’ and‘global read-from’ [17]. Since all three axioms underpin extensive experi-mental research into weak memory architectures [20], Theorem 3 gives deno-tational partial order semantics a new practical dimension.

3. Finally, we prove that there exists an asymptotically smaller quantifier-freefirst-order logic formula that has only O(N2) partial order constraints (The-orem 4) compared to the state-of-the-art O(N3) partial order encoding forbounded model checking [18] where N is the maximal number of reads andwrites on the same shared memory address. This is significant because Ncan be prohibitively large when concurrent programs frequently share data.

The rest of this paper is organized into three parts. First, we recall familiarconcepts on partial-string theory (§ 2) on which the rest of this paper is based.We then prove a least fixed point reduction result (§ 3). Finally, we character-ize a particular form of relaxed sequential consistency in terms of three weakmemory axioms by Alglave et al. (§ 4).

2 Partial-String Theory

In this section, we adapt an axiomatic model of computation that uses partialorders to describe the semantics of concurrent systems. For this, we recall famil-iar concepts (Definition 1, 2, 3 and 4) that underpin our mathematical model ofCKA (Theorem 1). This model is the basis for subsequent results in § 3 and § 4.

Definition 1 (Partial String). Denote with E a nonempty set of events. Let Γ bean alphabet. A partial string p is a triple 〈Ep, αp,�p〉 where Ep is a subset of E,αp : Ep → Γ is a function that maps each event in Ep to an alphabet symbol in Γ,

Page 33: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

22 A. Horn and D. Kroening

e0 e2

e1��

e3��

Fig. 1. A partial string p = 〈Ep, αp,�p〉 with events Ep = {e0, e1, e2, e3} andthe labelling function αp satisfying the following: αp(e0) = ‘r0 := [b]acquire’,αp(e1) = ‘r1 := [a]none’, αp(e2) = ‘[a]none := 1’ and αp(e3) = ‘[b]release := 1’

and �p is a partial order on Ep. Two partial strings p and q are said to be disjointwhenever Ep ∩ Eq = ∅. A partial string p is called empty whenever Ep = ∅. Denotewith P f the set of all finite partial strings p whose event set Ep is finite.

Each event in the universe E should be thought of as an occurrence of a com-putational step, whereas letters in Γ describe the computational effect of events.Typically, we denote a partial string by p, or letters from x through z. In essence,a partial string p is a partially-ordered set 〈Ep, �p〉 equipped with a labellingfunction αp. A partial string is therefore the same as a labelled partial order (lpo),see also Remark 1. We draw finite partial strings in P f as inverted Hasse dia-grams (e.g. Fig. 1), where the ordering between events may be interpreted asa happens-before relation [8], a fundamental notion in distributed systems andformal verification of concurrent systems, e.g. [16,17]. We remark the obviousfact that the empty partial string is unique under component-wise equality.

Example 1. In the partial string in Fig. 1, e0 happens-before e1, whereas both e0and e2 happen concurrently because neither e0 �p e2 nor e2 �p e0.

We abstractly describe the control flow in concurrent systems by adoptingthe sequential and concurrent operators on labelled partial orders [9,5,6,19,21].

Definition 2 (Partial String Operators). Let x and y be disjoint partial strings. Letx ‖ y � 〈Ex‖y, αx‖y,�x‖y〉 and x; y � 〈Ex;y, αx;y,�x;y〉 be their concurrent and

sequential composition, respectively, where Ex‖y = Ex;y � Ex ∪ Ey such that, forall events e, e′ in Ex ∪ Ey, the following holds:

– e �x‖y e′ exactly if e �x e′ or e �y e′,– e �x;y e′ exactly if (e ∈ Ex and e′ ∈ Ey) or e �x‖y e′,

– αx‖y(e) = αx;y(e) �{

αx(e) if e ∈ Ex

αy(e) if e ∈ Ey.

For simplicity, we assume that partial strings can be always made disjointby renaming events if necessary. But this assumption could be avoided by us-ing coproducts, a form of constructive disjoint union [21]. When clear from thecontext, we construct partial strings directly from the labels in Γ.

Example 2. If we ignore labels for now and let pi for all 0 ≤ i ≤ 3 be fourpartial strings which each consist of a single event ei, then (p0; p1) ‖ (p2; p3)corresponds to a partial string that is isomorphic to the one shown in Fig. 1.

To formalize the set of all possible happens-before relations of a concurrentsystem, we rely on Esik’s notion of monotonic bijective morphism [19]:

Page 34: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

On Partial Order Semantics for SAT/SMT-Based Symbolic Encodings 23

e0��

e2��

e′0 e′2

e1��

��e3��

��

������

e′1��

e′3����

����

︸ ︷︷ ︸

y︸ ︷︷ ︸

x Fig. 2. Two partial strings x and y such that x y pro-vided all the labels are preserved, e.g. αx(e′0) = αy(e0)

Definition 3 (Partial String Refinement). Let x and y be partial strings such thatx = 〈Ex, αx �x〉 and y = 〈Ey, αy,�y〉. A monotonic bijective morphism from xto y, written f : x → y, is a bijective function f from Ex to Ey such that, for all eventse, e′ ∈ Ex, αx(e) = αy( f (e)), and if e �x e′, then f (e) �y f (e′). Then x refines y,written x y, if there exists a monotonic bijective morphism f : y → x from y to x.

Remark 1. Partial words [9] and pomsets [5,6] are defined in terms of isomor-phism classes of lpos. Unlike lpos in pomsets, however, we study partial stringsin terms of monotonic bijective morphisms [19] because isomorphismsare about sameness whereas the exchange law on partial strings is aninequation [21].

The purpose of Definition 3 is to disregard the identity of events but retainthe notion of ‘subsumption’, cf. [6]. The intuition is that orders partial stringsaccording to their determinism. In other words, x y for partial strings x andy implies that all events ordered in y have the same order in x.

Example 3. Fig. 2 shows a monotonic bijective morphism from a partial string asgiven in Fig. 1 to an N-shaped partial string that is almost identical to the onein Fig. 1 except that it has an additional partial order constraint, giving its Nshape. One well-known fact about N-shaped partial strings is that they cannotbe constructed as x; y or x ‖ y under any labelling [5]. However, this is not aproblem for our study, as will become clear after Definition 4.

Our notion of partial string refinement is particularly appealing for symbolictechniques of concurrency because the monotonic bijective morphism can be di-rectly encoded as a first-order logic formula modulo the theory of uninterpretedfunctions. Such a symbolic partial order encoding would be fully justified froma computational complexity perspective, as shown next.

Proposition 1. Let x and y be finite partial strings in P f . The partial string refine-ment (PSR) problem — i.e. whether x y — is NP-complete.

Proof. Clearly PSR is in NP. The NP-hardness proof proceeds by reduction fromthe PLM problem [12]. Let Γ∗ be the set of strings, i.e. the set of finite partialstrings s such that �s is a total order (for all e, e′ ∈ Es, e �s e′ or e′ �s e). Given afinite partial string p, let Lp be the set of all strings which refine p; equivalently,Lp � {s ∈ Γ∗ | s p}. So Lp denotes the same as L(p) in [12, Definition 2.2].

Page 35: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

24 A. Horn and D. Kroening

Let s be a string in Γ∗ and P be a pomset over the alphabet Γ. By Remark 1,fix p to be a partial string in P. Thus s refines p if and only if s is a memberof Lp. Since this membership problem is NP-hard [12, Theorem 4.1], it followsthat the PSR problem is NP-hard. So the PSR problem is NP-complete. ��

Note that a single partial string is not enough to model mutually exclusive(nondeterministic) control flow. To see this, consider a simple (possibly sequen-tial) system such as if * then P else Q where * denotes nondeterministicchoice. If the semantics of a program was a single partial string, then we needto find exactly one partial string that represents the fact that P executes or Q exe-cutes, but never both. To model this, rather than using a conflict relation [10], weresort to the simpler Hoare powerdomain construction where we lift sequentialand concurrent composition operators to sets of partial strings. But since weare aiming (similar to Gischer [6]) at an over-approximation of concurrent systems,these sets are downward closed with respect to our partial string refinementordering from Definition 3. Additional benefits of using the downward closureinclude that program refinement then coincides with familiar set inclusion andthe ease with which later the Kleene star operators can be defined.

Definition 4 (Program). A program is a downward-closed set of finite partial stringswith respect to ; equivalently X ⊆ P f is a program whenever ↓ X = X where↓ X � {y ∈ P f | ∃x ∈ X : y x}. Denote with P the family of all programs.

Since we only consider systems that terminate, each partial string x in a pro-gram X is finite. We reemphasize that the downward closure of such a set Xcan be thought of as an over-approximation of all possible happens-before re-lations in a concurrent system whose instructions are ordered according to thepartial strings in X . Later on (§ 4) we make the downward closure of partialstrings more precise to model a certain kind of relaxed sequential consistency.

Example 4. Recall that N-shaped partial strings cannot be constructed as x; yor x ‖ y under any labelling [5]. Yet, by downward-closure of programs, suchpartial strings are included in the over-approximation of all the happens-beforerelations exhibited by a concurrent system. In particular, according to Exam-ple 3, the downward-closure of the set containing the partial string in Fig. 1includes (among many others) the N-shaped partial string shown on the rightin Fig. 2. In fact, we shall see in § 4 that this particular N-shaped partial stringcorresponds to a data race in the concurrent system shown in Fig. 3.

It is standard [6,21] to define 0 � ∅ and 1 � {⊥} where ⊥ is the (unique)empty partial string. Clearly 0 and 1 form programs in the sense of Definition 4.For the next theorem, we lift the two partial string operators (Definition 2) toprograms in the standard way:

Definition 5 (Bow Tie). Given two partial strings x and y, denote with x �� y eitherconcurrent or sequential composition of x and y. For all programs X ,Y in P andpartial string operators ��, X �� Y � ↓ {x �� y | x ∈ X and y ∈ Y} where X ‖ Yand X ;Y are called concurrent and sequential program composition, respectively.

Page 36: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

On Partial Order Semantics for SAT/SMT-Based Symbolic Encodings 25

By denoting programs as sets of partial strings, we can now define Kleenestar operators (−)‖ and (−); for iterative concurrent and sequential programcomposition, respectively, as least fixed points (μ) using set union (∪) as thebinary join operator that we interpret as the nondeterministic choice of twoprograms. We remark that this is fundamentally different from the pomsets re-cursion operators in ultra-metric spaces [22]. The next theorem could be thensummarized as saying that the resulting structure of programs, written S, isa partial order model of an algebraic concurrency semantics that satisfies theCKA laws [1]. Since CKA is an exemplar of the universal laws of program-ming [2], we base the rest of this paper on our partial order model of CKA.

Theorem 1. The structure S = 〈P,⊆,∪, 0, 1, ; , ‖〉 is a complete lattice, ordered bysubset inclusion (i.e. X ⊆ Y exactly if X ∪ Y = Y), such that ‖ and ; form unitalquantales over ∪ where S satisfies the following:

(U ‖ V); (X ‖ Y) ⊆ (U ;X ) ‖ (V ;Y) X ∪ (Y ∪ Z) = (X ∪ Y) ∪ ZX ∪X = X X ∪ 0 = 0 ∪X = XX ∪Y = Y ∪ X X ‖ Y = Y ‖ XX ‖ 1 = 1 ‖ X = X X ; 1 = 1;X = XX ‖ 0 = 0 ‖ X = 0 X ; 0 = 0;X = 0X ‖ (Y ∪Z) = (X ‖ Y) ∪ (X ‖ Z) X ; (Y ∪ Z) = (X ;Y)∪ (X ;Z)

(X ∪ Y) ‖ Z = (X ‖ Z) ∪ (Y ‖ Z) (X ∪ Y);Z = (X ;Z)∪ (Y ;Z)

X ‖ (Y ‖ Z) = (X ‖ Y) ‖ Z X ; (Y ;Z) = (X ;Y);ZP‖ = μX .1 ∪ (P ‖ X ) P ; = μX .1 ∪ (P ;X ).

Proof. The details are in the accompanying technical report of this paper [21].

By Theorem 1, it makes sense to call 1 in structure S the ��-identity programwhere �� is a placeholder for either ; or ‖. In the sequel, we call the binaryrelation ⊆ on P the program refinement relation.

3 Least Fixed Point Reduction

This section is about the least fixed point operators (−); and (−)‖. Henceforth,we shall denote these by (−)��. We show that under a certain finiteness con-dition (Definition 7) the program refinement problem X�� ⊆ Y�� can be re-duced to a bounded number of program refinement problems without leastfixed points (Theorem 2). To prove this, we start by inductively defining thenotion of iteratively composing a program with itself under ��.

Definition 6 (n-iterated-��-program-composition). Let N0 � N ∪ {0} be the setof non-negative integers. For all programs P in P and non-negative integers n inN0, P0·�� � 1 = {⊥} is the ��-identity program and P (n+1)·�� � P �� Pn·��.

Clearly (−)�� is the limit of its approximations in the following sense:

Page 37: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

26 A. Horn and D. Kroening

Proposition 2. For every program P in P, P�� =⋃

n≥0 Pn·��.

Definition 7 (Elementary Program). A program P in P is called elementary if Pis the downward-closed set with respect to of some finite and nonempty set Q of finitepartial strings, i.e. P =↓ Q. The set of elementary programs is denoted by P�.

An elementary program therefore could be seen as a machine-representableprogram generated from a finite and nonempty set of finite partial strings. Thisfiniteness restriction makes the notion of elementary programs a suitable can-didate for the study of decision procedures. To make this precise, we define thefollowing unary partial string operator:

Definition 8 (n-repeated-�� Partial String Operator). For every non-negative in-teger n in N0, x0·�� � ⊥ is the empty partial string and x(n+1)·�� � x �� xn·��.

Intuitively, pn·�� is a partial string that consists of n copies of a partial stringp, each combined by the partial string operator ��. This is formalized as follows:

Proposition 3. Let n ∈ N0 be a non-negative integer. Define [0] � ∅ and [n + 1] �{1, . . . , n + 1}. For every partial string x, xn·�� is isomorphic to y = 〈Ey, αy,�y〉where Ey � Ex × [n] such that, for all e, e′ ∈ Ex and i, i′ ∈ [n], the following holds:

– if ‘��’ is ‘‖’, then 〈e, i〉 �y 〈e′, i′〉 exactly if i = i′ and e �x e′,– if ‘��’ is ‘;’, then 〈e, i〉 �y 〈e′, i′〉 exactly if i < i′ or (i = i′ and e �x e′),– αy(〈e, i〉) = αx(e).

Definition 9 (Partial String Size). The size of a finite partial string p, denoted by|p|, is the cardinality of its event set Ep.

For example, the partial string in Fig. 1 has size four. It is obvious that thesize of finite partial strings is non-decreasing under the n-repeated-�� partialstring operator from Definition 8 whenever 0 < n. This simple fact is importantfor the next step towards our least fixed point reduction result in Theorem 2:

Proposition 4 (Elementary Least Fixed Point Pre-reduction). For all elementaryprograms X and Y in P�, if the ��-identity program 1 is not in Y and X ⊆ Y��,

then X ⊆ ⋃

n≥k≥0 Y k·�� where n =⌊�X�Y

such that �X � max {|x| | x ∈ X} and

�Y � min {|y| | y ∈ Y} is the size of the largest and smallest partial strings in X andY , respectively.

Proof. Assume X ⊆ Y��. Let x ∈ P f be a finite partial string. We can assumex ∈ X because X �= 0. By assumption, x ∈ Y��. By Proposition 2, there existsk ∈ N0 such that x ∈ Y k·��. Fix k to be the smallest such non-negative integer.

Show k ≤⌊�X�Y

(the fraction is well-defined because X and Y are nonempty

and 1 �∈ Y). By downward closure and definition of in terms of a one-to-onecorrespondence, it suffices to consider that x is one of a (not necessarily unique)longest partial strings in X , i.e. |x′| ≤ |x| for all x′ ∈ X ; equivalently, |x| = �X .

Page 38: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

On Partial Order Semantics for SAT/SMT-Based Symbolic Encodings 27

If |x| = 0, set k = 0, satisfying 1 = X ⊆ Y k·�� = 1 and k ≤ n = 0 as required.Otherwise, since the size of partial strings in a program can never decreaseunder the k-iterated program composition operator �� when 0 < k, it suffices toconsider the case x yk·�� for some shortest partial string y in Y . Since Eyk·�� isthe Cartesian product of Ey and [k], it follows |x| = k · |y|. Since |x| ≤ �X and

�Y ≤ |y|, k ≤ ⌊ �X�Y

. By definition n =⌊�X�Y

, proving x ∈ ⋃

n≥k≥0 Y k·��. ��

Equivalently, if there exists a partial string x in X such that x �∈ Y k·�� for all

non-negative integers k between zero and⌊�X�Y

, then X �⊆ Y��. Since we are

interested in decision procedures for program refinement checking, we need toshow that the converse of Proposition 4 also holds. Towards this end, we provethe following left (−)�� elimination rule:

Proposition 5. For every program X and Y in P, X�� ⊆ Y�� exactly if X ⊆ Y��.

Proof. Assume X�� ⊆ Y��. By Proposition 2, X ⊆ X��. By transitivity of ⊆ inP, X ⊆ Y��. Conversely, assume X ⊆ Y��. Let i, j ∈ N0. By induction on i,X i·��

�� X j·�� = X (i+j)·��. Thus, by Proposition 2 and distributivity of �� overleast upper bounds in P, X��

�� X�� = X��, i.e. (−)�� is idempotent. This,in turn, implies that (−)�� is a closure operator. Therefore, by monotonicity,X�� ⊆ (Y��)

��

= Y��, proving that X�� ⊆ Y�� is equivalent to X ⊆ Y��. ��Theorem 2 (Elementary Least Fixed Point Reduction). For all elementary pro-grams X and Y in P�, if the ��-identity program 1 is not in Y , then X�� ⊆ Y�� is

equivalent to X ⊆ ⋃

n≥k≥0 Y k·�� where n =⌊�X�Y

such that �X � max {|x| | x ∈ X}and �Y � min {|y| | y ∈ Y} is the size of the largest and smallest partial strings in Xand Y , respectively.

Proof. By Proposition 5, it remains to show that X ⊆ Y�� is equivalent to X ⊆⋃

n≥k≥0 Y k·�� where n =⌊�X�Y

. The forward and backward implication follow

from Proposition 4 and 2, respectively. ��From Theorem 2 follows immediately that X�� ⊆ Y�� is decidable for all

elementary programs X and Y in P� because there exists an algorithm thatcould iteratively make O

(|X | × |Y|n) calls to another decision procedure tocheck whether x y for all x ∈ X and y ∈ Y k·�� where n ≥ k ≥ 0. However,by Proposition 1, each iteration in such an algorithm would have to solve anNP-complete subproblem. But this high complexity is expected since the PLCproblem is Πp

2 -complete [12].

Corollary 1. For all elementary programs X and Y in P, if |x| = |y| for all x ∈ Xand y ∈ Y , then X�� ⊆ Y�� is equivalent to X ⊆ Y .

We next move on to enriching our model of computation to accommodate acertain kind of relaxed sequential consistency.

Page 39: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

28 A. Horn and D. Kroening

4 Relaxed Sequential Consistency

For efficiency reasons, all modern computer architectures implement some formof weak memory model rather than sequential consistency [23]. A definingcharacteristic of weak memory architectures is that they violate interleaving se-mantics unless specific instructions are used to restore sequential consistency.This section fixes a particular interpretation of weak memory and studies themathematical properties of the resulting partial order semantics. For this, weseparate memory accesses into synchronizing and non-synchronizing ones, akinto [24]. A synchronized store is called a release, whereas a synchronized load iscalled an acquire. The intuition behind release/acquire is that prior writes madeto other memory locations by the thread executing the release become visiblein the thread that performs the corresponding acquire. Crucially, the particularform of release/acquire semantics that we formalize here is shown to be equiv-alent to the conjunction of three weak memory axioms (Theorem 3), namely‘write coherence’, ‘from-read’ and ‘global read-from’ [17]. Subsequently, welook at one important ramification of this equivalence on bounded model checking(BMC) techniques for finding concurrency-related bugs (Theorem 4).

We start by defining the alphabet that we use for identifying events that de-note synchronizing and non-synchronizing memory accesses.

Definition 10 (Memory access alphabet). Define 〈LOAD〉 � {none, acquire},〈STORE〉 � {none, release} and 〈BIT〉 � {0, 1}. Let 〈ADDRESS〉 and 〈REG〉 bedisjoint sets of memory locations and registers, respectively. Let load tag ∈ 〈LOAD〉and store tag ∈ 〈STORE〉. Define the set of load and store labels, respectively:

Γload, load tag � {load tag} × 〈REG〉 × 〈ADDRESS〉Γstore, store tag � {store tag} × 〈ADDRESS〉 × 〈BIT〉

Let Γ � Γload,none ∪ Γload,acquire ∪ Γstore,none ∪ Γstore,release be the memory ac-cess alphabet. Given r ∈ 〈REG〉, a ∈ 〈ADDRESS〉 and b ∈ 〈BIT〉, we write‘r := [a]load tag’ for the label 〈load tag, r, a〉 in Γload, load tag; similarly, ‘[a]store tag := b’is shorthand for the label 〈store tag, a, b〉 in Γstore, store tag.

Let x be a partial string and e be an event in Ex. Then e is called a load or store ifits label, αx(e), is in Γload, load tag or Γstore, store tag, respectively. A load or store event eis a non-synchronizing memory access if αx(e) ∈ Γnone � Γload,none ∪ Γstore,none;otherwise, it is a synchronizing memory access. Let a ∈ 〈ADDRESS〉 be a memorylocation. An acquire on a is an event e such that αx(e) = ‘r := [a]acquire’ for somer ∈ 〈REG〉. Similarly, a release on a is an event e labelled by ‘[a]release := b’ for someb ∈ 〈BIT〉. A release and acquire is a release and acquire on some memory location,respectively.

Example 5. Fig. 3 shows the syntax of a program that consists of two threads T1and T2. This concurrent system can be directly modelled by the partial string

Page 40: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

On Partial Order Semantics for SAT/SMT-Based Symbolic Encodings 29

Thread T1 Thread T2

r0 := [b]acquire [a]none := 1r1 := [a]none [b]release := 1

Fig. 3. A concurrent system T1 ‖ T2 consisting of twothreads. The memory accesses on memory locationsb are synchronized, whereas those on a are not.

shown in Fig. 1 where memory location b is accessed through acquire and re-lease, whereas memory location a is accessed through non-synchronizing loadsand stores (shortly, we shall see that this leads to a data race).

Given Definition 10, we are now ready to refine our earlier conservative over-approximation of the happens-before relations (Definition 4) to get a particularform of release/acquire semantics. For this, we restrict the downward closureof programs X in P, in the sense of Definition 4, by requiring all partial stringsin X to satisfy the following partial ordering constraints:

Definition 11 (SC-relaxed program). A program X is called SC-relaxed if, for alla ∈ 〈ADDRESS〉 and partial string x in X , the set of release events on a is totallyordered by �x and, for every acquire l ∈ Ex and release s ∈ Ex on a, l �x s or s �x l.

Henceforth, we denote loads and stores by l, l′ and s, s′, respectively. If s ands′ are release events that modify the same memory location, either s happens-before s′, or vice versa. If l is an acquire and s is a release on the same memorylocation, either l happens-before s or s happens-before l. Importantly, however,two acquire events l and l′ on the same memory location may still happen con-currently in the sense that neither l happens-before l′ nor l′ happens-before l,in the same way non-synchronizing memory accesses are generally unordered.

Example 6. Example 4 and 5 illustrate the SC-relaxed semantics of the concur-rent system in Fig. 3. In particular, the N-shaped partial string in Fig. 2 cor-responds to a data race in T1 ‖ T2 because the non-synchronizing memoryaccesses on memory location a happen concurrently. To see this, it may helpto consider the interleaving r0 := [b]acquire; [a]none := 1; r1 := [a]none; [b]release := 1where both memory accesses on location a are unordered through the happens-before relation because there is no release instruction separating [a]none := 1from r1 := [a]none. One way of fixing this data race is by changing thread T1 toif [b]acquire = 1 then r1 := [a]none. Since CKA supports non-deterministic choicewith the ∪ binary operator (recall Theorem 1), it would not be difficult to givesemantics to such conditional checks, particularly if we introduce ‘assume’ la-bels into the alphabet in Definition 10.

We ultimately want to show that the conjunction of three existing weak mem-ory axioms as studied in [17] fully characterizes our particular interpretation ofrelaxed sequential consistency, thereby paving the way for Theorem 4. For this,we recall the following memory axioms which can be thought of as relations onloads and stores on the same memory location:

Definition 12 (Memory axioms). Let x be a partial string in P f . The read-fromfunction, denoted by rf : Ex → Ex, is defined to map every load to a store on the same

Page 41: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

30 A. Horn and D. Kroening

memory location. A load l synchronizes-with a store s if rf(l) = s implies s �x l.Write-coherence means that all stores s, s′ on the same memory location are totallyordered by �x. The from-read axiom holds whenever, for all loads l and stores s, s′ onthe same memory location, if rf(l) = s and s ≺x s′, then l �x s′.

By definition, the read-from function is total on all loads. The synchronizes-with axiom says that if a load reads-from a store (necessarily on the same mem-ory location), then the store happens-before the load. This is also known as theglobal read-from axiom [17]. Write-coherence, in turn, ensures that all stores onthe same memory location are totally ordered. This corresponds to the fact that“all writes to the same location are serialized in some order and are performedin that order with respect to any processor” [24]. Note that this is different fromthe modification order (‘mo’) on atomics in C++14 [25] because ‘mo’ is generallynot a subset of the happens-before relation. The from-read axiom [17] requiresthat, for all loads l and two different stores s, s′ on the same location, if l reads-from s and s happens-before s′, then l happens-before s′.We start by derivingfrom these three memory axioms the notion of SC-relaxed programs.

Proposition 6 (SC-relaxed consistency). For all X in P, if, for each partial stringx in X , the synchronizes-with, write-coherence and from-read axioms hold on all re-lease and acquire events in Ex on the same memory location, then X is an SC-relaxedprogram.

Proof. Let a ∈ 〈ADDRESS〉 be a memory location, l be an acquire on a and s′be a release on a. By write-coherence on release/acquire events, it remains toshow l �x s′ or s′ �x l. Since the read-from function is total, rf(l) = s for somerelease s on a. By the synchronizes-with axiom, s �x l. We therefore assumes �= s′. By write-coherence, s ≺x s′ or s′ ≺x s. The former implies l �x s′ by thefrom-read axiom, whereas the latter implies s′ �x l by transitivity. This proves,by case analysis, that X is an SC-relaxed program. ��

We need to prove some form of converse of the previous implication in orderto characterize SC-relaxed semantics in terms of the three aforementioned weakmemory axioms. For this purpose, we define the following:

Definition 13 (Read consistency). Let a ∈ 〈ADDRESS〉 be a memory location andx be a finite partial string in P f . For all loads l ∈ Ex on a, define the following set ofstore events: Hx(l) � {s ∈ Ex | s �x l and s is a store on a}. The read-from functionrf is said to satisfy weak read consistency whenever, for all loads l ∈ Ex and storess ∈ Ex on memory location a, the least upper bound

∨Hx(l) exists, and rf(l) = simplies

∨Hx(l) �x s; strong read consistency implies rf(l) = s =∨Hx(l).

By the next proposition, a natural sufficient condition for the existence of theleast upper bound

∨Hx(l) is the finiteness of the partial strings in P f and thetotal ordering of all stores on the same memory location from which the load lreads, i.e. write coherence. This could be generalized to well-ordered sets.

Proposition 7 (Weak read consistency existence). For all partial strings x in P f ,write coherence on memory location a implies that

∨Hx(l) exists for all loads l on a.

Page 42: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

On Partial Order Semantics for SAT/SMT-Based Symbolic Encodings 31

We remark that∨Hx(l) = ⊥ if Hx(l) = ∅; alternatively, to avoid that Hx(l)

is empty, we could require that programs are always constructed such that theirpartial strings have minimal store events that initialize all memory locations.

Proposition 8 (Weak read consistency equivalence). Write coherence implies thatweak read consistency is equivalent to the following: for all loads l and stores s, s′ onmemory location a ∈ 〈ADDRESS〉, if rf(l) = s and s′ �x l, then s′ �x s.

Proof. By write coherence,∨Hx(l) exists, and s′ �x

∨Hx(l) because s′ ∈ Hx(l)by assumption s′ �x l and Definition 13. By assumption of weak read consis-tency,

∨Hx(l) �x s. From transitivity follows s′ �x s.Conversely, assume rf(l) = s. Let s′ be a store on a such that s′ ∈ Hx(l).

Thus, by hypothesis, s′ �x s. Since s′ is arbitrary, s is an upper bound. Since theleast upper bound is well-defined by write coherence,

∨Hx(l) �x s. ��Weak read consistency therefore says that if a load l reads from a store s and

another store s′ on the same memory location happens before l, then s′ happensbefore s. This implies the next proposition.

Proposition 9 (From-Read Equivalence). For all SC-relaxed programs in P, weakread consistency with respect to release/acquire events is equivalent to the from-readaxiom with respect to release/acquire events.

We can characterize strong read consistency as follows:

Proposition 10 (Strong Read Consistency Equivalence). Strong read consistencyis equivalent to weak read consistency and the synchronizes-with axiom.

Proof. Let x be a partial string in P f . Let l be a load and s be a store on the samememory location. The forward implication is immediate from

∨Hx(l) �x l.Conversely, assume rf(l) = s. By synchronizes-with, s �x l, whence s ∈

Hx(l). By definition of least upper bound, s �x∨Hx(l). Since s �x

∨Hx(l),by hypothesis, and �x is antisymmetric, we conclude s =

∨Hx(l). ��Theorem 3 (SC-relaxed Equivalence). For every program X in P, X is SC-relaxedwhere, for all partial strings x in X and acquire events l in Ex, rf(l) =

∨Hx(l), ifand only if the synchronizes-with, write-coherence and from-read axioms hold for all xin X with respect to all release/acquire events in Ex on the same memory location.

Proof. Assume X is an SC-relaxed program according to Definition 11. Let x bea partial string in X and l be an acquire in the set of events Ex. By Proposition 7,∨Hx(l) exists. Assume rf(l) =

∨Hx(l). Since l is arbitrary, this is equivalentto assuming strong read consistency. Since release events are totally ordered in�x, by assumption, it remains to show that the synchronizes-with and from-read axioms hold. This follows from Proposition 10 and 9, respectively.

Conversely, assume the three weak memory axioms hold on x with respectto all release/acquire events in Ex on the same memory location. By Proposi-tion 6, X is an SC-relaxed program. Therefore, by Proposition 9 and 10, rf(l) =∨Hx(l), proving the equivalence. ��

Page 43: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

32 A. Horn and D. Kroening

While the state-of-the-art weak memory encoding is cubic in size [18], theprevious theorem has as immediate consequence that there exists an asymptot-ically smaller weak memory encoding with only a quadratic number of partialorder constraints.

Theorem 4 (Quadratic-size Weak Memory Encoding). There exists a quantifier-free first-order logic formula that has a quadratic number of partial order constraintsand is equisatisfiable to the cubic-size encoding given in [18].

Proof. Instead of instantiating the three universally quantified events in thefrom-read axiom, symbolically encode the least upper bound of weak read con-sistency. This can be accomplished with a new symbolic variable for every ac-quire event. It is easy to see that this reduces the cubic number of partial orderconstraints to a quadratic number. ��

In short, the asymptotic reduction in the number of partial order constraintsis due to a new symbolic encoding for how values are being overwritten inmemory: the current cubic-size formula [18] encodes the from-read axiom (Def-inition 12), whereas the proposed quadratic-size formula encodes a certain leastupper bound (Definition 13). We reemphasize that this formulation is in termsof release/acquire events rather than machine-specific accesses as in [18]. Theconstruction of the quadratic-size encoding, therefore, is generally only appli-cable if we can translate the machine-specific reads and writes in a shared mem-ory program to acquire and release events, respectively. This may require theprogram to be data race free, as illustrated in Example 6.

Furthermore, as mentioned in the introduction of this section, the primaryapplication of Theorem 4 is in the context of BMC. Recall that BMC assumesthat all loops in the shared memory program under scrutiny have been unrolled(the same restriction as in [18]). This makes it possible to symbolically encodebranch conditions, thereby alleviating the need to explicitly enumerate eachfinite partial string in an elementary program.

5 Concluding Remarks

This paper has studied a partial order model of computation that satisfies theaxioms of a unifying algebraic concurrency semantics by Hoare et al. By fur-ther restricting the partial string semantics, we obtained a relaxed sequentialconsistency semantics which was shown to be equivalent to the conjunction ofthree weak memory axioms by Alglave et al. This allowed us to prove the exis-tence of an equisatisfiable but asymptotically smaller weak memory encodingthat has only a quadratic number of partial order constraints compared to thestate-of-the-art cubic-size encoding. In upcoming work, we will experimentallycompare both encodings in the context of bounded model checking using SMTsolvers. As future theoretical work, it would be interesting to study the relation-ship between categorical models of partial string theory and event structures.

Page 44: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

On Partial Order Semantics for SAT/SMT-Based Symbolic Encodings 33

Acknowledgements.. We would like to thank Tony Hoare and Stephan van Staden fortheir valuable comments on an early draft of this paper, and we thank Jade Alglave,Cesar Rodrıguez, Michael Tautschnig, Peter Schrammel, Marcelo Sousa, Bjorn Wachterand John Wickerson for invaluable discussions.

References

1. Hoare, C.A., Moller, B., Struth, G., Wehrman, I.: Concurrent Kleene algebra and itsfoundations. J. Log. Algebr. Program. 80(6), 266–296 (2011)

2. Hoare, T., van Staden, S.: The laws of programming unify process calculi. Sci. Com-put. Program. 85, 102–114 (2014)

3. Hoare, T., van Staden, S.: In praise of algebra. Formal Aspects of Computing 24(4-6),423–431 (2012)

4. Hoare, T., van Staden, S., Moller, B., Struth, G., Villard, J., Zhu, H., O’Hearn, P.: De-velopments in Concurrent Kleene Algebra. In: Hofner, P., Jipsen, P., Kahl, W., Muller,M.E. (eds.) RAMiCS 2014. LNCS, vol. 8428, pp. 1–18. Springer, Heidelberg (2014)

5. Pratt, V.: Modeling concurrency with partial orders. Int. J. Parallel Program. 15(1),33–71 (1986)

6. Gischer, J.L.: The equational theory of pomsets. Theor. Comput. Sci. 61(2-3), 199–224(1988)

7. Petri, C.A.: Communication with automata. PhD thesis, Universitat Hamburg (1966)8. Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Com-

mun. ACM 21(7), 558–565 (1978)9. Grabowski, J.: On partial languages. Fundam. Inform. 4(2), 427–498 (1981)

10. Nielsen, M., Plotkin, G.D., Winskel, G.: Petri nets, event structures and domains,part I. Theor. Comput. Sci. 13(1), 85–108 (1981)

11. Bloom, B., Kwiatkowska, M.: Trade-offs in true concurrency: Pomsets andMazurkiewicz traces. In: Brookes, S., Main, M., Melton, A., Mislove, M., Schmidt,D. (eds.) MFPS 1991. LNCS, vol. 598, pp. 350–375. Springer, Heidelberg (1992)

12. Feigenbaum, J., Kahn, J., Lund, C.: Complexity results for POMSET languages. SIAMJ. Discret. Math. 6(3), 432–442 (1993)

13. Laurence, M.R., Struth, G.: Completeness theorems for Bi-Kleene algebras andseries-parallel rational pomset languages. In: Hofner, P., Jipsen, P., Kahl, W., Muller,M.E. (eds.) RAMiCS 2014. LNCS, vol. 8428, pp. 65–82. Springer, Heidelberg (2014)

14. Sewell, P., Sarkar, S., Owens, S., Nardelli, F.Z., Myreen, M.O.: x86-TSO: A rigorousand usable programmer’s model for x86 multiprocessors. Commun. ACM 53(7), 89–97 (2010)

15. Sevcık, J., Vafeiadis, V., Zappa Nardelli, F., Jagannathan, S., Sewell, P.: Relaxed-memory concurrency and verified compilation. SIGPLAN Not. 46(1), 43–54 (2011)

16. Batty, M., Owens, S., Sarkar, S., Sewell, P., Weber, T.: Mathematizing C++ concur-rency. SIGPLAN Not. 46(1), 55–66 (2011)

17. Alglave, J., Maranget, L., Sarkar, S., Sewell, P.: Fences in weak memory models (ex-tended version). FMSD 40(2), 170–205 (2012)

18. Alglave, J., Kroening, D., Tautschnig, M.: Partial orders for efficient bounded modelchecking of concurrent software. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS,vol. 8044, pp. 141–157. Springer, Heidelberg (2013)

19. Esik, Z.: Axiomatizing the subsumption and subword preorders on finite and infi-nite partial words. Theor. Comput. Sci. 273(1-2), 225–248 (2002)

Page 45: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

34 A. Horn and D. Kroening

20. Alglave, J., Maranget, L., Sarkar, S., Sewell, P.: Litmus: Running tests against hard-ware. In: Abdulla, P.A., Leino, K.R.M. (eds.) TACAS 2011. LNCS, vol. 6605, pp. 41–44. Springer, Heidelberg (2011)

21. Horn, A., Alglave, J.: Concurrent Kleene algebra of partial strings. ArXiv e-printsabs/1407.0385 (July 2014)

22. de Bakker, J.W., Warmerdam, J.H.A.: Metric pomset semantics for a concurrent lan-guage with recursion. In: Guessarian, I. (ed.) LITP 1990. LNCS, vol. 469, pp. 21–49.Springer, Heidelberg (1990)

23. Lamport, L.: How to make a multiprocessor computer that correctly executes multi-process programs. IEEE Trans. Comput. 28(9), 690–691 (1979)

24. Gharachorloo, K., Lenoski, D., Laudon, J., Gibbons, P., Gupta, A., Hennessy, J.: Mem-ory consistency and event ordering in scalable shared-memory multiprocessors.SIGARCH Comput. Archit. News 18(2SI), 15–26 (1990)

25. ISO: International Standard ISO/IEC 14882:2014(E) Programming Language C++.International Organization for Standardization (2014) (Ratified, to appear soon)

Page 46: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

A Strategy for Automatic Verification

of Stabilization of Distributed Algorithms

Ritwika Ghosh(�) and Sayan Mitra

University of Illinois, Urbana Champaign, USA{rghosh9,mitras}@illinois.edu

Abstract. Automatic verification of convergence and stabilization prop-erties of distributed algorithms has received less attention than verifica-tion of invariance properties. We present a semi-automatic strategy forverification of stabilization properties of arbitrarily large networks understructural and fairness constraints. We introduce a sufficient conditionthat guarantees that every fair execution of any (arbitrarily large) in-stance of the system stabilizes to the target set of states. In addition tospecifying the protocol executed by each agent in the network and thestabilizing set, the user also has to provide a measure function or a rank-ing function. With this, we show that for a restricted but useful classof distributed algorithms, the sufficient condition can be automaticallychecked for arbitrarily large networks, by exploiting the small modelproperties of these conditions. We illustrate the method by automati-cally verifying several well-known distributed algorithms including link-reversal, shortest path computation, distributed coloring, leader electionand spanning-tree construction.

1 Introduction

A system is said to stabilize to a set of states X ∗ if all its executions reach somestate in X ∗ [1]. This property can capture common progress requirements likeabsence of deadlocks and live-locks, counting to infinity, and achievement of self-stabilization in distributed systems. Stabilization is a liveness property, and likeother liveness properties, it is generally impossible to verify automatically. Inthis paper, we present sufficient conditions which can be used to automaticallyprove stabilization of distributed systems with arbitrarily many participatingprocesses.

A sufficient condition we propose is similar in spirit to Tsitsiklis’ conditionsgiven in [2] for convergence of iterative asynchronous processes. We require theuser to provide a measure function, parameterized by the number of processes,such that its sub-level sets are invariant with respect to the transitions andthere is a progress making action for each state.1 Our point of departure is a

This work is supported in part by research grants NSF CAREER 1054247 andAFOSR YIP FA9550-12-1-0336.

1 A sub-level set of a function comprises of all points in the domain which map to thesame value or less.

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 35–49, 2015.DOI: 10.1007/978-3-319-19195-9_3

Page 47: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

36 R. Ghosh and S. Mitra

non-interference condition that turned out to be essential for handling modelsof distributed systems. Furthermore, in order to handle non-deterministic com-munication patterns, our condition allows us to encode fairness conditions anddifferent underlying communication graphs.

Next, we show that these conditions can be transformed to a forall-exists formwith a small model property. That is, there exists a cut-off number N0 such that ifthe condition(s) is(are) valid in all models of sizes up to N0, then it is valid for allmodels.We use the small model results from [3] to determine the cut-off parameterand apply this approach to verify several well-known distributed algorithms.

We have a Python implementation based on the sufficient conditions for sta-bilization we develop in Section 3. We present precondition-effect style transitionsystems of algorithms in Section 4 and they serve as pseudo-code for our im-plementation. The SMT-solver is provided with the conditions for invariance,progress and non-interference as assertions. We encode the distributed systemmodels in Python and use the Z3 theorem-prover module [4] provided by Pythonto check the conditions for stabilization for different model sizes.

We have used this method to analyze a number of well-known distributedalgorithms, including a simple distributed coloring protocol, a self-stabilizingalgorithm for constructing a spanning tree of the underlying network graph, alink-reversal routing algorithm, and a binary gossip protocol. Our experimentssuggest that this method is effective for constructing a formal proof of stabiliza-tion of a variety of algorithms, provided the measure function is chosen carefully.Among other things, the measure function should be locally computable: changesfrom the measure of the previous state to that of the current state only dependon the vertices involved in the transition. It is difficult to determine whether sucha measure function exists for a given problem. For instance, consider Dijkstra’sself-stabilizing token ring protocol [5]. The proof of correctness relies on the factthat the leading node cannot push for a value greater than its previous uniquestate until every other node has the same value. We were unable to capturethis in a locally computable measure function because if translated directly, itinvolves looking at every other node in the system.

1.1 Related Work

The motivation for our approach is from the paper by John Tsitsiklis on con-vergence of asynchronous iterative processes [2], which contains conditions forconvergence similar to the sufficient conditions we state for stabilization. Ouruse of the measure function to capture stabilization is similar to the use of Lya-punov functions to prove stability as explored in [6], [7] and [8]. In [9], Dhamaand Theel present a progress monitor based method of designing self-stabilizingalgorithms with a weakly fair scheduler, given a self-stabilizing algorithm withan arbitrary, possibly very restrictive scheduler. They also use the existence ofa ranking function to prove convergence under the original scheduler. Severalauthors [10] employ functions to prove termination of distributed algorithms,but while they may provide an idea of what the measure function can be, in gen-eral they do not translate exactly to the measure functions that our verification

Page 48: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

A Strategy for Automatic Verification of Stabilization 37

strategy can employ. The notion of fairness we have is also essential in dictatingwhat the measure function should be, while not prohibiting too many behav-iors. In [7], the assumption of serial execution semantics is compatible with ournotions of fair executions.

The idea central to our proof method is the small model property of the suf-ficient conditions for stabilization. The small model nature of certain invarianceproperties of distributed algorithms (eg. distributed landing protocols for smallaircrafts as in [11]) has been used to verify them in [12]. In [13], Emerson andKahlon utilize a small model argument to perform parameterized model checkingof ring based message passing systems.

2 Preliminaries

We will represent distributed algorithms as transition systems. Stabilization isa liveness property and is closely related to convergence as defined in the worksof Tsitsiklis [2]; it is identical to the concept of region stability as presented in[14]. We will use measure functions in our definition of stabilization. A measurefunction on a domain provides a mapping from that domain to a well-orderedset. A well-ordered set W is one on which there is a total ordering <, such thatthere is a minimum element with respect to < on every non-empty subset ofW . Given a measure function C : A → B, there is a partition of A into sublevel-sets. All elements of A which map to the same element b ∈ B under C arein the same sub level-set Lb.

We are interested in verifying stabilization of distributed algorithms indepen-dent of the number of participating processes or nodes. Hence, the transitionsystems are parameterized by N—the number of nodes. Given a non-negativeinteger N , we use [N ] to denote a set of indices {1, 2, . . . , N}.Definition 1. For a natural number N and a set Q, a transition system A(N)with N nodes is defined as a tuple (X,A,D) where

a) X is the state space of the system. If the state space of of each node is Q,X = QN .

b) A is a set of actions.c) D : X ×A → X is a transition function, that maps a system-state action pair

to a system-state.

For any x ∈ X , the ith component of x is the state of the ith node and werefer to it as x[i]. Given a transition system A(N) = (X , A,D) we refer to thestate obtained by the application of the action a on a state x ∈ X i.e, D(x, a),by a(x).

An execution of A(N) records a particular run of the distributed system withN nodes. Formally, an execution α of A(N) is a (possibly infinite) alternat-ing sequence of states and actions x0, a1, x1, . . ., where each xi ∈ X and eachai ∈ A such that D(xi, ai+1) = xi+1. Given that the choice of actions is non-deterministic in the execution, it is reasonable to expect that not all executionsmay stabilize. For instance, an execution in which not all nodes participate, maynot stabilize.

Page 49: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

38 R. Ghosh and S. Mitra

Definition 2. A fairness condition F for A(N) is a finite collection of subsets ofactions {Ai}i∈I , where I is a finite index set. An action-sequence σ = a1, a2, . . .is F -Fair if every Ai in F is represented in σ infinitely often, that is,

∀ A′ ∈ F , ∀i ∈ N, ∃k > i, ak ∈ A′.

For instance, if the fairness condition is the collection of all singleton subsetsof A, then each action occurs infinitely often in an execution. This notion offairness is similar to action based fairness constraints in temporal logic modelchecking [15]. The network graph itself enforces whether an action is enabled:every pair of adjacent nodes determines a continuously enabled action. An exe-cution is strongly fair, if given a set of actions A such that all actions in A areinfinitely often enabled; some action in A occurs infinitely often in the it. AnF-fair execution is an infinite execution such that the corresponding sequenceof actions is F -fair.

Definition 3. Given a system A(N), a fairness condition F , and a set of statesX ∗ ⊆ X , A(N) is said to F -stabilize to X ∗ iff for any F-fair execution α =x0, a1, x1, a2, . . ., there exists k ∈ N such that xk ∈ X ∗. X ∗ is called a stabilizingset for A and F .

It is different from the definition of self-stabilization found in the literature [1],in that the stabilizing set X ∗ is not required to be an invariant of A(N). We viewproving the invariance of X ∗ as a separate problem that can be approached usingone of the available techniques for proving invariance of parametrized systemsin [3], [12].

Example 1. (Binary Gossip) We look at binary gossip in a ring network com-posed of N nodes. The nodes are numbered clockwise from 1, and nodes 1 andN are also neighbors. Each node has one of two states : {0, 1}. A pair of neigh-boring nodes communicates to exchange their values, and the new state is set tothe binary Or (∨) of the original values. Clearly, if all the interactions happeninfinitely often, and the initial state has at least one node state 1, this transitionsystem stabilizes to the state x = 1N . The set of actions is specified by the setof edges of the ring. We first represent this protocol and its transitions using astandard precondition-effect style notation similar to one used in [16].

Automaton Gossip[N : N]type indices : [N ]type values : {0, 1}variables

x[indices → values]transitions

step(i: indices , j: indices )pre Trueeff x[i] = x[j] = x[i] ∨ x[j]

measurefunc C : x �→ Sum(x)

Page 50: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

A Strategy for Automatic Verification of Stabilization 39

The above representation translates to the transition system A(N) = (X , A,D)where

1. The state space of each node is Q = {0, 1}, i.e X = {0, 1}N .2. The set of actions is A = {step(i, i+ 1) | 1 ≤ i < N} ∪ {(N, 1)}.3. The transition function is D(x, step(i, j)) = x′ where x′[i] = x′[j] = x[i] ∨

x[j].

We define the stabilizing set to be X∗ = {1N}, and the fairness condition is F ={{(i, i+1} | 1 < i < N}∪{1, N}, which ensures that all possible interactions takeplace infinitely often. In Section 3 we will discuss how this type of stabilizationcan be proven automatically with a user-defined measure function.

3 Verifying Stabilization

3.1 A Sufficient Condition for Stabilization

We state a sufficient condition for stabilization in terms of the existence of a mea-sure function. The measure functions are similar to Lyapunov stability conditionsin control theory [17] and well-founded relations used in proving termination ofprograms and rewriting systems [18].

Theorem 1. Suppose A(N) = 〈X , A,D〉 is a transition system parameterizedby N , with a fairness condition F , and let X ∗ be a subset of X . Suppose furtherthat there exists a measure function C : X → W , with minimum element ⊥ suchthat the following conditions hold for all states x ∈ X:

– (invariance) ∀ a ∈ A, C(a(x)) ≤ C(x),– (progress) ∃ Ax ∈ F , ∀a ∈ Ax, C(x) =⊥⇒ C(a(x)) < C(x),– (noninterference) ∀a, b ∈ A, C(a(x)) < C(x) ⇒ C(a(b(x))) < C(x), and– (minimality) C(x) = ⊥ ⇒ x ∈ X ∗.

Then, A[N ] F-stabilizes to X ∗.

Proof. Consider an F -fair execution α = x0a1x1 . . . of A(N) and let xi be anarbitrary state in that execution. If C(xi) = ⊥, then by minimality, we have xi ∈X ∗. Otherwise, by the progress condition we know that there exists a set of actionsAxi ∈ F and k > i, such that ak ∈ Axi , and C(ak(xi)) < C(xi). We performinduction on the length of the sub-sequence xiai+1xi+1 . . . akxk and prove thatC(xk) < C(xi). For any sequence β of intervening actions of length n,

C(ak(xi)) < C(xi) ⇒ C(ak(β(xi))) < C(xi).

The base case of the induction is n = 0, which is trivially true. By inductionhypothesis we have: for any j < n, with length of β equal to j,

C(ak(β(xi)) < C(xi).

Page 51: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

40 R. Ghosh and S. Mitra

We have to show that for any action b ∈ A,

C(ak(β(b(xi))) < C(xi).

There are two cases to consider. If C(b(xi)) < C(xi) then the result follows fromthe invariance property. Otherwise, let x′ = b(xi). From the invariance of b wehave C(x′) = C(xi). From the noninterference condition we have

C(a(b(xi)) < C(xi),

which implies that C(a(x′)) < C(x′). By applying the induction hypothesis to x′

we have the required inequality C(ak(β(b(xi))) < C(xi). So far we have provedthat either a state xi in an execution is already in the stabilizing set, or there isa state xk, k > i such that C(xk) < C(xi). Since < is a well-ordering on C(X ),there cannot be an infinite descending chain. Thus

∃j(j > i ∧ C(j) = ⊥).

By minimality , xj ∈ X∗. By invariance again, we have F -stabilization to X∗ �

We make some remarks on the conditions of Theorem 1. It requires the mea-sure function C and the transition system A(N) to satisfy four conditions. Theinvariance condition requires the sub-level sets of C to be invariant with respectto all the transitions of A(N). The progress condition requires that for everystate x for which the measure function is not already ⊥, there exists a fair setof actions Ax that takes x to a lower value of C.

The minimality condition asserts that C(x) drops to ⊥ only if the state is inthe stabilizing set X ∗. This is a part of the specification of the stabilizing set.

The noninterference condition requires that if a results in a decrease in thevalue of the measure function at state x, then application of a to another state x′

that is reachable from x also decreases the measure value below that of x. Notethat it doesn’t necessarily mean that a decreases the measure value at x′, onlythat either x′ has measure value less than x at the time of application of a orit drops after the application. In contrast, the progress condition of Theorem 1requires that for every sub-level set of C there is a fair action that takes allstates in the sub-level set to a smaller sub-level set.

To see the motivation for the noninterference condition, consider a sub-level setwith two states x1 and x2 such that b(x1) = x2, a(x2) = x1 and there is only oneaction a such that C(a(x1)) < C(x1). But as long as a does not occur at x1, aninfinite (fair) execution x1bx2ax1bx2 . . . may never enter a smaller sub-level set.

In our examples, the actions change the state of a node or at most a smallset of nodes while the measure functions succinctly captures global progressconditions such as the number of nodes that have different values. Thus, it isoften impossible to find actions that reduce the measure function for all possiblestates in a level-set. In Section 4, we will show how a candidate measure functioncan be checked for arbitrarily large instances of a distributed algorithm, andhence, lead to a method for automatic verification of stabilization.

Page 52: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

A Strategy for Automatic Verification of Stabilization 41

3.2 Automating Stabilization Proofs

For finite instances of a distributed algorithm, we can use formal verificationtools to check the sufficient conditions in Theorem 1 to prove stabilization. Fortransition systems with invariance, progress and noninterference conditions thatcan be encoded appropriately in an SMT solver, these checks can be performedautomatically. Our goal, however, is to prove stabilization of algorithms with anarbitrary or unknown number of participating nodes. We would like to define aparameterized family of measure functions and show that ∀N ∈ N,A(N) satisfiesthe conditions of Theorem 1. This is a parameterized verification problem andmost of the prior work on this problem has focused on verifying invariant prop-erties (see Section 1 for related works). Our approach will be based on exploitingthe small model nature of the logical formulas representing these conditions.

Suppose we want to check the validity of a logical formula of the form ∀ N ∈N, φ(N). Of course, this formula is valid iff the negation ∃ N ∈ N,¬φ(N) hasno satisfying solution. In our context, checking if ¬φ(N) has a satisfying solu-tion over all integers is the (large) search problem of finding a counter-example.That is, a particular instance of the distributed algorithm and specific values ofthe measure function for which the conditions in Theorem 1 do not hold. Theformula ¬φ(N) is said to have a small model property if there exists a cut-offvalue N0 such that if there is no counter-example found in any of the instancesA(1),A(2), . . . ,A(N0), then there are no counter-examples at all. Thus, if theconditions of Theorem 1 can be encoded in such a way that they have thesesmall model properties then by checking them over finite instances, we can infertheir validity for arbitrarily large systems.

In [3], a class of ∀∃ formulas with small model properties were used to checkinvariants of timed distributed systems on arbitrary networks. In this paper, wewill use the same class of formulas to encode the sufficient conditions for checkingstabilization. We use the following small model theorem as presented in [3]:

Theorem 2. Let Γ (N) be an assertion of the form

∀i1, . . . , ik ∈ [N ]∃j1, . . . , jm ∈ [N ], φ(i1, . . . , ik, j1, . . . , jm)

where φ is a quantifier-free formula involving the index variables, global andlocal variables in the system. Then, ∀N ∈ N : Γ (N) is valid iff for all n ≤ N0 =(e + 1)(k + 2), Γ (n) is satisfied by all models of size n, where e is the numberof index array variables in φ and k is the largest subscript of the universallyquantified index variables in Γ (N).

3.3 Computing the Small Model Parameter

Computing the small model parameter N0 for verifying a stability property ofa transition system first requires expressing all the conditions of Theorem 1using formulas which have the structure specified by Theorem 2. There are afew important considerations while doing so.

Page 53: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

42 R. Ghosh and S. Mitra

Translating the sufficient conditions. In their original form, none of the condi-tions of Theorem 1 have the structure of ∀∃-formulas as required by Theorem 2.For instance, a leading ∀x ∈ X quantification is not allowed by Theorem 2, sowe transform the conditions into formulas with implicit quantification. Take forinstance the invariance condition: ∀x ∈ X , ∀a ∈ A, (C(a(x)) ≤ C(x)). Checkingthe validity of the invariance condition is equivalent to checking the satisfiabilityof ∀a ∈ A, (a(x) = x′ ⇒ C(x′) ≤ C(x)), where x′ and x are free variables, whichare checked over all valuations. Here we need to check that x and x′ are actuallystates and they satisfy the transition function. For instance in the binary gossipexample, we get

Invariance : ∀x ∈ X , ∀a ∈ A, C(a(x)) ≤ C(x) is verified as

∀a ∈ A, x′ = a(x) ⇒ C(x′) ≤ C(x).

≡ ∀i, j ∈ [N ], x′ = step(i, j)(x) ⇒ Sum(x′) ≤ Sum(x).

Progress : ∀x ∈ X , ∃a ∈ A, C(x) = ⊥ ⇒ C(a(x)) < C(x)

is verified as C(x) = 0

⇒ ∃i, j ∈ [N ], x′ = step(i, j)(x) ∧ Sum(x)′ < Sum(x).

Noninterference : ∀x ∈ X , ∀a, b ∈ A, (C(a(x)) < C(x) ≡ C(a(b(x))) < C(x))

is verified as ∀i, j, k, l ∈ [N ], x′ = step(i, j)(x) ∧ x′′ = step(k, l)(x)

∧x′′′ = step(i, j)(x′′) ⇒ (C(x′) < C(x) ⇒ C(x′′′) < C(x)).

Interaction graphs. In distributed algorithms, the underlying network topologydictates which pairs of nodes can interact, and therefore the set of actions. Weneed to be able to specify the available set of actions in a way that is in the formatdemanded by the small-model theorem. In this paper we focus on specific classesof graphs like complete graphs, star graphs, rings, k-regular graphs, and k-partitecomplete graphs, as we know how to capture these constraints using predicates inthe requisite form. For instance, we use edge predicates E(i, j) : i and j are nodeindices, and the predicate is true if there is an undirected edge between them inthe interaction graph. For a complete graph, E(i, j) = true. In the Binary Gossipexample, the interaction graph is a ring, and E(i, j) = (i < N ∧ j = i+1)∨ (i >1 ∧ j = i − 1) ∨ i = 1 ∧ j = N). If the graph is a d-regular graph, we expressuse d arrays, reg1, . . . , regd, where ∃i, regi[k] = l if there is an edge betweenk and l, and i = j ≡ regi[k] = regj [k]. This only expresses that the degreeof each vertex is d, but there is no information about the connectivity of thegraph. For that, we can have a separate index-valued array which satisfies certainconstraints if the graph is connected. These constraints need to be expressed ina format satisfying the small model property as well. Other graph predicatescan be introduced based on the model requirements, for instance, Parent(i, j),Child(i, j), Direction(i, j). In our case studies we verify stabilization under theassumption that all pairs of nodes in E interact infinitely often. For the progresscondition, the formula simplifies to ∃a ∈ A,C(x) = ⊥ ⇒ C(a(x)) < C(x)). Moregeneral fairness constraints can be encoded in the same way as we encode graphconstraints.

Page 54: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

A Strategy for Automatic Verification of Stabilization 43

4 Case Studies

In this section, we will present the details of applying our strategy to variousdistributed algorithms. We begin by defining some predicates that are used in ourcase studies. Recall that we want wanted to check the conditions of Theorem 1using the transformation outlined in Section 3.3 involving x, x′ etc., representingthe states of a distributed system that are related by the transitions. Theseconditions are encoded using the following predicates, which we illustrate usingthe binary gossip example given in Section 2:

– isState(x) returns true iff the array variable x represents a state of the sys-tem. In the binary gossip example, isState(x) = ∀i ∈ [N ], x[i] = 0∨ x[i] = 1.

– isAction(a) returns true iff a is a valid action for the system. Again, for thebinary gossip example isAction(step(i, j)) = True for all i, j ∈ [N ] in thecase of a complete communication graph.

– isTransition(x, step(i, j), x′) returns true iff the state x goes to x′ when thetransition function for action step(i, j) is applied to it. In case of the binarygossip example, isTransition(x, step(i, j), x′) is

(x′[j] = x′[i] = x[i] ∨ x[j]) ∧ (∀p, p /∈ {i, j} ⇒ x[p] = x′[p]).

– Combining the above predicates, we define P (x, x′, i, j) as

isState(x) ∧ isState(x′) ∧ isTransition(x, step(i, j), x′) ∧ isAction(step(i, j)).

Using these constructions, we rewrite the conditions of Theorem 1 as follows:

Invariance : ∀i, j, P (x, x′, i, j) ⇒ C(x′) ≤ C(x). (1)

Progress : C(x) = ⊥ ⇒ ∃i, j, P (x, x′, i, j) ∧ C(x′) < C(x). (2)

Noninterference : ∀p, r, s, t, P (x, x′, p, q) ∧ P (x, x′′, s, t) ∧ P (x′′, x′′′, p, q)⇒ (C(x′) < C(x) ⇒ C(x′′′) < C(x)). (3)

Minimality : C(x) = ⊥ ⇒ x ∈ X∗. (4)

4.1 Graph Coloring

This algorithm colors a given graph in d + 1 colors, where d is the maximumdegree of a vertex in the graph [10]. Two nodes are said to have a conflict if theyhave the same color. A transition is made by choosing a single vertex, and if ithas a conflict with any of its neighbors, then it sets its own state to be the leastavailable value which is not the state of any of its neighbours. We want to verifythat the system stabilizes to a state with no conflicts. The measure function ischosen as the set of pairs with conflicts.

Page 55: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

44 R. Ghosh and S. Mitra

Automaton Coloring[N : N]type indices : [N ]type values : {1, . . . , N }variables

x[indices �→ values]transitions

internal step(i: indices )pre ∃j ∈ [N ](E(j, i) ∧ x[j] = x[i])eff x[i] = min(values \{c | j ∈ [N ] ∧ E(i, j) ∧ x[j] = c})

measurefunc C : x �→ {(i, j) | E(i, j) ∧ x[i] = x[j]}

Here, the ordering on the image of the measure function is set inclusion.

Invariance : ∀i ∈ [N ], P (x, x′, i) ⇒ C(x′) ⊆ C(x). (From (1))

≡ ∀i, j, k ∈ [N ], P (x, x′, i) ⇒ ((j, k) ∈ C(x′)⇒ (j, k) ∈ C(x)).

≡ ∀i, j, k ∈ [N ], P (x, x′, i)⇒ (E(j, k) ∧ x[j] = x[k] ⇒ x′[j] = x′[k]).

(E is the set of edges in the underlying graph)

Progress : ∃m ∈ [N ], C(x) = ∅ ⇒ C(step(m)(x)) < C(x).

≡ ∀i, j ∈ [N ], ∃m,n ∈ [N ], (E(i, j) ∧ x[i] = x[j]) ∨(P (x, x′,m) ∧ E(m,n) ∧ x[m] = x[n] ∧ x′[m] = x′[n]).

Noninterference : ∀q, r, s, t ∈ [N ], (P (x, x′, q) ∧ P (x, x′′, s) ∧ P (x′′, x′′′, q))⇒ (E(q, r) ∧ x[q] = x[r] ∧ x′[q] = x′[r] ⇒ E(s, t)

∧(x′[s] = x′[t] ⇒ x′′′[s] = x′′′[t]) ∧ x′′′[r] = x′′′[q])).(from (3 and expansion of ordering)

Minimality : C(x) = ∅ ⇒ x ∈ X∗.

From the above conditions, using Theorem 2 N0 is calculated to be 24.

4.2 Leader Election

This algorithm is a modified version of the Chang-Roberts leader election algo-rithm [10]. We apply Theorem 1 directly by defining a straightforward measurefunction. The state of each node in the network consists of a) its own uid, b)the index and uid of its proposed candidate, and c) the status of the electionaccording to the node (0 : the node itself is elected, 1 : the node is not the leader,2 : the node is still waiting for the election to finish). A node i communicatesits state to its clockwise neighbor j (i+ 1 if i < N , 0 otherwise) and if the UIDof i’s proposed candidate is greater than j, then j is out of the running. Theproposed candidate for each node is itself to begin with. When a node gets backits own index and uid, it sets its election status to 0. This status, and the correctleader identity propagates through the network, and we want to verify that thesystem stabilizes to a state where a leader is elected. The measure function isthe number of nodes with state 0.

Page 56: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

A Strategy for Automatic Verification of Stabilization 45

Automaton Leader[N : N]type indices : [N ]variables

uid[indices �→ [N ]]candidate[indices �→ [N ]]leader[indices �→ {0, 1, 2}]

transitionsinternal step(i: indices , j: indices )pre leader[i] = 1 ∧ uid[candidate[i]] > uid[candidate[j]]eff leader[j] = 1 ∧ candidate[j] = candidate[i]pre leader[j] = 2 ∧ candidate[i] = jeff leader[j] = 0∧candidate[j] = jpre leader[i] = 0eff leader[j] = 1 ∧ candidate[j] = i

measurefunc C : x �→ Sum(x.leader[i])

The function Sum() represents the sum of all elements in the array, and it canbe updated when a transition happens by just looking at the interacting nodes.We encode the sufficient conditions for stabilization of this algorithm using thestrategy outlined in Section 3.2.

Invariance : ∀i, j∈ [N ], P (x, x′, i, j)⇒(Sum(x′.leader)≤Sum(x.leader)).

≡ ∀i, j ∈ [N ], (P (x, x′, i, j) ⇒ (Sum(x.leader)− x.leader[i]−x.leader[j] + x′.leader[i] + x′.leader[j] ≤ Sum(x.leader)).

(difference only due to interacting nodes)

≡ ∀i, j ∈ [N ], P (x, x′, i, j)⇒ (x′.leader[i] + x′.leader[j] ≤ x.leader[i] + x.leader[j])

Progress : ∃m,n ∈ [N ], Sum(x.leader) = N − 1

⇒ Sum(step(m,n)(x).leader) < Sum(x.leader)).

≡ ∀p ∈ [N ], x.leader[p] = 2 ⇒∃m,n ∈ [N ], (P (x, x′,m, n) ∧ E(m,n) ∧x′.leader[m] + x′.leader[n] < x.leader[m] + x.leader[n]).

(one element still waiting for election to end)

Noninterference : ∀q, r, s, t ∈ [N ], P (x, x′, q, r) ∧ P (x, x′′, s, t) ∧ P (x′′, x′′′, q, r)⇒ (x′[q] + x′[r] < x[q] + x[r]

⇒ (x′′′[q] + x′′′[r] + x′′′[s] + x′′′[t] < x[q] + x[r] + x[s] + x[t])).

(expanding out Sum)

Minimality : C(x) = N − 1 ⇒ x ∈ X∗.

From the above conditions, using Theorem 2, N0 is calculated to be 35.

4.3 Shortest Path

This algorithm computes the shortest path to every node in a graph from a rootnode. It is a simplified version of the Chandy-Misra shortest path algorithm [10].We are allowed to distinguish the nodes with indices 1 or N in the formula

Page 57: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

46 R. Ghosh and S. Mitra

structure specified by Theorem 2. The state of the node represents the distancefrom the root node. The root node (index 1) has state 0. Each pair of neighboringnodes communicates their states to each other, and if one of them has a lesservalue v, then the one with the larger value updates its state to v + 1. Thisstabilizes to a state where all nodes have the shortest distance from the rootstored in their state. We don’t have an explicit value of ⊥ for the measurefunction for this, but it can be seen that we don’t need it in this case. Let theinteraction graph be a d−regular graph. The measure function is the sum ofdistances.

Automaton Shortest[N : N]type indices : [N ]type values : {1, . . . , N }variables

x[indices �→ values]transitions

internal step(i: indices , j: indices )pre x[j] > x[i] + 1eff x[j] = x[i] + 1pre x[i] = 0eff x[j] = 1

measurefunc C : x �→ Sum(x[i])

Ordering on the image of measure function is the usual one on natural numbers.

Invariance : ∀i, j ∈ [N ], P (x, x′, i, j) ⇒ Sum(x′) ≤ Sum(x).

≡ ∀, j ∈ [N ], P (x, x′, i, j)⇒ Sum(x)− x[i]− x[j] + x′[i] + x′[j] ≤ Sum(x).

≡ ∀i, j ∈ [N ], P (x, x′, i, j) ⇒ x′[i] + x′[j] ≤ x[i] + x[j).

Progress : ∃m,n∈ [N ], C(x) = ⊥⇒P (x, x′,m, n) ∧ Sum(x)′ < Sum(x).

≡ ∀k, l ∈ [N ], (E(k, l) ⇒ x[k] ≤ x[l] + 1)

∨∃m,n ∈ [N ](P (x, x′,m, n) ∧ E(m,n)

∧x[m] + x[n] > x′[m] + x′[n]).(C(x) = ⊥ if there is no pair of neighboring

vertices more than 1 distance apart from each other )

Noninterference : ∀q, r, s, t ∈ [N ], P (x, x′, q, r) ∧ P (x, x′′, s, t) ∧ P (x′, x′′, q, r)⇒ (x′[q] + x′[r] < x[q] + x[r]

⇒ (x′′′[q]+x′′′[r] + x′′′[s] + x′′′[t] < x[q] + x[r] + x[s] + x[t])).

Minimality : C(x) = ⊥ ⇒ x ∈ X∗

≡ ∀i, j(E(i, j) ⇒ x[i]− x[j] ≤ 1 ⇒ x ∈ X∗) (definition)

N0 is 7(d+ 1) where the graph is d-regular.

4.4 Link Reversal

We describe the full link reversal algorithm as presented by Gafni and Bertsekasin [19], where, given a directed graph with a distinguished sink vertex, it outputs

Page 58: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

A Strategy for Automatic Verification of Stabilization 47

a graph in which there is a path from every vertex to the sink. There is adistinguished sink node(index N). Any other node which detects that it has onlyincoming edges, reverses the direction of all its edges with its neighbours. Weuse the vector of reversal distances (the least number of edges required to bereversed for a node to have a path to the sink, for termination. The states storethe reversal distances, and the measure function is identity.

Automaton Reversal[N : N]type indices : [N ]type values : [N ]variables

x[indices �→ values]transitions

internal step(i: indices )pre i �= N ∧ ∀j ∈ [N ](E(i, j) ∧ (direction(i, j) = −1)eff ∀j ∈ [N ](E(i, j) ⇒ (Reverse(i, j)) ∧ x(i) = min(x(j)))

measurefunc C: x �→ x

The ordering on the image of the measure function is component-wise comparison:

V1 < V2 ⇔ ∀i(V1[i] < V2[i])

We mentioned earlier that the image of C has a well-ordering. That is a conditionformulated with the idea of continuous spaces in mind. The proposed ordering forthis problem works because the image of the measure function is discrete and hasa lower bound (specifically, 0N). We elaborate a bit on P here, because it needsto include the condition that the reversal distances are calculated accurately. Thenode N has reversal distance 0. Any other node has reversal distance rd(i) =min(rd(j1), . . . rd(jm), rd(k1) + 1, . . . rd(kn) + 1) where jp(p = 1 . . .m) are thenodes to which it has outgoing edges, and kq(q = 1 . . . n) are the nodes it hasincoming edges from. P also needs to include the condition that in a transition,reversal distances of no other nodes apart from the transitioning nodes change.The interaction graph in this example is complete.

Invariance : ∀i, j ∈ [N ], P (x, x′, i) ⇒ x′[j] ≤ x[j] (ordering)

Progress : ∃m ∈ [N ], C(x) = ⊥ ⇒ (C(step(m)(x)) < C(x)).

≡ ∀n ∈ [N ], (x[n] = 0) ∨ ∃m ∈ [N ](P (x, x′,m) ∧ x′[m] < x[m]).

Noninterference : ∀i, j ∈ [N ], P (x, x′, i) ∧ P (x′, x′′, j) ∧ P (x′, x′′′, i)⇒ (x′[i] < x[i] ∧ x′′′[i] < x[i]). (decreasing measure)

Minimality : C(x) = 0N ⇒ x ∈ X∗.

From the above conditions, using Theorem 2, N0 is calculated to be 21.

5 Experiments and Discussion

We verified that instances of the aforementioned systems with sizes less thanthe small model parameter N0 satisfy the four conditions(invariance, progress,non-interference, minimality) of Theorem 1 using the Z3 SMT-solver [4]. Themodels are checked by symbolic execution.

Page 59: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

48 R. Ghosh and S. Mitra

Fig. 1. Instance size vs log10(T ), where T is the running time in seconds

The interaction graphs were complete graphs in all the experiments. InFigure 5, the x-axis represents the problem instance sizes, and the y-axis isthe log of the running time (in seconds) for verifying Theorem 1 for the differentalgorithms. 2

We observe that the running times grow rapidly with the increase in themodel sizes. For the binary gossip example, the program completes in ∼ 17seconds for a model size 7, which is the N0 value. In case of the link reversal, fora model size 13, the program completes in ∼ 30 mins. We have used completegraphs in all our experiments, but as we mentioned earlier in Section 3.2, wecan encode more general graphs as well. This method is a general approachto automated verification of stabilization properties of distributed algorithmsunder specific fairness constraints, and structural constraints on graphs. Thesmall model nature of the conditions to be verified is crucial to the success of thisapproach. We saw that many distributed graph algorithms, routing algorithmsand symmetry-breaking algorithms can be verified using the techniques discussedin this paper. The problem of finding a suitable measure function which satisfiesTheorem 2, is indeed a non-trivial one in itself, however, for the problems westudy, the natural measure function of the algorithms seems to work.

References

1. Dolev, S.: Self-stabilization. MIT Press (2000)2. Tsitsiklis, J.N.: On the stability of asynchronous iterative processes. Mathematical

Systems Theory 20(1), 137–153 (1987)3. Johnson, T.T., Mitra, S.: A small model theorem for rectangular hybrid automata

networks. In: Giese, H., Rosu, G. (eds.) FMOODS/FORTE 2012. LNCS, vol. 7273,pp. 18–34. Springer, Heidelberg (2012)

4. de Moura, L., Bjørner, N.: Z3: An efficient SMT solver. In: Ramakrishnan, C.R.,Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg(2008)

2 The code is available athttp://web.engr.illinois.edu/rghosh9/code.

Page 60: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

A Strategy for Automatic Verification of Stabilization 49

5. Dijkstra, E.W.: Self-stabilization in spite of distributed control. In: Selected Writ-ings on Computing: A Personal Perspective, pp. 41–46. Springer (1982)

6. Theel, O.: Exploitation of ljapunov theory for verifying self-stabilizing algorithms.In: Herlihy, M. (ed.) DISC 2000. LNCS, vol. 1914, pp. 209–222. Springer, Heidelberg(2000)

7. Oehlerking, J., Dhama, A., Theel, O.: Towards automatic convergence verificationof self-stabilizing algorithms. In: Tixeuil, S., Herman, T. (eds.) SSS 2005. LNCS,vol. 3764, pp. 198–213. Springer, Heidelberg (2005)

8. Theel, O.E.: A new verification technique for self-stabilizing distributed algorithmsbased on variable structure systems and lyapunov theory. In: HICSS (2001)

9. Dhama, A., Theel, O.: A tranformational approach for designing scheduler-oblivious self-stabilizing algorithms. In: Dolev, S., Cobb, J., Fischer, M., Yung,M. (eds.) SSS 2010. LNCS, vol. 6366, pp. 80–95. Springer, Heidelberg (2010)

10. Ghosh, S.: Distributed systems: an algorithmic approach. CRC Press (2010)11. Umeno, S., Lynch, N.: Safety verification of an aircraft landing protocol: A re-

finement approach. In: Bemporad, A., Bicchi, A., Buttazzo, G. (eds.) HSCC 2007.LNCS, vol. 4416, pp. 557–572. Springer, Heidelberg (2007)

12. Johnson, T.T., Mitra, S.: Invariant synthesis for verification of parameterizedcyber-physical systems with applications to aerospace systems. In: Proceedings ofthe AIAA Infotech at Aerospace Conference (AIAA Infotech 2013), Boston, MA.AIAA (August 2013)

13. Allen Emerson, E., Kahlon, V.: Reducing model checking of the many to the few.In: McAllester, D. (ed.) CADE-17. LNCS (LNAI), vol. 1831, pp. 236–254. Springer,Heidelberg (2000)

14. Duggirala, P.S., Mitra, S.: Abstraction refinement for stability. In: 2011IEEE/ACM International Conference on Cyber-Physical Systems (ICCPS), pp.22–31. IEEE (2011)

15. Huth, M., Ryan, M.: Logic in Computer Science: Modelling and reasoning aboutsystems. Cambridge University Press (2004)

16. Mitra, S.: A verification framework for hybrid systems. PhD thesis, MassachusettsInstitute of Technology (2007)

17. Khalil, H.K., Grizzle, J.W.: Nonlinear systems, vol. 3. Prentice Hall, Upper SaddleRiver (2002)

18. Dershowitz, N.: Termination of rewriting. Journal of Symbolic Computation 3(1),69–115 (1987)

19. Gafni, E.M., Bertsekas, D.P.: Distributed algorithms for generating loop-free routesin networks with frequently changing topology. IEEE Transactions on Communi-cations 29(1), 11–18 (1981)

Page 61: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Faster Linearizability Checkingvia P-Compositionality

Alex Horn(�) and Daniel Kroening

University of Oxford, Oxford, [email protected]

Abstract. Linearizability is a well-established consistency and correct-ness criterion for concurrent data types. An important feature of lineariz-ability is Herlihy and Wing’s locality principle, which says that aconcurrent system is linearizable if and only if all of its constituent parts(so-called objects) are linearizable. This paper presents P-compositionality,which generalizes the idea behind the locality principle to operations onthe same concurrent data type. We implement P-compositionality in anovel linearizability checker. Our experiments with over nine implemen-tations of concurrent sets, including Intel’s TBB library, show that our lin-earizability checker is one order of magnitude faster and/or more spaceefficient than the state-of-the-art algorithm.

1 Introduction

Linearizability [1] is a well-established correctness criterion for concurrent datatypes and it corresponds to one of the three desirable properties of a distributedsystem, namely consistency [2]. The intuition behind linearizability is that everyoperation on a concurrent data type is guaranteed to take effect instantaneouslyat some point between its call and return.

The significance of linearizability for contemporary distributed key/valuestores has been highlighted recently by the Jepsen project, an extensive casestudy into the correctness of distributed systems.1 Interestingly, Jepsen foundlinearizability bugs in several distributed key/value stores despite the fact thatthey were designed based on formally verified distributed consensus protocols.This illustrates that there is often a gap between the design and the implemen-tation of distributed systems. This gap motivates the study in this paper intoruntime verification techniques (in the form of so-called linearizability checkers)for finding linearizability bugs in a single run of a concurrent system.

The input to a linearizability checker consists of a sequential specification ofa data type and a certain partially ordered set of operations, called a history. Ahistory represents a single terminating run of a concurrent system. We assume

This work is funded by a gift from Intel Corporation for research on EffectiveValidation of Firmware and the ERC project ERC 280053.

1 https://aphyr.com/posts/316-call-me-maybe-etcd-and-consul

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 50–65, 2015.DOI: 10.1007/978-3-319-19195-9_4

Page 62: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Faster Linearizability Checking via P-Compositionality 51

that the concurrent system is deadlock-free since there already exist good dead-lock detection tools. Despite the restriction to single histories, the problem ofchecking linearizability is NP-complete [3]. This high computational complex-ity means that writing an efficient linearizability checker is inherently difficult.The problem is to find ways of pruning a huge search space: in the worst case,its size is O(N!) where N is the length of the run of a concurrent system.

This paper presents a novel linearizability checker that efficiently prunes thesearch space by partitioning it into independent, faster to solve, subproblems.To achieve this, we propose P-compositionality (Definition 6), a new partitioningscheme of which Herlihy and Wing’s locality principle [1] is an instance. Recallthat locality says that a concurrent system Q is linearizable if and only if eachconcurrent object in Q is linearizable. The crux of P-compositionality is thatit generalizes the idea behind the locality principle to operations on the sameconcurrent object. For example, the operations on a concurrent unordered setand map are linearizable if and only if the restriction to each key is linearizable.This is not a consequence of Herlihy and Wing’s locality principle.

In this paper, we study the pragmatics of P-compositionality through its im-plementation in a novel linearizability checker and experimental evaluation.Our implementation is based on Wing and Gong’s algorithm (WG algorithm) [4]and a recent extension by Lowe [5]. We call Lowe’s extension of Wing andGong’s algorithm the WGL algorithm. The idea behind the WGL algorithm isto prune states that are equivalent to an already seen state. Lowe’s experimentsshow that the WGL algorithm can solve a significantly larger number of prob-lem instances than the WG algorithm. We therefore use the more recent WGLalgorithm as our starting point.

Our linearizability checker preserves three practical properties of the algo-rithms in the WG-family that we deem important. Firstly, our tool is precise,i.e., it reports no false alarms. This is particularly significant for evaluatinglarge code bases, as effectively shown by the Jepsen project. Secondly, our tooltakes as input an executable specification of the data type to be checked. Thissignificantly simplifies the task of expressing the expected behaviour of a datatype because one merely writes code, i.e., no expertise in formal modeling isrequired. Finally, our tool can be easily integrated with a range of runtime mon-itors to generate a history from a run of a concurrent system. This is essential tomake it a viable runtime verification technique.

We experimentally evaluate our linearizability checker using nine differentimplementations of concurrent sets, including Intel’s TBB library, as exemplarsof P-compositionality. Our experiments show that our linearizability checkeris at least one order of magnitude faster and/or more space efficient than theWGL algorithm. Overall, the results of our work can therefore dramaticallyincrease the number of runs that can be checked for linearizability bugs in agiven time budget.

The rest of this paper is organized as follows. We first formalize the problemby recalling familiar concepts (§ 2). We then present P-compositionality (§ 3) onwhich our decision procedure (§ 4) is based. We implement and experimentally

Page 63: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

52 A. Horn and D. Kroening

call1� set.insert(1) : true �

ret1

call3� set.contains(1) : true �

ret3

call2� set.remove(1) : false �

ret2

Fig. 1. A history diagram H1 for the operations on a concurrent set

evaluate our decision procedure (§ 5). Finally, we discuss related work (§ 6) andconclude the paper (§ 7).

2 Background

We recall familiar concepts that are fundamental to everything that follows.

Definition 1 (History). Let E � {call, ret} ×N. For all natural numbers n in N,calln � 〈call, n〉 in E is called a call and retn � 〈ret, n〉 in E is called a return. Theinvocation of a procedure with input and output arguments is called an operation. Anobject comprises a finite set of such operations. For all e in E, obj(e) and op(e) denotethe object and operation of e, respectively. A history is a tuple 〈H, obj, op〉 where H isa finite sequence of calls and returns, totally ordered by �H. When no ambiguity arises,we simply write H for a history. We write |H| for the length of H.

Intuitively, a history H records a particular run of a concurrent system. Usingthe implicitly associated functions obj and op, a history H gives relevant infor-mation on all operations performed at runtime, and the sequence of calls andreturns in H give the relative points in time at which an operation started andcompleted with respect to other operations. This can be visualized using thefamiliar history diagrams [1], as illustrated next.

Example 1. Consider a concurrent set with the usual operations: ‘insert’ adds anelement to a set, whereas ‘remove’ does the opposite, and ‘contains’ checks mem-bership. The return value indicates the success of the operation. For example,‘set.remove(1) : true’ denotes the operation that successfully removed ‘1’ fromthe object ‘set’, whereas ‘set.remove(1) : false’ denotes the operation that didnot modify ‘set’ because ‘1‘ is already not in the set. Then the history diagramin Fig. 1 can be defined by H1 = 〈call1, call2, ret1, ret2, call3, ret3〉 such that, for all1 ≤ i ≤ 3, obj(calli) = obj(reti) = ‘set’, and the following holds:

– op(call1) = op(ret1) = ‘insert(1) : true’,– op(call2) = op(ret2) = ‘remove(1) : false’,– op(call3) = op(ret3) = ‘contains(1) : true’.

Note that |H1| = 6 and the total ordering �H1 satisfies, among other con-straints, ret1 �H1 call3 because ret1 precedes call3 in the sequence H1.

Page 64: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Faster Linearizability Checking via P-Compositionality 53

Henceforth, we draw diagrams as in Fig. 1. Linearizability is ultimately de-fined in terms of sequential histories, in the following sense:

Definition 2 (Complete and Sequential History). Let e, e′ ∈ E and H be a his-tory. If e is a call and e′ is a return in H, both are matching whenever e �H e′ andtheir objects and operations are equal, i.e. obj(e) = obj(e′) and op(e) = op(e′). Ahistory is called complete if every call has a unique matching return. A complete his-tory is called sequential whenever it alternates between matching calls and returns(necessarily starting with a call).

Example 2. The following history H2 is sequential:

� remove(1) : false � � insert(1) : true � � contains(1) : true �

And so is H3 that we get when we swap the first two operations in H2 (al-though the resulting sequence of operations is not what we would expect froma sequential set, as discussed next):

� insert(1) : true � � remove(1) : false � � contains(1) : true �

H3 in Example 2 illustrates that a history can be sequential even though itmay not satisfy the expected sequential behaviour of the data type. This is ad-dressed by the following definition:

Definition 3 (Specification). A specification, denoted by φ (possibly with a sub-script), is a unary predicate on sequential histories.

Example 3. Define φset to be the specification of a sequential finite set. This meansthat, given a sequential history S according to Definition 2, the predicate φset(S)holds if and only if the input and output of ‘insert’, ‘remove’ and ‘contains’ inS are consistent with the operations on a set. For example, φset(H2) = true,whereas φset(H3) = false for the histories from Example 2.

Remark 1. In the upcoming decision procedure (§ 4), every φ is an executablespecification. Informally, this is achieved by ‘replaying’ all operations in a se-quential history S in the order in which they appear in S. If in any step theoutput deviates from the expected result, the executable specification returnsfalse; otherwise, if it reaches the end of S, it returns true.

The next definition will be key to answer which calls may be reordered in ahistory in order to satisfy a specification.

Definition 4 (Happens-before). Given a history H, the happens-before relation isdefined to be a partial order <H over calls e and e′ such that e <H e′ whenever e’smatching return, denoted by ret(e), precedes e′ in H, i.e. ret(e) �H e′. We say thattwo calls e and e′ happen concurrently whenever e �<H e′ and e′ �<H e.

Page 65: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

54 A. Horn and D. Kroening

Example 4. For the history H1 in Fig. 1, we get:

– call1 <H1 call3 and call2 <H1 call3, i.e. call1 and call2 happen-before call3;– call1 �<H1 call2 and call2 �<H1 call1, i.e. call1 and call2 happen concurrently.

Note that a history H is sequential if and only if <H is a total order. Moregenerally, <H is an interval order [6]: for every x, y, u, v in H, if x <H y andu <H v, then x <H v or u <H y. Observe that a partial order 〈P, ≤〉 is an inter-val order if and only if no restriction of 〈P, ≤〉 is isomorphic to the followingHasse diagram [7]:

• •

• •Put differently, this paper is about a decision procedure (§ 4) that concerns a

certain class of partial orders. The decision problem rests on the next definition:

Definition 5 (Linearizability). Let φ be a specification. A φ-sequential history isa sequential history H that satisfies φ(H). A history H is linearizable with respectto φ if it can be extended to a complete history H′ (by appending zero or more returns)and there is a φ-sequential history S with the same obj and op functions as H′ such that

L1 H′ and S are equal when seen as two sets of calls and returns;L2 <H ⊆ <S, i.e. for all calls e, e′ in H, if e happens-before e′, the same is true in S.

Informally, extending H to H′ means that all pending operations have com-pleted. This paper therefore considers only complete histories. This is fully jus-tified under our stated assumption (§ 1) that the concurrent system is deadlock-free [5]. Condition L1 means that H′ and S are identical if we disregard the orderin which calls and returns occur in both sequences. Condition L2 says that thehappens-before relation between calls in H must be preserved in S.

Example 5. Recall Example 3. Then H1 in Fig. 1 is linearizable with respect to φsetbecause H2 is a witness for a φset-sequential history that respects the happens-before relation <H1 detailed in Example 4. In particular, call1 <H1 call3 andcall2 <H1 call3 cannot be reordered.

3 P-compositionality

In this section, we introduce P-compositionality. We illustrate our new parti-tioning scheme in Examples 7–9.

Definition 6 (P-compositionality). Let P be a function that maps a history H toa non-trivial partition of H, i.e. P satisfies P(H) �= {H}. A specification φ is calledP-compositional whenever any history H is linearizable with respect to φ if and onlyif, for every history H′ ∈ P(H), H′ is linearizable with respect to φ. When this equiv-alence holds we speak of P-compositionality.

Page 66: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Faster Linearizability Checking via P-Compositionality 55

In the following examples, we assume that the partitions are non-trivial.The first example illustrates that the locality principle [1] is an instance of P-compositionality.

Example 6. Denote with Obj the set of objects. Let φ be a specification for allobjects in Obj. Let PObj be the function that maps every history H to the set ofhistories H where each sub-history H′ ∈ H is the restriction of H to an objectin Obj. Then PObj(H) is a partition of H. By the locality principle [1], a historyH is linearizable with respect to φ if and only if, for all Hobj ∈ PObj(H), Hobj islinearizable with respect to φ. Therefore φ is a PObj-compositional specification.

The remaining examples show that P-compositionality strictly generalizesthe locality principle because P-compositionality can partition a history evenif the implementation details or constituent parts (i.e. objects) of a concurrentsystem are unknown. For example, there are at least eight different implemen-tations of concurrent sets (Table 2), but we do not need to know the objects(e.g. registers, buckets) of which such implementations consist in order to par-tition one of their histories. This is in contrast to the locality principle wheresuch knowledge is required. Put differently, P-compositionality is all about theinterface of a concurrent data type, whereas the locality principle hinges on theimplementation details of such an interface.

Example 7. Reconsider φset, the specification of a set from Example 3, where alloperations have the form insert(k), remove(k) and contains(k) for some k. LetPset be the function that partitions every history H according to such k. Since the‘insert’, ‘remove’ and ‘contains’ operations on a single set object are linearizableif and only if the restriction to each k is linearizable, φset is a Pset-compositionalspecification of a set.

Similarly, there exists a Pmap-compositional specification for concurrent un-ordered maps where every history is partitioned by each key k.

Example 8. Consider a concurrent array. As their sequential counterparts, a con-current array can be only read or written at a particular array index. Let Parraybe the function that partitions a history based on such array indexes. This givesa Parray-compositional specification of an array.

Example 9. Consider a concurrent stack where each pop and push operationalso returns the height of the stack before it is modified. Among other things,the return value can be used to determine whether the operation has succeeded.For example, if stack.pop returns zero, we know the pop operation was unsuc-cessful (and the popped element is undefined) because the stack was empty atthe time the operation was called. We can use the returned height to partition ahistory such that a concurrent stack is linearizable if and only if each partitionis linearizable. This way we get a Pstack-compositional specification of a stack.

Intuitively, the reason why the previous specifications are P-compositionalis because all operations in one partition are, informally speaking, unaffected

Page 67: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

56 A. Horn and D. Kroening

by all operations in every other partition. For example, the return value ofset.insert(k) is unaffected by set.insert(k′), set.remove(k′) and set.contains(k′) fork �= k′. This clearly, however, has its limitations. For example, a ‘size’ operationthat returns the number of elements in a concurrent collection data type cannotbe generally partitioned this way.

Note that all these examples have in common that their P-compositionalspecifications can be expressed as a conjunction of specifications that each par-tition a history. For example, φset =

k∈K φset(k) where φset(k) for every k is asequential specification that only concerns operations on k, e.g. set.insert(k).

Next, we show how to leverage the concept of P-compositionality to moreefficiently find linearizability bugs.

4 Decision Procedure

In this section, we explain our linearizability checking algorithm that decideswhether a history is linearizable with respect to some P-compositional spec-ification (Definition 6). The novelty of our decision procedure is Algorithm 3that leverages P-compositionality. In the next section (§ 5), we experimentallyevaluate the effectiveness of Algorithm 3.

Since we base our work on the WGL algorithm (recall § 1), we use the fol-lowing data structures to represent the input to the decision procedure:

1. The specification (Definition 3) is modelled by a persistent data structure,e.g. [8]. Most standard data types in functional programming languages canbe almost directly used this way. For instance, the specification of a set canbe modelled through an immutable sequential set.

2. A history (Definition 1), in turn, is represented by a doubly-linked list of so-called entries. Consequently, each entry e has a e.next and e.prev field thatpoint to the next and previous entry, respectively. In addition, each entry ehas a match field, and we say that e is a call entry exactly if e.match �= null;otherwise, e is called a return entry. Given a call entry e, e.match corre-sponds to the matching return entry of e. This linked-list data structuretherefore aligns directly with the usual definition of history (Definition 1).

The idea behind the WGL Algorithm 1 is threefold: it keeps track of provi-sionally linearized call entries in a stack; it uses the stack to backtrack if nec-essary, and caches already seen configurations. We briefly explain each idea inturn. Denote the stack of call entries by calls. Given a history H, the height ofcalls is at most half of H’s length, i.e. |calls| ≤ 0.5 × |H| = N. Note that thereis no rounding involved because |H| is always even since every call entry hasa matching return entry. The height of the stack grows only if a call entry canbe linearized (line 5). When the stack grows or shrinks, the history is modi-fied (lines 13 and 23) by the LIFT and UNLIFT procedures (Algorithm 2). Weremark that the workings of both procedures are illustrated by Example 10. Ifno further call entries can be linearized but the stack is nonempty, the algorithmbacktracks and tries the next possible call entry (lines 18–24). The backtracking

Page 68: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Faster Linearizability Checking via P-Compositionality 57

Algorithm 1. WGL linearizability checker [5]

Require: head entry is such that head entry.next points to the beginning of history H.Require: N = 0.5 × |H| is half of the total number of entries reachable from head entry.Require: linearized is a bitset (array of bits) such that linearized[k] = 0 for all 0 ≤ k < N.Require: For all entries e in H, 0 ≤ entry id(e) < N.Require: For all entries e and e′ in H, if entry id(e) = entry id(e′), then e = e′.Require: cache is an empty set and calls is an empty stack.1: while head entry.next �= null do2: if entry.match �= null then � Is call entry?3: 〈is linearizable, s′〉 ← apply(entry, s) � Simulate entry’s operation4: cache′ ← cache � Copy set5: if is linearizable then6: linearized′ ← linearized � Copy bitset7: linearized′[entry id(entry)] ← 1 � Insert entry id(entry) into bitset8: cache ← cache ∪ {〈linearized′, s′〉} � Update configuration cache9: if cache′ �= cache then

10: calls ← push(calls, 〈entry, s〉) � Provisionally linearize call entry and state11: s ← s′ � Update state of persistent data type12: linearized[entry id(entry)] ← 1 � Keep track of linearized entries13: LIFT(entry) � Provisionally remove the entry from the history14: entry ← head entry.next � Continue search in shortened history15: else � Cannot linearize call entry16: entry ← entry.next � Continue search in unmodified history17: else � Handle “return entry”18: if is empty(calls) then19: return false � Cannot linearize entries in history20: 〈entry, s〉 ← top(calls) � Revert to earlier state21: linearized[entry id(entry)] ← 022: calls ← pop(calls)23: UNLIFT(entry) � Undo provisional linearization24: entry ← entry.next25: return true

points depend on the return value of apply(entry, s) and the cache. The former(line 3) models the specification φ: by Remark 1, it determines whether entry canbe applied to the current state s of a persistent data type. The latter (lines 4–8)is an optimization due to Lowe [5] that prunes the search space by memoiz-ing already seen configurations which are known to be non-linearizable. Moreaccurately, each configuration is a pair that consists of a set of unique call en-try identifiers and a state of the persistent data structure. The intuition behindpruning already seen configurations is that only one of two permutations ofoperations on a concurrent data type need to be considered if they lead to anidentical state [5]. We remark that the total correctness of the WGL algorithmfollows from Wing and Gong’s total correctness argument [4].

Example 10. We illustrate the handling of entries in the history data structure.For this, consider the two histories in Fig. 2. In Fig. 2a, the entries satisfy the

Page 69: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

58 A. Horn and D. Kroening

following: call2.prev = call1, call2.next = call3 and call2.match = ret2 etc. ThenLIFT(call2) (Algorithm 2) produces the history shown in Fig. 2b. Note that bothcall2 and ret2 are still valid entry pointers whose fields remain unchanged. Thisexplains how UNLIFT(call2) reverts the change in constant-time.

Algorithm 3 gives our partitioning scheme. This is an iterative algorithmthat, given an entry in a history H and positive integer n, partitions H startingfrom that entry into at most n separate sub-histories. The partitioning is con-trolled by the function partition : E → N from the set of call and return entriesto the natural numbers.

Example 11. Consider the history in Fig. 2b. For all entries e in this history,let partition(e) = k where k is the integer argument of the operation. For ex-ample, partition(call3) = partition(ret3) = 1 because op(call3) = op(ret3) =‘remove(1) : false’. Then the function PARTITION(call1) returns two disjoint sub-histories for the operations on ‘0’ and ‘1’, respectively:

call1� set.insert(0) : true � ret1

call2� set.contains(0) : true � ret2

and call3� set.remove(1) : false � ret3.

Given a nonempty set of disjoint sub-histories returned by the PARTITIONfunction (Algorithm 3), we invoke Algorithm 1 on each sub-history. It is not toodifficult to implement sub-histories such that there is no sharing between them,and Algorithm 1 could be therefore run in parallel for each sub-history. Never-theless, this addresses a challenging problem that was identified independentlyby Lowe [5] and Kingsbury [9].

Theorem 1. Let φ be a P-compositional specification and H be a history. Denote withhead entry the entry that represents the beginning of H. Associate with each disjointhistory Hk in partition P(H) a unique number 0 ≤ k < |P(H)| = n. If, for allHk ∈ P(H) and e ∈ Hk, partition(e) = k, then H is linearizable with respect to φ ifand only if Algorithm 1 returns true for every history in PARTITION(head entry, n).

We next experimentally quantify the benefits of the previous theorem.

call1� set.insert(0) : true �

ret1

call2� set.contains(0) : true �

ret2

call3� set.remove(1) : false �

ret3

(a)

call1� set.insert(0) : true �

ret1

call3� set.remove(1) : false �

ret3

(b)

Fig. 2. After calling LIFT(call2) in history (2a), we get the history in (2b). UNLIFT(call2)reverts this change in constant-time.

Page 70: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Faster Linearizability Checking via P-Compositionality 59

Algorithm 2. History modifications

1: procedure LIFT(entry)2: entry.prev.next ← entry.next3: entry.next.prev ← entry.prev4: match ← entry.match5: match.prev.next ← match.next6: if match.next �= null then7: match.next.prev ← match.prev8:9: procedure UNLIFT(entry)

10: match ← entry.match11: match.prev.next ← match12: if match.next �= null then13: match.next.prev ← match

14: entry.prev.next ← entry15: entry.next.prev ← entry

Algorithm 3. History partitioner

Require: n is a positive integerRequire: entries is an array of size n1: function PARTITION(entry, n)2: for 0 ≤ i < n do3: entries[i] ← null4: while entry �= null do5: i ← partition(entry) mod n6: if entries[i] �= null then7: entries[i].next ← entry

8: next entry ← entry.next9: entry.prev ← entries[i]

10: entry.next ← null11: entries[i] ← entry12: entry ← next entry

13: return entries

5 Implementation and Experiments

In this section, we discuss and experimentally evaluate our implementation ofthe decision procedure (§ 5). As an exemplar of P-compositionality, our experi-ments use Intel’s TBB library and Lowe’s implementations of concurrent sets.

5.1 Implementation

The implementation details of an NP-complete decision procedure matter, es-pecially for our experimental evaluation of P-compositionality. We particularlyconsider hashing and cache eviction options because these were not studied inprevious implementations of the WG-based algorithms [4,5].

For experimental robustness, we implemented our linearizability checker inC++11 [10] because this language has built-in concurrency support while al-lowing us to rule out interference from managed runtime environments (e.g.JVM) due to garbage collection etc. The choice of language, though, meant thatwe had to implement persistent data structures from scratch. In doing so, wefocused on optimizing equality checks for our specific purposes. This way, wemanaged to avoid a known performance bottleneck in Lowe’s implementationof the WGL algorithm [5] where the cost of equality checks had to be compen-sated with an additional union-find data structure. Another optimization in ourimplementation is a constant-time (instead of linear-time) hash function for bit-sets where we exploit the fact that the bitwise XOR operator over fixed-size bitvectors forms an abelian group. This optimization turns out to be importantwhen histories are longer than 8 K, cf. [5]. To see this, consider the computa-tional steps for retrieving a configuration from the cache and updating it (line 8in Algorithm 1). For example, a history of length 216 means that each bitsetin a configuration is at least 3 KiB, and so a constant-time hash function can

Page 71: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

60 A. Horn and D. Kroening

Table 1. Experimental results for three variants of the same linearizability checker. Theresults for the baseline are reported in the WGL column. The rows correspond to bench-marks drawn from Intel’s TBB library and Lowe’s implementations of concurrent sets(see Table 2 for mnemonics).

WGL WGL+LRU WGL+PBenchmark Time Memory Timeout Time Memory Timeout Time Memory Timeout

TBB 101 s 9792 MiB 0% 11 s 670 MiB 0% 6 s 672 MiB 0%

CRLSL 20 s 15738 MiB 0% 25 s 678 MiB 0% 6 s 400 MiB 0%

CRLFSL 14 s 15029 MiB 0% 18 s 678 MiB 0% 5 s 401 MiB 0%

FGL 16 s 14297 MiB 0% 81 s 678 MiB 0% 5 s 401 MiB 0%

LLL 23 s 16494 MiB 0% 94 s 678 MiB 0% 6 s 401 MiB 0%

LSL 20 s 15736 MiB 0% 25 s 678 MiB 14% 6 s 401 MiB 0%

LFLL 11 s 11847 MiB 0% 15 s 678 MiB 0% 5 s 402 MiB 0%

LFSL 14 s 14712 MiB 0% 18 s 678 MiB 0% 5 s 401 MiB 0%

LFSLF0 14 s 13125 MiB 0% 18 s 678 MiB 0% 5 s 402 MiB 0%

LFSLF1 < 1 s 404 MiB 0% < 1 s 407 MiB 0% < 1 s 402 MiB 0%

OPTIMIST 16 s 13818 MiB 0% 54 s 678 MiB 9% 5 s 401 MiB 0%

make a measurable difference when the cache is frequently accessed. In fact,it is not uncommon for the cache to contain more than 27 K of such configu-rations. For this reason, we also implemented a least recently used (LRU) cacheeviction feature that can optionally be enabled at compile-time. The effects ofthe LRU cache will be evaluated shortly.

Overall, our implementation and experimental setup is around 4 K lines ofcode, including several dozen unit tests. All the code and benchmarks are pub-licly available in our source code repository.2

5.2 TBB and Concurrent Set Experiments

For the experimental evaluation of our partitioning scheme, we collected over700 histories from nine different implementations of concurrent sets by Lowe [5]and the concurrent unordered set implementation in Intel’s TBB library.3 Weperformed all experiments on a 64-bit machine running GNU/Linux 3.17 with12 Intel Xeon 2.4 GHz cores and 94 GB of main memory.

Each history is generated by running 4 concurrent threads that pseudo ran-domly invoke operations on a single shared concurrent set. The argument ofeach operation is a pseudo random uniformly distributed integer between 0

2 https://github.com/ahorn/linearizability-checker3 https://www.threadingbuildingblocks.org/

Page 72: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Faster Linearizability Checking via P-Compositionality 61

(inclusive) and 24 (exclusive). Each thread invokes 70 K such operations. Notethat this is significantly more than in previous experiments where each processis limited to 213 ≈ 8 K operations [5]. In total, since every call generates a pairof entries, every history H in our benchmarks has length |H| = 4 × 2 × 70 K =560 K. We discuss the experimental results using Intel’s TBB library and Lowe’sconcurrent set implementations in turn.

The experimental results are given in Table 1. Each of the three main columnscorresponds to one variant of the same linearizability checker: ‘WGL’ is thebaseline, ‘WGL+LRU’ is the WGL algorithm with LRU cache eviction enabled(§ 5.1), and ‘WGL+P’ is the WGL algorithm combined with our partitioning al-gorithm (Algorithm 3 in § 4). We tried to use the WG algorithm [4] without theextension by Lowe [5] but WG times out on the majority of benchmarks. Wetherefore do not report the results on the WG algorithm and focus on WGL,WGL+LRU and WGL+P. The meaning of the sub-columns is as follows. The‘Time’ and ‘Memory’ columns give the average of the elapsed time and vir-tual memory usage, respectively. These averages exclude runs that we had toterminate after 1 hour. The percentage of such terminated runs is given in the‘Timeout’ column. In each row, all variants are compared with respect to thesame benchmark data. We therefore do not report confidence intervals.

The TBB benchmark corresponds to the first row in Table 1 and consists ofa total of 100 histories. Table 1 clearly shows that the WGL+P algorithm is atleast one order of magnitude faster compared to the baseline. We also see thatenabling the LRU cache eviction decreases the memory footprint by at least oneorder of magnitude, approximately 10 GiB versus 700 MiB. In fact, the runtimeperformance of WGL+LRU is almost one order of magnitude faster than thebaseline. The WGL+P algorithm is at least as fast and almost as space efficientas WGL+LRU. In the experiments with Lowe’s implementations of concurrentsets (see next paragraph), we further investigate the effect of the LRU cacheeviction feature and how it compares to the partitioning scheme.

We give Lowe’s implementations of concurrent sets mnemonics (Table 2) thatidentify the remaining ten benchmarks in Table 1. Each of these ten benchmarkscomprises between 50 and 100 histories with an average of 70 histories perbenchmark. To avoid bias, we collected these using Lowe’s tool. The signifi-cance of the experimental results in Table 1 is twofold. Firstly, they show thaton average, WGL+P is three times faster than WGL, and WGL+P consumes oneorder of magnitude less space than WGL. Secondly, and more crucially, how-ever, these experiments reveal that WGL+LRU is not as efficient as WGL+P, inneither time nor space. For example, for WGL+LRU the average elapsed timeof the FGL and LLL benchmark is 81 s and 94 s, respectively, with an averagememory usage of 678 MiB in both cases. By contrast, WGL+P achieves an av-erage runtime of less than 7 s (and so WGL+P is one order of magnitude fasterthan WGL+LRU) and consumes even less memory on average (401 MiB) thanWGL+LRU. The higher average runtime of WGL+LRU in the FGL benchmarkis due to a single check that took several orders of magnitude longer (3068 s)than the remaining checks (20 s on average when the 3068 s outlier is excluded).

Page 73: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

62 A. Horn and D. Kroening

Table 2. Mnemonics for Lowe’s implementation of concurrent sets [5]

Benchmark name Mnemonic Benchmark name Mnemonic

collision resistance lazy skip list CRLSL lock-free linked-list LFLLcollision resistance lock-free skip list CRLFSL lock-free skip list LFSLfine-grained lock FGL lock-free skip list faulty (bad hash) LFSLF0lazy linked-list LLL lock-free skip list faulty (good hash) LFSLF1lazy skip list LSL optimistic lock OPTIMIST

In the LLL benchmark there are two such outliers (2201 s and 675 s, whereas theother checks average 27 s). The observed difference between WGL+LRU andWGL+P is even more pronounced in both the LSL and OPTIMIST benchmarkswhere the LRU cache eviction causes 14% and 9% of runs to timeout, whereasthe WGL+P algorithm always runs to completion in less than a few seconds.

This experimentally confirms that the WGL+P is one order of magnitudefaster as well as more space efficient than the baseline and WGL+P consumeseven less space than our WGL+LRU implementation.

6 Related Work

Linearizability is related to the concept of atomicity, including weaker formssuch as k-atomicity [11]. An important difference is that atomicity is typicallynot defined in terms of a sequential specification, e.g. [12]. The theoretical limi-tations of automatically verifying linearizability are well understood. Of course,the problem is generally undecidable [13]. In fact, even checking finite-state im-plementation against atomic specifications, provided the number of programthreads is bounded, is EXPSPACE [14]. And the best known lower bound forthis problem is PSPACE-hardness. This explains the restrictions in this paperand its focus on runtime verification instead.

The literature on machine-assisted techniques for checking linearizabilitycan be broadly divided into simulation-based methods (e.g. [15,16]), modelchecking (e.g. [17,18,19,20]), static analysis (e.g. [21,22,23,24]) and fully auto-matic testing (e.g. [4,25,26,27,28,29,30,5]). The simulation-based methods havebeen used by experts to mechanically verify simple fine-grained and lock-freeimplementations. Model checking requires less expertise but is typically lim-ited to very small programs and a small number of threads due to the stateexplosion problem. By contrast, static analysis tools aim to prove correctnesswith respect to an unbounded number of threads. In general, these techniquesare necessarily incomplete and require the user to supply linearization pointsand/or invariants. Vafeiadis [24] proposes a more automatic form of static anal-ysis that works well on simpler concurrent data types such as stacks but report-edly not so well on data types that have more complicated invariants, includingthe CAS-based and lazy concurrent sets extensively studied in our experiments.

Page 74: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Faster Linearizability Checking via P-Compositionality 63

Our work is most closely related to linearizability testing techniques that areprecise, fully automatic and necessarily incomplete, e.g. [4,25,26,27,28,29,30,5].We focus our discussion on tools that do not require the notion of commitpoints, cf. [31]. The work in [25,30] checks k-atomicity with a polynomial-timealgorithm assuming that each write to a register assigns a distinct value. Bycontrast, we solve a more general NP-complete problem of which k-atomicityis an instance. The tool in [26] analyzes code that uses concurrent collection datatypes such as maps. To make the analysis scale, the authors assume that the col-lection data types are linearizable, whereas our tool could be used to check suchan assumption. A different tool [27] requires programmers to annotate concur-rent implementations with so-called state summary functions that act as a formof specification. Our approach is more modular because it strictly separates theconcurrent implementation from its specification. By contrast, [28] works with-out the programmer having to provide a sequential specification. As a result,however, the tool can only find linearizability violations when an exception isthrown or a deadlock occurs. Subsequent work [29] circumvents this, in thecontext of object-oriented programs, by considering the special case of a su-perclass serving as an executable, possibly non-deterministic, specification forall its subclasses. The fact that the superclass can be non-deterministic may ex-plain why even checks of two threads can take a significant amount of time (e.g.108 min) despite the fact that each concurrent test considers only two possiblelinearizations [29]. By contrast, the WGL algorithm [4,5], on which our decisionprocedure is based (§ 4), is significantly faster but limited to deterministic spec-ifications. Crucially, our experiments (§ 5) with P-compositional specificationsshow a significant improvement over the WGL algorithm.

7 Concluding Remarks

We have presented a precise, fully automatic runtime verification techniquefor finding linearizability bugs in implementations of concurrent data typesthat are expected to satisfy a P-compositional specification. Our experimentsshow that our partitioning scheme improves the WGL algorithm [4,5] by oneorder of magnitude, in both time and space. An additional strength of our tech-nique is that it is applicable to any linearizability checker. For this, however,our work assumes that the specification is P-compositional. This is generallynot always the case and it would be therefore interesting to further generalizeP-compositionality, perhaps with a less modular partitioning scheme that canmake more assumptions about the underlying decision procedure.

Acknowledgements.. We would like to thank Gavin Lowe, Kyle Kingsbury and AlexeyGotsman for invaluable discussions.

Page 75: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

64 A. Horn and D. Kroening

References

1. Herlihy, M.P., Wing, J.M.: Linearizability: A correctness condition for concurrent ob-jects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)

2. Gilbert, S., Lynch, N.: Brewer’s conjecture and the feasibility of consistent, available,partition-tolerant web services. SIGACT News 33(2), 51–59 (2002)

3. Gibbons, P.B., Korach, E.: Testing shared memories. SIAM J. Comput. 26(4),1208–1244 (1997)

4. Wing, J.M., Gong, C.: Testing and verifying concurrent objects. J. Parallel Distrib.Comput. 17(1-2), 164–182 (1993)

5. Lowe, G.: Testing for linearizability. In: PODC 2015 (2015) (Under submission),http://www.cs.ox.ac.uk/people/gavin.lowe/LinearizabiltyTesting/

6. Bouajjani, A., Emmi, M., Enea, C., Hamza, J.: Tractable refinement checking for con-current objects. In: POPL 2015, pp. 651–662. ACM (2015)

7. Rabinovitch, I.: The dimension of semiorders. Journal of Combinatorial Theory, Se-ries A 25(1), 50–61 (1978)

8. Okasaki, C.: Purely Functional Data Structures. Cambridge University Press (1998)9. Kingsbury, K.: Computational techniques in Knossos (May 2014),

https://aphyr.com/posts/314-computational-techniques-in-knossos10. ISO: International Standard ISO/IEC 14882:2011(E) Programming Language C++.

International Organization for Standardization (2011)11. Aiyer, A., Alvisi, L., Bazzi, R.A.: On the availability of non-strict quorum systems.

In: Fraigniaud, P. (ed.) DISC 2005. LNCS, vol. 3724, pp. 48–62. Springer, Heidelberg(2005)

12. Wang, L., Stoller, S.D.: Static analysis of atomicity for programs with non-blockingsynchronization. In: PPoPP 2005, pp. 61–71. ACM (2005)

13. Bouajjani, A., Emmi, M., Enea, C., Hamza, J.: Verifying concurrent programs againstsequential specifications. In: Felleisen, M., Gardner, P. (eds.) ESOP 2013. LNCS,vol. 7792, pp. 290–309. Springer, Heidelberg (2013)

14. Alur, R., McMillan, K., Peled, D.: Model-checking of correctness conditions for con-current objects. Inf. Comput. 160(1-2), 167–188 (2000)

15. Colvin, R., Doherty, S., Groves, L.: Verifying concurrent data structures by simula-tion. Electron. Notes Theor. Comput. Sci. 137(2), 93–110 (2005)

16. Derrick, J., Schellhorn, G., Wehrheim, H.: Mechanically verified proof obligations forlinearizability. ACM Trans. Program. Lang. Syst. 33(1), 4:1–4:43 (2011)

17. Vechev, M., Yahav, E., Yorsh, G.: Experience with model checking linearizability. In:Pasareanu, C.S. (ed.) SPIN 2009. LNCS, vol. 5578, pp. 261–278. Springer, Heidelberg(2009)

18. Burckhardt, S., Dern, C., Musuvathi, M., Tan, R.: Line-up: A complete and automaticlinearizability checker. SIGPLAN Not. 45(6), 330–340 (2010)

19. Cerny, P., Radhakrishna, A., Zufferey, D., Chaudhuri, S., Alur, R.: Model checking oflinearizability of concurrent list implementations. In: Touili, T., Cook, B., Jackson, P.(eds.) CAV 2010. LNCS, vol. 6174, pp. 465–479. Springer, Heidelberg (2010)

20. Liu, Y., Chen, W., Liu, Y.A., Sun, J., Zhang, S.J., Dong, J.S.: Verifying linearizabilityvia optimized refinement checking. IEEE Trans. Softw. Eng. 39(7), 1018–1039 (2013)

21. Amit, D., Rinetzky, N., Reps, T., Sagiv, M., Yahav, E.: Comparison under abstractionfor verifying linearizability. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS,vol. 4590, pp. 477–490. Springer, Heidelberg (2007)

22. Berdine, J., Lev-Ami, T., Manevich, R., Ramalingam, G., Sagiv, M.: Thread quantifi-cation for concurrent shape analysis. In: Gupta, A., Malik, S. (eds.) CAV 2008. LNCS,vol. 5123, pp. 399–413. Springer, Heidelberg (2008)

Page 76: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Faster Linearizability Checking via P-Compositionality 65

23. Vafeiadis, V.: Shape-value abstraction for verifying linearizability. In: Jones, N.D.,Muller-Olm, M. (eds.) VMCAI 2009. LNCS, vol. 5403, pp. 335–348. Springer,Heidelberg (2009)

24. Vafeiadis, V.: Automatically proving linearizability. In: Touili, T., Cook, B., Jackson,P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 450–464. Springer, Heidelberg (2010)

25. Anderson, E., Li, X., Shah, M.A., Tucek, J., Wylie, J.J.: What consistency does yourkey-value store actually provide? In: HotDep 2010, pp. 1–16. USENIX Association(2010)

26. Shacham, O., Bronson, N., Aiken, A., Sagiv, M., Vechev, M., Yahav, E.: Testing atom-icity of composed concurrent operations. SIGPLAN Not. 46(10), 51–64 (2011)

27. Fonseca, P., Li, C., Rodrigues, R.: Finding complex concurrency bugs in large multi-threaded applications. In: EuroSys 2011, pp. 215–228. ACM (2011)

28. Pradel, M., Gross, T.R.: Fully automatic and precise detection of thread safety viola-tions. SIGPLAN Not. 47(6), 521–530 (2012)

29. Pradel, M., Gross, T.R.: Automatic testing of sequential and concurrent substitutabil-ity. In: ICSE 2013, pp. 282–291. IEEE Press (2013)

30. Golab, W., Hurwitz, J., Li, X.S.: On the k-atomicity-verification problem. In: ICDCS2013, pp. 591–600. IEEE Computer Society (2013)

31. Elmas, T., Tasiran, S., Qadeer, S.: VYRD: Verifying concurrent programs by runtimerefinement-violation detection. SIGPLAN Not. 40(6), 27–37 (2005)

Page 77: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Translation Validation for Synchronous

Data-Flow Specification in the SIGNAL Compiler

Van Chan Ngo(�), Jean-Pierre Talpin, and Thierry Gautier

INRIA, 35042 Rennes, France{chan.ngo,jean-pierre.talpin,thierry.gautier}@inria.fr

Abstract. We present a method to construct a validator based on trans-lation validation approach to prove the value-equivalence of variables inthe compilation of the Signal compiler. The computation of output sig-nals in a Signal program and their counterparts in the generated C codeis represented by a Synchronous Data-flow Value-Graph (Sdvg). Our val-idator proves that every output signal and its counterpart variable havethe same values by transforming the Sdvg graph.

Keywords: Value-Graph · Graph transformation · Formal verification ·Translation validation · Certified compiler · Synchronous programs

1 Introduction

Motivation A compiler is a large and very complex program which often con-sists of hundreds of thousands, if not millions, lines of code, and is divided intomultiple sub-systems and modules. In addition, each compiler implements a par-ticular algorithm in its own way. That results in two main drawbacks regardingthe formal verification of the compiler itself. First, constructing the specifica-tions of the actual compiler implementation is a long and tedious task. Second,the correctness proof of a compiler implementation, in general, cannot be reusedfor another compiler.

To deal with these drawbacks of formally verifying the compiler itself, onecan prove that the source program and the compiled program are semanticallyequivalent, which is the approach of translation validation [13,12,5]. The princi-ple of translation validation is as follows: the source and the compiled programsare represented in a common semantics. Based on the representations of theinput and compiled programs, the notion of “correct transformation” is formal-ized. An automated proof method is provided to generate the proof scripts incase the compiled program implements correctly the input program. Otherwise,it produces a counter-example.

In this work, to adopt the translation validation approach, we use a value-graph as a common semantics to represent the computation of variables in thesource and compiled programs. The “correct transformation” is defined by theassertion that every output variable in the source program and the correspondingvariable in the compiled program have the same values.

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 66–80, 2015.DOI: 10.1007/978-3-319-19195-9_5

Page 78: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Translation Validation for Synchronous Data-Flow Specification 67

The Language. Signal [3,7] is a synchronous data-flow language that allowsthe specification of multi-clocked systems. Signal handles unbounded sequencesof typed values (x(t))t∈N, called signals, denoted by x. Each signal is implicitlyindexed by a logical clock indicating the set of instants at which the signal ispresent, noted Cx. At a given instant, a signal may be present where it holds avalue, or absent where it holds no value (denoted by ⊥). Given two signals, theyare synchronous iff they have the same clock. In Signal, a process (written Por Q) consists of the synchronous composition, noted |, of equations over signalsx, y, z, written x := y op z or x := op(y, z), where op is an operator. Naturally,equations and processes are concurrent.

Contribution. A Sdvg symbolically represents the computation of the outputsignals in a Signal program and their counterparts in its generated C code. Thesame structures are shared in the graph, meaning that they are represented bythe same subgraphs. Suppose that we want to show that an output signal andits counterpart have the same values. In order to do that we simply check thatthey are represented by the same subgraphs, meaning they label the same node.We manage to realize this check by transforming the graph using some rewriterules, which is called normalizing process.

Let A and C be the source program and its generated C code. Cp denotes theunverified Signal compiler which compiles A into C = Cp(A) or a compilationerror. We now associate Cp with a validator checking that for any output signal xin A and the corresponding variable xc in C, they have the same values (denotedby x = xc). We denote this fact by C �val A.

1 if (Cp(A) is Error) return Error;2 else {3 if (C �val A) return C;4 else return Error;5 }

The main components of the validator are depicted in Fig. 1. It works as follows.First, a shared value-graph that represents the computation of all signals andvariables in both programs is constructed. The value-graph can be considered asa generalization of symbolic evaluation. Then, the shared value-graph is trans-formed by applying graph rewrite rules (the normalization). The set of rewriterules reflects the general rules of inference of operators, or the optimizations ofthe compiler. For instance, consider the 3-node subgraph representing the ex-pression (1 > 0), the normalization will transform that graph into a single nodesubgraph representing the value true, as it reflects the constant folding. Finally,the validator compares the values of the output signals and the correspondingvariables in the C code. For every output signal and its corresponding variable,the validator checks whether they point to the same node in the graph, mean-ing that their computation is represented by the same subgraph. Therefore, inthe best case, when semantics has been preserved, this check has constant timecomplexityO(1). In fact, it is always expected that most transformations and op-timizations are semantics-preserving, thus the best-case complexity is important.

Page 79: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

68 V.C. Ngo et al.

Signal Program

SDVG Construction

SDVG Construction

Shared Value-graph

Generated C Program

NormalizedValue-graph

Does every output signal have the same value as the

corresponding variable in the generated C code?

Fig. 1. Sdvg Translation Validation Architecture

This work is a part of the whole work of the Signal compiler formal veri-fication. Our approach is that we separate the concerns and prove each analy-sis and transformation stage of the compiler separately with respect to ad-hocdata-structures to carry the semantic information relevant to that phase. Thepreservation of the semantics can be decomposed into the preservation of clocksemantics at the clock calculation and Boolean abstraction phase, the preserva-tion of data dependencies at the static scheduling phase, and value-equivalenceof variables at the code generation phase. Fig. 2 shows the integration of thisverification framework into the compilation process of the Signal compiler. Foreach phase, the validator takes the source program and its compiled counterpart,then constructs the corresponding formal models of both programs. Finally, itchecks the existence of the refinement relation to prove the preservation of theconsidered semantics. If the result is that the relation does not exist then a“compiler bug” message is emitted. Otherwise, the compiler continues its work.

Outline The remainder of this paper is organized as follows. In Section 2,we consider the formal definition of Sdvg and the representation of a Signalprogram and its generated C code as a shared Sdvg. Section 3 addresses themechanism of the verification process based on the normalization of a Sdvg.Section 4 illustrates the concept of Sdvg and the verification procedure. Section5 terminates this paper with some related work, a conclusion and an outlook tofuture work.

2 Synchronous Data-Flow Value-Graph

Let X be the set of variables which are used to denote the signals, clocks andvariables in a Signal program and its generated C code, and F be the set offunction symbols. In our consideration, F contains usual logic operators (not,and, or), numerical comparison functions (<, >, =, <=, >=, /=), numericaloperators (+, -, *, /), and gated φ-function [2]. A gated φ-function such asx = φ(c, x1, x2) represents a branching in a program, which means x takes thevalue of x1 if the condition c is satisfied, and the value of x2 otherwise. A constantis defined as a function symbol of arity 0.

Page 80: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Translation Validation for Synchronous Data-Flow Specification 69

*.SIG *_BASIC_TRA.SIG *_BOOL_TRA.SIG *_SEQ_TRA.SIG C/C++, Java

Clock calculation, Boolean abstraction Scheduling Code generation

Clock model

Clock model

Clock Refinement

Clock Refinement

Clock model

Signal Compiler

Validator

SDDG

SDDG

SDDG Refinement

SDVG

SDVG

SDVG Normalizing

Preservation of clock semantics

Preservation of data dependency

Preservation of value-equivalence of variables

*_SEQ_TRA.SIG C/C++, Java

Fig. 2. The Translation Validation for the SIGNAL Compiler

Definition 1. A Sdvg associated with a Signal program and its generated Ccode is a directed graph G = 〈N,E, lN ,mN〉 where N is a finite set of nodes thatrepresent clocks, signals, variables, or functions. E ⊆ N ×N is the set of edgesthat describe the computation relations between nodes. lN : N −→ X ∪ F is amapping labeling each node with an element in X ∪ F . mN : N −→ P(N) is amapping labeling each node with a finite set of clocks, signals, and variables. Itdefines the set of equivalent clocks, signals and variables.

A subgraph rooted at a node is used to describe the computation of the corre-sponding element labelled at this node. In a graph, for a node labelled by y, theset of clocks, signals or variables mN (y) = {x0, ..., xn} is written as a node withlabel {x0, ..., xn} y.

2.1 SDVG of SIGNAL Program

Let P be a Signal program, we write X = {x1, ..., xn} to denote the set ofall signals in P which consists of input, output, state (corresponding to delayoperator) and local signals, denoted by I, O, S and L, respectively. For eachxi ∈ X , Dxi denotes its domain of values, and D

⊥xi

= Dxi ∪{⊥} is the domain ofvalues with the absent value. Then, the domain of values of X with absent valueis defined as follows: D⊥

X =⋃n

i=1 Dxi ∪ {⊥}. For each signal xi, it is associatedwith a Boolean variable xi to encode its clock at a given instant t (true: xi

is present at t, false: xi is absent at t), and xi with the same type as xi toencode its value. Formally, the abstract values to represent the clock and valueof a signal can be represented by a gated φ-function, xi = φ(xi, xi,⊥).

Page 81: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

70 V.C. Ngo et al.

Assume that the computation of signals in processes P1 and P2 is representedas shared value-graphs G1 and G2, respectively. Then the value-graph G of thesynchronous combination process P1|P2 can be defined as G = 〈N,E, lN ,mN 〉 inwhich for any node labelled by x, we replace it by the subgraph that is rooted bythe node labelled by x in G1 and G2. Every identical subgraph is reused, in otherwords, we maximize sharing among graph nodes in G1 and G2. Thus, the sharedvalue-graph of P can be constructed as a combination of the sub-value-graphsof its equations.

A Signal program is built through a set of primitive operators. Therefore,to construct the Sdvg of a Signal program, we construct a subgraph for eachprimitive operator. In the following, we present the value-graph correspondingto each Signal primitive operator.

Stepwise Function. Consider the equation using the stepwise function y :=f(x1, ..., xn), it indicates that if all signals from x1 to xn are defined, then theoutput signal y is defined by applying f on the values of x1, ..., xn. Otherwise, it isassigned no value. Thus, the computation of y can be represented by the followinggated φ-function: y = φ(y, f(x1, x2, ..., xn),⊥), where y ⇔ x1 ⇔ x2 ⇔ ... ⇔ xn

(since they are synchronous). The graph representation of the stepwise functionis depicted in Fig. 3. Note that in the graph, the node labelled by {x1, ..., xn} ymeans that mN (y) = {x1, ..., xn}. In other words, the subgraph representing thecomputation of y is also the computation of x1, ..., and xn.

{x1, ..., xn} y

φ

{y} f

x1

φ

x2

φ

... xn

φ

{x} y

φ

{y} m.x

{m.x0} a

Fig. 3. The graphs of y := f(x1, ..., xn) and y := x$1 init a

Delay. Consider the equation using the delay operator y := x$1 init a. Theoutput signal y is defined by the last value of the signal x when the signalx is present. Otherwise, it is assigned no value. The computation of y can berepresented by the following nodes: y = φ(y, m.x,⊥) and m.x0 = a, where y ⇔x. m.x and m.x0 are the last value of x and the initialized value of y. The graphrepresentation is depicted in Fig. 3.

Page 82: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Translation Validation for Synchronous Data-Flow Specification 71

Merge. Consider the equation which corresponds to the merge operator y :=x default z. If the signal x is defined then the signal y is defined and holdsthe value of x. The signal y is assigned the value of z when the signal x isnot defined and the signal z is defined. When both x and z are not defined, yholds no value. The computation of y can be represented by the following node:y = φ(y, φ(x, x, z),⊥), where y ⇔ (x ∨ z). The graph representation is depictedin Fig. 4. Note that in the graph, the clock y is represented by the subgraph ofx ∨ z.

{y} ∨

φ

{y} φx z

x

φ

z

φ

{y} ∧

φ

{y} xφx ∧

b ˜b

φ

Fig. 4. The graphs of y := x default z and y := x when b

Sampling. Consider the equation which corresponds to the sampling operatory := x when b. If the signal x, b are defined and b holds the value true, thenthe signal y is defined and holds the value of x. Otherwise, y holds no value.The computation of y can be represented by the following node: y = φ(y, x,⊥),

where y ⇔ (x ∧ b ∧ b). Fig. 4 shows its graph representation.

Restriction. The graph representation of restriction process P1\x is the sameas the graph of P1.

Clock Relations. Given the above graph representations of the primitive op-erators, we can obtain the graph representations for the derived operators onclocks as the following gated φ-function z = φ(z, true,⊥), where z is computedas z ⇔ x for z := x, z ⇔ (x ∨ y) for z := xˆ+ y, z ⇔ (x ∧ y) for z := xˆ∗ y,

z ⇔ (x ∧ ¬y) for z := xˆ− y, and z ⇔ (b ∧ b) for z := when b. For the clockrelation xˆ= y, it is represented by a single node graph labelled by {x} y.

2.2 SDVG of Generated C Code

For constructing the shared value-graph, the generated C code is translated intoa subgraph along with the subgraph of the Signal program. Let A be a Signalprogram and C its generated C code, we write XA = {x1, ..., xn} to denote theset of all signals in A, and XC = {xc

1, ..., xcm} to denote the set of all variables in

Page 83: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

72 V.C. Ngo et al.

C. We added “c” as superscript for the variables, to distinguish them from thesignals in A.

As described in [4,8,6,1], the generated C code of A consists of the followingfiles:

• A main.c is the implementation of the main function. It opens the IO com-munication channels by calling functions provided in A io.c, and calls theinitialization function. Then it calls the step function repeatedly in an infi-nite loop to interact with the environment.

• A body.c is the implementation of the initialization function and the stepfunction. The initialization function is called once to provide initial valuesto the program variables. The step function, which contains also the stepinitialization and finalization functions, is responsible for the calculation ofthe outputs to interact with the environment. This function, which is calledrepeatedly in an infinite loop, is the essential part of the concrete code.

• A io.c is the implementation of the IO communication functions. The IOfunctions are called to setup communication channels with the environment.

The scheduling and the computations are done inside the step function. There-fore, it is natural to construct a graph of this function in order to prove thatits variables and the corresponding signals have the same values. To constructthe graph of the step function, the following considerations need to be studied.The generated C code in the step function consists of only the assignment andif-then statements. For each signal named x in A, it has a correspondingBoolean variable named C x in the step function. Then the computation of x isimplemented by a conditional if-then statement as follows:

1 if (C_x) {2 computation(x);3 }

If x is an input signal then its computation is the reading operation whichgets the value of x from the environment. In case x is an output signal, aftercomputing its value, it will be written to the IO communication channel with theenvironment. Note that the C programs use persistent variables (e.g., variableswhich always have some value) to implement the Signal program A which usesvolatile variables. As a result, there is a difference in the types of a signal in theSignal program and of the corresponding variable in the C code. When a signalhas the absent value, ⊥, at a given instant, the corresponding C variable alwayshas a value. This implies that we have to detect when a variable in the C codesuch that whose value is not updated. In this case, it will be assigned the absentvalue, ⊥. Thus, the computation of a variable, called xc, can fully be representedby a gated φ-function xc = φ(C xc, xc,⊥), where xc denotes the newly updatedvalue of the variable.

In the generated C code, the computation of the variable whose clock is themaster clock, which ticks every time the step function is called, and the compu-tation of some local variables (introduced by the Signal compiler) are imple-mented using the forms below.

Page 84: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Translation Validation for Synchronous Data-Flow Specification 73

It is obvious that x is always updated when the step function is invoked. Thecomputation of such variables can be represented by a single node graph labelledby {xc} xc. That means the variable xc is always updated and holds the valuexc.

1 if (C_x) {2 computation(x);3 } else computation(x);4 // or without if-then5 computation(x)

Considering the following code segment, we observe that the variable x isinvolved in the computation of the variable y before the updating of x.

1 if (C_y) {2 y = x + 1;3 }4 // code segment5 if (C_x) {6 x = ...7 }

In this situation, we refer to the value of x as the previous value, denoted bym.xc. It happens when a delay operator is applied on the signal x in the Signalprogram. The computation of y is represented by the following gated φ-function:yc = φ(C yc,m.xc + 1,⊥).

3 Translation Validation of SDVG

In this section, we introduce the set of rewrite rules to transform the shared value-graph resulting from the previous step. This procedure is called normalizing. Atthe end of the normalization, for any output signal x and its correspondingvariable xc in the generated C code, we check whether x and xc label the samenode in the resulting graph. The normalizing procedure can be adapted withany future optimization of the compiler by updating the set of rewrite rules.

3.1 Normalizing

Once a shared value-graph is constructed for the Signal program and its gen-erated C code, if the values of an output signal and its corresponding variablein the C code are not already equivalent (they do not point the same node inthe shared value-graph), we start to normalize the graph. Given a set of termrewrite rules, the normalizing process works as described below. The normalizingalgorithm indicates that we apply the rewrite rules to each graph node individ-ually. When there are no more rules that can be applied to the resulting graph,we maximize the shared nodes, reusing the identical subgraphs. The processterminates when there exists no more sharing or rules that can be applied.

We classify our set of rewrite rules into three basic types: general simplificationrules, optimization-specific rules and synchronous rules. In the following, we shall

Page 85: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

74 V.C. Ngo et al.

present the rewrite rules of these types, and we assume that all nodes in ourshared value-graph are typed. We write a rewrite rule in form of term rewriterules, tl → tr, meaning that the subgraph represented by tl is replaced by thesubgraph represented by tr when the rule is applied. Due to the lack of space, weonly present a part of these rules, the full set of rules is shown in the appendix.

1 // Input: G: A shared value-graph. R: The set of2 // rewrite rules. S: The sharing among graph nodes.3 // Output: The normalized graph4 while (∃s ∈ S or ∃r ∈ R that can be applied on G) {5 while (∃r ∈ R that can be applied on G) {6 for (n ∈ G)7 if (r can be applied on n)8 apply the rewrite rule to n9 }

10 maximize sharing11 }12 return G

General Simplification Rules. The general simplification rules contain therules which are related to the general rules of inference of operators, denotedby the corresponding function symbols in F . In our consideration, the operatorsused in the primitive stepwise functions and in the generated C code are usuallogic operators, numerical comparison functions, and numerical operators. Whenapplying these rules, we will replace a subgraph rooted at a node by a smallersubgraph. In consequence of this replacement, we will reduce the number ofnodes by eliminating some unnecessary structures. The first set of rules simplifiesnumerical and Boolean comparison expressions. In these rules, the subgraph trepresents a structure of value computing (e.g., the computation of expressionb = x �= true). These rules are self explanatory, for instance, with any structurerepresented by a subgraph t, the expression t = t can always be replaced with asingle node subgraph labelled by the value true.

= (t, t) → true

�= (t, t) → false

The second set of general simplification rules eliminates unnecessary nodes inthe graph that represent the φ-functions, where c is a Boolean expression. Forinstance, we consider the following rules.

φ(true, x1, x2) → x1

φ(c, true, false) → cφ(c, φ(c, x1, x2), x3) → φ(c, x1, x3)

The first rule replaces a φ-function with its left branch if the condition alwaysholds the value true. The second rule operates on Boolean expressions repre-sented by the branches. When the branches are Boolean constants and holddifferent values, the φ-function can be replaced with the value of the condition

Page 86: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Translation Validation for Synchronous Data-Flow Specification 75

c. Consider a φ-function such that one of its branches is another φ-function.The third rule removes the φ-function in the branches if the conditions of theφ-functions are the same.

Optimization-Specific Rules. Based on the optimizations of the Signal com-piler, we have a number of optimization-specific rules in a way that reflects theeffects of specific optimizations of the compiler. These rules do not always re-duce the graph or make it simpler. One has to know specific optimizations of thecompiler when she wants to add them to the set of rewrite rules. In our case, theset of rules for simplifying constant expressions of the Signal compiler such as:

+(cst1, cst2) → cst, where cst = cst1 + cst2∧(cst1, cst2) → cst, where cst = cst1 ∧ cst2�(cst1, cst2) → cst

where � denotes a numerical comparison function, and the Boolean value cst isthe evaluation of the constant expression �(cst1, cst2) which can hold either thevalue false or true.

We also may add a number of rewrite rules that are derived from the list ofrules of inference for propositional logic. For example, we have a group of lawsfor rewriting formulas with and operator, such as:

∧(x, true) → x∧(x,⇒ (x, y)) → x ∧ y

Synchronous Rules. In addition to the general and optimization-specific rules,we also have a number of rewrite rules that are derived from the semantics ofthe code generation mechanism of the Signal compiler.

The first rule is that if a variable in the generated C code is always updated,then we require that the corresponding signal in the source program is presentat every instant, meaning that the signal never holds the absent value. In conse-quence of this rewrite rule, the signal x and its value when it is present x (resp.the variable xc and its updated value xc in the generated C code) point to thesame node in the shared value-graph. Every reference to x and x (resp. xc andxc) point to the same node.

We consider the equation pz := z$1 init 0. We use the variable m.z tocapture the last value of the signal z. In the generated C program, the last valueof the variable zc is denoted by m.zc. The second rule is that it is required thatthe last values of a signal and the corresponding variable in the generated Ccode are the same. That means m.z = m.zc.

Finally, we add rules that mirror the relation between input signals and theircorresponding variables in the generated C code. First, for any input signal xand the corresponding variable xc in the generated C code, if x is present, thenthe value of x which is read from the environment and the value of the variablexc after the reading statement must be equivalent. That means xc and x arerepresented by the same subgraph in the graph. Second, if the clock of x is alsoread from the environment as a parameter, then the clock of the input signal x

Page 87: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

76 V.C. Ngo et al.

is equivalent to the condition in which the variable xc is updated. It means thatwe represent x and C xc by the same subgraph. Consequently, every referenceto x and C xc (resp. x and xc) points to the same node.

4 Illustrative Example

Let us illustrate the verification process in Fig. 1 on the program DEC in Listing1.1 and its generated C code DEC step() in Listing 1.2.

In the first step, we shall compute the shared value-graph for both programsto represent the computation of all signals and their corresponding variables.This graph is depicted in Fig. 5.

1 process DEC=2 (? integer FB;3 ! integer N)4 (| FB = when (ZN<=1)5 | N := FB default (ZN-1)6 | ZN := N$1 init 17 |)8 where integer ZN init 19 end;

Listing 1.1. DEC in Signal

1 EXTERN logical DEC_step() {2 C_FB = N <= 1;3 if (C_FB) {4 if (!r_DEC_FB(&FB)) return FALSE; // read input FB5 }6 if (C_FB) N = FB; else N = N - 1;7 w_DEC_N(N); // write output N8 DEC_step_finalize();9 return TRUE;

10 }

Listing 1.2. Generated C code of DEC

Note that in the C program, the variable N c (“c” is added as superscript forthe C program variables, to distinguish them from the signals in the Signalprogram) is always updated (line (6)). In lines (2) and (6), the references to thevariable N c are the references to the last value of N c denoted by m.N c. Thevariable FBc which corresponds to the input signal FB is updated only whenthe variable C FBc is true.

In the second step, we shall normalize the above initial graph. Below is apotential normalization scenario, meaning that it might have more than onenormalization scenario, and the validator can choose one of them. For example,given a set of rules that can be applied, the validator can apply these rules withdifferent order. Fig. 6 depicts the intermediate resulting graph of this normal-ization scenario, and Fig. 7 is the final normalized graph from the initial graphwhen we cannot perform any more normalization.

Page 88: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Translation Validation for Synchronous Data-Flow Specification 77

Fig. 5. The shared value-graph of DEC and DEC step

1. The clock of the output signal N is a master clock which is indicated in thegenerated C by the variable N c being always updated. The node {N , ZN} ∨is rewritten into true.

2. By rule ∧(true, x) → x, the node {FB} ∧ is rewritten into {FB} <=.3. The φ-function node representing the computation of N is removed and N

points to the node {N} φ.4. The φ-function node representing the computation of ZN is removed and

ZN points to the node {ZN} m.N .

5. The nodes FBc and FB are rewritten into a single node {FB} FBc. All

references to them are replaced by references to {FB} FBc.

6. The nodes m.N c and m.N are rewritten into a single node {m.N} m.N c.

All references to them are replaced by references to {m.N} m.N c.

Fig. 6. The resulting value-graph of DEC and DEC step

Page 89: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

78 V.C. Ngo et al.

Fig. 7. The final normalized graph of DEC and DEC step

In the final step, we check that the value of the output signal and its correspond-ing variable in the generated code merge into a single node. In this example, wecan safely conclude that the output signal N and its corresponding variable N c

are equivalent since they point to the same node in the final normalized graph.

5 Related Work and Conclusion

There is a wide range of works for value-graph representations of expression eval-uations in a program. For example, in [16], Weise et al. present a nice summaryof the various types of value-graph. In our context, the value-graph is used torepresent the computation of variables in both source program and its generatedC code in which the identical structures are shared. We believe that this rep-resentation will reduce the required storage and make the normalizing processmore efficient than two separated graphs. Another remark is that the calculationof clocks as well as the special value, the absent value, are also represented inthe shared graph.

Another related work which adopts the translation validation approach inverification of optimizations, Tristan et al. [15], recently proposed a frameworkfor translation validation of Llvm optimizer. For a function and its optimizedcounterpart, they construct a shared value-graph. The graph is normalized (thegraph is reduced). After the normalization, if the outputs of two functions arerepresented by the same sub-graph, they can safely conclude that both functionsare equivalent.

On the other hand, Tate et al. [14] proposed a framework for translation vali-dation. Given a function in the input program and the corresponding optimizedversion of the function in the output program, they compute two value-graphs torepresent the computations of the variables. Then they transform the graph byadding equivalent terms through a process called equality saturation. After thesaturation, if both value-graphs are the same, they can conclude that the return

Page 90: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Translation Validation for Synchronous Data-Flow Specification 79

value of two given functions are the same. However, for translation validationpurposes, our normalization process is more efficient and scalable since we canadd rewrite rules into the validator that reflect what a typical compiler intendsto do (e.g., a compiler will do the constant folding optimization, then we can addthe rewrite rule for constant expressions such as three nodes subgraph (1+ 2) isreplaced by a single node 3).

The present paper provides a verification framework to prove the value-equivalence of variables and applies this approach to the synchronous data-flowcompiler Signal. With the simplicity of the graph normalization, we believe thattranslation validation of synchronous data-flow value-graph for the industrialcompiler Signal is feasible and efficient. Moreover, the normalization processcan always be extended by adding new rewrite rules. That makes the translationvalidation of Sdvg scalable and flexible.

We have considered sequential code generation. A possibility is to extend thisframework to use with other code generation schemes including cluster code withstatic and dynamic scheduling, modular code, and distributed code. One pathforward is the combination of this work and the work on data dependency graphin [10,11,9]. That means that we use synchronous data-flow dependency graphsand synchronous data-flow value-graphs as a common semantic framework torepresent the semantics of the generated code. The formalization of the notion of“correct transformation” is defined as the refinements between two synchronousdata-flow dependency graphs and in a shared value-graph as described above.

Another possibility is that we use an Smt solver to reason on the rewritingrules. For example, we recall the following rules:

φ(c1, φ(c2, x1, x2), x3) → φ(c1, x1, x3) if c1 ⇒ c2

φ(c1, φ(c2, x1, x2), x3) → φ(c1, x2, x3) if c1 ⇒ ¬c2

To apply these rules on a shared value-graph to reduce the nested φ-functions(e.g., from φ(c1, φ(c2, x1, x2), x3) to φ(c1, x1, x3)), we have to check the validityof first-order logic formulas, for instance, we check that |= (c1 ⇒ c2) and |= c1 ⇒¬c2. We consider the use of Smt to solve the validity of the conditions as in theabove rewrite rules to normalize value-graphs.

References

1. Aubry, P., Guernic, P.L., Machard, S.: Synchronous distribution of signal programs.In: Proceedings of the 29th Hawaii International Conference on System Sciences,vol. 1, pp. 656–665. IEEE Computer Society Press (1996)

2. Ballance, R., Maccabe, A., Ottenstei, K.: The program dependence web: A repre-sentation supporting control, data, and demand driven interpretation of imperativelanguages. In: Proc. of the SIGPLAN 1990 Conference on Programming LanguageDesign and Implementation, pp. 257–271 (1990)

3. Benveniste, A., Guernic, P.L.: Hybrid dynamical systems theory and the signallanguage. IEEE Transactions on Automatic Control 35(5), 535–546 (1990)

Page 91: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

80 V.C. Ngo et al.

4. Besnard, L., Gautier, T., Guernic, P.L., Talpin, J.-P.: Compilation of polychronousdata-flow equations. In: Synthesis of Embedded Software, pp. 01–40. Springer(2010)

5. Blazy, S.: Which c semantics to embed in the front-end of a formally verified com-piler? In: Tools and Techniques for Verification of System Infrastructure, TTVSI(2008)

6. Gautier, T., Guernic, P.L.: Code generation in the sacres project. In: TowardsSystem Safety, Proceedings of the Safety-critical Systems Symposium, pp. 127–149(1999)

7. Gautier, T., Guernic, P.L., Besnard, L.: SIGNAL: A declarative language for syn-chronous programming of real-time systems. In: Kahn, G. (ed.) FPCA 1987. LNCS,vol. 274, pp. 257–277. Springer, Heidelberg (1987)

8. Maffeis, O., Guernic, P.L.: Distributed implementation of SIGNAL: Scheduling &graph clustering. In: Langmaack, H., de Roever, W.-P., Vytopil, J. (eds.) FTRTFT1994 and ProCoS 1994. LNCS, vol. 863, pp. 547–566. Springer, Heidelberg (1994)

9. Ngo, V.C.: Formal verification of a synchronous data-flow compiler: from signal toc. Ph.D Thesis (2014), http://tel.archives-ouvertes.fr/tel-01058041

10. Ngo, V.C., Talpin, J.-P., Gautier, T., Guernic, P.L., Besnard, L.: Formal verificationof compiler transformations on polychronous equations. In: Derrick, J., Gnesi, S.,Latella, D., Treharne, H. (eds.) IFM 2012. LNCS, vol. 7321, pp. 113–127. Springer,Heidelberg (2012)

11. Ngo, V.C., Talpin, J.-P., Gautier, T., Guernic, P.L., Besnard, L.: Formal verifica-tion of synchronous data-flow program transformations toward certified compilers.Frontiers of Computer Science, Special Issue on Synchronous Programming 7(5),598–616 (2013)

12. Pnueli, A., Shtrichman, O., Siegel, M.: Translation validation: From SIGNAL toC. In: Olderog, E.-R., Steffen, B. (eds.) Correct System Design. LNCS, vol. 1710,pp. 231–255. Springer, Heidelberg (1999)

13. Pnueli, A., Siegel, M., Singerman, E.: Translation validation. In: Steffen, B. (ed.)TACAS 1998. LNCS, vol. 1384, pp. 151–166. Springer, Heidelberg (1998)

14. Tate, R., Stepp, M., Tatlock, Z., Lerner, S.: Equility saturation: A new approach tooptimization. In: 36th Principles of Programming Languages, pp. 264–276 (2009)

15. Tristan, J.-B., Govereau, P., Morrisett, G.: Evaluating value-graph translation val-idation for llvm. In: ACM SIGPLAN Conference on Programming and LanguageDesign Implementation (2011)

16. Weise, D., Crew, R., Ernst, M., Steensgaard, B.: Value dependence graphs: Rep-resentation without taxation. In: 21st Principles of Programming Languages,pp. 297–310 (1994)

Page 92: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Formal Models of Concurrent and Distributed Systems

Page 93: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Dynamic Causality in Event Structures

Youssef Arbach, David Karcher, Kirstin Peters, and Uwe Nestmann(�)

Technische Universitat Berlin, Berlin, Germany{youssef.arbach,david.s.karcher,kirstin.peters,uwe.nestmann}@tu-berlin.de

Abstract Event Structures (ESs) address the representation of directrelationships between individual events, usually capturing the notions ofcausality and conflict. Up to now, such relationships have been static,i.e. they cannot change during a system run. Thus the common ESsonly model a static view on systems. We dynamize causality such thatcausal dependencies between some events can be changed by occurrencesof other events. We first model and study the case in which events mayentail the removal of causal dependencies, then we consider the additionof causal dependencies, and finally we combine both approaches in the so-called Dynamic Causality ESs. For all three newly defined types of ESs,we study their expressive power in comparison to the well-known PrimeESs, Dual ESs, Extended Bundle ESs, and ESs for Resolvable Conflicts.Interestingly Dynamic Causality ESs subsume Extended Bundle ESs andDual ESs but are incomparable with ESs for Resolvable Conflicts.

1 Introduction

Concurrency Model. Event Structures (ESs) usually address statically definedrelationships that constrain the possible occurrences of events, typically repre-sented as causality (for precedence) and conflict (for choice). An event is a singleoccurrence of an action; it cannot be repeated. ESs were first used to give seman-tics to Petri nets [14], then to process calculi [4,8], and recently to model quantumstrategies and games [16]. The semantics of an ES itself is usually provided bythe sets of traces compatible with the constraints, or by configuration-based setsof events, possibly in their partially-ordered variant (posets).

Motivation. Modern process-aware systems emphasize the need for flexibilityinto their design to adapt to changes in their environment [13]. One form offlexibility is the ability to change the work-flow during the runtime of the systemdeviating from the default path, due to changes in regulations or to exceptions.Such changes could be ad hoc or captured at the build time of the system[10]. For instance—as taken from [13]—during the treatment process, and for aparticular patient, a planned computer tomography must not be performed dueto the fact that she has a cardiac pacemaker. Instead, an X-ray activity shall beperformed. In this paper, we provide a formal model that can be used for suchscenarios, showing what is the regular execution path and what is the exceptional

Supported by the DFG Research Training Group SOAMED.

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 83–97, 2015.DOI: 10.1007/978-3-319-19195-9_6

Page 94: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

84 Y. Arbach et al.

Prime Bundle Extended Bundle

Dual Shrinking

Growing

Resolvable Conflict

Dynamic Causality

Fig. 1. Landscape of Event Structures (new ESs are bold)

one [10]. In the conclusion, we highlight the advantages of our model over otherstatic-causality models w.r.t. such scenarios.

Overview.We study the idea—motivated by application scenarios—of events changing

the causal dependencies of other events. In order to deal with dynamicity incausality usually duplications of events are used (see e.g. [5], where copies of thesame event have the same label, but different dependencies). In this paper wewant to express dynamic changes of causality more directly without duplications.We allow dependencies to change during a system run, by modifying the causalityitself. In this way we avoid duplications of events, and keep the model simpleand more intuitive. We separate the idea of dropping (shrinking) causality fromadding (growing) causality and study each one separately first, and then combinethem. In § 3 we define Shrinking Causality Event Structures (SESs), and comparetheir expressive power with other types of ESs. In § 4 we do the same for GrowingCausality Event Structures (GESs). In § 5 we combine both concepts within theDynamic Causality Event Structures (DCESs) and show that they are strictlymore expressive than Extended Bundle Event Structures (EBESs) [8], whichare incomparable to SESs and GESs. Although Event Structures for Resolvableconflicts (RCESs) [12] are shown to be more expressive than GESs and SESs,they are incomparable with DCESs. The relations among the various classesof ESs are summarised in Fig. 1, where an arrow from one class to anothermeans that the first is less expressive than the second. In § 6 we summarize thecontributions and show the limitations of other static-causality models w.r.t. ourexample, and conclude by future work.

Related Work. Kuske and Morin in [7] worked on local independence, using localtraces. There actions can be independent from each other after a given history.Comparing to our work we provide a mechanism for independence of events,through the growing and shrinking causality, while this related work abstractsfrom the way actions become independent. In [12], van Glabbeek and Plotkinintroduced RCESs, where conflicts can be resolved or created by the occurrenceof other events. This dynamicity of conflicts is complementary to our approach.As visualized in Fig. 1 DCESs and RCESs are incomparable but—similarly toRCESs—DCESs are more expressive than many other types of ESs.

Page 95: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Dynamic Causality in Event Structures 85

2 Technical Preliminaries

We investigate the idea of dynamically evolving dependency between events.Therefore we want to allow that the occurrence of events creates new causal de-pendencies between events or removes such dependencies. We base our extensionon prime event structures, because they provide a very simple causality model.In the following we shortly revisit the main definitions of the types of ESs fromliterature we compare with. We omit the labels of events since our results arenot influenced by their presence.

2.1 Prime Event Structures

A prime event structure (PES) [15] consists of a set of events and two relationsdescribing conflicts and causal dependencies. To cover the intuition, that eventscausally depending on an infinite number of other events can never occur, [15]requires PESs to satisfy the axiom of finite causes. Additionally the enablingrelation is assumed to be a partial order, i.e. is transitive and reflexive. Further-more the concept of conflict heredity is required; saying that an event conflictingwith another event conflicts with all its causal successors.

If we allow to add or drop causal dependencies, it is hard to maintain theconflict heredity and the transitivity and reflexivity of enabling. Because of thatwe do not consider the partial order property nor the axiom of conflict heredityin our definition of PESs. The same applies for the finite causes property whichwill be covered through finite configurations, like Def. 13 later on. Note howeverthat the following version of PESs has the same expressive power as PESs in [15]w.r.t. to finite configurations.

Definition 1. A Prime Event Structure (PES) is a triple π = (E,#,→), whereE is a set of events, # ⊆ E2 is an irreflexive symmetric relation (the conflictrelation), and → ⊆ E2 is the enabling relation.

The computation state of a process that is modeled as a PES is representedby the set of events that have occurred. Given a PES π = (E,#,→) we call suchsets C ⊆ E that respect # and → as configurations of π.

Definition 2. Let π = (E,#,→) be a PES. A set of events C ⊆ E is a con-figuration of π if it is conflict-free, i.e. ∀e, e′ ∈ C . ¬ (e#e′), downward-closed,i.e. ∀e, e′ ∈ E . e→ e′ ∧ e′ ∈ C =⇒ e ∈ C, and the transitive closure of theenabling relation is acyclic, i.e. →∗ ∩ C2 is free of cycles. We denote the set ofconfigurations of π by C(π).

An event e is called impossible in a PES if it does not occur in any of itsconfigurations. Events can be impossible because of enabling cycles, or an over-lapping between the enabling and the conflict relation, or because of impossiblepredecessors.

Page 96: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

86 Y. Arbach et al.

a b

c

(a)a bc

d e

(b)a b c

d

(c)

Fig. 2. A Bundle ES, an Extended Bundle ES, and a Dual ES

2.2 Bundle, Extended Bundle and Dual Event Structures

PESs are simple but also limited. They do not allow to describe optional orconditional enabling of events. Bundle event structures (BESs)—among others—were designed to overcome these limitations [8]. Enabling of events is based onbundles which are pairs (X, e), denoted as X → e, where X is a set of eventsand e is the event pointed by that bundle. A bundle is satisfied when one eventof X occurs. An event is enabled when all bundles pointing to it are satisfied.This disjunctive causality allows for optionality in enabling events.

Definition 3. A Bundle Event Structure (BES) is a triple β = (E,#, →),where E is a set of events, # ⊆ E2 is an irreflexive symmetric relation (theconflict relation), and → ⊆ P(E)× E is the enabling relation, such that for allX ⊆ E and e ∈ E the bundle X →e implies that for all e1, e2 ∈ X with e1 = e2it holds e1#e2 (Stability).

Figure 2 (a) shows an example of a BES. The solid arrows denote causality,i.e. reflect the enabling relation, where the bar between the arrows indicates abundle, and the dashed line denotes a mutual conflict.

A configuration of a BES is again a conflict-free set of events that is downward-closed. Therefore the stability condition avoids causal ambiguity [9]. To excludesets of events that result from enabling cycles we use traces. For a sequencet = e1 · · · en of events let t = {e1, . . . , en} and ti = e1 · · · ei. Let ε denote theempty sequence.

Definition 4. Let β = (E,#, →) be a BES. A trace is a sequence of distinctevents t = e1 · · · en with t ⊆ E such that ∀1 ≤ i, j ≤ n . ¬ (ei#ej) and such that∀1 ≤ i ≤ n . ∀X ⊆ E . X →ei =⇒ ti−1 ∩X = ∅.

A set of events C ⊆ E is a configuration of β if there is a trace t suchthat C = t. This trace-based definition of a configuration will be the same forExtended Bundle and Dual ESs. Let T(β) denote the set of traces and C(β) theset of configurations of β.

Partially ordered sets, abbreviated as posets, are used as a semantic model fordifferent kinds of ESs and other concurrency models (see e.g. [11]). In contrastto configurations, a poset does not only record the set of events that happened,but also captures the precedence relations between the events. Formally a posetis a pair (A,≤), where A is a finite set of events and ≤ is a partial order over A.

A poset represents a set of system runs, differing for permutation of indepen-dent events. To describe the semantics of the entire ES, families of posets [11]

Page 97: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Dynamic Causality in Event Structures 87

with a prefix relation are used. According to Rensink in [11], families of posetsform a convenient underlying model for models of concurrency, and are moreexpressive than families of configurations.

To obtain the posets of a BES, wndow each of its configurations with a partialorder. Let β = (E,#, →) be a BES and C ∈ C(β), and e, e′ ∈ C. Then e ≺C e′

if ∃X ⊆ E . e ∈ X ∧ X → e′. Let ≤C be the reflexive and transitive closure of≺C . It is proved in [8] that ≤C is a partial order over C. Let P(β) denote theset of posets of β.

Let x and y be two ESs of arbitrary kind on which posets are defined. Wedenote that x and y have the same set of posets by x �p y. Note that forBESs, EBES, and DESs families of posets are the most discriminating semanticsstudied in the literature. So, in these cases, we consider two ESs as behaviorallyequivalent if they have the same set of posets.

The first extension of BESs we consider are Extended Bundle Event Structures(EBESs) from [8]. The conflict relation # is replaced by a disabling relation.An event e1 disables another event e2, means once e1 occurs e2 cannot occuranymore. The symmetric conflict # can be modeled through mutual disabling.Therefore EBESs are a generalization of BESs, and thus are more expressive [8].

Definition 5. An Extended Bundle Event Structure (EBES) is a triple ξ =(E,�, →), where E is a set of events, � ⊆ E2 is the irreflexive disablingrelation, and → ⊆ P(E) × E is the enabling relation, such that for all X ⊆ Eand e ∈ E the bundle X →e implies that for all e1, e2 ∈ X with e1 = e2 it holdse1�e2 (Stability).

Stability ensures that two distinct events of a bundle set are in mutual dis-abling. Figure 2 (b) shows an EBES with the two bundles {a, c} → d and{b, c} → d. The dashed lines denote again mutual disabling as required by sta-bility. A disabling d � e, to be read ‘e disables d’, is represented by a dashedarrow.

Definition 6. Let ξ = (E,�, →) be an EBES. A trace is a sequence of distinctevents t = e1 · · · en with t ⊆ E such that ∀1 ≤ i, j ≤ n . ei� ej =⇒ i < j and∀1 ≤ i ≤ n . ∀X ⊆ E . X →ei =⇒ ti−1 ∩X = ∅.

We adapt the definitions of configurations and traces of BESs accordingly.For C ∈ C(ξ) and e, e′ ∈ C, let e ≺C e′ if ∃X ⊆ E . e ∈ C ∧X →e′ or if e�e′.Again ≤C denotes the reflexive and transitive closure of ≺C , and P(ξ) denotesthe set of posets of ξ.

Dual Event Structures (DESs) are the second extension of BES examinedhere. They are obtained by dropping the stability condition. This leads to causalambiguity, i.e. given a trace and one of its events, it is not always possible todetermine what caused this event. The definition of DESs varies between [6](based on EBESs) and [9] (based on BESs). Here we rely on the version of [9].

Definition 7. A Dual Event Structure (DES) is a triple δ = (E,#, →), whereE is a set of events, # ⊆ E2 is an irreflexive symmetric relation (the conflictrelation), and → ⊆ P(E)× E is the enabling relation.

Page 98: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

88 Y. Arbach et al.

Figure 2 (c) shows a DES with one bundle, namely {a, b, c} →d, and withoutconflicts. Again the definitions of configurations and traces are exactly the sameas in BESs (cf. Def. 4), therefore we omit them here.

Because of the causal ambiguity, the definition of ≤C is difficult and the be-havior of a DES w.r.t. a configuration cannot be described by a single poset any-more. [9] illustrates that there are different possible interpretations of causality.The authors defined five different intentional posets: liberal, bundle satisfaction,minimal, early and late posets. They show the equivalence of the behavioral se-mantics, and that the early causality and trace equivalence coincide. Thus weconcentrate on early causality. The remaining intentional partial order semanticsare discussed in [1]. To capture causal ambiguity we have to consider all tracesof a configuration to obtain its posets. Below U1 is earlier than U2 if the largestindex in U1 \ U2 is smaller than the largest index in U2 \ U1 [9].

Definition 8. Let δ = (E,#, →) be a DES, t = e1 · · · en one of its traces,1 ≤ i ≤ n, and X1 → ei, . . . , Xm → ei all bundles pointing to ei. A set U is acause of ei in t if ∀e ∈ U . ∃1 ≤ j < i . e = ej, ∀1 ≤ k ≤ m. Xk ∩ U = ∅, and Uis the earliest set satisfying the previous two conditions. Let Pd(t) be the set ofposets obtained this way for t.

2.3 Event Structures for Resolvable Conflicts

Event Structures for Resolvable Conflicts (RCES) were introduced in [12] togeneralize former types of ESs and to give semantics to general Petri Nets. Theyallow to model the case where a and c cannot occur together until b takes place,i.e. initially a and c are in a conflict until the occurrence of b resolves thisconflict. An RCES consists of a set of events and an enabling relation betweensets of events. Here the enabling relation also models conflicts between events.The behavior is defined by a transition relation between sets of events that isderived from the enabling relation �.Definition 9. An Event Structure for Resolvable Conflicts (RCES) is a pair

ρ = (E,�), where E is a set of events and � ⊆ P(E)2is the enabling relation.

In [12] several versions of configurations are defined. Here we consider onlyconfigurations which are both reachable and finite.

Definition 10. Let ρ = (E,�) be an RCES and X,Y ⊆ E. Then X →rc Yif (X ⊆ Y ∧ ∀Z ⊆ Y . ∃W ⊆ X . W � Z). The set of configurations of ρ isdefined as C(ρ) = {X ⊆ E | ∅→∗

rcX ∧X is finite }, where →∗rc is the reflexive

and transitive closure of →rc.

As an example consider the RCES ρ = (E,�), where E = {a, b, c}, {b} �{a, c}, and ∅ � X iff X ⊆ E and X = {a, c}. It models the above described initialconflict between a and c that can be resolved by b. In Fig. 3 (ρ) the respectivetransition graph is shown, i.e. the nodes are all reachable configurations of ρand the directed edges represent →rc. Note, because of {a, c} ⊂ {a, b, c} and∅ � {a, c}, there is no transition from ∅ to {a, b, c}.

Page 99: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Dynamic Causality in Event Structures 89

(ρ) {a}

{b}

{c}

{a, b}

{b, c}

{a, b, c} ∅

(ργ)

{b}

{a}

{c}

{a, b}

{a, c}

{b, c}

{a, b, c}

Fig. 3. Transition graphs of RCESs with resolvable conflict (ρ) and disabling (ργ)

We consider two RCESs as equivalent if they have the same transition graphs.Note that, since we consider only reachable configurations, the transition equi-valence defined below is denoted as reachable transition equivalence in [12].

Definition 11. Two RCESs ρ = (E,→rc) and ρ′ = (E′,→′rc) are transition

equivalent, denoted by ρ�t ρ′, if E = E′ and →rc ∩ (C(ρ))

2= →′

rc ∩ (C(ρ′))2.

Again we adapt the notion of transition equivalence to arbitrary types ofESs with a transition relation. Let x and y be two arbitrary types of ESs onthat a transition relation is defined. We denote the fact that x and y have thesame transition graphs by x�t y. Note that for RCESs, transition equivalence isthe most discriminating semantics studied in the literature. So we consider twoRCESs as behavioral equivalent if they have the same transition graphs.

3 Shrinking Causality

Now we add a new relation which represents the removal of causal dependenciesas a ternary relation between events � ⊆ E3. For instance (a, c, b) ∈ �, denotedas [a→b]� c, models that a is dropped from the set of causal predecessors ofb by the occurrence of c. The dropping is visualized in Fig. 4(a) by a dashedarrow with empty head from the initial cause a→b to its dropper c. We add thisrelation to PESs and denote the result as shrinking causality event structures.

Definition 12. A Shrinking Causality Event Structure (SES) is a pair σ =(π,�), where π = (E,#,→) is a PES and � ⊆ E3 is the shrinking causalityrelation such that [e→e′′]�e′ implies e→e′′ for all e, e′, e′′ ∈ E.

Sometimes we expand (π,�) and write (E,#,→,�). For [c→t]�m we callm the modifier, t the target, and c the contribution. We denote the set of allmodifiers dropping c → t by [c → t]�. We refer to the set of dropped causes of anevent w.r.t. a specific history by the function dc : P(E)×E → P(E) defined as:dc(H, e) = {e′ | ∃d ∈ H . [e′→e]�d}. We refer to the initial causes of an eventby the function ic : E → P(E) such that: ic(e) = {e′ | e′→e}. The semantics ofa SES can be defined based on posets similar to BESs, EBESs, and DESs, orbased on a transition relation similar to RCESs. We consider both.

Page 100: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

90 Y. Arbach et al.

a b

c

(a)

a

b(c)

b

a

(b)

a b

c(d)

a b

c(e)

Fig. 4. A SES and GESs modeling disabling, conflict, temporary disabling, and resolv-able conflicts

Definition 13. Let σ = (E,#,→,�) be a SES.– A trace of σ is a sequence of distinct events t = e1 · · · en with t ⊆ E such

that ∀1 ≤ i, j ≤ n . ¬ (ei#ej) and ∀1 ≤ i ≤ n .(

ic(ei) \ dc(

ti−1, ei)) ⊆ ti−1.

C ⊆ E is a traced-based configuration of σ if there is t such that C = t. LetCTr(σ) be the set of traced-based configurations, T(σ) the set of traces of σ.

– Let t = e1 · · · en ∈ T(σ) and 1 ≤ i ≤ n. A set U is a cause of ei in t if∀e ∈ U . ∃1 ≤ j < i . e = ej, (ic(ei) \ dc(U, ei)) ⊆ U , and U is the earliest setsatisfying the previous two conditions. Let Ps(t) be the set of posets obtainedthis way for t.

– Let X,Y ⊆ E. Then X→sY if X ⊆ Y , ∀e, e′ ∈ Y . ¬(e#e′), and ∀e ∈ Y \X .(ic(e) \ dc(X, e)) ⊆ X.

– The set of all configurations of σ is C(σ) = {X ⊆ E | ∅→∗s X ∧X is finite },

where →∗s is the reflexive and transitive closure of →s.

The combination of initial and dropped causes ensures that for each ei ∈ t,all its initial causes are either preceding ei or dropped by other events precedingei. Note that as for DESs we concentrate on early causality. We consider thereachable and finite configurations w.r.t. to →s as well as configurations basedon the traces. Note that both definitions coincide.

To show that �p and �t coincide on SESs we make use of the result that�p and trace equivalence coincide on DES (compare to [9]) and show that SESsare as expressive as DESs. Consider the shrinking-causality [c→t]�d. It modelsthe case that initially t causally depends on c which can be dropped by theoccurrence of d. Thus for t to be enabled either c occurs or d does. This is adisjunctive causality as modeled by DESs. In fact [c→t]�d corresponds to thebundle {c, d} → t. We prove that we can map each SES into a DES with thesame behavior and vice versa. To translate a SES into a DES we create a bundlefor each initial causal dependence and add all its droppers to the bundle set.

In the opposite direction we map each DES into a set of similar SESs suchthat each SES in this set has the same behavior as the DES. Intuitively we haveto choose an initial dependency for each bundle from its set, and to translatethe rest of the bundle set into droppers for that dependency. Unfortunately thebundles that point to the same event are not necessarily disjoint. Consider forexample {a, b} → e and {b, c} → e. If we choose b → e as initial dependencyfor both bundles to be dropped as [b→e] � a and [b→e] � c, then {a, e} is aconfiguration of the resulting SES but not of the original DES. So we have toensure that we choose distinct events as initial causes for all bundles pointing to

Page 101: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Dynamic Causality in Event Structures 91

the same event. Thus for each bundle we choose a fresh event as initial cause,make it impossible by a self-loop, and add all events of the bundle as droppers.Note that to translate a DES into a SES we have to introduce additional events,i.e. it is not always possible to translate a DES into a SES without additionalimpossible events. All proofs can be found in [1].

Theorem 1. SESs are as expressive as DESs.

In [1] we show that each SES and its translation as well as each DES and itstranslation have the same set of posets considering not only early but also liberal,minimal, and late causality. Thus the concepts of SESs and DESs are not onlybehaviorally equivalent but—except for the additional impossible events—alsostructurally closely related.

Note that �p, �t, and trace-equivalence coincide on SES.

Theorem 2. Let σ, σ′ be two SESs. Then σ�p σ′ iff σ�tσ

′ iff T(σ) = T(σ′).

As shown above SESs allow to model disjunctive causality. As an exampleconsider the dropping of a causality as in Fig. 4(a). Such a disjunctive causalityis not possible in EBESs. On the other hand the asymmetric conflict of an EBEScannot be modeled with a SES. As an example consider Fig. 2 (b),where e cannotprecede d.

Theorem 3. SESs and EBESs are incomparable.

SESs are strictly less expressive than RCESs, because each SES can be trans-lated into a transition-equivalent RCES, and on the other hand there are RCESsthat cannot be translated into a transition-equivalent SES. As a counterexam-ple we use the RCES ρσ = ({e, f} , {∅ � ∅, ∅ � {e} , ∅ � {f} , {e} � {e, f}}) thatcaptures disabling in an EBES.

Theorem 4. SESs are strictly less expressive than RCESs.

4 Growing Causality

As in SESs we base our extension for growing causality on PESs. We add thenew relation � ⊆ E3, where (a, c, b) ∈ �, denoted as c� [a → b], models that cadds a as a cause for b. Thus c is a condition for the causal dependency a→b.

The adding is visualized in Fig. 4(d) by a dashed line with a filled head fromthe modifier c to the added dependency a→ b, which is dotted denoting thatthis dependency does not exist initially (In this example there is an additionalcausality c→a).

Definition 14. A Growing Causality Event Structure (GES) is a pair γ =(π,�), where π = (E,#,→) is a PES and � ⊆ E3 is the growing causalityrelation such that ∀e, e′, e′′ ∈ E . e′� [e → e′′] =⇒ ¬(e → e′′).

We refer to the causes added to an event w.r.t. a specific history by the functionac : P(E)×E → P(E), defined as ac(H, e) = {e′ | ∃a ∈ H . a� [e′ → e]}, and tothe initial causality by the function ic as defined in § 3. Similar to the RCESsthe behavior of a GES can be defined by a transition relation. Thus we considertwo GESs as equally expressive if they are transition-equivalent.

Page 102: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

92 Y. Arbach et al.

Definition 15. Let γ = (E,#,→,�) be a GES.– A trace of γ is a sequence of distinct events t = e1 · · · en with t ⊆ E such that

∀1 ≤ i, j ≤ n . ¬ (ei#ej) and(

ic(ei) ∪ ac(

ti−1, ei)) ⊆ ti−1 for all i ≤ n.Then

C ⊆ E is a trace-based configuration of γ if there is a trace t such thatC = t. The set of traces of γ is denoted by T(γ) and the set of its trace-basedconfigurations is denoted by CTr(γ).

– Let X,Y ⊆ E. Then X →g Y if X ⊆ Y , ∀e, e′ ∈ Y . ¬ (e#e′), ∀e ∈ Y \X . (ic(e) ∪ ac(X, e)) ⊆ X, and ∀t,m ∈ Y \ X . ∀c ∈ E . m� [c → t] =⇒(c ∈ X ∨m ∈ {c, t}).

– The set of all configurations of γ is C(γ) ={

X ⊆ E | ∅→∗gX ∧X is finite

}

,where →∗

g is the reflexive and transitive closure of →g.

The last condition in the transition definition prevents the concurrent occur-rence of a target and its modifier since they are not independent. One exceptionis when the contribution has already occurred; in that case, the modifier does notchange the target’s predecessors. It also captures the trivial case of self adding,i.e. when a target adds a contribution to itself or a modifier adds itself to atarget. Again we consider the reachable and finite configurations, and show in[1] that the definitions of reachable and trace-based configurations coincide.

Disabling as defined in EBESs or the asymmetric event structure of [3] can bemodeled by �. For example b�a can be modeled by b� [a → a] as depicted inFig. 4 (b). Conflicts can be modeled by � through mutual disabling, as depictedin Fig. 4 (c), and thus the conflict relation can be omitted in this ES model.

In inhibitor event structures [2] there is a kind of disabling, where an evente can be disabled by another event d until an event out of a set X occurs. Thiskind of temporary disabling provides disjunction in the re-enabling that cannotbe modeled in GESs but in DCESs (cf. the next section). However temporarydisabling without a disjunctive re-enabling can be modeled by a GES as inFig. 4 (d).

Also resolvable conflicts can be modeled by a GES. For example the GES inFig. 4 (e) with a� [c → b] and b� [c → a] models a conflict between a and bthat can be resolved by c. Note that this example depends on the idea that amodifier and its target cannot occur concurrently (cf. Def. 15). Note also thatresolvable conflicts are a reason why families of configurations cannot be usedto define the semantics of GESs or RCESs.

As shown in Fig. 4 (b) GESs can model disabling. Nevertheless EBESs andGESs are incomparable, because GESs cannot model the disjunction in the en-abling relation that EBESs inherit from BESs. On the other hand EBESs cannotmodel conditional causal dependencies. Thus GESs are incomparable to BESsas well as EBESs.

Theorem 5. GESs are incomparable to BESs and EBESs.

GESs are also incomparable to SESs, because the adding of causes cannot bemodeled by SESs. As a counterexample we use the GES of Fig. 4 (c). Then sinceBESs are incomparable to GESs, BESs are less expressive than DESs, and DESsare as expressive as SESs, we conclude that GESs and SESs are incomparable.

Page 103: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Dynamic Causality in Event Structures 93

Theorem 6. GESs and SESs are incomparable.

As illustrated in Fig. 4 (d) GESs can model resolvable conflicts. Neverthelessthey are strictly less expressive than RCESs, because each GES can be trans-lated into a transition equivalent RCES and on the other hand there exists notransition equivalent GES for the RCES ργ = ({a, b, c} ,�) that is given by thesecond transition graph in Fig. 3. It models the case, where after a and b theevent c becomes impossible, i.e. it models disabling by a set instead of a singleevent.

Theorem 7. GESs are strictly less expressive than RCESs.

5 Dynamic Causality

Up to now we have investigated shrinking, and growing causality separately. Inthis section we combine them and examine the resulting expressiveness.

Definition 16. A Dynamic Causality Event Structure (DCES) is a triple Δ =(π,�,�)—expanded (E,#,→,�,�)—, where π = (E,#,→) is a PES, � ⊆ E3

is the shrinking causality relation, and � ⊆ E3 is the growing causality relationsuch that for all e, e′, e′′ ∈ E: 1. [e→e′′]�e′∧�m ∈ E . m� [e → e′′] =⇒ e → e′′

2. e′ � [e → e′′] ∧ �m ∈ E . [e→e′′]�m =⇒ ¬(e → e′′) 3. e′ � [e → e′′] =⇒¬([e→e′′]�e′).

Conditions 1 and 2 are just a generalization of the conditions in Defs. 12 and 14respectively. If there are droppers and adders for the same causal dependency wedo not specify whether this dependency is contained in →, because the semanticsdepends on the order in which the droppers and adders occur. Condition 3prevents that a modifier adds and drops the same cause for the same target.

The order of occurrence of droppers and adders determines the causes of anevent. For example assume a� [c → t] and [c→t]�d, then after ad, t does notdepend on c, whereas after da, t depends on c. Thus configurations like {a, d}are not expressive enough to represent the state of such a system (cf. Lem. 1).

Therefore in a DCES a state is a pair of a configuration C and a causal statefunction cs, which computes the causal predecessors of an event, that are stillneeded.

Definition 17. Let Δ = (E,#,→,�,�) be a DCES. The function mc : P(E)×E → P(E) denotes the maximal causality that an event can have after somehistory C ⊆ E, and is defined as mc(C, e) = {e′ ∈ E \ C | e′ → e ∨ ∃a ∈ C . a�[e′ → e]}. A state of Δ is a pair (C, cs) where cs : E \ C → P(E \ C) such thatC ⊆ E and cs(e) ⊆ mc(C, e). We denote cs as causality state function, whichshows for an event e that did not occur, which events are still missing such that eis enabled. An initial state of Δ is S0 = (∅, csi), where csi(e) = {e′ ∈ E | e′ → e}.Note that S0 is the only state with an empty set of events; for other sets ofevents there can be multiple states. The behavior of a DCES is defined by thetransition relation on its reachable states with finite configurations.

Page 104: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

94 Y. Arbach et al.

Definition 18. Let Δ = (E,#,→,�,�) be a DCES and C,C′ ⊆ E. Then(C, cs)→d (C

′, cs′) if:1. C ⊆ C′ 2. ∀e, e′ ∈ C′ . ¬(e#e′) 3. ∀e ∈ C′ \ C . cs(e) = ∅4. ∀e, e′ ∈ E \ C′ . e′ ∈ cs(e) \ cs′(e) =⇒ [e′ → e]� ∩(C′ \ C) = ∅5. ∀e, e′ ∈ E \ C′ . [e′ → e]� ∩(C′ \ C) = ∅ =⇒ e′ /∈ cs′(e)6. ∀e ∈ E \ C′ . e′ ∈ cs′(e) \ cs(e) =⇒� [e′ → e] ∩ (C′ \ C) = ∅7. ∀e, e′ ∈ E \ C′ . � [e′ → e] ∩ (C′ \ C) = ∅ =⇒ e′ ∈ cs′(e)8. ∀e, e′ ∈ E \ C . [e′ → e]� ∩(C′ \ C) = ∅∨ � [e′ → e] ∩ (C′ \ C) = ∅9. ∀t,m ∈ C′ \ C . ∀c ∈ E . m� [c → t] =⇒ (c ∈ C ∨m ∈ {c, t}).Condition 1 ensures the accumulation of events. Condition 2 ensures conflictfreeness. Condition 3 ensures that only events which are enabled after C cantake place in C′. Condition 4 ensures that, if a cause disappears, there has tobe a dropper of it. The same is ensured by Condition 6 for appearing causes.Condition 5 ensures that if there are adders, the cause has to appear in thenew causal state, unless it occurred. Similarly, Condition 7 ensures, that causesdisappear, when there are droppers. To keep the theory simple, Condition 8avoids race conditions; it forbids the occurrence of an adder and a dropper ofthe same causal dependency within one transition. Condition 9 ensures thatDCESs coincide with GESs.

Definition 19. Let Δ be a DCES. The set of (reachable) states of Δ is definedas S(Δ) = {(X, csX) | S0 →∗

d (X, csX) ∧ X is finite }, where→∗d is the reflexive

and transitive closure of →d.Two DCESs Δ = (E,#,→,�,�) and Δ′ = (E′,#′,→′,�′,�′) are state

transition equivalent, denoted by Δ�s Δ′, if E = E′ and →d ∩ (S(Δ))2 = →′

d

∩ (S(Δ′))2.

Lemma 1. There are DCESs that are transition equivalent but not state tran-sition equivalent.

Because of the previous Lemma, we consider the more discriminating equiva-lence �s—instead of �t—to compare DCESs and to compare with DCESs. Tocompare with RCESs, we use the counterexample ργ of Fig. 3 to show that notfor every RCES there is a transition-equivalent DCES. Moreover RCESs cannotdistinguish between different causality states of one configuration (cf. Lem. 1).Consequently DCESs and RCESs are incomparable.

Theorem 8. DCESs and RCESs are incomparable.

By construction, DCESs are at least as expressive as GESs and SESs. To embeda SES (or GES) into a DCES it suffice to choose � = ∅ (or � = ∅). Furthermoreare DCESs incomparable to RCESs which are strictly more expressive than GESsand SESs. Thus DCESs are strictly more expressive then GESs and SESs.

Theorem 9. DCESs are strictly more expressive than GESs and SESs.

To compare with EBESs, we use the disabling of GESs, and the disjunctivecausality of SESs. The translation of an EBES into a DCES is formally defined in

Page 105: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Dynamic Causality in Event Structures 95

ct

cardiac pacemaker

rest

x-ray

(a)

ct

cardiac pacemaker

rest

x-ray

(b)

ct

cardiac pacemaker

rest

x-ray

(c)

Fig. 5. A DCES, a BES, and a DES modeling the medical example

[1], where disabling uses self-loops of target events, while droppers use auxiliaryimpossible events, not to intervene with the disabling. Besides we constructposets for the configurations of the translation, and compare them with those ofthe original EBES. In this way we prove that DCESs are at least as expressiveas EBESs. But since EBESs cannot model the disjunctive causality—without aconflict—of SESs which are included in DCESs, the following result holds.

Theorem 10. DCESs are strictly more expressive than EBESs.

6 Conclusions

We study the idea that causality may change during system runs in event struc-tures. For this, we enhance a simple type of ESs—the PES—by means of ad-ditional relations capturing the changes in events’ dependencies, driven by theoccurrence of other events.

First, in § 3, we limit our concern to the case where dependencies can onlybe dropped. We call the new resulting event structure Shrinking Causality ES(SES). In that section, we show that the exhibited dynamic causality can be ex-pressed through a completely static perspective, by proving equivalence betweenSESs and DESs. By such a proof, we do not only show the expressive powerof our new ES, but also the big enhancement in expressive power (w.r.t. PESs)gained by adding only this one relation.

Later on, in § 4, we study the complementary style where dependencies canbe added to events, resulting in Growing Causality ES (GES). We show that thegrowing causality can model both permanent and temporary disabling. Besides,it can be used to resolve conflicts and, furthermore, to force conflicts. Unlikethe SESs, the GESs are not directly comparable to other types of ESs from theliterature, except for PESs; one reason is that they provide a conjunctive styleof causality, another is their ability to express conditioning in causality.

Page 106: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

96 Y. Arbach et al.

Finally, in § 5, we combine both approaches of dynamicity with a new type ofevent structures, which we called the Dynamic Causality ES (DCES). Thereina dependency can be both added and dropped. For this new type of ESs thefollowing two—possibly surprising—facts can be observed: (1) There are typesof ESs that are incomparable to both SESs and GESs, but that are compara-ble to (here: strictly less expressive than) DCESs, i.e. the combination of SESsand GESs; one such type is EBESs. (2) Though SESs and GESs are strictlyless expressive than RCESs, their combination—the newly defined DCESs—isincomparable to RCESs, or any other type of ESs with a static causality.

To highlight the pragmatic advantages of dynamic-causality ESs over theirequivalent and non-equivalent competitor ESs, we go back to our example men-tioned in the motivation. Reichert et al. in [10] emphasize that the model ofsuch processes should distinguish between the regular execution path and theexceptional one. Accordingly, they define two labels, REGULAR and EXCEP-TIONAL, to be assigned to tasks. Fig. 5 (a) shows a DCES model of our example,where rest represents the rest of the treatment process, and ct represents thecomputer tomography. The initial causality in a DCES e.g. ct → rest corre-sponds to the regular path of a process, while the changes carried by modifierse.g. cardiac pacemaker correspond to exceptional one. Other static-causality ESslike a BES and a DES can model the fact that either the computer tomographyXOR the X-ray is needed, as shown in Fig. 5(b) and 5(c). The same can bedone by an equivalent RCES. However, we argue that none of these models candistinguish between regular and exceptional paths.

Thus our main contributions are: 1. We provide a formal model that allowsus to express dynamicity in causality. Using this model, we enhance the PESsyielding SESs, GESs and DCESs. 2. We show the equivalence of SESs and DESs.3. We show the incomparability of GESs to many other types of ESs. 4. We showthat DCESs are strictly more expressive than EBESs and thus strictly moreexpressive than many other existing types of ESs. 5. We show that DCESs areincomparable to RCESs. 6. The new model succinctly supports modern work-flow management systems.

In [5] Crafa et al. defined an Event Structure semantics for the π-calculusbased on Prime ESs. Since the latter do not allow for disjunctive causality whichthey needed, and in order to avoid duplications of events, they extended PrimeESs with a set of bound names, and altered the configuration definition to allowfor such disjunction. With Shrinking-Causality ESs—that can express disjunc-tive causality—this problem could possibly be addressed more naturally withoutcopying events. Here higher-order dynamicity, i.e. to allow for adding and drop-ping of adders and droppers, might help to deal with the instantiation of variablescaused by communications involving bound names.

Up to now, we limit the execution of a DCESs such that an interleavingbetween adders and droppers of the same causal dependency is forced. As afuture work, we want to study the case where modifiers of the same dependencycan occur concurrently—read: at the very same instant of time—in DCESs.Similarly, we want to investigate the situation of concurrent occurrence of an

Page 107: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Dynamic Causality in Event Structures 97

adder and its target in GESs. Furthermore, we want to study the ideas of addingand dropping by sets of events or even higher-order dynamics, i.e. events thatmay change the role of events to adders, dropper or back to normal events.Additionally, the set of possible changes in our newly defined ESs must still bedeclared statically. We will also investigate the idea that ESs can evolve, bysupporting ad hoc changes, such that new dependencies as well as events can beadded to a structure.

References

1. Arbach, Y., Karcher, D., Peters, K., Nestmann, U.: Dynamic Causality in EventStructures (Technical Report). TU Berlin (2015), http://arxiv.org/

2. Baldan, P., Busi, N., Corradini, A., Pinna, G.M.: Domain and event structuresemantics for Petri nets with read and inhibitor arcs. Theoretical Computer Sci-ence 323(1-3), 129–189 (2004)

3. Baldan, P., Corradini, A., Montanari, U.: Contextual Petri Nets, Asymmetric EventStructures, and Processes. Information and Computation 171(1), 1–49 (2001)

4. Boudol, G., Castellani, I.: Flow Models of Distributed Computations: Three Equiv-alent Semantics for CCS. Information and Computation 114(2), 247–314 (1994)

5. Crafa, S., Varacca, D., Yoshida, N.: Event structure semantics of parallel extru-sion in the pi-calculus. In: Birkedal, L. (ed.) FOSSACS 2012. LNCS, vol. 7213,pp. 225–239. Springer, Heidelberg (2012)

6. Katoen, J.P.: Quantitative and Qualitative Extensions of Event Structures. PhDthesis, Twente (1996)

7. Kuske, D., Morin, R.: Pomsets for Local Trace Languages - Recognizability,Logic & Petri Nets. In: Palamidessi, C. (ed.) CONCUR 2000. LNCS, vol. 1877,pp. 426–441. Springer, Heidelberg (2000)

8. Langerak, R.: Transformations and Semantics for LOTOS. PhD thesis, Twente(1992)

9. Langerak, R., Brinksma, E., Katoen, J.P.: Causal Ambiguity and Partial Ordersin Event Structures. In: Mazurkiewicz, A., Winkowski, J. (eds.) CONCUR 1997.LNCS, vol. 1243, pp. 317–331. Springer, Heidelberg (1997)

10. Reichert, M., Dadam, P., Bauer, T.: Dealing with forward and backward jumps inworkflow management systems. Software and Systems Modeling 2(1), 37–58 (2003)

11. Rensink, A.: Posets for Configurations! In: Cleaveland, W.R. (ed.) CONCUR 1992.LNCS, vol. 630, pp. 269–285. Springer, Heidelberg (1992)

12. van Glabbeek, R., Plotkin, G.: Event Structures for Resolvable Conflict. In: Fiala,J., Koubek, V., Kratochvıl, J. (eds.) MFCS 2004. LNCS, vol. 3153, pp. 550–561.Springer, Heidelberg (2004)

13. Weber, B., Reichert, M., Rinderle-Ma, S.: Change patterns and change support fea-tures - enhancing flexibility in process-aware information systems. Data & Knowl-edge Engineering 66(3), 438–466 (2008)

14. Winskel, G.: Events in Computation. PhD thesis, Edinburgh (1980)15. Winskel, G.: An introduction to event structures. In: de Bakker, J.W., de Roever,

W.-P., Rozenberg, G. (eds.) Linear Time, Branching Time and Partial Orderin Logics and Models for Concurrency. LNCS, vol. 354, pp. 364–397. Springer,Heidelberg (1989)

16. Winskel, G.: Distributed Probabilistic and Quantum Strategies. In: Proceedings ofMFPS. ENTCS, vol. 298, pp. 403–425. Elsevier (2013)

Page 108: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Loop Freedom in AODVv2

Kedar S. Namjoshi1 and Richard J. Trefler2(�)

1 Bell Laboratories, Alcatel-Lucent, Newyork, [email protected]

2 University of Waterloo, Waterloo, [email protected]

Abstract. The AODV protocol is used to establish routes in a mobile,ad-hoc network (MANET). The protocol must operate in an adversar-ial environment where network connections and nodes can be added orremoved at any point. While the ability to establish routes is best-effortunder these conditions, the protocol is required to ensure that no rout-ing loops are ever formed. AODVv2 is currently under development atthe IETF, we focus attention on version 04. We detail two scenarios thatshow how routing loops may form in AODVv2 routing tables. The secondscenario demonstrates a problem with the route table update performedon a Broken route entry. Our solution to this problem has been incorpo-rated by the protocol designers into AODVv2, version 05. With the fixin place, we present an inductive and compositional proof showing thatthe corrected core protocol is loop-free for all valid configurations.

1 Introduction

The AODV (“Ad-Hoc On-Demand Distance Vector”) protocol family is underdevelopment by the IETF MANET (Mobile, Ad-Hoc Networking) group. Its cur-rent form is AODVv21, which has evolved from the earlier DYMO2 and AODV3

protocols. As stated in the protocol description, AODVv2 “is intended for useby mobile routers in wireless, multihop networks. AODVv2 determines unicastroutes among AODVv2 routers within the network in an on-demand fashion,offering rapid convergence in dynamic topologies.” AODVv2 is still evolving; ourwork focuses on the recent version 04 (which we refer to as AODVv2-04), pub-lished in July 2014. Subsequently, AODVv2-05 was issued in October of 2014,and AODVv2-06 in December 2014.

K.S. Namjoshi – Supported, in part, by DARPA under agreement number FA8750-12-C-0166. The U.S. Government is authorized to reproduce and distribute reprintsfor Governmental purposes notwithstanding any copyright notation thereon. Theviews and conclusions contained herein are those of the authors and should not beinterpreted as necessarily representing the official policies or endorsements, eitherexpressed or implied, of DARPA or the U.S. Government.R.J. Trefler – Supported in part by Natural Sciences and Engineering ResearchCouncil of Canada Discovery and Collaborative Research and Development Grants.

1 http://datatracker.ietf.org/doc/draft-ietf-manet-aodvv2/2 http://datatracker.ietf.org/doc/draft-ietf-manet-dymo/3 http://datatracker.ietf.org/doc/rfc3561/

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 98–112, 2015.DOI: 10.1007/978-3-319-19195-9_7

Page 109: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Loop Freedom in AODVv2 99

The environment in which AODVv2 operates is challenging, as network con-nections and nodes may be added or removed at any point. In such a setting,routes are established in a best-effort mode. However, the protocol is requiredto enforce a key safety property, that there are no routing loops in any reach-able global state. A routing loop is formed when the next-hop entries in routingtables are connected in a cyclic manner (E.g., node A has next-hop B; node Bhas next-hop C; and node C has next-hop A).

We construct a formal, inductive proof that an abstract model of the protocolhas no routing loops. Such a proof has utility, even though AODVv2 is still notfinalized. A proof elucidates broad conditions under which loop-freedom can beguaranteed; those conditions can then be taken into account as the protocol isrefined, and any fixes necessary can be incorporated quickly and with relativelylittle cost. Indeed, in the course of our analysis, we found that AODVv2-04allows routing loops to form under certain sequences of actions, we discuss thosescenarios in Section 1.2. The first example illustrates that if instance-specifictiming constants are not set correctly, then routing loops may form. The secondexample is more serious since it can occur even if the protocol parameters areset correctly. This problem was quickly acknowledged by the protocol designersand corrected, based on our input, for version 05 of AODVv2 (cf. [14], AppendixC: Changes since revision ...-04.txt).Our model aims to capture the core of the AODVv2 protocol by abstracting

away some detail and by leaving out optional features. The main abstractionis that timer-driven actions are replaced either with non-determinism or withglobal predicate guards. For instance, our model allows a route entry to beinvalidated at any point, while the protocol permits this only after timer expira-tion. In another instance, routes marked as Expired are expunged (i.e., removedcompletely) only after there is no activity for at least MAX_SEQNUM_LIFETIME

seconds. Note that MAX_SEQNUM_LIFETIME is one of several instance specificconstants in the protocol. Our model abstracts away from such constants byreplacing the time-based preconditions with global network predicates. The pred-icates abstract nicely from the specifics of network structure, delays and process-ing power, which must go into determining a correct setting for these symbolicconstants.

The full AODVv2 protocol has mechanisms that allow a wide variety of met-rics to determine the cost of a route. Our model considers only the hopcountmetric, i.e. the metric that counts the number of network edge hops between theorigin node of a route and the target node of a route. Hop count is an impor-tant metric used in practice, and AODVv2 correctness requires at a minimumthat the protocol behave correctly with the hop count metric. Our proof can beadapted to other cost metrics, where the cost of a link is greater than 0, and costalong a path is additive. Within these limits, the model exhibits all of the actualprotocol computations, and more: hence, any proof that the model is correctshows also that the protocol, under the stated restrictions, is correct.

To summarize, our work makes two contributions: (1) we exhibit scenarioswhere AODVv2-04 allows routing loops to form, and suggest a protocol fix,

Page 110: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

100 K.S. Namjoshi and R.J. Trefler

which has been adopted by the designers, and (2) we construct a formal modelof the protocol and an inductive proof showing that the corrected core protocolensures loop freedom. This proof is interesting in its treatment of adversarialactions and its use of compositional reasoning.

1.1 Protocol Sketch

We informally sketch the main features of the protocol before proceeding tothe proof. The model we use is given in Section 3. The model fixes an originnode, O, and a target node, T . The protocol establishes a route from O toT in two phases. The first phase is initiated by O, and consists of floodinga RREQ (route request) message through the network4. Every node receivingthis RREQ message maintains an “origin route”, next-hop entry which pointsto the neighboring node from which it has received the best route so far fromO, i.e., (roughly) the (first) path with the least cost. Note that a node mayreceive multiple copies of the RREQ message sent from O, through differentpaths. Whenever the target node T receives a better RREQ route from O, itresponds with an RREP (route reply) message. This message is not flooded: itfollows (backwards) the path to O that has been established by the origin routeentries. With fixed network connectivity and no message losses, this procedureconverges (under mild conditions) to a least cost path from O to T . Under thenetwork disruptions that are expected in the MANET model, though, there is noguarantee of convergence. Under adversarial control of the network and messagetransmission, the only property that is required of the protocol is that it shouldnever form a global state which has a routing loop: i.e., a state where the set oforigin route entries form a cycle, such as where node A has next-hop B, B hasnext-hop C, and C has next-hop A.

The tricky part of the analysis has to do with the case of “broken” routeentries, which are created when links in the network fail. If A has next-hop Band the A−B link fails, then the entry at A is marked as Broken. However, newcopies of the RREQ message from O may arrive at A after the breakage. Whenshould a route from one of those messages be accepted at A? Accepting any routeat all – which makes sense in a way: an unbroken route, however bad, is surelypreferable to a broken one – may lead to a routing loop, as shown in the secondscenario below. This scenario was possible in version 04 of the protocol, and it

4 A data structure, the RREQ Table, is used in AODVv2-04 to control the flooding.Appendices A.1 and A.2.2 of AODVv2-04 describe precisely how the table is used: anincoming RREQ message is used to update a route entry, then the message is checkedagainst the table to determine if it should be regenerated and sent to neighboringnodes. (We have confirmed this order of actions with the protocol authors, to resolvea slight ambiguity in the main text.) Hence, the table does not influence routeupdates; it may only stop the regeneration of RREQs, which is already included inour model as message loss. Therefore, we do not model the table.

Page 111: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Loop Freedom in AODVv2 101

was discovered by us in the attempt to construct a proof of loop-freedom5. Thepartial proof pointed to the condition “accept any route that is not worse thanthe current broken route” as a possible resolution. We confirm that this is indeeda correct resolution through the formal proof given next. That resolution has alsobeen accepted by the authors of AODVv2 and included starting with revision 05of AODVv2 (cf. [14], Appendix C: Changes since revision ...-04.txt).

1.2 Loop Formation Scenarios

The first scenario creates a loop when the timer MAX_SEQNUM_LIFETIME is not setto a large enough value. The second creates a loop when any route is accepted inplace of a broken one. Reading through the scenarios helps build intuition abouthow the protocol operates, which is helpful in understanding the proofs.

Poor choice of timer values. The AODVv2 protocol has several actions that aretriggers by symbolic time constants. A protocol implementer has to give concretevalues to these constants, a very difficult decision, as the correct values dependon the topology of the network, processing speeds, and transmission delays. Asshown in the scenario below, a routing loop may result if the constants areinadvertently not set properly. The loop prevents RREP messages that are sentback from T , the target node, from reaching O, the originating node. As a result,no messages can be transferred from O to T .

We should note that this is not an error in the protocol – with a correctchoice of constants, the loop will not occur. We model the time-based actionsas guarded commands with an untimed guard over the network state. The proofshows that the model is loop-free. The modeling, therefore, helps to narrow downthe choice of time constants: the values chosen for a network instance should besuch that the guard condition is guaranteed to hold when the timers expire.

Fig. 1. Network: Early Expunge Scenario. Number by edge indicates hop-count, num-ber in brackets indicates transmission delay in time units.

5 From Section 6.3 of AODVv2-04, one case of the condition for acceptance of anew route is “((Route.State == Broken) && LoopFree(RteMsg, Route))”. Thepredicate LoopFree is defined in Section 5.6 as “LoopFree (R1, R2) is TRUE

when Cost(R2) <= (Cost(R1) + 1)”. Thus, LoopFree(RteMsg,Route) is true iffCost(Route) <= (Cost(RteMsg) + 1). This allows the cost of the route in the in-coming message, RteMsg, to be arbitrarily larger than the cost of the stored route,Route, if the stored route is in a Broken state.

Page 112: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

102 K.S. Namjoshi and R.J. Trefler

The network is fixed as shown in Figure 1. The scenario is as follows:

1. An RREQ message created by O travels along the path O; H’; H; T. As aresult, the origin route entries at these nodes have hop counts O=0, H’=1,H=2, T=3. A copy of the RREQ message remains undelivered on the linkH −H ′.

2. The route entry at H’ is expired and then expunged.This is the critical step. In AODVv2-04, timer conditions say when a routemust be expunged. For non-timed routes, this happens (ref. Section 6.3)when (Current_Time - Route.LastUsed)>= MAX_SEQNUM_LIFETIME. How-ever, the protocol (ref. Section 6.3) also allows routes to be expunged withoutreference to MAX_SEQNUM_LIFETIME: an Expired route may be expunged atany time (least recently used first). If this constant is set to too low a value,there will be messages within the network which are still undelivered. Inthe network of Figure 1, if MAX_SEQNUM_LIFETIME is set to 4 units, and theH − H ′ path (a single link is shown but it could be a path through inter-mediate nodes) has the delay shown in the figure, the protocol will force therouting entry at H to be expunged while there is an undelivered RREQ. Thecorrect value depends on many factors, including the size of the network, thelength of paths in the network, and processing speeds and buffering at thenodes. In the model, we abstract this to a global predicate which must bemet before the expunge action can occur.

3. The undelivered RREQ from H now reaches H’. Since H’ has no entry, itaccepts this route; its next-hop is now H.

4. H’ sends a RREQ to H with hopcount 3. Since H already has an entrywith a better hopcount, it rejects this message. At this point, there are noundelivered RREQ messages. The H’-H entries form a routing loop.

Broken Routes. A route entry at a node is marked as Broken if the node is madeaware of a break in connectivity. The following scenario shows that a loop mayform if a broken route is replaced by any valid route (as may seem reasonable,even if the new route has a higher cost).

Fig. 2. Network: Broken Route Scenario. Number over edge indicates hop-count (cost);red (slanted) line indicates break.

1. A RREQmessage generated at O sets up the route O(0);X(1);A(2);B(3);T(4).The numbers in () are the hopcounts for the origin route entries at each node.A RREQ message remains undelivered on the lower A-B link.

Page 113: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Loop Freedom in AODVv2 103

2. The link X-A breaks, causing the route at A to be marked as Broken. Afterthe link breaks, both X and A are required to send out RERR messagesto their neighbors. We assume that those messages are lost and thereforeneither O nor B is notified of the break.

3. A now receives a long route from O, with cost 100. This is the critical point.In version 04 of AODVv2, any valid route is acceptable in place of a brokenone (see footnote 5 for details on why this is permitted), so this route willbe accepted. The route entry at A is now valid and has hopcount 100.

4. A now has a non-broken route. It receives the previously undelivered RREQfrom B, which has cost 4. As this cost is less than that of its current routecost (i.e., 100), A switches its next-hop to B. Node A then sends an RREQto B, but that has higher cost than B’s current route, and is rejected. Nofurther RREQ messages remain in the network, so the A-B loop is stable.

2 Proof of Loop Freedom

The proof method is standard: we identify a suitable assertion and prove that itis an inductive invariant by showing that it is preserved by every action. How-ever, the proof structure is more interesting: (1) we explicitly model networkdisruptions as adversarial actions and (2) the induction proof is localized to theneighborhood of an arbitrarily chosen network edge; thus, it implicitly uses sym-metry and is compositional in nature. Some aspects of the model are especiallyimportant for the proof (see Section 3 for more details of the model):

1. There is an underlying connectivity graph of nodes. We assume that thegraph is finite but of arbitrary size. Nodes and links may fail, and new linkscan be formed at any point. A node can also be restarted after failure.

2. Any link change and the reaction to it happens atomically with respect tothe actions of the protocol.

3. We fix an arbitrary origin node, O, and an arbitrary target node, T , suchthat T differs from O. Protocol analysis is then based on the discovery andmaintenance of bidirectional routes from O to T .

4. A route entry has a sequence number, a hop-count, and a state6. We say thatan entry x is “better” than an entry y if (seqx,−hopx) is lexicographicallystrictly greater than (seqy,−hopy). I.e., if seqx > seqy or if seqx = seqyand hopx < hopy. In this situation, we also say that y is “worse” than x.We write this relationship as y ≺ x. We treat sequence numbers as nat-ural numbers; i.e., we do not model wrap-around effects. In AODV-v04, anode has its own sequence number generator, with the range [0 . . . 65535]. Anew sequence number is assigned for a fresh route request/response. As thenumbers are assigned per node and the protocol separates routing entries by(origin, target), the AODVv2 drafts implicitly assume that comparison of aroute with a wrap-around successor route is very unlikely.

6 In AODVv2-04, an entry is also labeled with an (origin, target) pair. As the modelfixes the origin and target nodes, we omit this label.

Page 114: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

104 K.S. Namjoshi and R.J. Trefler

2.1 Proof Summary

We consider first the origin route established through RREQ messages, and showthat it cannot have a routing loop. To avoid case-splitting, we suppose that thereis a dummy route to O at O, given by hopcount 0 and the sequence number ofO. The proof hinges on showing the following lemma, which gives the desiredtheorem below.

Lemma 1. The following is an inductive invariant: for any node H, and forany node G: if H has a route entry to O with next hop G, then G has a routeentry to O that is better than the entry at H.

Theorem 1. The protocol never reaches a state with a routing loop formed fromorigin route entries.

Proof: The proof is by contradiction. Suppose that there is a reachable protocolstate with a routing loop induced by the entries for origin routes. Pick a node,say H , on the loop other than O (there must be one such) and go around theloop from H in the next-hop direction. By Lemma 1, the route entries along thiscircuit improve strictly at each hop. By the transitivity of ≺, the route entry atH is strictly worse than the route entry at H , a contradiction. EndProof.

2.2 The Main Proof

The bulk of the proof lies in establishing the invariance condition in Lemma 1.We do so by induction: i.e., we show that the statement holds in the initial pro-tocol configuration, and that it is preserved by protocol actions and by dynamicnetwork changes. For brevity, we use “route” in place of “route entry” through-out; it should be understood that route does not refer to a path connectingseveral nodes together. We require an auxiliary lemma, given below. It statesthat routes in RREQ/RREP messages on outgoing channels adjacent to a nodeare no better than the corresponding route at that node.

Lemma 2. The following is an inductive invariant:

(a) For any node H, the route to O in any RREQ message for (O, T ) on anyoutgoing link from H is not better than the route for O at H.

(b) For any node H, the route to T in any RREP message for (O, T ) on thelink from H to its next-hop on the route to O is not better than the route forT at H.

Proof of Lemma 2(a): The claim holds trivially at the initial state, as all con-nection links are empty.

Consider a transition from global state s to global state t, and suppose thatthe claim holds at s. To show that it holds for t, consider a node H (other thanO) in t. The proof is by case analysis on the transition which takes the systemfrom s to t.

Page 115: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Loop Freedom in AODVv2 105

Consider first the normal operations. Most cases are straightforward. If thetransition is for a node other than H , it only affects the neighborhood of H ifa message is removed from an outgoing link of H ; in this case, the invariant istrivially preserved. Changes to the route state of H (idle-route or expire-route)do not change any routes. Expunging the route (expunge-route) preserves theinvariant as its guard requires all outgoing channels from H to be empty. Thegeneration of a RREQ (rreq-gen) can only be done by H if it is O: in that casethe route generated is equal to the (dummy) route for O at O.

The interesting case is where the route at H is updated through a RREQ orRREP message (rreq-recv, rrep-recv). Let x be the route at H to O in s andlet y be its route in t. Then either y � x (in the normal case) or y � x (ifx is Broken). Now for every route r in a RREQ message on the link in s, theinductive hypotheses requires that x � r, so that y � r by transitivity. Everynew RREQ message generated by H through rreq-recv carries the route y. Thisre-establishes the invariant. Processing an RERR message may only invalidatebut not change the origin route.

We now consider the dynamic changes. Dropping a message, and removinga node or a link trivially preserves the invariant as no routes are changed. Theaddition of a link to H establishes the invariant for that link, as the link isempty. The interesting case is if H is a recovered node (recover-node). By thepre-condition for recovery (see the model detailed in the next sections), all ofH ’s outgoing channels are empty, so the invariant holds. EndProof.

The proof for part (b) is essentially identical, as the processing of RREQmessages in rreq-recv and RREP messages in rrep-recv is nearly symmetric.

Proof of Lemma 1: The claim holds trivially at the initial state, as all routesare undefined. Consider a transition from global state s to global state t, andsuppose that the claim holds at s. We show that it holds at t by case analysison the transition.

We first consider the normal protocol actions. In state t, consider node H , anda node G such that the route to O from H has next-hop G. We have to showthat G has a route better than the route at H . Consider the possible actions.

(1) The action does not involve either H or G. So there is no change in theroutes at the two nodes. By assumption, the claim holds for (G,H) in s, so itcontinues to hold in t.

(2) The action is one of G. Modifications to route state (idle-route, expire-route) do not affect routes, so the claim continues to hold from the assumptionfor s. The action cannot be an expunge, as its guard is not met in s, as theentry for H in s has next-hop G. Processing of RERR messages does not changethe route at G (although its state may change). The interesting case is whereG updates its route to O from rG in s to r′G in t by processing an RREQ oran RREP message. By the protocol, r′G � rG. Since r′H = rH , and rG � rH byassumption for s, we get that r′G � r′H in t.

(3) The action is one of H . Modifications to route state (idle-route, expire-route) do not affect routes, so the invariant is preserved from s. The actioncannot be an expunge, as H has an entry in t. The interesting case is if H

Page 116: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

106 K.S. Namjoshi and R.J. Trefler

updates its route to O through a rreq-recv for (O, T ) or through a rrep-recv for(X,O), where X is some node. Since H points to G in t, the updating messagemust be from G. Say this message carries a route r, and let rG be the route toO at G in s. By Lemma 2, regardless of the message type (RREQ or RREP),at state s, rG � r. The new route at H is obtained from r by incrementing itshopcount, so it is worse than r (i.e., r′H ≺ r). The route in G is unchanged in thetransition (i.e., r′G = rG). Hence, we have r′G = rG � r � r′H . By transitivity,r′G � r′H , as is desired. (Note the crucial role played by the hopcount incrementat H .) Actions which process RERR messages do not change the route at H , sothey preserve the invariant.

We now consider dynamic changes which affect H and G.(4) Dropping a message from a link, and removing a link trivially preserves

the invariant as no routes are changed. (Note that the link between H and Gmay be broken by the transition, yet H ’s route entry still points to G in t.)

(5) The action cannot be the addition or restart of H , as the newly added Hwould not have an origin route in t. The action may not add G as a fresh nodeeither, as the next-hop entry for G exists for H in s.

(6) The action cannot be the restart of G, as its precondition requires thereto be no entries which have G as a next-hop, and H has such an entry in s.

(7) The action cannot be the removal of nodes G or H , as we are only statingthe claim where both nodes exist in the network at t. (In t, there may be a nodeH ′ which has a next-hop entry for G′, but G′ is no longer in the network at t.Such a (G′, H ′) pair is not part of the invariant claim.) EndProof.

RREP Invariants. RREP (route response) messages are generated whenever anew RREQ message reaches its target. They follow a single path from target tosource which is set up by the origin route entries. I.e., unlike RREQs, the RREPmessages do not flood the network. The RREP messages create “target route”entries at each node, which determine a path from that node to the target, T .However, the origin route path at the point an RREP message is created maychange as the protocol progresses and intermediate nodes receive better routes.It may also change as the result of network disruptions and rearrangements.Hence, it is not obvious that RREP messages do not induce a routing loop inthe target route entries. The proof that the target routes created by RREPmessages is loop-free is similar in structure to the RREP loop-freedom proof.This is possible as the protocol is nearly symmetric in its handling of RREQand RREP messages. We therefore omit this proof.

3 AODVv2 Model

We describe the protocol model from the viewpoint of a node with name H .

Data Structures. A node maintains a route table route, indexed by nodes. Theroute to a node may be undefined, which we denote by ⊥. If defined, a route toa node is a pair: (n, e), where n is the next-hop node and e is its route entry.

Page 117: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Loop Freedom in AODVv2 107

An entry is of the form (s, h, x), where s is a sequence number, h is the hopcount(or, more generally, the cost), and x is the state of the route (one of Active, Idle,Expired, or Broken). It is assumed that s and h are non-negative numbers. Inaddition, a node maintains its own sequence number, referred to as seq. Weuse standard notation to refer to these components, for instance, n.route[O].e.hrefers to the hopcount of the route entry to node O at node n.

Messages. The protocol has three types of messages: RREQ (route request),RREP (route reply) and RERR (route error). Each message has the followingcomponents: h (a hopcount), tlv = (sO, sT ) (sequence numbers for origin andtarget, possibly undefined), and (O, T ) – the origin and target pair. We write amessage as, for example, RREQ(h, (sO, sT ), (O, T )).

Initial State. In its initial state, a node has undefined origin and target routes,and sequence number 0.

Protocol Actions. Here, we list the actions taken during normal operation. Theactions are atomic but may occur at any time. In the protocol, actions such asexpire-route are based on timers, to ensure that they do not happen too often.Since we are concerned with correctness, not performance, we replace such usesof timing by non-determinism. There are some parts of the protocol where timedactions are used as a proxy for global conditions. In the model, we replace suchtimers with global guards.

In the description below, we have also made certain actions (e.g., processingof RERRs) have more effect, or be more often enabled, than the actual protocolrecommends. This can only result in the model having more executions than theactual protocol, so any invariants shown for the model also hold for the protocol.

The notation y >> x expresses that the route in the route message y ispreferable to the route table entry x. From the AODVv2 protocol description,this is true if (1) y.s > x.s, or if (2) y.s = x.s, and either (a) y.h+1 < x.h, or (b)x is in the Broken state and y.h+1 ≤ x.h. (Term (b) is the correction introducedin AODVv2-05 based on the second loop-formation scenario from Section 1.2.)

We introduce the global predicate AllClear, which replaces the time-drivenactions based on MAX_SEQNUM_LIFETIME. The predicate AllClear(H) holds iff (1)there are no messages in any channel of the network with origin or target beingH , and (2) all outgoing channels from H are empty, and (3) no other node hasan Active route entry with next-hop H . This global condition is not present inthe actual protocol, as it cannot be checked locally. The protocol instead definesa symbolic time constant, MAX_SEQNUM_LIFETIME – a node waits until that muchtime has expired before expunging an entry. The protocol description does notspecify how this value is to be chosen for a network instance: the value should,clearly, depend on factors such as the size of the network, the link delays, andthe processing power of a node. The global condition defined here abstracts fromthese considerations: the time value should be set so that the global conditionis guaranteed to be true after that much time has elapsed.

Page 118: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

108 K.S. Namjoshi and R.J. Trefler

skip do nothingexpunge-route remove route if its state is Expired, and AllClear(H) holds.idle-route change route state to Idle if Active.expire-route change route state to Expired if Idle.rreq-gen(T) This generates an RREQ (request) message to node T .

true ==>

let msg = RREQ(h=0, (sO=H.seq+1, sT=H.route[T].e.s), (H,T)) in

H.seq := H.seq+1;

multicast(msg)

rreq-recv(RREQ(m),K) This action processes an RREQ messagem = (h, (sO, sT ), (O, T )) from neighbor K. It is guarded by the conditionthat the route in m is better than the origin route at node H .

(m.sO,m.h,Active) >> H.route[O].e ==> // m has a better route to the origin

// update the origin route

H.route[O] := (K,(m.sO,m.h+1),Active);

// propagate or reply as appropriate

if (H=T) then // H is the target node: reply with RREP

let reply = RREP(h=0,(sO=m.sO,sT=H.seq+1), (O,T)) in

H.seq := H.seq+1; // update local sequence number

unicast(reply, K) // send only to K

else // H is an intermediate node: propagate

let msg = RREQ(m.h+1, m.tlv,(O,T)) in

multicast(msg) // send to all neighbors

endif

rrep-recv(RREP(m),K) This action processes a reply (RREP) message m =(h, (sO, sT ), (O, T )) from neighbor K if it contains a better target route.

(m.sT,m.h,Active) >> H.route[T].e ==> // m has better route to the target

// update the target route

H.route[T] := (K,(m.sT,m.h+1),Active);

// propagate as appropriate

if (H = O) then // H is the origin node: do nothing

skip

else // H is an intermediate node

if (H.route[O] is defined) then // propagate RREP

let replymsg = RREP(m.h+1, m.tlv, (O,T)) in

unicast(replymsg, H.route[O].n)

else // generate error RERR

let errormsg = RERR(h=0,tlv=(_,_)) in

unicast(errormsg,K)

endif

rerr-recv(RERR(m),K) This action processes an error (RERR) message fromneighbor K. Mark any routes passing through K as broken, and propagatethe error. This is more permissive than the protocol in marking routes asBroken: in the protocol, there are other fields in the RERR message whichH can use to distinguish whether the error message from K pertains to anorigin or a target route.

true ==>

for all nodes w:

if (H.route[w].n = K) then

Page 119: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Loop Freedom in AODVv2 109

H.route[w].e.x := Broken; // mark route as broken.

multicast(RERR(m)) // propagate RERR to all neighbors

endif

Dynamic Actions. We now describe protocol actions taken in response to dy-namic changes. In our model the adversary may add, recover, or delete nodes,and may add or delete edges. Edges may be deleted by the adversary at anypoint during protocol execution. However, the adversary may delete a node onlyif the node is not linked to any edge. Below we give the detailed response thatthe protocol takes to adversarial actions.

remove-node(H) Do nothing.new-node(H) If H is a new node, it starts at its initial state, and all outgoing

channels are empty.recover-node(H) H is a recovered node. It does not re-join the protocol until

the condition AllClear(H) holds. This is the same global guard as that forexpunge-route. That is not a coincidence, the two conditions should be thesame, as shown by the first loop-formation scenario from Section 1.2. Theactual AODVv2-04 protocol says that a node can re-join the protocol onceMAX_SEQNUM_LIFETIME seconds have elapsed.

remove-link(H,K) Mark any routes through K as being broken, and sendRERR messages accordingly

true ==>

for all nodes w:

if (H.route[w].n = K) then

H.route[w].e.x := Broken; // mark route as broken.

multicast(RERR(m)) // propagate RERR to all neighbors

endif

add-link(H,K) new link from H to K established. Do nothing.

4 Related Work and Conclusions

There is a long history of research on inductive and compositional analysis ap-plied to network protocols: the work in [13,15,3,16,4] is representative. The con-tribution of this work is to apply these ideas to the verification of a protocoloperating under dynamic, adversarial network changes. Our proof technique isstandard (cf.[4]): we postulate an assertion and show that it is inductive by prov-ing that it is preserved by every action. However, there are interesting aspects tothe structure of the proof. Most importantly, our proof technique is ‘local’, thatis, it is applied to a generic protocol node (or edge), and considers interferencefrom only the nodes in the neighborhood of that node (or edge) during protocolexecution. Hence, the method is compositional. It relies on symmetry in the sensethat the generic node analyzed represents any of the nodes that may arise duringthe execution of the actual protocol. In addition, the possibility of adversarialnetwork change is taken care of by modeling the changes as non-deterministicactions, which are always enabled, and may take effect at any point. In [12] a

Page 120: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

110 K.S. Namjoshi and R.J. Trefler

simpler model of AODVv2 was analyzed. In particular, the AODVv2 model inthat earlier work did not incorporate node restarts or the expunging of Expiredroute table entries. This meant that the earlier model did not need to considerthe AllClear global guard. We note that consideration of Expired routes leadsto the first example of a routing loop in Section 1.2.

The formation of routing loops has been studied for earlier forms of theAODVv2 protocol (AODV and DYMO) in [1,5,17,11] and [18]. Although it op-erates in the same environment and has the same goals, the version of AODVv2under development differs significantly, in part due to efforts made to ensure thatrouting loop scenarios discovered for earlier forms are avoided. For instance, theuse of sequence numbers in AODVv2 is completely different from that in AODVand DYMO. The version of AODVv2 (DYMO) analyzed in [9] by model checkingfixed configurations allows intermediate, non-target nodes to generate RREQs,this is not possible in AODVv2-04. Nonetheless, some key features have beenretained across the protocol versions. An important one is the use of (sequencenumber, hopcount) as a metric to ensure loop freedom. That is to be expected,as the intuition given in all of the protocol descriptions is that the sequence num-ber represents the “freshness” of a route, while hopcount represents its “cost”.Our work shows that this intuition is valid; but it also shows (from the loopformation scenarios) that care must be taken when considering disruptive net-work changes. We have found it surprisingly easy to construct the proof, and wesuspect that this is so because of a focus, through compositional reasoning, on‘local’ state rather than ‘global’ state, and the many simplifications introducedby the designers.

The AODVv2 model verified here represents a possible abstract protocol im-plementation. However, several features or options of the full protocol are eithernot modeled or are not modeled in their full generality. For instance, in our ver-sion each addressable entity in the network is, if present, identified with a singlenode in any network topology. In contrast, in the full AODVv2 protocol, entitiesmay be ‘multi-homed’, and therefore messages sent to the entity may be sent tomultiple destinations.

Another significant difference is that in the model, we assume that the metricused by all nodes to determine the ‘least cost route’ to a destination is based onhop count. That is, the distance between any two neighboring nodes is 1, andthe cost of a path from node O to node T is the number of nodes in the pathminus 1. The protocol actually allows protocol implementers to choose a differentmetric, which changes the ‘least cost route’. In practice, such metrics may includeinformation relating to bandwidth of individual edges connecting neighboringnodes, or the implementation of individual edge connections (wireless, wired,etc.), to name just a few possible metrics.

In addition, we note that the full AODVv2 protocol allows great scope forimplementation decisions in the following form. Many per-node protocol deci-sions are described as ‘must’ but some are described as ‘may.’ For instance, ifthe route from node H to node T is marked as ‘expired’ in the route table ofH then H must not advertise this route to its neighbors. However, if H receives

Page 121: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Loop Freedom in AODVv2 111

an RREQ for T from a neighbor G then H may choose to add this route to O toH ’s routing table and advertise the RREQ message to H ’s neighbors. In our anal-ysis, we model these decisions as must instructions. Hence, any RREQ messagereceived at a node H will be processed at H and forwarded to H ’s neighbors.We note that, the models described in our work represent models allowed bythe AODVv2 protocol and therefore any errors or discrepancies found in themodelled protocol would represent discrepancies in the full AODVv2 protocol.

There are several other approaches to the analysis of dynamic and ad-hoc net-works. The work in [2] shows that Hoare triples for restricted logics are decidable.Work in [8,6] applies well-quasi-ordering (wqo) theory to ad-hoc networks, whilethe algorithm of [7] relies on symbolic forward exploration, as does (in a dif-ferent way) the method of [17]. It would be interesting to see how well thesealgorithmic and semi-algorithmic methods apply to the AODVv2 model. Ourown recent work [12] shows that the loose coupling forced by dynamic networkchanges contributes to the effectiveness of compositional reasoning and localsymmetry reduction.

4.1 Conclusion and Future Work

We describe a formal proof of loop-freedom for a model of the AODVv2 protocol.In the course of doing so, we discovered a mistake in version 04 of the protocol,which has been acknowledged and corrected by the designers. The straightfor-ward nature of the proof strengthens the conjecture which originally inspiredthis work: that dynamic network protocols must be loosely coupled and, hence,especially amenable to inductive compositional analysis.

There are several open questions that remain. For instance, we are interestedin techniques for the automatic generation of induction compositional assertionsfor use in the analysis of loosely coupled dynamic systems. Other questionssurround the analysis of AODVv2 itself. One is to check whether the chosenvalues for timing constants are correct for a given configuration of the protocol;the work in [10] can be a good starting point. Another is to generalize this proofto apply to a richer class of distance metrics, as well as to network features suchas multi-homing. A particularly important question is to find a good strategyfor constructing proofs for the various combinations of “may” options which arepermitted by the protocol, while avoiding a combinatorial explosion of protocolvariants. As nearly all network protocols include a number of may options, thisis a broadly applicable question, and especially relevant in practice.

Acknowledgments. We would like to thank the authors of the AODVv2-04 protocol,in particular Charles Perkins, for helpful comments on the loop-formation scenarios andthe proof.

Page 122: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

112 K.S. Namjoshi and R.J. Trefler

References

1. Bhargavan, K., Obradovic, D., Gunter, C.A.: Formal verification of standards fordistance vector routing protocols. J. ACM 49(4), 538–576 (2002)

2. Bouajjani, A., Jurski, Y., Sighireanu, M.: A generic framework for reasoning aboutdynamic networks of infinite-state processes. In: Grumberg, O., Huth, M. (eds.)TACAS 2007. LNCS, vol. 4424, pp. 690–705. Springer, Heidelberg (2007)

3. Chandy, K., Misra, J.: Proofs of networks of processes. IEEE Transactions onSoftware Engineering 7(4) (1981)

4. Chandy, K.M., Misra, J.: Parallel Program Design: A Foundation. Addison-Wesley(1988)

5. Das, S., Dill, D.L.: Counter-example based predicate discovery in predicate abstrac-tion. In: Aagaard, M.D., O’Leary, J.W. (eds.) FMCAD 2002. LNCS, vol. 2517, pp.19–32. Springer, Heidelberg (2002)

6. Delzanno, G., Sangnier, A., Traverso, R., Zavattaro, G.: On the complexity ofparameterized reachability in reconfigurable broadcast networks. In: FSTTCS.LIPIcs, vol. 18, pp. 289–300. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik(2012)

7. Delzanno, G., Sangnier, A., Zavattaro, G.: Parameterized verification of safetyproperties in ad hoc network protocols. In: PACO. EPTCS, vol. 60, pp. 56–65(2011)

8. Delzanno, G., Sangnier, A., Zavattaro, G.: Verification of ad hoc networks withnode and communication failures. In: Giese, H., Rosu, G. (eds.) FMOODS/FORTE2012. LNCS, vol. 7273, pp. 235–250. Springer, Heidelberg (2012)

9. Edenhofer, S., Hofner, P.: Towards a rigorous analysis of AODVv2 (DYMO). In:20th IEEE International Conference on Network Protocols, ICNP 2012, Austin,TX, USA, October 30-November 2, pp. 1–6. IEEE (2012)

10. Fehnker, A., van Glabbeek, R., Hofner, P., McIver, A., Portmann, M., Tan, W.L.:Automated analysis of AODV using UPPAAL. In: Flanagan, C., Konig, B. (eds.)TACAS 2012. LNCS, vol. 7214, pp. 173–187. Springer, Heidelberg (2012)

11. Hofner, P., van Glabbeek, R.J., Tan, W.L., Portmann, M., McIver, A., Fehnker,A.: A rigorous analysis of AODV and its variants. In: MSWiM, pp. 203–212. ACM(2012)

12. Namjoshi, K.S., Trefler, R.J.: Analysis of dynamic process networks. In: Baier, C.,Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 164–178. Springer, Heidelberg(2015)

13. Owicki, S.S., Gries, D.: Verifying properties of parallel programs: An axiomaticapproach. Commun. ACM 19(5), 279–285 (1976)

14. Perkins, C., Ratliff, S., Dowdell, J.: IETF MANET WG Internet Draft (Decem-ber 2014), http://datatracker.ietf.org/doc/draft-ietf-manet-aodvv2, current revi-sion 06

15. Pnueli, A.: The temporal logic of programs. In: FOCS (1977)16. Pnueli, A.: In transition from global to modular reasoning about programs. In:

Logics and Models of Concurrent Systems. NATO ASI Series (1985)17. Saksena, M., Wibling, O., Jonsson, B.: Graph grammar modeling and verification

of ad hoc routing protocols. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008.LNCS, vol. 4963, pp. 18–32. Springer, Heidelberg (2008)

18. van Glabbeek, R.J., Hofner, P., Tan, W.L., Portmann, M.: Sequence numbers donot guarantee loop freedom – AODV can yield routing loops –. In: MSWiM, 10pages. ACM (2013)

Page 123: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Code Mobility Meets Self-organisation:A Higher-Order Calculus of Computational Fields

Ferruccio Damiani1(�), Mirko Viroli2, Danilo Pianini2, and Jacob Beal3

1 University of Torino, Torino, [email protected]

2 University of Bologna, Bologna, Italy{mirko.viroli,danilo.pianini}@unibo.it3 Raytheon BBN Technologies, Cambridge, USA

[email protected]

Abstract. Self-organisation mechanisms, in which simple local interactionsresult in robust collective behaviors, are a useful approach to managing the coor-dination of large-scale adaptive systems. Emerging pervasive application scenar-ios, however, pose an openness challenge for this approach, as they often requireflexible and dynamic deployment of new code to the pertinent devices in thenetwork, and safe and predictable integration of that new code into the existingsystem of distributed self-organisation mechanisms. We approach this problemof combining self-organisation and code mobility by extending “computationalfield calculus”, a universal calculus for specification of self-organising systems,with a semantics for distributed first-class functions. Practically, this allows self-organisation code to be naturally handled like any other data, e.g., dynamicallyconstructed, compared, spread across devices, and executed in safely encapsu-lated distributed scopes. Programmers may thus be provided with the novel first-class abstraction of a “distributed function field”, a dynamically evolving mapfrom a network of devices to a set of executing distributed processes.

1 Introduction

In many different ways, our environment is becoming ever more saturated withcomputing devices. Programming and managing such complex distributed systems is

This work has been partially supported by HyVar (www.hyvar-project.eu, this project hasreceived funding from the European Unions Horizon 2020 research and innovation programmeunder grant agreement No 644298 - Damiani), by EU FP7 project SAPERE (www.sapere-project.eu, under contract No 256873 - Viroli), by ICT COST Action IC1402 ARVI (www.cost-arvi.eu - Damiani), by ICT COST Action IC1201 BETTY (www.behavioural-types.eu -Damiani), by the Italian PRIN 2010/2011 project CINA (sysma.imtlucca.it/cina - Damiani& Viroli), by Ateneo/CSP project SALT (salt.di.unito.it - Damiani), and by the United StatesAir Force and the Defense Advanced Research Projects Agency under Contract No. FA8750-10-C-0242 (Beal). The U.S. Government is authorized to reproduce and distribute reprints forGovernmental purposes notwithstanding any copyright notation thereon. The views, opinions,and/or findings contained in this article are those of the author(s)/presenter(s) and should notbe interpreted as representing the official views or policies of the Department of Defense orthe U.S. Government. Approved for public release; distribution is unlimited.

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 113–128, 2015.DOI: 10.1007/978-3-319-19195-9_8

Page 124: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

114 F. Damiani et al.

a difficult challenge and the subject of much ongoing investigation in contexts such ascyber-physical systems, pervasive computing, robotic systems, and large-scale wirelesssensor networks. A common theme in these investigations is aggregate programming,which aims to take advantage of the fact that the goal of many such systems are bestdescribed in terms of the aggregate operations and behaviours, e.g., “distribute the newversion of the application to all subscribers”, or “gather profile information from every-body in the festival area”, or “switch on safety lights on fast and safe paths towards theemergency exit”. Aggregate programming languages provide mechanisms for buildingsystems in terms of such aggregate-level operations and behaviours, and a global-to-local mapping that translates such specifications into an implementation in terms of theactions and interactions of individual devices. In this mapping, self-organisation tech-niques provide an effective source of building blocks for making such systems robustto device faults, network topology changes, and other contingencies. A wide range ofsuch aggregate programming approaches have been proposed [3]: most of them sharethe same core idea of viewing the aggregate in terms of dynamically evolving fields,where a field is a function that maps each device in some domain to a computationalvalue. Fields then become first-class elements of computation, used for tasks such asmodelling input from sensors, output to actuators, program state, and the (evolving)results of computation.

Many emerging pervasive application scenarios, however, pose a challenge to theseapproaches due to their openness. In these scenarios, there is need to flexibly and dy-namically deploy new or revised code to pertinent devices in the network, to adaptivelyshift which devices are running such code, and to safely and predictably integrate it intothe existing system of distributed processes. Prior aggregate programming approaches,however, have either assumed that no such dynamic changes of code exist (e.g., [2,21]),or else provide no safety guarantees ensuring that dynamically composed code willexecute as designed (e.g., [15,22]). Accordingly, our goal in this paper is develop afoundational model that supports both code mobility and the predictable compositionof self-organisation mechanisms. Moreover, we aim to support this combination suchthat these same self-organisation mechanisms can also be applied to manage and directthe deployment of mobile code.

To address the problem in a general and tractable way, we start from the field cal-culus [21], a recently developed minimal and universal [5] computational model thatprovides a formal mathematical grounding for the many languages for aggregate pro-gramming. In field calculus, all values are fields, so a natural approach to code mo-bility is to support fields of first-class functions, just as with first-class functions inmost modern programming languages and in common software design patterns suchas MapReduce [10]. By this mechanism, functions (and hence, code) can be dynami-cally consumed as input, passed around by device-to-device communication, and oper-ated upon just like any other type of program value. Formally, expressions of the fieldcalculus are enriched with function names, anonymous functions, and application offunction-valued expressions to arguments, and the operational semantics properly ac-commodates them with the same core field calculus mechanisms of neighbourhood fil-tering and alignment [21]. This produces a unified model supporting both code mobilityand self-organisation, greatly improving over the independent and generally incompat-

Page 125: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Code Mobility Meets Self-Organisation 115

ible mechanisms which have typically been employed in previous aggregate program-ming approaches. Programmers are thus provided with a new first-class abstraction ofa “distributed function field”: a dynamically evolving map from the network to a set ofexecuting distributed processes.

Section 2 introduces the concepts of higher-order field calculus; Section 3 formalisestheir semantics; Section 4 illustrates the approach with an example; and Section 5 con-cludes with a discussion of related and future work.

2 Fields and First-Class Functions

The defining property of fields is that they allow us to see computation from two differ-ent viewpoints. On the one hand, by the standard “local” viewpoint, computation is seenas occurring in a single device, and it hence manipulates data values (e.g., numbers) andcommunicates such data values with other devices to enable coordination. On the otherhand, by the “aggregate” (or “global”) viewpoint [21], computation is seen as occur-ring on the overall network of interconnected devices: the data abstraction manipulatedis hence a whole distributed field, a dynamically evolving data structure having extentover a subset of the network. This latter viewpoint is very useful when reasoning aboutaggregates of devices, and will be used throughout this document. Put more precisely,a field value φ may be viewed as a function φ : D →L that maps each device δ inthe domain D to an associated data value � in range L . Field computations then takefields as input (e.g., from sensors) and produce new fields as outputs, whose values maychange over time (e.g., as inputs change or the computation progresses). For example,the input of a computation might be a field of temperatures, as perceived by sensors ateach device in the network, and its output might be a Boolean field that maps to true

where temperature is greater than 25◦C, and to false elsewhere.

Field Calculus. The field calculus [21] is a tiny functional calculus capturing the es-sential elements of field computations, much as λ -calculus [7] captures the essence offunctional computation and FJ [12] the essence of object-oriented programming. Theprimitive expressions of field calculus are data values denoted � (Boolean, numbers,and pairs), representing constant fields holding the value � everywhere, and variables x,which are either function parameters or state variables (see the rep construct below).These are composed into programs using a Lisp-like syntax with five constructs:

(1) Built-in function call (o e1 · · ·en): A built-in operator o is a means to uniformlymodel a variety of “point-wise” operations, i.e. involving neither state nor communi-cation. Examples include simple mathematical functions (e.g., addition, comparison,sine) and context-dependent operators whose result depends on the environment (e.g.,the 0-ary operator uid returns the unique numerical identifier δ of the device, and the0-ary nbr-range operator yields a field where each device maps to a subfield mappingits neighbours to estimates of their current distance from the device). The expression(o e1 · · ·en) thus produces a field mapping each device identifier δ to the result of ap-plying o to the values at δ of its n ≥ 0 arguments e1, . . . ,en.(2) Function call (f e1 . . . en): Abstraction and recursion are supported by functiondefinition: functions are declared as (def f(x1 . . . xn) e) (where elements xi are formal

Page 126: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

116 F. Damiani et al.

(a) (if x (f (sns)) (g (sns))) (b) ((if x f g) (sns))

Fig. 1. Field calculus functions are evaluated over a domain of devices. E.g., in (a) the if opera-tion partitions the network into two subdomains, evaluating f where field x is true and g whereit is false (both applied to the output of sensor sns). With first-class functions, however, domainsmust be constructed dynamically based on the identity of the functions stored in the field, as in(b), which implements an equivalent computation.

parameters and e is the body), and expressions of the form (f e1 . . . en) are the way ofcalling function f passing n arguments.(3) Time evolution (rep x e0 e): The “repeat” construct supports dynamically evolv-ing fields, assuming that each device computes its program repeatedly in asynchronousrounds. It initialises state variable x to the result of initialisation expression e0 (a valueor a variable), then updates it at each step by computing e against the prior value of x.For instance, (rep x 0 (+ x 1)) is the (evolving) field counting in each device howmany rounds that device has computed.(4) Neighbourhood field construction (nbr e): Device-to-device interaction is encap-sulated in nbr, which returns a field φ mapping each neighbouring device to its mostrecent available value of e (i.e., the information available if devices broadcast the valueof e to their neighbours upon computing it). Such “neighbouring” fields can then bemanipulated and summarised with built-in operators, e.g., (min-hood (nbr e)) out-puts a field mapping each device to the minimum value of e amongst its neighbours.(5) Domain restriction (if e0 e1 e2): Branching is implemented by this construct,which computes e1 in the restricted domain where e0 is true, and e2 in the restricteddomain where e0 is false.

Any field calculus computation may be thus be viewed as a function f takingzero or more input fields and returning one output field, i.e., having the signaturef : (D → L )k → (D → L ). Figure 1a illustrates this concept, showing an examplewith complementary domains on which two functions are evaluated. This aggregate-level model of computation over fields can then be “compiled” into an equivalent sys-tem of local operations and message passing actually implementing the field calculusprogram on a distributed system [21].

Higher-order Field Calculus. The higher-order field calculus (HFC) is an extensionof the field calculus with embedded first-class functions, with the primary goal of al-lowing it to handle functions just like any other value, so that code can be dynamicallyinjected, moved, and executed in network (sub)domains. If functions are “first class” in

Page 127: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Code Mobility Meets Self-Organisation 117

e ::= x∣∣ v

∣∣ ( e e)

∣∣ (rep x w e)

∣∣ (nbr e)

∣∣ (if e e e) expression

v ::= �∣∣ φ value

� ::= b∣∣ n

∣∣ 〈�,�〉 ∣

∣ o∣∣ f

∣∣ (fun (x) e) local value

w ::= x∣∣ � variable or local value

F ::= (def f(x) e) user-defined functionP ::= F e program

Fig. 2. Syntax of HFC (differences from field calculus are highlighted in grey)

the language, then: (i) functions can take functions as arguments and return a functionas result (higher-order functions); (ii) functions can be created “on the fly” (anonymousfunctions); (iii) functions can be moved between devices (via the nbr construct); and(iv) the function one executes can change over time (via rep construct).

The syntax of the calculus is reported in Fig. 2. Values in the calculus include fieldsφ , which are produced at run-time and may not occur in source programs; also, localvalues may be smoothly extended by adding other ground values (e.g., characters) andstructured values (e.g., lists). Borrowing syntax from [12], the overbar notation denotesmetavariables over sequences and the empty sequence is denoted by •. E.g., for expres-sions, we let e range over sequences of expressions, written e1, e2, . . . en (n ≥ 0). Thedifferences from the field calculus are as follows: function application expressions (ee)can take an arbitrary expression e instead of just an operator o or a user-defined func-tion name f; anonymous functions can be defined (by syntax (fun (x) e)); and built-inoperators, user-defined function names, and anonymous functions are values. This im-plies that the range of a field can be a function as well. To apply the functions mappedto by such a field, we have to be able to transform the field back into a single aggregate-level function. Figure 1b illustrates this issue, with a simple example of a function callexpression applied to a function-valued field with two different values.

How can we evaluate a function call with such a heterogeneous field of functions?It would seem excessive to run a separate copy of function f for every device that hasf as its value in the field. At the opposite extreme, running f over the whole domainis problematic for implementation, because it would require devices that may not havea copy of f to help in evaluating f . Instead, we will take a more elegant approach, inwhich making a function call acts as a branch, with each function in the range appliedonly on the subspace of devices that hold that function. Formally, this may be expressedas transforming a function-valued field φ into a function fφ that is defined as:

fφ (ψ1,ψ2, . . .) =⋃

f∈φ(D)

f (ψ1|φ−1( f ),ψ2|φ−1( f ), . . . ) (1)

where ψi are the input fields, φ(D) is set of all functions held as data values by somedevices in the domain D of φ , and ψi|φ−1( f ) is the restriction of ψi to the subspace ofonly those devices that φ maps to function f . In fact, when the field of functions isconstant, this reduces to be precisely equivalent to a standard function call. This meansthat we can view ordinary evaluation of function f as equivalent to creating a function-valued field with a constant value f , then making a function call applying that field toits argument fields. This elegant transformation is the key insight of this paper, enablingfirst-class functions to be implemented with a minimal change to the existing semantics

Page 128: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

118 F. Damiani et al.

while also ensuring compatibility with the prior semantics as well, thus also inheritingits previously established desirable properties.

3 The Higher-order Field Calculus: Dynamic and Static Semantics

Dynamic Semantics (Big-Step Operational Semantics). As for the field calcu-lus [21], devices undergo computation in rounds. In each round, a device sleeps forsome time, wakes up, gathers information about messages received from neighbourswhile sleeping, performs an evaluation of the program, and finally emits a message toall neighbours with information about the outcome of computation before going back tosleep. The scheduling of such rounds across the network is fair and non-synchronous.This section presents a formal semantics of device computation, which is aimed to rep-resent a specification for any HFC-like programming language implementation.

The syntax of the HFC calculus has been introduced in Section 2 (Fig.2). In thefollowing, we let meta-variable δ range over the denumerable set D of device identifiers(which are numbers). To simplify the notation, we shall assume a fixed program P. Wesay that “device δ fires”, to mean that the main expression of P is evaluated on δ .

We model device computation by a big-step operational semantics where the resultof evaluation is a value-tree θ , which is an ordered tree of values, tracking the resultof any evaluated subexpression. Intuitively, the evaluation of an expression at a giventime in a device δ is performed against the recently-received value-trees of neighbours,namely, its outcome depends on those value-trees. The result is a new value-tree thatis conversely made available to δ ’s neighbours (through a broadcast) for their firing;this includes δ itself, so as to support a form of state across computation rounds (notethat any implementation might massively compress the value-tree, storing only enoughinformation for expressions to be aligned). A value-tree environment Θ is a map fromdevice identifiers to value-trees, collecting the outcome of the last evaluation on theneighbours. This is written δ �→ θ as short for δ1 �→ θ1, . . . ,δn �→ θn.

The syntax of field values, value-trees and value-tree environments is given in Fig.3(top). Figure 3 (middle) defines: the auxiliary functions ρ and π for extracting the rootvalue and a subtree of a value-tree, respectively (further explanations about functionπ will be given later); the extension of functions ρ and π to value-tree environments;and the auxiliary functions args and body for extracting the formal parameters and thebody of a (user-defined or anonymous) function, respectively. The computation thattakes place on a single device is formalised by the big-step operational semantics rulesgiven in Fig.3 (bottom). The derived judgements are of the form δ ;Θ e ⇓ θ , to beread “expression e evaluates to value-tree θ on device δ with respect to the value-treeenvironment Θ”, where: (i) δ is the identifier of the current device; (ii) Θ is the field ofthe value-trees produced by the most recent evaluation of (an expression correspondingto) e on δ ’s neighbours; (iii) e is a run-time expression (i.e., an expression that maycontain field values); (iv) the value-tree θ represents the values computed for all theexpressions encountered during the evaluation of e—in particular ρ(θ ) is the resultingvalue of expression e. The first firing of a device δ after activation or reset is performedwith respect to the empty tree environment, while any other firing must consider theoutcome of the most recent firing of δ (i.e., whenever Θ is not empty, it includes the

Page 129: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Code Mobility Meets Self-Organisation 119

Field values, value-trees, and value-tree environments:φ ::= δ �→ � field valueθ ::= v(θ ) value-treeΘ ::= δ �→ θ value-tree environment

Auxiliary functions:ρ(v (θ)) = v

πi(v (θ1, . . . ,θn)) = θi if 1 ≤ i ≤ n π�,n(v(θ1, . . . ,θn+2)) = θn+2 if ρ(θn+1) = �

πi(θ ) = • otherwise π�,n(θ ) = • otherwise

For aux ∈ ρ,πi,π�,n :

aux(δ �→ θ ) = aux(θ ) if aux(θ ) �= •aux(δ �→ θ ) = • if aux(θ ) = •aux(Θ ,Θ ′) = aux(Θ),aux(Θ ′)

args(f) = x if (def f(x) e) body(f) = e if (def f(x) e)args((fun (x) e)) = x body((fun (x) e)) = e

Rules for expression evaluation: δ ;Θ e ⇓ θ[E-LOC]

δ ;Θ � ⇓ �()

[E-FLD] φ ′ = φ |dom(Θ )∪{δ}δ ;Θ φ ⇓ φ ′ ()

[E-B-APP]δ ;πn+1(Θ) en+1 ⇓ θn+1 ρ(θn+1) = o

δ ;π1(Θ) e1 ⇓ θ1 · · · δ ;πn(Θ) en ⇓ θn v= εoδ ;Θ (ρ(θ1), . . . ,ρ(θn))

δ ;Θ en+1(e1, . . . ,en) ⇓ v(θ1, . . . ,θn+1)

[E-D-APP]

δ ;πn+1(Θ) en+1 ⇓ θn+1 ρ(θn+1) = � args(�) = x1, . . . ,xn

δ ;π1(Θ) e1 ⇓ θ1 · · · δ ;πn(Θ) en ⇓ θn body(�) = e

δ ;π�,n(Θ) e[x1 := ρ(θ1) . . . xn := ρ(θn)] ⇓ θn+2 v= ρ(θn+2)

δ ;Θ en+1(e1, . . . ,en) ⇓ v(θ1, . . . ,θn+2)

[E-REP] �0 =

{ρ(Θ(δ )) if Θ �= /0� otherwise

δ ;π1(Θ) e[x := �0] ⇓ θ1 �1 = ρ(θ1)

δ ;Θ (rep x � e) ⇓ �1 (θ1)[E-NBR] Θ1 = π1(Θ) δ ;Θ1 e ⇓ θ1 φ = ρ(Θ1)[δ �→ ρ(θ1)]

δ ;Θ (nbr e) ⇓ φ (θ1)

[E-THEN] δ ;π1(Θ) e ⇓ θ1 ρ(θ1) = true δ ;πtrue,0Θ e′ ⇓ θ2 �= ρ(θ2)

δ ;Θ (if e e′ e′′) ⇓ �(θ1,θ2)

[E-ELSE] δ ;π1(Θ) e ⇓ θ1 ρ(θ1) = false δ ;πfalse,0Θ e′′ ⇓ θ2 �= ρ(θ2)

δ ;Θ (if e e′ e′′) ⇓ �(θ1,θ2)

Fig. 3. Big-step operational semantics for expression evaluation

value of the most recent evaluation of e on δ )—this is needed to support the statefulsemantics of the rep construct.

The operational semantics rules are based on rather standard rules for functionallanguages, extended so as to be able to evaluate a subexpression e′ of e with respect tothe value-tree environment Θ ′ obtained from Θ by extracting the corresponding subtree(when present) in the value-trees in the range of Θ . This process, called alignment, ismodelled by the auxiliary function π , defined in Fig. 3 (middle). The function π hastwo different behaviours (specified by its subscript or superscript): πi(θ ) extracts thei-th subtree of θ , if it is present; and π�,n(θ ) extracts the (n+ 2)-th subtree of θ , if it ispresent and the root of the (n+ 1)-th subtree of θ is equal to the local value �.

Page 130: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

120 F. Damiani et al.

Rules [E-LOC] and [E-FLD] model the evaluation of expressions that are either a lo-cal value or a field value, respectively. For instance, evaluating the expression 1 pro-duces (by rule [E-LOC]) the value-tree 1(), while evaluating the expression + producesthe value-tree +(). Note that, in order to ensure that domain restriction is obeyed (cf.Section 2), rule [E-FLD] restricts the domain of the value field φ to the domain of Θaugmented by δ .

Rule [E-B-APP] models the application of built-in functions. It is used to evaluateexpressions of the form (en+1e1 · · ·en) such that the evaluation of en+1 produces avalue-tree θn+1 whose root ρ(θn+1) is a built-in function o. It produces the value-treev(θ1, . . . ,θn,θn+1), where θ1, . . . ,θn are the value-trees produced by the evaluation ofthe actual parameters e1, . . . ,en (n ≥ 0) and v is the value returned by the function.Rule [E-B-APP] exploits the special auxiliary function ε , whose actual definition is ab-stracted away. This is such that εo

δ ;Θ (v) computes the result of applying built-in func-tion o to values v in the current environment of the device δ . In particular, we assumethat the built-in 0-ary function uid gets evaluated to the current device identifier (i.e.,εuid

δ ;Θ () = δ ), and that mathematical operators have their standard meaning, which isindependent from δ and Θ (e.g., ε+δ ;Θ (1,2) = 3). The ε function also encapsulates mea-surement variables such as nbr-range and interactions with the external world via sen-sors and actuators. In order to ensure that domain restriction is obeyed, for each built-infunction o we assume that: εo

δ ;Θ (v1, · · · ,vn) is defined only if all the field values inv1, . . . ,vn have domain dom(Θ)∪{δ}; and if εo

δ ;Θ (v1, · · · ,vn) returns a field value φ ,then dom(φ) = dom(Θ)∪ {δ}. For instance, evaluating the expression (+ 1 2) pro-duces the value-tree 3(1(),2(),+()). The value of the whole expression, 3, has beencomputed by using rule [E-B-APP] to evaluate the application of the sum operator + (theroot of the third subtree of the value-tree) to the values 1 (the root of the first subtree ofthe value-tree) and 2 (the root of the second subtree of the value-tree). In the following,for sake of readability, we sometimes write the value v as short for the value-tree v().Following this convention, the value-tree 3(1(),2(),+()) is shortened to 3(1,2,+).

Rule [E-D-APP] models the application of user-defined or anonymous functions, i.e.,it is used to evaluate expressions of the form (en+1 e1 · · ·en) such that the evaluationof en+1 produces a value-tree θn+1 whose root � = ρ(θn+1) is a user-defined functionname or an anonymous function. It is similar to rule [E-B-APP], however it produces avalue-tree which has one more subtree, θn+2, which is produced by evaluating the bodyof the function � with respect to the value-tree environment π�,n(Θ) containing only thevalue-trees associated to the evaluation of the body of the same function �.

To illustrate rule [E-REP] (rep construct), as well as computational rounds, we considerprogram (rep x 0 (+ x 1)) (cf. Section 2). The first firing of a device δ after ac-tivation or reset is performed againstthe empty tree environment. Therefore, accordingto rule [E-REP], to evaluate (rep x 0 (+ x 1)) means to evaluate the subexpression(+ 0 1), obtained from (+ x 1) by replacing x with 0. This produces the value-treeθ1 = 1(1(0,1,+)), where root 1 is the overall result as usual, while its sub-tree is theresult of evaluating the third argument. Any subsequent firing of the device δ is per-formed with respect to a tree environment Θ that associates to δ the outcome of themost recent firing of δ . Therefore, evaluating (rep x 0 (+ x 1)) at the second fir-ing means to evaluate the subexpression (+ 1 1), obtained from (+ x 1) by replacing

Page 131: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Code Mobility Meets Self-Organisation 121

x with 1, which is the root of θ1. Hence the results of computation are 1, 2, 3, and soon.

Value-trees also support modelling information exchange through the nbr construct,as of rule [E-NBR]. Consider the program e′ = (min-hood (nbr (sns-num))), where the1-ary built-in function min-hood returns the lower limit of values in the range of itsfield argument, and the 0-ary built-in function sns-num returns the numeric valuemeasured by a sensor. Suppose that the program runs on a network of three fullyconnected devices δA, δB, and δC where sns-num returns 1 on δA, 2 on δB, and 3

on δC. Considering an initial empty tree-environment /0 on all devices, we have thefollowing: the evaluation of (sns-num) on δA yields 1(sns-num) (by rules [E-LOC]

and [E-B-APP], since εsns-numδA ; /0 () = 1); the evaluation of (nbr (sns-num)) on δA yields

(δA �→ 1)(1(sns-num)) (by rule [E-NBR]); and the evaluation of e′ on δA yields

θA = 1((δA �→ 1)(1(sns-num)),min-hood)

(by rule [E-B-APP], since εmin-hoodδA; /0 ((δA �→ 1)) = 1). Therefore, after its first firing, de-

vice δA produces the value-tree θA. Similarly, after their first firing, devices δB and δC

produce the value-trees

θB = 2((δB �→ 2)(2(sns-num)),min-hood)θC = 3((δC �→ 3)(3(sns-num)),min-hood)

respectively. Suppose that device δB is the first device that fires a second time. Thenthe evaluation of e′ on δB is now performed with respect to the value tree envi-ronment ΘB = (δA �→ θA, δB �→ θB, δC �→ θC) and the evaluation of its subexpressions(nbr(sns-num)) and (sns-num) is performed, respectively, with respect to the follow-ing value-tree environments obtained from ΘB by alignment:

Θ ′B = π1(ΘB) = (δA �→ (δA �→ 1)(1(sns-num)), δB �→ · · · , δC �→ · · ·)

Θ ′′B = π1(Θ ′

B) = (δA �→ 1(sns-num), δB �→ 2(sns-num), δC �→ 3(sns-num))

We have that εsns-numδB ;Θ ′′

B() = 2; the evaluation of (nbr (sns-num)) on δB with re-

spect to Θ ′B yields φ (2(sns-num)) where φ = (δA �→ 1,δB �→ 2,δC �→ 3); and

εmin-hoodδB;ΘB

(φ) = 1. Therefore the evaluation of e′ on δB produces the value-tree1(φ (2(sns-num)),min-hood). Namely, the computation at device δB after the firstround yields 1, which is the minimum of sns-num across neighbours—and similarlyfor δA and δC.

We now present an example illustrating first-class functions. Consider the program((pick-hood (nbr (sns-fun)))), where the 1-ary built-in function pick-hood returnsat random a value in the range of its field argument, and the 0-ary built-in functionsns-fun returns a 0-ary function returning a value of type num. Suppose that the pro-gram runs again on a network of three fully connected devices δA, δB, and δC wheresns-fun returns �0 = (fun () 0) on δA and δB, and returns �1 = (fun ()e′) on δC,where e′ = (min-hood (nbr (sns-num))) is the program illustrated in the previous ex-ample. Assume that sns-num returns 1 on δA, 2 on δB, and 3 on δC. Then after its firstfiring, device δA produces the value-tree

θ ′A = 0(�0 ((δA �→ �0)(�0 (sns-fun)),pick-hood),0)

Page 132: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

122 F. Damiani et al.

where the root of the first subtree of θ ′A is the anonymous function �0 (defined above),

and the second subtree of θ ′A, 0, has been produced by the evaluation of the body 0 of

�0. After their first firing, devices δB and δC produce the value-trees

θ ′B = 0(�0 ((δB �→ �0)(�0 (sns-fun)),pick-hood),0)

θ ′C = 3(�1 ((δC �→ �1)(�1 (sns-fun)),pick-hood),θC)

respectively, where θC is the value-tree for e given in the previous example.Suppose that device δA is the first device that fires a second time. The computation is

performed with respect to the value tree environment Θ ′A = (δA �→ θ ′

A, δB �→ θ ′B, δC �→

θ ′C) and produces the value-tree 1(�1 (φ ′ (�1 (sns-fun)),pick-hood),θ ′′

A ), where

φ ′ = (δA �→ �1,δC �→ �1) and θ ′′A = 1((δA �→ 1,δC �→ 3)(1(sns-num)),min-hood),

since, according to rule [E-D-APP], the evaluation of the body e′ of �1 (which producesthe value-tree θ ′′

A ) is performed with respect to the value-tree environment π�1,0(Θ ′A) =

(δC �→ θC). Namely, device δA executed the anonymous function �1 received from δC,and this was able to correctly align with execution of �1 at δC, gathering values per-ceived by sns-num of 1 at δA and 3 at δC.

Static Semantics (Type-Inference System). We have developed a variant of theHindley-Milner type system [9] for the HFC calculus.This type system has two kinds oftypes, local types (the types for local values) and field types (the types for field values),and is aimed to guarantee the following two properties:

Type Preservation. If a well-typed expression e has type T and e evaluates to a valuetree θ , then ρ(θ ) also has type T.

Domain Alignment. The domain of every field value arising during the evaluation ofa well-typed expression on a device δ consists of δ and of the aligned neighbours.

Alignment is key to guarantee that the semantics correctly relates the behaviour of if,nbr, rep and function application—namely, two fields with different domain are neverallowed to be combined. Besides performing standard checks (i.e., in a function appli-cation expression (en+1 e1 · · ·en) the arguments e1, . . .en have the expected type; in anif-expression (ife0 e1 e2) the condition e0 has type bool and the branches e1 and e2

have the same type; etc.) the type system perform additional checks in order to ensuredomain alignment. In particular, the type rules check that:

– In an anonymous function (fun(x)e) the free variables y of e that are not in x havelocal type. This prevents a device δ from creating a closure e′ =(fun (x) e)[y := φ ]containing field values φ (whose domain is by construction equal to the subset ofthe aligned neighbours of δ ). The closure e′ may lead to a domain alignment errorsince it may be shifted (via the nbr construct) to another device δ ′ that may use it(i.e., apply e′ to some arguments); and the evaluation of the body of e′ may involveuse of a field value φ in φ such that the set of aligned neighbours of δ ′ is differentfrom the domain of φ .

– In a rep-expression (rep x w e) it holds that x, w and e have (the same) local type.This prevents a device δ from storing in x a field value φ that may be reused in thenext computation round of δ , when the set of the set of aligned neighbours may bedifferent from the domain of φ .

Page 133: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Code Mobility Meets Self-Organisation 123

– In a nbr-expression (nbre) the expression e has local type. This prevents the at-tempt to create a “field of fields” (i.e., a field that maps device identifiers to fieldvalues)—which is pragmatically often overly costly to maintain and communicate.

– In an if-expression (ife0 e1 e2) the branches e1 and e2 have (the same) local type.This prevents the if-expression from evaluating to a field value whose domain isdifferent from the subset of the aligned neighbours of δ .

4 A Pervasive Computing Example

We now illustrate the application of first-class functions using a pervasive computingexample. In this scenario, people wandering a large environment (like an outdoor fes-tival, an airport, or a museum) each carry a personal device with short-range point-to-point ad-hoc capabilities (e.g. a smartphone sending messages to others nearby viaBluetooth or Wi-Fi). All devices run a minimal “virtual machine” that allows runtimeinjection of new programs: any device can initiate a new distributed process (in the formof a 0-ary anonymous function), which the virtual machine spreads to all other deviceswithin a specified range (e.g., 30 meters). For example, a person might inject a processthat estimates crowd density by counting the number of nearby devices or a processthat helps people to rendezvous with their friends, with such processes likely imple-mented via various self-organisation mechanisms. The virtual machine then executesthese using the first-class function semantics above, providing predictable deploymentand execution of an open class of runtime-determined processes.

Virtual Machine Implementation. The complete code for our example is listed inFigure 4, with syntax coloring to increase readability: grey for comments, red forfield calculus keywords, blue for user-defined functions, and green for built-in op-erators. In this code, we use the following naming conventions for built-ins: func-tions sns-* embed sensors that return a value perceived from the environment (e.g.,sns-injection-point returns a Boolean indicating whether a device’s user wantsto inject a function); functions *-hood yield a local value � obtained by aggregatingover the field value φ in input (e.g., sum-hood sums all values in each neighbourhood);functions *-hood+ behave the same but exclude the value associated with the currentdevice; and built-in functions pair, fst, and snd respectively create a pair of localsand access a pair’s first and second component. Additionally, given a built-in o thattakes n ≥ 1 locals an returns a local, the built-ins o[*,...,*] are variants of o whereone or more inputs are fields (as indicated in the bracket, l for local or f for field), andthe return value is a field, obtained by applying operator o in a point-wise manner. Forinstance, as = compares two locals returning a Boolean, =[f,f] is the operator takingtwo field inputs and returns a Boolean field where each element is the comparison ofthe corresponding elements in the inputs, and similarly =[f,l] takes a field and a localand returns a Boolean field where each element is the comparison of the correspondingelement of the field in input with the local.

The first two functions in Figure 4 implement frequently used self-organisationmechanisms. Function distance-to, also known as gradient [8,14], computes a fieldof minimal distances from each device to the nearest “source” device (those mapping to

Page 134: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

124 F. Damiani et al.

;; Computes a field of minimum distance from ’source’ devices(def distance-to (source) ;; has type: (bool) → num

(rep d infinity (mux source 0 (min-hood+ (+[f,f] (nbr d) (nbr-range))))))

;; Computes a field of pairs of distance to nearest ’source’ device, and the most recent value of ’v’ there(def gradcast (source v) ;; has type: ∀ β . (bool, β ) → β

(snd ((fun (x)(rep t x (mux source (pair 0 v)

(min-hood+(pair[f,f] (+[f,f] (nbr-range) (nbr (fst t)))

(nbr (snd t)))))))(pair infinity v)))

;; Evaluate a function field, running ’f’ from ’source’ within ’range’ meters, and ’no-op’ elsewhere(def deploy (range source g no-op) ;; has type: ∀ β . (num, bool, ()→ β , ()→ β ) → β

((if (< (distance-to source) range) (gradcast source g) no-op)))

;; The entry-point function executed to run the virtual machine on each device(def virtual-machine () ;; has type: () → num

(deploy (sns-range) (sns-injection-point) (sns-injected-fun) (fun () 0)))

;; Sums values of ’summand’ into a minimum of ’potential’, by descent(def converge-sum (potential summand) ;; has type: (num, num) → num

(rep v summand (+ summand(sum-hood+ (mux[f,f,l] (=[f,l] (nbr (parent potential)) (uid))

(nbr v) 0)))))

;; Maps each device to the uid of the neighbour with minimum value of ’potential’(def parent (potential) ;; has type: (num) → num

(snd (min-hood (pair[l,f] potential(mux[f,f,l] (<[f,l] (nbr potential) potential)

(nbr (uid)) NaN)))))

;; Simple low-pass filter for smoothing noisy signal ’value’ with rate constant ’alpha’(def low-pass (alpha value) ;; has type: (num, num) → num

(rep filtered value (+ (* value alpha) (* filtered (- 1 alpha)))))

Fig. 4. Virtual machine code (top) and application-specific code (bottom)

true in the Boolean input field). This is computed by repeated application of the triangleinequality (via rep): at every round, source devices take distance zero, while all othersupdate their distance estimates d to the minimum distance estimate through their neigh-bours (min-hood+ of each neighbour’s distance estimate (nbr d) plus the distance tothat neighbour nbr-range); source and non-source are discriminated by mux, a built-in “multiplexer” that operates like an if but differently from it always evaluates bothbranches on every device. Repeated application of this update procedure self-stabilisesinto the desired field of distances, regardless of any transient perturbations or faults [13].The second self-organisation mechanism, gradcast, is a directed broadcast, achievedby a computation identical to that of distance-to, except that the values are pairs(note that pair[f,f] produces a field of pairs, not a pair of fields), with the secondelement set to the value of v at the source: min-hood operates on pairs by applyinglexicographic ordering, so the second value of the pair is automatically carried alongshortest paths from the source. The result is a field of pairs of distance and most recentvalue of v at the nearest source, of which only the value is returned.

Page 135: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Code Mobility Meets Self-Organisation 125

The latter two functions in Figure 4 use these self-organisation methods to implementour simple virtual machine. Code mobility is implemented by function deploy, whichspreads a 0-ary function g via gradcast, keeping it bounded within distance range

from sources, and holding 0-ary function no-op elsewhere. The corresponding field offunctions is then executed (note the double parenthesis). The virtual-machine thensimply calls deploy, linking its arguments to sensors configuring deployment rangeand detecting who wants to inject which functions (and using (fun () 0) as no-op

function).In essence, this virtual machine implements a code-injection model much like those

used in a number of other pervasive computing approaches (e.g., [15,11,6])—thoughof course it has much more limited features, since it is only an illustrative example.With these previous approaches, however, code shares lexical scope and cannot haveits network domain externally controlled. Thus, injected code may spread through thenetwork unpredictably and may interact unpredictably with other injected code that itencounters. The extended field calculus semantics that we have presented, however,ensures that injected code moves only within the range specified to the virtual machineand remains lexically isolated from different injected code, so that no variable can beunexpectedly affected by interactions with neighbours.

Simulated Example Application. We further illustrate the application of first-classfunctions with an example in a simulated scenario. Consider a museum, whose docentsmonitor their efficacy in part by tracking the number of patrons nearby while they areworking. To monitor the number of nearby patrons, each docent’s device injects thefollowing anonymous function (of type: ()→ num):

(fun () (low-pass 0.5 (converge-sum (distance-to (sns-injection-point))(sns-patron))))

This counts patrons using the function converge-sum defined in Figure 4(bottom), asimple version of another standard self-organisation mechanism [4] which operates likean inverse broadcast, summing the values sensed by sns-patron (1 for a patron, 0 fora docent) down the distance gradient back to its source—in this case the docent at theinjection point. In particular, each device’s local value is summed with those identifyingit as their parent (their closest neighbour to the source, breaking ties with device uniqueidentifiers from built-in function uid), resulting in a relatively balanced spanning treeof summations with the source at its root. This very simple version of summation issomewhat noisy on a moving network of devices, so its output is passed through asimple low-pass filter, the function low-pass, also defined in Figure 4(bottom), in orderto smooth its output and improve the quality of estimate.

Figure 5a shows a simulation of a docent and 250 patrons in a large 100x30 metermuseum gallery. Of the patrons, 100 are a large group of school-children moving to-gether past the stationary docent from one side of the gallery to the other, while therest are wandering randomly. In this simulation, people move at an average 1 m/s, thedocent and all patrons carry personal devices running the virtual machine, executingasynchronously at 10Hz, and communicating via low-power Bluetooth to a range of 10meters. The simulation was implemented using the ALCHEMIST [18] simulation frame-

Page 136: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

126 F. Damiani et al.

(a) Simulation snapshots

0

20

40

60

80

100

120

140

160

180

200

0 10 20 30 40 50 60 70 80

Dev

ices

with

in 2

5m fr

om s

enso

r

Time (seconds)

(b) Estimated vs. True Count

Fig. 5. (a) Two snapshots of museum simulation: patrons (grey) are counted (black) within 25meters of the docent (green). (b) Estimated number of nearby patrons (grey) vs. actual number(black) in the simulation.

work and the Protelis [17] incarnation of field calculus, updated to the extended versionof the calculus presented in this paper.

In this simulation, at time 10 seconds, the docent injects the patron-counting func-tion with a range of 25 meters, and at time 70 seconds removes it. Figure 5a shows twosnapshots of the simulation, at times 11 (top) and 35 (bottom) seconds, while Figure 5bcompares the estimated value returned by the injected process with the true value. Notethat upon injection, the process rapidly disseminates and begins producing good esti-mates of the number of nearby patrons, then cleanly terminates upon removal.

5 Conclusion, Related and Future Work

Conceiving emerging distributed systems in terms of computations involving aggre-gates of devices, and hence adopting higher-level abstractions for system development,is a thread that has recently received a good deal of attention. A wide range of aggre-gate programming approaches have been proposed, including Proto [2], TOTA [15],the (bio)chemical tuple-space model [19], Regiment [16], the στ-Linda model [22],Paintable Computing [6], and many others included in the extensive survey of aggre-gate programming languages given in [3]. Those that best support self-organisation ap-proaches to robust and environment-independent computations have generally lackedwell-engineered mechanisms to support openness and code mobility (injection, update,etc.). Our contribution has been to develop a core calculus, building on the work pre-sented in [21], that smoothly combines for the first time self-organisation and codemobility, by means of the abstraction of “distributed function field”. This combinationof first-class functions with the domain-restriction mechanisms of field calculus allowsthe predictable and safe composition of distributed self-organisation mechanisms atruntime, thereby enabling robust operation of open pervasive systems. Furthermore, thesimplicity of the calculus enables it to easily serve as both an analytical framework anda programming framework, and we have already incorporated this into Protelis [17],

Page 137: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Code Mobility Meets Self-Organisation 127

thereby allowing these mechanisms to be deployed both in simulation and in actualdistributed systems.

Future plans include consolidation of this work, by extending the calculus and itsconceptual framework, to support an analytical methodology and a practical toolchainfor system development, as outlined in [4]. First, we aim to apply our approach to sup-port various application needs for dynamic management of distributed processes [1],which may also impact the methods of alignment for anonymous functions. Second,we plan to isolate fragments of the calculus that satisfy behavioural properties such asself-stabilisation, quasi-stabilisation to a dynamically evolving field, or density inde-pendence, following the approach of [20]. Finally, these foundations can be applied indeveloping APIs enabling the simple construction of complex distributed applications,building on the work in [4] to define a layered library of self-organisation patterns, andapplying these APIs to support a wide range of practical distributed applications.

References

1. Beal, J.: Dynamically defined processes for spatial computers. In: Spatial Computing Work-shop, New York, pp. 206–211. IEEE (September 2009)

2. Beal, J., Bachrach, J.: Infrastructure for engineered emergence in sensor/actuator networks.IEEE Intelligent Systems 21, 10–19 (2006)

3. Beal, J., Dulman, S., Usbeck, K., Viroli, M., Correll, N.: Organizing the aggregate: Lan-guages for spatial computing. In: Mernik, M. (ed.) Formal and Practical Aspects of Domain-Specific Languages: Recent Developments, ch. 16, pp. 436–501. IGI Global (2013), A longerversion available at http://arxiv.org/abs/1202.5509

4. Beal, J., Viroli, M.: Building blocks for aggregate programming of self-organising applica-tions. In: 2nd FoCAS Workshop on Fundamentals of Collective Systems, pp. 1–6. IEEE CS(2014) (to appear)

5. Beal, J., Viroli, M., Damiani, F.: Towards a unified model of spatial computing. In: 7th SpatialComputing Workshop (SCW 2014), AAMAS 2014, Paris, France (May 2014)

6. Butera, W.: Programming a Paintable Computer. PhD thesis, MIT, Cambridge, USA (2002)7. Church, A.: A set of postulates for the foundation of logic. Annals of Mathematics 33(2),

346–366 (1932)8. Clement, L., Nagpal, R.: Self-assembly and self-repairing topologies. In: Workshop on

Adaptability in Multi-Agent Systems, RoboCup Australian Open (2003)9. Damas, L., Milner, R.: Principal type-schemes for functional programs. In: Symposium on

Principles of Programming Languages, POPL 1982, pp. 207–212. ACM (1982)10. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Communi-

cations of the ACM 51(1), 107–113 (2008)11. Gelernter, D.: Generative communication in linda. ACM Trans. Program. Lang. Syst. 7(1),

80–112 (1985)12. Igarashi, A., Pierce, B.C., Wadler, P.: Featherweight Java: A minimal core calculus for Java

and GJ. ACM Transactions on Programming Languages and Systems 23(3) (2001)13. Kutten, S., Patt-Shamir, B.: Time-adaptive self stabilization. In: Proceedings of ACM Sym-

posium on Principles of Distributed Computing, pp. 149–158. ACM (1997)14. Lin, F.C.H., Keller, R.M.: The gradient model load balancing method. IEEE Trans. Softw.

Eng. 13(1), 32–38 (1987)15. Mamei, M., Zambonelli, F.: Programming pervasive and mobile computing applications: The

tota approach. ACM Trans. on Software Engineering Methodologies 18(4), 1–56 (2009)

Page 138: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

128 F. Damiani et al.

16. Newton, R., Welsh, M.: Region streams: Functional macroprogramming for sensor networks.In: Workshop on Data Management for Sensor Networks, pp. 78–87 (August 2004)

17. Pianini, D., Beal, J., Viroli, M.: Practical aggregate programming with PROTELIS. In: ACMSymposium on Applied Computing (SAC 2015) (to appear, 2015)

18. Pianini, D., Montagna, S., Viroli, M.: Chemical-oriented simulation of computational sys-tems with Alchemist. Journal of Simulation 7, 202–215 (2013)

19. Viroli, M., Casadei, M., Montagna, S., Zambonelli, F.: Spatial coordination of pervasiveservices through chemical-inspired tuple spaces. ACM Transactions on Autonomous andAdaptive Systems 6(2), 14:1–14:24 (2011)

20. Viroli, M., Damiani, F.: A calculus of self-stabilising computational fields. In: Kuhn, E.,Pugliese, R. (eds.) COORDINATION 2014. LNCS, vol. 8459, pp. 163–178. Springer,Heidelberg (2014)

21. Viroli, M., Damiani, F., Beal, J.: A calculus of computational fields. In: Canal, C., Villari, M.(eds.) ESOCC 2013. CCIS, vol. 393, pp. 114–128. Springer, Heidelberg (2013)

22. Viroli, M., Pianini, D., Beal, J.: Linda in space-time: An adaptive coordination model formobile ad-hoc environments. In: Sirjani, M. (ed.) COORDINATION 2012. LNCS, vol. 7274,pp. 212–229. Springer, Heidelberg (2012)

Page 139: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Real Time Systems

Page 140: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Timely Dataflow: A Model

Martın Abadi1,2 and Michael Isard3

1 Google, Mountain View, CA, USA2 University of California, Santa Cruz, CA, USA3 Microsoft Research, Mountain View, CA, USA

Abstract. This paper studies timely dataflow, a model for data-parallelcomputing in which each communication event is associated with a vir-tual time. It defines and investigates the could-result-in relation whichis central to this model, then the semantics of timely dataflow graphs.

1 Introduction

Timely dataflow is a model of data-parallel computation that extends tradi-tional dataflow (e.g., [10]) by associating each communication event with a vir-tual time [12]. Virtual times need not be linearly ordered, nor correspond tothe order in which events are processed. As in the Time Warp mechanism [7],virtual times serve to differentiate between data in different phases or aspectsof a computation, for example data associated with different batches of inputsand different loop iterations. Thus, an implementation may overlap, but stilldistinguish, work that corresponds to multiple logical parts of a computation.

In this model, each node in a dataflow graph can request to be notified when ithas received all messages for a given virtual time. The facilities for asynchronousprocessing and completion notifications imply that, even within a single program,some components can function in batch mode (queuing inputs and delaying pro-cessing until an appropriate notification) and others in streaming mode (pro-cessing inputs as they arrive). For example, an application may process a streamof GPS readings; as these readings arrive, the application may update a mapand, after each batch of readings, recompute shortest paths between landmarks.

The Naiad system [12] is the origin and an embodiment of timely dataflow.Naiad aspires to serve as a coherent platform for data-parallel applications, of-fering both high throughput and low latency. Timely dataflow is crucial to thisgoal. Naiad contrasts with other systems that focus on narrower domains (e.g.,graph problems) or on particular classes of programs (e.g., without loops).

The development and presentation of timely dataflow in the context of Naiadwas fairly precise but informal. Only one of its critical components (a distributedalgorithm that keeps track of virtual times for which there may remain work)was rigorously specified and verified [4]. Moreover, in the context of Naiad, def-initions focus on particular structures of dataflow graphs and particular typesof nodes. Specifically, Naiad supports iterative computations, with loops that

Most of this work was done at Microsoft Research.

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 131–145, 2015.DOI: 10.1007/978-3-319-19195-9_9

Page 141: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

132 M. Abadi and M. Isard

include special nodes for ingress, feedback, and egress, and with a set of virtualtimes that includes coordinates for input epochs and loop counters.

The goal of this paper is to provide a general, rigorous definition of timelydataflow. We allow arbitrary graph structures, partial orders of virtual times,and stateful local computations at each of the nodes. The local computationsare deterministic (only for simplicity); non-determinism is introduced by theordering of events. We specify the semantics of timely dataflow graphs using alinear-time temporal logic. In this setting, we explore some of the fundamentalconcepts and properties of the model. In particular, we study the could-result-inrelation, which drives completion notifications; for instance, we investigate howit applies to recursive dataflow computations, which are beyond Naiad’s presentscope. The semantics serves as the basis for rigorous proofs, as we demonstratewith an example application. We are finding the semantics valuable in other,more substantial applications. Specifically, the results of this paper have alreadybeen useful to us in our work on information-flow security properties [1] and onfault-tolerance [2]. Our rather elementary formulation of the semantics amplysuffices for these present purposes; we leave algebraic or categorical presentations(see, e.g., [6]) for further work.

The next section defines dataflow graphs and other basic notions. Section 3concerns the could-result-in relation. Section 4 describes the semantics of graphs,and Section 5 applies it. Section 6 concludes. Because of space constraints, proofsare omitted.

2 Dataflow Graphs, Messages, and Times

As is typical in dataflow models, we specify computations as directed graphs,with distinguished input and output edges. The graphs may contain cycles. Dur-ing execution, stateful nodes send and receive timestamped messages, and in ad-dition may request and receive notifications that they have received all messageswith a certain timestamp. This section defines the graphs and the behavior ofindividual nodes; later sections cover more global aspects of the semantics.

We write ∅ both for the empty sequence and for the empty set. We write〈〈m0,m1, . . .〉〉 for the sequence (finite or infinite) that consists ofm0, m1, . . . . Weuse “·” for sequence concatenation and also for appending elements to sequences,for example writing m·u instead of 〈〈m〉〉·u, where u is a sequence and m anelement. A mapping f on elements is extended to a mapping on sequences byletting f(〈〈m0,m1,m2, . . .〉〉) = 〈〈f(m0), f(m1), . . .〉〉, and to a mapping on sets byletting f(S) = {f(s) : s ∈ S}. When A is a set, we write P(A) for its powerset,and A∗ and Aω , respectively, for the sets of finite and infinite sequences ofelements of A. When f is a function with a domain that includes A, we writef�A for the restriction of f to A. When B is also a set, we write Πx∈A.Bfor the set of functions that map each x ∈ A to an element of B; if A is afinite set {a1, . . . , ak} and b1, . . . , bk are elements of B, we write such a function〈a1 �→ b1, . . . , ak �→ bk〉.

Page 142: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Timely Dataflow: A Model 133

2.1 Basics of Graphs, Messages, and Times

We assume a set of messages M , a partial order of times (T,≤), and a timetime(m) ∈ T for each m ∈ M . We also assume a finite set of nodes (proces-sors) P , and a set of local states ΣLoc for them. Finally, we assume a set ofedges (channels), partitioned into input edges I, internal edges E, and outputedges O. Edges have sources and destinations (not always both): for each i ∈ I,dst(i) ∈ P , and src(i) is undefined; for each e ∈ E, src(e), dst(e) ∈ P , and werequire that they are distinct; for each o ∈ O, src(o) ∈ P , and dst(o) is undefined.

Input edges are not essential for computations getting started, because nodescan initially create data in response to notifications. We include input edges asa convenience, and because they can serve for connecting graphs.

We allow (but do not require) the set of times to be the disjoint union of mul-tiple “time domains”. For example, a node may receive inputs tagged with GMTtimes, and produce outputs tagged with GMT times, PST times, or perhaps withsequence numbers, and may even send outputs in different time domains alongdifferent edges. In Naiad, the nodes for loop ingress and egress, respectively, addand remove time coordinates that represent loop counters. Accordingly, we donot assume, for example, that it is always immediately meaningful to comparethe times of inputs and outputs.

2.2 Processor Behavior

Timely dataflow supports stateful computations in which each node maintainslocal state. For each node p, a subset Initial (p) of ΣLoc × P(T ) describes thepossible initial states and initial notification requests for p. A local history forp is a finite sequence of the form 〈〈(s,N), x1, . . . , xn〉〉 where (s,N) ∈ Initial(p),n ≥ 0, and each xi is either a pair (d,m) where m ∈ M and d ∈ I ∪ E withdst(d) = p, or a time t ∈ T . In this context, we call a pair (d,m) or a timet an event . Thus, a local history records the order in which a node consumesevents; it also determines what the node does in response to these events, via thefunction g1 introduced below. We write Histories(p) for the set of local historiesof node p.

For each node p, the function g1(p) maps ΣLoc × (T ∪ ({d ∈ I ∪ E | dst(d) =p}×M)) to ΣLoc×P(T )× (Π{d∈E∪O|src(d)=p}.M∗). Intuitively g1 describes onestep of computation by one node:

– g1(p)(s, t) = (s′, {t1, . . . , tn}, 〈e1 �→μ1, . . . , ek �→μk〉) means that, in responseto a notification for time t and at a state s, the node p can move to state s′, re-quest notifications for times t1, . . . , tn, and add message sequences μ1, . . . , μk

on outgoing edges e1, . . . , ek, respectively.

– g1(p)(s, (d,m)) = (s′, {t1, . . . , tn}, 〈e1 �→μ1, . . . , ek �→μk〉) means that, in re-sponse to a message m on incoming edge d and at a state s, the node p canmove to state s′, request notifications for times t1, . . . , tn, and add messagesequences μ1, . . . , μk on outgoing edges e1, . . . , ek, respectively.

Page 143: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

134 M. Abadi and M. Isard

We could easily restrict these definitions so that a message for time t cannotappear in a history to the right of a notification for time t, and so that notifica-tions appear only in response to notification requests. However, such restrictionsdo not seem necessary; each node can enforce them.

We extend the function g1 to a function g that applies to local histories. Foreach node p, g(p) maps Histories(p) to ΣLoc ×P(T )× (Π{d∈E∪O|src(d)=p}.M∗),and is defined inductively by:

– g(p)(〈〈(s,N)〉〉) = (s,N, 〈e1 �→∅, . . . , ek �→∅〉)– If g(p)(h) = (s′, N, 〈e1 �→μ1, . . . , ek �→μk〉), h′ = h·t, and g1(p)(s

′, t) = (s”,N ′, 〈e1 �→μ′

1, . . . , ek �→μ′k〉), then

g(p)(h′) = (s”, N − {t} ∪N ′, 〈e1 �→μ1·μ′1, . . . , ek �→μk·μ′

k〉)– If g(p)(h) = (s′, N, 〈e1 �→μ1, . . . , ek �→μk〉), h′ = h·(d,m), and g1(p)(s

′, (d,m))= (s”, N ′, 〈e1 �→μ′

1, . . . , ek �→μ′k〉), then

g(p)(h′) = (s”, N ∪N ′, 〈e1 �→μ1·μ′1, . . . , ek �→μk·μ′

k〉)Given a triple (s,N, 〈e1 �→μ1, . . . , ek �→μk〉), perhaps obtained via one of these

functions, we write: ΠLoc(s,N, 〈e1 �→μ1, . . . , ek �→μk〉) for s, ΠNR(s,N, 〈e1 �→μ1,. . . , ek �→μk〉) for N , and Πei(s,N, 〈e1 �→μ1, . . . , ek �→μk〉) for μi.

In this model, each node can consume and produce multiple events in oneatomic action. For example, a node may simultaneously dequeue an input mes-sage and produce two output messages on each of two distinct edges. Alter-native models could be more asynchronous; in our example, the node wouldfirst dequeue the input message, and after some delay produce the two out-put messages one after the other. Fortunately, such an asynchronous modelcan be seen as a special case of ours: in our model, asynchronous behaviorcan be produced by buffering (see, e.g., [14]). We say that p ∈ P is a buffernode if there exist exactly one e1 ∈ I ∪ E such that dst(e1) = p and exactlyone e2 ∈ E ∪ O such that src(e2) = p, and g1(p)(s, t) = (s, ∅, 〈e2 �→∅〉) andg1(p)(s, (e1,m)) = (s, ∅, 〈e2 �→〈〈m〉〉〉). Such a node p is simply a relay betweenqueues. (The term “buffer” comes from the literature.) In order to simulate amore asynchronous semantics, we could require that every non-buffer node hasits output edges going into buffer nodes. However, we do not need to impose thisconstraint.

3 Pointstamps and the Could-result-in Relation

As indicated in the Introduction, each node can request to be notified when it hasreceived all messages for a given virtual time. Furthermore, “under the covers”,an implementation may benefit from knowing that a virtual time is complete inorder to reclaim associated resources. Thus, the notion of completion of virtualtimes is central to timely dataflow and to its practical realization. Reasoningabout completion is based on the could-result-in relation on pointstamps. In thissection we define this relation and establish some of its properties.

Page 144: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Timely Dataflow: A Model 135

3.1 Defining Could-result-in

A pointstamp is a pair (x, t) of a location x (node or edge) in a graph and atime t. Thus, the set of pointstamps is ((I ∪ E ∪ O) ∪ P ) × T . We say thatpointstamp (x, t) could-result-in pointstamp (x′, t′), and write (x, t)� (x′, t′),if a message or notification at location x and time t may lead to a message ornotification at location x′ and time t′. We define � via an auxiliary relation�1 that reflects one step of computation.

Definition 1. (p, t)�1 (d, t′) if and only if src(d) = p and there exist a historyh for p and a state s such that

g(p)(h) = (s, . . .)

and an event x such that either x = t or x = (e,m) for some e and m such thatt = time(m), and

g1(p)(s, x) = (. . . , 〈. . . d�→μ . . .〉)where some element of μ has time t′.

Definition 2. (x, t)� (x′, t′) if and only if

– x = x′ and t ≤ t′, or– there exist k > 1, distinct xi for i = 1 . . . k, and (not necessarily distinct) ti

for i = 1 . . . k, such that x = x1, x′ = xk, t ≤ t1, and tk ≤ t′, and for all

i = 1 . . . k − 1:• xi ∈ I ∪ E, xi+1 ∈ P , dst(xi) = xi+1, and ti = ti+1, or• xi ∈ P , xi+1 ∈ E ∪ O, src(xi+1) = xi, and there exist t′i ≥ ti andt′′i ≤ ti+1 such that (xi, t

′i)�1 (xi+1, t

′′i ).

In the first case, we say that the proof of (x, t)� (x′, t′) has length 1; in thesecond, that it has length k. (These lengths are helpful in inductive arguments.Different proofs of (x, t)� (x′, t′) may in general have different lengths.)

This definition captures the semantics of an arbitrary node p, via the functionsg1 and g. The function g is applied to a local history to generate a state s, theng1 is applied at s. Thus, the definition restricts attention to states s that canarise in some execution with p. However, we do not attempt to guarantee thatthis execution is one of those that can occur in the context of the other nodes inthe graph of interest, in order to avoid a circularity: this latter set of executionsis itself defined in terms of the relation � (in Section 4.2).

An implementation, such as Naiad’s, may soundly use simple, conservative ap-proximations to the relation � as we define it here. In Naiad, for most nodes p,it is assumed that (p, t)�1 (e, t) for all t and each outgoing edge e; certain nodes(loop ingress, feedback, and egress) receive special treatment.

The definition implies that � is reflexive. The following proposition assertsa few of the additional properties of � that we have found useful.

Page 145: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

136 M. Abadi and M. Isard

Proposition 1.

1. If (p, t1)� (e, t2) then there are e′ ∈ E ∪O and t′ ∈ T such that src(e′) = p,(p, t1)� (e′, t′) and (e′, t′)� (e, t2), with the proof of (e′, t′)� (e, t2) strictlyshorter than that of (p, t1)� (e, t2).

2. If (x, t1)� (x, t2) then t1 ≤ t2.3. If (x1, t1)� (x2, t2), t

′1 ≤ t1, and t2 ≤ t′2, then (x1, t

′1)� (x2, t

′2).

The definition is designed to be convenient in proofs and to reflect importantaspects of implementations (and of Naiad’s specifically). In particular, the dis-tinctness requirement (“there exist k > 1, distinct xi for i = 1 . . . k”) means thatproofs and implementations do not need to chase around cycles.

On the other hand, because of the distinctness requirement, the definitiondoes not immediately yield that � is transitive, as one might expect, and asone might often want in proofs. More broadly, the definition of � may not corre-spond to the intuitive understanding of could-result-in without some additionalassumptions, which we address next.

3.2 On Sending Notification Requests and Messages into the Past

In timely dataflow, and in Naiad in particular, it is generally expected thatevents do not give rise to other events at earlier times. When those other eventsare notification requests, the required condition is easy to state. When theyare messages, it is not, because we do not wish to compare times across timedomains. In this section we formulate and study these two conditions.

The first considers the generation of notification requests, which the definitionof � ignores. We formulate it via an additional relation �N , a local variant ofthe could-result-in relation that focuses on the generation of notification requests.(This relation is not intended to be reflexive or transitive.)

Definition 3. (p, t)�N (p, t′) if and only if there exist a history h for p and astate s such that

g(p)(h) = (s,N1, . . .)

and an event x such that either x = t′′ for some t′′ such that t ≤ t′′, or x = (e,m)for some e and m such that t ≤ time(m), and

g1(p)(s, x) = (. . . , N, . . .)

where some element of N −N1 is ≤ t′.

Using this relation, we can express that an event at time t can trigger notifi-cation requests only at greater times t′:

Condition 1. If (p, t)�N (p, t′) then t ≤ t′.

The question of the transitivity of � is closely related to the expectationthat nodes should not be allowed to send messages into the past. Indeed, a suf-ficient condition for transitivity is that, for all pointstamps (x, t) and (x′, t′),

Page 146: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Timely Dataflow: A Model 137

if (x, t)� (x′, t′) then t ≤ t′ (as implied by Theorem 1, below). However, theconverse does not hold, for trivial reasons. For example, in a graph with a sin-gle node, the relation � will always be transitive but we may not have that(x, t)� (x′, t′) implies t ≤ t′. Still, we can compare times at a node and at its in-coming edges, and fortunately such comparisons suffice, as the following theoremdemonstrates.

Condition 2. For all p ∈ P , e ∈ E with dst(e) = p, and t, t′ ∈ T , if (p, t)�(e, t′) then t ≤ t′.

Theorem 1. The relation � is transitive if and only if Condition 2 holds.

Conditions 1 and 2 both depend on the semantics of individual nodes; Con-dition 2 also depends on the topology of the graph. Although we assume themin some of our results, we do not discuss how they can be enforced. In practice,Naiad simply assumes analogous properties, but type systems and other staticanalyses may well help in checking them.

The following proposition offers another way of thinking about transitivity bycomparing times at different nodes and edges, via an embedding of these timesinto an additional partial order (T ′,�). One may view (T ′,�) as a set of timesnormalized into a coherent universal time—the “GMT” of timely dataflow. (Thisproposition is fairly straightforward, and we do not need it below.)

Proposition 2. The relation � is transitive if and only if there exist a partialorder (T ′,�) and a mapping E from the set of pointstamps ((I ∪E ∪O)∪P )×Tto T ′ such that, for all (x, t) and (x′, t′), (x, t)� (x′, t′) if and only if E(x, t) �E(x′, t′).

3.3 Closure

We say that a set S of pointstamps is upward closed if and only if, for allpointstamps (x, t) and (x′, t′), (x, t) ∈ S and (x, t)� (x′, t′) imply (x′, t′) ∈ S.For any set S of pointstamps, Close↑(S) is the least upward closed set thatcontains S. Assuming that � is transitive, the following proposition provides asimpler formulation for Close↑.

Proposition 3. Assume that Condition 2 holds. Then Close↑(S) = {(x′, t′) |∃(x, t) ∈ S.(x, t)� (x′, t′)}.

3.4 Recursion

Naiad focuses on iterative computation, and the could-result-in relation for thenodes that support iteration (loop ingress, feedback, and egress) has been dis-cussed informally [12]. We could revisit iteration using our definitions. However,the definitions are much more general. We demonstrate the value of this gener-ality by outlining how they apply to recursive dataflow computation (e.g., [5]).

Page 147: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

138 M. Abadi and M. Isard

Let us consider a dataflow graph that includes a distinguished input node in

with no incoming edges, a distinguished output node out with no outgoing edges,some ordinary nodes for operations on data, and other nodes that representrecursive calls to the entire computation. For simplicity, we let I = O = ∅,and do not consider multiple mutually recursive graphs and other variants. Weassume that every node is reachable from in, out is reachable from every node,and there is a path from in to out that does not go through any call nodes. Inorder to make the recursion explicit, we modify the dataflow graph by splittingeach call node c into a call part call-c and a return part ret-c, where theformer is the source of a back edge to in and the latter is the destination of aback edge from out.

A stack ∫ is a finite sequence of call nodes. We let.� be the least reflexive,

transitive relation on pairs (p, ∫) such that

1. (call-c, ∫) .� (in, ∫ ·c);2. symmetrically, (out, ∫ ·c) .� (ret-c, ∫); and3. if p is not call-c and p′ is not ret-c for any c, and there is an edge from

p to p′, then (p, ∫) .� (p′, ∫).We will have that

.� is a conservative approximation of � .At each node p, we define a pre-order on stacks: ∫ �p ∫ ′ if and only if

(p, ∫) .� (p, ∫ ′). We write (Tp,≤p) for the partial order induced by �p (so, Tp

identifies ∫ and ∫ ′ when both ∫ �p ∫ ′ and ∫ ′ �p ∫). The partial order is thusdifferent at each node. The partial order of virtual times (T,≤) is the disjoint(“tagged”) union of the partial orders (Tp,≤p) for all the nodes. We write [∫ ]pfor the element of T obtained by tagging the equivalence class of ∫ at p.

We assume that each node p uses the appropriate tags for its outgoing mes-sages and notification requests, and ignores inputs and notifications not taggedwith p, and also that the behavior of p, as reflected in the relation �1 , conformsto what the relation

.� expresses:

If (p, [∫ ]q)�1 (d, [∫ ′]p′) then q = p, p′ = dst(d), and (p, ∫) .� (p′, ∫ ′).We obtain:

Proposition 4. If (p, [∫ ]p)� (p′, [∫ ′]p′) then (p, ∫) .� (p′, ∫ ′).Proposition 5. If (p, [∫ ]q)� (p′, [∫ ′]q′) and p �= p′, then q = p and q′ = p′.

Applying Theorem 1, we also obtain:

Proposition 6. The relation � is transitive.

Furthermore, the relation.� can be decided quite simply by finding the first

call in which two stacks differ and performing an easy check based on thatdifference. This check relies on an alternative modified graph, in which we spliteach call node c into a call part call-c and a return part ret-c, but add a direct

Page 148: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Timely Dataflow: A Model 139

forward edge from the former to the latter (rather than back edges). Suppose(without loss of generality) that ∫ is of the form ∫1·∫2 and ∫ ′ is of the form ∫1·∫ ′2,where ∫2 and ∫ ′2 start with c and c’ respectively if they are not empty. We assumethat c and c’ are distinct if ∫2 and ∫ ′2 are both non-empty (so, ∫1 is maximal).Let l be ret-c if ∫2 is non-empty, and be p if it is empty; let l′ be call-c’ if ∫ ′2is non-empty, and be p′ if it is empty. Then we can prove that (p, ∫) .� (p′, ∫ ′) ifand only if there is a path from l to l′ in the alternative modified graph.

Special cases (in particular, special graph topologies) may allow further sim-plifications which could be helpful in implementations.

4 Semantics

We describe the semantics of timely dataflow graphs in a state-based frame-work [3,11]. In this section, we first review this framework, then specify thesemantics. Finally, we discuss matters of compositionality.

4.1 The Framework (Review)

The sequence 〈〈s0, s1, s2, . . .〉〉 is said to be stutter-free if, for each i, either si �=si+1 or the sequence is infinite and si = sj for all j ≥ i. We let �σ be the stutter-free sequence obtained from σ by replacing every maximal finite subsequencesi, si+1, . . . , sj of identical elements with the single element si. A set of sequencesS is closed under stuttering when σ ∈ S if and only if �σ ∈ S.

A state space Σ is a subset of ΣE ×ΣI for some sets ΣE of externally visiblestates and ΣI of internal states. If Σ is a state space, then a Σ-behavior is anelement of Σω. A ΣE-behavior is called an externally visible behavior. A Σ-property P is a set of Σ-behaviors that is closed under stuttering. When Σ isclear from context or is irrelevant, we may leave it implicit. We sometimes applythe adjective “complete”, as in “complete behavior”, in order to distinguishbehaviors and properties from externally visible behaviors and properties.

A state machine is a triple (Σ,F,N) where Σ is a state space; F , the setof initial states, is a subset of Σ; and N , the next-state relation, is a subset ofΣ × Σ. The complete property generated by a state machine (Σ,F,N) consistsof all infinite sequences 〈〈s0, s1, . . .〉〉 such that s0 ∈ F and, for all i ≥ 0, either〈si, si+1〉 ∈ N or si = si+1. The externally visible property generated by a statemachine is the externally visible property obtained from its complete propertyby projection onto ΣE and closure under stuttering. For brevity, we do notconsider fairness conditions or other liveness properties that can be added tostate machines; their treatment is largely orthogonal to our present goals.

Although we are not fully formal in the use of TLA [11], we generally followits approach to writing specifications. Specifically, we express state machines byformulas of the form:

∃y1, . . . , yn. F ∧ [N ]v1,...,vk

Page 149: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

140 M. Abadi and M. Isard

where:

– state functions that we write as variables represent the state;

– we distinguish external variables and internal variables, and the internalvariables (in this case, y1, . . . , yn) are existentially quantified;

– F is a formula that may refer to the variables;

– is the temporal-logic operator “always”;

– N is a formula that may refer to the variables and also to primed versions ofthe variables (thus denoting the values of those variables in the next state);

– [N ]v1,...,vk abbreviates N ∨ ((v′1 = v1) ∧ . . . ∧ (v′k = vk)).

4.2 Semantics Specification

In our semantics, the externally visible states map each e ∈ I∪O to a value Q(e)in M∗. In other words, we observe only the state of input and output edges. Theinternally visible states map each e ∈ E to a value Q(e) in M∗, and each p ∈ Pto a local state LocState(p) ∈ ΣLoc and to a set of pending notification requestsNotRequests(p) ∈ P(T ).

An auxiliary state function Clock (whose name comes from Naiad, and isunrelated to “clocks” elsewhere) tracks pointstamps for which work may remain:

ClockΔ= Close↑

{(e, time(m)) | e ∈ I ∪E ∪O,m ∈ Q(e)}∪

{(p, t) | p ∈ P, t ∈ NotRequests(p)}

We define an initial condition, the actions that constitute a next-state relation,and finally the specification.

Initial condition:

InitPropΔ=

∀e ∈ E ∪O.Q(e) = ∅ ∧ ∀i ∈ I.Q(i) ∈ M∗

∧∀p ∈ P.(LocState(p),NotRequests(p)) ∈ Initial (p)

Actions:

1. Receiving a message:

MessΔ= ∃p ∈ P.Mess1 (p)

Mess1 (p)Δ=

(∃m ∈ M.∃e ∈ I ∪ E such that p = dst(e).Q(e) = m·Q′(e) ∧Mess2 (p, e,m)

)

Page 150: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Timely Dataflow: A Model 141

Mess2 (p, e,m)Δ=

⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

let{e1, . . . , ek} = {d ∈ E ∪O | src(d) = p},s = LocState(p),(s′, N, 〈e1 �→μ1, . . . , ek �→μk〉) = g1(p)(s, (e,m))inLocState ′(p) = s′

∧NotRequests′(p) = NotRequests(p) ∪N∧Q′(e1) = Q(e1)·μ1 . . . Q

′(ek) = Q(ek)·μk

∧∀q ∈ P �= p.LocState ′(q) = LocState(q)∧∀q ∈ P �= p.NotRequests′(q) = NotRequests(q)∧∀d ∈ I ∪ E ∪O − {e, e1, . . . , ek}.Q′(d) = Q(d)

⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

These formulas describe how a node p dequeues a message m and reacts toit, producing messages and notification requests.

2. Receiving a notification:

NotΔ= ∃p ∈ P.Not1 (p)

Not1 (p)Δ=

⎜⎜⎝

∃t ∈ NotRequests(p).∀e ∈ I ∪ E such that dst(e) = p.(e, t) �∈ Clock∧Not2 (p, t)

⎟⎟⎠

Not2 (p, t)Δ=

⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

let{e1, . . . , ek} = {d ∈ E ∪O | src(d) = p},s = LocState(p),(s′, N, 〈e1 �→μ1, . . . , ek �→μk〉) = g1(p)(s, t)inLocState ′(p) = s′

∧NotRequests′(p) = NotRequests(p)− {t} ∪N∧Q′(e1) = Q(e1)·μ1 . . . Q

′(ek) = Q(ek)·μk

∧∀q ∈ P �= p.LocState ′(q) = LocState(q)∧∀q ∈ P �= p.NotRequests′(q) = NotRequests(q)∧∀d ∈ I ∪ E ∪O − {e1, . . . , ek}.Q′(d) = Q(d)

⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

Page 151: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

142 M. Abadi and M. Isard

These formulas describe how a node p consumes a notification t for which ithas an outstanding notification request, and how it reacts to the notification,producing messages and notification requests.

3. External input and output changes:

InpΔ=

⎜⎜⎜⎜⎜⎜⎜⎜⎝

∀i ∈ I.Q(i) is a subsequence of Q′(i)∧∀p ∈ P.LocState ′(p) = LocState(p)∧∀p ∈ P.NotRequests ′(p) = NotRequests(p)∧∀d ∈ E ∪O.Q′(d) = Q(d)

⎟⎟⎟⎟⎟⎟⎟⎟⎠

OutpΔ=

⎜⎜⎜⎜⎜⎜⎜⎜⎝

∀o ∈ O.Q′(o) is a subsequence of Q(o)∧∀p ∈ P.LocState ′(p) = LocState(p)∧∀p ∈ P.NotRequests ′(p) = NotRequests(p)∧∀d ∈ I ∪ E.Q′(d) = Q(d)

⎟⎟⎟⎟⎟⎟⎟⎟⎠

External input changes allow the contents of input edges to be extendedrather arbitrarily. We do not assume that such extensions are harmoniouswith notifications and the use of Clock ; from this perspective, it would bereasonable and straightforward to add the constraint Clock ′ ⊆ Clock to Inp.Similarly, external output changes allow the contents of output edges to beremoved, not necessarily in order. We ask that Q(i) be a subsequence ofQ′(i) and that Q′(o) be a subsequence of Q(o), so that it is easy to attributestate transitions. While variants on these two actions are viable, allowingsome degree of external change to input and output edges seems attractivefor composability (see Section 4.3).

The high-level specification:

ISpecΔ= InitProp ∧ [Mess ∨Not ∨ Inp ∨Outp]LocState,NotRequests,Q

SpecΔ= ∃LocState,NotRequests, Q�E.ISpec

ISpec describes a complete property and Spec an externally visible property.This specification is the most basic of several that we have studied. For in-

stance, another one allows certain message reorderings, replacing Mess1 (p) with

∃m ∈ M.∃e ∈ I ∪ E such that p = dst(e).∃u, v ∈ M∗.Q(e) = u·m·v ∧Q′(e) = u·v ∧ ∀n ∈ u.time(n) �≤ time(m) ∧Mess2 (p, e,m)

Given a queue of messages Q(e), p is allowed to process any message m suchthat there is no message n ahead of m with time(n) ≤ time(m). Mathematically,we may think of Q(e) as a partially ordered multiset (pomset) [13]; with thatview, m is a minimal element of Q(e). This relaxation is useful, for example, for

Page 152: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Timely Dataflow: A Model 143

enabling optimizations in which several messages for the same time are processedtogether, even if they are not all at the head of a queue.

4.3 Composing Graphs

We briefly discuss how to compose graphs, without however fully developingthe corresponding definitions and theory (in part, simply, because we have notneeded them in our applications of the semantics to date).

We can regard the specifications of this paper as being parameterized by aClock variable, rather than as being specifically for Clock as defined in Sec-tion 4.2. Once we regard Clock as a parameter, the specifications that corre-spond to multiple dataflow graphs can be composed meaningfully, and alongstandard lines [9]. Suppose that we are given graphs G1 and G2, with nodesP1 and P2, input edges I1 and I2, internal edges E1 and E2, and output edgesO1 and O2, and specifications Spec1 and Spec2. We assume that P1 and P2, I1and I2, E1 and E2, and O1 and O2 are pairwise disjoint. We also assume thatI1, I2, O1, and O2 are disjoint from E1 and E2. We write X12 = I2 ∩ O1 andX21 = I1∩O2. The edges in X12 and X21 will connect the two graphs. We definea specification for the composite system with nodes P = P1 ∪ P2, input edgesI = I1 ∪ I2 −X12 −X21, internal edges E = E1 ∪ E2 ∪X12 ∪ X21, and outputedges O = O1 ∪O2 −X12 −X21, by

Spec12 = ∃Q�(X12 ∪X21).Spec1 ∧ Spec2 ∧ [¬(Acts1 ∧ Acts2)]

where, for j = 1, 2,

Actsj =

∃i ∈ Ij .Q′(i) is a proper subsequence of Q(i)

∨∃o ∈ Oj .Q(o) is a proper subsequence of Q′(o)

The formula [¬(Acts1 ∧Acts2)] ensures that the actions of the two subsystemsthat are visible on their input and output edges are not simultaneous. It doesnot say anything about internal edges, nor does it address notification requests.

It remains to study how Spec12 relates to the non-compositional specifica-tion of the same system. Going further, the definition of a global Clock mightbe obtained compositionally from multiple, more local could-result-in relations.Finally, one might address questions of full abstraction. Although we rely on astate-based formalism, results such as Jonsson’s [9] (which are cast in terms ofI/O automata) should translate. However, a fully abstract treatment of timelydataflow would have interesting specificities, such as the handling of completionnotifications and the possible restrictions on contexts (in particular contextsconstrained not to send messages into the past).

5 An Application

In order to leverage the definitions and to test them, we state and prove a basicbut important property for timely dataflow. Specifically, we argue that, once apointstamp (e, t) is not in Clock , messages on e will never have times ≤ t. Forthis property to hold, however, we require a hypothesis on inputs; we simply

Page 153: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

144 M. Abadi and M. Isard

assume that, for all input edges i, Q(i) never grows, though it may contain somemessages initially. (Alternatively, we could add the constraint Clock ′ ⊆ Clock toInp, as suggested in Section 4.2.)

First, we establish some auxiliary propositions:

Proposition 7. ISpec implies that always, for all p ∈ P , there exists a localhistory H(p) for p such that LocState(p) = ΠLocg(H(p)) and NotRequests(p) =ΠNRg(H(p)).

Proposition 8. Assume that Conditions 1 and 2 hold. Then ISpec implies

[(∀i ∈ I.Q′(i) is a subsequence of Q(i)) ⇒ (Clock ′ ⊆ Clock )]

Proposition 9.

[(∀i ∈ I.Q′(i) is a subsequence of Q(i)) ⇒ (Clock ′ ⊆ Clock )]∧[∀i ∈ I.Q′(i) is a subsequence of Q(i)]⇒

∀e ∈ I ∪ E ∪O, t ∈ T.

(e, t) �∈ Clock⇒(e, t) �∈ Clock

We obtain:

Theorem 2. Assume that Conditions 1 and 2 hold. Then ISpec and

[∀i ∈ I.Q′(i) is a subsequence of Q(i)]

imply

∀e ∈ I ∪ E ∪O, t ∈ T,m ∈ M.

(e, t) �∈ Clock⇒(m ∈ Q(e) ⇒ time(m) �≤ t)

Previous work [4] studies a distributed algorithm for tracking the progressof a computation, and arrives at a somewhat analogous result. This previouswork assumes a notion of virtual time but defines neither a dataflow model nora corresponding could-result-in relation (so, in particular, it does not treat ana-logues of Conditions 1 and 2). In the distributed algorithm, information at eachprocessor serves for constructing a conservative approximation of the pendingwork in a system. Naiad relies on such an approximation for implementing itsclock, which the state function Clock represents in our model.

6 Conclusion

This paper aims to develop a rigorous foundation for timely dataflow, a model fordata-parallel computing. Some of the ingredients in timely dataflow, as definedin this paper, have a well-understood place in the literature on semantics andprogramming languages. For instance, many programming languages supportmessages and message streams. On the other hand, despite similarities to extantconcepts, other ingredients are more original, so giving them self-contained

Page 154: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Timely Dataflow: A Model 145

semantics can be both interesting and valuable for applications. In particular,virtual times and completion notifications may be reminiscent of the notion ofpriorities [15,8], but a straightforward reduction seems impossible. More broadly,there should be worthwhile opportunities for further foundational and formalcontributions to research on data-parallel software, currently a lively area ofexperimental work in which several computational abstractions and models arebeing revisited, adapted, or invented.

Acknowledgments. We are grateful to our coauthors on work on Naiad for discus-sions that led to this paper. In addition, conversations with Nikhil Swamy and DimitriosVytiniotis motivated our study of recursion.

References1. Abadi, M., Isard, M.: On the flow of data, information, and time. In: Focardi, R.,

Myers, A. (eds.) POST 2015. LNCS, vol. 9036, pp. 73–92. Springer, Heidelberg(2015)

2. Abadi, M., Isard, M.: Timely rollback: Specification and verification. In: Havelund,K., Holzmann, G., Joshi, R. (eds.) NFM 2015. LNCS, vol. 9058, pp. 19–34. Springer,Heidelberg (2015)

3. Abadi, M., Lamport, L.: The existence of refinement mappings. Theoretical Com-puter Science 82(2), 253–284 (1991)

4. Abadi, M., McSherry, F., Murray, D.G., Rodeheffer, T.L.: Formal analysis ofa distributed algorithm for tracking progress. In: Beyer, D., Boreale, M. (eds.)FMOODS/FORTE 2013. LNCS, vol. 7892, pp. 5–19. Springer, Heidelberg (2013)

5. Blelloch, G.E.: Programming parallel algorithms. Communications of theACM 39(3), 85–97 (1996)

6. Hildebrandt, T.T., Panangaden, P., Winskel, G.: A relational model of non-deterministic dataflow. Mathematical Structures in Computer Science 14(5),613–649 (2004)

7. Jefferson, D.R.: Virtual time. ACM Transactions on Programming Languages andSystems 7(3), 404–425 (1985)

8. John, M., Lhoussaine, C., Niehren, J., Uhrmacher, A.M.: The attributed pi-calculuswith priorities. In: Priami, C., Breitling, R., Gilbert, D., Heiner, M., Uhrmacher,A.M. (eds.) Transactions on Computational Systems Biology XII. LNCS (LNBI),vol. 5945, pp. 13–76. Springer, Heidelberg (2010)

9. Jonsson, B.: A fully abstract trace model for dataflow and asynchronous networks.Distributed Computing 7(4), 197–212 (1994)

10. Kahn, G.: The semantics of a simple language for parallel programming. In: IFIPCongress, pp. 471–475 (1974)

11. Lamport, L.: Specifying Systems, The TLA+ Language and Tools for Hardwareand Software Engineers. Addison-Wesley (2002)

12. Murray, D.G., McSherry, F., Isaacs, R., Isard, M., Barham, P., Abadi, M.: Naiad: atimely dataflow system. In: ACM SIGOPS 24th Symposium on Operating SystemsPrinciples, pp. 439–455 (2013)

13. Pratt, V.: Modeling concurrency with partial orders. International Journal of Par-allel Programming 15(1), 33–71 (1986)

14. Selinger, P.: First-order axioms for asynchrony. In: Mazurkiewicz, A., Winkowski, J.(eds.) CONCUR 1997. LNCS, vol. 1243, pp. 376–390. Springer, Heidelberg (1997)

15. Versari, C., Busi, N., Gorrieri, R.: An expressiveness study of priority in processcalculi. Mathematical Structures in Computer Science 19(6), 1161–1189 (2009)

Page 155: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Difference Bound Constraint Abstraction

for Timed Automata Reachability Checking

Weifeng Wang1,2 and Li Jiao1(�)

1 State Key Laboratory of Computer Science, Institute of Software,Chinese Academy of Sciences, Beijing 100190, China

{wangwf,ljiao}@ios.ac.cn2 University of Chinese Academy of Sciences, Beijing 100049, China

Abstract. We consider the reachability problem for timed automata.One of the most well-known solutions for this problem is the zone-basedsearch method. Max bound abstraction and LU-bound abstraction onzones have been proposed to reduce the state space for zone based search.These abstractions use bounds collected from the timed automata struc-ture to compute an abstract state space. In this paper we propose a dif-ference bound constraint abstraction for zones. In this abstraction, setsof difference bound constraints collected from the symbolic run are usedto compute the abstract states. Based on this new abstraction scheme,we propose an algorithm for the reachability checking of timed automata.Experiment results are reported on several timed automata benchmarks.

1 Introduction

Model checking of timed automata has been studied for a long time since itwas proposed [2]. One of the most interesting properties to be verified for timedautomata is the reachability property. In this paper, we will focus on the reach-ability problem of timed automata.

It is known that the reachability problem for timed automata is PSPACE-complete [3]. Initially region-based method [2] was used to discretize the statespace, and convert the timed automata model to finite automata. However, theresulting finite automata are so large that it is not practical to perform modelchecking on them. BDD-based [4,8] and SAT-based [17] symbolic model checkingcan be used to fight the state explosion.

Zone-based method is an important approach to the reachability problem oftimed automata. In zone-based method, a group of difference bound inequalitiesis used to symbolically represent a convex set of clock valuations (which is calleda “zone”), and exhaustive search is performed on the symbolic state space [7].Abstraction techniques for zones are used to reduce the symbolic state space, andensure the reduced symbolic state space to be finite. In max-bound abstraction(a.k.a. k-approximation), the maximum constants appearing in the guards of thetimed automata are collected, and used to compute abstractions for zones. LU-abstraction [6] improves by classifying the constants into two categories: thoseappearing in lower bound guards and those appearing in upper bound guards.

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 146–160, 2015.DOI: 10.1007/978-3-319-19195-9_10

Page 156: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Difference Bound Constraint Abstraction 147

Behrmann et al. [5] used static analysis on the structure of timed automata toobtain smaller bounds, which lead to coarser abstractions. Herbreteau et al [12]proposed to calculate the bounds on-the-fly, and used a non-convex abstractionbased on LU-bounds. All the above mentioned techniques are based on boundsthat are collected from the timed automata, which just capture limited amountof information about the system.

In this paper, we explore the possibility to use difference bound constraintsas abstractions for zones. In our abstraction scheme, a set of difference boundconstraints is used as the abstraction of the zone. In fact, the conjunction ofthese difference bound constraints is a zone that is larger than the original zone.This abstraction is, to some extent, similar to the predicate abstraction in theprogram verification field.

A lazy search algorithm similar to that in [13] is used to gradually refine theabstraction. Each node is a tuple (l, Z, C), where (l, Z) is the symbolic state,and C is a set of difference bound constraints, which serves as the abstraction.Initially, the difference bound constraint set of each node is ∅, which meansthere is no difference bound constraint in the abstraction, i.e., the abstractedzone is the set of all clock valuations. If a transition t is disabled from a node(l, Z, C), we extract a set of difference bound constraints Ct from Z such thatPostt(�Ct�) = ∅, Ct is sufficient to prove that the transition t from the configu-ration (l, Z) is disabled. The difference bound constraints in Ct are added to C,after which the change in C is propagated backward according to certain rules.The addition of difference bound constraints into the abstraction is in fact arefinement operation.

The key problem here is how to compute and propagate the set of differencebound constraints. We propose a method to propagate difference bound con-straints, which makes use of structural information of the timed automata toidentify “important” difference bound constraints in the zones from those thatare “irrelevant”.

Unfortunately, the lazy search algorithm using only difference bound con-straint abstraction does not necessarily terminate. The LU-abstraction Extra+LU

[6] is used in our algorithm to ensure termination. The resulting algorithm canbe seen as a state space reduction using difference bound constraint abstractionon top of Extra+LU -based symbolic search.

We performed experiments to compare our method with zone-based searchand the lazy abstraction method proposed in [13]. Results show that in generalour method behaves similarly to that in [13], while in some cases our methodcan achieve better reduction of the state spaces.

Related Work. Abstraction refinement techniques [9] have attracted much at-tention in recent years. This kind of techniques check the property of the systemby iteratively checking and refining abstract models which tend to be smallerthan the original model. Efforts have been devoted on adapting abstraction re-finement techniques for the verification of timed automata [16,10].

Lazy abstraction [11] is an important abstraction refinement technique. In thelazy abstraction procedure, an abstract reachability tree is built on-the-fly, along

Page 157: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

148 W. Wang and L. Jiao

with the refinement procedure, and predicate formulas are used to represent theabstract symbolic states. Difference bound constraint abstraction is similar topredicate abstraction, and the constraint propagation resembles interpolation[15]. In our method, abstractions only take the form of conjunctions of differ-ence bound constraints, which is more efficient than general-purpose first orderformulas. Herbreteau et al. [13] proposed a lazy search scheme to dynamicallycompute the LU-bounds during state space exploration, which results in smallerLU-bounds and coarser abstractions. We use a similar lazy search scheme.

Organization of the Paper. In Section 2 we have a simple review of basicconcepts related to timed automata. We present the difference bound constraintabstraction, and the model checking algorithm based on this abstraction in Sec-tion 3. An example is given in Section 4 to illustrate how our method achievesstate space reduction. Experiment results are reported in Section 5, and conclu-sions are given in Section 6.

2 Preliminaries

2.1 Timed Automata and the Reachability Property

A set of clock variables X is a set of non-negative real-valued variables. A clockconstraint is a conjunction of constraints of the form x ∼ c, where x, y ∈ X ,c ∈ N, and∼∈ {<,≤, >,≥}. A difference bound constraint onX is a constraint ofthe form x−y ≺ c, where x, y ∈ X∪{0}, c ∈ N, and≺∈ {<,≤}. Obviously a clockconstraint can be re-written as conjunctions of difference bound constraints. Aclock valuation is a function ν : X → R≥0, which assigns to each clock variablea nonnegative real value. We denote 0 the special clock valuation that assigns 0to every clock variable. For a formula ϕ on X , we write ν |= ϕ, if ϕ is satisfiedby the valuation ν. Furthermore, we denote by �ϕ� the set of all clock valuationssatisfying ϕ, i.e., �ϕ� = {ν|ν |= ϕ}.Definition 1 (Timed Automata). A timedautomaton is a tuple 〈L, linit, X, T 〉,whereL is a finite set of locations, linit is the initial location,X is a finite set of clocks,

and T is a finite set of transitions of the form la,g,r−−−→ l′, where a is an action label, g

is a clock constraint, which we call guard, and r ⊆ X is the set of clocks to be reset.

For a transition t = la,g,r−−−→ l′ ∈ T , we use t.a, t.g, t.r to denote the corre-

sponding action, guard, and set of clocks to be reset.

Definition 2 (Semantics of Timed Automata). A configuration of a timedautomaton A = 〈L, linit, X, T 〉 is a pair (l, ν) where l ∈ L is a location, and ν isa clock valuation. The initial configuration is (linit,0). There are two kinds oftransitions

– Action. For each pair of states (l, ν) and (l′, ν′), (l, ν) →t (l′, ν′) iff there is

a transition t = la,g,r−−−→ l′ ∈ T , and

Page 158: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Difference Bound Constraint Abstraction 149

• ν |= g, and• ν′(x) = ν(x) for each x /∈ r, and• ν′(x) = 0 for each x ∈ r.

– Delay. For each pair of configurations (l, ν) and (l′, ν′), and an arbitraryδ ∈ R≥0, (l, ν) →δ (l′, ν′), iff l = l′ and ν′(x) = ν(x) + δ for each clockx ∈ X.

A run of a timed automaton is a (possibly infinite) sequence of configurationsρ = (l0, ν0)(l1, ν1) · · · , where (l0, ν0) = (linit,0), and for each i ≥ 0, either(li, νi) →t (li+1, νi+1) for some t ∈ T , or (li, νi) →δ (li+1, νi+1) for some δ ∈R≥0.

The definition of timed automata is always extended to networks of timed au-tomata, which is a parallel composition of timed automata. The parallel compo-sition can be obtained by the product of these components. Usually this productis not computed directly, but on-the fly during the verification. In this paper wewill describe our method based on timed automata, while it could be naturallyextended on networks of timed automata.

In this paper we will consider the reachability problem. Basically, a locationof a timed automaton is reachable iff there is a run of the timed automaton thatreaches the location.

Definition 3 (Reachability). A location lacc of a timed automaton A is reach-able iff there is a finite run ρ = (l0, ν0)(l1, ν1) · · · (lk, νk), where lk = lacc.

2.2 Zone Based Symbolic Semantics

The symbolic semantics of timed automata has been proposed to fight stateexplosion. Basically, the idea is to represent a set of clock valuations using clockconstraints. Zones are used in timed automata model checking to symbolicallyrepresent the sets of clock valuations. A zone is a convex set of clock valuationsthat can be represented by a set of difference bound constraints.

For a zone Z and a clock constraint g, we define Z ∧ g as {ν|ν ∈ Z ∧ ν |= g},Z[r := 0] as {ν|∃ν′ ∈ Z · ∀x ∈ r(ν(x) = 0) ∧ ∀x /∈ r(ν(x) = ν′(x))}, and Z ↑ as{ν|∃ν′ ∈ Z, δ ∈ R≥0 · ν = ν′ + δ}. Zones are closed under these operations [7].

Definition 4 (Symbolic Semantics of Timed Automata). The symbolicsemantics of a timed automaton A = 〈L, linit, X, T 〉 is a labeled transition system(S,⇒, s0). Each state s ∈ S is a symbolic configuration (l, Z), where l is alocation, and Z is a zone. The initial state is s0 = (linit, �0 ≤ x1 = x2 = · · · =xn�). For each pair of states s = (l, Z) and s′ = (l′, Z ′), s ⇒t s

′ iff there exists a

transition t = la,g,r−−−→ l′ ∈ T such that Z ′ = (Z ∧ g)[r := 0] ↑. A symbolic run is

a sequence (l0, Z0)(l1, Z1) · · · , where (l0, Z0) = (linit, �0 ≤ x1 = x2 = · · · = xn�),and for each i ≥ 0, (li, Zi) ⇒t (li+1, Zi+1) for some t ∈ T .

In addition, we define the Post operator Postt(Z)def= (Z ∧ t.g)[t.r := 0] ↑. A

transition t = la,g,r−−−→ l′ ∈ T is disabled at (l, Z), if Postt(Z) = ∅.

Page 159: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

150 W. Wang and L. Jiao

Symbolic semantics is sound and complete with respect to the reachabilityproperty [7]. A given location lacc is reachable iff there is a symbolic run endingwith a configuration (lacc, Z) where Z �= ∅.

Zones can be represented as Difference Bound Matrices (DBMs), and efficientalgorithms for manipulating DBMs have already been proposed [7]. The DBMrepresentation of a zone on the clock set X is a (|X | + 1) × (|X | + 1) matrix,each element of which is a tuple (≺, c), where ≺∈ {<,≤} and c ∈ N. In a DBMD, D0x = (≺, c) means 0 − x ≺ c, i.e. x � −c, Dx0 = (≺, c) means x − 0 ≺ c,i.e. x ≺ c, for x, y �= 0, Dij = (≺, c) represents the constraint xi − xj ≺ c.Two different DBMs might correspond to the same zone. In order to tackle thisproblem, the canonical forms of DBMs can be computed by the Floyd-Warshallalgorithm [7].

The zone-based semantics described in the above is not necessarily finite.Max-bound abstraction and LU-bound abstraction are proposed to reduce thestate space to finite, and the former can be seen as a special case of the latter.Basically, these abstraction techniques remove from the zone those constraintsthat exceed certain bounds, resulting in an abstracted zone that is larger thanthe original one. Coarser abstractions lead to smaller symbolic state space. Asfar as we know, Extra+LU [6] is the coarsest convex-preserving abstraction basedon LU-bounds.

An LU-bound is a pair of functions LU , where L : X → N ∪ {−∞} is calleda lower bound function and U : X → N ∪ {−∞} an upper bound function.

Definition 5 (LU-extrapolation [6]). Let Z be a zone whose canonical DBMis 〈ci,j ,≺i,j〉i,j=0,1,...,|X|. Given an LU-bound LU , the LU-extrapolation

Extra+LU (Z) of Z is a zone Z ′ which can be represented by a DBM 〈c′i,j ,≺′i,j

〉i,j=0,1,...,|X|, where

〈c′i,j ,≺′i,j〉 =

⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎩

∞ if ci,j > L(xi)

∞ if − c0,i > L(xi)

∞ if − c0,j > U(xj), i �= 0

(−U(xj), <) if − c0,j > U(xj), i = 0

(ci,j ,≺i,j) otherwise

We denote by ⇒E the symbolic semantics of timed automata augmented withExtra+LU : for two symbolic configurations (l, Z) and (l′, Z ′), (l, Z) ⇒E (l′, Z ′) iffthere is a zone Z ′′ such that (l, Z) ⇒ (l′, Z ′′) and Z ′ = Extra+LU (Z

′′). We canchoose the LU-bound L(x) and U(x) as follows: for each clock x, L(x)(U(x)) isthe largest constant c such that x > c(x < c) or x ≥ c(x ≤ c) appears in theguard of some transition. Intuitively, L(x) (U(x)) collects the maximum constantappearing in the lower-bound (upper-bound) guard of x. It can be proved that⇒E preserves reachability [6].

Page 160: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Difference Bound Constraint Abstraction 151

3 Difference Bound Constraint Abstraction BasedReachability Checking

3.1 Difference Bound Constraint Abstraction and AdaptiveSimulation Graph

For a zone Z and a difference bound constraint c we denote Z |= c as ∀ν ∈Z ·ν |= c. For a set C of difference bound constraints we denote �C� to be the setof valuations that satisfy all the constraints in C, i.e., �C� = {ν|∀c ∈ C · ν |= c}.Intuitively, a set C of difference bound constraints is interpreted as a conjunctionof the constraints from C. In fact, �C� is also a zone, although it will not be storedas a canonical DBM in our algorithm. In the sequel, we might mix the use ofset-theoretic notations (e.g. intersection) and logic notations (e.g. conjunction)on zones and constraints.

Definition 6 (Difference Bound Constraint Abstraction). A differencebound constraint abstraction C for a zone Z is a set of difference bound con-straints such that Z ⊆ �C�.

This definition resembles the concept of predicate abstraction in programverification. Each difference bound constraint in C can be seen as a predicate.In our algorithm, only those difference bound constraints that are useful for thereachability problem are kept, while the irrelevant constraints are ignored. Thedifference bound constraint abstraction of a zone is still a zone, but we store itas a set of constraints rather than a canonical DBM. Based on this abstraction,we define an Adaptive Simulation Graph (ASG) similar to that in [13].

Definition 7 (Adaptive Simulation Graph). Given a timed automaton A,the adaptive simulation graph ASGA of A is a graph with nodes of the form(l, Z, C), where l is a location, Z is a zone, and C is a set of difference boundconstraints such that Z ⊆ �C�. A node could be marked tentative (which roughlymeans it is covered by another node). Three constraints should be satisfied:G1 For the initial state l0 and initial zone Z0, there is a node (l0, Z0, C0) in thegraph for some C0.G2 If a node (l, Z, C) is not tentative, then for every transition (l, Z) ⇒t (l

′, Z ′)s.t. Z ′ �= ∅, there is a successor (l′, Z ′, C′) for this node.G3 If a node (l, Z, C) is tentative, there is a non-tentative node (l, Z ′, C′) cov-ering it, i.e., Z ⊆ �C� ⊆ �C′�.In addition, two invariants are required.I1 If a transition t is disabled from (l, Z), and (l, Z, C) is a non-tentative node,then t should also be disabled from (l, �C�).I2 For every edge (l, Z, C) ⇒t (l

′, Z ′, C′) in the ASG: Postt(�C�) ⊆ �C′�.

For a node v = (l, Z, C), we use v.l, v.Z, and v.C to denote the three compo-nents. We say that a node (l, Z, C) is covered by a node (l, Z ′, C′), if Z ⊆ �C′�.

The following theorem states that, the adaptive simulation graph preservesreachability of the corresponding timed automaton.

Page 161: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

152 W. Wang and L. Jiao

Theorem 1. A location lacc in the timed automaton A is reachable, iff there isa node (lacc, Z, C) such that Z �= ∅ in ASGA.

The right-to-left direction of this theorem is obviously true: by the definitionof ASG, each path in ASGA corresponds to a symbolic run in A. In order toprove the other direction of Theorem 1, we prove a slightly stronger lemma.

Lemma 1. If there is a symbolic run (l0, Z0) · · · (l, Z) in A with Z �= ∅, thenthere must be a non-tentative node (l, Z1, C1) in ASGA such that Z ⊆ �C1�.

Proof. We prove this lemma by induction on the length of the run. For the 0-length run and (l0, Z0), the lemma is trivially true. Assume the lemma is true fora run (l0, Z0) · · · (l, Z), now we prove that the lemma holds for every successor(l′, Z ′) of (l, Z) with (l, Z) ⇒t (l

′, Z ′) and Z ′ �= ∅.We need to prove that there is a node in ASGA corresponding to (l′, Z ′) as

described in the lemma. By induction hypothesis, there is a non-tentative node vsuch that Z ⊆ �v.C�. We now assert that Postt(v.Z) �= ∅. Because, otherwise, byI1 of Definition 7 we will have Postt(�v.C�) = ∅, and consequently Postt(Z) ⊆Postt(�v.C�) = ∅, which contradicts the assumption Z ′ �= ∅. From G2 we knowthat there is a successor v′ of v such that (v.l, v.Z) ⇒t (v′.l, v′.Z). By I2 wehave Postt(�v.C�) ⊆ �v′.C′�, so Z ′ = Postt(Z) ⊆ Postt(�v.C�) ⊆ �v′.C�, If v′ isnon-tentative, then it is a node that we want to find, and the lemma is proved.Otherwise, there is a non-tentative node v′′ covering v′. From G3 we know that�v′.C� ⊆ �v′′.C�, so Z ′ ⊆ �v′′.C�, and v′′ is the qualifying node.

According to Theorem 1, the reachability problem could be solved by explor-ing the ASG. The algorithm for constructing the ASG will be described in thenext subsection.

3.2 The ASG-Constructing Algorithm

The algorithm for constructing the ASG is shown in Algorithm 1. The mainprocedure repeatedly calls EXPLORE to explore the nodes until an acceptingnode is found (lines 10-11), or the worklist is empty (line 8).

The function EXPLORE proceeds as follows. For a node v to be explored,first it checks whether v is an accepting node. If so, the algorithm exits withthe result “reachable”, otherwise it checks whether there is a non-tentative nodev′′ covering v. If so, v is marked tentative with respect to v′′ (line 13), andthe set of difference bound constraints of v′′ is copied to v to maintain G3 ofDefinition 7 (line 14). There is no need to generate successor nodes for a tentativenode. Otherwise, the difference bound constraint set of v is computed using thetransitions disabled at v to maintain I1 (line 17), after which successor nodesof v are generated (maintaining G2) and put into the worklist for exploration(lines 19-21).

Whenever the difference bound constraint set of a node v′ is changed, PROP-AGATE will be called (lines 15, 18, 38) to propagate the newly-added constraintsbackward to its parent v (to maintain I2) (lines 29-31), and further to the nodes

Page 162: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Difference Bound Constraint Abstraction 153

Algorithm 1. The ASG-constructing algorithm

1: function main2: let vroot = (l0, Z0, ∅)3: add vroot to the worklist4: while worklist not empty do5: remove v from the worklist6: explore(v)7: resolve8: return “not reachable”9: function explore(v)10: if v.l is accepting then11: exit “reachable”12: else if ∃v′′ non-tentative s.t. v.l =

v′′.q ∧ v.Z ⊆ �v′′.C� then13: mark v tentative w.r.t. v′′

14: v.C ← v′′.C15: propagate(v, v.C)16: else17: v.C ←disabled(v.l, v.Z)18: propagate(v, v.C)19: for all (l′, Z′) s.t. (v.l, v.Z) ⇒

(l′, Z′) and Z′ �= ∅ do

20: create the successor v′ =(l′, Z′, ∅) of v

21: add v′ to worklist22: function resolve23: for all v tentative w.r.t. v′ do24: if v.Z � �v′.C� then25: mark v non-tentative26: set v.C ← ∅27: add v to worklist28: function propagate(v′, C′)29: let v = parent(v′)30: C ← backprop(v, v′, C′)31: C1 ←update(v, C)32: if C1 �= ∅ then33: for all vt tentative w.r.t. v do34: if vt.Z ⊆ �v.C� then35: Ct ←update(vt, C1)36: propagate(vt, Ct)

37: if v �= vroot then38: propagate(v, C1)

tentative with respect to v (to maintain G3) (line 36). If a tentative node vt is nolonger covered by v, the function RESOLVE will eventually be called (line 7) tomark it non-tentative, remove all constraints from its difference bound constraintset, and put it into the worklist for exploration (lines 25-27).

For each node v, the difference bound constraint abstraction v.C is stored asa set of difference bound constraints, rather than a canonical DBM. Checkingwhether Z ⊆ �C� for a zone Z and a difference bound constraint set C can beaccomplished by checking whether Z |= c for all c ∈ C. When adding constraintsto a difference bound constraint abstraction, only the strongest constraints arekept, which is handled by the function UPDATE, whose code is omitted here.

3.3 Computing the Difference Bound Constraint Sets

The algorithm for building the ASG relies on two functions DISABLED, andBACKPROP to extract the “important” difference bound constraints from zones.In this subsection we describe an implementation of the two functions. Beforedoing that we introduce the arithmetic on {<,≤}×N. For two arbitrary pairs(≺1, c1), (≺2, c2) ∈ {<,≤} × (N ∪ {+∞}), (≺1, c1) + (≺2, c2) = (≺3, c3), wherec3 = c1 + c2, and ≺3=< iff ≺1=< or ≺2=<. The order “<” is defined as:(≺1, c1) < (≺2, c2) iff c1 < c2 or c1 = c2∧ ≺1=< ∧ ≺2=≤.

Page 163: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

154 W. Wang and L. Jiao

DISABLED. In order to maintain the invariant I1, the result C computed byDISABLED(l, Z) should satisfy: i) Z ⊆ �C�, and ii) Postt(�C�) = ∅ for each tdisabled at (l, Z).

By Definition 4 we know that Postt(Z) = ∅ iff Z ∧ t.g = ∅. Thus the prob-lem is reduced to: given a zone Z, and a formula ϕ which is conjunctions ofdifference bound constraints, such that Z ∧ ϕ = ∅, find a set C of differencebound constraints such that Z ⊆ �C� and �C� ∧ ϕ = ∅. Since Z ∧ ϕ = ∅, theremust be a sequence of difference bound constraints x0 − x1 ≺0 c0, x1 − x2 ≺1

c1, . . . xm−1 − xm ≺m−1 cm−1, xm − x0 ≺m cm such that

– C1 Each of them appears either in ϕ, or in the canonical DBM of Z.– C2 xi �= xj for i, j ∈ {0, 1, . . . ,m} and i �= j– C3 (≺0, c0)+ · · ·+(≺m, cm) < (<, 0), i.e., this sequence of difference bound

constraints forms a contradiction.

We take C to be {c = xi − x(i+1) mod (m+1) ≺i ci|c is from Z}. Obviously, theC obtained above satisfies i) and ii).

As shown in [7], a conjunction of difference bound constraints can be seenas a directed weighted graph, where each clock variable corresponds to a node,and each constraint corresponds to a weighted edge, whose weight is a pair in{<,≤} × (N ∪ {+∞}). Figure 1 illustrates the directed weighted graph for thezone x ≥ 0 ∧ y − x = 10. There is a contradiction in the conjunction iff there isa negative cycle in the graph, i.e., the sum of weights in the cycle is less than(≤, 0).

0 x

y

(≤, 0)

(≤,−10)(<,+∞)

(≤,−10) (≤, 10)

(<,+∞)

Fig. 1. Directed weighted graph of zone x ≥ 0 ∧ y − x = 10

The difference bound constraints in C correspond to the edges in the graph ofZ that form a negative cycle with the edges from the graph of ϕ. So the problemis reduced to finding a negative cycle in the merged graph of Z and ϕ, and pickingthe edges in the cycle that belong to Z. This task is accomplished by the functionFINDCONTRA in Algorithm 2, where Floyd-Warshall algorithm is used to finda negative cycle. Function DISABLED in Algorithm 2 calls FINDCONTRA forevery disabled transition, and collects all the constraints obtained.

BACKPROP. In order to maintain the invariant I2, the result C computedby BACKPROP(v, v′, C′) (where (v.l, v.Z) ⇒t (v′.l, v′.Z)) should satisfy: i)Postt(�C�) ⊆ �C′� and ii) v.Z ⊆ �C�.

Page 164: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Difference Bound Constraint Abstraction 155

If there is a set Cc′ of difference bound constraints for each c′ ∈ C′, such thatv.Z ⊆ �Cc′� and Postt(�Cc′�) |= c′, then we can choose C as

c′∈C′Cc′ . One can

easily check that such a C satisfies the above two conditions. We only need toconsider the propagation of each difference bound constraint in C′ one by one.

For simplicity, we break up each transition into three steps: guard, reset, andtime delay, and explain the propagation for each step. The overall description ofBACKPROP is shown in Algorithm 2.

We assume two zones Z1, Z2 and a difference bound constraint c2 which cor-responds to Z2 (i.e. Z2 |= c2).

Delay. Now we have Z1 ↑= Z2, we want to find a Cc2 such that Z1 ⊆ �Cc2�and �Cc2� ↑|= c2. Observe that c2 must be in one of the three cases: x − 0 ≺ c,0 − x ≺ c, and x − y ≺ c, where x, y ∈ X and c ∈ N. Since Z1 ↑= Z2, c2 cannot be of the form x − 0 ≺ c, and for the other two cases we have Z1 |= c2 and�{c2}� ↑|= c2. So we can just choose Cc2 to be {c2}. Intuitively, for time delay,we just copy c2 from the successor to the predecessor.

Reset. For a set r of clocks such that Z1[r] = Z2, we want to find a Cc2 suchthat Z1 ⊆ �Cc2� and �Cc2�[r] |= c2. Let c2 be x − y ≺ c, where x ∈ X ∪ {0},y ∈ X and c ∈ N, we choose Cc2 as {c1}, where:

c1 =

⎪⎪⎪⎨

⎪⎪⎪⎩

x− y ≺ c, if x, y /∈ r

0− y ≺ c, if x ∈ r, y /∈ r

x− 0 ≺ c, if x /∈ r, y ∈ r

0− 0 ≺ c, if x, y ∈ r

(1)

The first and the last cases are trivial. Let’s look at the second case. Sincex ∈ r, according to the definition of reset operation, it must be the case thatZ2 |= x = 0. According to the assumption, Z2 |= c2. Note that c2 is x − y ≺ c.Combining the above two results we have Z2 |= 0− y ≺ c. Since y /∈ r, the valueof y has not changed during the reset operation, thus we have Z1 |= 0 − y ≺ c.Conversely, we can check that �{0 − y ≺ c}�[r] |= x − y ≺ c. Thus {c1} is aqualified candidate for Cc2 . It is similar for the third case.

Here we ignore the case when c2 is x − 0 ≺ c because, according to the timedelay operation (which always follows the reset operation in the timed automatarun), this is impossible.

Guard. For a guard g such that Z1 ∧ g = Z2, we want to find a Cc2 such thatZ1 ⊆ �Cc2� and �Cc2� ∧ g |= c2. Notice that Z1 ∧ g |= c2 iff Z1 ∧ (g ∧ ¬c2) = ∅,similarly, �Cc2�∧ g |= c2 iff �Cc2�∧ (g ∧¬c2) = ∅. Like c2, its negation ¬c2 is alsoa difference bound constraint, so g ∧ ¬c2 is a conjunction of difference boundconstraints, and we can use FINDCONTRA to compute the set Cc2 .

Page 165: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

156 W. Wang and L. Jiao

Algorithm 2

1: function findcontra(Z, ϕ)2: Find a negative cycle using Floyd-

Warshall algorithm on themerged graph of Z and ϕ

3: Take the sequence of constraintscorresponding to the negativecycle in Z ∧ ϕ: c0, c1, . . . cm−1.

4: return {ci|ci is from Z}5: function disabled(l, Z)6: C ← ∅7: for all t disabled at (l, Z) do

8: C ← C ∪ findcontra(Z, t.g)

9: return C10: function backprop(v, v′, C′)11: Given v ⇒t v

12: C ← ∅13: for all c2 ∈ C′ do14: Compute c1 according to (1)15: C ← C∪ findcontra(v.Z, g∧

¬c2)16: return C

3.4 Termination of the ASG-Constructing Algorithm

In order to ensure that the ASG-constructing algorithm terminates, we makeslight modifications on Definition 7 and on Algorithm 1. The condition G2 ofDefinition 7 is modified to:

G2’ If a node (l, Z, C) is not tentative, then for every transition (l, Z) ⇒Et

(l′, Z ′) s.t. Z ′ �= ∅, there is a successor (l′, Z ′, C′) for this node.The symbolic transition relation ⇒ is replaced with ⇒E , which means that

the operator Extra+LU is used when computing the successor nodes. Accord-ingly, in Line 19 of Algorithm 1, (v.l, v.Z) ⇒ (v′.q, v′.Z) should be changed to(v.l, v.Z) ⇒E (v′.q, v′.Z).

Now our algorithm can be seen as a further reduction made on top of Extra+LU -based search.

Theorem 2. For an arbitrary timed automaton A, the modified version of Al-gorithm 1 as described above will terminate.

Proof. Assume, to the contrary, that the algorithm does not terminate. Theremust be an infinite sequence of explored nodes v1, v2, . . . (listed in the order ofexploration) such that v1.l = v2.l = · · · . Since the symbolic state space withExtra+LU abstraction is finite, there must be two nodes vi, vj (with i < j) in thesequence such that vi.Z = vj .Z. From Definition 6 we know that vj .Z = vi.Z ⊆�vi.C�, so vj will never be explored, which contradicts the assumption.

4 An Example

Here we illustrate how our method works on the example timed automaton A1

shown in Figure 2a, where q4 is the accepting location. Following [5], instead ofconsidering one global LU-bound, we associate a local LU-bound to each locationusing static guard analysis. The LU-bound at each location is given in Figure 2c.

Page 166: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Difference Bound Constraint Abstraction 157

(a) (b)

Location Lx Ux Ly Uy Lz Uz

q0 100 100 200 −∞ 200 200q1 −∞ 99 200 −∞ 200 200q2 −∞ −∞ −∞ −∞ −∞ −∞q3 −∞ −∞ −∞ −∞ −∞ −∞q4 −∞ −∞ −∞ −∞ −∞ −∞

(c)

Fig. 2. (2a): a timed automaton A1, (2b): the ASG of A1. Solid arrows represent thetransition relation, while doted arrows represent the cover relation. (2c): the LU-boundsof A1

Using zone-based search, the following symbolic configurations will be gener-ated: (q0, x = z ≥ 0∧ y ≤ z), (q0, z = x+ 1∧ y ≤ z ∧ x ≥ 0), (q0, z = x+ 2∧ y ≤z∧x ≥ 0), . . . , (q0, z = x+100∧y ≤ z∧x ≥ 0), . . ., which is more than 100 nodes.When using our method, the resulting ASG has only 3 nodes, as shown in Fig-ure 2b. In this example our method successfully ignores many of the constraintsthat are irrelevant to the reachability problem, achieving a huge reduction.

The algorithm in [13] can not avoid generating too many nodes either. Thereason is that, LU-bound based abstractions can not find that the differencebound constraints on z − x and y − x are irrelevant. In our abstraction scheme,we consider more information than just LU-bounds, so our method can identifythese constraints to be irrelevant.

5 Experiments

We have implemented UPPAAL’s search algorithm, the algorithm in [13], andour algorithm. Similar to [14], an improvement is made on Algorithm 1 in theimplementation: for each node v, if there is a node v′ such that v.l = v′.l andv.Z ⊆ v′.Z, then v will be deleted, parents of v will be inherited by v′, and thedifference bound constraints in v′.C will be propagated to the parents of v, andnodes that are covered by v (if there is any) will be marked non-tentative.

We performed experiments on several benchmarks. The results are shown inTable 1 and Table 2, which are the results for breadth-first search (bfs) and

Page 167: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

158 W. Wang and L. Jiao

Table 1. Experiment results of the three methods using bfs search. “Search-Extra+LU”

stands for Extra+LU -based search similar to UPPAAL. “a�LU ,disabled” is the algo-

rithm in [13]. “DBCA-Extra+LU” is difference bound constraint abstraction combined

with Extra+LU . “tnds” is the total number of nodes generated, “fnds” is the number

of nodes finally left, and “tm(s)” is the running time (in seconds). “to” stands fortime-out(200s).

Model Search-Extra+LU a�LU ,disabled DBCA-Extra+

LU

tnds fnds tm(s) tnds fnds tm(s) tnds fnds tm(s)

A1 405 204 0.03 405 203 0.14 3 3 0.00D′′

7 12869 12869 6.77 113 113 0.00 113 113 0.01D′′

8 48619 48619 100.08 145 145 0.01 145 145 0.01D′′

70 to 9941 9941 4.35 9941 9941 90.19CSMA/CD 9 99288 45836 3.13 78552 35084 7.33 78552 35084 4.38CSMA/CD 10 258249 120845 8.44 200649 90125 12.15 200649 90125 14.91CSMA/CD 11 656312 311310 25.30 501432 226830 65.16 501432 226830 38.27CSMA/CD 12 1636261 786447 68.81 1230757 561167 118.12 1230757 561167 98.58

FDDI 12 52555 727 13.72 422 341 0.56 176 154 0.13FDDI 30 to 2923 2227 26.84 464 406 2.29Fischer 8 132593 25080 2.61 132593 25080 8.44 132593 25080 7.55Fischer 9 487459 81035 11.24 487459 81035 30.82 487459 81035 22.35Critical 4 434421 53937 6.83 499441 53697 31.58 548781 54180 23.59Lynch 4 46432 12700 1.19 46432 12700 1.26 46432 12700 1.05

Table 2. Experiment results of the three methods using dfs search. All settings arethe same as in Table 1, except that dfs search is performed here

Model Search-Extra+LU a�LU ,disabled DBCA-Extra+

LU

tnds fnds tm(s) tnds fnds tm(s) tnds fnds tm(s)

A1 405 204 0.02 405 203 0.23 3 3 0.00D′′

7 12869 12869 6.61 113 113 0.01 113 113 0.01D′′

8 48619 48619 110.15 145 145 0.01 145 145 0.01D′′

70 to 9941 9941 4.68 9941 9941 55.30CSMA/CD 9 246072 45836 9.43 136813 36901 18.63 129718 35415 11.58CSMA/CD 10 822699 120845 37.98 452788 98731 96.94 362407 90769 35.57CSMA/CD 11 2758945 311310 150.23 to 997243 228054 120.85

FDDI 12 1016 727 0.27 96 96 0.02 96 96 0.03FDDI 30 6308 4507 5.73 240 240 0.26 240 240 0.41FDDI 50 17508 12507 53.21 400 400 0.72 400 400 1.35Fischer 8 218017 25080 4.37 196738 25080 14.00 156634 25080 7.84Fischer 9 1058685 81035 25.80 906766 81035 86.21 642739 81035 44.13Critical 4 1067979 54469 15.19 1025269 53731 66.45 1009995 54488 38.25Lynch 4 84421 12700 1.21 83171 12700 3.81 83967 12700 3.25

Page 168: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Difference Bound Constraint Abstraction 159

depth-first search (dfs), respectively. D′′ is from [13], Fischer comes from thedemos in the UPPAAL tool. The other models are from [1]. In our experimentsettings, no location is set to be accepting, thus forcing the algorithms to performexhaustive state space exploration. The programs are run on a VMware virtualmachine with Ubuntu 10.04 operating system, which is allocated 2GB of memory.The underlying PC has an Intel Core i7 CPU of 2.93GHz and 3GB of RAM.

The results on the model A1 show the advantage of our method over LU-bound based abstractions. This is the case when LU-bound information is notsufficient to identify irrelevant constraints. On the other models, our methodbehaves similarly as a�LU ,disabled. This is quite reasonable, since we use thesame lazy search framework.

In many cases DBCA-Extra+LU generates less nodes, but costs more time

than Search-Extra+LU . This is due to the overhead of the effort to maintain

the invariants of ASGs The situation is similar for a�LU ,disabled. However,for some of the models we can see that the state space reduction of our methodover Search-Extra+

LU is large enough to cover the overhead.

6 Conclusion

In this paper we proposed a difference bound constraint abstraction on zonesfor timed automata reachability checking. Difference bound constraint sets areused as abstractions in a lazy search algorithm, and Extra+LU is used to ensuretermination. Experiments show that in some of the cases the new abstractionscheme reduces the state spaces. A future work would be to perform experimentson other models to further investigate the performance.

Our abstraction is not necessarily coarser than [13], because non-convex ab-stractions [12] are used in [13], while difference bound constraint abstraction isin fact conjunctions of constraints, which is convex. However, our abstractionscheme makes it possible to make use of more information than just LU-bounds,achieving state space reduction in some cases.

The difference bound constraint abstraction is used in a forward lazy searchscheme, and Extra+LU is used to ensure termination. In fact, backward zone-based search is also possible [18], and does not need Max-bound abstraction orLU-abstraction to ensure termination. A possible future work is to explore thepossibility to perform lazy search using difference bound constraint abstractionin a backward manner.

Acknowledgements. We thank the anonymous reviewers for their helpful commentson the earlier version of this paper.

References

1. Benchmarks of Timed Automata,http://www.comp.nus.edu.sg/∼pat/bddlib/timedexp.html

2. Alur, R., Dill, D.: Automata for modeling real-time systems. In: Paterson, M. (ed.)ICALP 1990. LNCS, vol. 443, pp. 322–335. Springer, Heidelberg (1990)

Page 169: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

160 W. Wang and L. Jiao

3. Alur, R., Dill, D.L.: A theory of timed automata. Theor. Comput. Sci. 126(2),183–235 (1994)

4. Asarin, E., Bozga, M., Kerbrat, A., Maler, O., Pnueli, A., Rasse, A.: Data-structures for the verification of timed automata. In: Maler, O. (ed.) HART 1997.LNCS, vol. 1201, pp. 346–360. Springer, Heidelberg (1997)

5. Behrmann, G., Bouyer, P., Fleury, E., Larsen, K.G.: Static guard analysis in timedautomata verification. In: Garavel, H., Hatcliff, J. (eds.) TACAS 2003. LNCS,vol. 2619, pp. 254–270. Springer, Heidelberg (2003)

6. Behrmann, G., Bouyer, P., Larsen, K.G., Pelanek, R.: Lower and upper boundsin zone based abstractions of timed automata. In: Jensen, K., Podelski, A. (eds.)TACAS 2004. LNCS, vol. 2988, pp. 312–326. Springer, Heidelberg (2004)

7. Bengtsson, J., Yi, W.: Timed automata: Semantics, algorithms and tools. In: Desel,J., Reisig, W., Rozenberg, G. (eds.) ACPN 2003. LNCS, vol. 3098, pp. 87–124.Springer, Heidelberg (2004)

8. Beyer, D.: Improvements in BDD-based reachability analysis of timed automata.In: Oliveira, J.N., Zave, P. (eds.) FME 2001. LNCS, vol. 2021, pp. 318–343.Springer, Heidelberg (2001)

9. Clarke, E., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-guided ab-straction refinement. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS,vol. 1855, pp. 154–169. Springer, Heidelberg (2000)

10. Dierks, H., Kupferschmid, S., Larsen, K.G.: Automatic abstraction refinementfor timed automata. In: Raskin, J.-F., Thiagarajan, P.S. (eds.) FORMATS 2007.LNCS, vol. 4763, pp. 114–129. Springer, Heidelberg (2007)

11. Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Lazy abstraction. In: POPL,pp. 58–70. ACM (2002)

12. Herbreteau, F., Kini, D., Srivathsan, B., Walukiewicz, I.: Using non-convex ap-proximations for efficient analysis of timed automata. In: FSTTCS 2011, pp. 78–89(2011)

13. Herbreteau, F., Srivathsan, B., Walukiewicz, I.: Lazy abstractions for timed au-tomata. In: Sharygina, N., Veith,H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 990–1005.Springer, Heidelberg (2013)

14. Herbreteau, F., Srivathsan, B., Walukiewicz, I.: Lazy abstractions for timed au-tomata. CoRR abs/1301.3127 (2013)

15. McMillan, K.L.: Lazy abstraction with interpolants. In: Ball, T., Jones, R.B. (eds.)CAV 2006. LNCS, vol. 4144, pp. 123–136. Springer, Heidelberg (2006)

16. Sorea, M.: Lazy approximation for dense real-time systems. In: Lakhnech, Y.,Yovine, S. (eds.) FORMATS/FTRTFT 2004. LNCS, vol. 3253, pp. 363–378.Springer, Heidelberg (2004)

17. Wozna, B., Zbrzezny, A., Penczek, W.: Checking reachability properties for timedautomata via sat. Fundam. Inf. 55(2), 223–241 (2003)

18. Yovine, S.: Model checking timed automata. In: Rozenberg, G., Vaandrager, F.W.(eds.) Lectures on Embedded Systems. LNCS, vol. 1494, pp. 114–152. Springer,Heidelberg (1998)

Page 170: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Compliance and Subtyping

in Timed Session Types

Massimo Bartoletti(�), Tiziana Cimoli, Maurizio Murgia,Alessandro Sebastian Podda, and Livio Pompianu

Universita degli Studi di Cagliari, Cagliari, [email protected]

Abstract. We propose an extension of binary session types, to formalisetimed communication protocols between two participants at the end-points of a session. We introduce a decidable compliance relation, whichgeneralises to the timed setting the usual progress-based notion of com-pliance between untimed session types. We then show a sound and com-plete technique to decide when a timed session type admits a compliantone, and if so, to construct the least session type compliant with a givenone, according to the subtyping preorder induced by compliance. Decid-ability of subtyping follows from these results. We exploit our theory todesign and implement a message-oriented middleware, where distributedmodules with compliant protocols can be dynamically composed, andtheir communications monitored, so to guarantee safe interactions.

1 Introduction

Session types are formal descriptions of interaction protocols involving two ormore participants over a network [18,23]. They can be used to specify the be-havioural interface of a service or a component, and to statically check througha (session-)type system that this conforms to its implementation, so enablingcompositional verification of distributed applications. Session types support for-mal definitions of compatibility or compliance (when two or more session types,composed together, behave correctly), and of substitutability or subtyping (whena service can be safely replaced by another one, while preserving the interactioncapabilities with the context). Since these notions are often decidable and com-putationally tractable (for synchronous session types), or safely approximable(for asynchronous ones), session typing is becoming a particularly attractive ap-proach to the problem of correctly designing distributed applications. This iswitnessed by a steady flow of foundational studies [16,10,15] and of tools [12,24]based on them in the last few years.

In the simplest setting, session types are terms of a process algebra featuringa selection construct (an internal choice among a set of branches), a branchingconstruct (an external choice offered to the environment), and recursion. In thisbasic form, session types cannot faithfully capture a natural and relevant aspectof interaction protocols, i.e., the timing constraints among the communicationactions. While formal methods for time have been studied for at least a couple

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 161–177, 2015.DOI: 10.1007/978-3-319-19195-9_11

Page 171: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

162 M. Bartoletti et al.

of decades, they have approached the realm of session types very recently [9,22].However, these approaches introduce time into an already sophisticated frame-work, featuring multiparty session types with asynchronous communication (viaunbounded buffers). While on the one hand this has the advantage of extendingto the timed setting type techniques which enable compositional verification [19],on the other hand it seems that some of the key notions of the untimed setting(e.g., compliance, duality) have not been explored yet in the timed case.

We think that studying timed session types in a basic setting (synchronouscommunication between two endpoints, as in the seminal untimed version) isworthy of attention. From a theoretical point of view, the objective is to lift tothe timed case some decidability results, like those of compliance and subtyping.Some intriguing problems arise: unlike in the untimed case, a timed session typenot always admits a compliant; hence, besides deciding if two session types arecompliant, it becomes a relevant problem whether a session type has a compli-ant. From a more practical perspective, decision procedures for timed sessiontypes, like those for compliance and for dynamic verification, enable the imple-mentation of programming tools and infrastructures for the development of safecommunication-oriented distributed applications.

Contributions. In this paper we introduce a theory of binary timed session types(TSTs), and we explore its viability as a foundation for programming tools toleverage the complexity of developing distributed applications.

We start in Section 2 by giving the syntax and semantics of TSTs. E.g., wedescribe as the following TST the contract of a service taking as input a zipcode, and then either providing as output the current weather, or aborting:

p = ?zip{x}. (!weather{5 < x < 10} ⊕ !abort{x < 1})

The prefix ?zip{x} states that the service can receive a zip code, and then reseta clock x. The continuation is an internal choice between two outputs: eitherthe service sends weather in a time window of (5, 10) time units, or it will abortthe protocol within 1 time unit.

The semantics of TSTs is a conservative extension of the synchronous seman-tics of untimed session types [4], adding clock valuations to associate each clockwith a positive real. We also extend to the timed setting the standard seman-tic notion of compliance, which relates two session types whenever they enjoyprogress until reaching success. For instance, p above is not compliant with:

q = !zip{y}. (?weather{y < 7}+ ?abort{y < 5})

because q is available to receive weather until 7 time units since it has sent thezip code, while p can choose to send weather until 10 time units (note that pand q , cleaned from all time annotations, are compliant in the untimed setting).

Despite the semantics of TSTs being infinite-state (while it is finite-state inthe untimed case), we develop a sound and complete decision procedure forverifying compliance (Theorem 1). To do that, we reduce this problem to that

Page 172: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Compliance and Subtyping in Timed Session Types 163

of model-checking deadlock freedom in timed automata [2], which is decidable,and we implement our technique using the Uppaal model checker [7].

Another difference from the untimed case is that not every TST admits acompliant (while in the untimed case, a session type is always compliant to itssyntactic dual). For instance, consider the client contract:

q ′ = !zip{y < 10}. (?weather{y < 7}+ ?abort{y < 5})

No service can be compliant with q ′, because if q ′ sends the zip code, e.g., at time8, one cannot send weather or abort in the given time constraints. We developa procedure to detect whether a TST admits a compliant. This takes the form ofa kind system which associates, to each p, a set of clock valuations under which padmits a compliant. The kind system is sound and complete (Theorems 4 and 5),and kind inference is decidable (Theorem 3), so summing up we have a (soundand complete) decision procedure for the existence of compliant. When p admitsa compliant, by exploiting the kind system we can construct the greatest TSTcompliant with p (Theorem 7), according to the semantic subtyping relation [4].Decidability of subtyping follows from that of compliance and kind inference.This provides us with an effective way of checking whether a service with type pcan be replaced by one with a subtype p′ of p, guaranteeing that all the serviceswhich interacted correctly with the old one will do the same with the new one.

In Section 4 we address the problem of dynamically monitoring interactionsregulated by TSTs. To do that, we will provide TSTs with a monitoring seman-tics, which detects when a participant is not respecting its TST. This semanticsenjoys some desirable properties: it is deterministic, and it guarantees that ineach state of an interaction, either we have reached success, or someone is incharge of a move, or not respecting its TST. We then exploit all the theoreticalresults discussed above, to discuss the design and implementation of a message-oriented middleware which uses TSTs to enable and regulate the interaction ofdistributed services. This infrastructure pursues the bottom-up approach to ser-vice composition: it allows services to advertise contracts (in the form of TSTs);all the advertised TSTs are collected by a broker, which finds pairs of compli-ant TSTs, and creates sessions between the respective services. These can thenstart interacting, by doing the actions prescribed by their TSTs (or even bychoosing not to do so). In a system of honest services, compliance between TSTsensures progress of the whole system; in any case, dynamic verification of all theexchanged messages guarantees safe executions.

Due to space constraints, the proofs of our statements, additional examples,as well as some tools related to the middleware, are available in [5].

2 Timed Session Types

We introduce binary timed session types (TSTs), by giving their syntax andsemantics, and by defining a compliance relation between them. The main resultof this section is Theorem 1, which states that compliance is decidable.

Page 173: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

164 M. Bartoletti et al.

Syntax. Let A be a set of actions, ranged over by a, b, . . .. We denote with A! theset {!a | a ∈ A} of output actions, with A? the set {?a | a ∈ A} of input actions,and with L = A! ∪ A? the set of branch labels, ranged over by �, �′, . . ..

We use δ, δ′, . . . to range over the set R≥0 of positive real numbers includingzero, and d, d′, . . . to range over N. Let C be a set of clocks, namely variables inR≥0, ranged over by t, t′, . . .. We use R, T , . . . ⊆ C to range over sets of clocks.

Definition 1 (Guards). The set GC of guards over clocks C is defined as:

g ::= true∣∣ ¬g ∣

∣ g ∧g∣∣ t ◦d ∣

∣ t− t′ ◦d (where ◦ ∈ {<,≤,=,≥, >})Definition 2 below introduces the syntax of TSTs. A TST p models the be-

haviour of a single participant involved in an interaction. TSTs are terms of a pro-cess algebra featuring the success state 1, internal choice

i∈I !ai{gi, Ri} . pi,external choice

i∈I ?ai{gi, Ri} . pi, and recursion recX. p.To give some intuition, we consider two participants, Alice (A) and Bob (B),

which want to interact. Alice advertises an internal choice⊕

i !ai{gi, Ri} . piwhen she wants to do one of the outputs !ai in a time window where gi is true;further, the clocks in Ri will be reset after the output is performed. The meaningof an external choice

i ?ai{gi, Ri} . qi (advertised, say, by B) is somehow dual:B is saying that he is available to receive each message ai in any instant withinthe time window defined by gi (and the clocks in Ri will be reset after the input).

Definition 2 (Timed session types). Timed session types p, q, . . . are termsof the following grammar:

p ::= 1∣∣

i∈I

!ai{gi, Ri} . pi∣∣

i∈I

?ai{gi, Ri} . pi∣∣ recX. p

∣∣ X

where (i) the set I is finite and non-empty, (ii) the actions in internal/externalchoices are pairwise distinct, (iii) recursion is guarded. Unless stated otherwise,we consider TSTs up-to unfolding of recursion. A TST is closed when it has norecursion variables. If q =

i∈I !ai{gi, Ri} . pi and 0 ∈ I, we write !a0.p0 ⊕ qfor

i∈I∪{0} !ai{gi, Ri} . pi (the same for external choices). True guards, emptyresets, and trailing occurrences of the success state can be omitted.

Example 1. Along the lines of PayPal User Agreement [1], we specify the pro-tection policy for buyers of a simple on-line payment platform, called PayNow(see [5] for the full version). PayNow helps customers in on-line purchasing, pro-viding protection against misbehaviours. In case a buyer has not received whathe has paid for, he can open a dispute within 180 days from the date the buyermade the payment. After opening of the dispute, the buyer and the seller maytry to come to an agreement. If this is not the case, within 20 days, the buyercan escalate the dispute to a claim. However, the buyer must wait at least 7 daysfrom the date of payment to escalate a dispute. Upon not reaching an agreement,if still the buyer does not escalate the dispute to a claim within 20 days, thedispute is considered aborted. During a claim procedure, PayNow will ask thebuyer to provide documentation to certify the payment, within 3 days of the datethe dispute was escalated to a claim. After that, the payment will be refundedwithin 7 days. The contract of PayNow is described by the following TST p:

Page 174: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Compliance and Subtyping in Timed Session Types 165

p = ?pay{tpay}. (?ok + ?dispute{tpay < 180, td}. p′) where

p′ = ?ok{td < 20} +

?claim{td < 20 ∧ tpay > 7, tc}.?rcpt{tc < 3, tc}.!refund{tc < 7} +

?abort

Semantics. To define the behaviour of TSTs we use clock valuations, whichassociate each clock with its value. The state of the interaction between two TSTsis described by a configuration (p, ν) | (q, η), where the clock valuations ν andη record (keeping the same pace) the time of the clocks in p and q , respectively.The dynamics of the interaction is formalised as a transition relation betweenconfigurations (Definition 5). This relation describes all and only the correctinteractions: for instance, we do not allow time passing to make unsatisfiableall the guards in an internal choice, since doing so would prevent a participantfrom respecting her protocol. In Section 4 we will study another semantics ofTSTs, which can also describe the behaviour of dishonest participants who donot respect their protocols.

We denote with V = C → R≥0 the set of clock valuations (ranged over byν, η, . . .), and with ν0 the valuation mapping each clock to zero. We write ν + δfor the valuation which increases ν by δ, i.e., (ν + δ)(t) = ν(t) + δ for all t ∈ C.For a set R ⊆ C, we write ν [R] for the reset of the clocks in R, i.e., ν [R](t) = 0if t ∈ R, and ν [R](t) = ν (t) otherwise.

Definition 3 (Semantics of guards). For all guards g, we define the set ofclock valuations �g� inductively as follows, where ◦ ∈ {<,≤,=,≥, >}:

�true� = V �¬g� = V \ �g� �g1 ∧ g2� = �g1� ∩ �g2�

�t ◦ d� = {ν | ν (t) ◦ d} �t − t′ ◦ d� = {ν | ν (t)− ν(t′) ◦ d}Before defining the semantics of TSTs, we recall from [8] some basic operations

on sets of clock valuations (ranged over by K,K′ , . . . ⊆ V).

Definition 4 (Past and inverse reset). For all sets K of clock valuations, theset of clock valuations ↓ K (the past of K) and K[T ]−1 (the inverse reset of K)are defined as: ↓ K = {ν | ∃δ ≥ 0 : ν + δ ∈ K}, K[T ]−1 = {ν | ν [T ] ∈ K}.Definition 5 (Semantics of TSTs). A configuration is a term of the form(p, ν) | (q, η), where p, q are TSTs extended with committed choices [!a{g,R}] p.The semantics of TSTs is defined as a labelled relation −→ over configurations,whose labels are either silent actions τ , delays δ, or branch labels.

We now comment the rules in Figure 1. The first four rules are auxiliary, asthey describe the behaviour of a TST in isolation. Rule [⊕] allows a TST tocommit to the branch !a of her internal choice, provided that the correspondingguard is satisfied in the clock valuation ν . This results in the term [!a{g,R}] p,which represents the fact that the endpoint has committed to branch !a in aspecific time instant: actually, it can only fire !a through rule [!] (which also

Page 175: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

166 M. Bartoletti et al.

(!a{g, R}. p ⊕ p′, ν ) τ−→ ([!a{g,R}] p, ν ) if ν ∈ �g� [⊕]

([!a{g,R}] p, ν ) !a−→ (p, ν [R]) [!]

(?a{g,R}. p + p′, ν )?a−→ (p, ν [R]) if ν ∈ �g� [?]

(p, ν )δ−→ (p, ν + δ) if δ > 0 ∧ ν + δ ∈ rdy(p) [Del]

(p, ν )τ−−→ (p′, ν ′)

(p, ν ) | (q, η) τ−−→ (p′, ν ′) | (q, η) [S-⊕](p, ν )

δ−→ (p, ν ′) (q, η)δ−→ (q, η ′)

(p, ν ) | (q, η) δ−→ (p, ν ′) | (q, η ′)[S-Del]

(p, ν )!a−−→ (p′, ν ′) (q, η)

?a−−→ (q ′, η ′)(p, ν ) | (q, η) τ−−→ (p′, ν ′) | (q ′, η ′)

[S-τ ]

rdy(⊕

!ai{gi, Ri} . pi) = ↓ ⋃

�gi� rdy(∑ · · ·) = rdy(1) = V rdy([!a{g,R}] p) = ∅

Fig. 1. Semantics of timed session types (symmetric rules omitted)

resets the clocks in R), while time cannot pass. Rule [?] allows an external choiceto fire any of its input actions whose guard is satisfied. Rule [Del] allows time topass; this is always possible for external choices and success term, while for aninternal choice we require that at least one of the guards remains satisfiable; thisis obtained through the function rdy in Figure 1. The last three rules deal withconfigurations of two TSTs. Rule [S-⊕] allows a TSTs to commit in an internalchoice. Rule [S-τ ] is the standard synchronisation rule a la CCS; note that B isassumed to read a message as soon as it is sent, so A never blocks on internalchoices. Rule [S-Del] allows time to pass, equally for both endpoints.

Example 2. Let p = !a ⊕ !b{t > 2}, let q = ?b{t > 5}, and consider thefollowing computations:

(p, ν0) | (q, η0) 7−→ τ−→ ([!b{t > 2}] , ν0 + 7) | (q, η0 + 7)τ−→ (1, ν0 + 7) | (1, η0 + 7) (1)

(p, ν0) | (q, η0) δ−→ τ−→ ([!a] , ν0 + δ) | (q, η0 + δ) (2)

(p, ν0) | (q, η0) 3−→ τ−→ ([!b{t > 2}] , ν0 + 3) | (q, η0 + 3) (3)

The computation in (1) reaches success, while the other two computations reachthe deadlock state. In (2), p commits to the choice !a after some delay δ; at thispoint, time cannot pass (because the leftmost endpoint is a committed choice),and no synchronisation is possible (because the other endpoint is not offering ?a).In (3), p commits to !b after 3 time units; here, the rightmost endpoint wouldoffer ?b, — but not in the time chosen by the leftmost endpoint. Note that, werewe allowing time to pass in committed choices, then we would have obtained

Page 176: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Compliance and Subtyping in Timed Session Types 167

e.g. that (!b{t > 2}, ν0) | (q, η0) never reaches deadlock — contradicting ourintuition that these endpoints should not be considered compliant.

Compliance. We now extend to the timed setting the standard progress-basedcompliance between (untimed) session types [21,11,4]. If p is compliant with q ,then whenever an interaction between p and q becomes stuck, it means thatboth participants have reached the success state. Intuitively, when two TSTs arecompliant and participants behave honestly (according to their TSTs), then theinteraction will progress, until both of them reach the success state.

Definition 6 (Compliance). We say that (p, ν ) | (q, η) is deadlock whenever(i) it is not the case that both p and q are 1, and (ii) there is no δ such that

(p, ν + δ) | (q, η + δ)τ−→. We then write (p, ν) �� (q, η) whenever:

(p, ν) | (q, η) −→∗ (p′, ν ′) | (q ′, η ′) implies (p′, ν ′) | (q ′, η ′) not deadlock

We say that p and q are compliant whenever (p, ν0) �� (q, η0) (in short, p �� q).

Example 3. Let p = ?a{t < 5}.!b{t < 3}. We have that p is compliant withq = !a{t < 2}.?b{t < 3}, but it is not compliant with q ′ = !a{t < 5}.?b{t < 3}.Example 4. Consider a customer of PayNow (see Example 1) who is willing towait 10 days to receive the item she has paid for, but after that she will opena claim. Further, she will instantly provide PayNow with any documentationrequired. The customer contract is described by the following TST, which iscompliant with PayNow’s contract p in Example 1:

!pay{tpay}.(!ok{tpay < 10} ⊕!dispute{tpay=10}.!claim{tpay=10}.!rcpt{tpay=10}.?refund)

Compliance between TSTs is somehow more liberal than the untimed notion,as it can relate terms which, when cleaned from all the time annotations, wouldnot be compliant in the untimed case. The following example shows e.g., that arecursive internal choice can be compliant with a non-recursive external choice— which can never happen in untimed session types.

Example 5. Consider the TSTs p = recX.(

!a ⊕ !b{x ≤ 1}. ?c. X)

, and q =?a + ?b{y ≤ 1}. !c{y > 1}. ?a. We have that p �� q . Indeed, if p chooses theoutput !a, then q has the corresponding input, and they both succeed; instead,if p chooses !b, then it will read ?c when x > 1, and so at the next loop it isforced to choose !a, since the guard of !b has become unsatisfiable.

Definition 7 and Lemma 1 below coinductively characterise compliance be-tween TSTs, by extending to the timed setting the coinductive compliance foruntimed session types in [3]. Intuitively, an internal choice p is compliant withq when (i) q is an external choice, (ii) for each output !a that p can fire afterδ time units, there exists a corresponding input ?a that q can fire after δ timeunits, and (iii) their continuations are coinductively compliant. The case wherep is an external choice is symmetric.

Page 177: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

168 M. Bartoletti et al.

Definition 7. We say R is a coinductive compliance iff (p, ν)R (q, η) implies:

1. p = 1 ⇐⇒ q = 1

2. p =⊕

i∈I !ai{gi, Ri} . pi =⇒ ν ∈ rdy(p) ∧ q =∑

j∈J ?aj{gj , Rj} . qj ∧∀δ, i : ν + δ ∈ �gi� =⇒ ∃j : ai = aj ∧η+ δ ∈ �gj�∧ (pi, ν + δ[Ri])R (qj , η+ δ[Rj ])

3. p =∑

j∈J ?aj{gj , Rj} . pj =⇒ η ∈ rdy(q) ∧ q =⊕

i∈I !ai{gi, Ri} . qi ∧∀δ, i : η+ δ ∈ �gi� =⇒ ∃j : ai = aj ∧ν + δ ∈ �gj�∧ (pj , ν + δ[Rj ])R (qi, η+ δ[Ri])

Lemma 1. p �� q ⇐⇒ ∃R coinductive compliance : (p, ν0)R (q, η0)

The following theorem establishes decidability of compliance. To prove it, wereduce the problem of checking p��q to that of model-checking deadlock freedomin a network of timed automata constructed from p and q (see [5] for details).

Theorem 1. Compliance between TSTs is decidable.

3 On Duality and Subtyping

The dual of an untimed session type is computed by simply swapping internalchoices with external ones (and inputs with outputs) [10]. A naıve attempt toextend this construction to TSTs can be to swap internal with external choices,as in the untimed case, and leave guards and resets unchanged. This constructiondoes not work as expected, as shown by the following example.

Example 6. Consider the following TSTs:

p1 = !a{x ≤ 2}. !b{x ≤ 1} p2 = !a{x ≤ 2} ⊕ !b{x ≤ 1}. ?a{x ≤ 0}p3 = recX. ?a{x ≤ 1 ∧ y ≤ 1}. !a{x ≤ 1, {x}}. X

The TST p1 is not compliant with its naıve dual q1 = ?a{x ≤ 2}. ?b{x ≤ 1}:even though q1 can do the input ?a in the required time window, p1 cannotperform !b if !a is performed after 1 time unit. For this very reason, no TST iscompliant with p1. Note instead that q1 ��!a{x ≤ 1}. !b{x ≤ 1}, which is not itsnaıve dual. In p2, a similar deadlock situation occurs if the !b branch is chosen,and so also p2 does not admit a compliant. The reason why p3 does not admita compliant is more subtle: actually, p3 can loop until the clock y reaches thevalue 1; after this point, the guard y ≤ 1 can no longer be satisfied, and then p3reaches a deadlock.

As suggested in the above example, the dual construction makes sense only forthose TSTs for which a compliant exists. To this purpose, we define a procedure(more precisely, a kind system) which computes the set of clock valuations K(called kinds) such that p admits a compliant TST in all ν ∈ K. We thenprovide a constructive proof of its soundness, by showing a TST q compliantwith p, which we call the dual of p.

We now define our kind system for TSTs.

Page 178: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Compliance and Subtyping in Timed Session Types 169

Γ � 1 : V [T-1]Γ � pi : Ki for i ∈ I

Γ � ∑

i∈I ?ai{gi, T i} . pi : ⋃i∈I ↓(

�gi� ∩ Ki[T i]−1

) [T-+]

Γ � pi : Ki for i ∈ I

Γ � ⊕

i∈I !ai{gi, T i} . pi :(⋃

i∈I ↓ �gi�) \ (⋃i∈I ↓ (�gi� \ Ki[T i]

−1)) [T-⊕]

Γ,X : K � X : K [T-Var]∃K,K′ : Γ{K/X} � p : K′

Γ � recX. p :⋃ {K | Γ{K/X} � p : K′ ∧ K ⊆ K′} [T-Rec]

Fig. 2. Kind system for TSTs

Definition 8 (Kind system). Kind judgements Γ � p : K are defined inFigure 2. where Γ is a partial function which associates kinds to recursion variables.

Rule [T-1] says that the success TST 1 admits compliant in every ν : indeed, 1is compliant with itself. The kind of an exernal choice is the union of the kindsof its branches (rule [T-+]), where the kind of a branch is the past of those clockvaluations which satisfy both the guard and, after the reset, the kind of theircontinuation. Internal choices are dealt with by rule [T-⊕], which computes thedifference between the union of the past of the guards and a set of error clockvaluations. The error clock valuations are those which can satisfy a guard butnot the kind of its continuation. Rule [T-Var] is standard. Rule [T-Rec] looks for akind which is preserved by unfolding of recursion (hence a fixed point). In orderto obtain completeness of the kind system we need the greatest fixed point.

Example 7. Recall p2 from Example 6. We have the following kind derivation:

� 1 : V� 1 : V

[T-+]� !a{x ≤ 0} : ↓ �x ≤ 0� ∩ V = �x ≤ 0�[T-⊕]

� p2 :( ↓ �x ≤ 2� ∪ ↓ �x ≤ 1�

) \ ( ↓ �x ≤ 2� \ V) ∪ ↓ �x ≤ 1� \ �x ≤ 0�)

= K

where K = �(x > 1) ∧ (x ≤ 2)�. As noted in Example 6, intuitively p2 has nocompliant; this will be asserted by Theorem 5 below, as a consequence of thefact that ν0 ∈ K. However, since K is non-empty, Theorem 4 guarantees thatthere exist q and η such that (p2, ν) �� (q, η), for all clock valuations ν ∈ K.

The following theorem states that every TST is kindable. We stress the factthat being kindable does not imply admitting a compliant. This holds if andonly if ν0 belongs to the kind (see Theorems 4 and 5).

Theorem 2. For all closed p, there exists some K such that � p : K.

The following theorem states that the problem of determining the kind of a

TST is decidable. This might seem surprising, as the cardinality of kinds is 22ℵ0.

However, the kinds constructed by our inference rules can always be representedsyntactically by guards (as in Definition 1) [17].

Theorem 3. Kind inference is decidable.

Page 179: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

170 M. Bartoletti et al.

coΓ (1) = 1coΓ

(∑

i∈I ?ai{gi, T i} . pi)

=⊕

i∈I !ai{gi ∧ Ki[T i]−1, T i} . coΓ (pi) if Γ � pi : Ki

coΓ(⊕

i∈I !ai{gi, T i} . pi)

=∑

i∈I ?ai{gi, T i} . coΓ (pi)coΓ (X ) = X if Γ (X ) defined

coΓ (recX. p) = recX. coΓ{K/X}(p) if Γ � recX. p : K

Fig. 3. Dual of a TST

We now define the canonical compliant of kindable TSTs. Roughly, we turninternal choices into external ones (without changing guards nor resets), andexternal into internal, changing the guards so that the kind of continuations ispreserved. Decidability of this construction follows from that of kind inference.

Definition 9 (Dual). For all kindable p and kinding environments Γ , we definethe TST coΓ (p) (in short, co(p) when Γ = ∅) in Figure 3.

The following theorem states the soundness of the kind system: is particular,if the clock valuation ν0 belongs to the kind of p, then p admits a compliant.

Theorem 4 (Soundness). If � p : K and ν ∈ K, then (p, ν) �� (co(p) , ν).

Example 8. Recall the TST q1 = ?a{x ≤ 2}. ?b{x ≤ 1} in Example 6. We have:

co(q1) = !a{x ≤ 1}. !b{x ≤ 1}Since � q1 : K = �x ≤ 1� and ν0 ∈ K, by Theorem 4 we have that q1 �� co(q1),as anticipated in Example 6.

The following theorem states the kind system is also complete: in particular,if p admits a compliant, then the clock valuation ν0 belongs to the kind of p.

Theorem 5 (Completeness). If � p : K and ∃q, η. (p, ν)��(q, η), then ν ∈ K.

Compliance is not transitive, in general (see [5]); however, the following The-orem 6 states that transitivity holds when passing through duals.

Theorem 6. If p �� p′ and co(p′) �� q, then p �� q.

We now show that the dual is maximal w.r.t. the subtyping relation, like thedual in the untimed setting. We start by defining the semantic subyting preorder,which is a sound and complete model of the Gay and Hole subtyping relation(in reverse order) for untimed session types [4]. Intuitively, p is subtype of q ifevery q ′ compliant with q is compliant with p, too.

Definition 10 (Semantic subtyping). For all TSTs p, we define the set p��

as {q | p �� q}. Then, we define the relation p � q whenever p�� ⊇ q��.

The following theorem states that co(p) is the maximum (i.e., the most “pre-cise”) in the set of the compliants of p, if not empty.

Page 180: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Compliance and Subtyping in Timed Session Types 171

Theorem 7. q �� p =⇒ q � co(p)

The following theorem reduces the problem of deciding p � q to that ofchecking compliance between p and co(q). Since both compliance and the dualconstruction are decidable, this implies decidability of subtyping.

Theorem 8. If q admits a compliant, then: p � q ⇐⇒ p �� co(q).

4 Runtime Monitoring

In this section we study runtime monitoring based on TSTs. The setting is thefollowing: two participants A and B want to interact according to two (compliant)TSTs pA and pB , respectively. This interaction happens through a server, whichmonitors all the messages exchanged between A and B, while keeping track ofthe passing of time. If a participant (say, A) sends a message not expected byher TST, then the monitor classifies A as culpable of a violation. There are othertwo circumstances where A is culpable: (i) pA is an internal choice, but A losestime until all the branches become unfeasible, or (ii) pA is an external choice,but A does not readily receive an incoming message sent by B.

Note that the semantics in Figure 1 cannot be directly exploited to define sucha runtime monitor, for two reasons. First, the synchronisation rule is purelysymmetric, while the monitor outlined above assumes an asymmetry betweeninternal and external choices. Second, the semantics in Figure 1 does not havetransitions (either messages or delays) which are not allowed by the TSTs: forinstance, (!a{t ≤ 1}, ν) cannot take any transitions (neither !a nor δ) if ν (t) >1. In a runtime monitor we want to avoid such kind of situations, where noactions are possible, and the time is frozen. More specifically, our desideratumis that the runtime monitor acts as a deterministic automaton, which reads atimed trace (a sequence of actions and time delays) and it reaches a uniquestate γ, which can be inspected to find which of the two participants (if any)is culpable.

To reach this goal, we define the semantics of the runtime monitor on twolevels. The first level, specified by the relation −→→, deals with the case of honestparticipants; however, differently from the semantics in Section 2, here we de-couple the action of sending from that of receiving. More precisely, if A has aninternal choice and B has an external choice, then we postulate that A must movefirst, by doing one of the outputs in her choice, and then B must be ready todo the corresponding input. The second level, called monitoring semantics andspecified by the relation −→→M , builds upon the first one. Each move accepted bythe first level is also accepted by the monitor. Additionally, the monitoring se-mantics defines transitions for actions not accepted by the first level, for instanceunexpected input/output actions, and improper time delays. In these cases, themonitoring semantics signals which of the two participants is culpable.

Definition 11 (Monitoring semantics of TSTs). Monitoring configura-tions γ, γ′, . . . are terms of the form P ‖Q, P and Q are triples (p, c, ν), where

Page 181: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

172 M. Bartoletti et al.

(!a{g,R}. p ⊕ p′, [], ν ) ‖ (q, [], η) A:!a−−→→ (p, [!a], ν [R]) ‖ (q, [], η) if ν ∈ �g� [M-⊕]

(p, [!a], ν) ‖ (?a{g,R}. q + q ′, [], η)B:?a−−→→ (p, [], ν ) ‖ (q, [], η [R]) if ν ∈ �g� [M-+]

ν + δ ∈ rdy(p) η + δ ∈ rdy(q)

(p, [], ν ) ‖ (q, [], η) δ−→→ (p, [], ν + δ) ‖ (q, [], η + δ)[M-Del]

(p, c, ν ) ‖ (q, d, η) λ−→→ (p′, c′, ν ′) ‖ (q ′, d′, η ′)

(p, c, ν ) ‖ (q, d, η) λ−→→M (p′, c′, ν ′) ‖ (q ′, d′, η ′)[M-Ok]

(p, c, ν ) ‖ (q, d, η) � A:�−−→→(p, c, ν) ‖ (q, d, η) A:�−−→→M (0, c, ν ) ‖ (q, d, η)

[M-FailA]

(d = [] ∧ ν + δ �∈ rdy(p)) ∨ d �= [])

(p, c, ν ) ‖ (q, d, η) δ−→→M (0, c, ν + δ) ‖ (q, d, η + δ)[M-FailD]

Fig. 4. Monitoring semantics (symmetric rules omitted)

p is either a TST or 0, and c is a one-position buffer (either empty or con-taining an output label). The transition relations −→→ and −→→M over monitoringconfigurations, with labels λ, λ′, . . . ∈ ({A,B} × L) ∪R≥0, is defined in Figure 4.

In the rules in Figure 4, we always assume that the leftmost TST is governedby A, while the rightmost one is governed by B. In rule [M-⊕], A has an internalchoice, and she can fire one of her outputs !a, provided that its buffer is empty,and the guard g is satisfied. When this happens, the message !a is written to thebuffer, and the clocks in R are reset. Then, B can read the buffer, by firing ?a inan external choice through rule [M-+]; this requires that the buffer of B is empty,and the guard g of the branch ?a is satisfied. Rule [M-Del] allows time to pass,provided that the delay δ is permitted for both participants, and both buffersare empty. The last three rules specify the runtime monitor. Rule [M-Ok] saysthat any move accepted by −→→ is also accepted by the monitor. Rule [M-FailA]

is used when participant A attempts to do an action not permitted by −→→: thismakes the monitor evolve to a configuration where A is culpable (denoted by theterm 0). Rule [M-FailD] makes A culpable when time passes, in two cases: eitherA has an internal choice, but the guards are no longer satisfiable; or she has anexternal choice, and there is an incoming message.

When both participants behave honestly, i.e., they never take [M-Fail*] moves,the monitoring semantics preserves compliance (Theorem 9). The monitoringcompliance relation ��M is the straightforward adaptation of that in Definition 6,except that −→→ transitions are used instead of −→ ones (see [5]).

Theorem 9. �� = ��M .

Page 182: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Compliance and Subtyping in Timed Session Types 173

The following lemma establishes that the monitoring semantics is determin-

istic: that is, if γλ−→→M γ′ and γ

λ−→→M γ′′, then γ′ = γ′′. Determinism is a verydesirable property indeed, because it ensures that the culpability of a participantat any given time is uniquely determined by the past actions. Furthermore, forall finite timed traces λ (i.e., sequences of actions A : � or time delays δ), thereexists some configuration γ reachable from the initial one.

Lemma 2. Let γ0 = (p, [], ν0) ‖ (q, [], η0). If p��q, then (−→→M , γ0) is determinis-

tic, and for all finite timed traces λ there exists (unique) γ such that γ0λ−→→M γ.

The goal of the runtime monitor is to detect, at any state of the execution,which of the two participants is culpable (if any). Further, we want to identifywho is in charge of the next move. This is formalised by the following definition.

Definition 12 (Duties & culpability). Let γ = (p, c, ν) ‖ (q, d, η). We saythat A is culpable in γ iff p = 0. We say that A is on duty in γ if (i) A is notculpable in γ, and (ii) either p is an internal choice, or d is not empty.

Lemma 3 guarantees that, in each reachable configuration, only one of theparticipants can be on duty; and if no one is on duty nor culpable, then bothparticipants have reached success.

Lemma 3. If p �� q and (p, [], ν0) ‖ (q, [], η0) −→→∗M γ, then:

1. there exists at most one participant on duty in γ,2. if there exists some culpable participants in γ, then no one is on duty in γ,3. if no one is on duty in γ, then γ is success, or someone is culpable in γ.

Note that both participants may be culpable in a configuration. E.g., letγ = (!a{true}, [], η0) ‖ (?a{true}, [], η0). By applying [M-FailA] twice, we obtain:

γA:?b−−−→→M (0, [], ν0) ‖ (?a{true}, [], η0) B:?b−−−→→M (0, [], ν0) ‖ (0, [], η0)

and in the final configuration both participants are culpable.

Example 9. Let p = !a{2 < t < 4} be the TST of participant A, and letq = ?a{2 < t < 5} + ?b{2 < t < 5} be that of B. We have that p �� q .Let γ0 = (p, [], ν0) ‖ (q, [], ν0). A correct interaction is given by the timed trace

η = 〈1.2, A : !a, B : ?a〉. Indeed, γ0 η−→→M (1, [], ν0) ‖ (1, [], ν0). On the contrary,things may go awry in three cases:

(i) a participant does something not permitted. E.g., if A fires a at 1 t.u., by

[M-FailA]: γ01−→→M

A:!a−−−→→M (0, [], ν0 + 1) ‖ (q, [], η0 + 1), where A is culpable.(ii) a participant avoids to do something she is supposed to do. E.g., assume

that after 6 t.u., A has not yet fired a. By rule [M-FailD], we obtain γ06−→→M

(0, [], ν0 + 6) ‖ (q, [], η0 + 6), where A is culpable.(iii) a participant does not receive a message as soon as it is sent. For instance,

after a is sent at 1.2 t.u., at 5.2 t.u. B has not yet fired ?a. By [M-FailD],

γ01.2−−→→M

A:!a−−−→→M4−→→M (1, [!a], ν0+5.2) ‖ (0, [], η0+5.2), where B is culpable.

Page 183: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

174 M. Bartoletti et al.

5 Conclusions

We have studied a theory of session types (TSTs), featuring timed synchronouscommunication between two endpoints. We have defined a decidable notion ofcompliance between TSTs, a decidable procedure to detect when a TST admits acompliant, a decidable subtyping relation, and a (decidable) runtime monitoring.

All these notions have been exploited in the design and development of amessage-oriented middleware which uses TSTs to drive safe interactions amongdistributed components. The idea is a contract-oriented, bottom-up composition,where only those services with compliant contracts can interact via (binary) ses-sions. The middleware makes available a global store where services can advertisecontracts, in the form of TSTs. Assume that A advertises a contract p to thestore (this is only possible if p admits a compliant). A session between A andB can be established if (i) B advertises a contract q compliant with p, or (ii) Baccepts the contract p (in this case, the contract of B is the dual of p). Whenthe session is established, A and B can interact by sending/receiving messagesthrough the session. During the interaction, all their actions are monitored (ac-cording to Definition 11), and possible misbehaviours are detected (accordingto Definition 12). The middleware is accessible through a set of public APIs; asuite of tools for developing contract-oriented applications is available at [5].

Related work. Compliance between TSTs is loosely related to the notion ofcompliance between untimed session types (in symbols, ��u). Let u(p) be thesession type obtained by erasing from p all the timing annotations. It is easyto check that the semantics of (u(p), ν0) | (u(q), ν0) in Section 2 coincides withthe semantics of u(p) | u(q) in [4]. Therefore, if u(p) �� u(q), then u(p) ��u u(q).Instead, semantic conservation of compliance does not hold, i.e. it is not true ingeneral that if p �� q , then u(p) ��u u(q). E.g., let p = !a{t < 5} ⊕ !b{t < 0},and let q = ?a{t < 7}. We have that p �� q (because the branch !b can neverbe chosen), whereas u(p) = !a ⊕ !b ��u?a = u(q). Note that, for every p,u(co(p)) = co(u(p)).

In the context of session types, time has been originally introduced in [9].However, the setting is different than ours (multiparty and asynchronous, whileours is bi-party and synchronous), as well as its objectives: while we have focussedon primitives for the bottom-up approach to service composition [6], [9] extendsto the timed case the top-down approach. There, a choreography (expressing theoverall communication behaviour of a set of participants) is projected into a setof session types, which in turn are refined as processes, to be type-checked againsttheir session type in order to make service composition preserve the propertiesenjoyed by the choreography.

Our approach is a conservative extension of untimed session types, in thesense that a participant which performs an output action chooses not only thebranch, but the time of writing too; dually, when performing an input, one hasto passively follow the choice of the other participant. Instead, in [9] externalchoices can also delay the reading time. The notion of correct interaction studiedin [9] is called feasibility: a choreography is feasible iff all its reducts can reach the

Page 184: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Compliance and Subtyping in Timed Session Types 175

success state. This property implies progress, but it is undecidable in general,as shown by [20] in the context of communicating timed automata (however,feasibility is decidable for the subclass of infinitely satisfiable choreographies).The problem of deciding if, given a local type T , there exists a choreography Gsuch that T is in the projection of G and G enjoys (global) progress is not beingaddressed in [9]. We think that it can be solved by adapting our kind system (inparticular rule [T-+] must be adjusted).

Another problem not addresses by [9] is that of determinining if a set ofsession types enjoys progress (which, as feasibility of choreographies, would beundecidable). In our work we have considered this problem, under a synchronoussemantics, and with the restriction of two participants. Extending our seman-tics to an asynchronous one would make compliance undecidable (as it is foruntimed asynchronous session types [15]). Note that our progress-based notionof compliance does not imply progress with the semantics of [9] (adapted tothe binary case). For instance, let p = ?a{x ≤ 2}. !a{x ≤ 1} and q = !a{y ≤1}. ?a{y ≤ 1}. We have that p��q , while in the semantics of [9] (ν0, (p, q,w0)) −→∗

(ν , (!a{x ≤ 1}, ?a{y ≤ 1},w0)) with ν(x) = ν(y) > 1, which is a deadlock state.Dynamic verification of timed multiparty session types is addressed by [22],

where the top-down approach to service composition is pursued [19]. Our mid-dleware instead composes and monitors services in a bottom-up fashion [6].

In [13] timed specifications are studied in the setting of timed I/O transitionsystems (TIOTS). They feature a notion of correct composition, called compati-bility, following the optimistic approach pursued in [14]: roughly, two systems arecompatible whenever there exists an environment which, composed with them,makes “undesirable” states unreachable. A notion of refinement is coinductivelyformalised as an alternating timed simulation. Refinement is a preorder, andit is included in the semantic subtyping relation (using compatibility insteadof ��). Because of the different assumptions (open systems and broadcast com-munications in [13], closed binary systems in TSTs), compatibility/refinementseem unrelated to our notions of compliance/subtyping. Despite the main no-tions in [13] are defined on semantic objects (TIOTS), they can be decided ontimed I/O automata, which are finite representations of TIOTS. With respect toTSTs, timed I/O automata are more liberal: e.g., they allow for mixed choices,while in TSTs each state is either an input or an output. However, this increasedexpressiveness does not seem appropriate for our purposes: first, it makes theconcept of culpability unclear (and it breaks one of the main properties of ours,i.e. that at most one participant is on duty at each execution step); second, itseems to invalidate any dual construction. This is particularly unwelcome, sincethis construction is one of the crucial primitives of contract-oriented interactions.

Acknowledgments. Work partially supported by Aut. Reg. of Sardinia L.R.7/2007CRP-17285 (TRICS), P.I.A. 2010 (“Social Glue”), P.O.R. Sardegna F.S.E. OperationalProgramme of the Aut. Reg. of Sardinia, EU Social Fund 2007-13 – Axis IV HumanResources, Objective l.3, Line of Activity l.3.1), by MIUR PRIN 2010-11 project “Se-curity Horizons”, and by EU COST Action IC1201 “Behavioural Types for ReliableLarge-Scale Software Systems” (BETTY).

Page 185: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

176 M. Bartoletti et al.

References

1. PayPal buyer protection,https://www.paypal.com/us/webapps/mpp/ua/useragreement-full#13 (accessed:January 20, 2015)

2. Alur, R., Dill, D.L.: A theory of timed automata. Theor. Comput. Sci. 126(2),183–235 (1994)

3. Barbanera, F., de’Liguoro, U.: Two notions of sub-behaviour for session-basedclient/server systems. In: PPDP, pp. 155–164 (2010)

4. Barbanera, F., de’Liguoro, U.: Sub-behaviour relations for session-basedclient/server systems. Math. Struct. in Comp. Science, 1–43 (January 2015)

5. Bartoletti, M., Cimoli, T., Murgia, M., Podda, A.S., Pompianu, L.: Complianceand subtyping in timed session types. Technical report (2015), http://co2.unica.it

6. Bartoletti, M., Tuosto, E., Zunino, R.: Contract-oriented computing in CO2. Sci.Ann. Comp. Sci. 22(1) (2012)

7. Behrmann, G., David, A., Larsen, K.G.: A tutorial on Uppaal. In: Bernardo,M., Corradini, F. (eds.) SFM-RT 2004. LNCS, vol. 3185, pp. 200–236. Springer,Heidelberg (2004)

8. Bengtsson, J., Yi, W.: Timed automata: Semantics, algorithms and tools. In: Desel,J., Reisig, W., Rozenberg, G. (eds.) ACPN 2003. LNCS, vol. 3098, pp. 87–124.Springer, Heidelberg (2004)

9. Bocchi, L., Yang, W., Yoshida, N.: Timed multiparty session types. In: Baldan, P.,Gorla, D. (eds.) CONCUR 2014. LNCS, vol. 8704, pp. 419–434. Springer, Heidel-berg (2014)

10. Castagna, G., Dezani-Ciancaglini, M., Giachino, E., Padovani, L.: Foundations ofsession types. In: PPDP, pp. 219–230 (2009)

11. Castagna, G., Gesbert, N., Padovani, L.: A theory of contracts for web services.ACM Transactions on Programming Languages and Systems 31(5) (2009)

12. Corin, R., Denielou, P.-M., Fournet, C., Bhargavan, K., Leifer, J.J.: A secure com-piler for session abstractions. Journal of Computer Security 16(5) (2008)

13. David, A., Larsen, K.G., Legay, A., Nyman, U., Traonouez, L., Wasowski, A.:Real-time specifications. STTT 17(1), 17–45 (2015)

14. de Alfaro, L., Henzinger, T.A.: Interface automata. In: ACM SIGSOFT,pp. 109–120 (2001)

15. Denielou, P.-M., Yoshida, N.: Multiparty compatibility in communicating au-tomata: Characterisation and synthesis of global session types. In: Fomin, F.V.,Freivalds, R., Kwiatkowska, M., Peleg, D. (eds.) ICALP 2013, Part II. LNCS,vol. 7966, pp. 174–186. Springer, Heidelberg (2013)

16. Dezani-Ciancaglini, M., de’Liguoro, U.: Sessions and session types: An overview.In: Laneve, C., Su, J. (eds.) WS-FM 2009. LNCS, vol. 6194, pp. 1–28. Springer,Heidelberg (2010)

17. Henzinger, T.A., Nicollin, X., Sifakis, J., Yovine, S.: Symbolic model checking forreal-time systems. Inf. Comput. 111(2), 193–244 (1994)

18. Honda, K., Vasconcelos, V.T., Kubo, M.: Language primitives and type disciplinefor structured communication-based programming. In: Hankin, C. (ed.) ESOP1998. LNCS, vol. 1381, pp. 122–138. Springer, Heidelberg (1998)

19. Honda, K., Yoshida, N., Carbone, M.: Multiparty asynchronous session types. In:POPL (2008)

20. Krcal, P., Yi, W.: Communicating timed automata: The more synchronous, themore difficult to verify. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144,pp. 249–262. Springer, Heidelberg (2006)

Page 186: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Compliance and Subtyping in Timed Session Types 177

21. Laneve, C., Padovani, L.: The must preorder revisited. In: Caires, L., Vasconcelos,V.T. (eds.) CONCUR 2007. LNCS, vol. 4703, pp. 212–225. Springer, Heidelberg(2007)

22. Neykova, R., Bocchi, L., Yoshida, N.: Timed runtime monitoring for multipartyconversations. In: BEAT, pp. 19–26 (2014)

23. Takeuchi, K., Honda, K., Kubo, M.: An interaction-based language and its typ-ing system. In: Halatsis, C., Maritsas, D., Philokyprou, G., Theodoridis, S. (eds.)PARLE 1994. LNCS, vol. 817, pp. 398–413. Springer, Heidelberg (1994)

24. Yoshida, N., Hu, R., Neykova, R., Ng, N.: The Scribble protocol language. In:Abadi, M., Lluch Lafuente, A. (eds.) TGC 2013. LNCS, vol. 8358, pp. 22–41.Springer, Heidelberg (2014)

Page 187: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Security

Page 188: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Type Checking Privacy Policies in the π-calculus

Dimitrios Kouzapas1(�) and Anna Philippou2

1 Department of Computing, Imperial College London andDepartment of Computing Science, University of Glasgow, Glasgow, UK

[email protected] Department of Computer Science, University of Cyprus, Nicosia, Cyprus

[email protected]

Abstract. In this paper we propose a formal framework for studying privacy. Ourframework is based on the π-calculus with groups accompanied by a type systemfor capturing privacy requirements relating to information collection, informationprocessing and information dissemination. The framework incorporates a privacypolicy language. We show that a system respects a privacy policy if the typingof the system is compatible with the policy. We illustrate our methodology viaanalysis of privacy-aware schemes proposed for electronic traffic pricing.

1 Introduction

The notion of privacy is a fundamental notion for society and, as such, it has been an ob-ject of study within various scientific disciplines. Recently, its importance is becomingincreasingly pronounced as the technological advances and the associated widespreadaccessibility of personal information is redefining the very essence of the term privacy.

A study of the diverse types of privacy, their interplay with technology, and theneed for formal methodologies for understanding and protecting privacy is discussedin [19], where the authors follow in their arguments the analysis of David Solove, a le-gal scholar who has provided a discussion of privacy as a taxonomy of possible privacyviolations [18]. According to Solove, privacy violations can be distinguished in fourcategories: invasions, information collection, information processing, and informationdissemination. Invasion-related privacy violations are violations that occur on the physi-cal sphere of an individual. The authors of [19] concentrate on the latter three categoriesand they identify a model for studying them consisting of the data holder possessinginformation about the data subject and responsible to protect this information againstunauthorized adversaries within the environment.

The motivation of this paper stems from the need of developing formal frameworksfor reasoning about privacy-related concepts. Such frameworks may provide solid foun-dations for understanding the notion of privacy and allow to rigorously model and studyprivacy-related situations. More specifically, our objective is to develop a static methodfor ensuring that a privacy policy is satisfied by an information system using the π-calculus as the underlying theory.

To achieve this objective, we develop a meta-theory for the π-calculus that capturesprivacy as policy. Following the model of [19], we create a policy language that enablesus to describe privacy requirements for private data over data entities. For each type of

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 181–195, 2015.DOI: 10.1007/978-3-319-19195-9_12

Page 189: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

182 D. Kouzapas and A. Philippou

private data we expect entities to follow different policy requirements. Thus, we definepolicies as objects that describe a hierarchical nesting of entities where each node/entityof the hierarchy is associated with a set of privacy permissions. The choice of permis-sions encapsulated within a policy language is an important issue because identificationof these permissions constitutes, in a sense, a characterization of the notion of privacy.In this work, we make a first attempt of identifying some such permissions, our choiceemanating from the more obvious privacy violations of Solove’s taxonomy which werefine by considering some common applications where privacy plays a central role.

As an example consider a medical system obligated to protect patient’s data. Insidethe system a nurse may access patient files to disseminate them to doctors. Doctors areable to process the data without any right to disseminate them. Overall, the data cannotbe disclosed outside the hospital. We formalize this policy as follows:

t� Hospital : {nondisclose}[ Nurse : {access,disclose Hospital1},Doctor : {access, read,write}]

where t is the type of the patient’s data. The policy describes the existence of theHospital entity at the higher level of the hierarchy associated with the nondisclose per-mission signifying that patient data should not be disclosed outside the system. Withinthis structure, a nurse may access (but not read) a patient file and disseminate the fileonce (disclose Hospital1). Similarly a doctor may be given access to a patient file but isalso allowed to read and write data within the files (permissions access, read and write).

Moving on to the framework underlying our study, we employ the π-calculus withgroups [5]. This calculus extends the π-calculus with the notion of groups and an as-sociated type system in a way that controls how data is being disseminated inside asystem. It turns out that groups give a natural abstraction for the representation of en-tities in a system. Thus, we build on the notion of a group of the calculus of [5], andwe use the group memberships of processes to distinguish their roles within systems.Information processing issues can be analysed through the use of names of the calculusin input, output and object position to identify when a channel is reading or writingprivate data or when links to private data are being communicated between groups.

An implementation of the hospital scenario in the π-calculus with groups would be

(ν Hospital)((ν Nurse)(a〈l〉.0) | (ν Doctor)(a(x).x(y).x〈d〉.0))In this system, (νHospital) creates a new group that is known to the two processes of thesubsequent parallel composition while (Nurse) and (Doctor) are groups nested withinthe Hospital group and available to processes a〈l〉.0 and a(x).x(y).x〈d〉.0, respectively.The group memberships of the two processes characterize their nature while reflectingthe entity hierarchy expressed in the privacy policy defined above.

The types of the names in the above process are defined as y : t,d : t, that is y andd are values of sensitive data, while l : Hospital[t] signifies that l is a channel thatcan be used only by processes which belong to group Hospital to carry data of type t.Similarly, a :Hospital[Hospital[t]] states that a is a channel that can be used by membersof group Hospital, to carry objects of type Hospital[t]. Intuitively, we may see that thissystem conforms to the defined policy, both in terms of the group structure as well asthe permissions exercised by the processes. Instead, if the nurse were able to engage in

Page 190: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Type Checking Privacy Policies in the π-calculus 183

a l〈d〉 action then the defined policy would be violated as it would be the case if thetype of a was defined as a : Other[Hospital[t]] for some distinct group Other. Thus, theencompassing group is essential for capturing requirements of non-disclosure.

Using these building blocks, our methodology is applied as follows: Given a sys-tem and a typing we perform type checking to confirm that the system is well-typedwhile we infer a permission interface. This interface captures the permissions exercisedby the system. To check that the system complies with a privacy policy we provide acorrespondence between policies and permission interfaces the intention being that: apermission interface satisfies a policy if and only if the system exercises a subset of theallowed permissions of the policy. With this machinery at hand, we state and prove asafety theorem according to which, if a system Sys type-checks against a typing Γ andproduces an interface Θ , and Θ satisfies a privacy policy P , then Sys respects P .

2 The Calculus

Our study of privacy is based on the π-calculus with groups proposed by Cardelli etal. [5]. This calculus is an extension of the π-calculus with the notion of a group andan operation of group creation where a group is a type for channels. In [5] the authorsestablish a close connection between group creation and secrecy as they show that asecret belonging to a certain group cannot be communicated outside the initial scope ofthe group. This is related to the fact that groups can never be communicated betweenprocesses. Our calculus is based on the π-calculus with groups with some modifications.

We assume the existence of two basic entities: G , ranged over by G,G1, . . . is theset of groups and N , ranged over by a,b,x,y, . . ., is the set of names. Furthermore,we assume a set of basic types D, ranged over by ti, which refer to the basic data ofour calculus on which privacy requirements should be enforced. Specifically, we assigneach name in N a type such that a name may either be of some base type t or of typeG[T ], where G is the group of the name and T the type of value that can be carried onthe name. Given the above, a type is constructed via the following BNF.

T ::= t | G[T ]

Then the syntax of the calculus is defined at two levels. At the process level, P,we have the standard π-calculus syntax. At the system level, S, we include the groupconstruct, applied both at the level of processes (ν G)P, and at the level of systems,(ν G)S, the name restriction construct as well as parallel composition for systems.

P ::= x(y:T ).P | x〈z〉.P | (ν a:T )P | P1 | P2 | !P | 0

S ::= (ν G)P | (ν G)S | (ν a:T )S | S1 | S2 | 0

In (ν a:T )P and (ν a:T )S, name a is bound in P and S, respectively, and in processx(y:T ).P, name y is bound in P. In (ν G)P and (ν G)S, the group G is bound in P andS. We write fn(P) and fn(S) for the sets of names free in a process P and a system S,and fg(S) and fg(T ), for the free groups in a system S and a type T , respectively. Notethat free occurrences of groups occur within the types T of a process/system.

Page 191: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

184 D. Kouzapas and A. Philippou

We now turn to defining a labelled transition semantics for the calculus. We define alabelled transition semantics instead of a reduction semantics due to a characteristic ofthe intended structural congruence in our calculus. In particular, the definition of sucha congruence would omit the axiom (ν G)(S1 | S2) ≡ (ν G)S1 | S2 if G �∈ fg(S2) as itwas used in [5]. This is due to our intended semantics of the group concept which isconsidered to assign capabilities to processes. Thus, nesting of a process P within somegroup G, as in (ν G)P, cannot be lost even if G �∈ fg(P), since the (ν G) constructhas the additional meaning of group membership in our calculus and it instills P withprivacy-related permissions as we will discuss in the sequel. The absence of this lawrenders a reduction semantics rule of parallel composition rather complex.

To define a labelled transition semantics we first define a set of labels:

� ::= τ | x(y) | x〈y〉 | (ν y)x〈y〉Label τ is the internal action whereas labels x(y) and x〈y〉 are the input and outputactions, respectively. Label (ν y)x〈y〉 is the restricted output where the object y of theaction is restricted. Functions fn(�) and bn(�) return the set of the free and bound namesof �, respectively. We also define the relation dual(�,�′) which relates dual actions as

dual(�,�′) if and only if {�,�′}= {x(y),x〈y〉} or {�,�′}= {x(y),(ν y)x〈y〉}.We use the meta-notation (F ::= P | S) to define the labelled transition semantics.

x(y : T ).Px(z)−→ P{z/y} (In) x〈z〉.P x〈z〉−→ P (Out)

F1�−→ F ′

1 bn(�)∩fn(F2) = /0

F1 | F2�−→ F ′

1 | F2

(ParL)F2

�−→ F ′2 bn(�)∩fn(F1) = /0

F1 | F2�−→ F1 | F ′

2

(ParR)

F�−→ F ′ x /∈ fn(�)

(ν x : T )F�−→ (ν x : T )F ′

(ResN)F

x〈y〉−→ F ′

(ν y : T )F(ν y)x〈y〉−→ F ′

(Scope)

F�−→ F ′

(ν G)F �−→ (ν G)F ′(ResG)

F ≡α F ′′ F ′′ �−→ F ′

F �−→ F ′(Alpha)

P�−→ P′

!P�−→ P′ | !P

(Repl)F1

�1−→ F ′1 F2

�2−→ F ′2 dual(�1, �2)

F1 | F2τ−→ (ν bn(�1)∪bn(�2))(F

′1 | F ′

2)(Com)

Fig. 1. The labelled transition system

The labelled transition semantics follows along the lines of standard π-calculus se-mantics where ≡α denotes α-equivalence and the rule for the group-creation construct,(ResG), captures that transitions are closed under group restriction.

3 Policies and Types

In this section we define a policy language and the appropriate type machinery to en-force policies over processes.

Page 192: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Type Checking Privacy Policies in the π-calculus 185

3.1 Policies

Typically, privacy policy languages express positive and negative norms that are ex-pected to hold in a system. These norms distinguish what may happen, in the case ofa positive norm, and what may not happen, in the case of a negative norm on data at-tributes which are types of sensitive data within a system, and, in particular, how thevarious agents, who are referred to by their roles, may/may not handle this data.

The notions of an attribute and a role are reflected in our framework via the notionsof base types and groups, respectively. Thus, our policy language is defined in such away as to specify the allowed and disallowed permissions associated with the variousgroups for each base type (sensitive data). This is achieved via the following entities:

(Linearities) λ ::= 1 | 2 | . . . | ∗(Permissions) P ::= read | write | access | disclose Gλ | nondisclose

(Hierarchies) H ::= ε | G : p[Hj] j∈J

(Policies) P ::= t� H | P;P

Specifically, we define the set of policy permissions P: they express that data maybe read (read) and written (write), that links to data may be accessed (access) or dis-closed within some group G up to λ times (disclose Gλ ). Notation λ is either a naturalnumber or equal to ∗ which denotes an infinite number. While the above are positivepermissions, permission nondisclose is a negative permission and, when associated witha group and a base type, expresses that the base type cannot be disclosed to any partic-ipant who is not a member of the group.

In turn, a policy has the form t1 � H1; . . . ;tn � Hn assigning a structure Hi to eachtype of sensitive data ti. The components Hi, which we refer to as permission hier-archies, specify the group-permission associations for each base type. A permissionhierarchy H has the form G:p [H1, . . . ,Hm], and expresses that an entity belonging togroup G has rights p to the data in question and if additionally it is a member of somegroup Gi where Hi = Gi: pi [. . .], then it also has the rights pi, and so on.

We define the auxiliary functions groups(H) and perms(H) so as to gather the setsof groups and the set of permissions, respectively, inside a hierarchy structure:

groups(H) =

{{G}∪ (⋃

j∈J groups(Hj)) if H = G : p[Hj] j∈J

/0 if H = ε

perms(H) =

{p∪ (

j∈J perms(Hj)) if H = G : p[Hj] j∈J

/0 if H = ε

We say that a policy P = t1 � H1; . . . ;tn � Hn is well formed, written P : , if itsatisfies the following:

1. The ti are distinct.2. If H = G : p[Hj] j∈J occurs within some Hi then G �∈ groups(Hj) for all j ∈ J, that

is, the group hierarchy is acyclic.3. If H = G : p[Hj] j∈J occurs within some Hi, nondisclose ∈ p and disclose G′ λ ∈

perms(Hj) for some j ∈ J, then G′ ∈ groups(H). In words, no non-disclosure re-quirement imposed at some level of a hierarchy is in conflict with a disclosurerequirement granted in its sub-hierarchy.

Page 193: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

186 D. Kouzapas and A. Philippou

Hereafter, we assume that policies are well-formed policies. As a shorthand, we writeG : p for G : p[ε] and we abbreviate G for G : /0.

As an example, consider a hospital containing the departments of surgery (Surgery),cardiology (Cardiology), and psychotherapy (Psychotherapy), where cardiologists ofgroup Cardio belong to the cardiology department, surgeons of group Surgeon be-long to the surgery department, psychiatrists of group Psy belong to the psychother-apy department, and CarSurgeon refers to doctors who may have a joint appointmentwith the surgery and cardiology departments. Further, let us assume the existence ofdata of type MedFile which (1) should not be disclosed to any participant outside theHospital group, (2) may be read, written, accessed and disclosed freely within boththe surgery and the cardiology departments and (3) may be read, written and accessedbut not disclosed outside the psychotherapy department. We capture these requirementsvia policy P =MedFile� H where pcd = {read,access,write,disclose Cardiology∗},psd ={read,access,write,disclose Surgery∗}, ppd={nondisclose, read,access,write},

H = Hospital:{nondisclose} [Cardiology:pcd [Cardio,CarSurgeon],

Surgery:psd [Surgeon,CarSurgeon],

Psychotherapy:ppd [Psy]]

At this point we note that, as illustrated in the above policy, hierarchies need not betree-structured: group CarSurgeon may be reached via both the Hospital,Cardiologypath as well as the Hospital,Surgery path. In effect this allows to define a process(ν Hospital)(ν Cardiology)(ν SD)(ν CarSurgeon)P belonging to four groups and in-heriting the permissions of each one of them.

3.2 The Type System

We proceed to define a typing system for the calculus.

Typing Judgements. The environment on which type checking is carried out consistsof the component Γ . During type checking we infer the two additional structures ofΔ -environments and Θ -interfaces as follows

Γ ::= /0 | Γ · x : T | Γ ·GΔ ::= /0 | t : p ·ΔΘ ::= t� Hθ ;Θ | t� Hθ

with Hθ ::= G[Hθ ] | G[p]. Note that Hθ captures a special type of hierarchies wherethe nesting of groups is linear. We refer to Hθ as interface hierarchies. The domain ofenvironment Γ , dom(Γ ), contains all groups and names recorded in Γ . EnvironmentΔ assigns permissions to sensitive data types t. When associated with a base type t,permissions read and write express that it is possible to read/write data of type t alongchannels of type G[t] for any group G. Permission access, when associated with a typet, expresses that it is possible to receive a channel of type G[t] for any G and, finally, ifpermission disclose Gλ is associated with t then it is possible to send channels of typeG[t] for up to λ times. Thus, while permissions read and write are related to manipu-lating sensitive data, permissions access and disclose are related to manipulating links

Page 194: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Type Checking Privacy Policies in the π-calculus 187

to sensitive data. Finally, interface Θ associates a sensitive type with a linear hierarchyof groups and a set of permissions, namely, an entity of the form G1[G2[. . .Gn[p] . . .]].

We define three typing judgements: Γ � x �T , Γ � P�Δ and Γ � S �Θ . JudgementΓ � x � T says that under typing environment Γ , name x has type T . Judgement Γ �P �Δ stipulates that process P is well typed under the environment Γ and produces apermission environment Δ . In this judgement, Γ records the types of the names of Pand Δ records the permissions exercised by the names in P for each base type. Finally,judgement Γ � S �Θ defines that system S is well typed under the environment Γ andproduces interface Θ which records the group memberships of all components of S aswell as the permissions exercised by each component.

Typing System. We now move on to our typing system. We begin with some usefulnotation. We write:

Δ rT =

t : read if T = tt : access if T = G[t]/0 otherwise

Δ wT =

t : write if T = tt : disclose G1 if T = G[t]/0 otherwise

Furthermore, we define the � operator over permissions:

p1 � p2 = {p | p ∈ p1 ∪ p2 ∧ p �= disclose Gλ ∧ p �= nondisclose}∪ {disclose G(λ1 +λ2) | disclose Gλ1 ∈ p1 ∧disclose Gλ2 ∈ p2}∪ {disclose Gλ | disclose Gλ ∈ p1 ∧disclose Gλ ′ /∈ p2}∪ {disclose Gλ | disclose Gλ ′ /∈ p1 ∧disclose Gλ ∈ p2}

Operator � adds two permission sets by taking the union of the non nondisclose per-missions modulo adding the linearities of the disclose Gλ permissions. We extend the� operator for Δ -environments: assuming t : /0 ∈ Δ if t : p �∈ Δ , we define Δ1 �Δ2 = {t :p1 � p2 | t : p1 ∈ Δ1, t : p2 ∈ Δ2}.

Finally, we define the ⊕ operator as:

G⊕ (t1 : p1, . . . , tm : pm) = t1 � G[p1], . . . , tm � G[pm]

G⊕ (t1 � Hθ1 , . . . , tm � Hθ

m) = (t1 � G[Hθ1 ], . . . , tm � G[Hθ

m])

Operator ⊕ when applied to a group G and an interface Δ attaches G to all permissionsets of Δ , thus yielding a Θ interface, whereas, when applied to a group G and aninterface Θ , it attaches group G to all interface hierarchies of Θ .

The typing system is defined in Fig. 2. Rule (Name) is used to type names: in nametyping we require that all group names of the type are present in Γ . Process 0 can betyped under any typing environment (axiom (Nil)) to infer the empty Δ -interface.

Rule (In) types the input-prefixed process. If environment Γ extended with the typeof y produces Δ as an interface of P, we conclude that the process x(y).P produces aninterface where the type of T is extended with the permissions Δ r

T , where (i) if T is basetype t then Δ is extended by t : read since the process is reading an object of type t,(ii) if T = T ′[t] then Δ is extended by t : access, since the process has obtained accessto a link for base type t and (iii) Δ remains unaffected otherwise.

Page 195: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

188 D. Kouzapas and A. Philippou

(Name)fg(T )⊆ Γ

Γ · x : T � x�T(Nil) Γ � 0� /0

(In)Γ · y : T � P�Δ Γ � x�G[T ]

Γ � x(y : T ).P�Δ �Δ rT

(Out)Γ � P�Δ Γ � x�G[T ] Γ � y�T

Γ � x〈y〉.P�Δ �Δ wT

(ParP)Γ � P1 �Δ1 Γ � P2 �Δ2

Γ � P1 | P2 �Δ1 �Δ2(ParS)

Γ � S1 �Θ1 Γ � S2 �Θ2

Γ � S1 | S2 �Θ1 ·Θ2

(ResNP)Γ · x : T � P�Δ

Γ � (ν x : T )P�Δ(ResNS)

Γ · x : T � S�ΘΓ � (ν x : T )S�Θ

(ResGP)Γ ·G � P�Δ

Γ � (ν G)P�G⊕Δ(ResGS)

Γ ·G � S�ΘΓ � (ν G)S�G⊕Θ

(Rep)Γ � P�ΔΓ �!P�Δ !

Fig. 2. The Typing System

Rule (Out) is similar: If y is of type T , x of type G[T ] and Δ is the permissioninterface for P, then, x〈y〉.P produces an interface which extends Δ with permissionsΔ w

T . These permissions are (i) {t : write} if T = t since the process is writing data oftype t, (ii) {disclose G1} if T = G[t], since the process is disclosing once a link toprivate data via a channel of group G, and (iii) the empty set of permissions otherwise.

Rule (ParP) uses the � operator to compose the process interfaces of P1 and P2.Parallel composition of systems, rule (ParS), concatenates the system interfaces of S1

and S2. For name restriction, (ResNP) specifies that if P type checks within an environ-ment Γ · x : T , then (νx)P type checks in environment Γ . (ResNS) is defined similarly.Moving on to group creation, for rule (ResGP) we have that, if P produces a typing Δ ,then system (ν G)P produces the Θ -interface G⊕Δwhereas for rule (ResGS), we havethat if S produces a typing interface Θ then process (ν G)S produces interface G⊕ΘThus, enclosing a system within an (ν G) operator results in adding G to the groupmemberships of each of the components.

Finally, for replication, axiom (Rep) states that if P produces an interface Δ then !Pproduces an interface Δ !, where Δ ! is such that if a type is disclosed λ > 1 in Δ then itis disclosed for an unlimited number of times in Δ !. That is, Δ ! = {t : p! | t : p ∈ Δ},where p! = {p ∈ p | p �= discloseGλ}∪{disclose G ∗ | disclose Gλ ∈ Δ}. Note thatthe type system never assigns the nondisclose permissions, thus interfaces are neverinferred on the nondisclose permission. This is the reason the nondisclose permission isignored in the definition of the � operator.

As an example consider S = (ν G1)(ν G2)P where P =!get(loc : Tl).put〈loc〉.0. Fur-ther, suppose the existence of a base type Loc and types Tl = G1[Loc], Tr = G2[Tl ] andTs = G1[Tl ]. Let us write Γ = get : Tr ·put : Ts · loc : Tl . Then we have:

Γ � 0 � /0 by (Nil)Γ � put〈loc〉.0 � {Loc : {disclose G1 1} by (Out)

Page 196: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Type Checking Privacy Policies in the π-calculus 189

Γ � get(loc : Tl).put〈loc〉.0 � {Loc : {disclose G1 1,access} by (In)Γ � !get(loc : Tl).put〈loc〉.0 � {Loc : {disclose G1 ∗,access} by (Rep)Γ � (ν G2)P�Loc� G2 : [{disclose G1 ∗,access}] by (ResNP)Γ � S �Loc� G1 : [G2 : [{disclose G1 ∗,access}]] by (ResNS)

4 Soundness and Safety

In this section we establish soundness and safety results for our framework. Missingproofs of results can be found in the appendix. First, we establish that typing is pre-served under substitution.

Lemma 1 (Substitution). If Γ · x : T � P�Δ then Γ · y : T � P{y/x} �Δ .

The next definition defines an operator that captures the changes on the interfaceenvironment when a process executes an action.

Definition 1 (Θ1 �Θ2).

1. p1 � p2 if (i) for all p ∈ p1 and p �= disclose Gλ implies p ∈ p2, and (ii) for alldisclose Gλ ∈ p1 implies disclose λ ′ G ∈ p2 and λ ′ ≥ λ or λ ′ = ∗.

2. Δ1 � Δ2 if ∀t, t : p1 ∈ Δ1 implies that t : p2 ∈ Δ2 and p1 � p2.3. (i) G[p1]� G[p2] if p1 � p2, and (ii) G[H1]� G[H2] if H1 � H2.4. Θ1 � Θ2 if (i) dom(Θ1) = dom(Θ2), and (ii) for all t, t � H1 ∈ Θ1 implies that

t� H2 ∈Θ2 and H1 � H2.

Specifically, when a process executes an action we expect a name to maintain or loseits interface capabilities that are expressed through the typing of the name.

We are now ready to define the notion of satisfaction of a policy P by a permissioninterface Θ thus connecting our type system with policy compliance.

Definition 2.

– Consider a policy hierarchy H = G : p[Hj] j∈J and an interface hierarchy Hθ . Wesay that Hθ satisfies H, written H � Hθ , if:

groups(Hθ ) = G∪⋃

j∈J groups(Hθj ) ∀ j ∈ J,Hj � Hθ

j

perms(Hθ )� (� j∈Jperms(Hθj ))� p

G : p[Hj] j∈J � Hθ

– Consider a policy P and an interface Θ . Θ satisfies P , written P �Θ , if:

H � Hθ

t� H;P � t� HθH � Hθ P �Θ

t� H;P � t� Hθ ;Θ

According to the definition of H � Hθ , an interface hierarchy Hθ satisfies a policyhierarchy H, if its groups can be decomposed into a partition {G}∪⋃

j∈J G j, such thatthere exist interface hierarchies Hθ

j referring to groups G j, each satisfying hierarchy Hj

and where the union of the assigned permissions Hθj with permissions p is a superset

Page 197: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

190 D. Kouzapas and A. Philippou

of the permissions of Hθ , that is, perms(Hθ )� (� j∈Jperms(Hθj ))� p. Similarly, a Θ -

interface satisfies a policy, P �Θ , if for each component t� Hθ of Θ , there exists acomponent t� H of P such that Hθ satisfies H. A direct corollary of the definition isthe preservation of the � operator over the satisfiability relation:

Corollary 1. If P �Θ1 and Θ2 �Θ1 then P �Θ2.

The next definition formalises when a system satisfies a policy:

Definition 3 (Policy Satisfaction). Let P : . We say that S satisfies P , writtenP � S,if Γ � S �Θ for some Γ and Θ such that P �Θ .

We may now state our result on type preservation by action execution of processes.

Theorem 1 (Type Preservation).

1. Let Γ � P�Δ and P�−→ P′ then Γ � P′ �Δ ′ and Δ ′ � Δ .

2. Let Γ � S �Θ and S�−→ S′ then Γ � S′ �Θ ′ and Θ ′ �Θ .

Corollary 2. Let P � S and S�−→ S′ then P � S′.

Let countLnk(P,Γ ,G[t]) count the number of output prefixes of the form x〈y〉 inprocess P where x : G[t] for some base type t. (This can be defined inductively on thestructure of P.) Moreover, given a policy hierarchy H and a set of groups G, let uswrite HG for the interface hierarchy such that (i) groups(HG) = G, (ii) H � HG and,

(iii) for all Hθ such that groups(Hθ ) = G and H � Hθ , then perms(Hθ )� perms(HG).Intuitively, HG captures an interface hierarchy with the maximum possible permissions

for groups G as determined by H. We may now define the notion of the error processwhich clarifies the satisfiability relation between the policies and processes.

Definition 4 (Error Process). Consider a policy P , an environment Γ and a system

S ≡ (ν G1)(ν x1 : T1)((ν G2)(ν x2 : T2)(. . . ((ν Gn)(ν xn : Tn)P | Q | Sn) . . . ) | S1)

System S is an error process with respect to P and Γ , if there exists t such that P =t� H;P ′ and at least one of the following holds, where G = 〈G1, . . . ,Gn〉:1. read /∈ perms(HG) and ∃x such that Γ � x�G[t] and P = x(y).P′.2. write /∈ perms(HG) and ∃x such that Γ � x�G[t] and P = x〈y〉.P′.3. access /∈ perms(HG) and ∃x such that Γ � x�G[t] and P = y(x).P′.4. disclose G′ λ /∈ perms(HG) and ∃x,y such that Γ � x �G[t], Γ � y �G′[G[t]] and

P = y〈x〉.P′.5. disclose Gλ ∈ perms(HG), λ �= ∗ and countLnk(P,Γ ,G[t])> λ6. there exists a sub-hierarchy of H, H ′ = Gk : p[Hi]i∈I,1 ≤ k ≤ n with nondisclose∈

p and ∃,x,y such that Γ � x � G[t], Γ � y �G′[G[t]] and P = y〈x〉.P′ with G′ /∈groups(H ′).

The first two error processes expect that a process with no read or write permissionson a certain level of the hierarchy should not have, respectively, a prefix receiving orsending an object typed with the private data. Similarly an error process with no access

Page 198: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Type Checking Privacy Policies in the π-calculus 191

permission on a certain level of the hierarchy should not have an input-prefixed subjectwith object a link to private data. An output-prefixed process that send links through achannel of sort G′ is an error process if it is found in a specific group hierarchy withno disclose G′ λ permission. In the fifth clause, a process is an error if the number ofoutput prefixes to links in its definition (counted with the countLnk(P,Γ , t) function)are more than the λ in the discloseGλ permission of the process’s hierarchy. Finally, ifa policy specifies that no data should be disclosed outside some group G, then a processshould not be able to send private data links to groups that are not contained within thehierarchy of G.

As expected, if a process is an error with respect to a policy P and an environmentΓ its Θ -interface does not satisfy P:

Lemma 2. Let system S be an error process with respect to well formed policy P andsort Γ . If Γ � S �Θ then P ��Θ .

By Corollary 2 and Lemma 2 we conclude with our safety theorem which verifiesthat the satisfiability of a policy by a typed process is preserved by the semantics.

Theorem 2 (Safety). If Γ � S �Θ , P � Θ and S�−→

∗S′ then S′ is not an error with

respect to policy P .

5 Example

Electronic Traffic Pricing (ETP) is an electronic toll collection scheme in which the feeto be paid by drivers depends on the road usage of their vehicles where factors suchas the type of roads used and the times of the usage determine the toll. To achievethis, for each vehicle detailed time and location information must be collected and pro-cessed and the due amount can be calculated with the help of a digital tariff and a roadmap. A number of possible implementation schemes may be considered for this sys-tem [8]. In the centralized approach, all location information is communicated to thepricing authority which computes the fee to be paid based on the received information.In the decentralized approach the fee is computed locally on the car via the use of athird trusted entity such as a smart card. In the following subsections we consider theseapproaches and their associated privacy characteristics.

5.1 The Centralized Approach

This approach makes use of on-board equipment (OBE) which computes regularly thegeographical position of the car and forwards it to the Pricing Authority (PA). To avoiddrivers tampering with their OBE and communicating false information, the authoritiesmay perform checks on the spot to confirm that the OBE is reliable.

We may model this system with the aid of five groups: ETP corresponds to theentirety of the ETP system, Car refers to the car and is divided into the OBE and theGPS subgroups, and PA refers to the pricing authority. As far as types are concerned,we assume the existence of two base types: Loc referring to the attribute of locationsand Fee referring to the attribute of fees. We write Tl = ETP[Loc], Tr = Car[Tl ], Tpa =ETP[Tl ], Tx = ETP[Tl ] and Tsc = ETP[Tx].

Page 199: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

192 D. Kouzapas and A. Philippou

O = !read(loc : Tl).topa〈loc〉.0| !spotcheck(s : Tx).read(ls : Tl).s〈ls〉.0

L = !(ν newl : Tl)read〈newl〉.0A = !topa(z : Tl).z(l : Loc).0

| !send〈fee〉.0| !(ν x : Tx)spotcheck〈x〉.x(y : Tl).y(ls : Loc).0

System = (ν ETP)(ν spotcheck : Tsc)(ν topa : Tpa)

[ (ν PA)A | (ν Car)((ν read : Tr)((ν OBE)O | (ν GPS)L)) ]

In the above model we have the component of the OBE, O, belonging to groupOBE, and the component responsible for computing the current location, L, belongingto group GPS. These two components are nested within the Car group and share theprivate name read on which it is possible for L to pass to O a name via which thecurrent location may be read. The OBE O may spontaneously read on name read orit may enquire the current location for the purposes of a spot check. Such a check isinitiated by the pricing authority A who may engage in three different activities: Firstly,it may receive a name z from the OBE via channel topa and then use z for readingthe car location (action z(l)). Secondly, it may periodically compute the fee to be paidand communicate the link (fee : ETP[Fee]) via name send : ETP[ETP[Fee]]. Thirdly, itmay initiate a spot check, during which it creates and sends the OBE a new channel viawhich the OBE is expected to return the current location for a verification check.

By applying the rules of our type system we may show that Γ � System�Θ , whereΓ = {fee : ETP[Fee],send : ETP[ETP[Fee]]} and where

Θ = Fee� ETP[PA[{disclose ETP∗}]]; Loc� ETP[PA[{access, read}]];Loc� ETP[Car[OBE[{access,disclose ETP∗}]]]; Loc� ETP[Car[GPS[{disclose Car∗}]]]

A possible privacy policy for the system might be one that states that locations may befreely forwarded by the OBE. We may define this by P = Loc� H where

H = ETP : nondisclose [Car : [OBE : {access,disclose ETP∗},GPS : {disclose Car∗}],

PA : {access, read}]

We have that P � Loc � ETP[PA[{access, read}]], since the per-missions assigned to groups ETP and PA by the policy are equal to{access, read} � {access, read} = perms(ETP[PA[{access, read}]]). Simi-larly, P � Loc � ETP[Car[OBE[{access,disclose ETP∗}]]] and P � Loc �ETP[Car[GPS[{disclose Car∗}]]]. Thus, we conclude that System satisfies P .

This architecture is simple but also very weak in protecting the privacy of individuals:the fact that the PA gets detailed travel information about every vehicle constitutes aprivacy and security threat. An alternative implementation that limits the transmissionof locations is presented in the second implementation proposal presented below.

Page 200: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Type Checking Privacy Policies in the π-calculus 193

5.2 The Decentralized Approach

To avoid the disclosure of the complete travel logs of a system this solution employsa third trusted entity (e.g. smart card) to make computations of the fee locally on thecar and send its value to the authority which in turn may make spot checks to obtainevidence on the correctness of the calculation.

The policy here would require that locations can be communicated for at most asmall fixed amount of times and that the OBE may read the fee computed by the smartcard but not change its value. Precisely, the new privacy policy might be:

Loc�ETP : nondisclose [ Fee�ETP : nondisclose [Car : [ Car : [

OBE : {access,disclose ETP2}, OBE : {},GPS : {disclose Car∗}, GPS : {},SC : {access, read}], SC : {write,disclose ETP∗}],

PA : {access, read}] PA : {access, read}]The new system as described above may be modelled as follows, where we have a

new group SC and a new component S, a smart card, belonging to this group:

S = !read(loc : Tl).loc(l : Loc).(ν newval : Fee)fee〈newval〉.send〈fee〉.0O = spotcheck(s1 : Tx).read(ls1 : Tl).s1〈ls1〉.spotcheck(s2 : Tx).read(ls2 : Tl).s2〈ls2〉.0L = !(ν newl : Tl)read〈newl〉.0A = !(ν x : Tx)spotcheck〈x〉.x(y : Tl).y(ls : Loc).0

| send(fee).fee(v : Fee).0

System = (ν ETP)(ν spotcheck : Tsc)(ν topa : Tpa)

[ (ν PA)A | (ν Car)((ν read : Tr)((ν OBE)O | (ν GPS)L) | (ν SC)S) ]

We may verify that Γ � System�Θ , where Γ = {fee :ETP[Fee],send :ETP[ETP[Fee]]}and interface Θ satisfies the enunciated policy.

6 Related Work

There exists a large body of literature concerned with formally reasoning about privacy.To begin with, a number of languages have been proposed to express privacy poli-cies [16,15] and can be used to verify the consistency of policies or to check whethera system complies with a certain policy via static techniques such as model check-ing [15,13], on-the-fly using monitoring, or through audit procedures [7,2].

Related to the privacy properties we are considering in this paper is the notion ofContextual Integrity [2]. Aspects of this notion have been formalized in a logical frame-work and were used for specifying privacy regulations while notions of compliance ofpolicies by systems were considered. Also related to our work is [17] where a family ofmodels named P-RBAC (Privacy-aware Role Based Access Control) is presented thatextends the traditional role-based access control to support specification of complexprivacy policies. In particular, the variation thereby introduced called Hierarchical P-RBAC introduces, amongst others, the notion of role hierarchies which is reminiscentof our policy hierarchies. However, the methodology proposed is mostly geared towards

Page 201: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

194 D. Kouzapas and A. Philippou

expressing policies and checking for conflicts within policies as opposed to assessingthe satisfaction of policies by systems, which is the goal of our work.

Also related to our work is the research line on typed-based security in process cal-culi. Among these works, numerous studies have focused on access control which isclosely related to privacy. For instance the work on the Dπ calculus has introducedsophisticated type systems for controlling the access to distributed resources [11,12].Furthermore, discretionary access control has been considered in [4] which similarly toour work employs the π-calculus with groups, while role-based access control (RBAC)has been considered in [3,9,6]. In addition, authorization policies and their analysis viatype checking has been considered in a number of papers including [10,1]. While adopt-ing a similar approach, our work departs from these works in the following respects: Tobegin with we note that role-based access control is insufficient for reasoning aboutcertain privacy violations. While in RBAC it is possible to express that a doctor mayread patient’s data and send emails, it is not possible to detect the privacy violationbreach executed when the doctor sends an email with the sensitive patient data. In ourframework, we may control such information dissemination by distinguishing betweendifferent types of data and how these can be manipulated. Furthermore, a novelty of ourapproach is the concept of hierarchies within policies which allow to arrange the sys-tem into a hierarchical arrangement of disclosure zones while allowing the inheritanceof permissions between groups within the hierarchy.

To conclude, we mention our previous work of [14]. In that work we employed theπ-calculus with groups accompanied by a type system based on i/o and linear types forcapturing privacy-related notions. In the present work, the type system is reconstructedand simplified using the notions of groups to distinguish between different entities andintroducing permissions inference during type checking. Most importantly, a contribu-tion of this work in comparison to [14] is that we introduce a policy language and provea safety criterion that establishes policy satisfaction by typing.

7 Conclusions

In this paper we have presented a formal framework based on the π-calculus with groupsfor studying privacy. Our framework is accompanied by a type system for capturingprivacy-related notions and a privacy language for expressing privacy policies. We haveproved a type preservation theorem and a safety theorem which establishes sufficientconditions for a system to satisfy a policy.

The policy language we have proposed is a simple language that constructs a hi-erarchical structure of the entities composing a system and assigning permissions foraccessing sensitive data to each of the entities while allowing to reason about somesimple privacy violations. These permissions are certainly not intended to capture everypossible privacy issue, but rather to demonstrate a method of how one might formalizeprivacy rights. Identifying an appropriate and complete set of permissions for provid-ing foundations for the notion of privacy in the general context should be the result ofintensive and probably interdisciplinary research that justifies each choice. To this ef-fect, Solove’s taxonomy of privacy violations forms a promising context in which theseefforts can be based and it provides various directions for future work.

Page 202: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Type Checking Privacy Policies in the π-calculus 195

Other possible directions for future work can be inspired by privacy approaches suchas contextual integrity and P-RBAC. We are currently extending our work to reasonabout more complex privacy policies that include conditional permissions and the con-cepts of purpose and obligation as in P-RBAC. Finally, it would be interesting to ex-plore more dynamic settings where the roles evolve over time.

References

1. Backes, M., Hritcu, C., Maffei, M.: Type-checking zero-knowledge. In: Proceedings of CCS2008, pp. 357–370 (2008)

2. Barth, A., Datta, A., Mitchell, J.C., Nissenbaum, H.: Privacy and contextual integrity: Frame-work and applications. In: Proceedings of S&P 2006, pp. 184–198 (2006)

3. Braghin, C., Gorla, D., Sassone, V.: Role-based access control for a distributed calculus.Journal of Computer Security 14(2), 113–155 (2006)

4. Bugliesi, M., Colazzo, D., Crafa, S., Macedonio, D.: A type system for discretionary accesscontrol. Mathematical Structures in Computer Science 19(4), 839–875 (2009)

5. Cardelli, L., Ghelli, G., Gordon, A.D.: Secrecy and group creation. Information and Compu-tation 196(2), 127–155 (2005)

6. Compagnoni, A.B., Gunter, E.L., Bidinger, P.: Role-based access control for boxed ambients.Theoretical Computer Science 398(1-3), 203–216 (2008)

7. Datta, A., Blocki, J., Christin, N., DeYoung, H., Garg, D., Jia, L., Kaynar, D., Sinha, A.:Understanding and protecting privacy: Formal semantics and principled audit mechanisms.In: Jajodia, S., Mazumdar, C. (eds.) ICISS 2011. LNCS, vol. 7093, pp. 1–27. Springer, Hei-delberg (2011)

8. de Jonge, W., Jacobs, B.: Privacy-friendly electronic traffic pricing via commits. In: Degano,P., Guttman, J., Martinelli, F. (eds.) FAST 2008. LNCS, vol. 5491, pp. 143–161. Springer,Heidelberg (2009)

9. Dezani-Ciancaglini, M., Ghilezan, S., Jaksic, S., Pantovic, J.: Types for role-based accesscontrol of dynamic web data. In: Marino, J. (ed.) WFLP 2010. LNCS, vol. 6559, pp. 1–29.Springer, Heidelberg (2011)

10. Fournet, C., Gordon, A., Maffeis, S.: A type discipline for authorization in distributed sys-tems. In: Proceedings of CSF 2007, pp. 31–48 (2007)

11. Hennessy, M., Rathke, J., Yoshida, N.: safedpi: A language for controlling mobile code. ActaInformatica 42(4-5), 227–290 (2005)

12. Hennessy, M., Riely, J.: Resource access control in systems of mobile agents. Informationand Computation 173(1), 82–120 (2002)

13. Koleini, M., Ritter, E., Ryan, M.: Model checking agent knowledge in dynamic access controlpolicies. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013. LNCS, vol. 7795, pp. 448–462.Springer, Heidelberg (2013)

14. Kouzapas, D., Philippou, A.: A typing system for privacy. In: Counsell, S., Nunez, M. (eds.)SEFM 2013. LNCS, vol. 8368, pp. 56–68. Springer, Heidelberg (2014)

15. Liu, Y., Muller, S., Xu, K.: A static compliance-checking framework for business processmodels. IBM Systems Journal 46(2), 335–362 (2007)

16. May, M.J., Gunter, C.A., Lee, I.: Privacy APIs: Access control techniques to analyze andverify legal privacy policies. In: Proceedings of CSFW 2006, pp. 85–97 (2006)

17. Ni, Q., Bertino, E., Lobo, J., Brodie, C., Karat, C., Karat, J., Trombetta, A.: Privacy-awarerole-based access control. ACM Trans. on Information and System Security 13(3) (2010)

18. Solove, D.J.: A Taxonomy of Privacy. University of Pennsylvania Law Review 154(3), 477–560 (2006)

19. Tschantz, M.C., Wing, J.M.: Formal methods for privacy. In: Cavalcanti, A., Dams, D.R.(eds.) FM 2009. LNCS, vol. 5850, pp. 1–15. Springer, Heidelberg (2009)

Page 203: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Extending Testing Automata to All LTL

Ala Eddine Ben Salem(�)

LRDE, EPITA, Le Kremlin-Bicêtre, [email protected]

Abstract. An alternative to the traditional Büchi Automata (BA), called TestingAutomata (TA) was proposed by Hansen et al. [8, 6] to improve the automata-theoretic approach to LTL model checking. In previous work [2], we proposedan improvement of this alternative approach called TGTA (Generalized TestingAutomata). TGTA mixes features from both TA and TGBA (Generalized BüchiAutomata), without the disadvantage of TA, which is the second pass of the empti-ness check algorithm. We have shown that TGTA outperform TA, BA and TGBAfor explicit and symbolic LTL model checking. However, TA and TGTA are lessexpressive than Büchi Automata since they are able to represent only stutter-invariant LTL properties (LTL\X) [13]. In this paper, we show how to extendGeneralized Testing Automata (TGTA) to represent any LTL property. This al-lows to extend the model checking approach based on this new form of testingautomata to check other kinds of properties and also other kinds of models (suchas Timed models). Implementation and experimentation of this extended TGTAapproach show that it is statistically more efficient than the Büchi Automata ap-proaches (BA and TGBA), for the explicit model checking of LTL properties.

1 Introduction

The model checking of a behavioral property on a finite-state system is an automaticprocedure that requires many phases. The first step is to formally represent the systemand the property to be checked. The formalization of the system produces a modelM that formally describes all the possible executions of the system. The property tobe checked is formally described using a specification language such as Linear-timeTemporal Logic (LTL). The next step is to run a model checking algorithm that takesas inputs the model M and the LTL property ϕ. This algorithm exhaustively checksthat all the executions of the model M satisfy ϕ. When the property is not satisfied, themodel checker returns a counterexample, i.e., an execution of M invalidating ϕ, thiscounterexample is particularly useful to find subtle errors in complex systems.

The automata-theoretic approach [16] to LTL model checking represents the state-space of M and the property ϕ using variants of ω-automata, i.e., an extension of theclassical finite automata to recognize words having infinite length (called ω-words).

The automata-theoretic approach splits the verification process into four operations:1. Computation of the state-space of M. This state-space can be represented by a vari-

ant of ω-automaton, called Kripke structure KM, whose language L (KM), repre-sents all possible infinite executions of M.

2. Translation of the negation of the LTL property ϕ into an ω-automaton A¬ϕ whoselanguage, L (A¬ϕ), is the set of all infinite executions that would invalidate ϕ.

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 196–210, 2015.DOI: 10.1007/978-3-319-19195-9_13

Page 204: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Extending Testing Automata to All LTL 197

3. Synchronization of these automata. This constructs a synchronous product automa-ton KM ⊗A¬ϕ whose language, L (KM ⊗A¬ϕ) =L (KM)∩L (A¬ϕ), is the set ofexecutions of M invalidating ϕ.

4. Emptiness check of this product. This operation tells whether KM ⊗A¬ϕ accepts aninfinite word, and can return such a word (a counterexample) if it does. The modelM verifies ϕ iff L (KM ⊗A¬ϕ) = /0.

The main difficulty of the LTL model checking is the state-space explosion problem.In particular, in the automata-theoretic approach, the product automaton KM ⊗A¬ϕ isoften too large to be emptiness checked in a reasonable run time and memory. Indeed,the performance of the automata-theoretic approach mainly depends on the size of theexplored part of KM⊗A¬ϕ during the emptiness check. This explored part itself dependson three parameters: the automaton A¬ϕ obtained from the LTL property ϕ, the Kripkestructure KM representing the state-space of M, and the emptiness check algorithm: thefact that this algorithm is performed “on-the-fly” potentially avoids building the entireproduct automaton. The states of this product that are not visited by the emptiness checkare not generated at all.

Different kinds of ω-automata have been used to represent A¬ϕ. In the most com-mon case, the negation of ϕ is converted into a Büchi automaton (BA) with state-basedaccepting. Transition-based Generalized Büchi Automata (TGBA) represent the LTLproperties using generalized (i.e., multiple) Büchi acceptance conditions on transitionsrather than on states. TGBA allow to have a smaller [7, 4] property automaton than BA.

Unfortunately, having a smaller property automaton A¬ϕ does not always imply asmaller product (AM⊗A¬ϕ). Thus, instead of targeting smaller property automata, somepeople have attempted to build automata that are more deterministic [14].

Hansen et al. [8, 6] introduced an alternative type of ω-automata called Testing Au-tomata (TA) that only observe changes on the atomic propositions. TA are often largerthan their equivalent BA, but according to Geldenhuys and Hansen [6], thanks to theirhigh degree of determinism [8], the TA allow to obtain a smaller product and thusimprove the performance of model checking. As a back-side, TA have two differentmodes of acceptance (Büchi-accepting or livelock-accepting), and consequently theiremptiness check requires two passes [6], mitigating the benefits of a having a smallerproduct.

In previous work [2], we propose an improvement of TA called Transition-basedGeneralized Testing Automata (TGTA) that combine the advantages of both TA andTGBA, and without the disadvantages of TA (without introducing a second mode ofacceptance and without the second pass of the emptiness check).

Unfortunately, the two variants of testing automata TA and TGTA are less expres-sive than Büchi automata (BA and TGBA) since they are tailored to represent stutter-invariant properties.

The goal of this paper is to extend TGTA in order to obtain a new form of testingautomata that represent any LTL property, and therefore extend the model checkingapproach based on this alternative kind of automata to check other kinds of propertiesand also other kinds of models.

In order to remove the constraint that TGTA only represent stutter-invariant proper-ties, one solution would be to change the construction of TGTA to take into account the“sub-parts” of the automata corresponding to the “sub-formulas” that are not insensitive

Page 205: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

198 A.E. Ben Salem

to stuttering. Indeed, during the transformation of a TGBA into a TGTA, only the sec-ond step exploits the fact that the LTL property is stutter-invariant (this step allows toremove the useless stuttering-transitions). The idea is to apply this second step only forthe parts of the automata that are insensitive to stuttering and to apply only the first stepof the construction for the other parts (that are sensitive to stuttering).

We have run benchmarks to compare the new form of TGTA against BA and TGBA.Experiments reported that, in most cases, TGTA produce the smallest products (AM ⊗A¬ϕ) and TGTA outperform BA and TGBA when no counterexample is found (i.e., theproperty is satisfied), but they are comparable when the property is violated, becausein this case the on-the-fly algorithm stops as soon as it finds a counterexample withoutexploring the entire product.

2 Preliminaries

Let AP a finite set of atomic propositions, a valuation � over AP is represented by afunction � : AP �→ {⊥,�}. We denote by Σ= 2AP the set of all valuations over AP, wherea valuation �∈ Σ is interpreted either as the set of atomic propositions that are true, or asa Boolean conjunction. For instance if AP= {a,b}, then Σ= 2AP = {{a,b},{a},{b}, /0}or equivalently Σ = {ab,ab, ab, ab}.

The state-space of a system can be represented by a directed graph, called Kripkestructure, where vertices represent the states of the system and edges are the transitionsbetween these states. In addition, each vertex is labeled by a valuation that representsthe set of atomic propositions that are true in the corresponding state.

Definition 1 (Kripke Structure). A Kripke structure over the set of atomic proposi-tions AP is a tuple K = 〈S ,S0,R , l〉, where:

– S is a finite set of states,– S0 ⊆ S is the set of initial states,– R ⊆ S ×S is the transition relation,– l : S → Σ is a labeling function that maps each state s to a valuation that represents

the set of atomic propositions that are true in s.

The automata-theoretic approach is based on the transformation of the negation of theLTL property to be checked into an ω-automaton that accepts the same executions.Büchi Automata (BA) are ω-automata with labels on transitions and acceptance con-ditions on states. Büchi Automata are commonly used for LTL model checking (we usethe abbreviation BA for the standard variant of Büchi Automata). The following sectionpresent TGBA [7]: a generalized variant of BA that allow a more compact representa-tion of LTL properties [4].

2.1 Transition-Based Generalized Büchi Automata (TGBA)

A Transition-based Generalized Büchi Automaton (TGBA) [7] is a variant of a Büchiautomaton that has multiple acceptance conditions on transitions.

Definition 2 (TGBA). A TGBA over the alphabet Σ = 2AP is a tuple G = 〈Q ,I ,δ,F 〉where:

Page 206: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Extending Testing Automata to All LTL 199

(a) ϕ

ab

ab

ab

ab (b) 0,FGa 1,Ga

a, a

a

a

Fig. 1. (a) A TGBA recognizing the LTL property ϕ = GFa∧GFb with acceptance conditionsF = { , } . (b) A TGBA recognizing the LTL property ϕ = FGa with F = { } .

– Q is a finite set of states,– I ⊆ Q is a set of initial states,– F is a finite set of acceptance conditions,– δ ⊆ Q ×Σ× 2F ×Q is the transition relation, where each element (q, �,F,q′) ∈ δ

represents a transition from state q to state q′ labeled by a valuation � ∈ 2AP, anda set of acceptance conditions F ∈ 2F .

An infinite word σ = �0�1�2 . . . ∈ Σω is accepted by G if there exists an infinite runr = (q0, �0,F0,q1)(q1, �1,F1,q2)(q2, �2,F2,q3) . . . ∈ δω where:

– q0 ∈ I (the infinite word is recognized by the run),– ∀ f ∈ F , ∀i ∈N, ∃ j ≥ i, f ∈ Fj (each acceptance condition is visited infinitely of-

ten).The language of G is the set L (G)⊆ Σω of infinite words it accepts.

Any LTL formula ϕ can be converted into a TGBA whose language is the set ofexecutions that satisfy ϕ [4].

Figure 1 shows two examples of LTL properties expressed as TGBA. The Boolean ex-pression over AP = {a,b} that labels each transition represents the valuation of atomicpropositions that hold in this transition. A run in these TGBA is accepted if it visitsinfinitely often all acceptance conditions (represented by colored dots and on tran-sitions). It is important to note that the LTL formulas labeling each state represent theproperty accepted starting from this state of the automaton. These labels are generatedby our LTL-to-TGBA translator (Spot [12]), they are shown for the reader’s conveniencebut not used for model checking.Figure 1(a) is a TGBA recognizing the LTL formula (GFa∧GFb), i.e., recognizingthe runs where a is true infinitely often and b is true infinitely often. An accepting runin this TGBA has to visit infinitely often the two acceptance conditions indicated by

and . Therefore, it must explore infinitely often the transitions where a is true (i.e.,transitions labeled by ab or ab) and infinitely often the transitions where b is true (i.e.,transitions labeled by ab or ab).Figure 1(b) shows a TGBA derived from the LTL formula FGa. Any infinite run inthis example is accepted if it visits infinitely often the only acceptance condition ontransition (1,a,1). Therefore, an accepting run in this TGBA must stay on state 1 byexecuting infinitely a.

The product of a TGBA with a Kripke structure is a TGBA whose language is theintersection of both languages.

Page 207: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

200 A.E. Ben Salem

Definition 3 (Product using TGBA). For a Kripke structure K = 〈S ,S0,R , l〉 and aTGBA G = 〈Q ,I ,δ,F 〉 the product K ⊗G is the TGBA 〈S⊗,I⊗,δ⊗,F 〉 where

– S⊗ = S ×Q ,– I⊗ = S0 × I ,– δ⊗ = {((s,q), �,F,(s′,q′)) | (s,s′) ∈ R , (q, �,F,q′) ∈ δ, l(s) = �}

Property 1. We have L (K ⊗G) =L (K )∩L (G) by construction.

The goal of the emptiness check algorithm is to determine if the product automatonaccepts an execution or not. In other words, it checks if the language of the productautomaton is empty or not. Testing the TGBA (representing the product automaton)for emptiness amounts to the search of an accepting cycle that contains at least oneoccurrence of each acceptance condition. This can be done in different ways: eitherwith a variation of Tarjan or Dijkstra algorithm [3] or using several Nested Depth-FirstSearches (NDFS) [15]. The product automaton that has to be explored during the empti-ness check is generally very large, its size can reach the value obtained by multiplyingthe the sizes of the model and formula automata, which are synchronized to build thisproduct. Therefore, building the entire product must be avoided. "On-the-fly" emptinesscheck algorithms allow the product automaton to be constructed lazily during its explo-ration. These on-the-fly algorithms are more efficient because they stop as soon as theyfind a counterexample and therefore possibly before building the entire product.

3 Transition-Based Generalized Testing Automata (TGTA)

Another kind of ω-automaton called Testing Automaton (TA) was introduced by Hansenet al. [8]. Instead of observing the valuations on states or transitions, the TA transitionsonly record the changes between these valuations. However, TA are less expressive thanBüchi automata since they are able to represent only stutter-invariant LTL properties.Also they are often a lot larger than their equivalent Büchi automaton, but their highdegree of determinism [8] often leads to a smaller product size [6].

In previous work [1], we evaluate the efficiency of LTL model checking approachusing TA. We have shown that TA are better than Büchi automata (BA and TGBA)when the formula to be verified is violated (i.e., a counterexample is found), but thisis not the case when the property is verified since the entire product have to be visitedtwice to check for each acceptance mode of a TA. Then, in order to improve the TAapproach, we proposed in [2] a new ω-automata for stutter-invariant properties, calledTransition-based Generalized Testing Automata (TGTA) [2], that mixes features fromboth TA and TGBA.

The basic idea of TGTA is to build an improved form of testing automata with gen-eralized acceptance conditions on transitions, which allows us to modify the automataconstruction in order to remove the second pass of the emptiness check of the product.

Another advantage of TGTA compared to TA, is that the implementation of TGTAapproach does not require a dedicated emptiness check, it reuses the same algorithmused for Büchi automata, and the counterexample constructed by this algorithm is alsoreported as a counterexample for the TGTA approach.

Page 208: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Extending Testing Automata to All LTL 201

1,FGp∧ p

p

0,FGp∧ p

p

2,Gp∧ p

p

/0 /0 /0

{p}

{p}

{p}

Fig. 2. TGTA recognizing the LTL property ϕ = FGp with acceptance conditions F = { }

In [2], we compared the LTL model checking approach using TGTA with the “tra-ditional” approaches using BA, TGBA and TA. The results of these experimental com-parisons show that TGTA compete well: the TGTA approach was statistically more effi-cient than the other evaluated approaches, especially when no counterexample is found(i.e., the property is verified) because it does not require a second pass. Unfortunately,the TGTA constructed in [2] only represent stutter-invariant properties (LTL\X [13]).

This section shows how to adapt the TGTA construction to obtain a TGTA that canrepresent all LTL properties.

Before presenting this new construction of TGTA, we first recall the definition ofthis improved variant of testing automata.

While Büchi automata observe the values of the atomic propositions of AP, the basicidea of testing automata (TA and TGTA) is to only detect the changes in these values; ifthe values of the atomic propositions do not change between two consecutive valuationsof an execution, this transition is called stuttering-transition.

If A and B are two valuations, A⊕B denotes the symmetric set difference, i.e., the setof atomic propositions that differ (e.g., ab⊕ab= {b}). Technically, this is implementedwith an XOR operation (also denoted by the symbol ⊕).

Definition 4 (TGTA). A TGTA over the alphabet Σ is a tuple T = 〈Q ,I ,U,δ,F 〉where:

– Q is a finite set of states,– I ⊆ Q is a set of initial states,– U : I → 2Σ is a function mapping each initial state to a set of symbols of Σ,– F is a finite set of acceptance conditions,– δ ⊆ Q ×Σ× 2F ×Q is the transition relation, where each element (q,k,F,q′) rep-

resents a transition from state q to state q′ labeled by a changeset k interpreted asa (possibly empty) set of atomic propositions whose values change between q andq′, and the set of acceptance conditions F ∈ 2F ,

An infinite word σ = �0�1�2 . . . ∈ Σω is accepted by T if there exists an infinite runr = (q0, �0 ⊕ �1,F0,q1)(q1, �1 ⊕ �2,F1,q2)(q2, �2 ⊕ �3,F2,q3) . . . ∈ δω where:

– q0 ∈ I with �0 ∈U(q0) (the infinite word is recognized by the run),– ∀ f ∈ F , ∀i ∈N, ∃ j ≥ i, f ∈ Fj (each acceptance condition is visited infinitely of-

ten).The language accepted by T is the set L (T )⊆ Σω of infinite words it accepts.

Figure 2 shows a TGTA recognizing the LTL formula FGp. Acceptance conditionsare represented using dots as in TGBAs. Transitions are labeled by changesets: e.g., the

Page 209: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

202 A.E. Ben Salem

transition (0,{p},1) means that the value of p changes between states 0 and 1. Initialvaluations are shown above initial arrows: U(0) = {p}, U(1) = { p} and U(2) = {p}.Any infinite run in this example is accepted if it visits infinitely often, the acceptancetransition indicated by the black dot : i.e., the stuttering self-loop (2, /0, ,2).As an illustration, the infinite word p; p; p; p; . . . is accepted by the run:

1 2 2 2 . . .{p} /0 /0

because the value p only changes between the first two steps.Indeed, a run recognizing such an infinite word must start in state 1 (because onlyU(1) = { p}), then it changes the value of p, so it has to take transitions labeled by {p},i.e., (1,{p},0) or (1,{p},2). To be accepted, it must move to state 2 (rather than state0), and finally stay on state 2 by executing infinitely the accepting stuttering self-loop(2, /0, ,2).

In the next section, we present in detail the formalization of the different steps usedto build a TGTA that represent any LTL property.

4 TGTA Construction

Let us now describe how to build a TGTA starting from a TGBA. The TGTA construc-tion is inspired by the one presented in [2], with some changes introduced in the secondstep of this construction. Indeed, a TGTA is built in two steps as illustrated in Figure 3.The first step transforms a TGBA into an intermediate TGTA by labeling transitionswith changesets. Then, the second step builds the final form of TGTA by removing theuseless stuttering transitions. In this work, this simplification of stuttering transitionsdoes not require the hypothesis that the LTL property is stutter-invariant (this representsa crucial difference compared to the TGTA construction presented in [2]). For example,Figure 4d shows a TGTA constructed for ϕ = X p∧FGp which is not stutter-invariant.In the following, we will detail the successive steps to build this TGTA.

TGBA Intermediate TGTA TGTALabeling transitions

with “changesets”

Elimination of useless

stuttering transitions ( /0)

Fig. 3. The two steps of the construction of a TGTA from a TGBA

4.1 First Step: Construction of an Intermediate TGTA from a TGBA

Geldenhuys and Hansen [6] have shown how to convert a Büchi Automaton (BA) into aTesting Automtaton (TA) by first converting the BA into an automaton with valuationson the states (called State-Labeled Büchi Automaton (SLBA)), and then converting thisSLBA into an intermediate form of TA by computing the difference between the labelsof the source and destination states of each transition.

The first step of the TGTA construction is similar to the first step of the TA con-struction [6, 2]. We construct an intermediate TGTA from a TGBA by moving labels tostates, and labeling each transition by the set difference between the labels of its sourceand destination states. While doing so, we keep the generalized acceptance conditionson the transitions. The next proposition implements these first steps.

Page 210: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Extending Testing Automata to All LTL 203

ϕ = (X p∧FGp)

(p∧FGp)

Gp

FGp

p, pp

p

p, p

p

p

(a) TGBA for ϕ = X p∧FGp.

ϕ, pp

ϕ, pp

(p∧FGp), p FGp, p

FGp, p

Gp, p

{p}

/0/0

/0

{p}{p}

/0

{p}

/0/0

/0

(b) Intermediate TGTA obtained by prop-erty 2.

ϕ, pp

ϕ, pp

(p∧FGp), p FGp, p

FGp, p

Gp, p{p}

{p}/0

/0

/0

{p}{p}

/0

{p}

{p}

/0

/0

(c) TGTA after simplifications by property 3.

1p

2p

3 4

5{p}

{p}/0

/0

/0

{p}

/0

{p}

{p}

/0

(d) TGTA after bisimulation.

Fig. 4. TGTA obtained after various steps while translating the TGBA representing ϕ = X p∧FGp, into a TGTA with F = { }

Property 2 (Converting TGBA into an intermediate TGTA). Given any TGBA G =〈QG ,IG ,δG ,F 〉 over the alphabet Σ, let us build the TGTA T = 〈QT ,IT ,UT ,δT ,F 〉with QT = QG ×Σ, IT = IG ×Σ and(i) ∀(q, �) ∈ IT ,UT ((q, �)) = {�}

(ii) ∀(q, �) ∈ QT ,∀(q′, �′) ∈ QT ,(

(q, �), �⊕ �′,F,(q′, �′)) ∈ δT ⇐⇒ ((q, �,F,q′) ∈ δG )

Then L (G) =L (T ). (The proof of this property. 2 is given in [2].)

Page 211: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

204 A.E. Ben Salem

An example of an intermediate TGTA is shown on Figure 4b. It is the result ofapplying the construction of property. 2 to the example of TGBA for ϕ = X p∧ FGp(shown in Figure 4a). The next property shows how to remove the useless stuttering-transitions (labeled by /0) in TGTA.

4.2 Second Step: Elimination of Useless Stuttering-Transitions ( /0)

In the following, we say that a language L is stutter-invariant if the number of thesuccessive repetitions of any letter of a word σ ∈L does not affect the membership ofσ to L [5]. In other words, L is stutter-invariant iff for any finite sequence u ∈ Σ∗, anyelement � ∈ Σ, and any infinite sequence v ∈ Σω we have u�v ∈L ⇐⇒ u��v ∈L .

We begin by defining the concept of a language recognized starting from a state ina TGTA. This definition will be very useful for the formalization of the second step ofthe TGTA construction presented below.

Definition 5 (L (T ,q)). Given a TGTA T and a state q of T , we say that an infiniteword σ = �0�1�2 . . . ∈ Σω is accepted by T starting from the state q if there exists aninfinite run r = (q, �0 ⊕ �1,F0,q1)(q1, �1 ⊕ �2,F1,q2)(q2, �2 ⊕ �3,F2,q3) . . . ∈ δω where:∀ f ∈ F , ∀i ∈N, ∃ j ≥ i, f ∈ Fj (each acceptance condition is visited infinitely often).The language L (T ,q)⊆ Σω is the set of infinite words accepted by T starting from thestate q.

In the following, we will exploit the fact that in a TGTA, the language recognizedstarting from certain states of a TGTA can be stutter-invariant, although the overallTGTA language (recognized from the initial states) is not stutter-invariant. For example,in the intermediate TGTA shown on Figure 4b, the language recognized starting fromthe two initial states (labeled by the formula ϕ = X p∧ FGp) is not stutter-invariant.However, the languages recognized starting form the other states (labeled by the formu-las (p∧ FGp), (FGp) and (Gp)) are stutter-invariants. Indeed, similar to the originalTGBA G , in an intermediate TGTA T obtained by property 2, if the LTL formula la-beling a state q is stutter-invariant, then the language recognized starting from this stateq is also stutter-invariant (Figure 4b). This can easily be deduced from the proof [2] ofproperty 2.

The next property allow to simplify the intermediate TGTA by removing the uselessstuttering transitions and thus obtain the final TGTA. The intuition behind this simplifi-cation is illustrated in Figure 5a: In a TGTA T , we have that the language recognizedstarting from a state q0 is stutter-invariant and q0 can reach an accepting stuttering-cycleby following only stuttering transitions. In the context of TA we would have to declareq0 as being a livelock-accepting state. For TGTA, we replace the accepting stuttering-cycle by adding a self-loop labeled by all acceptance conditions on qn, then the pre-decessors of q0 are connected to qn as in Figure 5b. In the last step of the followingconstruction, for each state q such that L (T ,q) is stutter-invariant, we add a stutteringself-loop to q and we remove all stuttering transitions from q to other states. Figure 4cshows how the automaton from Figure 4b is simplified.

Property 3 (Elimination of useless stuttering transitions to build a TGTA). Givena TGTA T = 〈Q ,I ,U,δ,F 〉. By combining the first three of the following operations,

Page 212: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Extending Testing Automata to All LTL 205

q q0 · · · qn

q q′......

. . .

. . .

k /0 /0/0

/0

/0

(a) TGTA T before reduction of stuttering tran-sitions ( /0). L (T ,q0) is stutter-invariant.

q q0 qn

. . .

...

. . .k

k

/0/0

(b) T after reduction of stuttering tran-sitions ( /0).

Fig. 5. Elimination of useless stuttering transitions in TGTA

we can remove the useless stuttering-transitions in this TGTA (Figure 5). The fourthoperation can be performed along the way for further (classical) simplifications.1. If Q⊆Q is a SCC such that for any state q∈Q we haveL (T ,q) is stutter-invariant

and any two states q,q′ ∈ Q can be connected using a sequence of stuttering transi-tions (q, /0,F0,r1)(r1, /0,F1,r2) · · · (rn, /0,Fn,q′) ∈ δ∗ with F0∪F1∪·· ·∪Fn = F , thenwe can add an accepting stuttering self-loop (q, /0,F ,q) on each state q ∈ Q. I.e.,the TGTA T ′ = 〈Q ,I ,U,δ∪{(q, /0,F ,q) | q∈ Q},F 〉 is such that L (T ′) =L (T ).Let us call such a component Q an accepting Stuttering-SCC.

2. Let q0 is a state of T such that L (T ,q0) is stutter-invariant. If there exists anaccepting Stuttering-SCC Q and a sequence of stuttering-transitions:(q0, /0,F1,q1)(q1, /0,F2,q2) · · · (qn−1, /0,Fn,qn) ∈ δ∗ such that qn ∈ Q and q0, q1, ...qn−1 �∈ Q (as shown in Figure 5a), then:

– For any transition (q,k,F,q0) ∈ δ going to q0 (with (q,k,F,qn) �∈ δ), the TGTAT ′′ = 〈Q ,I ,U,δ∪{(q,k,F,qn)},F 〉 is such that L (T ′′) =L (T ) (Figure 5b).

– If q0 ∈ I , the TGTA T ′′ = 〈Q ,I ∪{qn},U ′′,δ,F 〉 with ∀q �= qn,U ′′(q) =U(q)and U ′′(qn) =U(qn)∪U(q0), is such that L (T ′′) =L (T ).

3. Let T † = 〈Q ,I †,U†,δ†,F 〉 be the TGTA obtained after repeating the previoustwo operations as much as possible (i.e., T † contains all the transitions and initialstates that can be added by the above two operations. Then, we add a non-acceptingstuttering self-loop (q, /0, /0,q) to any state q that did not have an accepting stutter-ing self-loop and such that L (T ,q) is stutter-invariant. Also we remove all stut-tering transitions from q that are not self-loops since stuttering can be captured byself-loops after the previous two operations. After this last reduction of stutteringtransitions, we obtain the final TGTA (Figure 4c).More formally, the TGTA T ′′′ = 〈Q ,I †,U†,δ′′′,F 〉 with δ′′′ = {(q,k,F,q′) ∈ δ† |L (T ,q) is not stutter-invariant }∪{(q,k,F,q′)∈ δ† | k �= /0∨(q= q′ ∧F =F )}∪{(q, /0, /0,q) |L (T ,q) is stutter-invariant ∧ (q, /0,F ,q) �∈ δ†} is such that:L (T ′′′) =L (T †) =L (T ).

4. Any state from which one cannot reach a Büchi-accepting cycle can be removedfrom the automaton without changing its language.

The proof of Property 3 is similar to the proof of the second step of TGTA constructiongiven in [2].

Figure 4c shows how the TGTA from Figure 2 is simplified by the above Property 3.

Page 213: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

206 A.E. Ben Salem

Similar to the TA construction [2], the resulting TGTA can be further simplified bymerging bisimilar states (two states q and q′ are bisimilar if the automaton T can acceptthe same infinite words starting from either of these states, i.e., L (T ,q) =L (T ,q′)).This optimization can be achieved using any algorithm based on partition refinement,the same as for Büchi automata, taking {F ∩G ,F \G ,G \F ,Q \ (F ∪G)} as initialpartition and taking into account the acceptance conditions of the outgoing transitions.The final TGTA obtained after all these steps is shown in Figure 4.

As for the other variants of ω-automata, the automata-theoretic approach using TGTAhas two important operations: the construction of a TGTA T recognizing the negationof the LTL property ϕ and the emptiness check of the product (K ⊗T ) of the Kripkestructure K with T .

Definition 6 (Product using TGTA). For a Kripke structure K = 〈S ,S0,R , l〉 and aTGTA T = 〈Q ,I ,U,δ,F 〉, the product K ⊗T is a TGTA 〈S⊗,I⊗,U⊗,δ⊗,F⊗〉 where

– S⊗ = S ×Q ,– I⊗ = {(s,q) ∈ S0 × I | l(s) ∈U(q)},– ∀(s,q) ∈ I⊗,U⊗((s,q)) = {l(s)},– δ⊗ = {((s,q),k,F,(s′,q′)) | (s,s′) ∈ R , (q,k,F,q′) ∈ δ, k = (l(s)⊕ l(s′))},– F⊗ = F .

Property 4. We have L (K ⊗T ) =L (K )∩L (T ) by construction.

Since a product of a TGTA with a Kripke structure is a TGTA, we only need anemptiness check algorithm for a TGTA automaton. A TGTA can be seen as a TGBAwhose transitions are labeled by changesets instead of valuations of atomic propositions.When checking a TGBA for emptiness, we are looking for an accepting cycle that isreachable from an initial state. When checking a TGTA for emptiness, we are lookingexactly for the same thing. Therefore, because emptiness check algorithms do not lookat transitions labels, the same emptiness check algorithm used for the product usingTGBA can also be used for the product using TGTA.

5 Experimental Evaluation of TGTA

In order to evaluate the TGTA approach against the TGBA and BA approaches, anexperimentation was conducted under the same conditions as our previous work [1], i.e.,within the same CheckPN tool on top of Spot [12] and using the same benchmark Inputs(formulas and models) used in the experimental comparison [1] of BA, TGBA and TA.The models are from the Petri net literature [11], we selected two instances of eachof the following models: the Flexible Manufacturing System (4/5), the Kanban system(4/5), the Peterson algorithm (4/5), the slotted-ring system (6/5), the dining philosophers(9/10) and the Round-robin mutex (14/15). We also used two models from actual casestudies: PolyORB [10] and MAPK [9]. For each selected model instance, we generated200 verified formulas (no counterexample in the product) and 200 violated formulas(a counterexample exists): 100 random (length 15) and 100 weak-fairness [1] (length30) of the two cases of formulas. Since generated formulas are very often trivial toverify (the emptiness check needs to explore only a handful of states), we selected onlythose formulas requiring more than one second of CPU for the emptiness check in allapproaches.

Page 214: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Extending Testing Automata to All LTL 207

5.1 Implementation

Figure 6 shows the building blocks we used to implement the three approaches. Theautomaton used to represent the property to check has to be synchronized with a Kripkestructure representing the model. Depending on the kind of automaton, this synchronousproduct is implemented differently. The TGBA and BA approaches can share the sameproduct implementation. The TGTA approach require a dedicated product computation.The TGBA, BA, and TGTA approaches share the same emptiness check.

KripkeStructure

LTLFormula

LTL2TGBA

TGBA2BA

TGBA2TGTA

Sync. Product(classic)

Sync. Product(TGTA)

Emptiness check (classic)

TRUE orcounterexample

Fig. 6. The experiment’s architecture in Spot. Three command-line switches control which oneof the approaches is used to verify an LTL formula on a Kripke structure. The new componentsrequired by the TGTA approach are outlined in Gray.

5.2 Results

Figure 7 compares the sizes of the products automata (in terms of number of states) andFigure 8 compares the number of visited transitions when running the emptiness check;plotting TGTA against BA and TGBA. This gives an idea of their relative performance.Indeed, in order to protect the results against the influence of various optimizations,implementation tricks, and the central processor and memory architecture, Geldenhuysand Hansen [6] found that the number of states gives a reliable indication of the mem-ory required, and, similarly, the number of transitions a reliable indication of the timeconsumption. Each point of the scatter plots corresponds to one of the 5600 evaluatedformulas (2800 violated with counterexample as black circles, and 2800 verified havingno counterexample as green crosses). Each point below the diagonal is in favor of TGTAwhile others are in favor of the other approach. Axes are displayed using a logarithmicscale. All these experiments were run on a 64bit Linux system running on an Intel(R)64-bit Xeon(R) @2.00GHz, with 10GB of RAM.

5.3 Discussion

On verified properties (green crosses), the results are very straightforward to interpret.On the scatter plots of Figure 7, the cases where the TGTA approach is better thanBA and TGBA approaches, appear as green crosses below the diagonal. In these cases,

Page 215: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

208 A.E. Ben Salem

1E+05

1E+06

1E+07

1E+08

1E+05

1E+06

1E+07

1E+08

TG

TA

BA

violatedverified

1E+05

1E+06

1E+07

1E+08

1E+05

1E+06

1E+07

1E+08

TG

TA

TGBA

violatedverified

Fig. 7. Size of products (number of states) using TGTA against BA and TGBA

1E+05

1E+06

1E+07

1E+08

1E+05

1E+06

1E+07

1E+08

TG

TA

BA

violatedverified

1E+05

1E+06

1E+07

1E+08

1E+05

1E+06

1E+07

1E+08

TG

TA

TGBA

violatedverified

Fig. 8. Performance (number of transitions explored by the emptiness check) using TGTA againstBA and TGBA

The TGTA approach is a clear improvement because the products automata are smallerusing TGTA. The same result is observed for the scatter plots of Figure 8, when lookingat the number of transitions explored by the emptiness check, the TGTA approach alsooutperforms TGBA and BA approaches for verified properties.

On violated properties (black circles), for the number of transitions explored by theemptiness check, it is difficult to interpret the scatter plots of Figure 8 because theemptiness check is an on-the-fly algorithm. It stops as soon as it finds a counterexam-ple without exploring the entire product. Thus, for violated properties, the explorationorder of non-deterministic transitions of TGBA, BA and TGTA changes the number oftransitions explored in the product before a counterexample is found.

However, if we analyze the scatter plots of Figure 7, we observe that the TGTAapproach produces the smallest products. This allows the TGTA approach to seek acounterexample in a smaller product and therefore have a better chance to find it faster.

Page 216: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Extending Testing Automata to All LTL 209

6 Conclusion

In previous work [2], we have shown that Transition-based Generalized Testing Au-tomata (TGTA) are a way to improve the model checking approach when verifyingstutter-invariant properties. In this work, we propose a construction of a TGTA that al-low to check any LTL property (stutter-invariant or not). This TGTA is constructed intwo steps. The first one builds an intermediate TGTA from a TGBA (Transition-basedGeneralized Büchi Automata). The second step transforms an intermediate TGTA intoa TGTA by removing the useless stuttering-transitions that are not self-loops (this re-duction does not need the restriction to stutter-invariant properties as in our previouswork [2]).

The constructed TGTA combines advantages observed on both Testing Automata(TA) and TGBA:

– From TA, it reuses the labeling of transitions with changesets, and the elimina-tion of the useless stuttering-transitions, but without requiring a second pass in theemptiness check of the product.

– From TGBA, it inherits the use of generalized acceptance conditions on transitions.TGTA have been implemented in Spot easily, because only two new algorithms are

required: the conversion of a TGBA into a TGTA, and a new definition of a productbetween a TGTA and a Kripke structure.

We have run benchmarks to compare TGTA against BA and TGBA. Experimentsreported that TGTA produce the smallest products automata and therefore TGTA out-perform BA and TGBA when no counterexample is found in these products (i.e., theproperty is satisfied), but they are comparable when the property is violated, becausein this case the on-the-fly algorithm stops as soon as it finds a counterexample withoutexploring the entire product.

We conclude that there is nothing to lose by using TGTA to verify any LTL property,since they are always at least as good as BA and TGBA and we believe that TGTA arebetter thanks to the elimination of the useless stuttering-transitions during the TGTAconstruction.

As a future work, an idea would be to provide a direct conversion of LTL to TGTA,without the intermediate TGBA step. We believe a tableau construction such as the oneof Couvreur [3] could be easily adapted to produce TGTA. Another important optimiza-tion is to build on-the-fly the TGTA during the construction of the synchronous product.Especially when the number of atomic propositions (AP) is very large, because thismay lead to build a TGTA with a large number of unnecessary initial states, that are notsynchronized with the initial state(s) of the Kripke structure.

References

1. Ben Salem, A.E., Duret-Lutz, A., Kordon, F.: Generalized Büchi automata versus testingautomata for model checking. In: Proc. of SUMo 2011, vol. 726, pp. 65–79. CEUR (2011)

2. Ben Salem, A.-E., Duret-Lutz, A., Kordon, F.: Model Checking Using Generalized Test-ing Automata. In: Jensen, K., van der Aalst, W.M., Ajmone Marsan, M., Franceschinis, G.,Kleijn, J., Kristensen, L.M. (eds.) ToPNoC VI. LNCS, vol. 7400, pp. 94–122. Springer, Hei-delberg (2012)

Page 217: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

210 A.E. Ben Salem

3. Couvreur, J.-M.: On-the-fly verification of linear temporal logic. In: Wing, J.M., Woodcock,J. (eds.) FM 1999. LNCS, vol. 1708, pp. 253–271. Springer, Heidelberg (1999)

4. Duret-Lutz, A., Poitrenaud, D.: SPOT: an extensible model checking library using transition-based generalized Büchi automata. In: Proc. of MASCOTS 2004, pp. 76–83. IEEE ComputerSociety Press (2004)

5. Etessami, K.: Stutter-invariant languages, ω-automata, and temporal logic. In: Halbwachs,N., Peled, D.A. (eds.) CAV 1999. LNCS, vol. 1633, pp. 236–248. Springer, Heidelberg(1999)

6. Geldenhuys, J., Hansen, H.: Larger automata and less work for LTL model checking. In:Valmari, A. (ed.) SPIN 2006. LNCS, vol. 3925, pp. 53–70. Springer, Heidelberg (2006)

7. Giannakopoulou, D., Lerda, F.: From states to transitions: Improving translation of LTL for-mulæ to Büchi automata. In: Peled, D.A., Vardi, M.Y. (eds.) FORTE 2002. LNCS, vol. 2529,pp. 308–326. Springer, Heidelberg (2002)

8. Hansen, H., Penczek, W., Valmari, A.: Stuttering-insensitive automata for on-the-fly detec-tion of livelock properties. In: Proc. of FMICS 2002. ENTCS, vol. 66(2). Elsevier (2002)

9. Heiner, M., Gilbert, D., Donaldson, R.: Petri nets for systems and synthetic biology. In:Bernardo, M., Degano, P., Zavattaro, G. (eds.) SFM 2008. LNCS, vol. 5016, pp. 215–264.Springer, Heidelberg (2008)

10. Hugues, J., Thierry-Mieg, Y., Kordon, F., Pautet, L., Barrir, S., Vergnaud, T.: On the formalverification of middleware behavioral properties. In: Proc. of FMICS 2004. ENTCS, vol. 133,pp. 139–157. Elsevier (2004)

11. Kordon, F., et al.: Report on the model checking contest at petri nets 2011. In: Jensen, K., vander Aalst, W.M., Ajmone Marsan, M., Franceschinis, G., Kleijn, J., Kristensen, L.M. (eds.)ToPNoC VI. LNCS, vol. 7400, pp. 169–196. Springer, Heidelberg (2012)

12. MoVe/LRDE. The Spot home page: http://spot.lip6.fr (2014)13. Peled, D., Wilke, T.: Stutter-invariant temporal properties are expressible without the next-

time operator. Information Processing Letters 63(5), 243–246 (1995)14. Sebastiani, R., Tonetta, S.: “More deterministic” vs. “Smaller” Büchi automata for efficient

LTL model checking. In: Geist, D., Tronci, E. (eds.) CHARME 2003. LNCS, vol. 2860, pp.126–140. Springer, Heidelberg (2003)

15. Tauriainen, H.: Automata and Linear Temporal Logic: Translation with Transition-based Ac-ceptance. PhD thesis, Helsinki University of Technology, Espoo, Finland (September 2006)

16. Vardi, M.Y.: Automata-theoretic model checking revisited. In: Cook, B., Podelski, A. (eds.)VMCAI 2007. LNCS, vol. 4349, pp. 137–150. Springer, Heidelberg (2007)

Page 218: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Efficient Verification Techniques

Page 219: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Simple Isolation for an Actor Abstract Machine

Benoit Claudel1, Quentin Sabah2, and Jean-Bernard Stefani3(�)

1 The MathWorks, Grenoble, France2 Mentor Graphics, Grenoble, France

3 INRIA Grenoble-Rhone-Alpes, Valbonne, France

Abstract. The actor model is an old but compelling concurrent pro-gramming model in this age of multicore architectures and distributedservices. In this paper we study an as yet unexplored region of the actordesign space in the context of concurrent object-oriented programming.Specifically, we show that a purely run-time, annotation-free approachto actor state isolation with reference passing of arbitrary object graphsis perfectly viable. In addition, we show, via a formal proof using theCoq proof assistant, that our approach indeed enforces actor isolation.

1 Introduction

Motivations. The actor model of concurrency [1], where isolated sequentialthreads of execution communicate via buffered asynchronous message-passing,is an attractive alternative to the model of concurrency adopted e.g. for Java,based on threads communicating via shared memory. The actor model is bothmore congruent to the constraints of increasingly distributed hardware architec-tures – be they local as in multicore chips, or global as in the world-wide web –,and more adapted to the construction of long-lived dynamic systems, includingdealing with hardware and software faults, or supporting dynamic update andreconfiguration, as illustrated by the Erlang system [2]. Because of this, we haveseen in the recent years renewed interest in implementing the actor model, bethat at the level of experimental operating systems as in e.g. Singularity [9], orin language libraries as in e.g. Java [24] and Scala [13].

When combining the actor model with an object-oriented programming model,two key questions to consider are the exact semantics of message passing, andits efficient implementation, in particular on multiprocessor architectures withshared physical memory. To be efficient, an implementation of message passingon a shared memory architecture ought to use data transfer by reference, wherethe only data exchanged is a pointer to the part of the memory that contains themessage. However, with data transfer by reference, enforcing the share-nothingsemantics of actors becomes problematic: once an arbitrary memory reference isexchanged between sender and receiver, how do you ensure the sender can nolonger access the referenced data ? Usual responses to this question, typicallyinvolve restricting the shape of messages, and controlling references (usuallythrough a reference uniqueness scheme [19]) by various means, including run-time support, type systems and other static analyses, as in Singularity [9], Kilim[24], Scala actors [14], and SOTER [21].

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 213–227, 2015.DOI: 10.1007/978-3-319-19195-9_14

Page 220: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

214 B. Claudel et al.

Contributions. In this paper, we study a point in the actor model design spacewhich, despite its simplicity, has never, to our knowledge, been explored before.It features a very simple programming model that places no restriction on theshape and type of messages, and does not require special types or annotationsfor references, yet still enforces the share nothing semantics of the actor model.Specifically, we introduce an actor abstract machine, called Siaam. Siaam is lay-ered on top of a sequential object-oriented abstract machine, has actors runningconcurrently using a shared heap, and enforces strict actor isolation by meansof run-time barriers that prevent an actor from accessing objects that belongto a different actor. The contributions of this paper can be summarized as fol-lows. We formally specify the Siaam model, building on the Jinja specificationof a Java-like sequential language [18]. We formally prove, using the Coq proofassistant, the strong isolation property of the Siaam model. We describe ourimplementation of the Siaam model as a modified Jikes RVM [16]. We presenta novel static analysis, based on a combination of points-to, alias and livenessanalyses, which is used both for improving the run-time performance of Siaamprograms, and for providing useful debugging support for programmers. Finally,we evaluate the performance of our implementation and of our static analysis.

Outline. The paper is organized as follows. Section 2 presents the Siaam machineand its formal specification. Section 3 presents the formal proof of its isolationproperty. Section 4 describes the implementation of the Siaam machine. Sec-tion 5 presents the Siaam static analysis. Section 6 presents an evaluation ofthe Siaam implementation and of the Siaam analysis. Section 7 discusses relatedwork and concludes the paper. Because of space limitations, we present onlysome highlights of the different developments. Interested readers can find all thedetails in the second author’s PhD thesis [22], which is available online alongwith the Coq code [25].

2 Siaam: Model and Formal Specification

Informal presentation. Siaam combines actors and objects in a programmingmodel with a single shared heap. Actors are instances of a special class. Eachactor is equipped with at least one mailbox for queued communication withother actors, and has its own logical thread of execution that runs concurrentlywith other actor threads. Every object in Siaam belongs to an actor, we call itsowner. An object has a unique owner. Each actor is its own owner. At any pointin time the ownership relation forms a partition of the set of objects. A newlycreated object has its owner set to that of the actor of the creating thread.

Siaam places absolutely no restriction on the references between objects, in-cluding actors. In particular objects with different owners may reference eachother. Siaam also places no constraint on what can be exchanged via messages:the contents of a message can be an arbitrary object graph, defined as the graphof objects reachable (following object references in object fields) from a rootobject specified when sending a message. Message passing in Siaam has a zero-copy semantics, meaning that the object graph of a message is not copied from

Page 221: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Simple Isolation for an Actor Abstract Machine 215

a

1

2 3

b

c

(a)

ab

1

2 3c

(b)

Fig. 1. Ownership and ownership transfer in Siaam

the sender actor to the receiver actor, only the reference to the root object ofa message is communicated. An actor is only allowed to send objects it owns1,and it cannot send itself as part of a message content.

Figure 1 illustrates ownership and ownership transfer in Siaam. On the leftside (a) is a configuration of the heap and the ownership relation where eachactor, presented in gray, owns the objects that are part of the same dottedconvex hull. Directed edges are heap references. On the right side (b), the objects1, 2, 3 have been transferred from a to b, and object 1 has been attached to thedata structure maintained in b’s local state. The reference from a to 1 has beenpreserved, but actor a is no longer allowed to access the fields of 1, 2, 3.

To ensure isolation, Siaam enforces the following invariant: an object o (infact an executing thread) can only access fields of an object that has the sameowner than o; any attempt to access the fields of an object of a different ownerthan the caller raises a run-time exception. To enforce this invariant, messageexchange in Siaam involves twice changing the owner of all objects in a messagecontents graph: when a message is enqueued in a receiver mailbox, the owner ofobjects in the message contents is changed atomically to a null owner ID thatis never assigned to any actor ; when the message is dequeued by the receiveractor, the owner of objects in the message contents is changed atomically tothe receiver actor. This scheme prevents pathological situations where an objectpassed in a message m may be sent in another message m′ by the receiver actorwithout the latter having dequeued (and hence actually received) message m.Since Siaam does not modify object references in any way, the sender actor canstill have references to objects that have been sent, but any attempt from thissender actor to access them will raise an exception.

1 Siaam enforces the constraint that all objects reachable from a message root objecthave the same owner – the sending actor. If the constraint is not met, sending themessage fails. However, this constraint, which makes for a simple design, is just adesign option. An alternative would be to consider that a message contents consistof all the objects reachable from the root object which have the sending actor astheir owner. This alternate semantics would not change the actual mechanics of themodel and the strong isolation enforced by it.

Page 222: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

216 B. Claudel et al.

Readhp s a = Some (C, fs) fs(F,D) = Some v

P, w � 〈a.F{D}, s〉 − OwnerCheck a True → 〈Val v, s〉

ReadX P, w � 〈a.F{D}, s〉 − OwnerCheck a False → 〈Throw OwnerMismatch, s〉

Global

acs s w = Some x P,w � 〈x, shp s〉 − wa → 〈x′, h′〉 ok act P s w waupd act P s w wa = (xs′, ws′,ms′, ) s′ = (xs′[w �→ x′], ws′, ms′, h′)

P � s → s′

Fig. 2. Siaam operational semantics: sample rules

Siaam: model and formal specification. The formal specification of the Siaammodel defines an operational semantics for the Siaam language, in the form ofa reduction semantics. The Siaam language is a Java-like language, for its se-quential part, extended with special classes with native methods correspondingto operations of the actor model, e.g. sending and receiving messages. The se-mantics is organized in two layers, the single-actor semantics and the globalsemantics. The single-actor semantics deals with evolutions of individual actors,and reduces actor-local state. The global semantics maintains a global state notdirectly accessible from the single-actor semantics. In particular, the effect ofreading or updating object fields by actors belongs to the single-actor semantics,but whether it is allowed is controlled by the global semantics. Communicationsare handled by the global semantics.

The single actor semantics extends the Jinja formal specification in HOL ofthe reduction semantics of a (purely sequential) Java-like language [18] 2. Jinjagives a reduction semantics for its Java-like language via judgments of the formP � 〈e, (lv, h)〉 → 〈e′, (lv′, h′)〉, which means that in presence of program P (alist of class declarations), expression e with a set of local variables lv and a heaph reduces to expression e′ with local variables lv′ and heap h′.

We extend Jinja judgments for our single-actor semantics to take the formP,w � 〈e, (lv, h)〉 − wa → 〈e′, (lv′, h′)〉 where 〈e, lv〉 corresponds to the localactor state, h is the global heap, w is the identifier of the current actor (owner),and wa is the actor action requested by the reduction. Actor actions embody theSiaam model per se. They include creating new objects (with their initial owner),including actors and mailboxes, checking the owner of an object, sending andreceiving messages. For instance, succesfully accessing an object field is governedby rule Read in Figure 2. Jinja objects are pairs (C, fs) of the object class nameC and the field table fs. A field table is a map holding a value for each fieldof an object, where fields are identified by pairs (F,D) of the field name Fand the name D of the declaring class. The premisses of rule Read retrievethe object referenced by a from the heap (hp s a = Some (C, fs) – where hp

2 Jinja, as described in [18], only covers a subset of the Java language. It does nothave class member qualifiers, interfaces, generics, or concurrency.

Page 223: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Simple Isolation for an Actor Abstract Machine 217

is the projection function that retrieves the heap component of a local actorstate, and the heap itself is an association table modelled as a function thatgiven an object reference returns an object), and the value v held in field F .In the conclusion of rule Read, reading the field F from a returns the value v,with the local state s (local variables and heap) unchanged. The actor actionOwnerCheck a True indicates that object a has the current actor as its owner.Apart from the addition of the actor action label, rule Read is directly liftedfrom the small step semantics of Jinja in [18]. In the case of field access, the ruleRead is naturally complemented with rule ReadX, that raises an exception ifthe owner check fails, and which is specific to Siaam. Actor actions also includea special silent action, that corresponds to single-actor reductions (includingexception handling) that require no access to the global state. Non silent actoractions are triggered by object creation, object field access, and native calls, i.e.method calls on the special actor and mailbox classes.

The global semantics is defined by the rule Global in Figure 2. The judg-ment, written P � s → s′, means in presence of program P , global state sreduces to global state s′. The global state (xs, ws,ms, h) of a Siaam programexecution comprises four components: the actor table xs, an ownership relationws, the mailbox table ms, and a shared heap h. The projection functions acs,ows, mbs, shp return respectively the actor table, the ownerships relation, themailbox table, and the shared heap component of the global state. The actor ta-ble associates an actor identifier to an actor local state consisting of a pair 〈e, lv〉of expression and local variables. The rule Global reduces the global state byapplying a single step of the single-actor semantics for actor w. In the premisesof the rule, the shared heap shp s and the current local state x (expression andlocal variables) for w are retrieved from the global state. The actor can reduceto x′ with new shared heap h′ and perform the action wa. ok act tests the actoraction precondition against s. If it is satisfiable, upd act applies the effects of wato the global state, yielding the new tuple of state components (xs′, ws′,ms′, )where the heap is left unchanged. The new state s′ is assembled from the newmailbox table, the new ownership relation, the new heap from the single actorreduction and the new actor table where the state for actor w is updated withits new local state x′. We illustrate the effect of actor actions in the next section.

3 Siaam: Proof of Isolation

The key property we expect the Siaam model to uphold is the strong isolation (orshare nothing) property of the actor model, meaning actors can only exchangeinformation via message passing. We have formalized this property and provedit using the Coq proof assistant (v8.4) [8]. We present in this section some keyelements of the formalization and proof, using excerpts from the Coq code. Theformalization uses an abstraction of the operational semantics presented in theprevious section. Specifically, we abstract away from the single-actor semantics.The local state of an actor is abstracted as being just a table of local variables(no expression), which may change in obvious ways: adding or removing a local

Page 224: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

218 B. Claudel et al.

variable, changing the value held by a local variable. The formalization (which wecall Abstract Siaam) is thus a generalization of the Siaam operational semantics.

Abstract Siaam: Types. The key data structure in Abstract Siaam is the config-uration, defined as an abstraction of the global state in the previous section. Aconfiguration conf is a tuple comprising an actor table, an ownership relation, amailbox table and a shared heap. In Coq:

Record conf : Type := mkcf { acs : actors; ows : owners; mbs : mailboxes; shp : heap }.

Actor table, ownership relation, mailbox table and heap are all defined as simpleassociation tables, i.e. lists of pairs 〈i, d〉 of identifiers i and data d:

Definition actors := table aid locals. Definition actor := prod aid locals.Definition owners := table addr (option aid). Definition mailboxes := table mid mbox.Definition heap := table addr object.

Identifiers aid, addr, and mid correspond to actor identifiers, object references,and mailbox identifiers, respectively. The data locals is a table of local variables(with identifier type vid), an actor is just a pair associating an actor identifierwith a table of local variables, and a mailbox mbox is a list of messages associatedwith an actor identifier (the actor receiving messages via the mailbox):

Definition locals := table vid value. Definition message := prod msgid addr.Definition queue := list message. Record mbox : Type := mkmb { own : aid ; msgs : queue}.

A message is just a pair consisting of a message identifier and a reference to a rootobject. A value can be either the null value (vnull), the mark value (vmark), aninteger (vnat), a boolean (vbool), an object reference, an actor id or a mailbox id.The special mark value is simply a distinct value used to formalize the isolationproperty.

Abstract Siaam: Transition rules. Evolution of a Siaam system are modeled inAbstract Siaam as transitions between configurations, which are in turn governedby transition rules. Each transition rule in Abstract Siaam corresponds to an aninstance of the Global rule in the Siaam operational semantics, specialized fordealing with a given actor action. For instance, the rule governing field access,which abstracts the global semantics reduction picking the OwnerCheck a Trueaction offered by a Read reduction of the single-actor semantics (cf. Figure 2)carrying the identifier of actor e, and accessing field f of object o referenced bya is defined as follows:

Inductive redfr : conf → aid → conf → Prop :=| redfr_step : ∀ (c1 c2 : conf)(e : aid)(l1 l2 : locals)(i j : vid)(v w : value)(a: addr)

(o : object)(f: fid),set_In (e,l1) (acs c1) → set_In (i, w) l1 → set_In (j,vadd a) l1 →

set_In (a,o) (shp c1) → set_In (f,v) o → set_In (a, Some e) (ows c1) →v_compat w v → l2 = up_locals i v l1 →

c2 = mkcf (up_actors e l2 (acs c1)) (ows c1) (mbs c1) (shp c1) →c1 =fr e ⇒ c2

where " t ’=fr’ a ’⇒’ t’ " := (redfr t a t’).

The conclusion of the rule, c1 =fr e ⇒ c2, states that configuration c1 can evolveinto configuration c2 by actor e doing a field access fr. The premises of therule are the obvious ones: e must designate an actor of c1; the table l1 of local

Page 225: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Simple Isolation for an Actor Abstract Machine 219

variables of actor e must have two local variables i and j, one holding a referencea to the accessed object (set_In (j,vadd a) l1), the other some value w (set_In (

i, w) l1) compatible with that read in the accessed object field (v_compat w v); amust point to an object o in the heap of c1 (set_In (a,o) (shp c1) ), which musthave a field f, holding some value v (set_In (f,v) o) ; and actor e must be theowner of object o for the field access to succeed (set_In (a, Some e) (ows c1)). Thefinal configuration c2 has the same owernship relation, mailbox table and sharedheap than the initial one c1, but its actor table is updated with new local stateof actor e (c2 = mkcf (up_actors e l2 (acs c1)) (ows c1) (mbs c1) (shp c1)), wherevariable i now holds the read value v (l2 = up_locals i v l1).

Another key instance of the Abstract Siaam transition rules is the rule pre-siding over message send:

Inductive redsnd : conf → aid → conf → Prop :=| redsnd_step : ∀ (c1 c2 : conf)(e : aid) (a : addr) (l : locals) (ms: msgid)(mi: mid)

(mb mb’: mbox)(owns : owners),set_In (e,l) (acs c1) →set_In (vadd a) (values_from_locals l) →trans_owner_check (shp c1) (ows c1) (Some e) a = true →set_In (mi,mb) (mbs c1) →not (set_In ms (msgids_from_mbox mb)) →Some owns = trans_owner_update (shp c1) (ows c1) None a →mb’ = mkmb (own mb) ((ms,a)::(msgs mb)) →c2 = mkcf (acs c1) owns (up_mboxes mi mb’ (mbs c1)) (shp c1) →c1 =snd e ⇒ c2

where " t ’=snd’ a ’⇒’ t’ " := (redsnd t a t’).

The conclusion of the rule, c1 =snd e ⇒ c2, states that configuration c1 canevolve into configuration c2 by actor e doing a message send snd. The premisesof the rule expects the owner of the objects reachable from the root object (refer-enced by a) of the message to be e; this is checked with function trans_owner_check

: trans_owner_check (shp c1) (ows c1) (Some e) a = true. When placing the mes-sage in the mailbox mb of the receiver actor, the owner of all the objects reach-able is set to None; this is done with function trans_owner_update: Some owns =

trans_owner_update (shp c1) (ows c1) None a. Placing the message with id ms androot object referenced by a in the mailbox is just a matter of queuing it in themailbox message queue: mb’ = mkmb (own mb) ((ms,a)::(msgs mb)).

The transition rules of Abstract Siaam also include a rule governing silenttransitions, i.e. transitions that abstract from local actor state reductions thatelicit no change on other elements of a configuration (shared heap, mailboxes,ownership relation, other actors). The latter are just modelled as transitionsarbitrarily modifying a given actor local variables, with no acquisition of objectreferences that were previously unknown to the actor.

Isolation proof. The Siaam model ensures that the only means of informationtransfer between actors is message exchange. We can formalize this isolationproperty using mark values. We call an actor a clean if its local variables donot hold a mark, and if all objects reachable from a and belonging to a hold nomark in their fields. An object o is reachable from an actor a if a has a localvariable holding o’s reference, or if, recursively, an object o’ is reachable from a

which holds o’s reference in one of its fields. The isolation property can now be

Page 226: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

220 B. Claudel et al.

characterized as follows: a clean actor in any configuration remains clean duringan evolution of the configuration if it never receives any message. In Coq:

Theorem ac_isolation : ∀ (c1 c2 : conf) (a1 a2: actor),wf_conf c1 → set_In a1 (acs c1) → ac_clean (shp c1) a1 (ows c1) →

c1 =@ (fst a1) ⇒∗ c2 → Some a2 = lookup_actor (acs c2) (fst a1) →ac_clean (shp c2) a2 (ows c2).

The theorem states that, in any well-formed configuration c1, an actor a1 which is

clean (ac_clean (shp c1) a1 (ows c1)), remains clean in any evolution of c1 that does

not involve a reception by a1. This is expressed as c1 =@ (fst a1) ⇒∗ c2 and ac_clean

(shp c2) a2 (ows c2), where fst a1 just extracts the identifier of actor a1, and a2 is

the descendant of actor a1 in the evolution (it has the same actor identifier than a1:

Some a2 = lookup_actor (acs c2) (fst a1)). The relation =@ a ⇒∗, which represents

evolutions not involving a message receipt by actor a, is defined as the reflexive and

transitive closure of relation =@ a ⇒, which is a one step evolution not involving a

receipt by a. The isolation theorem is really about transfer of information between

actors, the mark denoting a distinguished bit of information held by an actor. At first

sight it appears to say nothing about about ownership, but notice that a clean actor a

is one such that all objects that belong to a are clean, i.e. hold no mark in their fields.

Thus a corollary of the theorem is that, in absence of message receipt, actor a cannot

acquire an object from another actor (if that was the case, transferring the ownership

of an unclean object would result in actor a becoming unclean).

A well-formed configuration is a configuration where each object in the heaphas a single owner, all identifiers are indeed unique, where mailboxes hold mes-sages sent by actors in the actor table, and all objects referenced by actors (di-rectly or indirectly, through references in object fields) belong to the heap. Toprove theorem ac_isolation, we first prove that well-formedness is an invariantin any configuration evolution:

Theorem red_preserves_wf : ∀ (c1 c2 : conf), c1 ⇒ c2 → wf_conf c1 → wf_conf c2.

The theorem red_preserves_wf is proved by induction on the derivation of the

assertion c1 ⇒ c2. To prove the different cases, we rely mostly on simple reasoning with

sets, and a few lemmas characterizing the correctness of table manipulation functions,

of the trans_owner_check function which verifies that all objects reachable from the

root object in a message have the same owner, and of the trans_owner_update function

which updates the ownership table during message transfers. Using the invariance of

well-formedness, theorem ac_isolation is proved by induction on the derivation of

the assertion c1 =@ (fst a1) ⇒∗ c2. To prove the different cases, we rely on several

lemmas dealing with reachability and cleanliness.

The last theorem, live_mark, is a liveness property that shows that the isola-tion property is not vacuously true. It states that marks can flow between actorsduring execution. In Coq:

Theorem live_mark : ∃ (c1 c2 : conf)(ac1 ac2 : actor),c1 ⇒∗ c2 ∧ set_In ac1 (acs c1) ∧ ac_clean (shp c1) ac1 (ows c1)

∧ Some ac2 = lookup_actor (acs c2) (fst ac1) ∧ ac_mark (shp c2) ac2 (ows c2).

Page 227: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Simple Isolation for an Actor Abstract Machine 221

4 Siaam: Implementation

We have implemented the Siaam abstract machine as a modified Jikes RVM [16].Specifically, we extended the Jikes RVM bytecode and added a set of core primi-tives supporting the ownership machinery, which are used to build trusted APIsimplementing particular programming models. The Siaam programming modelis available as a trusted API that implements the formal specification presentedin Section 2. On top of the Siaam programming model, we implemented theActorFoundry API as described in [17], which we used for some of our evalua-tion. Finally we implemented a trusted event-based actor programming modelon top of the core primitives, which can dispatch thousand of lightweight actorsover pools of threads, and enables to build high-level APIs similar to Kilim withSiaam’s ownership-based isolation.

Bytecode. The standard Java VM instructions are extended to include: a mod-ified object creation instruction New, which creates an object on the heap andsets its owner to that of the creating thread; modified field read and write acessinstructions getfield and putfield with owner check; modified instructionsload and store array instructions aload and astore with owner check.Virtual Machine Core. Each heap object and each thread of execution havean owner reference, which points to an object implementing the special Ownerinterface. A thread can only access objects belonging to the Owner instancereferenced by its owner reference. Core primitives include operations to retrieveand set the owner of the current thread, to retrieve the owner of an object,to withdraw and acquire ownership over objects reachable from a given rootobject. In the Jikes RVM, objects are represented in memory by a sequence ofbytes organized into a leading header section and the trailing scalar object’sfields or array’s length and elements. We extended the object header with tworeference-sized words, OWNER and LINK. The OWNER word stores a reference to theobject owner, whereas the LINK word is introduced to optimize the performanceof object graph traversal operations.Contexts. Since the JikesRVM is fully written in Java, threads seamlessly ex-ecute application bytecode and the virtual machine internal bytecode. We haveintroduced a notion of execution context in the VM to avoid subjecting VMbytecode to the owner-checking mechanisms. A method in the application con-text is instrumented with all the isolation mechanisms whereas methods in theVM context are not. If a method can be in both context, it must be compiled intwo versions, one for both contexts. When a method is invoked, the context ofthe caller is used to deduce which version of the method should be called. Thedecision is taken statically when the invoke instruction is compiled.Owernship Transfer. Central to the performance of the Siaam virtual machineare operations implementing ownership transfer, withdraw and acquire. In theformal specification, owner-checking an object graph and updating the ownerof objects in the graph is done atomically (see e.g. the message send transitionrule in Section 3). However implementing the withdraw operation as an atomicoperation would be costly. Furthermore, an implementation of ownership transfer

Page 228: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

222 B. Claudel et al.

access V .FV points-to

unsafeobjects ?

V safereference ?

can eliminateV owner-checking

must keepV owner-checking

yes

no

yes

no

Fig. 3. Owner-check elimination decision diagram

must minimize graph traversals. We have implemented an iterative algorithmfor withdraw that chains objects that are part of a message through their LINKword. The list thus obtained is maintained as long as the message exists sothat the acquire operation can efficiently traverse the objects of the message.The algorithm leverages specialized techniques, initially introduced in the JikesRVM to optimize the reference scanning phase during garbage collection [10], toefficiently enumerate the reference offsets for a given base object.

5 Siaam: Static Analysis

We describe in this section some elements of Siaam static analysis to optimizeaway owner-checking on field read and write instructions. The analysis is basedon the observation that an instruction accessing an object’s field does not needan owner-checking if the object accessed belongs to the executing actor. Anyobject that has been allocated or received by an actor and has not been passedto another actor ever since, belongs to that actor. The algorithm returns anunder-approximation of the owner-checking removal opportunities in the ana-lyzed program.

Considering a point in the program, we say an object (or a reference to anobject) is safe when it always belongs to the actor executing that point, regard-less of the execution history. By opposition, we say an object is unsafe whensometimes it doesn’t belong to the current actor. We extend the denominationto instructions that would respectively access a safe object or an unsafe ob-ject. A safe instruction will never throw an OwnerException, whereas an unsafeinstruction might.

Analysis. The Siaam analysis is structured in two phases. First the safe dy-namic references analysis employs a local must-alias analysis to propagate owner-checked references along the control-flow edges. It is optionally refined with aninter-procedural pass propagating safe references through method argumentsand returned values. Then the safe objects analysis tracks safe runtime objectsalong call-graph and method control-flow edges by combining an inter-proceduralpoints-to analysis and an intra-procedural live variable analysis. Both phases de-pend on the transfered abstract objects analysis that propagates unsafe abstractobjects from the communication sites downward the call graph edges.

By combining results from the two phases, the algorithm computes conservativeapproximations of unsafe runtime objects and safe variables at any control-flow

Page 229: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Simple Isolation for an Actor Abstract Machine 223

point in the program.The owner-check elimination for a given instruction s access-ing the reference in variable V proceeds as illustrated in Figure 3. First the unsafeobjects analysis is queried to know whether V may points-to an unsafe runtimeobject at s. If not, the instruction can skip the owner-check for V . Otherwise, thesafe reference analysis is consulted to know whether the reference in variable Vis considered safe at s, thanks to dominant owner-checks of the reference in thecontrol-flow graph.

The Siaam analysis makes use of several standard intra and inter-prodeduralprogram analyses: a call-graph representation, an inter-procedural points-to anal-ysis, an intra-procedural liveness analysis, and an intra-procedural must-aliasanalysis. Each of these analyses exists in many different variants offering varioustradeoffs of results accuracy and algorithmic complexity, but regardless of theimplementation, they provide a rather standard querying interface. Our analysisis implemented as a framework that can make use of different instances of theseanalyses.Implementations. The intra-procedural safe reference analysis which is partof the Siaam analysis has been included in the Jikes RVM optimizing compiler.Despite its relative simplicity and its very conservative assumptions, it efficientlyeliminates about half of the owner-check barriers introduced by application byte-code and the standard library for the benchmarks we have tested (see Section 6).The safe reference analysis and the safe object analyses from the Siaam analy-sis have been implemented in their inter-procedural versions as an offline toolwritten in Java. The tool interfaces with the Soot analysis framework [23], thatprovides the program representation, the call graph, the inter-procedural pointeranalysis, the must-alias analysis and the liveness analysis we use.Programming Assistant. The Siaam programming model is quite simple,requiring no programmer annotation, and placing no constraint on messages.However, it may generate hard to understand runtime exceptions due to failedowner-checks. The Siaam analysis is therefore used as the basis of a program-ming assistant that helps application developers understand why a given pro-gram statement is potentially unsafe and may throw an owernship exception atruntime. The Siaam analysis guarantees that there will be no false negative, butto limit the amount of false positives it is necessary to use a combination ofthe most accurate standard (points-to, must-alias and liveness) analyses. Theprogramming assistant tracks a program P backward, starting from an unsafestatement s with a non-empty set of unverified ownerhip preconditions (as givenby the ok act function in Section 2), trying to find every program points thatmay explain why a given precondition is not met at s. For each unsatisfied pre-condition, the assistant can exhibit the shortest execution paths that result in anexception being raised at s. An ownership precondition may comprise require-ments that a variable or an object be safe. When a requirement is not satisfiedbefore s, it raises one or several questions of the form “why is x unsafe before s?”.The assistant traverses the control-flow backward, looks for immediate answersat each statement reached, and propagates the questions further if necessary,until all questions have found an answer.

Page 230: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

224 B. Claudel et al.

6 Evaluation

Siaam Implementation. We present first an evaluation of the overall per-formance of our Siaam implementation based on the DaCapo benchmark suite[3], representative of various real industrial workloads. These applications useregular Java. The bytecode is instrumented with Siaam’s owner-checks and allthreads share the same owner. With this benchmark we measure the overhead ofthe dynamic ownership machinery, encompassing the object owner initializationand the owner-checking barriers, plus the allocation and collection costs linkedto the object header modifications.

We benchmarked five configurations. no siaam is the reference Jikes RVMwithout modifications. opt designates the modified Jikes RVM with JIT owner-checks elimination. noopt designates the modified Jikes RVM without JIT owner-checks elimination. sopt is the same as opt but the application bytecode hassafety annotations issued by the offline Siaam static analysis tool. Finally soptnc

is the same as sopt without owner-check barriers for the standard library byte-code. We executed the 2010-MR2 version of the DaCapo benchamrks, with twoworkloads, the default and the large. Table 1 shows the results for the Dacapo2010-MR2 runs. The results were obtained using a machine equipped with anIntel Xeon W3520 2.67Ghz processor. The execution time results are normal-ized with respect to the no-siaam configuration for each program of the suite:lower is better. The geometric mean summarizes the typical overhead for eachconfiguration. The opt figures in Table 1 show that the modified virtual machineincluding JIT barrier elimination has an overhead of about 30% compared to thenot-isolated reference. The JIT elimination improves the performances by about20% compared to the noopt configuration. When the bytecode is annotated bythe whole-program static analysis the performance is 10% to 20% better thanwith the runtime-only optimization. However, the DaCapo benchmarks use theJava reflection API to load classes and invoke methods, meaning our static anal-ysis was not able to process all the bytecode with the best precision. We canexpect better results with other programs for which the call graph can be entirelybuilt with precision. Moreover we used for the benchmarks a context-insensitive,flow-insensitive pointer analysis, meaning the Siaam analysis could be even moreaccurate with sensitive standard analyses. Finally the standard library bytecodeis not annotated by our tool, it is only treated by the JIT elimination opti-mization. The soptnc configuration provides a good indication of what the fulloptimization would yield. The results show an overhead (w.r.t. application) witha mean of 15%, which can be considered as an acceptable price to pay for thesimplicity of developing isolated programs with Siaam.

The Siaam virtual machine consumes more heap space than the unmodifiedJikes RVM due to the duplication of the standard library used by both thevirtual machine and the application, and because of the two words we add inevery object’s header. The average object size in the DaCapo benchmarks is 62bytes, so our implementations increases it by 13%. We have measured a 13%increase in the full garbage collection time, which accounts for the tracing of thetwo additional references and the memory compaction.

Page 231: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Simple Isolation for an Actor Abstract Machine 225

Table 1. DaCapo benchmarks

Benchmark opt noopt sopt soptnc opt noopt sopt soptncworkload default largeantlr 1.20 1.32 1.09 1.11 1.21 1.33 1.11 1.10bloat 1.24 1.41 1.17 1.05 1.40 1.59 1.14 0.96hsqldb 1.24 1.36 1.09 1.06 1.45 1.60 1.29 1.10jython 1.52 1.73 1.41 1.24 1.45 1.70 1.45 1.15luindex 1.25 1.46 1.09 1.05 1.25 1.43 1.09 1.03lusearch 1.31 1.45 1.17 1.18 1.33 1.49 1.21 1.21pmd 1.32 1.37 1.29 1.24 1.34 1.44 1.39 1.30xalan 1.24 1.39 1.33 1.35 1.29 1.41 1.38 1.40geometric mean 1.28 1.43 1.20 1.16 1.34 1.50 1.25 1.15

Table 2. ActorFoundry analyses

Ownercheck Message Passing ratio to Ideal

SitesIdealsafe

Siaamsafe Sites

Idealsafe

Siaamsafe

Time(sec) Siaam SOTER

ActorFoundrythreadring 24 24 24 8 8 8 0.1 100% 100%

(1)concurrent 99 99 99 15 12 10 0.1 98% 58%(2)copymessages 89 89 84 22 20 15 0.1 91% 56%

performance 54 54 54 14 14 14 0.2 100% 86%pingpong 28 28 28 13 13 13 0.1 100% 89%

refmessages 4 4 4 6 6 6 0.1 100% 67%Benchmarks

chameneos 75 75 75 10 10 10 0.1 100% 33%fibonacci 46 46 46 13 13 13 0.2 100% 86%

leader 50 50 50 10 10 10 0.1 100% 17%philosophers 35 35 35 10 10 10 0.2 100% 100%

pi 31 31 31 8 8 8 0.1 100% 67%shortestpath 147 147 147 34 34 34 1.2 100% 88%

SyntheticquicksortCopy 24 24 24 8 8 8 0.2 100% 100%

(3)quicksortCopy2 56 56 51 10 10 5 0.1 85% 75%Real worldclownfish 245 245 245 102 102 102 2.2 100% 68%

(4)rainbow fish 143 143 143 83 82 82 0.2 99% 99%swordfish 181 181 181 136 136 136 1.7 100% 97%

Siaam Analysis. We compare the efficiency of the Siaam whole-programanalysis to the SOTER algorithm, which is closest to ours. Table 2 contains theresults that we obtained for the benchmarks reported in [21], that use Actor-Foundry programs. For each analyzed application we give the total number ofowner-checking barriers and the total number of message passing sites in thebytecode. The columns “Ideal safe” show the expected number of safe sites foreach criteria. The column “ Siaam safe” gives the result obtained with the Siaamanalysis. The analysis execution time is given in the third main colum. The lastcolumn compares the result ratio to ideal for both SOTER and Siaam. Ouranalysis outperforms SOTER significantly. SOTER relies on an inter-procedurallive-analysis and a points-to analysis to infer message passing sites where a by-reference semantics can applies safely. Given an argument ai of a message passingsite s in the program, SOTER computes the set of objects passed by ai and theset of objects transitively reachable from the variables live after s. If the inter-section of these two sets is empty, SOTER marks ai as eligible for by-reference

Page 232: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

226 B. Claudel et al.

argument passing, otherwise it must use the default by-value semantic. Theweakness to this pessimistic approach is that among the live objects, a signif-icant part won’t actually be accessed in the control-flow after s. On the otherhand, Siaam do care about objects being actually accessed, which is a strongerevidence criterion to incriminate message passing sites. Although Siaam’s algo-rithm wasn’t designed to optimize-out by-value message passing, it is perfectlyadapted for that task. For each unsafe instruction detected by the algorithm,there is one or several guilty dominating message passing sites. Our diagnosisalgorithm tracks back the application control-flow from the unsafe instructionto the incriminated message passing sites. These sites represent a subset of thesites where SOTER cannot optimize-out by-value argument passing.

7 Related Work and Conclusion

Enforcing isolation between different groups of objects, programs or threads inpresence of a shared memory has been much studied in the past two decades.Although we cannot give here a full survey of the state of the art (a morein depth analysis is available in [22]), we can point out three different kinds ofrelated works: those relying on type annotations to ensure isolation, those relyingon run-time mechanisms, and those relying on static analyses.

Much work has been done on controlling aliasing and encapsulation in object-oriented languages and systems, in a concurrent context or not. Much of theworks in these areas rely on some sort of reference uniqueness, that eliminatesobject sharing by making sure that there is only one reference to an objectat any time, e.g. [5,14,15,19,20]. All these systems restrict the shape of objectgraphs or the use of references in some way. In contrast, Siaam makes no suchrestriction. A number of systems rely on run-time mechanisms for achieving iso-lation, most using either deep-copy or special message heaps for communication,e.g. [7,9,11,12]. Of these, O-Kilim [12], which builds directly on the PhD workof the first author of this paper [6], is the closest to Siaam: it places no con-straint on transferred object graphs, but at the expense of a complex program-ming model and no programmer support, in contrast to Siaam. Finally severalworks develop static analyses for efficient concurrency or ownership transfer, e.g.[4,21,24]. Kilim [24] relies in addition on type annotations to ensure tree-shapedmessages. The SOTER [21] analysis is closest to the Siaam analysis and has beendiscussed in the previous section.

With its annotation-free programmingmodel, which places no restriction on ob-ject references and message shape, we believe Siaam to be really unique comparedto other approaches in the literature. In addition, we have not found an equiva-lent of the formal proof of isolation we have conducted for Siaam. Our evaluationsdemonstrate that the Siaam approach to isolation is perfectly viable: it suffers onlyfrom a limited overhead in performance and memory consumption, and our staticanalysis can significantly improve the situation. The one drawback of our program-ming model, raising possibly hard to understand runtime exceptions, is greatlyalleviated by the use of the Siaam analysis in a programming assistant.

Page 233: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Simple Isolation for an Actor Abstract Machine 227

References

1. Agha, G.A.: Actors: A Model of Concurrent Computation in Distributed Systems.The MIT Press (1986)

2. Armstrong, J.: Erlang. Commun. ACM 53(9) (2010)3. Blackburn, S.M., Garner, R., Hoffman, C., et al.: The DaCapo benchmarks: Java

benchmarking development and analysis. In: OOPSLA 2006. ACM (2006)4. Carlsson, R., Sagonas, K.F., Wilhelmsson, J.: Message analysis for concurrent pro-

grams using message passing. ACM Trans. Program. Lang. Syst. 28(4) (2006)5. Clarke, D., Wrigstad, T.: External uniqueness is unique enough. In: Cardelli, L.

(ed.) ECOOP 2003. LNCS, vol. 2743, pp. 176–200. Springer, Heidelberg (2003)6. Claudel, B.: Mecanismes logiciels de protection memoire. PhD thesis, Universite

de Grenoble (2009)7. Czajkowski, G., Daynes, L.: Multitasking without compromise: A virtual machine

evolution. In: OOPSLA 2001. ACM (2001)8. Coq development team, http://coq.inria.fr9. Fahndrich, M., Aiken, M., et al.: Language Support for Fast and Reliable Message-

based Communication in Singularity OS. In: 1st EuroSys Conference. ACM (2006)10. Garner, R.J., Blackburn, S.M., Frampton, D.: A comprehensive evaluation of object

scanning techniques. In: 10th ISMM. ACM (2011)11. Geoffray, N., Thomas, G., Muller, G., et al.: I-JVM: a java virtual machine for

component isolation in osgi. In: DSN 2009. IEEE (2009)12. Gruber, O., Boyer, F.: Ownership-based isolation for concurrent actors on multi-

core machines. In: Castagna, G. (ed.) ECOOP 2013. LNCS, vol. 7920, pp. 281–301.Springer, Heidelberg (2013)

13. Haller, P., Odersky, M.: Actors that unify threads and events. In: Murphy, A.L.,Vitek, J. (eds.) COORDINATION 2007. LNCS, vol. 4467, pp. 171–190. Springer,Heidelberg (2007)

14. Haller, P., Odersky, M.: Capabilities for uniqueness and borrowing. In: D’Hondt,T. (ed.) ECOOP 2010. LNCS, vol. 6183, pp. 354–378. Springer, Heidelberg (2010)

15. Hogg, J.: Islands: aliasing protection in object-oriented languages. SIGPLANNot. 26(11) (1991)

16. http://jikesrvm.org17. Karmani, R.K., Shali, A., Agha, G.: Actor frameworks for the JVM platform: a

comparative analysis. In: 7th PPPJ. ACM (2009)18. Klein, G., Nipkow, T.: A machine-checked model for a java-like language, virtual

machine, and compiler. ACM Trans. Program. Lang. Syst. 28(4) (2006)19. Minsky, N.H.: Towards alias-free pointers. In: Cointe, P. (ed.) ECOOP 1996. LNCS,

vol. 1098, pp. 189–209. Springer, Heidelberg (1996)20. Muller, P., Rudich, A.: Ownership transfer in universe types. SIGPLAN Not. 42(10)

(2007)21. Negara, S., Karmani, R.K., Agha, G.A.: Inferring ownership transfer for efficient

message passing. In: 16th PPOPP. ACM (2011)22. Sabah, Q.: SIAAM: Simple Isolation for an Abstract Actor Machine. PhD thesis,

Universite de Grenoble (December 2013)23. http://sable.github.io/soot/24. Srinivasan, S., Mycroft, A.: Kilim: Isolation-Typed Actors for Java. In: Vitek, J.

(ed.) ECOOP 2008. LNCS, vol. 5142, pp. 104–128. Springer, Heidelberg (2008)25. https://team.inria.fr/spades/siaam/

Page 234: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Sliced Path Prefixes:An Effective Method to Enable

Refinement Selection

Dirk Beyer, Stefan Löwe(�), and Philipp Wendler

University of Passau, Passau, [email protected]

Abstract. Automatic software verification relies on constructing, fora given program, an abstract model that is (1) abstract enough toavoid state-space explosion and (2) precise enough to reason about thespecification. Counterexample-guided abstraction refinement is a stan-dard technique that suggests to extract information from infeasible errorpaths, in order to refine the abstract model if it is too imprecise. Existingapproaches —including our previous work— do not choose the refinementfor a given path systematically. We present a method that generates al-ternative refinements and allows to systematically choose a suited one.The method takes as input one given infeasible error path and applies aslicing technique to obtain a set of new error paths that are more abstractthan the original error path but still infeasible, each for a different rea-son. The (more abstract) constraints of the new paths can be passed to astandard refinement procedure, in order to obtain a set of possible refine-ments, one for each new path. Our technique is completely independentfrom the abstract domain that is used in the program analysis, and doesnot rely on a certain proof technique, such as SMT solving. We imple-mented the new algorithm in the verification framework CPAcheckerand made our extension publicly available. The experimental evaluationof our technique indicates that there is a wide range of possibilities onhow to refine the abstract model for a given error path, and we demon-strate that the choice of which refinement to apply to the abstract modelhas a significant impact on the verification effectiveness and efficiency.

1 IntroductionIn the field of automatic software verification, abstraction is a well-understoodand widely-used technique, enabling the successful verification of real-world,industrial programs (cf. [4, 13, 14]). Abstraction makes it possible to omit cer-tain aspects of the concrete semantics that are not necessary to prove or dis-prove the program’s correctness. This may lead to a massive reduction of aprogram’s state space, such that verification becomes feasible within reasonabletime and resource limits. For example, Slam [5] uses predicate abstraction [18]for creating an abstract model of the software. One of the current research

A preliminary version of this article appeared as technical report [12].

c© IFIP International Federation for Information Processing 2015S. Graf and M. Viswanathan (Eds.): FORTE 2015, LNCS 9039, pp. 228–243, 2015.DOI: 10.1007/978-3-319-19195-9 15

Page 235: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Sliced Path Prefixes: An Effective Method to Enable Refinement Selection 229

1 extern int f ( int x ) ;2 int main ( ) {3 int b = 0 ;4 int i = 0 ;5 while ( 1 ) {6 i f ( i > 9) break ;7 f ( i ++);8 }9 i f (b != 0) {

10 i f ( i != 10) {11 assert ( 0 ) ;12 }13 }14 }

true

b==0

b==0

b==0

false

true

true

i==0

false

false

false false

false false

Fig. 1. From left to right, the input program, an infeasible error path, and a “good”and a “bad” interpolant sequence for the infeasible error path

directions is to invent techniques to automatically find suitable abstractions.An ideal model is abstract enough to avoid state-space explosion and still con-tains enough detail to verify the property. Counterexample-guided abstractionrefinement (CEGAR) [15] is an automatic technique that starts with a coarse ab-straction and iteratively refines an abstract model using infeasible error paths.If the analysis does not find an error path in the abstract model, the analy-sis terminates with the result true. If the analysis finds an error path, thepath is checked for feasibility. If this error path is feasible according to the con-crete program semantics, then it represents a bug, and the analysis terminateswith the result false. However, if the error path is infeasible, then the abstractmodel was too coarse. In this case, the infeasible error path can be passed toan interpolation engine, which identifies information that is needed to refine thecurrent abstraction, such that the same infeasible error path is excluded in thenext CEGAR iterations. CEGAR is successfully used, for example, by the toolsSlam [5], Blast [7], CPAchecker [10], and Ufo [1].

Craig interpolation [16] is a technique that yields for two contradicting formu-las an interpolant formula that contains less information than the first formula,but is still expressive enough to contradict the second formula. In software veri-fication, interpolation was first used for the domain of predicate abstraction [19],and later for value-analysis domains [11]. Independent of the analysis domain,interpolants for path constraints of infeasible error paths can be used to re-fine abstract models and to eliminate the infeasible error paths in subsequentCEGAR iterations. In this context, it is important to point out that the choiceof interpolants is crucial for the performance of the analysis. Figure 1 gives anexample: In this program, the analysis will typically find the shown error path,which is infeasible for two different reasons: both the value of i and the valueof b can be used to find a contradiction. In general, it is now beneficial for theverifier to track the value of the boolean variable b, and not to track the value ofthe loop-counter variable i, because the latter has many more possible values,

Page 236: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

230 D. Beyer et al.

and tracking it would usually lead to an expensive unrolling of the loop. Instead,if only variable b is tracked, the verifier can conclude the safety of the programwithout unrolling the loop. Thus, we would like to use for refinement the in-terpolant sequence shown on the left (with only the boolean variable) and notthe right one (with the loop-counter variable). However, interpolation enginestypically do not allow to guide the interpolation process towards “good”, or awayfrom “bad”, interpolant sequences. The interpolation engines inherently cannotdo a better job here: they do not have access to information such as whethera specific variable is a loop counter and should be avoided in the interpolant.Instead, which interpolant is returned depends solely on the internal algorithmsof the interpolation engine. This is especially true if the model checker uses anoff-the-shelf interpolation engine, which normally cannot be controlled on sucha fine-grained level. In this case, the model checker is stuck to what the interpo-lation engine returns, be it good or bad for the verification process.

Therefore, we present an approach that allows to guide the interpolation en-gine to produce different interpolants, without changing the interpolation engine.To achieve this, we extract from one infeasible error path a set of infeasible slicedpaths, each infeasible for a different reason. Each of these sliced paths can beused for interpolation, yielding different interpolant sequences that are all ex-pressive enough to eliminate the original infeasible error path. Our approachfits well into CEGAR (with or without lazy abstraction [20]), because only therefinement component needs customization, and the new approach remains com-patible with off-the-shelf interpolation engines.

Contributions. We make the following key contributions: (1) we introduce adomain- and analysis-independent method to extract a set of infeasible slicedpaths from infeasible error paths, (2) we prove that interpolants for such a slicedpath are also interpolants for the original infeasible error path, (3) we explainthat —and how— it is possible to obtain, given a set of infeasible sliced paths,different precisions (interpolants) for the same infeasible error path, and thatthe choice of the precision makes a significant difference for CEGAR, (4) we im-plement the presented concepts in the open-source framework for software ver-ification CPAchecker, and (5) we show experimentally that the novel approachto obtain different precisions significantly impacts the effectiveness and efficiency.

While we use interpolation to compute the refined precisions, our method isnot bound to interpolation: invariant-generation techniques for refinement suchas path invariants [8] can equally benefit from the new possibility of choice.

Related Work. The desire to control which interpolants an interpolation engineproduces, and trying to make the verification process more efficient by findinggood interpolants, is not new. Our goal is to contribute a technique that is inde-pendent from the abstract domain that a program analysis uses, and independentfrom specific properties of interpolation engines.

The first work in this direction suggested to control the interpolantstrength [17] such that the user can choose between strong and weak interpolants.This approach is unfortunately not implemented in standard interpolation en-gines. The technique of interpolation abstractions [22], a generalization of term

Page 237: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Sliced Path Prefixes: An Effective Method to Enable Refinement Selection 231

abstraction [2], can be used to guide solvers to pick good interpolants. This isachieved by extending the concrete interpolation problem by so called templates(e.g., terms, formulas, uninterpreted functions with free variables) to obtain amore abstract interpolation problem. An interpolant for the abstract interpola-tion problem is also a solution to the concrete interpolation problem. Suitableinterpolants can be chosen using a cost function, because these interpolation ab-stractions form a lattice. In contrast to interpolation abstractions, our approachdoes not rely on SMT solving and is independent from the interpolation engineand abstract domain, so it is also applicable to, e.g., value and octagon domains.

Path slicing [21] is a technique that was introduced to reduce the burdenof the interpolation engine: Before the constraints of the path are given to theinterpolation engine, the constraints are weakened by removing facts that arenot important for the infeasibility of the error path, i.e., a more abstract errorpath is constructed. We also make the error path more abstract, but in differentdirections to obtain different interpolant sequences, from which we can chooseone that yields a suitable abstract model. While path slicing is interested inreducing the run time of the interpolation engine (by omitting some facts), weare interested in reducing the run time of the verification engine (by spendingmore time on interpolation and selection but creating a better abstract model).

2 BackgroundOur approach is based on several existing concepts, and in this section we remindthe reader of some basic definitions and our previous work in this field [11].

Programs, Control-Flow Automata, States, Paths, Precisions. We re-strict the presentation to a simple imperative programming language, where alloperations are either assignments or assume operations, and all variables rangeover integers.1 A program is represented by a control-flow automaton (CFA).A CFA A = (L, l0, G) consists of a set L of program locations, which modelthe program counter, an initial program location l0 ∈ L, which models the pro-gram entry, and a set G ⊆ L × Ops × L of control-flow edges, which model theoperations that are executed when control flows from one program location tothe next. The set of program variables that occur in operations from Ops isdenoted by X . A verification problem P = (A, le) consists of a CFA A, repre-senting the program, and a target program location le ∈ L, which represents thespecification, i.e., “the program must not reach location le”.

A concrete data state of a program is a variable assignment cd : X → Z,which assigns to each program variable an integer value; the set of integer valuesis denoted as Z. A concrete state of a program is a pair (l, cd), where l ∈ L is aprogram location and cd is a concrete data state. The set of all concrete statesof a program is denoted by C, a subset r ⊆ C is called region. Each edge g ∈ Gdefines a labeled transition relation g→ ⊆ C × {g} × C. The complete transition

1 Our implementation is based on CPAchecker, which operates on C programs;non-recursive function calls are supported.

Page 238: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

232 D. Beyer et al.

relation → is the union over all control-flow edges: → =⋃

g∈G

g→. We write cg→c′

if (c, g, c′) ∈ →, and c→c′ if there exists a g with cg→c′.

An abstract data state represents a region of concrete data states, formallydefined as abstract variable assignment. An abstract variable assignment is ei-ther a partial function v : X −→◦ Z mapping variables in its definition range tointeger values, or ⊥, which represents no variable assignment (i.e., no value ispossible, similar to the predicate false in logic). The special abstract variableassignment � = {} does not map any variable to a value and is used as ini-tial abstract variable assignment in a program analysis. Variables that do notoccur in the definition range of an abstract variable assignment are either omit-ted by purpose for abstraction in the analysis, or the analysis is not able todetermine a concrete value (e.g., resulting from an uninitialized variable decla-ration or from an external function call). For two partial functions f and f ′, wewrite f(x) = y for the predicate (x, y) ∈ f , and f(x) = f ′(x) for the predicate∃c : (f(x) = c) ∧ (f ′(x) = c). We denote the definition range for a partial func-tion f as def(f) = {x | ∃y : f(x) = y}, and the restriction of a partial function fto a new definition range Y as f|Y = f ∩ (Y × Z). An abstract variable assign-ment v represents the set [[v]] of all concrete data states cd for which v is valid,formally: [[⊥]] = {} and for all v �= ⊥, [[v]] = {cd | ∀x ∈ def(v) : v(x) = cd(x)}.The abstract variable assignment ⊥ is called contradicting. The implication forabstract variable assignments is defined as follows: v implies v′ (written v ⇒ v′)if v = ⊥, or for all variables x ∈ def(v′) we have v(x) = v′(x). The conjunctionfor abstract variable assignments v and v′ is defined as:

v ∧ v′ ={⊥ if v = ⊥ or v′ = ⊥ or (∃x ∈ def(v) ∩ def(v′) : ¬ v(x) = v′(x))v ∪ v′ otherwise

The semantics of an operation op ∈ Ops is defined by the strongest-postoperator SPop(·): given an abstract variable assignment v, SPop(v) representsthe set of concrete data states that are reachable from the concrete data statesin the set [[v]] by executing op. Formally, given an abstract variable assignment vand an assignment operation x := exp, we have SPx:=exp(v) = ⊥ if v = ⊥, orSPx:=exp(v) = v|X\{x} ∧ vx with

vx =

{{(x, c)} if c ∈ Z is the result of the arith. evaluation of expression exp/v{} otherwise (if exp/v cannot be evaluated)

where exp/v denotes the interpretation of expression exp for the abstract variableassignment v. Given an abstract variable assignment v and an assume opera-tion [p], we have SP[p](v) = ⊥ if v = ⊥ or the predicate p/v is unsatisfiable, or wehave SP[p](v) = v ∧ vp, with vp =

{

(x, c) ∈ (X \ def(v)× Z)∣∣ p/v ⇒ (x = c)

}

and p/v = p ∧ ∧

y∈def(v)y = v(y).

A path σ is a sequence 〈(op1, l1), . . . , (opn, ln)〉 of pairs of an operation and alocation. The path σ is called program path if for every i with 1 ≤ i ≤ n thereexists a CFA edge g = (li−1, opi, li) and l0 is the initial program location, i.e., thepath σ represents a syntactic walk through the CFA. The result of appending thepair (opn, ln) to a path σ = 〈(op1, l1), . . . , (opm, lm)〉 is defined as σ∧ (opn, ln) =〈(op1, l1), . . . , (opm, lm), (opn, ln)〉.

Page 239: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Sliced Path Prefixes: An Effective Method to Enable Refinement Selection 233

Every path σ = 〈(op1, l1), . . . , (opn, ln)〉 defines a constraint sequenceγσ = 〈op1, . . . , opn〉. The conjunction γ ∧ γ′ of two constraint sequencesγ = 〈op1, . . . , opn〉 and γ′ = 〈op ′

1, . . . , op′m〉 is defined as their concatenation,

i.e., γ ∧ γ′ = 〈op1, . . . , opn, op′1, . . . , op

′m〉, the implication of γ and γ′ (denoted

by γ ⇒ γ′) as the implication of their strongest-post assignments SPγ(�) ⇒SPγ′(�), and γ is contradicting if SPγ(�) = ⊥. The semantics of a path σ =〈(op1, l1), . . . , (opn, ln)〉 is defined as the successive application of the strongest-post operator to each operation of the corresponding constraint sequence γσ:SPγσ (v) = SPopn

(. . . SPop1(v) . . .). The set of concrete program states that re-

sult from running a program path σ is represented by the pair (ln, SPγσ(�)),where � is the initial abstract variable assignment. A path σ is feasible ifSPγσ (�) is not contradicting, i.e., SPγσ (�) �= ⊥. A concrete state (ln, cdn) isreachable, denoted by (ln, cdn) ∈ Reach, if there exists a feasible program pathσ = 〈(op1, l1), . . . , (opn, ln)〉 with cdn ∈ [[SPγσ(�)]]. A location l is reachableif there exists a concrete data state cd such that (l, cd) is reachable. A pro-gram is safe (the specification is satisfied) if le is not reachable. A program pathσ = 〈(op1, l1), . . . , (opn, le)〉, which ends in le, is called error path.

The precision is a function π : L → 2Π , where Π depends on the abstractdomain that is used by the analysis. It assigns to each program location someanalysis-dependent information that defines the level of abstraction of the anal-ysis. For example, if using predicate abstraction, the set Π is a set of predicatesover program variables. If using a value domain, the set Π is the set X of programvariables, and a precision defines which program variables should be tracked bythe analysis at which program location.

Counterexample-Guided Abstraction Refinement (CEGAR). CEGAR,a technique for automatic iterative refinement of an abstract model [15], is basedon three concepts: (1) a precision, which determines the current level of abstrac-tion, (2) a feasibility check, which decides if an error path (counterexample) isfeasible, and (3) a refinement procedure, which takes as input an infeasible errorpath and extracts a precision to refine the abstract model such that the infea-sible error path is eliminated from further exploration. Algorithm 1 shows aninstantiation of the CEGAR algorithm. It uses the CPA algorithm [9, 11] forprogram analysis with dynamic precision adjustment and an abstract domainthat is formalized as a configurable program analysis (CPA) with dynamic pre-cision adjustment D. The CPA uses a set E of abstract states and a set L → 2Π

of precisions. The analysis algorithm computes the sets reached and waitlist,which represent the current reachable abstract states with precisions and thefrontier, respectively. The analysis algorithm is run first with π0 as coarse ini-tial precision (usually π0(l) = {} for all l ∈ L). If all program states have beenexhaustively checked, indicated by an empty waitlist, and no error was reachedthen the CEGAR algorithm terminates and reports true (program is safe). Ifthe CPA algorithm finds an error in the abstract state space, then it stopsand returns the yet incomplete sets reached and waitlist. Now the correspond-ing abstract error path is extracted from the set reached, using the procedureExtractErrorPath, and passed to the procedure IsFeasible for the feasibility check.

Page 240: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

234 D. Beyer et al.

Algorithm 1 CEGAR(D, e0, π0), cf. [11]Input: a CPA with dynamic precision adjustment D and

an initial abstract state e0 ∈ E with precision π0 ∈ (L → 2Π)Output: verification result true (property holds) or falseVariables: a set reached of elements of E × (L → 2Π ),

a set waitlist of elements of E × (L → 2Π), andan error path σ = 〈(op1, l1), . . . , (opn, ln)〉

1: reached := {(e0, π0)}; waitlist := {(e0, π0)}; π := π0

2: while true do3: (reached,waitlist) := CPA(D, reached,waitlist)4: if waitlist = {} then5: return true6: else7: σ := ExtractErrorPath(reached)8: if IsFeasible(σ) then // error path is feasible: report bug9: return false

10: else // error path is infeasible: refine and restart11: π(l) := π(l) ∪ Refine(σ)(l), for all program locations l12: reached := {(e0, π)}; waitlist := {(e0, π)}

Algorithm 2 Refine(σ)

Input: an infeasible error path σ = 〈(op1, l1), . . . , (opn, ln)〉Output: a precision π ∈ L → 2Π

Variables: a constraint sequence Γ1: Γ := 〈〉2: π(l) := {}, for all program locations l3: for i := 1 to n− 1 do4: γ+ := 〈opi+1, . . . , opn〉5: Γ := Interpolate(Γ ∧ 〈opi〉, γ+) // inductive interpolation6: π(li) := ExtractPrecision(Γ ) // create precision based on Γ7: return π

If the abstract error path is feasible, meaning there exists a corresponding con-crete error path, then this error path represents a violation of the specificationand the algorithm terminates, reporting false. If the error path is infeasible,i.e., is not corresponding to a concrete program path, then the precision was toocoarse and needs to be refined. The refinement step is performed by procedureRefine (cf. Alg. 2) which returns a precision π that makes the analysis strongenough to exclude the infeasible error path from future state-space explorations.This returned precision is used to extend the current precision of the CPA al-gorithm, which is started in CEGAR’s next iteration and re-computes the setsreached and waitlist based on the new, refined precision. CEGAR is often usedwith lazy abstraction [20] so that after refining, instead of the whole state space,only some parts of reached and waitlist are removed, and re-explored with thenew precision.Interpolation for Constraint Sequences. An interpolant for two constraintsequences γ− and γ+, such that γ− ∧ γ+ is contradicting, is a constraint

Page 241: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Sliced Path Prefixes: An Effective Method to Enable Refinement Selection 235

sequence Γ for which 1) the implication γ− ⇒ Γ holds, 2) the conjunctionΓ ∧γ+ is contradicting, and 3) the interpolant Γ contains in its constraints onlyvariables that occur in both γ− and γ+ [11].

Next, we introduce our novel approach, which extracts from one infeasibleerror path a set of infeasible sliced path prefixes. Sect. 4 then uses this method toextend the procedure Refine to perform precision extraction on a set of infeasiblesliced prefixes, offering to select the most suitable precision from several choices.

3 Sliced Prefixes

Infeasible Sliced Prefixes. A CEGAR-based analysis encounters an infeasi-ble error path if the precision is too coarse. An infeasible error path containsat least one assume operation for which the reachability algorithm computes anon-contradicting abstract successor based on the current precision, but com-putes a contradicting successor if the concrete semantics of the program is used.Every infeasible error path contains at least one such contradicting assume op-eration, but often, there exist several independently contradicting assume oper-ations in an infeasible error path, which leads to the notion of infeasible slicedprefixes: A path φ = 〈(op1, l1), . . . , (opw, lw)〉 is a sliced prefix for a programpath σ = 〈(op1, l1), . . . , (opn, ln)〉 if w ≤ n and for all 1 ≤ i ≤ w, we haveφ.li = σ.li and (φ.opi = σ.opi or (φ.opi = [true] and σ.opi is assume op)), i.e.,a sliced prefix results from a program path by omitting pairs of operations andlocations from the end, and possibly replacing some assume operations by no-opoperations. If a sliced prefix for σ is infeasible, then σ is infeasible.Extracting Infeasible Sliced Prefixes from an Infeasible Error Path.Algorithm 3 extracts from an infeasible error path a set of infeasible sliced pre-fixes. The algorithm iterates through the given infeasible error path σ. It keepsincrementing a feasible sliced prefix σf that contains all operations from σ thatwere seen so far, except contradicting assume operations, which were replaced byno-op operations. Thus, σf is always feasible. For every element (op, l) from theoriginal path σ (iterating in order from the first to the last pair), it is checked

Algorithm 3 ExtractSlicedPrefixes(σ)Input: an infeasible path σ = 〈(op1, l1), . . . , (opn, ln)〉Output: a non-empty set Σ = {σ1, . . . , σn} of infeasible sliced prefixes of σVariables: a path σf that is always feasible1: Σ := {}2: σf := 〈〉3: for each (op, l) ∈ σ do4: if SPσf∧(op,l)(�) = ⊥ then5: // add σf ∧ (op, l) to the set of infeasible sliced prefixes6: Σ := Σ ∪ {σf ∧ (op, l)}7: σf := σf ∧ ([true], l) // append no-op8: else9: σf := σf ∧ (op, l) // append original pair

10: return Σ

Page 242: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

236 D. Beyer et al.

(a) Error path (b) Sliced-prefix cascade (c) Sliced prefixes

Fig. 2. From one infeasible error path to a set of infeasible sliced prefixes

whether it contradicts σf , which is the case if the result of the strongest-postoperator for the path σf ∧ (op, l) is contradicting (denoted by ⊥). If so, the algo-rithm has found a new infeasible sliced prefix, which is collected in the set Σ ofinfeasible sliced prefixes. The feasible sliced prefix σf is extended either by a no-op operation (Line 7) or by the current operation (Line 9). When the algorithmterminates, which is guaranteed because σ is finite, the set Σ contains infeasiblesliced prefixes of σ, one for each ‘reason’ of infeasibility. There is always at leastone infeasible sliced prefix because σ is infeasible.

The sliced prefixes that Alg. 3 returns have some interesting characteristics:(1) Each sliced prefix φ starts with the initial operation op1, and ends with anassume operation that contradicts the previous operations of φ, i.e., SPφ(�) = ⊥.(2) The i-th sliced prefix, excluding its (final and only) contradicting assumeoperation and location, is a prefix of the (i + 1)-st sliced prefix. (3) All slicedprefixes differ from a prefix of the original infeasible error path σ only in theirno-op operations.

The visualizations in Fig. 2 capture the details of this process. Figure 2ashows the original error path. Nodes represent program locations and edges rep-resent operations between these locations (assignments to variables or assumeoperations over variables, the latter denoted with brackets). To allow easier dis-tinction, program locations that are followed by assume operations are drawn asdiamonds, while other program locations are drawn as squares. Program loca-tions before contradicting assume operations are drawn with a filled background.

Page 243: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Sliced Path Prefixes: An Effective Method to Enable Refinement Selection 237

The sequence of operations ends in the error state, denoted by le. Figure 2b de-picts the cascade-like sliced prefixes that the algorithm encounters during itsprogress. Figure 2c shows the three infeasible sliced prefixes that Alg. 3 returnsfor this example.

The refinement procedure can now use any of these infeasible sliced prefixes tocreate interpolation problems, and is not bound to a single, specific interpolantsequence for the original infeasible error path: a refinement selection from dif-ferent precisions is now possible. The following proposition states that this is avalid refinement process.

Proposition. Let σ be an infeasible error path and φ be the i-th infeasible slicedprefix for σ that is extracted by Alg. 3, then all interpolant sequences for φ arealso interpolant sequences for σ.

Proof. Let σ = 〈(op1, l1), . . . , (opn, ln)〉 and φ = 〈(op1, l1), . . . , (opw, lw)〉.Let Γφj be the j-th interpolant of an interpolant sequence for φ, i.e., for thetwo constraint sequences γ−

φj = 〈op1, . . . , opj〉 and γ+φj = 〈opj+1, . . . , opw〉,

with 1 ≤ j < w. Because φ is infeasible, the two constraint sequences γ−φj and

γ+φj are contradicting, and therefore, Γφj exists [11]. The interpolant Γφj is also

an interpolant for γ−σj = 〈op1, . . . , opj〉 and γ+

σj = 〈opj+1, . . . , opn〉, if (1) theimplication γ−

σj ⇒ Γφj holds, (2) the conjunction Γφj ∧ γ+σj is contradicting, and

(3) the interpolant Γφj contains only variables that occur in both γ−σj and γ+

σj .Consider that γ−

φj was created from γ−σj by replacing some assume operations by

no-op operations, and that γ+φj was created from γ+

σj by replacing some assumeoperations by no-op operations and by removing the operations 〈opw+1, . . . , opn〉at the end. Thus, both γ−

φj and γ+φj do not contain any additional constraints

(except for no-op operations) than γ−σj and γ+

σj , respectively.Because Γφj is an interpolant for γ−

φj and γ+φj , we know that γ−

φj ⇒ Γφj holds,and because γ−

σj can only be stronger than γ−φj , Claim (1) follows. The conjunction

Γφj ∧γ+φj is contradicting, and γ+

σj can only be stronger than γ+φj . Thus, Claim (2)

holds. Because Γφj references only variables that occur in both γ−φj and γ+

φj , whichdo not contain more variables than γ−

σj and γ+σj , resp., Claim (3) holds.

4 Slice-Based Refinement SelectionExtracting good precisions from the infeasible error paths is key to the CEGARtechnique, and the choice of interpolants influences the quality of the precision,and thus, the effectiveness of the analysis algorithm. By using the results intro-duced in the previous section, the refinement procedure can now be improved byselecting a precision that is derived via interpolation from a selected infeasiblesliced prefix.

Slice-based refinement selection extracts from a given infeasible error pathnot only one single interpolation problem for obtaining a refined precision, buta set of (more abstract) infeasible sliced prefixes and thus, a set of interpolationproblems, from which a refined precision can be extracted. The interpolation

Page 244: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

238 D. Beyer et al.

Algorithm 4 Refine+(σ)

Input: an infeasible error path σ = 〈(op1, l1), . . . , (opn, ln)〉Output: a precision π ∈ L → 2Π

Variables: a constraint sequence Γ ,a set Σ of infeasible sliced prefixes of σ,a mapping τ from infeasible sliced prefixes and program locations to precisions

1: Σ := ExtractSlicedPrefixes(σ)2: // compute precisions for each infeasible sliced prefix3: for each φj ∈ Σ do4: τ (φj) := Refine(φj) // Alg. 25: // select suitable sliced prefix (based on the sliced prefixes and their precisions)6: φselected := SelectSlicedPrefix(τ )

7: // return precision for CEGAR based on selected sliced prefix8: return τ (φselected)

problems for the extracted paths can be given, one by one, to the interpolationengine, in order to derive interpolants for each sliced prefix individually. Hence,the abstraction refinement of the analysis is no longer dependent on what theinterpolation engine produces, but instead it is free to choose from a set ofinterpolant sequences the one that it finds most suitable. The move from solvinga single interpolation problem to solving multiple interpolation problems, andunderstanding refinement selection as an optimization problem, is a key insightof our novel approach.

Algorithm 4 shows the algorithm for slice-based refinement selection, which isan extension of Alg. 2 in the CEGAR algorithm, allowing to choose a suitable pre-cision during the refinement step. First, this algorithm calls ExtractSlicedPrefixesto extract a set of infeasible sliced prefixes. Second, it computes precisions forthe sliced prefixes and stores them in the mapping τ . Third, one sliced prefix ischosen by a heuristic (in function SelectSlicedPrefix), and fourth, the precisionof the chosen sliced prefix is selected for refinement of the abstract model. Theheuristic can decide based on the information contained in the sliced prefixes aswell as in the precisions, e.g., which variables are referenced.Refinement-Selection Heuristics. We regard the problem of finding and se-lecting a preferable refinement as an independent direction for further research,and here, we restrict ourselves to presenting some ideas for a few refinement-selection heuristics. There are two obvious options for refinement selection thatare independent of the actual interpolants. Using the interpolant sequence de-rived from the very first, i.e., the shortest, infeasible prefix may rule out manysimilar infeasible error paths. The downside of this choice is that the analysismay have to track information rather early, possibly blowing up the state-spaceand making the analysis less efficient. The other straight-forward option (similarto counterexample minimization [2]) is to use the longest infeasible sliced prefix(containing the last contradicting assume operation) for computing an inter-polant sequence. This may lead to a precision that is local to the error locationand does not require refining large parts of the state space at the beginning of theerror path. However, it may also lead to a larger number of refinements if many

Page 245: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Sliced Path Prefixes: An Effective Method to Enable Refinement Selection 239

error paths with a common prefix exist. A more advanced strategy is to analyzethe domain types [3] of the variables that are referenced in the extracted preci-sion. Each precision can be assigned a score that depends on the domain typesof the variables in the precision such that the score of the infeasible sliced prefixis better if its extracted precision references only ‘easy’ types of variables, e.g.,boolean variables, and no integer variables or even loop counters. This allows tofocus on variables that are inexpensive to analyze, avoiding loop unrolling wherepossible, and keeping the size of the abstract state space as small as possible.

As future work, we plan to systematically investigate many different refine-ment heuristics; such heuristics can be integrated without changing the overallalgorithm, by replacing only the function SelectSlicedPrefix in Alg. 4 accordingly.

5 ExperimentsWe implemented our approach in the open-source verification frameworkCPAchecker [10], which is available online 2 under the Apache 2.0 license.CPAchecker already provides several abstract domains that can be used forprogram analysis with CEGAR. We only extended the refinement process towork according to Alg. 4 (Refine+), and did neither change the abstract domainsnor the interpolation engines. Our implementation is available in the source-coderepository of CPAchecker. The tool, the benchmark programs, the configurationfiles, and the complete results are available on the supplementary web page 3.Setup. For benchmarking, we used machines with two Intel Xeon E5-2650v2eight-coreCPUs with 2.6 GHz and 135 GB of memory. We limited each verificationrun to two CPU cores, 15 min of CPU time, and 15 GB of memory. We measuredCPU time and report it rounded to two significant digits. BenchExec 4 was usedas benchmarking framework to ensure precise and reproducible results.Configurations. Out of the several abstract domains that are supported byCPAchecker, we choose the value analysis with refinement [11] for our experi-ments. We use CPAchecker, tag cpachecker-1.4.2-slicedPathPrefixes.

In order to evaluate the potential of our approach, we compare four differ-ent heuristics for refinement selection (function SelectSlicedPrefix in Alg. 4):(1) shortest sliced prefix, (2) longest sliced prefix, (3) sliced prefix with bestdomain-type score, and (4) sliced prefix with worst domain-type score. Thedomain-type score of a sliced prefix is computed based on the domain types [3] ofthe variables that occur in the precisions, i.e., variables with a boolean characterare preferred over loop counters and other integer variables.Benchmarks. To present a thorough evaluation of our approach, we need a largenumber of verification tasks, and thus, we use the repository of SV-COMP [6]as a source of verification tasks. We select all verification tasks that fulfill thefollowing characteristics, which are necessary for a valid evaluation of our ap-proach: (1) the verification tasks relate to reachability properties, because the

2 http://cpachecker.sosy-lab.org/3 http://www.sosy-lab.org/~dbeyer/cpa-ref-sel/4 https://github.com/dbeyer/benchexec

Page 246: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

240 D. Beyer et al.

Table 1. Number of solved verification tasks for different heuristics for slice-basedrefinement selection on different subsets of benchmarks

Heuristic Sliced-Prefix Length Score Oracle

# Tasks Shortest Longest Best Worst Best Worst Diff

DeviceDrivers64 619 326 395 399 319 403 315 88ECA 1140 489 512 570 478 611 410 201ProductLines 597 456 361 402 360 463 353 110Sequentialized 234 29 22 30 27 30 19 11All Tasks 2 696 1 369 1 359 1 470 1 252 1 577 1 165 412

analysis that we use does not support other properties; (2) the reachability prop-erty of the verification tasks does not rely on concurrency, recursion, dynamicdata structures or pointers, because the analysis that we use does not supportthese features; and (3) there is at least one refinement during the analysis withmore than one infeasible sliced prefix, i.e., in at least one refinement iteration,a refinement selection is possible. More restrictions are not necessary becauseour goal is to show that there exists a significant difference in effectiveness andefficiency, depending on the choice of which sliced prefix is used for precisionrefinement. The scope of our experiments is not to evaluate which refinementselection is the best. The set of all verification tasks in our experiments containsa total of 2 696 verification tasks.

Results. Table 1 shows the number of verification tasks that the analysis couldsolve using refinement selection with one of the four heuristics described above.We also show hypothetical results of a fictional heuristic “Oracle”, which, for agiven program, always selects the best (or the worst) of the four basic heuristics.In other words, the column “Oracle Best” shows how many tasks could be solvedby at least one of the heuristics, and the column “Oracle Worst” shows how manytasks could be solved by all of the heuristics. The difference between these num-bers (column “Diff”) gives an approximation of the potential of our approach andprovides evidence how important refinement selection is. We list the results forthe full set of 2 696 verification tasks as well as for several subsets (categories ofSV-COMP’15). We consider these categories to be especially interesting becausethey contain larger programs than the remaining categories and our approach fo-cuses on improving refinements in large programs (with long and complex errorpaths, and many contradicting assume operations per error path).

The results show that selecting the right refinement can have a significantimpact on the effectiveness of an analysis. In our benchmark set there are morethan 400 verification tasks for which the choice of the refinement-selection heuris-tic makes the difference between being able to solve the task and running intoa timeout. Without our refinement-selection approach, the choice of the refine-ment depends solely on the internal algorithm of the interpolation engine, andthis potential for improving the analysis would be lost. The results show thatnone of the presented heuristics is clearly the best. The heuristic that uses the

Page 247: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Sliced Path Prefixes: An Effective Method to Enable Refinement Selection 241

1

10

100

1000

1 10 100 1000

CPU

Tim

e (s

) for

Heu

ristic

Lon

gest

Slic

ed P

rex

CPU Time (s) for Heuristic Shortest Sliced Pre x

(a) Heuristic “Shortest” vs. “Longest”

1

10

100

1000

1 10 100 1000

CPU

Tim

e (s

) for

Heu

ristic

Wor

st S

core

CPU Time (s) for Heuristic Best Score

(b) Heuristic “Best Score” vs. “WorstScore”

Fig. 3. Scatter plots comparing the CPU time of the analysis with different heuristicsfor slice-based refinement selection for all 2 696 verification tasks

refinement with the best score regarding the domain types of the variables thatare contained in the precisions is the best overall (as expected, because it is theonly one that systematically tries to select a refinement that hopefully makes iteasier for the analysis). However, there are still verification tasks that cannot besolved with this heuristic but with one of the others (as witnessed by the differ-ence between columns “Score Best” and “Oracle Best”). Thus, finding a betterrefinement-selection heuristics is promising future work.

Figure 3 shows scatter plots for comparing the CPU times of the analysis withtwo of the four heuristics for slice-based refinement selection. The large number ofdata points at the top and right borders of the boxes show those results that weresolved using one of the heuristics but not by the other. In addition, one can seethat the choice of the refinement-selection heuristic can also have a performanceimpact of factor more than 10 even for those programs that can be solved by bothheuristics (witnessed by the data points in the upper left and lower right corners).This effect also results in a huge performance difference in total: the CPU timefor those 1 165 verification tasks that could be solved with all heuristics varies be-tween 110 h (heuristic “Score Worst”) and 57 h (heuristic “Score Best”), a potentialimprovement due to refinement selection of almost 50 %.

6 ConclusionThis paper presents our novel approach of sliced prefixes of program paths, whichextracts several infeasible sliced prefixes from one single infeasible error path.From any of these infeasible sliced prefixes, an independent interpolation prob-lem can be derived that can be solved by a standard interpolation engine, and

Page 248: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

242 D. Beyer et al.

the refinement procedure can choose from the resulting interpolant sequencesthe one that it considers best for the verification. Our novel approach is inde-pendent from the abstract domain (in particular, does not depend on using anSMT solver) and can be combined with any analysis that is based on CEGAR,while previous work on guided interpolation [22] is applicable only to SMT-based approaches. Finally, we demonstrated on a large experimental evaluationon standard verification tasks that the choice, which sliced prefix to take forprecision extraction, has a significant impact on the effectiveness and efficiencyof the program analysis. In future work, we plan to systematically explore morecriteria for ranking sliced prefixes, and then investigate guided techniques forautomatically selecting a preferable refinement. Furthermore, we plan to extendour experiments to other abstract domains, such as predicate abstraction andoctagons; preliminary results already look promising.

References1. Albarghouthi, A., Li, Y., Gurfinkel, A., Chechik, M.: Ufo: A framework for

abstraction- and interpolation-based software verification. In: Madhusudan, P.,Seshia, S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 672–678. Springer, Heidelberg(2012)

2. Alberti, F., Bruttomesso, R., Ghilardi, S., Ranise, S., Sharygina, N.: An extensionof lazy abstraction with interpolation for programs with arrays. Formal Methodsin System Design 45(1), 63–109 (2014)

3. Apel, S., Beyer, D., Friedberger, K., Raimondi, F., von Rhein, A.: Domain types:Abstract-domain selection based on variable usage. In: Bertacco, V., Legay, A.(eds.) HVC 2013. LNCS, vol. 8244, pp. 262–278. Springer, Heidelberg (2013)

4. Ball, T., Cook, B., Levin, V., Rajamani, S.K.: SLAM and Static Driver Verifier:Technology transfer of formal methods inside Microsoft. In: Boiten, E.A., Derrick,J., Smith, G.P. (eds.) IFM 2004. LNCS, vol. 2999, pp. 1–20. Springer, Heidelberg(2004)

5. Ball, T., Rajamani, S.K.: The Slam project: Debugging system software via staticanalysis. In: Launchbury, J., Mitchell, J.C. (eds.) POPL, pp. 1–3. ACM, New York(2002)

6. Beyer, D.: Software verification and verifiable witnesses (Report on SV-COMP2015). In: Baier, C., Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 401–416.Springer, Heidelberg (2015)

7. Beyer, D., Henzinger, T.A., Jhala, R., Majumdar, R.: The software model checkerBlast. Int. J. Softw. Tools Technol. Transfer 9(5-6), 505–525 (2007)

8. Beyer, D., Henzinger, T.A., Majumdar, R., Rybalchenko, A.: Path invariants. In:Ferrante, J., McKinley, K.S. (eds.) PLDI, pp. 300–309. ACM, New York (2007)

9. Beyer, D., Henzinger, T.A., Théoduloz, G.: Program analysis with dynamic preci-sion adjustment. In: ASE, pp. 29–38. IEEE, Washington, DC (2008)

10. Beyer, D., Keremoglu, M.E.: CPAchecker: A tool for configurable software ver-ification. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806,pp. 184–190. Springer, Heidelberg (2011)

11. Beyer, D., Löwe, S.: Explicit-state software model checking based on CEGAR andinterpolation. In: Cortellessa, V., Varró, D. (eds.) FASE 2013. LNCS, vol. 7793,pp. 146–162. Springer, Heidelberg (2013)

Page 249: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Sliced Path Prefixes: An Effective Method to Enable Refinement Selection 243

12. Beyer, D., Löwe, S., Wendler, P.: Domain-type-guided refinement selection basedon sliced path prefixes. Technical Report MIP-1501, University of Passau (January2015), arXiv:1502.00045

13. Beyer, D., Petrenko, A.K.: Linux driver verification. In: Margaria, T., Steffen, B.(eds.) ISoLA 2012, Part II. LNCS, vol. 7610, pp. 1–6. Springer, Heidelberg (2012)

14. Blanchet, B., Cousot, P., Cousot, R., Feret, J., Mauborgne, L., Miné, A., Monniaux,D., Rival, X.: A static analyzer for large safety-critical software. In: Cytron, R.,Gupta, R. (eds.) PLDI, pp. 196–207. ACM, New York (2003)

15. Clarke, E.M., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-guidedabstraction refinement for symbolic model checking. J. ACM 50(5), 752–794 (2003)

16. Craig, W.: Linear reasoning. A new form of the Herbrand-Gentzen theorem. J.Symb. Log. 22(3), 250–268 (1957)

17. D’Silva, V., Kröning, D., Purandare, M., Weissenbacher, G.: Interpolant strength.In: Barthe, G., Hermenegildo, M. (eds.) VMCAI 2010. LNCS, vol. 5944,pp. 129–145. Springer, Heidelberg (2010)

18. Graf, S., Saïdi, H.: Construction of abstract state graphs with PVS. In: Grumberg,O. (ed.) CAV 1997. LNCS, vol. 1254, pp. 72–83. Springer, Heidelberg (1997)

19. Henzinger, T.A., Jhala, R., Majumdar, R., McMillan, K.L.: Abstractions fromproofs. In: Jones, N.D., Leroy, X. (eds.) POPL, pp. 232–244. ACM, New York(2004)

20. Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Lazy abstraction. In: Launch-bury, J., Mitchell, J.C. (eds.) POPL, pp. 58–70. ACM, New York (2002)

21. Jhala, R., Majumdar, R.: Path slicing. In: Sarkar, V., Hall, M. (eds.) PLDI,pp. 38–47. ACM, New York (2005)

22. Rümmer, P., Subotic, P.: Exploring interpolants. In: Jobstmann, B., Ray, S. (eds.)FMCAD, pp. 69–76. IEEE, Washington, DC (2013)

Page 250: Formal Techniques LNCS 9039 - Inria 9039 35th IFIP WG 6.1 International Conference, FORTE 2015 ... that ensures deadlock freedom in the π-calculus to a higher-order language.

Author Index

Abadi, Martın 131Arbach, Youssef 83

Bartoletti, Massimo 161Beal, Jacob 113Ben Salem, Ala Eddine 196Beyer, Dirk 228

Cimoli, Tiziana 161Claudel, Benoit 213

Damiani, Ferruccio 113

Gautier, Thierry 66Ghosh, Ritwika 35

Horn, Alex 19, 50

Isard, Michael 131

Jiao, Li 146

Karcher, David 83Kouzapas, Dimitrios 181Kroening, Daniel 19, 50

Lowe, Stefan 228

Mitra, Sayan 35Murgia, Maurizio 161

Namjoshi, Kedar S. 98Nestmann, Uwe 83Ngo, Van Chan 66Novara, Luca 3

Padovani, Luca 3Peters, Kirstin 83Philippou, Anna 181Pianini, Danilo 113Podda, Alessandro Sebastian 161Pompianu, Livio 161

Sabah, Quentin 213Stefani, Jean-Bernard 213

Talpin, Jean-Pierre 66Trefler, Richard J. 98

Viroli, Mirko 113

Wang, Weifeng 146Wendler, Philipp 228