Coverage Maximization with Autonomous Agents in...

Noname manuscript No.(will be inserted by the editor)

Coverage Maximization with Autonomous Agents in Fast FlowEnvironments

Andrew Kwok · Sonia Martınez

Communicated by Emilio Frazzoli

Abstract This work examines the cooperative motion of a group of autonomous ve-hicles in a fast flow environment. The magnitude of the flow velocity is assumed to begreater than the available actuation to each agent. Collectively, the agents wish to max-imize total coverage area defined as the set of points reachable by any agent withinTtime. The reachable set of an agent in a fast flow is characterized using optimal con-trol techniques. Specifically, this work addresses the complementary cases where thestatic flow field is smooth, and where the flow field is piecewise constant. The lattercase arises as a proposed approximation of a smooth flow that remains analyticallytractable. Furthermore, the techniques used in the piecewise constant flow case enabletreatment for obstacles in the environment. In both cases, a gradient ascent method isderived to maximize the total coverage area in a distributed fashion. Simulations showthat such a network is able to maximize the coverage area in a fast flow.

Keywords Optimal Sensor Coverage· Distributed Control of Multi-agent Systems

Mathematics Subject Classification (2000)MSC 93C99· MSC 49J15

1 Introduction

The development of novel schemes that allow for the re-adaptation of unmanned vehiclesto changing environmental conditions is a vigorously studied field [1–3]. Indeed, the per-vasiveness of miniaturized embedded devices has spurred a great interest in the productionof lower-cost autonomous vehicles. However, the low power capacity of these vehicles canseverely affect the system autonomy and performance. This may be the case when operatingin harsh environments, where the actuation of a vehicle may not be sufficient to deal with

Work supported by grants NSF CAREER Award CMMI-0643679 and NSF CNS-0930946.

A. KwokOpera Solutions, San Diego

S. MartınezMechanical and Aerospace Engineering, 9500 Gilman Drive, LaJolla, CA 92093E-mail: [email protected]

2 Andrew Kwok, Sonia Martınez

environmental disturbances. In this work, we consider the case where the environment pre-vents an agent from maintaining a constant position, or evenrevisiting previous locations.Such scenarios are motivated by applications, such as performing surveillance or monitor-ing operations with lightweight UAVs in a large storm system. Specifically, we will assumethat the agents move in ana priori known flow field, but the field is faster than the absolutemaximum speed of an agent. We wish to enable a group of cooperative vehicles to movesuch that the area of the reachable set of points in the flow environment is maximized.

Contributions.The main contribution of this work is an algorithmic solution to the au-tonomous deployment of vehicles to maximize coverage in a fast flow environment. Thiscan be used as a high-level routine for the deployment of UAVsin cooperative surveillanceor sensing tasks. We encode this objective through a performance metric that the agents wishto maximize as the total area of all points reachable within achosen time horizonT. Thismetric contains global information since it is the sum of allthe individual vehicles’ reach-able areas. However, using gradient ascent techniques, we show that local maximization ispossible with knowledge of only a subset of other agent locations.

We restrict our study to flows that are either affine or piecewise constant, which allows usto provide an analytical characterization of the gradient ascent direction that vehicles mustfollow. Additionally, we allow for the presence of obstacles in the piecewise constant flowcase. The notion of reachable sets motivates the study of time-optimal trajectories withinthese two respective flow fields. Hence, we characterize the behavior of time-optimal trajec-tories for each flow case. While there is a rich body of existing literature regarding motionplanning with and without obstacles in various environments, we provide results that arespecifically adapted to the problem that we are addressing. In particular, we study the differ-ent cases of time-optimal trajectories that may arise in a piecewise constant flows, and usethese cases as a basis of a gradient-based coverage motion algorithm.

Literature review.The problem of path planning in a flow field has its roots in the clas-sical optimal control problem known as Zermelo’s navigation problem [4], whose solutionrelies on use of the Pontryagin minimum principle, as in [5].Motion planning for single ve-hicles in a flow has been lately studied [6,7]. More recently,[8–10] characterizes the optimaltrajectories for a Dubins-like vehicle in the presence of constant flows, while [11] presentsa robust control strategy for a Dubins’ vehicle in non-constant flows. The paper [12] pro-vides a generalization of time-optimal course changes for aDubins vehicle moving across aheterogeneous terrain, and across boundaries where the vehicle dynamics change.

A rich body of literature exists for obstacle avoidance, such as [13–16]. The approachthat we take in this work most resembles the results found in [13–15], where different possi-ble cases of minimum-time trajectories are considered and characterized. In particular, [14,15] deal with the scenario where the environment exhibits piecewise constant regions withdiffering trasversal costs, which results in a type of Snell’s law for trajectories. However, theresulting behavior of time-optimal trajectories is different from the results we present herefor fast flows. Namely, with hilly terrain as in [14] one finds that having “switchback” typepaths may be time-optimal, while in the flow environment we consider, only straight pathsare feasible. In [15], the authors examine the case where refracted trajectories may movealong interfaces between regions; similar to the results described in this paper, except thatin [15], the piecewise constant regions are isotropic with respect to desired travel direction.

Regarding cooperative motion of multiple vehicles in a flow environment, the authorsin [17] have demonstrated stable formation control in the presence of an external time-invariant flow field. However, that work assumes that individual vehicles have full controlla-bility. Similarly, [18] addresses the case of maintaining astable formation for sampling in an

Coverage Maximization with Autonomous Agents in Fast Flow Environments 3

ui

V

θ

α

β

‖b‖TT

Fig. 1 On the left, illustration of theβ , andα quantities. The dashed circle represents the set of inputsuiwith magnitude 1. On the right, the reachable setRT(pi0) for pi0 under a constant horizontal flow.

ocean environment with exogenous currents, while [19] combines time-optimal trajectoriesin a constant flow field with formation stabilization.

With the exception of [9] and [20], previous work has considered the motion controlproblem of vehicles in a slow flow. That is, the vehicles had sufficient actuation to moveagainst the flow. We utilize a simple kinematic model, as in [17], to consider flows whichare both spatially varying and fast. Furthermore, much previous work, such as [9], considerconstant flow fields. In the interest of obtaining analytic solutions, we consider piecewiseconstant flow fields as an approximation of smooth spatially varying flows, such as in [20].Such a consideration introduces several interesting effects regarding time-optimal trajecto-ries over discontinuous flows. The work of [12] considers a similar situation, except that theeffect of the environment only imposes a change in maximum velocity of a vehicle. This isnot the case for this work due to the directionality of flows inthe environment.

2 Problem Statements and Definitions

We consider the following kinematic model:

pi = ui +V(pi) , (1)

whereui(t) andV(pi) are piecewise smooth,‖ui‖ ≤ 1, and‖V‖ > 1, and the driftV(pi)models an environmental flow field. Because of this, an agent cannot maintain a constantposition, and is always pushed along the flow.

Definition 2.1 (Reachable set)The reachable set, R(pi0), of an agent at position pi0 isthe set of points x∈ X that an agent can reach in finite time starting from pi0 and using apiecewise smooth control input ui(t) with ‖ui‖ ≤ 1. The T-limited reachable set, RT(pi0),of an agent at position pi0, is the set of points that an agent can reach within time T using apiecewise smooth control input ui(t) with ‖ui‖ ≤ 1.

The definition of theT-limited reachable set motivates the study of time-optimaltrajectoriesin Sections 3–5. It can be shown, see [5], page 77, that maximum actuation is a necessarycondition for time optimality. Thus, for time optimal trajectory calculations, we assume thatui = (cosθi ,sinθi)

T.To better relate flow field and agent motion, we utilize the following two parameters

to describe the flow field. Referencing Figure 1, firstα(x) = atan2(V2(x),V1(x)) , is theflow direction. Next,β : X →]0, π

2 [ denotes the maximum feasible angle away fromα(x)that the agent can move in. More precisely, givenV(x) and‖ui‖ = 1, the endpoint of anagent’s velocity vector (1) lies on the dashed circle of Figure 1. Using trigonometry, one can


show thatβ (x) = arcsin(

1‖V(x)‖

)

. Additionally, the choice of headingsθ that result in these

maximal travel directions areθ = α(x)± (β (x)+ π2 ). The feasible set of directions that an

agent atpi would be able to travel in is then the interval[α(pi)−β (pi),α(pi)+β (pi)].

Problem 1 Our main objective will be to define a distributed algorithm for the deploymentof agents in a flow environment. Specifically, we wish to maximize the area coverage metric:

H (p1, . . . , pn) :=∫

⋃ni=1RT (pi)

1dx. (2)

This must be done without the need for all-to-all inter-agent communication, and it musttake into account the flow environment’s influence on agent dynamics.

We will employ a gradient ascent approach to maximizeH , since this technique lendsitself easily to distributed implementation. One can imagine that, as an agenti moves in ageneral flow environment, the shape ofRT(pi) can change in complicated ways. However,knowledge of how this change works is crucial in a gradient approach, as we would like topick the direction of motion for each agent that results in the greatest gain of reachable area.Let ∂Y denote the boundary of a setY ⊆ X. A key observation for this problem is that, fora fixedT, points along∂RT(pi) correspond to endpoints of time optimal trajectories. Thechoice ofT is constrained by inter-vehicle communication restrictions (see Section 6).

Therefore, to address Problem 1, we will need to study the nature of time optimal tra-jectories in a flow environment. We will examine two distinctbut complementary problems.The first flow environment will be the affine flow field. For this case, we will use the follow-ing set of assumptions.

Assumption A1 The coverage domain X is simply connected, convex, bounded,and thevector field V describing the flow is:

1. time invariant and affine in x, V(x) = Ax+b, for any A∈ R2×2 and b∈ R

2,2. ‖V(x)‖ > 1 for all x ∈ X.

Under Assumption A1, we would like to solve the following problem.

Problem 2 Given Assumption A1, an agent with initial positionpi(0) = pi0, and subject todynamics (1), determine an expression for time optimal trajectories.

In Section 3, we will describe some properties of the reachable sets under Assumption A1as well as present the solution to Problem 2.

Examining affine flow fields is quite restrictive, however consideration of general smo-oth flow fields results in solutions of the time optimal control problem that are, in general,unsolvable in closed form. Therefore, in the interest of tractability, we examine piecewiseconstant fields:

Assumption A2 The flow environment X may have obstacles and:

1. the flow V is piecewise constant. That is, X=⋃m

k=1 Xk, such that V|Xkis constant and

satisfies‖V|Xk‖ > 1 for all k,

2. the regions Xk, k∈ {1, . . . ,m}, are separated by curvesψℓk : Xk →R, which are piecewise

differentiable. More precisely, the curve defined byψℓk(x) = 0 separates the flow regions

k andℓ ∈ {1, . . . ,m} and defines∂Xk∩∂Xℓ,3. the flow regions Xk are closed sets. Therefore, along the boundary between two neigh-

boring regions Xk, Xℓ, we enable the agent to flow according to either constant flowvelocity.


Under these assumptions, we will solve the problem:

Problem 3 Given Assumption A2, an agent with initial positionpi(0) = pi0, and subject todynamics (1), characterize the behavior of time optimal trajectories.

In Sections 4 and 5, we will heavily exploit the fact that time-optimal trajectories are simpleto compute in the interior of constant flow regions. However,the discontinuity betweenflow regions complicates the characterization of trajectories through multiple flow regions.Furthermore, certain intersections of time optimal trajectories with flow region boundarieslead to interesting behavior, which we detail in Section 5.

3 Affine Flows

In this section, we characterize the time-optimal trajectories and reachable set of an agent ina continuously differentiable flow field. This will solve Problem 2, which will be employedto solve Problem 1. The results build on the known analysis for the Zermelo problem thatwe briefly present here for the sake of completeness.

By choosingui such that an agent is always heading±β away fromα, we can computethe boundary of the reachable set. Specifically, these inputs take the form

ui(pi) =

[

cos(α(pi)±(

β (pi)+ π2 ))

sin(α(pi)±(

β (pi)+ π2 ))

]

. (3)

We state some properties of the reachable set and refer the reader to [21] for the proofs.

Proposition 3.1 Under Assumption A1, the extreme trajectories computed using (3), whichdefine the reachable set boundary, intersect only at the initial position pi0.

Theorem 3.1 Under Assumption A1, the reachable setR(pi0) has no holes.

In order to findRT(pi0), we need to obtain all the solutions to the optimal controlproblem:

minimize: J =∫ t f

01dt ,

subject to: ˙pi = ui +V(pi) ,‖ui‖ ≤ 1,given pi(0)andpi(t f ) ,

(4)

for a given timet f ≤ T, initial condition pi(0) = pi0 and free final conditionpi(t f ). For asmooth flow fieldV, this is known as Zermelo’s problem, and a solution can be found [5],page 78. The optimal solution is to consider a control input of the formui = (cosθi ,sinθi)

T

with

θi = sin2 θi∂V2

∂x1+sinθi cosθi

(

∂V1

∂x1− ∂V2

∂x2

)

−cos2 θi∂V1

∂x2. (5)

The minimum-time trajectories are obtained by using this input in combination with (1).Given a fixedpi(T) in the reachable set, one can find an specific optimal trajectory via ashooting method.

To define theT-limited boundary, one can integrate (1) using (5) to timeT, starting atpi0 for various initial headingsθi(0) ∈ [α −β − π

2 ,α + β + π2 ]. The solutions can be then

recorded and combined with the extreme travel curves fort ∈ [0,T] to give the boundary ofRT(pi0). Finally, we defineτ : X×X → R≥0∪{∞} to be the minimum travel time betweentwo points inX. The minimum travel timeτ(x,y) is computed from the solution to (4) foran agent starting atx and ending aty. Due to the flow, in general,τ(x,y) 6= τ(y,x).


Remark 3.1 Time-optimal solutions can in some situations cease to be optimal at a givenpoint. This is known in the optimal control literature as aconjugate point. Roughly speaking,a conjugate point along an optimal trajectory (or geodesic), occurs wherever that geodesicintersects another geodesic. However, due to the result in [22], optimal trajectories corre-sponding to linear flow fields do not have conjugate points. This can be readily extendedto the case of affine flows. In this way,T can be arbitrarily large without the risk of incor-rectly approximating theT-limited reachable set by joining the endpoints of solutions of thePontryagin minimum principle, (4) and (5).

Remark 3.2 The reachable setRT(pi0) can be approximated by a discrete set of pointson the boundary by choosing a subset of angles. As the number of angles to sample thereachable set increases, the approximation converges to the true reachable set. To find anaccurate approximation, one can employ{θk}N

k=1 ⊆ [α(pi)− β (pi)− π2 ,α(pi) + β (pi) +

π2 ]≡ [a,b], with θk+1 = a+ k

N−1(b−a), k= 0, . . . ,N−1, withN sufficiently large. However,depending on the shape of the reachable set, and how the optimal trajectories vary withθ , wemay require a largeN to obtain an acceptable representation. With knowledge of the curvesdefining theT-limited boundary ofRT(pi) (that is, the points inRT(pi) that are reachedat timeT), and the two extreme optimal trajectories, one can attemptto use an adaptivesampling procedure to find better{θk}N

k=1. Several papers are available in the literatureregarding the approximation of convex curves by a set of points [23,24]. Typically, curvesare assumed to be known, piecewise convex or concave, with known inflection points; thus,the approximation problem is restricted to the convex/concave sub-pieces. For example, [25]quantifies the error of the approximation of the boundary of aconvex setQ⊆ R

2 by a set ofn pointsPn defining an inner polygon,Pn ⊆ Q. In particular, it is proven that, asn → +∞,the approximation error is given as:

E ≈ 112n2

∫ 2π

0

(

ρ(r)2/3dr)3

=1

12n3

(

∫

ℓκ(ℓ)1/3dℓ

)3

,

whereρ(r) is the curvature radius of the boundary,κ its curvature, andℓ an arc-lengthparameterization of the curve. The authors [25] then suggest the method ofempirical dis-tributions to place the points inPn = {r1, . . . , rn} to provide an optimal approximation asn → +∞. The r i should satisfy that

∫ r i+1r i

ρ(r)2/3dr has equal value for alli. This meansthat, the higher the curvature radius, the closer the pointshave to be in order to representit. The representation and sampling ofRT(pi) can be based on the method of empiricaldistributions (which can be extended to non-convex boundaries). If theT boundary ofRT

was known, then one could choose pointsp1 f , . . . , pN f ∈ RT to satisfy the above criterion.Then, for eachpk f , one can solve the minimum-time optimal control problem to find thecorresponding angleθk and optimal trajectory joiningpi(0) with pk f . These would be moreadapted angles to representRT . Similarly, one would have to choose timestk ∈ [0,T], toapproximate each of the optimal extreme trajectories ofRT(pi). To deal with the fact thattheT-limited portion ofRT(pi) is unknown, one would have to estimate somehow the valueρ(r) (or κ(ℓ)) between two sampling pointspk f , pk+1 f , which could be done via a finersampling. This idea was used in the paper [26] to approximatenon-convex boundaries. Fi-nally, aspi moves, the boundaries will change with time. A good adaptivealgorithm wouldtake into account the variation of the optimal trajectoriesin a neighborhood ofpi in order toforecast if the boundaries become more curved or not. This issue requires further study forthe affine and piecewise constant case in the following sections. In case the flow is constantin one direction and linear in the other it can be seen that theboundary of reachable sets are


composed of straight and circular lines. For these cases, the approximations using constantinter-angle sampling works well.

For affine flows, the dynamics (1) take the form

pi = Api +b+ui =

(

a11 a12

a21 a22

)

pi +

(

b1 +cosθi

b2 +sinθi

)

. (6)

The time-optimal steering input given by (5) with the affine flow parameters is

θi = a21sin2 θi +(a11−a22)sinθi cosθi −a12cos2 θi .

Note that the above differential equation is separable, andone can obtain an explicit solutionfor θi(t) up to quadratures (possibly in implicit integral form). Thesystem described by (6)is linear inpi , and so the time-optimal trajectories in an affine flow are given by

pi(t) = eAt pi(0)+∫ t

0eA(t−s)(b+ui(s))ds. (7)

For the constant flow case,A = 0 and‖b‖ > 1, the setRT(pi) resembles Figure 1. Theextreme travel curves are straight line segments at angles±β away from the flow direction.They are joined by a circular arc with radiusT centered at a distance‖b‖T downstream. Theset for affine flows is a deformed version of this shape.

4 Piecewise Constant Flows: Simple Trajectories

With the aim of extending algorithms to a wider family of flows, we further develop theprevious results to piecewise constant flows addressing Problem 3. Thus, this section and thefollowing one detail the various types of optimal trajectories that may occur in a piecewiseconstant flow.

Under Assumption A2, optimal paths in the interior of each region Xk will be straightlines since the flow in each region is constant. The last item in Assumption A2 appliesmainly for the scenario where it is possible for an agent to move along the interface betweentwo flow regions. This specific case will be handled in Section5.

Suppose an agent begins atp0 and headingθ0. Then, there will be a sequence of optimalcourse changes corresponding to each time an agent travels into a new flow region. Weenumerate these with a subscriptℓ. Furthermore,tℓ andxℓ will be the time and location ofa trajectory’s crossing from the(ℓ−1)th region to theℓth region, with the convention thatt0 = 0 andx0 = p0. When necessary, we denote the first and second components ofxℓ asxℓ,1

andxℓ,2, respectively.

4.1 Catalog of Optimal Trajectories

Before launching into the detailed description of each typeof trajectory and how they occur,we summarize the various types of optimal trajectories we find in a piecewise constant flow.We categorize the time-optimal trajectories into two broadcategories:

Definition 4.1 A simple trajectoryis a time optimal trajectory where each point of intersec-tion with a flow region boundaryxℓ occurs transversely. Furthermore, this simple trajectorydoes not intersect any other time optimal trajectory. A trajectory that is not a simple trajec-tory is anon-simple trajectory.


We note that transverse intersections with the environmentboundaries, i.e. obstacles, arecollisions, and the simple trajectory ends at that final intersection point. Non-simple tra-jectories are primarily trajectories that have atangential intersection with a flow regionboundary. This results in a wide range of interesting phenomena that will be investigated inSection 5. We reserve the remainder of this section to describe simple trajectories. Unlessotherwise stated, the proofs of all statements can be found in the thesis [21].

4.2 Simple Trajectories

V1V2 V3

ψ21(x) = 0

ψ32(x) = 0

V− = (c1,c2) V+ = (d1,d2)

x0

xf

x

ψ(x) = 0

Fig. 2 On the left, a general piecewise flow case. On the right, a graphical example of the relation betweenincident and refracted angle of a vehicle at the interface between two regions of different constant flows.

Referencing Figure 2, consider that the vehicle begins atx0, the interface between tworegionsX1, X2, occurs at the smooth level set ofψ(x) = 0, and the two constant flows aregiven byV− = (c1,c2)

T andV+ = (d1,d2)T.

The following proposition determines the heading change that an optimal trajectory willundergo when transitioning fromX1 to X2:

Proposition 4.1 Let V− = (c1,c2)T and V+ = (d1,d2)

T be the flows in two neighboring re-gions, andα1, α2 be their respective flow orientations. Letξ be the orientation of the normalvector of the smooth curveψ(x) = 0 at the point where the optimal trajectory crosses intothe second flow region. Letθ− andθ+ be the control inputs of the optimal trajectories insideregions X1 and X2, respectively. A necessary condition for an optimal trajectory across theinterface of the two flow regions requires that:

1+‖V−‖cos(θ−−α1)

sin(θ−−ξ )=

1+‖V+‖cos(θ+ −α2)

sin(θ+ −ξ ). (8)

Given(8), and a fixed headingθ−, the final heading satisfies

sinθ+ =B±C

√B2 +C2−1

B2 +C2 , (9)

with B=1+‖V−‖cos(θ−−α1)

sin(θ−−ξ )cosξ −d2 , C =

1+‖V−‖cos(θ−−α1)

sin(θ−−ξ )sinξ +d1.

Remark 4.1 For certain choices of initial headingsθ−, the system of equations in the proofof Proposition 4.1 may have no solution. These special angles correspond to the choice ofextreme headingsθ− = α1±

(

β1 + π2

)

. However, the solution ofθ+ as a function ofθ− has


a well-defined limit asθ− → α1±(

β1 + π2

)

. Additionally, it may be that there is no solutionfor θ+ in (9). This may occur whenB2 +C2−1 < 0. However, since (9) is a necessary con-dition for optimality, the fact that there is no real-valuedθ+ satisfying (9) implies that thereis no optimal continuation of a particular trajectory that intersects the boundary between twoflows; see Figure 5 for an example. Trajectories following flow interfaces or encounteringobstacles will be studied in Section 5.

5 Piecewise Constant Flows: Non-simple Trajectories

We now study how consideration of flow boundaries, obstacles, or boundaries between twoneighboring flow regions affects the optimal trajectories.We refer to these optimal trajecto-ries asnon-simple trajectories.

5.1 Obstacles

By obstacles in the flow, we refer to holes in the flow environment and boundaries markingimpassable regions, such as in Figure 3. Such phenomena may impede a simple trajectory.

Before introducing the next lemma, we define∗, the path concatenation operator. Giventwo parameterized pathsρ1,ρ2 : [0,1] → R

2, such thatρ1(1) = ρ2(0), the concatenation ofthe two paths,ρ = ρ1∗ρ2 is such thatρ(0) = ρ1(0) andρ(1) = ρ2(1).

Lemma 5.1 Letρ be a feasible trajectory in a constant flow environment without obstaclesfrom ρ0 = ρ(0) to ρ f = ρ(1). Choose s1,s2 ∈ [0,1] with s1 < s2 and letρ1 be the sectionof ρ(s) for 0 ≤ s < s1 and ρ2 be the section ofρ(s) for s2 ≤ s≤ 1. Let the function Jtake a parameterized pathρ and return the time required to trasverse it in the constantflow. Supposeµ is the straight line path fromρ(s1) to ρ(s2). Then,ρ = ρ1 ∗ µ ∗ρ2 has theproperty that J(ρ) ≤ J(ρ).

In other words, by replacing a section of a trajectory with a straight line path, the time totrasverse that path is shortened. Next, we consider the situation when an optimal trajectoryintersects an obstacle (or flow environment boundary).

Definition 5.1 Let ζ be a parameterized curve in the flow environment. Then,ζ is locallyconvexat ζ (s) relative to the flow domain X if there exists aδ > 0 such that, for everys∈]s−δ , s+δ [, s 6= s, and for all t∈ [0,1], we haveζ (s)+ t(ζ (s)−ζ (s)) ∈ X. Similarly,ζis locally concaveat ζ (s) relative to X if there exists aδ > 0 such that, for s∈]s−δ , s+δ [,s 6= s, and for all t∈ [0,1], ζ (s)+ t(ζ (s)−ζ (s)) /∈ X.

The convex (resp. concave) curveζ is strictly convex(resp.strictly concave) if thereexistsδ > 0, such that, for all t∈ [0,1], ζ (s)+ t(ζ (s)−ζ (s)) 6= ζ (s) for all s∈]s−δ , s+δ [.

With these definitions we can state the following result regarding non-simple paths.

Theorem 5.1 Let ζ be a parameterization of the boundary of an obstacle. Ifζ is locallystrictly concave at the pointζ (s), then time-optimal trajectories that are tangent atζ (s)may result in non-simple paths. These paths are formed by following the boundaryζ , andthen tangentially leaving as a straight line path into the interior of X.


p0

10 11 12 13 14 15 16 17 18 19 20

8

9

10

11

12

13

14

15

Fig. 3 On the left, the light-gray area is the subset ofR(p0) that can be reached by following simple trajec-tories. Non-simple trajectories are obtained going fromp0 to the corner in the environment and continuiningwith a straight line in the dark-grey area. On the right, a scenario where simple trajectories form a gap

Remark 5.1 Cases where∂X is not smooth can be seen as a limiting case of Theorem 5.1.Similarly, non-smooth points along∂Xk−∂X∩∂Xc

k can be seen as a limiting case of Propo-sition 4.1. Both scenarios are illustrated in Figure 3. Here, the set of points reachable bysimple trajectories is shaded light gray, while the non-simply reachable points are shadedas dark gray. Time optimal trajectories to reach the occluded region must follow the blackdashed line, and then change course at the corner.

Definition 5.2 We define acourse changeto be either a discontinuous change in heading asdescribed in(8) or the sequence of navigating around an obstacle described in Theorem 5.1.

5.2 Intersecting Trajectories

Multiple simple or non-simple trajectories may intersect in two possible ways. The firstinstance involves trajectories that intersect at the same time, and the second instance in-volves trajectories that intersect at different times. In both cases, the individual trajectoriesare optimal up until the point where they intersect. Then, a decision must be made to deter-mine which trajectory (if any) to continue based on travel time. Such scenarios highlight the

p0

Fig. 4 On the left, a scenario where simple trajectories intersect each other. On the right, a path flows aroundan obstacle faster than the other. The dashed path terminatesat the intersection.

fact that the Pontryagin minimum principle along with the Bellman principle of optimalityare necessary conditions. Thus, while a particular trajectory ρ1(t), originating fromp0 andpassing throughx∈ X, may satisfy Proposition 4.1 and Theorem 5.1, it does not imply that


ρ1(t) is the only trajectory satisfying those results that passesthroughx. The following re-sult formalizes the decision-making process that occurs when two time optimal trajectoriesintersect.

Proposition 5.1 Let ρ1(t) and ρ2(t) be two distinct time optimal trajectories originatingfrom p0. Suppose that there exist t1 > 0 and t2 > 0 such thatρ1(t1) = ρ2(t2), and that thisis the first intersection betweenρ1 andρ2. In other words, t1 = mint{ρ1(t) = ρ2(s) | s> 0}.Then,

1. if t1 < t2, thenρ1(t) is time optimal for t∈ [0, t1],2. if t1 = t2, then the optimality of both trajectories is restricted to only points inρ1(t),

ρ2(t) for t ∈ [0, t1],3. if t1 > t2, thenρ1(t) is time optimal for t∈ [0, t1[.

The first and third cases of Proposition 5.1 involve simple trajectories that intersect at dif-ferent times. This can occur with flow around an obstacle, as in Figure 4. It may be that flowon one side of the obstacle is faster or shorter. Then it is possible for a trajectory that takesthe faster path to intersect a trajectory that travels the slower path. Intersecting trajectoriescan also occur when an agent travels along a flow interface as in Figure 5. Upon crossing aflow interface, the new course change may result in an agent that travels along the interface.Then, some trajectories are blocked by the trajectory flowing along the interface.

p0

p0

x

Fig. 5 On the left, a scenario where simple trajectories starting atp0 trasverse through the shaded regionterminate at the flow interface (dotted line). There is a non-simple trajectory (thick red line) that flows alongthe interface after switching into the lower region. The black line is the boundary of the reachable set. On theright, a scenario where it may be faster to follow the solid path to reach a final destination,x, rather than asimple trajectory (dotted line). This is due to the faster flowvelocity in the bottom region.

The second case of Proposition 5.1 introduces the case of simple trajectories intersectingat the same time. This may occur when there is a flow interface that is convex relative to theincoming flow as in Figure 4.

5.3 Trajectories Along Flow Interfaces

The flow case shown in Figure 5 motivates additional study of trajectories that flow alongthe interface of two constant regions. Application of (8) can result in a trajectory that travelsalong the boundary between two different flows. Use of the same result can also give a wayto compute a heading back into the first region. This leads to trajectories, as in Figure 5.

We now show that the existence of an initial headingθ− that results in an outgoingtrajectory that is tangent with the flow interface. For simplicity, suppose we have a two-region flow with parametersα1 = α2 = 0, 1< ‖V−‖< ‖V+‖, ξ = π

2 . This corresponds to the


cartoon shown in Figure 5. We seek a solution toθ− in (8) that results inθ+ = 0. This impliesthat the trajectory, after switching, moves along the flow interface. A few computations lead

to cosθ− =1

1+‖V+‖−‖V−‖. Note here that the assumption that‖V+‖> ‖V−‖ is important

in order to have a well-defined solution.Now we wish to solve the subsequent question. Once we are flowing along the boundary

in the faster flow region, what is the optimal heading change back into the slower region? To

find this out, considerα1 = α2 = 0, 1 < ‖V+‖ < ‖V−‖, ξ =π2

. Settingθ− = 0 and solving

for θ+ from (8) leads to cosθ+ =1

1+‖V−‖−‖V+‖. Again, since‖V−‖> ‖V+‖, the solution

for θ+ is well-defined.Although we have considered a simple example for how a trajectory can flow along a

boundary, this may happen for other choices ofα1,α2. When an agent is moving along aflow boundary, and it is possible to switch back into the first region, the agent may chooseto switch back at any time, making this process indeterminate. However, the result abovedictates that there is only one possible outgoing heading back into the first flow region.

The following result summarizes the above analysis for general flow parameters, whichwe omit in this manuscript; see [21] for a proof.

Proposition 5.2 Assume that two flow regions, defined by‖V−‖,α1 and‖V+‖,α2, respec-tively, are separated by an interface whose normal angle isξ . If it is possible for an agentto flow along the boundary under the second flow, thenθ+ satisfies

θ+ = ξ +π ±arccos[

‖V+‖sin(

ξ +π2−α2

)]

. (10)

Let D=1+‖V+‖cos(θ+ −α2)

sin(θ+ −ξ ). Then, the incoming heading resulting in flow along the

boundary, if it exists, satisfies

θ− = arctan

(

Dcosξ −‖V−‖sinα1

−Dsinξ −‖V−‖cosα1

)

±arccos

(

1√

(Dcosξ −V sinα1)2 +(Dsinξ +‖V−‖cosα1)2

)

. (11)

Another example of flow along interfaces can occur for the converse case where thetrajectory originates from a region of faster flow, and encounters a region of slower flowas in Figure 6. While there may exist a simple trajectory thatconnects two points in theflow, it may be faster if the region of slow flow can be avoided. In this case, the slowerregion may be treated as an obstacle in the flow, and a candidate trajectory may be formedusing previous techniques. Then, apply the result of Proposition 5.1 to resolve the conflict ofmultiple trajectories meeting at the same point. Note that,with respect to the fast flow only,the heavy red path in Figure 6 still satisfies the necessary condition of optimality describedin Theorem 5.1.

5.4 Nested Non-simple Trajectories

In the previous subsections, we described scenarios where either the reachable set is largerthan the set of all simple trajectories, or where simple trajectories intersect. In the case where


x

p0 x∗1

x∗2

Fig. 6 On the left, a scenario showing that it may be faster to follow the solid-type of path than the dotted-likeone. This is because the flow is much faster outside of the circular region. On the right, a scenario where thereis a nested non-simply reachable set of points (dark gray) that cannot be reached by a simple trajectory fromthe corner markedx∗1.

there is a setU (p0) that is unreachable by simple trajectories as in Figure 3, itis possible tohave nested non-simply reachable sets. This occurs, as in Figure 6. For such scenarios, self-similarity provides a solution for an optimal path fromp0 to a point within the non-simplyreachable set.

For example, in Figure 6, we can view the problem of reaching apoint in the regionU2(p0) in minimum time as a minimum time problem starting from the point x∗1. By re-peatedly applying the result from the previous subsection,an optimal path fromp0 to apoint inU2(p0) can be obtained. By the principle of optimality, this path iscorrect since allsub-paths are optimal trajectories.

Remark 5.2 This notion of self-similarity provides a systematic method of constructingan approximation of the reachable set of points for piecewise constant flow fields (in thepresence of obstacles). Because of the nature of fast flows, the time optimal set of headingsto choose from falls within the intervalθ ∈ [α − (β + π

2 ),α +(β + π2 )]. At each instance

where a optimal trajectory undergoes a course change due to an obstacle, one can re-samplea new set of trajectories with initial headings in the range of feasible headings. For example,suppose thatR0 is the set of trajectories frompi to an obstacle polygonP0. For every vertexof P0, v, consider the set of intersection-free trajectories that can be reached fromv, sayRv. Then, one can updateR1 = R0∪Rv. The complexity of computing a reachable set whenan obstacle is present in the flow should not be larger than thecomplexity of computing itwhen it goes across several arbitrary region configurations. In fact, an obstacle can be con-sidered as a flow of “0-magnitude” surrounded by an “∞-magnitude” flow. Additionally, onecould leverage the fact that optimal trajectories emanating from the obstacle in the knownenvironment are known and can be computed off-line in order to simplify the reachable-set calculations. The complexity of representing and computing reachable sets can be highwhenT is large compared to region sizes or the geometry of regions/flows is complicated.

6 Area Coverage

In this section, we make use of the analysis of the previous sections and present coveragealgorithms that aim to solve Problem 1 for affine and piecewise-constant flows.

Definition 6.1 Agents i and j arereachable set neighborsif RT(pi0)∩RT(p j0 6= /0. Thus,agent i is trivially a neighbor of itself. Additionally, theneighbor setof i, is

Ni,reach := { j ∈ {1, . . . ,n} | RT(pi0)∩RT(p j0) 6= /0} .

The reachable set graph, Greach = (V,E), describes the interconnection between agents inthe flow environment. For V= {1, . . . ,n}, and i, j ∈ V, (i, j) ∈ E if and only if j∈ Ni,reach

and vice-versa.


The algorithms that we will develop are distributed over thereachable set graphGreach.That is, in order to properly execute the deployment processand attempt to maximize reach-able area, an individual agenti only needs to communicate information with its neighborsj ∈ Ni,reach.

Information within distancer > 0 is enough for each vehicle to find the setNi,reach,whenRT(pi)∩RT(p j) 6= /0, only if d(pi , p j) ≤ r. Assuming that there exist a finite upperbound for the flow magnitudes,Vmax = maxk∈{1,...,m} ‖V|Xk

‖ < +∞, the reachable sets willbe definitely bounded, so there exists a finiter for which this is true. An upper bound of thisrange for the specific situation when the environment can be decomposed, as in Figure 2,can be found to ber = 2T(Vmax+1). In this case, the regionX consists of a finite sequenceof “consecutive” regionsXi , i ∈ {1, . . . ,m}, where only simple trajectories are possible. Tosee this, consider first the case of two consecutive regionsX1 followed by X2: If p0 ∈ X2,then, due to the specific shape of the reachable set, as in Figure 1, the length of an optimaltrajectory is always upper bounded byℓ ≤ T(V2 + 1) ≤ T(Vmax+ 1). If p0 ∈ X1−X1∩X2,then a trajectory can be decomposed into two parts with lengthsℓ1 andℓ2. It can be verifiedthatℓ1 ≤ T1(Vmax+1), whereT1 is the time that takes the trajectory to intersectX1∩X2, andthat ℓ2 ≤ (T −T1)(Vmax+ 1). In this way,ℓ1 + ℓ2 ≤ T(Vmax+ 1). TheT-reachable pointswill be then contained in a diskD(p0,T(Vmax+1)). An induction argument can be used toguarantee thatRT(p0)⊆ D(p0,T(Vmax+1)) when the reachable set crosses several regions.In general, the requiredr depends on the shape of the reachable set.

6.1 Gradient-based Coverage Control Algorithm

Since the goal is to maximizeH , we begin by taking the gradient ofH with respect topi

in order to obtain a set directions each agent must travel in.

Proposition 6.1 Given the area objective(2), let

Ai := ∂RT(pi0)∩

⋃

j∈Ni,reachj 6=i

interior(RT(p j0))

c

∩X , (12)

be the set of points in∂RT(pi0) not in the interior of neighboring reachable sets. Then thegradient with respect to pi is:

∂H

∂ pi:=∫

Ai

nT(ζi)∂ζi

∂ pidζi , (13)

whereζi : S → R2 is a parameterization of∂RT(pi0), andn : R

2 → R2 is the unit outward-

pointing normal vector atζi .

We refer the reader to [27] for the full proof of this fact. Thegeneral strategy now is tofollow the gradient direction in order to maximize coveragearea. We begin by examiningthe time-evolution of the objective functionH .

Proposition 6.2 The rate of area increase islocally maximized if agents implement

ui =

∂H

∂ pi∥

∥

∥

∂H

∂ pi

∥

∥

∥

, if

∥

∥

∥

∥

∂H

∂ pi

∥

∥

∥

∥

≥ 1,

∂H

∂ pi, if

∥

∥

∥

∂H

∂ pi

∥

∥

∥≤ 1.

(14)


Depending on the direction of the gradient∂H

∂ pi, an individual agent may not be able to

maximize its own area covered. This is because‖ui‖ = 1 < ‖V‖, so it cannot be guaranteedthat each quantity in the sum can be positive due to the term∂H

∂ piV. Nevertheless, we can still

chooseui such that the quantity∂H

∂ pi(ui +V) is maximized for alli. The next result considers

a limit situation that guarantees the local maximization ofthe area function in an infinite,rectangular domain where the flow tends to a constant ; that is, V(p) → v for ‖p‖→ +∞.

Proposition 6.3 Let X be an unbounded flow domain with parallel horizontal boundaries(an infinitely long strip). Suppose that V is a piecewise continuous flow for which we havethat [α(pi)−β (pi),α(pi)+β (pi)] ⊆ [−γ,γ], for some0 < γ < π

2 , and suppose further thatV(pi) = v(pi) + g(pi) with lim‖pi‖→+∞ g(pi) = 0 and v a horizontal constant vector with‖v‖ > 1. Then, the agents positions tend to the set of local maximizers of H under thecontrol law(14).

Remark 6.1 One can view the above result as a type of time-varying function optimizationresult, where the cost eventually converges to some static value. The validity of the resultfor generalV will occur when the area of the reachable set of a pointp under p = V(p)is conserved or increased. This is a type of non-contractible flow that includes constantflows.On the other hand, by relaxing the condition‖ui‖ < ‖V‖ to ‖ui‖ = ‖V‖, one canmodify the gradient algorithm to prevent minimization of the area function by allowingagents remain static when this happens.

6.2 Computation of the Gradient of the Area Function

In general, the gradient is difficult to analytically compute due to the∂ζi∂ pi

term inside theintegral. This term is a 2×2 matrix that represents the variation of a point along∂RT(pi0)as the agent positionpi varies. An analytical result would require an explicit solution to (1)using (5). However, there are certain cases where such a solution is possible.

Proposition 6.4 Under Assumption A1,

∂ζi

∂ pi= eAτ(pi ,ζi ) . (15)

In particular, ifA = 0 in (15), the gradient according to (13), becomes

∂H

∂ pi=∫

Ai

nT(ζi)dζi . (16)

In the piecewise constant case, the gradient expression forH with respect to agentpositionspi is identical to the result for continuously differentiableflows in (13); the maindifference lies in the term∂ζi

∂ pi. First, we present a result dealing with simple trajectories, and

then modify it to include non-simple trajectories.We seek a formula for the position of a trajectory at terminaltime T with an initial

headingθ0 at t = 0, and passing through a total ofq+ 1 flow regions. Thus, there will beq optimal course changes. Once the initial headingθ0 is chosen, the sequence of optimalheadings to follow,{θ1, . . . ,θq}, within each region, can be solved along with the sequenceof crossing locations,{x1, . . . , xq}. The times of interface crossings can be expressed via thefollowing recursive relationship:

tℓ = tℓ−1 +‖xℓ − xℓ−1‖‖pℓ−1‖

.


Similar to the affine flow case, points on the boundary∂Ai can be viewed as endpoints ofsimple trajectories starting atp0. Suppose that a simple trajectory passes through a totalof q+ 1 flow regions. We can express the final position at timeT for this trajectory in arecursive fashion. For example, the final position at timeT is given by

pT = xq +(T − tq)pq . (17)

From (17), given some arbitrary initial heading, we compute∂ζi∂ pi

with the following results.

Proposition 6.5 Under Assumption A2, letx0 = p0 and

Dℓ =

(

∂ xℓ

∂ xℓ−1

)(

∂ xℓ−1

∂ xℓ−2

)

· · ·(

∂ x1

∂ x0

)

.

The derivative of(17), the endpoint of a simple trajectory, with respect to an initial positionp0 is

∂ pT

∂ p0= Dq− pq

q

∑ℓ=1

∂ tℓ∂ xℓ−1

Dℓ−1 . (18)

We now detail how to compute each term in (18).

Proposition 6.6 Let an interface crossing position and time bex andt, suppose that thereis only one such crossing, and that the interface is differentiable atx. Letη = (cosξ ,sinξ )T

be the unit normal vector to the interface atx. Then, for a fixed choice of headingp, theJacobian of the crossing position with respect to change in initial position is given by

∂ x∂ p0

= I2−(

pηT)

pTη, (19)

where I2 is the identity matrix of dimension2. The Jacobian of the crossing time with respectto change in initial position is given by

∂ t∂ p0

= − ηT

pTη. (20)

For non-simple trajectories, the expression for the derivative of the endpoint with respectto the initial position still follows from (18) by treating non-simple course changes, as one ofthexℓ in (18). However, the expressions (19) and (20) do not apply for course changes aroundobstacles or through flow interfaces that are not differentiable. We present the followingresult to address these cases.

Proposition 6.7 Let pT be the endpoint of a non-simple trajectory that undergoes q coursechanges. Letℓ∗ ∈ {1, . . . ,q} denote thefirst non-simple course change described in Theo-rem 5.1 or the Remark 5.1. Then, the Jacobian of the endpoint position with respect to initialposition is

∂ pT

∂ p0= −pq

ℓ∗

∑ℓ=1

∂ tℓ∂ xℓ−1

Dℓ−1 , (21)

with∂ tℓ∗

∂ xℓ∗−1=

1‖pℓ∗−1‖

(xℓ∗−1− xℓ∗)T

‖xℓ∗ − xℓ∗−1‖. (22)


00 30

800

t

H

Fig. 7 Different snapshots of the area deployment algorithm in a flowfor 8 agents. Times for the threesnapshots aret = 0s (top right),t = 5s (top left), andt = 26s (bottom right). Afterward, the agents leave theflow environment. A plot of the objective function is shown in the bottom graph (solid line) and a plot of theobjective function forui = 0 for all i is shown in the dashed line for comparison.

The computation of∂ζi∂ pi

can be done approximately by using a linear interpolation ofRT(pi)

which uses a set of end-pointsp1 f , . . . , pN f in theT boundary ofRT(pi). The variation ofthe linear interpolation can be expressed in terms of the variation of the end points.

Remark 6.2 The only impediment of extending the previous results to affine flows comesfrom the necessity of analytically computing the term∂ t

∂ x . An expression fortℓ in terms oftℓ−1, xℓ, andxℓ−1, as in the proofs of Propositions 6.5 and 6.7, cannot be easily stated.

7 Simulations

The following simulation portrays a typical execution of the area coverage algorithm for theset of special flow cases described in Section 3. Here, 8 agents begin in a flow environmentof length 100mand width 50m. We demonstrate a case with flow parametersa11= 0.025s−1,b1 = 1m/s, a22 = −0.1s−1, andb2 = 0. The simulation results are shown in Figure 7.

Fig. 8 Deployment in a flow environment with an island obstacle by 8 agents. The flow is also faster in themiddle strip than in the regions close to the top and bottom boundaries. Simulation times from top to bottomare:t = 0s, t = 10s, t = 20s andt = 40s.


In the next simulation, we demonstrate a piecewise constantflow domain with an obsta-cle. The flow environment features three parallel strips with the middle strip having a fasterflow. Additionally, the obstacle is located in the middle flowregion. Figure 8 gives three foursnapshots of the coverage maximization as time progresses.The agents successfully navi-gate around the obstacle and move away from each other in order to maximize area covered.Figure 9 provides a plot of the total area covered as a function of time. One can imagine

H

t (sec)0 10 20 40

100

150

Fig. 9 Area function in the flow environment with an island.

a situation where various temperature, pressure, and othermeteorological data need to becollected. This next simulation depicts 8 UAVs deploying ina hurricane-like storm system,Figure 10. The storm is modeled as a 6-sector, radially symmetric flow field. The magnitudeof the central flow is 4 times an agent’s airspeed in the absence of flow, while the middleand outer layers are 3.25 and 2.5 times the agent’s airspeed, respectively.

H

t (sec)0 20 40 60

100

200

300

400

Fig. 10 The central “eye” of the storm is treated as an obstacle, or equivalently, a “no-fly zone.” The sim-ulation snapshots occur fort = 0 (top left),t = 20s (top right), andt = 60 (bottom left). A plot of the totalreachable area is shown in the bottom right.


8 Concluding Remarks

We addressed the problem of area coverage maximization for agents moving in a fast flowenvironment. The inability to move against the flow presented many interesting challenges,and we limited the set of flow environments to affine and piecewise-constant flow fieldsthat allow us to obtain some analytical results. Possible directions of future work includeconsideration of alternative objective functions other than the total reachable area and theextension to other types of flows. For example, a vector fieldV(x) = A(x)x+ b(x), withx∈ X, whereA(x) = A0+εA1(x)+ε2A2(x)+ . . . andb(x) = b0+εb1(x)+ε2b2(x) . . . , withAi(x),bi(x) ∈ O(ε), are small perturbations to constantsA0,b0, may allow the extension ofthe results.

The presentation of theT-limited reachable sets involved the study of time-optimaltra-jectories in the respective flow environments. This allowedus to compute optimal directionsof motion for the vehicles to locally maximize the group performance metric. Simulationsshow that, when agents are not transitioning between regions of different flow velocities, thetotal reachable area is maximized. This is in accord with theresults derived in Section 6.However, in general, we cannot guarantee maximization of reachable area during transitionfrom one flow region to the next. In fact, one can propose several flow environments that re-sult in net loss of reachable area as agents deploy over the environment. In some instances,it is possible to see that the reachable-set area value fluctuates slightly around some con-stant value. For example, when the attracting region is madeup of a finite number of flow-constant subregions, with a large size in comparison with reachable-set regions, this seemsalways to be the case. One can argue that, except for isolatedtime intervals correspondingto region changes, the vehicles reconfigure to provide relatively equal area. However, thespecific value around which the area value oscillates and theamplitude of this oscillationdepends very much on the parameterT and the region configuration. A relevant question toinvestigate is the effect of the time horizon parameterT on the algorithm performance.

The algorithm requires full knowledge of the underlying flowin order to come up withaccurate approximations by piecewise constant regions. Inthis case, the computational effortto determine the reachable sets is justified. Otherwise, thealgorithm can be used in the finalstages of a flow field estimation process, when the uncertainty about the field is low.

References

1. Cortes, J., Martınez, S., Bullo, F.: Spatially-distributed coverage optimization and control with limited-range interactions. ESAIM: Control Optim. Calc. Var. 11, 691–719 (2005)

2. Hussein, I.I., Stipanovic, D.M.: Effective coverage control for mobile sensor networks with guaranteedcollision avoidance. IEEE Trans. Control Syst. Technol. 15, 642–657 (2007)

3. Schwager, M., Rus, D., Slotine, J.: Decentralized, adaptive coverage control for networked robots. Int.J. Rob. Res. 28, 357 – 375 (2009)

4. Zermelo, E.:Uber das navigationproble bei ruhender oder veranderlicher windverteilung. Z. Angrew.Math. und. Mech 11 (1931)

5. Bryson, A.E., Ho, Y.: Applied Optimal Control. Hemisphere Publishing Corporation, New York (1969)6. Reif, J., Sun, Z.: Movement planning in the presence of flows. Algorithmica 39, 127–153 (2004)7. Ceccarelli, N., Enright, J., Frazzoli, E., Rasmussen, S.,Schumacher, C.: Micro UAV path planning for

reconnaissance in wind. In American Control Conference, pp.5310–5315, New York City, New York,USA (2007)

8. McGee, T.G., Spry, S., Hendrick, J.K.: Optimal path planning in a constant wind with a bounded turningrate. In AIAA Guidance, Navigation, and Control Conferenceand Exhibit, pp. 1–11, (2005)

9. Rysdyk, R.: Course and heading changes in significant wind. AIAA J. Gui. Control Dynam. 30, 1168–1171 (2007)


V− = (c1,c2) V+ = (d1,d2)

x0

xf

x

ψ(x) = 0

Fig. 11 Deriving the relation between incident and refracted angleof a vehicle at the interface between tworegions of different, but constant, flows.

.

10. Techy, L., Woolsey, C.A.: Minimum-time path planning for unmanned aerial vehicles in steady uniformwinds. AIAA J. Gui. Control Dynam. 32, 1736–1746 (2009)

11. McGee, T.G., Hendrick, J.K.: Path planning and control for multiple point surveillance by an unmannedaircraft in wind. In American Control Conference, pp. 4261–4266, Minneapolis, Minnesota, USA (2006)

12. Sanfelice, R.G., Frazzoli, E.: On the optimality of Dubins paths across heterogeneous terrain. In Hybridsystems: Computation and Control, pp. 457–470, St. Louis, Missouri, USA (2008)

13. Lozano-Perez, T., Wesley, M.: An algorithm for planningcollision-free paths among polyhedral obsta-cles. Commun. ACM 22, 560–570 (1979)

14. Rowe, N., Ross, R.: Optimal grid-free path planning across arbitrarily contoured terrain with anisotropicfriction and gravity effects. IEEE Trans. Rob. 6, 540–553 (1990)

15. Rowe, N., Alexander, R.: Finding optimal-path maps for path planning across weighted regions. Int. J.Rob. Res. 19, 83–95 (2000)

16. LaValle, S.M.: Planning Algorithms. Cambridge University Press, New York (2006)17. Paley, D.A., Peterson, C.: Stabilization of collectivemotion in a time-invariant flowfield. AIAA J. Gui.

Control Dynam. 32, 771–779 (2009)18. Leonard, N.E., Paley, D., Lekien, F., Sepulchre, R., Fratantoni, D.M., Davis, R.: Collective motion,

sensor networks and ocean sampling. Proc. IEEE 95, 48–74 (2007)19. Techy, L., Smale, D., Woolsey, C.: Coordinated aerobiological sampling of a plant pathogen in the lower

atmosphere using two autonomous Unmanned Aerial Vehicles. J. Field Rob. 27, 335–343 (2010)20. Bakolas, E., Tsiotras, P.: Minimum-time paths for a small aircraft in the presence of regionally-varying

strong winds. In AIAA Infotech@Aerospace, pp. 2010–3380, Atlanta, GA (2010)21. Kwok, A.: Deployment algorithms for mobile robots under dynamic constraints. Ph.D. Thesis, Univer-

sity of California, San Diego (2011)22. Serres, U.: On the curvature of two-dimensional optimal control systems and Zermelo’s navigation prob-

lem. J. Math. Sci. 135, 3224–3243 (2006)23. Gruber, P.M.: Approximation of convex bodies. In P.M. Gruber, J.M. Wills (eds.): Convexity and its

Applications, pp. 131–162. Birkhauser Verlag, Boston (1983)24. Chen, L.: Mesh smoothing schemes based on optimal Delaunay triangulations. In 13th International

Meshing Roundtable, pp. 109–120, Sandia National Laboratories (2004)25. McLure, D.E., Vitale, R.A.: Polygonal approximation of plane convex bodies. J. Math. Anal. Appl. 51,

326–358 (1975)26. Susca, S., Martınez, S., Bullo, F.: Monitoring environmental boundaries with a robotic sensor network.

IEEE Trans. Control Syst. Technol. 16, 288–296 (2008)27. Kwok, A., Martınez, S.: A coverage algorithm for drifters in a river environment. In 2010 American

Control Conference, pp. 6436–6441, Baltimore, MD, USA (2010)

A Proofs of Theorems and Propositions

Proof, Proposition 4.1.We wish to solve for the minimum-time trajectory fromx0 to a pointxf after crossingthe interface between the two flows at some point ¯x. For minimum time problems, the cost function will be:

J =∫ t f

01dt = t f .


The vehicle has dynamics ˙x= F−(x,θ) or x= F+(x,θ) depending on which side of the flow interface it is on,where:

F−(x1,x2,θ) =

(

cosθ +c1sinθ +c2

)

, if x∈ X1,

F+(x1,x2,θ) =

(

cosθ +d1sinθ +d2

)

, if x∈ X2 .

We also have the interior point boundary condition that for some time 0< t < t f , ψ(x(t)) = 0. One can nowdefine the Hamiltonians for each set of dynamics as:

H1(x,θ ,λ ) = 1+λ1(cosθ +c1)+λ2(sinθ +c2) ,

H2(x,θ ,λ ) = 1+λ1(cosθ +d1)+λ2(sinθ +d2) ,

whereλ = (λ1,λ2)T is the costate vector. The interior boundary condition can be viewed as an additional con-

straint occurring at timet. In this way, we adjoin the interior point constraint to the costate and Hamiltonianwith a Lagrange multiplierν ∈ R. Let t− andt+ denote the time immediately before and aftert, respectively.Then,

λT(t−) = λT(t+)+ν∂ψ∂x

,

H1(x(t−),θ(t−),λ (t−))=H2(x(t+),θ(t+),λ (t+))+ν∂ψ∂ t

.

We haveH1(x(t−),θ(t−),λ (t−)) = H2(x(t+),θ(t+),λ (t+)), because there is no time dependence inψ. Fur-thermore, for minimum-time optimal control problems with time-invariant dynamics, the Hamiltonian isconstant and equal to zero [5], page 78. The costate jump att is described by

λ1(t−) = λ1(t+)+ν∂ψ∂x1

, λ2(t−) = λ2(t+)+ν∂ψ∂x2

.

Using optimality with respect to the control inputθ , we derive an additional set of relations:

∂H1

∂θ= −λ1(t−)sinθ− +λ2(t−)cosθ− = 0 ⇒ tanθ− =

λ2(t−)

λ1(t−),

And, similarly, tanθ+ =λ2(t+)

λ1(t+). This leads to the following six equations:

1+λ1(t−)(cosθ− +c1)+λ2(t−)(sinθ− +c2) = 0,

1+λ1(t+)(cosθ+ +d1)+λ2(t+)(sinθ+ +d2) = 0,

λ1(t−) = λ1(t+)+ν∂ψ∂x1

,

λ2(t−) = λ2(t+)+ν∂ψ∂x2

,

tanθ− =λ2(t−)

λ1(t−),

tanθ+ =λ2(t+)

λ1(t+),

with the six unknowns:ν, θ+, λ1(t−), λ2(t−), λ1(t+), andλ2(t+). Eliminating all butθ+, we obtain:

(

∂ψ∂x1

tanθ+ − ∂ψ∂x2

)

(1+c1 cosθ− +c2 sinθ−)secθ−

= −(

∂ψ∂x2

− ∂ψ∂x1

tanθ−)

(1+d1 cosθ+ +d2 sinθ+)secθ+ .


Substitute(c1,c2) = (‖V−‖cosα1, ‖V−‖sinα1) and(d1,d2) = (‖V+‖cosα2, ‖V+‖sinα2). The relation be-comes

1+‖V−‖cos(θ−−α1)∂ψ∂x2

cosθ−− ∂ψ∂x1

sinθ−= − 1+‖V+‖cos(θ+ −α2)

∂ψ∂x1

sinθ+ − ∂ψ∂x2

cosθ+

.

Let ξ be the angle of the gradient vector∇ψ. That is,ξ = arctan

(

∂ψ∂x2∂ψ∂x1

)

. This direction is normal to the level

setψ(x1,x2) = 0. Then, the substitution(

∂ψ∂x1

, ∂ψ∂x2

)

= (‖∇ψ‖cosξ , ‖∇ψ‖sinξ ), leads to (8).

The cosine functions in the numerators of (8) can be expanded.Then, performing the substitution,cosθ+ =

√

1−sin2 θ+ yields the equation

(A2 +B2)sin2 θ+ +2Asinθ+ +1−B2 = 0.

This is quadratic in sinθ+, which results in (9).Proof, Lemma 5.1.Since there are no obstacles, the pointρ(s2) is reachable fromρ(s1) via a straight line.Finally, straight line paths in a constant flow environment are time-optimal, so by replacing the middle sectionof ρ with a straight line segment inρ , we have decreased the travel time alongρ compared toρ.Proof, Theorem 5.1.Let ρ be a time-optimal trajectory originating fromp0 that is tangent toζ at ζ (s). Wenote that, if a time-optimal trajectory intersectsζ transversely, then that trajectory must stop. First, we showthatζ must be locally concave about the intersection pointζ (s) in order to admit non-simple paths. Then, forthe case whenζ is locally concave atζ (s), we will construct non-simple (optimal) paths.

Suppose thatζ is locally strictly convex atζ (s) and thatρ is tangent toζ at ζ (s). The time-optimaltrajectoryρ must be feasible, thus it must remain inX at all times. Forρ to continue pastζ (s), it mustcontinue to follow the curveζ or return into the interior of the flow region. Following the boundary is notoptimal sinceζ is locally convex, so there exists a neighborhoodN aboutζ (s) such that there is a straightline path joiningζ (s) to another point alongζ through the interior ofX that also lies inN; see Figure 12. Ifζis locally convex but not strictly convex aboutζ (s), thenζ is a straight line throughζ (s). Thus, straight linetrajectories may continue, andρ remains a simple trajectory.

ζ (s)

Fig. 12 Convex boundary counterexample. The thick dashed path that follows the boundary of the obstacleis not time-optimal since there exists a straight line path from ζ (s) to any point along the convex boundary,resulting in a faster time-of-arrival.

For the case thatζ is locally strictly concave atζ (s), then the boundary of the obstacle occludes a portionof the flow domainX. Non-simple trajectories may result after this tangential intersection if it is feasible tosteer along the pathζ under the constant flow. LetN be a small neighborhood about the tangential intersectionpointζ (s). We first show that a trajectory that follows the boundaryζ is time-optimal. Then we combine thiswith the fact that, within the interior ofX, time-optimal paths must be straight lines in order to complete theconstruction of these optimal non-simple trajectories.

Consider a trajectoryρ that connectsζ (s1) to ζ (s2) on the boundary of a locally strictly concave curveζ . Supposeρ follows the boundaryζ , but is not time-optimal. Then there exists a trajectoryρ∗ connectingζ (s1) andζ (s2) with a shorter travel time. Furthermore,ρ∗ must leave the boundaryζ at some point and thenre-enter it in order to finish atζ (s2). However, sinceζ is locally strictly concave, such a maneuver requires acourse change within the interior of the flow region, which implies thatρ∗ is not time-optimal; see Figure 13.At any point along the boundary, the time-optimal trajectoryρ may depart tangentially. This continuation ofthe time-optimal path satisfies Bellman’s necessary principlefor optimality. Furthermore, it is the only paththat does so. Suppose that at some pointζ (s∗) ∈ N, s∗ > s, the departing trajectory is not tangential. Thenit either transversely crossesζ , which is not feasible, or it exits back into the flow domainX. Let µ be thisnon-tangential line. Sinceζ is locally concave inN, for some smallε > 0, there is a straight line path from anearlier point along the boundary,ζ (s∗− ε) that necessarily intersects this non-tangential departing line, seeFigure 14. By Lemma 5.1, the tangentially departing path results in a faster arrival time to the intersectionpoint.


ζ (s1)

ζ (s2)

Fig. 13 A diagram demonstrating optimality of a path the follows the boundary of an obstacle. Sinceζ islocally strictly concave, any other path connectingζ (s1) to ζ (s2) must undergo a path change within theinterior of a flow region (i.e. the dotted line). This violates the optimality condition that straight-line pathswithin the interior of a flow region are time-optimal.

ζ (s)ζ (s) ζ (s∗)

ζ (s∗− ε)

Fig. 14 At left, an example of a non-simple (time-optimal) path around anobstacle. At right, we provide acounterexample showing that a tangentially departing path is optimal.

Proof, Proposition 5.1.Under the assumptions of the Proposition,ρ1 andρ2 are time optimal up until theintersection point. By time optimality, the trajectory that reaches the intersection pointρ1(t1) = ρ2(t2) first isthe fastest trajectory to that point. This addresses the cases where the intersection times are not equal.

Suppose instead thatρ1(t1) = ρ2(t2) , x, with t1 = t2 andx inside the interior of a regionXk. Suppose thatρ1 continues to be the time-optimal path to go fromp0 to y = ρ1(t) for t > t1. Then the trajectory composedof ρ2(t) until x andρ1(t) until y would also be time-optimal (it takes the same time to reachy from p0). Thisis a contradiction with the fact that inside the regionXk, optimal trajectories must be straight lines.Proof, Proposition 5.2We begin by stating the useful identity

acosθ +bsinθ =√

a2 +b2 cos(θ −φ) ,

φ = arctan

(

ba

)

.

Given that the outgoing trajectory flows along the flow interface, we have the following condition

tan(

ξ ± π2

)

=sin(

ξ ± π2

)

cos(

ξ ± π2

) =sinθ+ +‖V+‖sinα2

cosθ+ +‖V+‖cosα2.

The left hand side denotes the tangent direction at the interface crossing while the right hand side denotes theresultant velocity of an agent. We now solve for the outgoingvehicle headingθ+. Throughout we note thatwe keep track of the plus/minus signs so that they remain consistent.

±cosξ∓sinξ

=sinθ+ +‖V+‖sinα2

cosθ+ +‖V+‖cosα2,

±cosξ cosθ+ ±‖V+‖cosα2 cosξ = ∓sinξ sinθ+ ∓‖V2‖sinα2 sinξ ,

cosξ cosθ+ ±sinξ sinθ+ = ∓‖V+‖sin(α2 +ξ ) .

At this point, we apply the trigonometric identity for addition of sin and cos.

cos

[

θ+ −arctan

(±sinξ±cosξ

)]

= ∓‖V+‖sin(α2 +ξ ) ,

cos(θ+ ∓ξ ) = ∓‖V2‖sin(α2 +ξ ) .

At this point, we take the inverse cosine of both sides. However, due to the multiplicity of solutions for arccos,we will again have two possible solutions. Before proceeding with arccos, we split the two existing solutions


we already have to obtain–in the end–four total possible solutions.

{

θ+ −ξ = ±arccos[−‖V+‖sin(α2 +ξ )] ,

θ+ +ξ = ±arccos[‖V+‖sin(α2 +ξ )] ,{

θ+ = ξ ±arccos[−‖V+‖sin(α2 +ξ )] ,

θ+ = xi±arccos[‖V+‖sin(α2 +ξ )] .

Thus, we have achieved the first result of this proposition. Now we assume thatθ+ is known and satisfiesthe first result. We wish to solve for the incoming headingθ− that achieves an outgoing heading ofθ+. Tosimplify some clutter, we let

D =1+‖V+‖cos(θ+ −α2)

sin(θ+ −ξ ).

Now we solve forθ−:

1+‖V−‖cos(θ−−α1)

sin(θ− +ξ )= D,

1+‖V−‖(cosθ− cosα1 +sinθ− sinα1) = D(sinθ− cosξ −cosθ− sinξ ),

(‖V− cosα1 +Dsinξ )cosθ− +(‖V−‖sinα1−Dcosξ )sinθ− = −1.

We apply the trigonometric identity for addition of sin and cos.

√

(‖V−‖cosα1 +Dsinξ )2 +(‖V−‖sinα1−Dcosξ )2

·cos

(

θ−−arctan

[‖V−‖sinα1−Dcosξ‖V−‖cosα1 +Dsinξ

])

= −1.

Applying arccos with multiplicity of solutions in mind and solving for θ− leads to the final result.Proof, Proposition 6.2.We consider the time evolution of the cost function by applying the chain rule:

H =n

∑i=1

∂H

∂ pipi =

n

∑i=1

∂H

∂ pi(ui +V) .

Each agent’s contribution to the rate of change ofH is then given by∂H

∂ pi(ui +V) = ∂H

∂ piui +

∂H

∂ piV. However,

there is the constraint that‖ui‖ ≤ 1. The choice ofui that maximizes each individual term of the sum is thevector that points in the same direction as∂H

∂ pi, but with magnitude 1.

Proof, Proposition 6.3.Sinceg(p) is piecewise continuous, for everyε > 0 there isr > 0 such that‖g(p)‖≤ εfor ‖p‖ > r. Defineρ := min{‖v‖− ε,min‖p‖≤r ‖V(p)‖}, thenρ > 1. Consider the vectorw with angleγand normρ −1 > 0. From our analysis of the reachable set, we have ˙xi = Vx(pi)+ux

i ≥ wx. Thus, we have‖pi(T + t)− pi(t)‖ ≥ wxT and‖pi(T + t)‖ ≥ −‖pi(t)‖+ wxT ≥ 0 for large enoughT. In other words, theflow makes‖pi(t)‖→ +∞, whenpi evolves under (14).

The vector fieldV(p) is bounded and piecewise continuous, thus the functionf (p) = area(RT(p)) iscontinuous and bounded, with area(RT(p)) ≤ A. Define 0≤W(p1, . . . , pn) := −H (p1, . . . , pn)+nA≤ nA,for everyp1, . . . , pn. We can rewrite:

W = −∑i

∂H

∂ pi(ui +V(pi)) = −∑

i

∂H

∂ pi(ui + ui +v)− ∂H

∂ pig(pi)+

∂h(p1, . . . , pn)

∂ pi,

whereH , (resp.A i , ui ), are the corresponding functions (resp. regions, controllaw) associated with thedynamics ˙pi = ui + v, ui = ui − ui , andh(p1, . . . , pn) =

∫

∪RT (pi )\∪RT (pi )dq. Because of the assumptions on

the flow, we have that limt→+∞ h(p1(t), . . . , pn(t)) = 0, limt→+∞∂h∂ pi

(t) = 0 and limt→+∞ ui(t) → 0. Further,

since ∂H

∂ pi(p1, . . . , pn) is bounded and limt→+∞ g(pi(t)) = 0, we have limt→+∞

∂H

∂ pi(pi(t))g(pi(t)) = 0.

Let us see now that∑i∂H

∂ pi(ui +v) → c≥ 0, with t → ∞. To show this, we will prove that

n

∑i=1

∫

A i

nT(ζi)dζi = 0.


From the definition ofA i (12)

A i = ∂RT(pi0)∩∂

(

n⋃

j=1

RT(p j0)

)

∩X .

Then, note that eachA i is disjoint except at the isolated points where two regionsRT(pi0) andRT(p j0)

meet. Furthermore,∂(⋃n

i=1RT(pi0))

∩X =⋃n

i=1A i .Since

⋃ni=1RT(pi0)∩X is compact, then the divergence theorem states that:

0 =∫

∂(⋃n

i=1 RT (pi0)∩X)nds=

∫

∂(⋃n

i=1 A i)nds=

n

∑i=1

∫

A i

nds.

In other words,∑ni=1

∂H

∂ pi= 0.

From here we have that

n

∑i=1

∂H

∂ pi(ui +v) =

n

∑i=1

∂H

∂ piui +

∂H

∂ piv =

n

∑i=1

∂H

∂ piui .

Sincev is a constant, the second term in the summation is zero. This implies that∑ni=1

∂H

∂ piui ≥ 0 by the

definition ofui .SinceH ≥ 0 and ∂H

∂ pi(ui +v) ≥ 0, then limt→+∞ W(p1(t), . . . , pn(t)) exists and must be zero. Suppose

the limit is not zero; then, it should be strictly negative, as−c≤ 0. But if it is strictly negative, we have that,givenε > 0, there isTε > 0, such that∀ t ≥ Tε , W(t)≤−c+ε < 0. But then,W(t ′)−W(t)≤ (−c+ε)(t ′− t)for all t ′, t ≥ Tε andW(t) would not be lower bounded, which is a contradiction.Proof, Proposition 6.4.For the affine flow, the time-optimal trajectories are given by (7). Taking the gradientof pi(t) with respect to initial positionpi(0) results in

∂ pi(t)∂ pi(0)

= eAt .

To connect the above gradient back to the problem of finding∂ζi∂ pi

, recall thatζi is a parameterization of

∂RT(pi0). Thus points alongζi are solutions of the optimal control problem at a particular (fixed) time. Theresult (15) follows immediately.Proof, Lemma 6.3.Let ui = 0 for all i ∈ {1, . . . ,n}. ThenH = ∑n

i=1∂H

∂ pic. Suppose an agenti has a region

that intersects the boundary ofX. Sinceui = 0, agenti drifts parallel to the shore. Furthermore, other agentsdrift at the same rate. Thus, all agents maintain their relative positions and so total area remains constant.

Thus∑ni=1

∂H

∂ pic = 0. Now, forui as in (14),H = ∑n

i=1 v∥

∥

∥

∂H

∂ pi

∥

∥

∥≥ 0.

Proof, Proposition 6.5.We can take the result (17) and take the derivative with respect to initial positionp0 by applying the chain rule. Before launching into the computation, we make a couple observations. Thesequence of interface crossing positions{x1, . . . , xq} depends only on the initial heading choiceθ0. In fact,each interface crossing position ¯xℓ depends only on the previous interface crossing ¯xℓ−1 and the (fixed) choiceof directionθℓ−1 determined by (8). On the other hand, we can express the interface crossing timetℓ as

tℓ = tℓ−1 +‖xℓ − xℓ−1‖‖pℓ−1‖

.

Thus,tℓ depends on both the previous interface crossing time and the previous crossing position.Applying the chain rule to (17), we have

∂ pT

∂ p0=

∂ xq

∂ p0− pq

∂ tq∂ p0

=∂ xq

∂ xq−1

∂ xq−1

∂ p0− pq

[

∂ tq∂ xq−1

∂ xq−1

∂ p0+

∂ tq∂ tq−1

∂ tq−1

∂ p0

]

.

Note that∂ xq

∂ x0=

(

∂ xq

∂ xq−1

)(

∂ xq−1

∂ xq−2

)

· · ·(

∂ x1

∂ x0

)

, Dq .

Then

∂ pT

∂ p0= Dq− pq

[

∂ tq∂ xq−1

Dq−1 +∂ tq−1

∂ p0

]

.


Repeated application of the chain rule on the last term givesthe result.Proof, Proposition 6.6.Suppose there is a parameterizationζ : R → R

2 of the curve described byψ(x) = 0.Let ζ (0) correspond to the interface crossing position ¯x. We wish to find the variation of the crossing positionx with respect to a change in initial positionp0. For example, in Figure 15, for a horizontal displacementδx,we wish to find

lim‖δx‖→0

ζ (t)−ζ (0)

‖δx‖ , (23)

whereζ (t) is the point along the curveζ that intersects the displaced trajectory. We also wish to find a similarquantity for vertical displacementsδy.

ζ (0)

ζ (t)

x

δx

p0 p0 +δx

Fig. 15 Diagram for the proof.

Sincex is constrained to lie on the curveζ (t), the derivatives of ¯x with respect to the horizontal andvertical variations in initial position must lie along the tangent directionζ ′(0). One can show that computing

lim‖δx‖→0

x−ζ (0)

‖δx‖ , (24)

wherex is the point along the tangent line fromζ (0) that intersects the displaced trajectory, is equivalent tocomputing (23) as‖δx‖→ 0. The limit (24) is much easier to compute since ˜x can be solved for explicitly.

In other words, we use the linear approximation ofζ (t) aboutζ (0) = x to compute the Jacobian∂ x∂ p0

at

the pointx. Let η be the unit normal vector toζ at ζ (0). Then, given a constant velocity ˙p, the intersectionpoint x of a trajectory starting fromp0 to the tangent line throughζ (0) is given by

x = p0 +(ζ (0)− p0)

TηpTη

p.

The quotient of inner products represents the travel time from p0 to the tangent line, since the inner productsproject the travel distance and travel velocity onto the normal direction. Expandingp0, p, andη into compo-nents and differentiating with respect to horizontal and vertical changes ofp0 yields the final result of (19).In performing the above differentiation, we also differentiate the crossing time with respect to initial positionto get (20).Proof, Proposition 6.7.We assume a non-simple trajectory such as that described in Theorem 5.1. Then,there is a specific pointx∗ = xℓ∗ that a trajectory originating fromp0 must pass through to reach points in

U (p0). This point isfixed in space, so the quantity∂ xℓ∗∂ p0

= 0. Thus, the expression for the Jacobian (18)becomes (21).

The remaining task is to compute the special term∂ tℓ∗∂ xℓ∗−1

corresponding to the special crossing point ¯xℓ∗ .

For ℓ < ℓ∗, the derivative is given by (20). We begin with

tℓ∗ = tℓ∗−1 +‖xℓ∗ − xℓ∗−1‖

‖pℓ∗−1‖.

Then, differentiating with respect to the previous crossing locationxℓ∗−1, we obtain the result (22).

Coverage Maximization with Autonomous Agents in...

Documents

Transcript of Coverage Maximization with Autonomous Agents in...