Optimal Experimental Design for Constrained Inverse Problems · c Ruthotto, Chung2 Constrained OED...

c© Ruthotto, Chung2

Constrained OEDTransition Workshop @ SAMSI

Optimal Experimental Design forConstrained Inverse Problems

Transition Workshop @ SAMSI

May 1, 2017

Lars Ruthotto1, Julianne Chung2, Matthias Chung2

1Department of Mathematics and Computer Science, Emory University2Department of Mathematics, Virginia Tech

[email protected] Intro EBOD Exp Σ 1



Agenda

Logo for out GitHub organization

JulıaInvLogo for the Base package

⌘Inv

1

I Motivation and IntroductionI Example: 2D TomographyI Two models for sparsity enforcing design

I Empirical Bayes FormulationI idea: improve design using training dataI bi-level optimization problemI inequality constraints on both levels.I parallel framework to accelerate computation

I Numerical Examples

I Summary / Open Problems

Title Intro EBOD Exp Σ 2



Motivation and Introduction




Application: 2D Tomography

Discrete Radon transform: Given ftrue ∈ Rn obtain data for angle θ as

d(θ) = TR(θ)ftrue,

I R(θ) ∈ Rn×n - rotation matrixI T ∈ RnT×n - tomography operator

OED Question: How to optimally choose θ1, θ2, . . . , θ`?




Inverse ProblemConsider the discrete inverse problem

d(p) = M(p)ftrue + ε(p)

withI d(p) ∈ Rm - noisy dataI M(p) ∈ Rm×n - linear forward operatorI p ∈ R` - design parameters (angles, offsets,...)I ftrue ∈ Rn - true modelI ε(p) ∈ Rm - measurement noise (ε ∈ N (0,Γε(p)))

Assume that true model(s) satisfyI Cef− ce = 0 (e.g., sum of intensities known)I Cif− ci ≥ 0 (e.g., non-negativity, box constraints, . . . )

Does optimal design differ in constrained/unconstrained problems?




Optimal Experimental Design: Problem AIdea: Start with many angles θ1, θ2, . . . , θ` (`� 0) and consider

M(p) = diag(p)A, A =

TR(θ1)...

TR(θ`)

, and d(p) = diag(p)d.

(p ∈ R` with p ≥ 0 encodes importance of measurements)Find relevant components in p by solving the OED problem (upper-level)

minp≥0

E∥∥∥f(p)− ftrue

∥∥∥2+ β ‖p‖1 .

Here, f(p) solves the inverse (lower-level) problem, i.e.,

f(p) = arg minf

12‖M(p)f− d(p)‖2

2 +α2

2‖L(f− µ)‖2

2

subject to fL ≤ f ≤ fH, Cef− ce = 0.

Inequality constraints no closed-form solution for f(p).




Optimal Experimental Design: Problem B

Let p ∈ R` (` small) be the projection angles and

M(p) =

TR(p1)...

TR(p`)

, and d(p) = M(p)ftrue + ε.

Find optimal angled by solving the OED problem (upper-level)

minpmax≥p≥pmin


∥∥∥2

2.

Here, f(p) solves the inverse (lower-level) problem, i.e.,

f(p) = arg minf

12‖M(p)f− d(p)‖2

2 +α2

2‖L(f− µ)‖2

2

subject to fL ≤ f ≤ fH, Cef− ce = 0 .




Related work

A. Alexanderian, N. Petra, G. Stadler, andO. Ghattas.A-Optimal Design of Experiments forInfinite-Dimensional Bayesian Linear InverseProblems with Regularized `0-Sparsification.SIAM SISC, 36(5):A2122–A2148, 2014.

M. Chung and E. Haber.Experimental design for biological systems.SIAM Journal on Control and Optimization,50(1): 471–489, 2012.

E. Haber, L. Horesh, and L. Tenorio.Numerical methods for experimental design oflarge-scale linear ill-posed inverse problems.Inverse Problems, 24(5): 055012–17, Sept.2008.

E. Haber, L. Horesh, and L. Tenorio.Numerical methods for the design oflarge-scale nonlinear discrete ill-posed inverseproblems.Inverse Problems, 26(2):025002, Feb. 2010.

OED for Inverse Problems with Constraints

I will optimal design change when considering constraints?I can we solve large-scale OED problems?

I many design parametersI high-resolution inversionsI lots of test data




Empirical Bayes Formulation




Emprirical Bayes FormulationLet f(1)

true, . . . , f(N)true ∈ Rn be true models (training data). Given p simulate

datasets d(1)(p), . . . ,d(N)(p) ∈ Rm and obtain reconstructionsf(1)(p), . . . , f(1)(p) by solving lower-level problem.Goal: Find p that minimizes Mean Squared Error (MSE)

MSE(p) =1

2N

N∑i=1

∥∥∥f(i)(p)− f(i)true

∥∥∥2.

Solution strategies:I Reduced approach: Eliminate lower-level problem and treat f(p) as

(differentiable?) functionI Upper-level problem: Use projected steepest descent, projected Gauss

Newton, . . .Note that

∇pMSE(p) =1N

N∑i=1

Jfi(p)>

(fi(p)− f(i)

true

).

Challenge: Compute Jacobian (or sensitivity) Jfi(p)> = ∇pfi(p)




Lower Level Problem as Convex QP

Recall, lower-level problem

f(p) = arg minf

12‖M(p)f− d(p)‖2

2 +α2

2‖L(f− µ)‖2

2

subject to fL ≤ f ≤ fH, Cef− ce = 0.

Introducing slack s, we rewrite this as

minf,s

12

f>Q(p)f + b(p)>f

subject to Cef− ce = 0, Cif− ci − s = 0, s ≥ 0,,

whereI Q(p) = M(p)>M(p) + αL>LI b(p) = −M(p)>d(p)− αL>µI Ci = [In;−In], ci = [fL;−fH] (bound constraints)




Lower Level Problem: Optimality ConditionsThe Lagrangian of lower level problem is

L(f, λe, s, λi) =12

f>Q(p)f + b(p)>f− λ>e (Cef− ce)− λ>i (Cif− ci − s).

Necessary and sufficient conditions: Find (f, λe, λi, s) such that

F(f, λe, λi, s) =

∇fL∇λeL∇λiL

= 0, λi � s = 0, λi, s ≥ 0.

Interior point method: Relax the complementarity using σ > 0 andµk = (sk, λk

i )/m

F(f, λe, λi, s;σ, µ) =

Q(p)f + b(p)− C>e λe − C>i λi

Cef− ce

Cif− ci − sSΛie− σµke

= 0, λi, s > 0,

with S = diag(s) and Λi = diag(λi). Use Mehrotra’s algorithm.Title Intro EBOD Exp Σ 12



Sensitivities of Interior Point SolutionFind Jacobian Jf(p) ∈ Rn×` such that for small ∆p ∈ R`

f(p + ∆p) = f(p) + Jf(p)∆p +O(‖∆p‖2)

Differentiating F around (f(p), λe(p), λi(p), s(p)) gives

0 = ∇pF(f, λe, λi, σ, µ) +

...

......

...∇fF ∇λeF ∇sF ∇λiF

......

......

Jf(p)Jλe(p)Js(p)Jλi(p)

This gives in our case

Jf(p)Jλe(p)Js(p)Jλi(p)

= −

Q(p) −Ce 0 −Ci

Ce 0 0 0Ci 0 −I 00 0 Λi S

−1

∇p(Q(p)f(p) + b(p))000

Building Jf ∈ Rm×` costly when ` large→ matrix-free mode / store

factorization?Title Intro EBOD Exp Σ 13



Extensions to New Designs

Approach to sensitivity computation is very flexible. Need to compute

∇p(Q(p)f + b(p)) = ∇p(M(p)>M(p)f−M(p)>d(p))

This requires product rule and methods for computing

∇p(M(p)f), ∇p(M(p)>d), and ∇pd(p).

Example: For OED problem A we haveI ∇p(M(p)f) = diag(Af)I ∇p(M(p)>d) = (diag(d)A)>

I ∇p(d(p)) = diag(d)




Solving OED Problem A

Two step procedure from Haber, Horesch 2008 but with projectedGauss-Newton-CG.

Step 1: Consider the relaxed `0-problem for some β > 0

p∗ = arg minp≥0


∥∥∥2+ β ‖p‖1 .

Step 2: Let p∗ = Qp where Q ∈ R`×k for k� `. Fine tune the non-zeroweights by solving the small-scale problem.

minp≥0

E∥∥∥f(Qp)− ftrue

∥∥∥2




Computational Challenges

Recall MSE

MSE(p) =1

2N

N∑i=1

∥∥∥f(i)(p)− f(i)true

∥∥∥2, ∇pMSE(p) =

1N

N∑i=1

Jf(p)>(

fi(p)− f(i)true

).

Computationally challenging for large scale OED (many parameters,high-resolution recon, many training data).Structure is similar to parameter estimation problems with manymeasurements (PDE constrained optimization, machine learning, . . . )Some ideas

1. Sampling: subsample training data in OED phase.2. Parallelization: Distribute terms in MSE. IPQP problems solved in

parallel, factorizations or preconditioners stored on workers forsensitivity calculations.




Who is julia?

I open source, started 2009, public since 2012I started by Alan Edelman @ MIT→ large developer communityI dynamic programming language (multiple dispatch)I sophisticated type systems (though optional)I designed for speed and efficiency (just-in-time compiler)I designed for parallel/cloud computingI standard library mostly written in juliaI easy to connect with C, C++, Fortran, Python,. . .




Selected JuliaInv Packages on Github




AcknowledgementsThe Julia community (particularly Jiahao Chen, MIT).

Main jInv collaborators:

Eldad Haber Eran Treister Samy Wu

Dave Marchant Christoph Schwarzbach Patrick Belliveau

LR and SW are supported by the National Science Foundation grant DMS 1522599




Step 1: Prepare forward problems on workers

jInv allows for automatic or user-defined scheduling. Here: Automatic.

Main Process

f1,A,L1Ref[1]

f2,A,L2Ref[2]

f3,A,L3Ref[3]

Worker 1 Worker 2

f1,A,L1prepare pFor[1] f3,A,L3

prepare pFor[3]f3, pFor[3] RemoteRef

f1, pFor[1] f2,A,L2

prepare pFor[2]

RemoteRef

f2, pFor[2]

1. Send data 1 to worker 1 and data 2 to worker 22. Get remote reference from worker 1 and send problem 33. Get remote reference from worker 1 and 2




Step 2: Compute MisfitNote: Data locations are fixed.

Main Process

Ref[1]Ref[2]Ref[3]

p

ϕ = 0∇ϕ = 0ϕ = ϕ1∇ϕ = ∇ϕ1

ϕ = ϕ1 + ϕ2∇ϕ = ∇ϕ1 +∇ϕ2

Worker 1 Worker 2

p

RemoteRef

f1, pFor[1]

f3, pFor[3]

psolve pFor[1]f1,∇f1solve pFor[3]f3,A−1

3

p

solve pFor[2]

p

f2,C2RemoteRef

f2, pFor[2]

1. Send model p to both workers2. Worker 1 returns remote reference to simulated data f1 and ∇f1

3. Worker 1 and 2 return remote references

Sensitivities, factorizations or preconditioners stored on workers




Simple Parallelization using Multiple DispatchExample: Linear Least Squares Problem with 10 training models.

Option 1: Use single pFor for sequential computation:

1 pFor = LeastSquaresParam(A,L,[],D)

Option 2: Split up sources and use Array{LeastSquaresParam} forparallelization:

1 pFor1 = LeastSquaresParam(A,L,[],D[:,1:5])2 pFor2 = LeastSquaresParam(A,L,[],D[:,6:10])3 pForp = [pFor1; pFor2]

Option 3: Distribute least squares problems a priori to reducecommunication:

1 pFord = Array{RemoteRef{Channel{Any}}}(2)2 pFord[1] = @spawnat processor1 identity(pFor1)3 pFord[2] = @spawnat processor2 identity(pFor2)

Then: getData(m,pFor) = getData(m,pForp) = getData(m,pFord)

jInv uses multiple dispatch to find correct method of getData




Optimization: Projected Gauss-NewtonjInv provides tailored implementation of projected Gauss-Newton for solving

minp

f (p) subject to pL ≤ p ≤ pH,

for f : Rn → R smooth (in general not convex).

Projected steepest descent on active set:

δpL = max(0,∇f (p)) and δpH = min(0,−∇f (p))

Projected Newton step on inactive set (with implicit H(p))

P>H(p)P δpin = −P>∇f (p)

Use curvature information to scale Steepest descent step:

δp =

min(‖δpin‖∞/‖δpL‖∞, 1)δpL ,p = pL

min(‖δpin‖∞/‖δpH‖∞, 1)δpH ,p = pH

δpin , else.

Line search: projected Armijo.




Numerical Experiments




2D Tomography Example: Binary Rectangles

I training data: 40× 40× 20I randomly sized and spaced

rectanglesI binary intensitiesI 180 projections angles, 40

projections per angleI L = In (identity)I OED modes:

1. unconstrained2. equality constrained*3. non-negativity constrained*4. box-constrained*

test images:

*: true images used to formulate constraints




Binary Rectangles: OED B Results

MSE vs. angles

unconstrained equality constraints

non-negativity box-constraints

MSE vs. regularization parameter

10−2 10−1 100 101

102.5

β

unconstrainedequalitynon-negativitybox-constrained




Binary Rectangle: OED A Resultsnumber of nonzero projections reconstruction error, MSE

100 101 1020

10

20

30

40

β100 101 10210−2

10−1

100

101

102

β





Binary Rectangle: Reconstruction Results

no constraints equality constraints

non-negativity box constraints




2D Tomography Example: Shepp Logan

I training data: 64× 64× 20I intensities between 0 and 1I 180 projections angles, 64

projections per angleI L = In (identity)I OED modes:

1. unconstrained2. equality constrained*3. non-negativity constrained*4. box-constrained*

test images:

*: true images used to formulate constraints




Shepp Logan: OED B Results

MSE vs. angles

unconstrained equality constraints

non-negativity box-constraints

MSE vs. regularization parameter

10−2 10−1 100 101

860

880

900

920

λ





Binary Rectangle: Reconstruction Results

no constraints (nnz=88) equality constraints (nnz=74) non-negativity (65)




Summary and Outlook




Σ: OED for Constrained Inverse ProblemsOED can drastically improve effectiveness of constrained inversions.

Empirical Bayesian Optimal DesignI handles linear equality and inequality constraintsI requirements: training data, resources

Numerical OptimizationI inequalities nonlinear optimality conditionI differentiate solution of interior point method

(predictor corrector)I modular approach: extensible to new designsI computational challenging parallel processing

Future WorkI Bayesian formulationI OED for nonlinear constrained inverse problemsI large-scale 3D imaging applicationsI application in PDE parameter estimation

Special thanks to y’all at SAMSI and SAMSI sponsors.

Logo for out GitHub organization

JulıaInvLogo for the Base package

⌘Inv

1


Optimal Experimental Design for Constrained Inverse Problems · c Ruthotto, Chung2 Constrained OED...

Documents

Transcript of Optimal Experimental Design for Constrained Inverse Problems · c Ruthotto, Chung2 Constrained OED...