Tutorial: Sparse Signal Recovery - Home | Institute for ... · Tutorial: Sparse Signal Recovery ......

Post on 07-Mar-2019

244 views 0 download

Transcript of Tutorial: Sparse Signal Recovery - Home | Institute for ... · Tutorial: Sparse Signal Recovery ......

Tutorial: Sparse Signal Recovery

Anna C. Gilbert

Department of MathematicsUniversity of Michigan

(Sparse) Signal recovery problem

signal or

population

length N

k important

features

x = yΦ

measurements

or tests:

length m

Under-determined linear system: Φx = y

Given Φ and y , recover information about x

Solving LS: arg min ‖Φx − y‖2 returns a non-sparse x

Two main examples: group testing and compressed sensing

= y

Group testingΦ binary = pooling designx binary, 1 =⇒ defective, 0 =⇒ acceptableOUTPUT: defective setsuccess = find defective set exactlyArithmetic: OR

Example: Combinatorial group testing

Rat dies only 1 week after drinking poisoned wine

Being good (computer) scientists, they do the following:

Unique encoding of each bottle

If bottle 5 were poison...

...after 1 week

Two main examples: group testing and compressed sensing

Φ x = y

Compressed sensingΦ = measurement matrixx signal, sparse/compressibleOUTPUT: x good approximation to xsuccess = ‖x − x‖p versus ‖x − xk‖qArithmetic: R or C

Head

Tail

k N

xk

x! xk

Biological applications are a combination of these examples

• Sparse (binary?) matrices

• Quantitative (non-Boolean) measurements

• Noisy or missing measurements

• Various signal models

Theoretical design problems: jointly design matrices andalgorithms

Design/Characterize Φ with m < N rows and recoveryalgorithm s.t. output x is close to good approximation of x :

‖x − x‖p ≤ C‖x − xk‖q.

Algorithmic resources

• Computational time: return x in time proportional to N orto m? (impacts number of items returned)

• Computational memory: use working space proportional toN or to m?

• Small-space random constructions: can we generaterows/cols of Φ from collections of random vars with limitedindependence?

• Explicit constructions: how do we construct Φ? Randomlyor deterministically? Time to calculate each entry?

• Matrix density: number of non-zero entries in Φ?

Signal models

• Adversarial or “for all” recover all x with guarantee

‖x − x‖2 ≤C√k‖x − xk‖1.

When x is compressible, or ‖x − xk‖1 ≤√

k‖x − xk‖2, obtainbetter result

‖x − x‖2 ≤ C‖x − xk‖2.

• Probabilistic or “for each”: recover all x that satisfy astatistical constraint

fixed signal x , recover with high probability over constructionof Φuniform distribution over k-sparse signalsi.e., random signal model

Empirical algorithmic performance

• Polynomial time algorithms: typically fewer measurementsrequired for desired accuracy, by constant factor

1 Convex optimization: constrained minimization (e.g., Bregmaniteration)

arg min ‖x‖1 s.t. ‖y − Φx‖2 ≤ δ.

Unconstrained minimization (e.g., `1-regularized LS)

minimize L(x ; γ, y) =1

2‖y − Φx‖2

2 + γ ‖x‖1 .

2 Greedy algorithms: iterative thresholding, subspace iteration

z = Φ∗y then threshold z , keep largest T entries

• Sublinear time algorithms: for sparse, binary matrices only(currently)

Strive to work in space and time proportional to m, speciallydesigned matrices/distributions and algorithms

Sparse matrices: Expander graphs

S N(S)

• Adjacency matrix Φ of a d regular (1, ε) expander graph• Graph G = (X ,Y ,E ), |X | = n, |Y | = m• For any S ⊂ X , |S | ≤ k, the neighbor set

|N(S)| ≥ (1− ε)d |S |• Probabilistic construction:

d = O(log(n/k)/ε),m = O(k log(n/k)/ε2)

• Deterministic construction:

d = O(2O(log3(log(n)/ε))),m = k/ε 2O(log3(log(n)/ε))

Bipartite graphAdjacency matrix

1 0 1 1 0 1 1 0

0 1 0 1 1 1 0 1

0 1 1 0 0 1 1 1

1 0 0 1 1 0 0 1

1 1 1 0 1 0 1 0

Measurement matrix(larger example)

50 100 150 200 250

5

10

15

20

RIP(p)

A measurement matrix Φ satisfies RIP(p, k , δ) property if for anyk-sparse vector x ,

(1− δ)‖x‖p ≤ ‖Φx‖p ≤ (1 + δ)‖x‖p.

Examples:

• RIP(2): Gaussian random matrix, random rows of discreteFourier matrix

• RIP(1): adjacency matrix of expander

RIP(p) ⇐⇒ expander

Theorem(k , ε) expansion implies

(1− 2ε)d‖x‖1 ≤ ‖Φx‖1 ≤ d‖x‖1

for any k-sparse x. d = degree of expander.

TheoremRIP(1) + binary sparse matrix implies (k, ε) expander for

ε =1− 1/(1 + δ)

2−√

2.

Joint work with Berinde, Indyk, Karloff, Strauss

RIP(2) and LP decoding

Let Φ be an RIP(2) matrix and collect measurements Φx = y .Then LP recovers a good approximation to the top k entries of x :

arg min ‖x‖1 s.t. Φx = y

LP returns x with ‖x − ‖2 ≤ C√k‖x − xk‖1. And, noise-resilient

version too. [Candes, Tao, + Romberg; Donoho]

Expansion =⇒ LP decoding

TheoremΦ adjacency matrix of (2k , ε) expander. Consider two vectors x, x∗such that Φx = Φx∗ and ‖x∗‖1 ≤ ‖x‖1. Then

‖x − x∗‖1 ≤2

1− 2α(ε)‖x − xk‖1

where xk is the optimal k-term representation for x andα(ε) = (2ε)/(1− 2ε).

• Guarantees that Linear Program recovers good sparseapproximation

• Noise-resilient version too

RIP(1) 6= RIP(2)

• Any binary sparse matrix which satisfies RIP(2) must haveΩ(k2) rows [Chandar ’07]

• Gaussian random matrix m = O(k log(n/k)) (scaled) satisfiesRIP(2) but not RIP(1)

xT =(0 · · · 0 1 0 · · · 0

)yT =

(1/k · · · 1/k 0 · · · 0

)‖x‖1 = ‖y‖1 but ‖Gx‖1 ≈

√k‖Gy‖1

• For biological applications, it may be useful to use sparse,binary RIP(1)+RIP(2) matrices with many rows.

RIP(1) =⇒ LP decoding

• `1 uncertainty principle

LemmaLet y satisfy Φy = 0. Let S the set of k largest coordinates of y .Then

‖yS‖1 ≤ α(ε)‖y‖1.

• LP guarantee

TheoremConsider any two vectors u, v such that for y = u − v we haveAy = 0, ‖v‖1 ≤ ‖u‖1. S set of k largest entries of u. Then

‖y‖1 ≤2

1− 2α(ε)‖uSc‖1.

Greedy, iterative algorithms for sparse matrices

• Polynomial time algorithms: several variants, similar to IHT[Berinde, Indyk, Ruszic]

• Sublinear time algorithms: several variants, depend on norms,signal models, [Gilbert, Li, Porat, Strauss; Porat, Strauss; Ngo, Porat, Rudra; Price, Woodruff]

High Throughput Screening (HTS)

HTS is an essential step in drug discovery(and elsewhere in biology)

• Large chemical libraries screened on abiological target for activity

• Basic 0, 1 type biological assays to findactive compounds

• Usually a small number of compounds found

• One-at-a-time screening: automation andminiaturization

• Noisy assays with false positives andnegative errors

Motivation Pooling in HTS QUAPO Results Conclusions

High Throughput Screening (HTS)

HTS is an essential step in drug discovery (and elsewhere in biology).

Large chemical libraries screened ona biological target for activity.

Basic yes-no type biological assaysto find active compounds .

Usually a small number ofcompounds found.

One-at-a-time screening – powercomes from automation andminiaturization.

Noisy Assays with false positive andnegative errors.

Current HTS uses one-at-a-time testing scheme (with repeatedtrials).

Motivation Pooling in HTS QUAPO Results Conclusions

Current HTS Design

Current HTS uses a one-at-a-time testing scheme.

Mathematical set-up

• Binary measurement matrix: adjacencymatrix of unbalanced expander graph

• Appropriate linear biochemical model

• Decoding via linear programming

Motivation Pooling in HTS QUAPO Results Conclusions

Sparse Recovery Problem

QUAPO

Binary measurement matrix : Adjacencymatrix of unbalanced expander graph.a

Appropriate linear biochemical model.

Decoding : via Linear Programming.

aBerinde et. al. (2008) – Combining geometry andcombinatorics: A unified approach to sparse signalrecovery

Potential Advantages

Motivation Pooling in HTS QUAPO Results Conclusions

Pooled HTS Design

Propose the use of pooledtesting of compounds.

Uses fewer tests.

Work is moved from testing(costly) to analysis (cheap).

Handles errors in testing dueto in-built redundancy.

• Pooled testing of compounds

• Uses fewer tests

• Work moved from testing(costly) to computationalanalysis (cheap)

• Handles errors in testing betterdue to built-in replication

•• Additional quantitative

information

HTS and signal recoveryMotivation Pooling in HTS QUAPO Results Conclusions

Analogy to HTS

To use compressed sensing approach we need a linear model.

Experimental set-up• Deterministic binary matrix = Shifted Transversal Design

(STD)• Time-resolved FRET screen to identify inhibitors of RGS4C• Singleplex: 316 compounds screened twice• Multiplex: 316 compounds screened 4 per well using 316

pooled wells

Motivation poolHiTS QUAPO poolMC Summary

STD based screen

Time-resolved FRET screen to identify inhibitors of RGS4C

Singleplex: 316 compounds screened twice

Multiplex: 316 compounds screened 4 per well using 316 pooled wells (4X)

23 / 47

Experimental results

• Multiplex results match Singleplex 1 very closely

• Decoding in the presence of large errors

Motivation poolHiTS QUAPO poolMC Summary

QUAPO screen results

Multiplex results match Singleplex 1 very closely.

Decoding in the presence of large errors.

34 / 47

Theoretical −→ Practical challenges in HTS applications

• Accuracy of output not speed of algorithm

• Joint matrix and algorithm design, maximize recoveredinformation within budget

• Incorporate prior information

• Incorporate noise

• Incorporate measurement device configuration