PU Learning for Matrix Completion - University of Texas at …cjhsieh/pu_mc_slides.pdfUT Austin ICML...

PU Learning for Matrix Completion

Cho-Jui HsiehDept of Computer Science

UT Austin

ICML 2015

Joint work with N. Natarajan and I. S. Dhillon

Cho-Jui Hsieh Dept of Computer Science UT Austin PU Learning for Matrix Completion

Matrix Completion

Example: movie recommendation

Given a set Ω and the values MΩ, how to predict other elements?


Matrix Completion

Assumption: the underlying matrix M is low rank.

Recover M by solving

min‖X‖∗≤t

∑i ,j∈Ω

(Xij −Mij)2,

‖X‖∗ is the nuclear norm (the best convex relaxation of rank(X )).


One Class Matrix Completion

All the observed entries are 1’s.Examples:

Link prediction using social networks (only friend relationships)Product recommendation using purchase networks.”Follows” in Twitter, ”like” in Facebook, . . .


Can we apply matrix completion?

Minimizing the loss on the observed 1’s.

Will get a trivial solution.


Can we apply matrix completion?

Treat all the missing entries as zeroes, and minimizing the loss on allthe entries.

99% elements are zero ⇒ tend to fit zeroes instead of ones.


Challenges

All the observed entries are 1’s.

’0’ is unlabeled entries: can be either 0 or 1 in the underlying matrix.

PU (Positive and Unlabeled) Matrix Completion:

How to formulate the problem?How to solve the problem?What’s the sample complexity?What’s the time complexity?


Outline

Non-deterministic setting – Shifted Matrix Completion

Deterministic setting – Biased Matrix Completion

Extension to PU matrix completion with features.

Experimental Results


Non-deterministic setting

Mij ∈ [0, 1], M is low-rank.The generating process: M (underlying) → Y (0-1 matrix) → Ω1.An underlying 0− 1 matrix Y is generated by

Yij =

{1 with prob. Mij

0 with prob. 1−Mij .

Ω1 sampled from {(i , j) | Yij = 1}, the sample rate is1− ρ = |Ω1|/‖Y ‖0


Unbiased Estimator of Error

Find the best X to minimize the mean square error on M:

minX

∑i ,j

(Xij −Mij)2 = minX

∑i ,j

`(Xij ,Mij)


Unbiased Estimator of Error

Find the best X to minimize the mean square error on M:The unbiased estimator [Natarajan et al., 2013]:

˜̀(Xij ,Aij) =

{(Xij−1)2−ρX 2ij

1−ρ if Aij = 1

X 2ij if Aij = 0.


Shifted Matrix Completion

Shifted Matrix Completion.

SolveminX

∑i ,j

˜̀(Xij ,Aij) s.t. ‖X‖∗ ≤ t, 1 ≥ Xij ≥ 0.

Where

˜̀(Xij ,Aij) =

{(Xij−1)2−ρX 2ij

1−ρ if Aij = 1

X 2ij if Aij = 0.

Equivalent to

minX‖X − Â‖2F + λ‖X‖∗ s.t. 1 ≥ X ≥ 0,

where

Âij = 1/(1− ρ) if Aij = 1Âij = 0 if Aij = 0.


Error Bound for Shifted MF

Measure the error by R(X ) = 1n2∑

i ,j(Mij − Xij)2.

Theorem: error bound for Shifted MF

Let X̂ be the solution of the Shifted MF, then with probability at least 1− δ,

R(X̂ ) ≤3√

log(2/δ)

n(1− ρ)+ Ct

2√n + 4√s

(1− ρ)n2

= O(1

n(1− ρ)),

where C is a constant.


Deterministic Setting

Mij ∈ [0, 1] (can be generalized to other bounded matrix).With some threshold q ∈ [0, 1],

Yij =

{1 if Mij > q

0 if Mij ≤ q,

Ω1 sampled from {(i , j) | Yij = 1}.Given Ω1, impossible to recover M:

for example, M = ηeeT will generate Y = eeT for all η > q.So our goal is to recover Y .


Biased Matrix Factorization

Square loss: `(x , a) = (x − a)2.Biased square loss:

`α(x , a) = α1a=1`(x , 1) + (1− α)1a=0`(x , 0).

Biased MF:X̂ = arg min

X :‖X‖∗≤t

∑i ,j

`α(Xij ,Aij).

Recover Y :

X̄ij =

{1 if X̂ij > q

0 otherwise


Sample Complexity

Error: R̄(X ) = 1n2∑

i ,j 1Xij 6=Yij .

Theorem: error bound for BiasMF

Let X̄ be the solution of BiasMF. If α = 1+ρ2 , then with probability at least1− δ,

R(X̄ ) ≤ 2η1 + ρ

(Ct

2√n + 4√s

n2+ 3

√log(2/δ)

n(1− ρ)

)= O(

1

n(1− ρ)),

where η = max(1/q2, 1/(1− q)2, 8) and C is a constant.


Time Complexity

Gradient can be efficiently solved using O((nnz)k) time:

For non-convex formulation: by Alternating Least Squares (ALS) orCyclyc Coordinate Descent (CCD++) (Yu et al., 2012).For convex formulation: by proximal gradient or active-subspace selection(Hsieh et al., 2014).

One bit matrix completion: need O(n2) time.


Inductive Matrix Completion

Proposed for matrix completion with features [Jain and Dhillon, 2013;Xu et al., 2013]

Input: partially observed matrix AΩ and features Fu,Fv ∈ Rn×dassociated with rows/columns.

Recover the underlying matrix by solving

minD∈Rd×d ,‖D‖∗≤t

∑i ,j∈Ω

(Aij − (FuDFTv )ij)2


PU Inductive Matrix Completion

Inductive Matrix Completion: recover the underlying matrix using1 A subset of 1s in the matrix.2 row and/or column features.

Inductive shift matrix factorization—non-deterministic setting.

Average Error = O( 1n(1−ρ) )

Inductive biased matrix factorization—deterministic setting.

Average Error = O( 1n(1−ρ) )


Experimental results – link prediction

(a) Accuracy on ca-HepTh (b) FPR-FNR on ca-HepTh

Figure: Comparison of algorithms on the link prediction problem (11, 204 nodes,235, 368 edges)


Experimental results – link prediction

(a) Accuracy on Myspace dataset. (b) FPR-FNR on Myspace dataset.

Figure: Comparison of algorithms on the link prediction problem (2, 137, 264 nodes,90, 333, 122 edges)


Application – semi-supervised clustering

Original problem:

Given n samples with features {xi}ni=1.Given partial positive and negative pairwise relationship A ∈ Rn×n.Recover clusters (categories of samples).(Yi et al, 2013) proposed to use inductive MF to solve this problem.

Semi-supervised clustering with one class observation:

Only observe positive pairs Ω1.We propose a one class inductive MF to solve this problem.


Experimental results – semi-supervised clustering

Mushroom dataset, 8142 samples, 2 clusters.


Conclusions

Study the one class matrix completion problem.Proposed algorithms with nice theoretical guarantee:

error decays with the rate of O(1/n).Scale to large problems (millions of rows and columns).Applications:

Link prediction using social networks (only friend relationships)Product recommendation using purchase networks.”Follows” in Twitter, ”like” in Facebook, . . .


PU Learning for Matrix Completion - University of Texas at …cjhsieh/pu_mc_slides.pdfUT Austin ICML...

Documents

Transcript of PU Learning for Matrix Completion - University of Texas at …cjhsieh/pu_mc_slides.pdfUT Austin ICML...

Math 212a Lecture 1 - Introduction.shlomo/212a/01.pdfMath 212a Lecture 1 Shlomo Sternberg ... are the completion of the rational numbers, ... Ivan Niven’s version of the proof.

Radical Views JANUARY 2015 - BIDMC Radiology Alumniradnet.bidmc.harvard.edu/newsletters/Radical2015January.pdf · Neuroradiology. Dr. Soman comes to BIDMC following completion of

Παρουσίαση του PowerPoint · AutoCAD 3D Click . AAUTODESK øv.ray O ECDL . Certificate Of Completion AUTODESK Certificate of Completion AUTODESK AUTODESK Authorised Training

ECS 171: Machine Learningchohsieh/teaching/ECS171_Winter2018/lecture16.pdfLecture 16: Matrix completion, word embedding Cho-Jui Hsieh UC Davis March 11, 2018. Outline Matrix Completion

Data‐Compatible T‐Matrix Completion (DCTMC)€¦ · AI V VB ()−Γ = Φ −1 A ... If ( ) , we can analyse the algorithm in the linear regime: wr ... (Convergence of Levenberg‐Marquardt

Lipschitz Generative Adversarial Nets11-14-00)-11-15-10-4628-lipschitz_gener.pdf · Zhiming Zhou Lipschitz Generative Adversarial Nets ICML, 2019 2 / 11. The Gradient Uninformativeness

Introductionmathfiles/preprints/nsm-math-preprint-1712.pdf · dinal arithmetic, the Boolean completion B of P is (!; )-distributive for all < but (!; ; )-distributivity fails for

Gd Liquid scintillator: completion of the R&D!

Timing and completion of puberty in female mice depend on … › content › pnas › 107 › 52 › 22693.full.pdf · Timing and completion of puberty in female mice depend on estrogen

Jane L. Wagstaff, Michelle L. Rowe, Shu-Ju Hsieh, Danielle ... · Jane L. Wagstaff, Michelle L. Rowe, Shu-Ju Hsieh, Danielle DiCara, John F. Marshall, Richard A. Williamson and Mark

Abhishek Kumar, Ben Pooleno... · Abhishek Kumar, Ben Poole Google Research, Brain Team ICML 2020. Conﬁdential & Proprietary β-VAE Encoder Decoder How does variational family regularize

2019-2020 Student Guidea BA Honours degree in English Language and Literature after the completion of at least 8 academic semesters of study ... transcripts, annual course registration

Math 30-1 - Unit 4 Workbook · Lesson 1: Degrees and Radians Approximate Completion Time: 4 Days Lesson 2: The Unit Circle Approximate Completion Time: 4 Days Lesson 3: Trigonometric

Gd Liquid scintillator: completion of the R&D! Stability 0,1 % Gd in PXE LENS R&D new metal β-diketone molecule (MPIK) Stable: 0.1% Gd-Acac (few months)

Outline of Decommissioning and Contaminated …...Unit 4 Completion of fuel removal 2014 First unit Start of fuel debris retrieval Unit 2 Wit hin 2021 Pumping well Groundwater level

The Omega Charter - Academy Of On Charter.pdf · The Omega Charter The Academy of On ... of Heru-Ur within the Great ... The completion of the House of God sets the spiritual science

Kunal Datta and Prof. Hossein Hashemi Ming Hsieh ...minghsiehece.usc.edu/wp-content/uploads/2017/08/Datta-Kunal_.pdfKunal Datta and Prof. Hossein Hashemi Ming Hsieh Department of Electrical

Math 10C - Unit 4 Workbook 1: Graphing Relations Approximate Completion Time: 2 Days Lesson 2: Domain and Range Approximate Completion Time: 1 Day Lesson 3: Functions Approximate Completion

-worst case delay lw) › courses › comparch › 2015 › files › fall... · 2016-10-04 · cslab@ntua 2015-2016 13 5. Memory read completion (write back step) Reg[IR[20-16]]

Appendices for the ICML paper “Optimizing Neural Networks ...jmartens/docs/K-FAC_ICML_Appendices.pdf · Appendices for the ICML paper “Optimizing Neural Networks with Kronecker-factored