The$Nonparanormal$SKEPTIC$$ and$...

20
The Nonparanormal SKEPTIC and Its Applica9on

Transcript of The$Nonparanormal$SKEPTIC$$ and$...

Page 1: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

The  Nonparanormal  SKEPTIC    and  

Its  Applica9on    

Page 2: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

Outline  

•  The  Nonparanormal  SKEPTIC    •  inferring  biochemical  networks  

Page 3: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$
Page 4: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

the  precision  matrix  

•  inverse  of  the  covariance  matrix  •  Θ  •  if  the  the  data  is  mul9variate  normal:  

node   1   2   3   4   5   6   7  

1   0   ~   ~   ~   0   0   0  

2   ~   0   0   0   0   0   0  

3   ~   0   0   0   ~   0   0  

4   ~   0   0   0   0   0   0  

5   0   0   ~   0   0   ~   0  

6   0   0   0   0   ~   0   ~  

7   0   0   0   0   0   ~   0  

1

2

3

4

5

6

7

Page 5: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

2  problems  

•  dimension  >>  #  observa9ons  •  data  is  not  mul9variate  normal  

Page 6: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

dimension  >>  #  observa9ons  

•  log  likelihood    

log  detΘ  –  tr(SΘ)-­‐  (terms  involving  the  mean)  

Max  

Page 7: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

data  is  not  mul9variate  normal  

•   trick  1  (the  nonparanormal):  

 •  trick  2  (nonparametric  correla9on):  

Lafferty,  J.  (2009).  The  Nonparanormal :  Semiparametric  Es9ma9on  of  High  Dimensional  Undirected  Graphs,  10,  2295–2328.  

Page 8: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$
Page 9: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

Pris'onchus  pacificus  

•  satellite  model  organism  of  C.  elegans  •   necromenic  associa9on  with  Scarab  beetles  •  global  distribu9on  – diverse  habitats  –   diverse  but  structured  gene9c  background  

Image courtesy of Sommer Lab

Collaboration: Ralf J. Sommer, Director, MPI, Tuebingen, Germany

Page 10: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

data  set    

•  ~450  strains  •  2  replicates  each  •  posi9ve  and  nega9ve  ioniza9on  high  resolu9on  lcms  (metabolome)  

•  restric9on  site  associated  dna  maker  snp  calls  (genome)  

Page 11: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

rad  seq  

restric9on  enzyme  

adapter  

restric9on  enzyme  

adapter  

genomic  DNA  

sequencing  

SNP  calling  

Poland,  J.  a,  Brown,  P.  J.,  Sorrells,  M.  E.,  &  Jannink,  J.-­‐L.  (2012).  Development  of  high-­‐density  gene9c  maps  for  barley  and  wheat  using  a  novel  two-­‐enzyme  genotyping-­‐by-­‐sequencing  approach.  PloS  One,  7(2),  e32253.  doi:10.1371/journal.pone.0032253    

Page 12: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

snp  data  set  snp_locus

_1  snp_locus

_2  snp_locus

_3  …  

sample_1  

sample_2  

sample_3  

…  

1%  genomic    coverage  

#  alleles   count  1   194  2   2947  3   1  

Page 13: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

column:  hkp://www.waters.com/webassets/cms/category/media/snapshot/ACQUITY_Column.jpg  mass  spectrometer:  hkps://encrypted-­‐tbn2.gsta9c.com/images?q=tbn:ANd9GcSJGwVjgNgUcS9gVvxiupz6-­‐wrL5jrVypj09BYwFnIfvHGSfFXXdg  

total  ion  chromatogram  

mass  spectrometer  

liquid  chromatography  coupled  mass  spectrometry  (lcms)  

chromatography    column  

Page 14: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

peak_1  (m,rt)  

peak_2  (m,rt)  

peak_3  (m,rt)  

…  

sample_1  

sample_2  

sample_3  

…  

 ~2,000  features  

lc-­‐ms  

xcms  

Page 15: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$
Page 16: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

PC  2  

PC  3  

PC  4  

PC  2   PC  3  PC  1  

Page 17: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$
Page 18: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

ascaroside  centric  metabolic  network  

(466.2,  5.78)  

Page 19: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

ascaroside  centric  metabolic  network  

Start Node End Node Shortest

Path

Shortest Path To Random Node From

Start Node

Shortest Path To Random Node

From End Node Correlation

ascr#9 pasc#12 1 9.18 10.74 -0.128061447

pasc#9 pasc#12 2 13.28 10.88 -0.076858659

ascr#9 pasc#9 3 9.6 12.52 -0.626094706

Page 20: The$Nonparanormal$SKEPTIC$$ and$ Its$Applicaon$$pi.math.cornell.edu/~raazesh/LifeNetworks2014Files/... · dimension$>>#observaons$ • loglikelihood log$detΘ$– tr(SΘ)S$(terms$involving$the$mean)$

advantages  of  this  method  

•  requires  no  prior  knowledge  •  unsupervised  •  group  wise  interference  •  generalizable  •  efficient  •  func9onal