Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4...

53
Multivariate Curve Resolution: theory and applications Romà Tauler, Abril, 2005

Transcript of Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4...

Page 1: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Multivariate Curve Resolution:theory and applications

Romà Tauler, Abril, 2005

Page 2: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Multivariate Curve Resolution

Pure component information

è

C

STsn

s1

c nc 1

WavelengthsRetention times

Pure concentration profilesChemical model

Process evolutionCompound contribution

Pure signalsCompound identity

D

Mixed information

tR

λ

Page 3: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

DataMatrix

InitialEstimation

SVDor

PCA

ALSoptimization

ResolvedSpectraprofiles

Re

solv

ed

Co

nce

ntr

atio

np

rofi

les

Estimation of the number

of components

Initial estimation ALS optimization

CONSTRAINTS Results of the ALS optimization procedure:

Fit and Diagnostics

E+

Data matrix decomposition according to a bilinear model

Flowchart of MCR-ALS

DC

ST

TPCAC

SCDmin ˆˆˆˆ

− TPCA

SSCDmin

Tˆˆˆ −

D = C ST + E(bilinear model)

Journal of Chemometrics, 1995, 9, 31-58; Chemomet.Intel. Lab. Systems, 1995, 30, 133-146Journal of Chemometrics, 2001, 15, 749-7; Analytica Chimica Acta, 2003, 500,195-210

Page 4: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d
Page 5: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Hard + soft modelling constraintsMCR-ALS hybrid (grey) models

Page 6: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

D(I x J x K)

Regular data cube

A series of two-way data sets with commoninformation in one or more modes.

D

D

Other three-way arrangements

Multiway data : Multiple measurementorders/modes/directions/ways

Example: Three-way data

Page 7: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Data augmentations in MCR

D

=D1 D2 D3

S1 T S2

T S3T

ST

CD

=D1 D2 D3

S1 T S2

T S3T

ST

CD

=D1 D2 D3D1 D2 D3

S1 T S2

T S3TS1

T S2T S3

T

ST

C

The same experiment monitored with different techniques

C1

C2

C3

=

STD1

D2

D3

D C

C1

C2

C3

=

STD1

D2

D3

D CSeveral experiments monitored with the same technique

=

S1T S2

T S3T

D1 D2 D3 C1

D4 D5 D6 C2

D C

ST

=

S1T S2

T S3TS1

T S2T S3

T

D1 D2 D3 C1

D4 D5 D6 C2

D C

ST

Several experiments monitored with several

techniques

Row-wise

Column-wise Row and column-wise

Page 8: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

ST

C

=

D

D1

D2

D3

Trilinearity Constraint (flexible to every species) Extension of MCR-ALS to multilinear systems

1st scoreloadings

PCA,SVD

Foldingspeciesprofile

1st scoregives thecommonshape

Loadings give therelative amounts!

TrilinearityConstraint

Unfolding species profile

UniqueSolutions!

Substitution of species profile

C

Selection of species profile

R.Tauler, I.Marqués and E.Casassas. Journal of Chemometrics, 1998; 12, 55-75

Page 9: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

D = C ST + E = D* + E

STnew = T ST

Cnew = C T-1

D* = C ST = C T-1T ST = CnewSTnew

Quality assesment of MCR-ALS resultsMCR solutions are not uniqueEvaluation of rotation ambiguities

Rotation matrix T is notunique. It depends on theconstraints.Tmax and Tmin may be found by a non-linear constrainedoptimization?

•0 •5 •10 •15 •20 •25 •30 •35 •40 •45 •50•0

•0.1

•0.2

•0.3

•0.4

•0.5

•0 •5 •10 •15 •20 •25 •30 •35 •40•0

•0.5

•1

•1.5

Tmax

Tmin

Tmax

Tmin

R.Tauler. Journal of Chemometrics, 2001, 15, 627

Page 10: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data

∑=

+=N

1nijkknjninijk etscd

0 50 100 150 20000.20.40.60.8

11.21.41.61.8

Run 2Run1Run 3

Run 4 0 50 100 150 2000

1

2

3

0.5

0 20 4000.10.20.30.4

0 5 10 15 20 25 30 35 40 45 500

0.2

0.4

0.6

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

Run 1 Run2 Run 3 Run 4

a) Matrix augmentation, non-negativity andspectra normalization constraints

0 10 20 30 40 50 60 70 80 90 1000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0 20 40 60 80 100 120 140 160 180 2000

0.5

1

1.5

2

2.5

3

c) Matrix augmentation, non-negativity, spectranormalization and trilinearity constraints

0 5 10 15 20 25 30 35 40 45 500

0.1

0.2

0.3

0.4

0.5

0.6

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

b) Matrix augmentation, non-negativity, spectranormalization and selectivity constraints

Page 11: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

3 4 5 6 7 8

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Mean, bands and confidence range of concentration profiles

pH

Rel

. con

c

240 250 260 270 280 290 300 310 3200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8Mean, bands and confidence range of spectra

Wavelength /nm

Abs

orba

nce

/a.u

.

2e-144.92382e-143.65390 %

0.00224.92266e-43.65400.1 %

0.02644.91340.00613.65921 %

1.12175.33080.48734.07545 %

0.04094.91000.01013.66562 %

Std. dev

ValueStd. dev

ValueNoiseadded

pK2pK1

MontecarloSimulation

JackknifeNoise Addition

Resampling Methods

TheoreticalData

ExperimentalData

Noise 1%

Quality assesment of MCR-ALS results. Error propagation and Confidence intervals

J.Jaumot, R.Gargallo and R.TaulerJ. of Chemometrics, 2004, 18, 327–340

Page 12: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

240 250 260 270 280 290 300 310 3200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Wavelength /nm

Abs

orba

nce

/a.u

.

Mean, bands and confidence range of the spectra

240 250 260 270 280 290 300 310 3200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8Mean, bands and confidence range of spectra

Wavelength /nm

Abs

orba

nce

/a.u

.

240 250 260 270 280 290 300 310 3200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8Mean, bands and confidence range of spectra

Wavelength /nm

Abs

orba

nce

/a.u

.

240 250 260 270 280 290 300 310 320

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9Mean, bands and confidence range of spectra

Wavelength /nm

Abs

orba

nce

/a.u

.

240 250 260 270 280 290 300 310 320-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6Mean, bands and confidence range of spectra

Wavelength /nm

Abs

orba

nce

/a.u

.

Noise Addition SimulationsSpectra profiles: Mean, max and min profilesConfidence range profiles

0% noise

0.1% noise

1 % noise 5 % noise

2 % noise

Page 13: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Until now MCR-ALS input has to be typed in the MATLAB command line

Troublesome and difficult in complex cases where several data matrices are simultaneously analyzed and/or different constraints are applied to each of them for an optimal resolution

NowA new graphical user-friendly

interface for MCR-ALSJoaquim Jaumot, Raimundo Gargallo, Anna de Juan and Roma Tauler, Chemometrics and Intelligent Laboratory Systems, 2005, 76(1) 101-110

Multivariate Curve ResolutionHome Page

http://www.ub.es/gesq/mcr/mcr.htm

Page 14: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Example : Analysis of a single experimentMelting experiment of an oligonucleotide (2 components)

monitored by UV-VIS spectroscopy

Page 15: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d
Page 16: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Example 2. Analysis of multiple experiments. Analysis of 4 HPLC-DAD runs each of them containing four compounds

Page 17: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d
Page 18: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

0 50 100 150 200 2500

000

1000

1000

2000

2000

0 50 100 150 200 2500

000

1000

1000

2000

2000

3000

λ

y

x

Scanned surface

Pixel

λ

Num

ber

of p

ixel

s (x

x y

)

D Chemical measurementData set

Spectroscopic Imaging DataCoupling microscopy and spectroscopy

ç

Page 19: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

(*) Andrew, J.J., Hancewicz, T.M. Appl. Spec. 50 (1996) 263.

Oil-in-water emulsion(60 × 60 pixels)Spectroscopic techniqueRAMAN (229 wavenumbers)Compounds in the emulsion overlap.The interphase is complex.

Wavenumber

RA

MA

N in

tens

ity

Example: Industrial spectroscopic images

Oil-in-water emulsion monitoredat different depths(24 x 23 x 10 pixels)

Page 20: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d
Page 21: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Resolution of augmented image data sets

Layer 1

D C

ST

Pure spectrum

Distribution maps

=

Bulk quantitativeinformation

Layer 2

Layer 3

Layer 4

Layer 5

L1

L2

L3

L4

L5

A. de Juan, R. Dyson, C. Marcolli, M. Rault, R. Tauler and M.Maeder. Trendsin Analytical Chemistry 2004, 23, 70-79

Page 22: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d
Page 23: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Study of the cis platination reaction between:

Oligopeptide methionine-guanine conjugatePhac-Met-linker-p5dG (Phac = phenylacyl)

cisplatin[15N]-cis-dichlorodiammineplatinum(II)

+

PtH315N

15NH3

Cl

Cl

NH

O

S

CH3

NH

O

O N

OO

OH

O-

O

P N

NHN

NH2

O

Methionine

dG

J.Jaumot, V.Marchán, R.Gargallo, A.Grandas and R.Tauler, Analytical Chemistry, 2004, 76, 7094-7101

Page 24: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

2D-NMR reaction monitoring

[1H,15N]-HSQC NMRcorrelated spectroscopy

15N labelled cisplatin751 15N chemical shifts156 1H chemical shifts

Experimental conditions of the reaction

cisplatin

Met dG+

-reaction time: hours (slow reaction)

-at several times one 2D-NMR spectrum

- 23 2D NMR correlated spectra were measured

D (23,156,751)

PtH315N

15NH3

Cl

Cl

Study of thisreaction with time

48 h

Page 25: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

cisplatinS adduct

N adductguanine

cisplatinS adduct

N adductguanine chelate

chelate

chelate

chelate

cisplatin

final product

chelatechelate

cisplatin

final product

Page 26: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Data structure:multiple 2D data matrices at differentreaction stages

2D-spectrum

751 δ 15N

156 δ1H t1

EvolvingReaction

2D-spectrum

751 δ 15N

156 δ1H t2

2D-spectrum

751 δ 15N

156 δ1H t23

2D-spectrum

751 δ 15N

156 δ1H t22

t

156 δ1H x 751 δ15N D “tube-wise”

D1

D2

D23

.................................................................

................................................................

Page 27: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

MCR-ALS applied to multiple 2D NMR data matrices

=

Refolding

S1 S3S2 S4

=

Kinetic information:distribution diagram and 2D “pure” NMR spectra

D1

D2

D23

..........................................

..........................................

..........................................

D “tube-wise”

ST1

ST2

ST2

ST4

C

t

156 δ1H x 751 δ15N 156 δ1H

751 δ15N

ST

Page 28: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Final Product

NH3 Trans MetMonofuntional

NH3 Trans dGMonofuntional

NH3 TransMetal Chelate

Page 29: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

NH3 TransMetal Chelate

Final Product

NH3 Trans MetMonofuntional

NH3 Trans dGMonofuntional

Page 30: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

S.Navea, A. de Juan and R.Tauler Analytical Chemistry, 2002, 74, 6031-9

α-lactalbumin(Ca2+ presente)

α-apolactalbumin(Ca2+ ausente)

Id the presence of Ca(II) affecting the protein folding mechanism?

200-250 nm

250-300 nm

Page 31: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

The protein folding pathway (Far- and near-UV CD)

D1 D2

D

S1T S2

T

è

CST

α-lactalbumin

260 280 300 320-250-200

-150-100-50

050

Wavelengths (nm)

D

N

250200 210 220 230 240-800

-600

-400

-200

0

200

Wavelengths (nm)

ND

Temperature (ºC )0 10 20 30 40 50 60 700

0.02

0.04

0.06

0.08

0.1

N

D

α-apolactalbuminN

ID

10 20 30 40 50 60 70 800

0.20.4

0.6

0.8

1

250-25-20-15-10

-505

Wavelengths ( nm )

N

D

I

300-25-20-15-10

-505

N

D

I

- 25-20-15-10

-505

N

D

I

200 250Wavelengths ( nm )

- 8

-6

-40

-20

0

NI

D

- 80

-60

-40

-20

0

NI

D

Temperature (oC)

CD near UV CD far UV CD far UVCD near UV

Page 32: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

T-induced transitions of β-lactoglobulinNIR/MIR results

Co

nce

ntr

ació

(u.a

.)

1200 1600 2000 24000

0.2

0.4

0.6

0.8

1

1.2

1.4

Wavelength (nm)

Ab

sorb

ance

1500170019000

0.2

0.4

0.6

0.8

1

1.2

1.4

Wavenumber (cm-1)

=C

on

cen

trat

ion

(u.a

.)

20 30 40 50 60 70 800

0.2

0.4

0.6

0.8

1

Temperature (ºC)

CD2O and Cprotein

P3

P1

P2

D2D1

20 30 40 50 60 70 800

0.2

0.4

0.6

0.8

1

Temperature (ºC)

CD2O

D1

D2

15001700190000.10.20.30.40.50.60.70.80.9

Wavenumber (cm-1)

Ab

sorb

ance

(u.a

.)

STMIR

D2

D1

1200 1600 200000.10.20.30.40.50.60.70.80.91

Wavelength (nm)

Ab

sorb

ance

(u.a

.)

STNIR

D2D1

1200 1600 2000 24000

0.2

0.4

0.6

0.8

1

1.2

1.4

Wavelength (nm)

Ab

sorb

ance

Ab

sorb

ance

1500170019000

0.2

0.4

0.6

0.8

1

1.2

1.4

Wavenumber (cm-1)

The protein contributions are successfully modelledin the presence of the evolving solvent background.

Tesis doctoral deSusana Navea

Page 33: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

T-induced transitions of β-lactoglobulin

• The proposed mechanism is confirmed.• Only the combined use of MIR/NIR defines all the protein

conformations.

1200 1400 1600 1800 2000 2200 240000.10.20.30.40.50.60.70.80.91

Wavelength (nm)

Ab

sorb

ance

(a.u

.)P1

P2

P3

1400150016001700180019000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Wavenumber (cm-1)

Ab

sorb

ance

(a.u

.)

P1

P2

P3

20 30 40 50 60 70 8000.10.20.30.40.50.60.70.80.91

Temperature (ºC)

Co

nce

ntr

atio

n(a

.u.)

P1

P2

P3

46 ºC 63 ºC

Native protein Molten globule63ºCR-type state46ºC

(p1) (p2) (p3)

Protein process descriptionNIR MIR

S.Navea, A. de Juan and R.Tauler Analytical Chemistry, 2003, 75, 5592-5601

Page 34: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

MCR-ALS applied to DNA microarray data

DNA microarray technology has made possible to monitor gene expression levels for thousands of genes in a single experiment.

Information about the existence ofpatterns and relationships betweensamples (cell lines) and variables (genes) can be obtained.

Because of the huge amount of data generated in a single experiment, data compression and data analysis methodsare needed to extract and understandthe information contained in the data

Celllines

(cancersamples)

Genes expression(variables)

Page 35: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Experimental data

Gene profilesSamples profiles

ALS

Initial EstimationK-means Centroids

DData Matrix

C ST

informationabout the

cancersamples

(cell lines)

informationabout the

gene expression(variables)

MCR-ALS applied to DNA microarray data

Page 36: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

EXTRACTING BIOMEDICAL INFORMATION FROM GENE EXPRESSION MICROARRAY DATA

BY MULTIVARIATE CURVE RESOLUTION J.Jaumot, R.Tauler and R.Gargallo (work in progress)

MCR-ALS resultsshows that each group

is characterized by:

Type ofcancer

MCR Component

CO5

Not welldefined

4

LE3

ME2

CNS, OV, RE, LC

1

C(cell lines)

ST

(genes)

LE leukemia, ME melanomaCO colon, other carcinomas

Page 37: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

ST

+

Elutionprofile 1

=MCR-ALS

Dunk

Dstd1

Dstd2

DstdN

...

Cunk

Cstd1

Cstd2

CstdN

...

Eunk

Estd1

Estd2

Cstd

Astd or hstd

Cunk

hunk

EstdN

...

Aunk

hunk

...Astd1 Astd2

hstd1 hstd2

CunkCstd1 Cstd2 CstdN

AstdN

hstdN

Cunk

Cstd1

Cstd2

...

CstdN

ST

C +=MCR-ALS

ED

Quantitativeinformation

Qualitative information

m/z

t

MS data

(A)

(B)

Daug Caug Eaug

QUANTITATION USING MCR-ALS

Page 38: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

min10 20 30 40

0

2000000

4000000

6000000

8000000

10000000

12000000

MSD2 TIC, MS File (E:\20PPM.D) API-ES, Pos, Scan, 90

min10 20 30 40

0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

8000000

MSD2 TIC, MS File (E:\MINAB10.D) API-ES, Pos, Scan, 90

min10 20 30 40

0

2000000

4000000

6000000

8000000

10000000

MSD2 TIC, MS File (E:\LLAGE10.D) API-ES, Pos, Scan, 90

(A)

(B)

(C)

Reconstructed TIC MS chromatograms

Standard mixture of 13 biocides at 20 ppm

Aznalcollar sediment sample spikedwith 13 biocides at 10 ppm

In-WWTP water sample from La LLagosta sample spikedwith 13 biocides at 10 ppm

Page 39: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

35 35.5 36 36.5 37 37.5 38 38.50

100

200

300

400

500

600

700

800

900

1000

D

Elution profiles

alachlor

chlopyrifos-oxonterbutryn

solventgradient

CST

=MCR-ALS

MULTIVARIATE CURVE RESOLUTION

tR

λ & m/z + wavelets

DAD MS

DAD MS

tR

200 220 240 260 280 300 320 3400

0.1

0.2

0.3

0.4

50 100 150 200 250 300 350 4000

0.1

0.2

0.3

0.4

?

?m/ztR

?m/ztR

DAD and MS Spectra

m/z

DAD

MS

SCAN mode

Page 40: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

50 100 150 200 250 300 350 4000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

34.5 35 35.5 36 36.5 37 37.5 38 38.50

100

200

300

400

500

600

700

800

900

200 220 240 260 280 300 320 3400

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

?

m/z

tR

Terbutryn

Solvent gradientAlachlorChlorpyrifos-oxon

(B)

(C)

(A)

Aznalcollar sediment

34.5 35 35.5 36 36.5 37 37.50

100

200

300

400

500

600

700

800

900

50 100 150 200 250 300 350 4000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

200 220 240 260 280 300 320 3400

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

?

m/z

tR

ImpurityTerbutryn

Solvent gradientAlachlorChlorpyrifos-oxon

(B)

(C)

(A)

In-WWT water sample

Page 41: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d
Page 42: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Determinació directa d’analits a partir de mesuresespectrofluorimètriques en matrius naturals complexes sense necesitat de separació prèvia per métodescromatogràfics

Espectres d’excitacio-emissió per una mostra de

aigua de mar amb trifenilestany

Analytica Chimica Acta, 2000, 409, 237-245

Determinació de trifenilestany en aigua de mar mitjançant fluorescència i resolució multivariant

Page 43: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

b) emission spectra for the unknown sea-water sample; c) emission species spectra for the standard;d) emission species spectra for flavonol reagent; e) emission species spectra for sea-watere background; f) excitation spectra

Rel

ativ

e in

tens

ity

450 500 5500

2

4

Emission wavelength (nm)

2

1

3

a

Emission wavelength (nm)450 500 550

Rel

ativ

e in

tens

ity

0

2

4b

1

2

Emission wavelength (nm)450 500 550

0

2

4

Rel

ativ

e in

tens

ity

2

c

Emission wavelength (nm)450 500 550

0

2

4

Rel

ativ

e in

tens

ity

d

3

415400 405 410

0

2

4

Excitation wavelength (nm)

Arb

itrar

y in

tens

ity

e

1

2

3

MCR-ALS

(e)

(d)

(c)

(b)

(f)

1 TPhT flavonol complex2 Flavonol reagent3 sea-water background

MCR-ALS resolution of [U;S;R;B] augmented matrix

300305

310315420 460 500 540 580

123456789

Excitation Wavelength (nm)

Emission Wavelength (nm)

Fluo

resc

ence

Inte

nsity

U

300305

310315420 460 500 540 580

0

1

2

3

4

5

ExcitationWavelength (nm)

Emission Wavelength (nm)

Fluo

resc

ence

Inte

nsity

300305

310315 420 460 500 540

1

2

3

4

ExcitationWavelength (nm)

Emission Wavelength (nm)

Fluo

resc

ence

Inte

nsity

300305

310315420 460 500 540 580

0

1

2

ExcitationWavelength (nm)

Emission Wavelength (nm)

Fluo

resc

ence

Inte

nsity

(a)

S

R

B

a) 3-D plots of the EEM fluorescence of the unknown sample U, standard S, flavonol reagent R andsea-water background B;

emission

excitation

Page 44: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

0

0.5

1

1.5

2

2.5

3

0 10 20 30 40 50Concentration (pg/l)

Res

pons

e

Standards

Synthetic

See Water A

See Water B

See Water C

See Water D

See Water E

Areas de los espectros de emisión resuleltos por MCR-ALS respecto a las concentraciones de TPhT

Resolución y quantificación porMCR-ALS de datos EEM

0

5

10

15

20

25

30

35

40

45

0 10 20 30 40 50

Real Concentration (ppt)

Cal

cula

ted c

once

ntr

atio

n (

ppt)

Comparación valores verdaderos y calculados por MCR-ALS en muestras de aguas de mar

cU = [Area(yU) / Area(yS)] cSerrores de predicción siempre por debajo del 13%!

0

0.5

1

1.5

2

2.5

-20 0 20 40concentration TPhT

(µg / L)

Rel

ativ

e A

rea

Estratègias de Calibracióni. using pure standardsii. using sea-water standardsiii. using the standard addition method

Analytica Chimica Acta, 2000, 409, 237;2001, 432, 245-255

The Analyst, 2000, 125, 2038-43

Standardadditioncalibrationgraph ina sea-wateranalytedetermination(sea-watersample U4)

Page 45: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

I sam

ples

J variables

0 5 10 15 20 25 30-50

0

50

100

150

200

250

300

350

0 5 10 15 20 25 30 35 40 45 50-50

0

50

100

150

200

250

300

350

Data table ordata matrix

Plot of samples(rows)

Plot of variables(columns)

12 13 45 67 89 42 35 0 0.3 0.005 111 33 5 67 90 0.06 44 33 1 2X(I,J)

Environmental data tables (two-way data)

Conc. of chemicalsPhysical PropertiesBiological propertiesOther .....

‘m’

<LOD

Page 46: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

1

Nx g f eij in nj ijn= +∑

=

Bilinearity!

NR

NC

X

+G

FT

ENR NR

Environmental source resolution and apportioment

0 10 20 30 40 50 60 70 80 90 1000

1

2

3

4

5

6

concn. of 96 organic compounds

0 1 0 20 30 4 0 50 60 7 0 80 90 1000

0.05

0.1

0.15

0.2

0 1 0 20 30 4 0 50 60 7 0 80 90 1000

0.05

0.1

0.15

0.2

0 1 0 20 30 4 0 50 60 7 0 80 90 1000

0.1

0.2

0.3

0.4

0 5 1 0 1 5 20 250

5

10

15

20

0 5 1 0 1 5 20 250

10

20

30

0 5 1 0 1 5 20 250

5

10

15

20

22 samples

identificationof contamination

sources(composition)

distribution ofconatamination

sources

xij concentration of chemical contaminant j in sample i; fiin contribution of source n in sample j;gnj contribution of compound j in source n

Page 47: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Projecte ACAE. Peré-Trepat, M. Terrado, R.Tauler,

Page 48: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Contaminación general

?

Page 49: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

T12

R14

R17

T16

T15

T9R1T3

R18

T8

R4R6T7

T13

T11T10

T5

T2

TORTOSA

LLEIDAZARAGOZA

HUESCAMONZÓN

SABIÑÁNIGO

PAMPLONA

LOGROÑO

VITORIA

TUDELA

AQUATERRASub-Project BASIN

Workpackage: R2 – EBRO river basinIntegration Monitoring-Chemometrics-GIS

M.Terrado, A.Navarro, S.Lacorte and R.Tauler

WP Leader: D. BARCELOIIQAB-CSIC, Barcelona (E)Partners:ACA, Barcelona (E)AGBAR, Barcelona (E)AGH, Kracow (Pl)

Page 50: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

SURFACE WATERDIFFERENT SOURCES

RED ICA àCHE(Red Integral de Control de Aguas)

210 sample points

AQUATERRA (Surveillance monitoring)

R0: Ebro in Reinosa (Cantabria)R1: Ebro in Miranda de Ebro (Burgos) T2: Zadorra in Audinaka (Álava)T3: Zadorra in Villodas (Álava)R4: Ebro in Haro (La Rioja)T5: Najerilla in Najera (La Rioja)R6: Ebro in Logroño (La Rioja)T7: Ega in Estella (Navarra)R7: Ebro in Tudela (Navarra)T8: Araquil in Alsasua (Navarra) T9: Arga in Puente la Reina (Navarra)T10: Jalon in Grisen (Zaragoza)T11: Huerva in Zaragoza (Zaragoza)T12: Gállego in Caldearenas (Huesca)T13: Gállego in San Mateo de Gállego (Zaragoza)R14: Ebro in Presa de Pina (Zaragoza)R15: Ebro in Sástago (Zaragoza)T15: Cinca in Alcolea de Cinca (Huesca)T16: Segre in Torres de Segre (Lleida)R17: Ebro in Flix (Tarragona)R18: Ebro in Tortosa (Tarragona)R19: Ebro in Amposta (Tarragona)R20: Ebro in Delta de l’Ebro (Tarragona)GAR1: Gállego in Villanueva de Gállego (Zaragoza)

24 sample points

RCSPàCHE (Red de control de sustancias peligrosas)

Zadorra en SalvatierraSP-18:

Najerilla en Nájera (aguas abajo)SP-17:

Jalón en GrisénSP-16:

Huerva en Fuente de la JunqueraSP-15:

Gállego en VillanuevaSP-14:

Ega en ArinzanoSP-13:

Ebro en Logroño (aguas abajo)-VareaSP-12:

Ebro en Conchas de HaroSP-11:

Araquil en Alsasua-UrdiaínSP-10:

Ebro en TortosaSP-9:

Zadorra en Vitoria TrespuentesSP-8:

Ebro en Miranda de EbroSP-7:

Arga en Puente La ReinaSP-6:

Cinca en Monzón (aguas abajo)SP-5:

Segre en Torres de SegreSP-4:

Ebro en AscóSP-3:

Ebro en Presa PinaSP-2:

Gállego en JabarrellaSP-1:

Zadorra en SalvatierraSP-18:

Najerilla en Nájera (aguas abajo)SP-17:

Jalón en GrisénSP-16:

Huerva en Fuente de la JunqueraSP-15:

Gállego en VillanuevaSP-14:

Ega en ArinzanoSP-13:

Ebro en Logroño (aguas abajo)-VareaSP-12:

Ebro en Conchas de HaroSP-11:

Araquil en Alsasua-UrdiaínSP-10:

Ebro en TortosaSP-9:

Zadorra en Vitoria TrespuentesSP-8:

Ebro en Miranda de EbroSP-7:

Arga en Puente La ReinaSP-6:

Cinca en Monzón (aguas abajo)SP-5:

Segre en Torres de SegreSP-4:

Ebro en AscóSP-3:

Ebro en Presa PinaSP-2:

Gállego en JabarrellaSP-1:

18 sample points

Page 51: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

SURFACE WATER à HISTORICAL DATA

DATA BASE

DANGEROUS SUBSTANCES CONTROL NETWORK (Red de Control de Sustancias Peligrosas)

18 sampling points; 3 compartments comparison

WATER(2002-04)

SEDIMENTS(2002-03)

BIOTA(2002-03)

1ICA NET OF SURFACE WATER CONTROL (Red Integrada de Control de la Calidad de las

Aguas Superficiales)

Selection of some points from this net common with the AQUATERRA’s surveillance monitoring;

1 compartment

2

WATER(2002-04)

Physical parameters

-Flow

-Water temperature

-Air temperature

-Conductivity at 20ºC

-Aspect

Chemical parameters

-pH

-Dissolved oxygen

-Suspended materials

-COD (chemical oxygen demand), BOD5...

-Chlorides, sulphates, phosphates

METALS: As, Cd, Pb, Zn, Cu

ORGANIC COMPOUNDS:

-pesticides of agricultural origin (HCB, SDDTs, SHCHs)

-SPCBs

-PAHs

Other parameters

-LAND USES à grouping points according to their land use, to obtain common features and differences

-QUALITY INDEXES :

ISQA (simplified index for water quality)

BIOLOGIC index

Page 52: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

SURFACE WATER AND GROUNDWATER DATABASES

CHEMOMETRICSPCA (Principal Component Analysis)

Other methods (Multivariate data Analysis)

IDENTIFICATIONcontamination sources

Loading plots

ENVIRONMENTAL DISCUSSION

ØKey zone delimitation according to the pollution degree.

ØRelation between land use (agricultural, industrial, urban and protected areas) and distribution of the contamination sources

ØRepresentation of quality indexes

ØComparison of the contribution of the different contamination sources in each environmental compartment (water, sediments and biota)

DISTRIBUTIONcontamination sources

Geographical Temporal

GIS

Page 53: Multivariate Curve Resolution: theory and applications€¦ · Extensión to ‘multiway’ data: 4 chromatographic runs of 4 coeluting components Trilinear data å = = + N n 1 d

Acknowledgements

• Chemometrics Group UB – Staff: Anna de Juan, Javier Saurina, Raimundo Gargallo (RyC)– PhD : Susana Navea, Joaquim Jaumot, Emma Peré-Trepat– Master and DEA: Gloria Muñoz, Silvia Mas

• Environmental Chemometrics Group IIQAB-CSIC– Staff: Romà Tauler– Post-doc: Montse Vives (JdC), Mónica Felipe (I3P)– Master and DEA Marta Terrado, Xavier Puig

• Elisabeth Teixido (ACA), Silvia Termes (LAG)