Analisis de Datos de Ondas Gravitacionales 2da parte€¦ · 11/10/2017 · Analisis de Datos de...

Analisis de Datos de Ondas Gravitacionales

2da parte

Basada en apuntes desarrollados por J. Romano

GaussianityA single random variable x is said to have a Gaussian probability distribution

if

p(x) =1

!

2πσ2x

exp

"

−

1

2

(x − µx)2

σ2x

#

The parameters µx and σ2x

are the mean and variance of x.

A set of random variables xj , where j = 0, 1, · · · , N − 1 (e.g., the discretetime samples xj of a random process x) is said to have a multivariate Gaussian

probability distribution if

p(x1, x2, . . . , xN ) = 1(2π)N/2

√

det Cxexp

!

−12

"N−1i,j=0 C−1

xij(xi − µxi)(xj − µxj)#

where Cxij is the covariance matrix Cxij := ⟨xixj⟩ − ⟨xi⟩⟨xj⟩

Cxij generalises the variance σ2x for a single Gaussian-distributed random

variable.

If the discrete random process xj has zero mean (i.e., ⟨xj⟩ = 0), then theelements of the covariance matrix are just the discretised version of the correla-tion function Cx(t) = ⟨x(t′)x(t + t′)⟩ that we defined for a continuous randomprocess x.

Statistical inference The ultimate goal of science is to infer nature's state from observations.Since observations are incomplete and imprecise, being often contaminated by noise, our conclusions will always have some level of uncertainty associated with them.Statistical inference (or probability theory) is a way of quantifying and manipulating uncertainty.

There are (at least) two ways of interpreting probability:

(i) degree of belief in a statement

(ii) long-run relative occurrence of an event in a set of identical experiments

Interpretation (ii) requires the experiment to be repeatable in principle.Hence, the question ``what's the probability that it's going to rain today" doesn't make sense with respect to this definition, unless you are willing to imagine an ensemble of identical worlds with the same initial weather conditions.

Interpretation (i) is more general than interpretation (ii) since it need not be associated with repeatable experiments.

But both interpretations lead to the same algebra of probability. Namely the probability of a statement X such as ``it will rain today" or ``gravitational waves from a particular supernovae were incident on our detector during the past hour" is given by a number p(X) between 0 and 1 which satisfies:

(i) p(X = true) = 1, p(X = false) = 0, 0 < p(X = not sure) < 1(ii) p(X) + p(X) = 1, where X means not X

(iii) p(X, Y ) = p(X|Y )p(Y ), where p(X|Y ) is conditional probability of X

given Y

Property (ii) is called the sum rule; property (iii) is the product rule.

Note that a joint probability distribution for x and y can be converted intoa marginalised distribution for x by integrating out the y-dependence:

p(X) =!

dY p(X, Y ) =!

dY p(X|Y )p(Y )

where the second equality follows from the product rule.

Frequentist vs. Bayesian statistics

(i) is a branch of statistical inference that takes Interpretation (ii) as its interpretation of probability.

(ii) probabilities can only be assigned to random variables -e.g., the outcomes of repeated identical experiments- and not to hypotheses or parameters describing the state of nature, which have fixed but unknown values.

(iii) one assumes that the measured data are drawn from some underlying probability distribution, which assumes the truth of a particular hypothesis or model.

Frequency Statistics

(iv) one constructs a statistic (i.e., a function of the data that estimates a signal parameter or indicates how well the data fits a particular hypothesis). The statistic is a random variable since the data from which it is constructed are random.(v) one calculates analytically or via Monte Carlo simulations the probability distribution of the statistic (the so-called sampling distribution).(vi) one needs to be clever in choosing a statistic as they are not unique, and have different properties (e.g.,unbiased but large variance, biased but small variance, etc.).

(vii) the sampling distribution depends on data values that were not actually observed, which depend in general on how the experiment was carried out or might have been carried out!. (This is related to the so-called stopping problem of frequentist statistics.)

(viii) from the statistic and sampling distribution one calculates either confidence intervals for parameter estimation or p-values for rejecting null hypotheses. (More on these later.)

(ix) claims to be more objective than Bayesian statistics since it does not require the introduction of a prior.

Bayesian statistics(i) is a branch of statistical inference that takes Interpretation (i) as its interpretation of probability.(ii) probabilities can be assigned to hypotheses or parameters describing the state of nature, which have fixed but unknown values.(iii) one uses Bayes' theorem to update the degree of belief in a particular hypothesis, in light of the data that was actually measured.

(iv) the likelihood function contains all that the data has to say about the problem; it's what converts prior probabilities to posterior probabilities based on the observed data.(v) from the posterior distribution, one can construct probability intervals for parameter estimation or posterior odds ratios for comparing various hypotheses.(vi) is not necessarily subjective, as priors can be assigned in a consistent objective fashion, based on only the information at hand (so-called least informative priors).

Bayes' theoremBayes’ theorem is a simple consequence of p(X, Y ) = p(Y, X)and the product rule.It relates the conditional probabilities p(X|Y ) and p(Y |X):

p(X|Y ) = p(Y |X)p(X)p(Y )

p(X) is called the prior probability of X; p(X|Y ) is the posterior probabilityof X given Y. p(Y |X) is called the likelihood of X; p(Y ) is called the evidence;and

In a typical situation, X will correspond to a model or some hypothesis aboutthe state of nature and Y will be the data that we collect from an experiment.Denoting the hypothesis by H and the data by D, we get

p(H|D) = p(D|H)p(H)p(D)

In words: ``The probability of a hypothesis given the observed data is proportional to the probability of the data assuming the hypothesis is true times the prior probability that the hypothesis is true, before taking into account the observed data."

Bayes' theorem tells us how probabilities evolve in light of new data.

Bayes' theorem: ExampleThe following example is borrowed from Graham Woan:

A gravitational wave detector may have detected a gravitational wave burst from a Type II supernova. But since burst-like signals in a detector can also be produced by instrumental glitches -in fact, only 1 out of 10,000 bursts are really due to a supernova-the data are checked for glitches using an auxiliary veto channel test. From Monte Carlo simulations, one finds that the veto channel test confirms that the burst is due to a supernova 95% of the time if there really was a GW burst in the data; but falsely claims the that the burst is due to a supernova 1% of the time, when there was no GW burst in the data. It turns out that the measured burst passes the veto channel test.

What is the probability that it's due to a supernova?

Notation: S = burst is due to a supernovaeG = burst is due to a glitch+ = veto test says the burst is due to a supernovae- = veto test says the burst is due to a glitchWe would like to calculate P(S | +).

Bayesian parameter estimation

Parameter estimation in Bayesian statistics is via the posterior distribution P(a|D).The posterior distribution tells you everything you need to know about a parameter, although you can reduce it to a few numbers if you like -e.g., mode, mean, standard deviation, etc.A Bayesian confidence interval is simply defined in terms of the area under the posterior distribution between one parameter value and another.

If the posterior distribution depends on two parameters a and b, but you only really care about a, then you can obtain the posterior distribution for a by marginalizing over b:

p(a|D) =

!db p(a, b|D) =

!db p(a|b, D)p(b)

Bayesian hypothesis testingIt doesn't make sense to talk about a single hypothesis in Bayesian statistics without making reference to alternative hypotheses. This is because we need to specify the alternative hypotheses to calculate the denominator P(D) in Bayes' theorem:

p(D) =!

i

p(D|Hi)p(Hi)

p(H1|D)

p(H2|D)=

p(D|H1)

p(D|H2)

p(H1)

p(H2)

Comparison of two hypotheses is natural in the Bayesian framework:

In words: ``Posterior odds equals the likelihood ratio times the prior odds."The likelihood ratio is also called the Bayes'factor.One often sets the prior odds to unity.Calculation of the likelihood of an hypothesis usually involves marginalization over the parameters associated with it:

p(D|H) =

!da p(D|a, H)p(a)

This can can be approximated as

p(D|H) ≈ L(amax)δa

∆a

where amax maximizes the likelihood function, δa is the range of parameter values over which the likelihood is peaked, and Δa is full range of the parameter values.

The factor penalizes a hypothesis that uses more parameter space volume than needed to fit the data.

δa/Δa is called Occam's factor in light of Occam's razor which says that ``it is vain to do with more what can be done with less."

Everything else being equal simpler hypotheses are automatically preferred in the Bayesian framework.

Frequentist parameter estimation Construct a statistic (or estimator) â of the parameter a you are interested in. Calculate its sampling distribution P(â |a).Statements like

Prob(a − ∆ < !a < a + ∆) = .95

make sense in the frequentist framework, since â is a random variable. Although the inequality can be rearranged to yield

Prob(!a − ∆ < a < !a + ∆) = .95

this should not be interpreted as a statement about the probability of a lying within a particular interval

since a is not a random variable.Rather, it should be interpreted as a probabilistic statement about the intervals

Namely, in a set of many repeated experiments, 0.95 is the fraction of intervals that will contain the true value of the parameter a.

[!a − ∆,!a + ∆]

[!a − ∆,!a + ∆]

Frequentist hypothesis testingSuppose you want to test an hypothesis H1 (e.g., the hypothesis that a GW signal is present in the data).

Since you cannot assign a probability to hypotheses in frequentist statistics, you introduce instead the single alternative hypothesis

You then argue for H1 by arguing against H0. (Like proof by contradiction in mathematics.) H0 is usually called the null hypothesis.

For the above example, it is the hypothesis that there is no GW signal in the data.

H0 := H1

Construct a test statistic t, and calculate analytically or via Monte Carlo simulations the sampling distribution P(t|H0).

If the observed value of t lies far out in the tails of the distribution, you reject H0 (and thus accept H1) at the p*100% level, where

p = Prob(t > tobs|H0)

This is the so-called p-value of the test. The p-value needed to reject the null hypothesis is the threshold for detection. There are two types of errors one can make:

Type I or false alarm errors: Rejecting the null hypothesis when in fact it is true.

Type II or false dismissal errors: Accepting the null hypothesis when in fact it is false.

Different test statistics are judged according to their false alarm and false dismissal probabilities.

For gravitational wave data analysis, people will be initially reluctant to falsely claim a detection. Hence the false alarm probability will be set to some very low value.

The best statistic is the one which minimizes the false dismissal probability (i.e., maximizes detection probability) for fixed false alarm. (This is the Neyman-Pearson criterion.)

For medical diagnosis, on the other hand, a doctor is very reluctant to falsely dismiss an illness. Hence the false dismissal probability will be set to some very low value. The best statistic is the one which minimizes the false alarm probability for fixed dismissal.

Matched filteringA matched filter (or Weiner optimal filter) is a frequentist detection statistic used to look for a signal with a deterministic waveform.

We assume the detector output x(t) has the form

x(t) = n(t) + s(t; θα)where θα are parameters describing the deterministic signal. One constructs the filtered output

y =

!∞

−∞

dt x(t)K(t) =

!∞

−∞

df "x(f) "K∗(f)

where the filter function is determined by maximizing the expected signal-to-noise ratio of y.

!K(f)

The result of the maximization is

y(θα) =

!∞

−∞

df"x(f)"s∗(f ; θα)

Pn(f)

where Pn(f) is the detector noise PSD.For weak signals (i.e., one can replace the PSD of the noise Pn(f) in the matched filter by the PSD of the detector output Px(f). Note that the filter function

Ps(f) ≪ Pn(f)

!K(f ; θα) =!s(f ; θα)

Pn(f)passes power in the band where the detector output resembles the signal and stops power in the band where the noise power is large. By being matched to the signal, the M F output grows with increased observation, like the square root of the number of cycles or observation time (for a continuous GW signal.)

The expected value of y(θα) is just the expected signal-to-noise ratio

⟨y(θα)⟩ =

!∞

−∞

df|"s∗(f ; θα)|2

Pn(f)

For white noise, the filter is just the signal itself. For example, for a pure sinusoid in white noise, we have

y(f0) =

!∞

−∞

dt x(t) sin(2πf0t) =1

2i("x(f0) − "x∗(f0))

Thus, Fourier transform is the matched filter for a pure sinusoid in white noise.

If s(t) corresponds to a burst signal with start time t0 (i.e., s(t; t0) = T (t −t0)) then

!s(f ; t0) = e−i2πft0 !T (f)so

y(t0) ="∞

−∞df !x(f)!T∗(f)

Pn(f) ei2πft0

which is just the inverse Fourier transform of !x(f) !T ∗(f)/Pn(f). Thus, theinverse Fourier transform allows us to easily search over signal start times.

• In general, one will have to construct a bank of filters for various points in parameter space. This is called a template bank.

• The separation of the filters in the template bank should not be too large, since one may fail to detect a signal whose parameter values different significantly from those in the bank. If the spacing is too fine, however, the computational costs will be prohibitive.

• There is a geometric method for determining the template placement, which uses a metric defined on the space of signal parameters.

Proof of optimality

y =Z 1

1dt x(t)K(t) =

Z 1

1df ex(f) e

K

(f)

The proof starts

The expected signal-to-noise ratio of y is S/N:

S := hyi and N2 :=(y hyi)2

↵

It can be shown:

S =Z 1

1df es(f) eK(f)

N2 =Z 1

1df | eK(f)|2 Pn(f)

We define an inner product

(A, B) :=Z 1

1df A(f)B(f) Pn(f)

which gives

S

N

2

=(A, eK)2

( eK, eK)

A(f) = es(f)/Pn(f)

where

In analogy with the dot product of vectors in 3-d space, it follows that theabove ratio of inner products is maximized by choosing

!K(f) ∝ A(f) = !s(f)Pn(f)

The signal-to-noise ratio for this optimal choice of !K(f) is

"SN

#2

opt=

$ ∞−∞ df |!s∗(f ;θα)|2

Pn(f)

which is the same as the expected value of the optimally-filtered outputy(θα).

Like with the dot product, it can maximized by choosing

In analogy with the dot product of vectors in 3-d space, it follows that theabove ratio of inner products is maximized by choosing

!K(f) ∝ A(f) = !s(f)Pn(f)

The signal-to-noise ratio for this optimal choice of !K(f) is

"SN

#2

opt=

$ ∞−∞ df |!s∗(f ;θα)|2

Pn(f)

which is the same as the expected value of the optimally-filtered outputy(θα).

The optimal choice is:

which is the same as the expected value of the optimally-filtered output y(↵)

Maximum likelihoodThe likelihood function plays an important role in Bayesian statistics as it updates prior probabilities to posterior probabilities.

It is also important in frequentist statistics since parameters can be estimated as the values that maximize the likelihood.

Suppose one wants to estimate the amplitude of a constant signal s in Gaussian-distributed random noise nj. If the noise is white with the same variance σ2 for each sample, then the likelihood function is given by

L(s) := p(x1, . . . , xN |s) =1

(2π)N/2σNexp

⎡

⎣−1

2

N−1∑

j=0

(xj − s)2

σ2

⎤

⎦

Maximising L(s) is the same as maximising its log:

lnL(s) = const − 12

!N−1j=0

(xj−s)2

σ2

where the const term doesn’t depend on s.

It follows that

d lnL(s)ds

!!!!!s="s

= 0 ⇐⇒ s = 1N

#N−1j=0 xj

which is the usual sample mean.

Note also that the second derivative of the log likelihood is simply related to the inverse variance of the estimator bsNote also that the second derivative of the log likelihood is simply related

to the inverse variance of the estimator !s:

d2 lnL(s)

ds2

"""""s=!s

= −

N

σ2

Generalising to white Gaussian-distributed noise but with different variancesσ2

i for each sample yields the weighted average

s =

!iσ−2

ixi!

jσ−2

j

Allowing for an arbitrary noise correlation matrix Cij yields another weightedaverage

s =

!iλixi!j

λj

where λi =!

j C−1

ij

To show the connection between the matched filter and the likelihood func-tion, generalize to the continuous functions of time

x(t) := n(t) + s(t; θα)

Then the discrete sum

lim∆t→0,T→∞

!N−1i,j=0 C−1

ij (xi − si)(xj − nj) ="∞

−∞df (#x(f)−#s(f))(#x∗(f)−#s∗(f))

Pn(f)

Again, it is convenient to define an inner product

(a|b) := 12

!∞

−∞df a∗(f)b(f)+a(f)b∗(f)

Pn(f)

in terms of which the exponent of the exponential in the likelihood functionbecomes

− 12 ("x − "s(θα)|"x − "s(θα))

Thus, up to constants that are independent of the signal parameters θα, wehave

lnL(θα) = const − 12 [(!s(θα)|!s(θα)) − 2(!x|!s(θα))]

Note that the last term

(!x|!s(θα)) = 12

"∞

−∞df !x(f)!s∗(f ;θα)+!x∗(f)!s(f ;θα)

Pn(f)

is just a symmetric version of the matched filter discussed previously.

To determine the maximum likelihood parameter estimators !θα, simply setthe partial derivatives of the log likelihood with respect to θα equal to zero. The

result is ∂ lnL(θ)∂θα

"""""θ=!θ

= 0 ⇐⇒

#$x − $s(θ)

""""∂$s(θ)∂θα

% """""θ=!θ

= 0

Fisher information matrix

The Fisher information matrix is defined in terms of the expectation valueof the second derivatives of the log likelihood with respect to θα and θβ :

Γαβ := −

!∂2 lnL(θ)∂θα∂θβ

"""""θ=#θ

$

In terms of the inner product ( | ), we can write

Γαβ :=

%∂s(θ)∂θα

""""∂s(θ)∂θβ

& """""θ=#θ

In particular, the precision with which can expect to measure parameter θα

is

σθα := σθαθα =!

Γ−1αα

As an explicit example, determine the correlations between the signal pa-rameters θα := (A, f0, φ0) for a sinusoid of unknown amplitude, frequency, andinitial phase:

s(t;A, f0, φ0) = A sin(2πf0t + φ0)

The importance of the Fisher information matrix is that its inverse Γ−1

αβ

specifies the expected correlations between the various signal parameters:

Γ−1

αβ = σ2

θαθβ

Stochastic GW searches

A stochastic gravitational wave background is a random gravitational wave signal produced by the superposition of many independent, weak, unresolvable gravitational wave signals. It can be either cosmological or astrophysical in nature:

1. A cosmological background would consist of the remnant gravitational waves left over from the very early universe.(Similar to, but much earlier than, the CMB.) Amplifications of quantum mechanical perturbations in the gravitational field by rapid inflation is the main expected source of such a background (although producing a very weak field). Other speculative exotic processes: cosmic strings, colliding vacuum bubbles, etc.

Sources

2. An astrophysical background is a confusion-limited background produced by astrophysical sources---e.g., supernovae throughout the universe, NS inspiral, close white dwarf binary inspiral (the dominant ``noise'' source for LISA below about 3mHz), GWs from accreting LMXB systems, etc.

Most cosmological backgrounds are expected to have gravitational wave power that falls-off like 1/f3. Most astrophysical backgrounds are expected to have power peaked at some characteristic frequency. A cosmological background is expected to be isotropic, whereas an astrophysical background may be anisotropic following the spatial distribution of the astrophysical sources.

Characterising a stochastic GW background

Being a random signal, the stochastic GW background must be characterized statistically.

It is most-likely a stationary random process, since it has been produced on time-scales much, much longer than the duration of an observation.

It is most-likely Gaussian-distributed being the superposition of very many independent GW signals (central limit theorem).

The gravitational power is expected to be many orders of magnitude weaker than the instrumental noise power, but we can build up SNR by observing for long times (proportional to )√

T obs

It is most-likely isotropic, although anisotropies from our motion relative to the CMB and the confusion background from astrophysical sources are possible.

It is most-likely unpolarized (no strong reason to suspect otherwise).

The strength of the stochastic GW signal is specified by how the GW energy density is distributed in frequency. It is normalized by the critical energy density needed to close the universe ; it tells us at each frequency f what the fractional energy density in GWs is in a bandwidth Δf equal to the frequency f:

ρc = 3H2

0/8πG

Ωgw(f) =∆ρgw(f)

ρc

The total fractional energy density in GWs is

Ωgw =!∞

0d(ln f) Ωgw(f)

which can be compared to Ωb, ΩΛ, etc.

Ωgw(f) is related to the (2-sided) power spectrum Pgw(f) of an isotropicbackground of GWs incident on a single interferometer via

Pgw(f) = 3H20

20π2

Ωgw(f)f3

Note the 1/f3 dependence for stochastic backgrounds with Ωgw = const.

Since ρc involves the Hubble constant, one often writes H0 = h0∗100 km/s/Mpcand absorbs a factor of h2

0 in Ωgw, working with the quantity h20Ωgw which is

independent of the value of the Hubble constant.

However, recent measurements (WMAP, etc.) have shown that h0 = 0.72to a high degree of precision, so we have assumed this value in these notes andquote limits directly on Ωgw.

Observational constraintsCMB isotropy: Fluctuations of 1 part in 105 in the CMB temperature limit

the amount of GW perturbations to the CMB. The constraint is very stringentbut at very low frequencies:

Ωgw(f) < 1.3 × 10−13 for f ∼ 10−16 Hz

Precision timing of msec pulsars: Attributing the small timing residuals inmsec pulsar measurements to GWs puts an upper bound on the strength of thestochastic GW background at frequencies of order one over the observation timeof the pulsars:

Ωgw(f) < 1.8 × 10−7 for f ∼ 10−8 Hz

Spacecraft Doppler tracking: Attributing discrepancies in timing of commu-nication signals sent between the Cassini spacecraft and Earth puts an upperbound on the strength of a stochastic GW background at frequencies of orderone over the transit time to the spacecraft:

Ωgw(f) < 0.027 for 10−6 < f < 10−3 Hz

Big Bang nucleosynthesis (BBNS): The agreement of the measured abun-dances of light elements (H, He, Li, Be) with the predictions of nucleosynthesiscalculations limits the expansion rate of the universe (and hence the energycontent in GWs) at the time of nucleosynthesis:

!f>10−10 Hz

d(ln f) Ωgw(f) < 1.5 × 10−5

This constraint only applies to GWs present at the time of nucleosynthesis(a few minutes after the Big Bang), not to GWs produced by more recentastrophysical sources for example.

Direct GW measurements: We have recently started to obtain direct GWobservational constraints using resonant bar and interferometric GW detectors.Although not as stringent as the indirect electromagnetic observations listedabove, the upper limits set by direct measurements are improving with detectorsensitivity and increased observation time, and may surpass some of the abovebounds in the near future. (More about this later.)

Cross-correlation methodThe metric perturbations hij(t) representing a GW can be expanded as a

superposition of plane polarized GWs propagating in different directions !n:

hij(t, x) ="

A

#∞

−∞df

#d2Ωn

$hA(f, !n)eAij(!n)ei2πf(t−!n·x)

where eAij(!n) are the polarisation tensors (A = +,×) in the plane transverse

to the propagation of the wave.

The gravitational wave strain h(t) incident on an interferometer is then

h(t) = hij(t, x0)dij

where

dij := 1

2( !Xi !Xj

−!Y i !Y j)

is the detector tensor defined by the unit vectors !Xi, !Y i pointing along thetwo arms.

Let the output of two detectors be denoted

x1(t) = n1(t) + h1(t)

x2(t) = n2(t) + h2(t)

where n1,2(t) denote the noise in the two detectors and h1,2(t) denote thegravitational wave strain produced by the stochastic GW background.

We define the cross-correlation statistic Y by

Y :=! T/2

−T/2dt1

! T/2

−T/2dt2 x1(t1)x2(t2)K(t1 − t2)

where K is a filter function which we will choose to maximise the signal-to-noise ratio of Y , and T is the total observation time.

Stationarity allows us to write K(t1−t2) instead of the more general K(t1, t2),which would be needed for non-stationary random processes.

One can re-express Y in the frequency domain

Y =!∞

−∞df

!∞

−∞df ′ δT (f − f ′)"x∗

1(f)"x2(f ′) "K(f ′)

where δT (f − f ′) is a finite time approximation to the Dirac delta function:

δT (f) :=! T/2

−T/2dt ei2πft

Finding !K that optimises the expected signal-to-noise ratio of Y is similar towhat we did in the previous lecture to find the matched filter for a deterministicsignal.

We first define the signal-to-noise ratio of Y as S/N where

S := ⟨Y ⟩ and N2 :=!

(Y − ⟨Y ⟩)2"

Calculating the expected values leads to

S = T!∞

−∞df γ(f)Pgw(f) "K(f) and N2 ≈

!∞

−∞df | "K(f)|2 Pn1

(f)Pn2(f)

where Pn1(f), Pn2

(f) are the noise power spectra, Pgw(f) is the power inthe stochastic GW background, and γ(f) is the overlap reduction function

The approximation used to get N2 is the assumption that the power in thestochastic GW signal is much, much less than that in the detector noise. Thisis a valid assumption for current detectors. Thus, the Pn(f)’s can be replacedby the actual measured Px(f)’s for the detector output.

As before, one introduces an inner product

(A, B) :=!∞

−∞df A∗(f)B(f) Pn1

(f)Pn2(f)

in terms of which

"SN

#2= (A,$K)2

($K,$K)

where A(f) = γ(f)Pgw(f)/(Pn1(f)Pn2

(f)).

This ratio of inner products is maximised by choosing

!K(f) ∝ A(f) = γ(f)Pgw(f)Pn1

(f)Pn2(f)

The overall normalisation is chosen so that ⟨Y ⟩ equals the amplitude of thestochastic GW background that we are looking for (e.g., Ωgw(f) evaluated atsome reference frequency). Then N2 is the variance of the estimator: σ2

Ωgw.

The shape of the stochastic GW spectrum is an input for the filter, but theoverall amplitude of the spectrum is an output of the analysis, estimated by thestatistic.

The expected signal-to-noise ratio for this optimal choice of !K(f) is

"SN

#opt

=√

T$%

∞

−∞df

γ2(f)Pgw(f)Pn1

(f)Pn2(f)

&1/2

Of special interest is the factor of√

T which means that by cross-correlatingthe detector outputs for a sufficiently long period of time, we can extract astochastic GW signal that sits well below the noise level.

Overlap reduction functionThe overlap reduction function γ(f) quantifies the loss in sensitivity to de-

tecting an isotropic stochastic GW background due to the separation and mis-alignment of a pair of detectors.

i.e. γ(f) = 1 for coincident and coaligned detectors (e.g., LHO 4km and2km).

Explicitly, γ(f) = 5

8π

!A

"d2Ωn ei2πfn·∆xFA

1 (n)FA2 (n)

whereFA

1,2(n) = eAij(!n)dij

1,2

are the antenna pattern functions for the two detectors.

γ(f) falls-off rapidly to zero when the frequency of a GW is large enoughfor its wavelength to fit in the separation between the two detectors. (Then thedifferent detectors can be driven out of phase by the passing GW.)

Data analysis pipelineSplit the available data into short segments (∼60-s) over which the detector

is stationary.

High pass filter the data above 40 Hz to avoid contamination from low fre-quency seismic noise.

Down-sample the data from 16384 Hz to 1024 Hz (fNyq = 512 Hz) afterapplying a low-pass anti-aliasing filter. (Don’t need higher frequencies due tooverlap reduction.)

Window the data (using a Hann window) to avoid large leakage of powerfrom strong spectral lines.

Calibrate the data to take into account the response of the interferometerto GWs. This converts measured voltages to GW strain h(t).

Calculate power spectra needed for construction of the optimal filter. (TakeΩgw(f) = const as the template for the type of stochastic background that we’relooking for.)

Apply the optimal filter to the data from the two detectors calculating thevalue of the cross-correlation statistic Yi (i.e., a point estimate of Ωgw) and thetheoretical error σi for each data segment i.

Combine the data using a weighted average to obtain a final point estimateand error bar:

!Ωgw =

"iσ−2i

Yi"j

σ−2j

and σ−2Ωgw

="

iσ−2i

Assuming a Gaussian distribution for !Ωgw, convert the point estimate anderror bar into a Bayesian posterior distribution for Ωgw, taking into accountthe prior probability distribution for Ωgw (from previous measurements), andmarginalise over any systematic errors like uncertainties in the calibration.

NOTE: Analysis code is checked via simulated signal injections in both soft-ware and hardware (actually move the mirrors in the same way that a stochasticGW background would).

Other searchesCross-correlating the output of the LLO interferometer and the ALLEGRO

resonant bar detector at around 900 Hz (the resonant frequency of the bar):(i) Good overlap due to proximity (40 km).(ii) Can rotate the bar to modulate the cross-correlation signal (capable of

producing a null measurement to get a handle on non GW correlations).(iii) not as sensitive as LLO-LHO searches.

Cross-correlating the two LHO detectors:(i) No reduction in sensitivity at higher frequencies due to the overlap re-

duction function since coincident and coaligned detectors. (Possible order-of-magnitude improvement in sensitivity to Ωgw for initial LIGO because of this.)

(ii) Problem to distinguish GW correlation from instrumental and environ-mental correlation. (Couldn’t use LHO-LHO data in previous searches due tolarge acoustic noise.)

Targeting different patches in the sky for excess GW flux by appropriatelytime-delaying the outputs of separated interferometers:

(i) Ideal for detecting anisotropies in the GW background.(ii) Capable of producing an all-sky map similar to CMB maps (although

currently dominated by noise!).(iii) Can also be used to look for excess GW flux from targeted sources like

Sco-X1 or known pulsars (but not optimal for these searches).

Extending Bayesian analyses to marginalise over to noise variance and todetermine spectral shape of the background: Being pursued by Emma Robinson,at B’ham.

Current results

All correlations so far are consistent with zero background. (Not surprisingsince the stochastic GW background is expected to be so weak.)

Upper limits for Ωgw(f) = const (with f roughly between 50 and 150 Hz):

S1: Ωgw < 44 (PRD 69, 122004, 2004)

S3: Ωgw < 8.4 × 10−4 (PRL 95, 221101, 2005)

S4: Ωgw < 6.5 × 10−5 (Astrophysical Journal, 659, 918, 2007)

We are currently analysing S5 data (design sensitivity, 1 year of coincidentdata). We hope to beat the BBNS bound using this data with LHO-LLO!

0 < 1.7 107

For O1 Advanced LIGO

Analisis de Datos de Ondas Gravitacionales 2da parte€¦ · 11/10/2017 · Analisis de Datos de...

Documents

Transcript of Analisis de Datos de Ondas Gravitacionales 2da parte€¦ · 11/10/2017 · Analisis de Datos de...