Random Variables - UTKweb.eecs.utk.edu/~mjr/ECE504/PresentationSlides/... · Functions of a Random...
Transcript of Random Variables - UTKweb.eecs.utk.edu/~mjr/ECE504/PresentationSlides/... · Functions of a Random...
Random Variables
Randomness
• The word random effectively means unpredictable
• In engineering practice we may treat some signals as random to simplify the analysis even though they may not actually be random
Random Variable Defined
A random variable X ζ( ) is the assignment of numerical
values to the outcomes ζ of experiments
Random Variables Examples of assignments of numbers to the outcomes of experiments.
Discrete-Value vs Continuous-Value Random Variables
• A discrete-value (DV) random variable has a set of distinct values separated by values that cannot occur
• A random variable associated with the outcomes of coin flips, card draws, dice tosses, etc... would be DV random variable
• A continuous-value (CV) random variable may take on any value in a continuum of values which may be finite or infinite in size
Distribution Functions
The distribution function of a random variable X is theprobability that it is less than or equal to some value,as a function of that value. FX x( ) = P X ≤ x⎡⎣ ⎤⎦Since the distribution function is a probability it must satisfythe requirements for a probability. 0 ≤ FX x( ) ≤1 , − ∞ < x < ∞
P x1 < X ≤ x2⎡⎣ ⎤⎦ = FX x2( )− FX x1( )FX x( ) is a monotonic function and its derivative is never negative.
Distribution Functions
The distribution function for tossing a single die
FX x( ) = 1/ 6( ) u x −1( ) + u x − 2( ) + u x − 3( )+u x − 4( ) + u x − 5( ) + u x − 6( )⎡
⎣⎢⎢
⎤
⎦⎥⎥
Distribution Functions A possible distribution function for a continuous random variable
Probability Mass and Density
The derivative of the distribution function is the probability density function (PDF)
fX x( ) ≡ ddx
FX x( )( )Probability density can also be defined by fX x( )dx = P x < X ≤ x + dx⎡⎣ ⎤⎦Properties
fX x( ) ≥ 0 , − ∞ < x < +∞ fX x( )dx−∞
∞
∫ = 1
FX x( ) = fX λ( )dλ−∞
x
∫ P x1 < X ≤ x2⎡⎣ ⎤⎦ = fX x( )dxx1
x2
∫
The PDF for tossing a die
Probability Mass and Density
A DV random variable X is a Bernoulli random variable if it takes on only two values 0 and 1 and
pX x( ) = P X = x⎡⎣ ⎤⎦ =1− p , x = 0p , x = 10 , otherwise
⎧
⎨⎪
⎩⎪
and its PDF is fX x( ) = 1− p( )δ x( ) + pδ x −1( )pX x( ) is called the probability mass function (PMF).
Probability Mass and Density
Probability Mass and Density
Bernoulli PMF Bernoulli PDF
If we perform n trials of an experiment whose outcome is Bernoulli distributed and if X represents the total number of 1’s that occur in those n trials, then X is said to be a Binomial random variable and its PMF is
pX x( ) =nx
⎛
⎝⎜⎞
⎠⎟px 1− p( )n−x
, x ∈ 0,1,2,,n{ }0 , otherwise
⎧
⎨⎪
⎩⎪
and its PDF is
fX x( ) = nk
⎛
⎝⎜⎞
⎠⎟pk 1− p( )n−k
δ x − k( )k=0
n
∑
Probability Mass and Density
Probability Mass and Density
Binomial PMF Binomial PDF
Probability Mass and Density
If we perform Bernoulli trials until a 1 (success) occurs and the probability of a 1 on any single trial is p, the probability that the
first success will occur on the kth trial is p 1− p( )k−1. A DV random
variable X is said to be a Geometric random variable if its PMF is
pX x( ) = p 1− p( )x−1 , x ∈ 1,2,3,...∞{ }
0 , otherwise
⎧⎨⎪
⎩⎪and its PDF is
fX x( ) = p 1− p( )k−1δ x − k( )
k=1
∞
∑
Probability Mass and Density
Geometric PMF Geometric PDF
Probability Mass and Density
If we perform Bernoulli trials until the rth 1 occurs and the probability of a 1 on any single trial is p, the probability that therth success will occur on the kth trial is
P rth success on kth trial⎡⎣ ⎤⎦ =k −1r −1
⎛
⎝⎜⎞
⎠⎟pr 1− p( )k−r
A DV random variable Y is said to be a Negative - Binomial or Pascal random variable with parameters r and p if its PMF is
pY y( ) =y −1r −1
⎛
⎝⎜
⎞
⎠⎟ pr 1− p( )y−r
, y ∈ r,r +1,,∞{ }0 , otherwise
⎧
⎨⎪
⎩⎪
and its PDF is fY y( ) = k −1r −1
⎛
⎝⎜⎞
⎠⎟pr 1− p( )k−r
δ y − k( )k=r
∞
∑
Probability Mass and Density
Negative Binomial (Pascal) PMF
Negative Binomial (Pascal) PDF
Probability Mass and Density
Suppose we randomly place n points in the time interval 0 ≤ t < T with each point being equally likely to fall anywhere in that range. The probability that k of them fall inside an interval of length Δt < T inside that range is
P k inside Δt⎡⎣ ⎤⎦ =nk
⎛⎝⎜
⎞⎠⎟
pk 1− p( )n−k= n!
k! n − k( )! pk 1− p( )n−k
where p = Δt / T is the probability that any single point falls withinΔt. Further, suppose that as n→∞, n / T = λ, a constant. If λ is constant and n→∞ that implies that T →∞ and p→ 0. Then λ is the average number of points per unit time, over all time.
Probability Mass and Density Events occurring at random times
Probability Mass and Density
It can be shown that
P k inside Δt⎡⎣ ⎤⎦ =α k
k!limn→∞
1− αn
⎛⎝⎜
⎞⎠⎟
n
=e−α
= α k
k!e−α
where α = λΔt. A DV random variable is a Poisson randomvariable with parameter α if its PMF is
pX x( ) =α x
x!e−α , x ∈ 0,1,2,,∞{ }
0 , otherwise
⎧⎨⎪
⎩⎪
and its PDF is
fX x( ) = α k
k!e−αδ x − k( )
k=0
∞
∑
The uniform PDF
The probabilities of X being in Range #1 and X being in Range #2 are the same as long as both ranges have the same width and lie entirely between a and b
Probability Mass and Density
The Exponential PDFThe arrival times of photons in a light beam at a surface arerandom and have a Poisson distribution. Let the time betweenphotons be T . The mean time between photons is T . The probability that a photon arrives in any very short length of time
Δt located randomly in time is P photon arrival during Δt⎡⎣ ⎤⎦ =ΔtT
.
Let a photon arrive at time t0. What is the probabilitythat the next photon will arrive within the next t seconds?
Probability Mass and Density
The Exponential PDFFrom one point of view, the probability that a photon arriveswithin the time range t0 + t < T ≤ t0 + t + Δt is
P t0 + t < T < t0 + t + Δt⎡⎣ ⎤⎦ = FT t + Δt( )− FT t( ).This probability is also the product of the probability ofa photon arriving in any length of time Δt, which is Δt / T ,and the probability that no photon arrives before that time,which is P no photon before t0 + t⎡⎣ ⎤⎦ = 1− FT t( ).
Probability Mass and Density
The Exponential PDF
FT t + Δt( )− FT t( ) = 1− FT t( )⎡⎣ ⎤⎦ΔtT
Dividing both sides by Δt and letting it approach zero,
limΔt→0
FT τ + Δτ( )− FT τ( )Δt
= ddt
FT t( )( ) = 1− FT t( )T
, t ≥ 0
Solving the differential equation,
FT t( ) = 1− e− t /T( )u t( )⇒ fT t( ) = e− t /T
Tu t( )
Probability Mass and Density
The Erlang PDFThe Erlang PDF is a generalization of the exponential PDF.It is the probability of the time between one event and thekth event later in time.
fT ,k t( ) = t k−1e− t /T
T k k −1( )!u t( ) , k = 1,2,3,…
Notice that, for k = 1,
fT ,1 t( ) = e− t /T
Tu t( )
which is the same as the exponential PDF.
Probability Mass and Density
Probability Mass and Density The Gamma PDF The Gamma PDF is based on the Gamma function
Γ x( ) = λ x−1e−λdλ0
∞
∫ , x > 0
Some important properties of the Gamma function are Γ x +1( ) = xΓ x( ) , x >1 Γ n +1( ) = n! , n ≥ 0 and n an integer
Γ 1 / 2( ) = π , Γ 3 / 2( ) = 1 / 2( ) π
nk
⎛⎝⎜
⎞⎠⎟=
Γ n +1( )Γ k +1( )Γ n − k +1( )
Probability Mass and Density The Gamma PDF
fX x;α,β( ) = xα−1ex /β
βαΓ α( ) u x( ) , α > 0 , β > 0
Probability Mass and Density
The Gamma PDF If β = 1, the Gamma PDF is called the standard Gamma PDF
fX x;α( ) = xα−1ex
Γ α( ) u x( ) , α > 0
FX x;α( ) = fX λ( )dλ0
x
∫ = λα−1e−λ
Γ α( ) dλ0
x
∫The distribution function FX x;α( ) is also called in mathematicsthe incomplete Gamma function.
Probability Mass and Density
The Gamma PDF
Two special cases of the Gamma function are important.If we let α = 1 and β = 1 / λ we get the exponential PDF fX x( ) = λe−λx u x( ).If we let α = N / 2 and β = 2 we get the chi - squared PDF
fX x( ) = xN /2−1e− x /2
2N /2Γ N / 2( ) u x( )
in which N is called the number of degrees of freedom of X.
Functions of a Random Variable
Consider first a transformation from a DV random variable X to another DV random variable Y through Y = g X( ). If the
function g is invertible, then X = g−1 Y( ) and the PMF for Y is then
pY y( ) = pX g−1 y( )( ) where pX x( ) is the PMF for X . For the DV
random variable X the PDF consists only of impulses
fX x( ) = aiδ x − xi( )i=1
N
∑ where N is the number of impulses.
The PDF of Y also consists only of impulses and each impulse in the PDF of Y corresponds to an impulse in the PDF of X
fY y( ) = aiδ y − yi( )i=1
N
∑ = aiδ y − g xi( )( )i=1
N
∑
Functions of a Random Variable
If the function g is not invertible the PMF and PDF of Y can be found by finding the probability of each value of Y . Each value of X with non-zero probability causes a non-zero probability for the corresponding value of Y . So, for the ith value of Y ,
P Y = yi⎡⎣ ⎤⎦ = P X = xi,1⎡⎣ ⎤⎦ + P X = xi,2
⎡⎣ ⎤⎦ +
+ P X = xi,n⎡⎣ ⎤⎦ = P X = xi,k
⎡⎣ ⎤⎦k=1
n
∑
Functions of a Random Variable
Let X and Y be CV random variables and let Y = g X( ). Also, let
the function g be invertible, meaning that an inverse functionX = g−1 Y( ) exists and is single-valued as in the illustrations below.
Functions of a Random Variable
Then it can be shown that the PDF’s of X and Y are related by
fY y( ) = fX g−1 y( )( )dy / dx
Functions of a Random Variable
Let the PDF of X be fX x( ) = 14
rect x −14
⎛⎝⎜
⎞⎠⎟
and let Y = 2X +1.
Then X = g−1 Y( ) = Y −12
, dYdX
= 2.
Functions of a Random Variable
fY y( ) =
fX
y −12
⎛⎝⎜
⎞⎠⎟
2=
14
recty−1( ) / 2 −1
4
⎛
⎝⎜
⎞
⎠⎟
2= 1
8rect y − 3
8⎛⎝⎜
⎞⎠⎟
Functions of a Random Variable
Now let Y = −2X + 5⇒ X = g−1 Y( ) = 5−Y2
, dYdX
= −2
fY y( ) =fX
5− y2
⎛⎝⎜
⎞⎠⎟
2= 1
8rect 3− y
8⎛⎝⎜
⎞⎠⎟= 1
8rect y − 3
8⎛⎝⎜
⎞⎠⎟
Functions of a Random Variable
Now let Y = g X( ) = X 2. This is more complicated because the event
1< Y ≤ 4{ } is caused by the event 1< X ≤ 2{ } but it is also caused by
the event −2 ≤ X < −1{ }. If we make the transformation from
the PDF of X to the PDF of Y in two steps the process is simplerto see. Let Z = X .
Functions of a Random Variable
fZ z( ) = 14
rect z −1/ 2( ) + 14
rect z − 3 / 23
⎛⎝⎜
⎞⎠⎟
Z = Y , Y ≥ 0 , dYdZ
= 2Z = 2 Y , Y ≥ 0
fY y( ) =fZ g−1 y( )( )
dy / dz, y ≥ 0
0 , y < 0
⎧
⎨⎪
⎩⎪
⎫
⎬⎪
⎭⎪=
14
recty − 3 / 2
3
⎛
⎝⎜
⎞
⎠⎟ +
14
rect y −1/ 2( )2 y
u y( )
fY y( ) = 18 y
rect2 y − 3
6
⎛
⎝⎜
⎞
⎠⎟ + rect y − 1
2⎛⎝⎜
⎞⎠⎟
⎡
⎣⎢⎢
⎤
⎦⎥⎥
u y( )
Functions of a Random Variable
fY y( )dy
−∞
∞
∫ = 1
Functions of a Random Variable
In general, if Y = g X( ) and the real solutions of this equation are
x1 ,x2 ,xN then, for those ranges of Y for which there is a
corresponding X through Y = g X( ) we can find the PDF of Y .
Notice that for some ranges of X and there are multiple real solutions and for other ranges there may be fewer. In this figure there are three real values of x which produce , y1. But there is only one value of x which produces y2. Only the real solutions are used. So in some transformations, the transformation used may depend on the range of values of X and Y being considered.
Functions of a Random Variable
For those ranges of Y for which there is a corresponding real X through Y = g X( )fY y( ) = fX x1( )
dYdX
⎛⎝⎜
⎞⎠⎟ X =x1
+fX x2( )dYdX
⎛⎝⎜
⎞⎠⎟ X =x2
++fX xN( )dYdX
⎛⎝⎜
⎞⎠⎟ X =xN
In the previous example Y = g X( ) = X 2 and x1,2 = ± y
fY y( ) =fX y( )
2 y+
fX − y( )2 y
, y ≥ 0
0 , y < 0
⎧
⎨⎪⎪
⎩⎪⎪
⎫
⎬⎪⎪
⎭⎪⎪
=rect
y −14
⎛
⎝⎜
⎞
⎠⎟ + rect
− y −14
⎛
⎝⎜
⎞
⎠⎟
8 y, y ≥ 0
0 , y < 0
⎧
⎨
⎪⎪
⎩
⎪⎪
Functions of a Random Variable
One problem that arises when transforming CV random variables occurs when the derivative is zero. This occurs in any type of transformation for which Y is constant for a non-zero range of X . Since division by zero is undefined, the formula
fY y( ) = fX x1( )dy / dx( )x=x1
+fX x2( )
dy / dx( )x=x2
++fX xN( )
dy / dx( )x=xN
is not usable in the range of X in which Y is a constant. In these cases it is better to utilize a more fundamental relation between X and Y .
Functions of a Random Variable
Let fX x( ) = 1 / 6( )rect x / 3( )
Let Y =3X − 2 , X ≥11 , X <1⎧⎨⎩
We can say P Y = 1[ ] = P X <1[ ] = 2 / 3. So Y has a probability of 2/3 of being exactly one. That means that there must be an impulse in fY y( ) at y = 1 and the strength of the impulse is 2/3. Inthe remaining range of Y we can use
fY y( ) = fX x1( )dy / dx( )x=x1
+fX x2( )
dy / dx( )x=x2
++fX xN( )
dy / dx( )x=xN
Functions of a Random Variable
For this example, the PDF of Y would be
fY y( ) = 2 / 3( )δ y −1( ) + 1/ 18( )rect y + 2( ) / 18( )u y −1( )
or, simplifying,
fY y( ) = 2 / 3( )δ y −1( ) + 1/ 18( )rect y − 4( ) / 6( )
The Inverse Problem
.
Given a random variable X with a known PDF fX x( ) it is desired to
find a functional transformation Y = g X( ) that produces a
random variable Y with a desired PDF fY y( ). First let the PDF
of Y be uniform in the interval 0 < y ≤1. The function that converts the PDF of X to this uniform PDF of Y is Y = FX X( ). This implies that y = P X ≤ x⎡⎣ ⎤⎦ and therefore, 0 < y ≤1. If X ≤ x
since FX x( ) is monotonic then Y ≤ y implying that P Y ≤ y⎡⎣ ⎤⎦ = P X ≤ x⎡⎣ ⎤⎦.
Combining, FY y( ) = P Y ≤ y⎡⎣ ⎤⎦ = P X ≤ x⎡⎣ ⎤⎦ = FX x( ) = y , 0 < y ≤1
and fY y( ) = 1 , 0 < y ≤1.
Now suppose the PDF of X is uniform and we want an arbitrary PDF of Y , fY y( ). This is simply the previous exercise in reverse. If choosing
Y = FX X( ) converts an arbitrary PDF of X into a uniform PDF for Y
between 0 and 1, then Y = FY−1 X( ) converts a uniform PDF for X
between 0 and 1 into an arbitrary PDF for Y . Now, to convert an arbitrary PDF of one random variable to an arbitrary PDF of another random variable we just combine these two operations into one.
The simplest solution is Y = FY-1 FX X( )( ).
The Inverse Problem
Another approach to the problem of changing one PDF into another is
to start with the general relationship fY y( ) = fX g−1 y( )( )dy / dx
. If fY y( ) is
to be the constant one in the range 0 < y ≤1, then dy / dx must be the
same as fX g−1 y( )( ) in that same range. So we want the magnitude of
the derivative of y = g x( ) to be the same as fX x( ) which is the derivative
of FX x( ). So if we choose g x( ) to be the same as FX x( ), when we
differentiate we will get the correct denominator to make the PDF of Y uniform. Because of the magnitude operation there is another solution, Y = 1− FX X( ). In this second solution the mapping between X and Y is
different but the distribution of values is the same.
The Inverse Problem
Example
A random variable X has a PDF
fX x( ) = 2x , 0 < x <10 , otherwise⎧⎨⎩
and we want to convert it into another random variable Y with a PDF
fY y( ) = 1− y , y <1
0 , otherwise
⎧⎨⎪
⎩⎪What is the desired function Y = g X( )?
The Inverse Problem
fX x( ) = 2x , 0 < x <10 , otherwise⎧⎨⎩
, fY y( ) = 1− y , y <1
0 , otherwise
⎧⎨⎪
⎩⎪
FX x( ) =0 , x < 0x2 , 0 < x <11 , x >1
⎧
⎨⎪
⎩⎪
, FY y( ) =0 , y < −1y2 /2 + y +1/ 2 , −1< y < 0y − y2 / 2 +1/ 2 , 0 < y <11 , y >1
⎧
⎨⎪⎪
⎩⎪⎪
FY−1 y( ) = −1+ 2y , 0 < y <1/ 2
1− 2 1− y( ) , 1 / 2 < y <1
⎧⎨⎪
⎩⎪
FY−1 FX x( )( ) = −1+ 2x2 , 0 < x2 <1/ 2
1− 2 1− x2( ) , 1 / 2 < x2 <1
⎧⎨⎪
⎩⎪
Integration Integration
Inversion
The Inverse Problem
The Inverse Problem
Expectation and Moments
Imagine an experiment with M possible distinct outcomes
performed N times. The average of those N outcomes is X = 1N
nixii=1
M
∑where xi is the ith distinct value of X and ni is the number of
times that value occurred. Then X = 1N
nixii=1
M
∑ =ni
Nxi
i=1
M
∑ = rixii=1
M
∑ .
The expected value of X is
E X( ) = limN→∞
ni
Nxi
i=1
M
∑ = limN→∞
rixii=1
M
∑ = P X = xi⎡⎣ ⎤⎦ xii=1
M
∑ .
Expectation and Moments
The probability that X lies within some small range can be
approximated by P xi −Δx2
< X ≤ xi +Δx2
⎡
⎣⎢
⎤
⎦⎥ ≅ fX xi( )Δx
and the expected value is then approximated by
E X( ) = P xi −Δx2
< X ≤ xi +Δx2
⎡
⎣⎢
⎤
⎦⎥ xi
i=1
M
∑ ≅ xi fX xi( )Δxi=1
M
∑where M is now the number of subdivisions of width Δx of the range of the random variable.
Expectation and Moments
In the limit as Δx approaches zero, E X( ) = x fX x( )dx−∞
∞
∫ .
Similarly E g X( )( ) = g x( )fX x( )dx−∞
∞
∫ .
The nth moment of a random variable is E X n( ) = xn fX x( )dx−∞
∞
∫ .
Expectation and Moments
The first moment of a random variable is its expected value
E X( ) = x fX x( )dx−∞
∞
∫ . The second moment of a random variable
is its mean-squared value (which is the mean of its square, not the square of its mean).
E X 2( ) = x2 fX x( )dx−∞
∞
∫
Expectation and Moments
A central moment of a random variable is the moment ofthat random variable after its expected value is subtracted.
E X − E X( )⎡⎣ ⎤⎦n⎛
⎝⎞⎠ = x − E X( )⎡⎣ ⎤⎦
nfX x( )dx
−∞
∞
∫The first central moment is always zero. The second centralmoment (for real-valued random variables) is the variance,
σ X2 = E X − E X( )⎡⎣ ⎤⎦
2⎛⎝
⎞⎠ = x − E X( )⎡⎣ ⎤⎦
2fX x( )dx
−∞
∞
∫The positive square root of the variance is the standarddeviation.
Expectation and Moments
Properties of Expectation
E a( ) = a , E aX( ) = a E X( ) , E Xnn∑⎛⎝⎜
⎞⎠⎟= E Xn( )
n∑
where a is a constant. These properties can be use to prove
the handy relationship σ X2 = E X 2( )− E2 X( ). The variance of
a random variable is the mean of its square minus the square of its mean.
Expectation and Moments
For complex-valued random variables absolute moments are useful. The nth absolute moment of a random variable is defined by
E Xn( ) = x
nfX x( )dx
−∞
∞
∫and the nth absolute central moment is defined by
E X − E X( ) n⎛⎝
⎞⎠ = x − E X( ) n
fX x( )dx−∞
∞
∫
Expectation and Moments
Let Z = X + jY .
E Z2( ) = E X + jY
2( ) = E X 2( ) + E Y 2( ) = 2 / 3
σ Z2 = E Z − E Z( ) 2⎛
⎝⎞⎠ = E X + jY − E X + jY( ) 2⎛
⎝⎞⎠
and it can be shown that
σ Z2 = σ X
2 +σY2 = E Z
2( )− E Z( ) 2
Notice that for a real-valued random variable X , if n is an even
number X − E X( ) n= X − E X( )⎡⎣ ⎤⎦
n.
Conditional Probability
Distribution Function
FX |A x( ) = P X ≤ x | A⎡⎣ ⎤⎦ =P X ≤ x( )∩ A⎡⎣ ⎤⎦
P A⎡⎣ ⎤⎦This is the distribution function for x given that the condition A exists. 0 ≤ FX |A x( ) ≤1 , − ∞ < x < ∞
FX |A −∞( ) = 0 and FX |A +∞( ) = 1
P x1 < X ≤ x2 | A⎡⎣ ⎤⎦ = FX |A x2( )− FX |A x1( ) FX |A x( ) is a monotonic function of x
Conditional Probability
Let A be A = X ≤ a{ } where a is a constant.
FX |A x( ) = P X ≤ x | X ≤ a⎡⎣ ⎤⎦ =P X ≤ x( )∩ X ≤ a( )⎡⎣ ⎤⎦
P X ≤ a⎡⎣ ⎤⎦If a ≤ x then P X ≤ x( )∩ X ≤ a( )⎡⎣ ⎤⎦ = P X ≤ a⎡⎣ ⎤⎦and
FX |A x( ) = P X ≤ x | X ≤ a⎡⎣ ⎤⎦ =P X ≤ a⎡⎣ ⎤⎦P X ≤ a⎡⎣ ⎤⎦
= 1
If a ≥ x then P X ≤ x( )∩ X ≤ a( )⎡⎣ ⎤⎦ = P X ≤ x⎡⎣ ⎤⎦ and
FX |A x( ) = P X ≤ x | X ≤ a⎡⎣ ⎤⎦ =P X ≤ x⎡⎣ ⎤⎦P X ≤ a⎡⎣ ⎤⎦
=FX x( )FX a( )
Conditional Probability
Conditional PDF fX |A x( ) = ddx
FX |A x( )( )Conditional expected value of a function
E g X( ) | A( ) = g x( )fX |A x( )dx−∞
∞
∫
Conditional mean E X | A( ) = x fX |A x( )dx−∞
∞
∫