Time Frequency Representation - nyu.edu · Discrete Fourier Transform (DFT) X ... •The Inverse...

Click here to load reader

  • date post

  • Category


  • view

  • download


Embed Size (px)

Transcript of Time Frequency Representation - nyu.edu · Discrete Fourier Transform (DFT) X ... •The Inverse...

  • Time Frequency Representation

    Juan Pablo BelloMPATE-GE 2623 Music Information RetrievalNew York University

  • Discrete Signal and Sampling



    x(tn) = Q(x(t) p(t))

  • X()x(t)



    P ()


  • Aliasing


    P ()


    P ()

    X(k) X(k)

  • Aliasing

    = 2f

    k = 2(fs f)

  • Discrete Fourier Transform (DFT)

    X(k) N1

    n=0 x(tn)ejktn

    x = input signaltn =

    nfs = discrete time (s), n 0 is an integer

    fs = sampling rate (Hz)X = spectrum of xk = k = discrete frequency (rad/s), k 0 is an integer = 2( fsN ) = frequency sampling interval (rad/s)N = number of time/frequency samples

    Simple form :

    X(k) N1

    n=0 x(n)ej2nk/N

  • Discrete Fourier Transform (DFT)


    X(N 1)


    s0(0) s0(1) s0(N 1)

    s1(0) s1(1) s1(N 1)


    . . ....

    sN1(0) sN1(1) sN1(N 1)


    x(N 1)

    This can also be written as:

    X(k) = x(n), sk(n)

    Which can be formulated as a matrix multiplication:

  • Discrete Fourier Transform (DFT)

    Real part Imaginary partwhere,

    sk(n) = ej2nk/N = cos(2nk/N) + jsin(2nk/N)

    is the set of the sampled complex sinusoids witha whole number of periods in N samples (Smith, 2007).

  • Discrete Fourier Transform (DFT)

    The N resulting X(k) are complex-valued vectorsXR(k) + jXI(k) such that, k = 0, 1, , N 1:

    |X(k)| =

    X2R(k) +X2I (k)

    X = (k) = tan1 XI(k)XR(k)

    Furthermore, if x(n) is real-valued, then:

    X(k) = X(N k)

    The DFT of an audio signal is half-redundant!

  • Discrete Fourier Transform (DFT)

    Conjugate PairsSpectrum

  • The IDFT and FFT

    The Inverse DFT (IDFT) is defined as:

    The DFT needs on the order of N2 operations for its computation.

    The Fast Fourier Transform (FFT) is an efficient implementation of the DFT, that only requires on the order of Nlog2N operations when N is a power of 2.

    The FFT is so fast that it can be used to efficiently perform time-domain operations such as convolution.

    x(n) 1N



    X(k)ej2nk/N , n = 0, 1, , N 1

  • Spectral Leakage


    Whole number of periods Fractional number of periods

    N N


  • Zero padding

    N-long signal Zero padded to 8xN length

    Spectrum Interpolated Spectrum

  • Windows

    We are effectively using a rectangular window: w(n)

    Spectrum = convolution of X(k) and W(k)

    Ideal window: narrow central lobe; strong attenuation in sidebands

    Figures show Magnitude in dB:

    dB(X) = 20 log10(X)

  • Windowing

  • Short-Time Fourier Transform (STFT)

    hop size




  • Spectrogram

  • Time vs Frequency Resolution

    frequency time

    frequency time

  • Instantaneous Frequency



    The instantaneous frequency for frequency bin k at timeinstant mh can be defined as (Arfib et al., 2003):

    fi, k(m) =12

    dk(m)dt =




    k(m) = kh+ princarg[k(m) k(m 1) kh]


    princarg(x) = + [(x+ )mod(2)]

    wraps the phase to the (,] range.

  • Instantaneous Frequency

  • Instantaneous Frequency

  • The signal is approximated as a sum of time-varying sinusoidal components plus a residual:

    Sinusoidal Modeling

    x(n) K

    k=0 ak(n)cos(k(n)) + e(n)


    ak = instantaneous amplitudek = instantaneous phasee(n) = residual (noise)

  • Peak picking + interpolation

    Sinusoidal components are peak-picked.Instantaneous magnitude and phase values are obtained by interpolation.Components are tracked over time

  • Sinusoidal tracking

  • References

    Smith, J.O. Mathematics of the Discrete Fourier Transform (DFT). 2nd Edition, W3K Publishing (2007)

    Zlzer, U. (Ed). DAFX: Digital Audio Effects. John Wiley and Sons (2002): chapter 8, Arfib, D., Keiler, F. and Zlzer, U., Time-frequency Processing; and chapter 10: Amatriain, X., Bonada, J., Loscos, A. and Serra, X. Spectral Processing.

    Pohlmann, K.C. Principles of Digital Audio. 6th Edition. McGraw Hill (2011): chapter 2.

    Loy, G. Musimathics, Vol.2. MIT Press (2011): chapter 3.