Signal Analysis Method
(see
Dr. Kelly Fitz's site
for more)
The roughness model calculates the roughness of sound signals using spectral
information (frequency and amplitude values of a signal’s spectral components).
Spectral analysis in SRA uses an improved Short Time Fourier Transform (STFT) algorithm,
which is based on
reassigned
bandwidthenhanced modeling (Fitz & Fulop, 2005; Fitz & Haken, 2002; Fitz et al., 2003; Fulop
& Fitz, 2006a,b, 2007), and incorporates an automatic spectral peakpicking process to
determine which frequency analysis bands correspond to spectral components of
the analyzed signal.
Frequency reassignment works differently from traditional Fast Fourier
Transform (FFT) and has more in common with
phase vocoder
methods.
For example, as in traditional FFT, frequency resolution of 10Hz will not be
able to resolve frequency components laying less than 10Hz apart. But,
unlike traditional FFT, the precision of the frequency values returned will not
be limited by this 10Hz "bandwidth," since the frequency band boundaries are
floating rather than being fixed. This a) finetunes the frequencies reported and b)
practically eliminates spectral smearing, since the method ensures that the
assumption of all energy being located at the highfrequency end of an analysis
band can be fulfilled.
Similarly, as in traditional FFT, a given analysis window length determines the
length of the shortest signals that can be reliably analyzed. But, unlike traditional FFT, the temporal resolution of a
signal's spectral (and therefore roughness) timeprofiles will
not be limited by this "window length," since the frequency and amplitude
estimates are not timewindow averages but instantaneous at the timewindow's
center. This a) pinpoints time with much higher precision than implied by the
window length and b) practically eliminates temporal smearing, since the spectra
estimated through timewindow overlaps do not involve averaging over the entire
analysis windows (Fitz & Hacken, 2002; Fitz et al., 2003; Fulop & Fitz, 2006a,b, 2007).
In practical terms, spectral analysis results are finetuned through the
incorporation of a dual STFT process. Frequency values reported correspond to
the time derivative of the argument (phase) of the complex analytic signal
representing a given frequency bin. Similarly, time values reported correspond
to the frequency derivative of the STFT phase, defining the local group delay
and applying a time correction that pinpoints the precise excitation time.
Therefore, the Reassigned BandwidthEnhanced Additive Model shares with
traditional sinusoidal methods the notion of temporallyconnected parameter
estimates of spectral components. By contrast, reassigned estimates are
nonuniformly distributed in both time and frequency, yielding greater temporal
and frequency precision than is possible via conventional additive techniques.
Parameter envelopes of spectral components are obtained by following ridges on a
timefrequency surface, using the reassignment method (Auger & Flandrin,1995) to
improve the time and frequency estimates for the envelope breakpoints.
Bandwidth enhancement expands the notion of a spectral component,
permitting the representation of both
sinusoidal and noise energy with a single component type. Reassigned bandwidthenhanced
components are
defined by a trio of synchronized breakpoint envelopes, specifying the
timevarying amplitude, center frequency, and noise content (or bandwidth) for
each component. The amount of noise energy represented by each reassigned
bandwidthenhanced spectral component is determined through
bandwidth association, a process of constructing the components'
bandwidth envelopes.
REFERENCES
(with links to the sources)
Auger, F. and Flandrin, P. (1995). "Improving the readability of time frequency and time scale representations by the reassignment method,"
IEEE Transactions on Signal Processing 43: 10681089.
Fitz, K. and
Fulop, S.A. (2005). "A unified theory of timefrequency reassignment,"
Digital Signal Processing (preprint).
Fitz, K. and Haken, L. (2002). "On the use of timefrequency reassignment in additive sound modeling,"
Journal of the Audio Engineering Society 50(11): 879893.
Fitz, K., Haken, L., Lefvert, S., Champion, C., and O'Donnell, M. (2003).
"Cellutes and fluttertongued cats: Sound morphing using Loris and the Reassigned BandwidthEnhanced Model,"
Computer Music Journal 27(4): 4465.
Fulop, S.A. and Fitz, K. (2006a). "Algorithms for computing the timecorrected instantaneous
frequency (reassigned) spectrogram, with applications,"
J. Acoust. Soc. Am. 119(1): 360371.
Fulop, S.A. and Fitz, K. (2006b). "A spectrogram for the
twentyfirst century," Acoustics Today 2(3): 2633.
Fulop, S.A. and Fitz, K. (2007). "Separation of components from
impulses in reassigned spectrograms," J. Acoust. Soc. Am. 121(3): 15101518.
