Signal Analysis Method  (see Dr. Kelly Fitz's site for more)


The roughness model calculates the roughness of sound signals using spectral information (frequency and amplitude values of a signalís spectral components).
Spectral analysis in SRA uses an improved Short Time Fourier Transform (STFT) algorithm, which is based on reassigned bandwidth-enhanced modeling (Fitz & Fulop, 2005; Fitz & Haken, 2002; Fitz et al., 2003; Fulop & Fitz, 2006a,b, 2007), and incorporates an automatic spectral peak-picking process to determine which frequency analysis bands correspond to spectral components of the analyzed signal.

Frequency reassignment works differently from traditional Fast Fourier Transform (FFT) and has more in common with phase vocoder methods. For example, as in traditional FFT, frequency resolution of 10Hz will not be able to resolve frequency components laying less than 10Hz apart. But, unlike traditional FFT, the precision of the frequency values returned will not be limited by this 10Hz "bandwidth," since the frequency band boundaries are floating rather than being fixed. This a) fine-tunes the frequencies reported and b) practically eliminates spectral smearing, since the method ensures that the assumption of all energy being located at the high-frequency end of an analysis band can be fulfilled.
Similarly, as in traditional FFT, a given analysis window length determines the length of the shortest signals that can be reliably analyzed. But, unlike traditional FFT, the temporal resolution of a signal's spectral (and therefore roughness) time-profiles will not be limited by this "window length," since the frequency and amplitude estimates are not time-window averages but instantaneous at the time-window's center. This a) pin-points time with much higher precision than implied by the window length and b) practically eliminates temporal smearing, since the spectra estimated through time-window overlaps do not involve averaging over the entire analysis windows (Fitz & Hacken, 2002; Fitz et al., 2003; Fulop & Fitz, 2006a,b, 2007).

In practical terms, spectral analysis results are fine-tuned through the incorporation of a dual STFT process. Frequency values reported correspond to the time derivative of the argument (phase) of the complex analytic signal representing a given frequency bin. Similarly, time values reported correspond to the frequency derivative of the STFT phase, defining the local group delay and applying a time correction that pinpoints the precise excitation time.

Therefore, the Reassigned Bandwidth-Enhanced Additive Model shares with traditional sinusoidal methods the notion of temporally-connected parameter estimates of spectral components. By contrast, reassigned estimates are non-uniformly distributed in both time and frequency, yielding greater temporal and frequency precision than is possible via conventional additive techniques. Parameter envelopes of spectral components are obtained by following ridges on a time-frequency surface, using the reassignment method (Auger & Flandrin,1995) to improve the time and frequency estimates for the envelope breakpoints.

Bandwidth enhancement expands the notion of a spectral component, permitting the representation of both sinusoidal and noise energy with a single component type. Reassigned bandwidth-enhanced components are defined by a trio of synchronized breakpoint envelopes, specifying the time-varying amplitude, center frequency, and noise content (or bandwidth) for each component. The amount of noise energy represented by each reassigned bandwidth-enhanced spectral component is determined through bandwidth association, a process of constructing the components' bandwidth envelopes.


(with links to the sources)

Auger, F. and Flandrin, P. (1995). "Improving the readability of time frequency and time scale representations by the reassignment method," IEEE Transactions on Signal Processing 43: 1068-1089.

Fitz, K. and Fulop, S.A. (2005). "A unified theory of time-frequency reassignment," Digital Signal Processing (preprint).

Fitz, K. and Haken, L. (2002). "On the use of time-frequency reassignment in additive sound modeling," Journal of the Audio Engineering Society 50(11): 879-893.

Fitz, K., Haken, L., Lefvert, S., Champion, C., and O'Donnell, M. (2003). "Cell-utes and flutter-tongued cats: Sound morphing using Loris and the Reassigned Bandwidth-Enhanced Model," Computer Music Journal 27(4): 44-65.

Fulop, S.A. and Fitz, K. (2006a). "Algorithms for computing the time-corrected instantaneous frequency (reassigned) spectrogram, with applications," J. Acoust. Soc. Am. 119(1): 360-371.

Fulop, S.A. and Fitz, K. (2006b). "A spectrogram for the twenty-first century," Acoustics Today 2(3): 26-33.

Fulop, S.A. and Fitz, K. (2007). "Separation of components from impulses in reassigned spectrograms," J. Acoust. Soc. Am. 121(3): 1510-1518.