Fundamentals of Sound |
Perceptual attributes of acoustic
waves:
Pitch
Definition - Ranges - JND
Pitch:
Sonic (i.e. perceptual) attribute of sound waves, related mainly to
frequency.
Discussions on pitch usually revolve around music. However, pitch and pitch contours are quite significant in speech and, in some languages, pitch inflections carry specific semantic meaning. The frequency range of hearing extends from ~20Hz to ~20,000Hz (or 20kHz). These values constitute the low and high absolute thresholds of frequency perception, respectively. Listen to a sine signal sweeping through this range. [This will be a test more of your listening equipment than of your hearing].
The frequency range that can give an accurately identifiable pitch
sensation extends from
~30Hz to ~5000Hz (or 5kHz).
Maximal pitch accuracy extends from ~60Hz to ~3800Hz (or 3.8kHz) or ~six octaves above ~60Hz (~B1), to 60*26 = ~3800Hz (~B7)
Frequencies above 10kHz give rise to pitch sensations that, although may be distinguishable from one another, they are hard to identify in terms of height and cannot accurately portray direction of pitch change.
Complex-tone
spectral components with frequencies above 10,000Hz usually represent the 'noisy'
portions of musical sounds (bow scrapings, reed attacks, hammer hits, etc.)
and have timbral (tone color) rather than pitch significance. This 'noisy'
portion, corresponding to spectral energy above 10,000Hz, often
correlates with the degree of a complex
signal's perceived "naturalness." | |
JND for Pitch: ~0.3-1% of frequency, depending on register
(i.e. on frequency region - see the figure, below).
[Reminder: JND (just noticeable difference) or difference threshold refers to the smallest perceivable change in a physical variable] Click, below, to listen to pairs of successive tones ranging from 440-441Hz to 440-448Hz. What is the smallest frequency difference you can perceive as a change in pitch? |
|
|
|
Pitch of Pure
Tones
Pitch & Frequency / Intensity / Duration - Music Theory / Tuning
Perception
Dependence on Frequency |
For simple/pure tones, pitch closely relates to frequency. Similarly to SIL [Sound Intensity Level] and loudness, frequency and pitch relate logarithmically: addition in perception (pitch) corresponds to multiplication in the physical variable (frequency). As the frequency rises, the same pitch interval corresponds to increasingly larger frequency differences. For example, pitch increase by an octave interval corresponds to frequency doubling. So, raising 200Hz by an octave means adding 200Hz, while raising 500Hz by an octave means adding 500Hz.
The figure, below, illustrates the frequency / pitch relationship. The same pitch interval (e.g. octave) corresponds to an increasingly larger frequency distance (after Campbell and Greated, 2001) |
|
Music Theory Primer - Use this virtual piano |
Interval: perceived
pitch distance between two tones (whether pure/simple or complex).
Each of the 13 possible intervals outlined by the notes in the chromatic scale has a unique sonic character or 'signature sound,' related to the different ways the frequency components of the interval notes interact within the ear. Combinations of these signature sounds can create cognitively recognizable patterns. Click, below to listen to all thirteen intervals, starting at C4, (synthesized notes). C4-C4 C4-C#4 C4-D4 C4-D#4 C4-E4 C4-F4 C4-F#4 C4-G4 C4-G#4 C4-A4 C4-A#4 C4-B4 C4-C5 For tuning and tuning-comparison
purposes, each semitone is further divided into 100 log-equal
parts called "cents ." Tuning - Music Performance & Perception
|
Dependence on Intensity |
|
|
The pitch of pure tones also depends on
intensity (see the figures to the left). In general, increasing the intensity of pure tones:
In addition,
the introduction of a
high-intensity "interference" tone will change the perceived pitch of
existing low-intensity tones, even if the
frequency of the low-intensity tones remains unchanged.
In other words, the high intensity tone pushes the
pitch of low intensity tones away from it, assuming
frequency separations beyond one critical bandwidth.
The figure to the left offers another, simplified illustration of the average dependence
of the pitch of pure tones on intensity. |
Dependence on Duration |
|
Pitch also depends on duration. A tone must last more than a minimum amount of time (~10-60ms, depending on frequency and intensity) in order to sound more than a 'click' and convey a clear sense of pitch (see below). We will return to this during the module on Timbre.
|
Pitch of Complex Tones
Pitch & Spectrum - Pitch Theories
The dependence of the pitch of periodic
complex tones (i.e. of complex tones with harmonic
spectra) on frequency, intensity, duration, and the introduction of
intense
'interference' tones is qualitatively similar to that of pure tones but more
complex, due to the variety of frequencies involved. The pitch JND for complex tones may be larger or smaller than that for pure tones, depending on spectral context. |
Dependence on Spectrum |
Reminder: Periodic/harmonic complex tones are usually perceived as a single unit, with their multiple spectral components merging into a single tone perception rather than being perceived as a set of individual pure tones. The pitch of periodic complex tones with
harmonic spectra (i.e. spectral components
with frequencies that are integer multiples of the frequency of the lowest
component, called 'fundamental')
matches (in general) the frequency of the fundamental component.
Frequency region most significant to
pitch
Perceiving a complex signal's individual components
|
Analytic / Synthetic Listening
In this example, (adapted from Smoorenburg, 1970), two complex tones with 2
components each are presented in succession.
In Dannenbring's (1974) demonstration (masking noise bursts filling tone gaps in a steady or frequency modulated tone - from our Hearing Module):
The McGurk effect is an auditory illusion caused by ambiguous audio/visual cues. It offers another example of synthetic listening, illustrating that perceptions are the result of an experience-guided synthesis of information from our environment.
|
Pitch of Complex Tones with Inharmonic Spectra Inharmonic/non-periodic complex tones, whose frequency components are not integer multiples of a 'fundamental' component, do not elicit a clear, unique pitch sensation. They may elicit more than one competing pitch sensations, may resemble chords, or may sound as noise, depending on their spectral distribution. As illustrated below, however, changing the spectral peaks of inharmonic complex tones without changing the frequencies of their sinusoidal components can result in a distinguishable change in pitch, matching the frequency change of the spectral peaks. This highlights the perceptual salience of changing vs. static stimuli and provides additional evidence of the dependence of pitch on spectral distribution.
This effect can also be produced through spectral shaping of noise bands. Pitch perception will track changes in the spectral peak of the noise.
Approximate value of the virtual pitch of slightly inharmonic spectra, whose fundamental or additional low-frequency components are missing:
|
Pitch Theories (for more details, see the Optional section at the end of this and the Hearing modules) |
Place (Tonotopic) Theory
|
Temporal (Periodicity/Frequency) Theory
|
Hybrid Theories
|
.
The Octave - Multidimensionality of Pitch: Pitch Height & Pitch Chroma
The Octave (use this virtual piano to help you with the concepts in this section) |
|
|
|
Multidimensionality of Pitch (pitch spiral) Pitch is multidimensional with at least three
dimensions, involving a) pitch
height (one dimension: frequency) and
The western chromatic scale breaks the octave down into 12 different pitch chromas.
The pitch chroma dimensions represent a circularity in pitch perception.
The perceptual circularity of pitch is explored in Shepard-tone scales/slides (after Roger Shepard) that present the paradox of a continuously ascending (or descending) pitch. Listen to two pitch spiral examples (Houtsma et al., 1987). Shepard scales/slides are the auditory analog of the continuously ascending/descending staircases, explored conceptually by Penrose and artistically by Escher (see the images, below) |
|
Relativity (Escher, 1953) Ascending-Decending (Escher, 1960) | |
Short video on pitch and a/v related audio illusions |
The Place Theory of Pitch was proposed by Ohm (1843), developed by Helmholtz (1862), and confirmed experimentally by von Békésy (1950s), who won the Nobel prize in medicine (1961) for his contributions to the understanding of hearing. The theory's drawbacks, below, were being explored in parallel. | |
The Pitch-Shift Effects The artificial spectra created in explorations of the two pitch shift effects are slightly inharmonic but not inharmonic enough for the pitch sensation to deteriorate. The first pitch shift effect: Shifting all components of a harmonic complex tone by a value: |Δƒ|, results in a shift of the perceived pitch by a value |ΔP|, despite the fact that the frequency spacing between the components (and therefore the difference frequency |Δƒ| among successive components of the complex tone) remains unchanged (|Δƒ|=ƒ0). The second pitch shift effect: For the harmonic complex tone to the right/bottom (continuous spectral lines) the perceived pitch P matches that of the difference frequency ƒ0. Increasing the spacing of the components while keeping the frequency of the center component n the same, results in a drop in pitch P'< P , although the difference frequency has increased from |Δƒ|=ƒ0 to |Δƒ|=ƒ0+dƒ (broken lines). Vassilakis (1998) showed that these two effects are not distinct but alternative manifestations of a single phenomenon. In a follow-up work, he argued that the pitch shift effect reflects our perceptual system's handling of the interaction between the phase and group velocities of the inharmonic tone complexes used in pitch shift experiments (Vassilakis, 1998b). |
|
|
Unresolved Upper Harmonics The place (or tonotopic) theory of pitch requires energy in the lowest 5-8 components (depending on fundamental frequency) in order to produce a clear pitch sensation. These low components are the components that are usually resolved best by the basilar membrane (i.e. they lay within separate critical bands) and can therefore provide clear place-related pitch information and/or produce strong intermodulation distortion products. The problem is that clear virtual pitch sensations persist even when the remaining components in a spectrum are not resolvable (see the 'unresolved upper harmonics' in the figure, left). |
The above observations led to the first attempt to a temporal theory of pitch, called the "Residue" theory of pitch, developed first by Schouten (1938) and later by Walliser (1969). It states that pitch is determined by the temporal interaction, at a neural level, of the unresolved, 'residual' upper harmonics in a spectrum. This theory is challenged by the experimentally determined frequency region most salient to pitch (~400-1500Hz).
|
|
Temporal Coding: Phase Locking and Rectification The first systematic temporal
(periodicity) theory of pitch was
proposed by Seebeck (1843), developed by Rutherford (1886), |
|
|
Neural response (neural firing) follows (or appears to be locked to) the positive peaks in the stimulus, firing only when the stereocilia are sheared in one direction. This results in the neural signals of
sinusoidal inputs The process of phase locking is closely related to hearing's "temporal coding theory" of encoding frequency information: The inner hair cells
release neurotransmitters only when the basilar membrane
moves in one direction, towards the scala media. They
therefore only respond to the positive portions of
incoming signals, with their stereocilia opening their
ion channels when bending against the TM mostly at a signal’s positive peak. Since neurons are not fast enough to encode high frequencies, more than one neuron must be involved in the process. Each neuron fires at some of the peak portions of an incoming signal and, after adding the outputs of all neurons, the signal is represented to the brain in a manner similar to that shown at the bottom graph (left). |
Perception of pitch relations - Unit of pitch Mel: Pitch-height
unit and scale devised by S. S. Stevens (1937, 1940). Reference: 1000 mels = pitch of 1000Hz presented 40dB above threshold. The Mel unit of pitch height is analogous to the Sone unit of loudness. The derivation of the Mel scale has been criticized for flawed methodology and is not in use. |
|
Further Resources _ Concise and systematic
historical review of pitch theories
(Alain de Cheveigné, IRCAM, Paris, France -
source).
|
Key to the 3-tone
listening example:
DD (the tones have frequencies 12025Hz,
12000Hz, and 11975Hz, in this order)
Loyola Marymount University - School of Film & Television