It appears you don't have support to open PDFs in this web browser. To view this file, Open with your PDF reader
Abstract
Some of the most important sounds humans hear, including speech and music, can be defined in part by their pitch. Acoustically, sounds said to have pitch have a regular rate of repetition in time—called the fundamental frequency, or f0—and contain overtones (‘harmonics’) that are multiples of the f0. Pitch is traditionally construed as the perceptual correlate of f0, and a longstanding goal of hearing research has been to determine how listeners estimate f0 from harmonic sounds. But the settings in which harmonic sounds have been studied are impoverished relative to the immense variety of situations in which we encounter such sounds, including natural auditory scenes containing music, speech, and noise. Consequently, our understanding of how the auditory system processes harmonic sounds remains limited. This dissertation examines the representations involved in hearing harmonic sounds, both when extracting pitch information and when segregating concurrent sounds. One method employed across studies is the comparison of task performance with harmonic sounds to that with and inharmonic sounds—those whose frequencies are inconsistent with any single f0. By comparing task performance with harmonic and inharmonic sounds we gain insight into the conditions where listeners rely on representations of the f0. In addition, we use a broad array of different tasks, individual differences approaches, and cross-cultural experiments. There are five main contributions of this dissertation:
1. Chapter 1: We first test humans on a large battery of tasks thought to depend on pitch, and find that performance on some tasks, such as discriminating musical intervals or recognizing voices, is impaired when sounds are inharmonic. But other tasks, such as judging the direction of a pitch change, are performed equally well with inharmonic sounds. Listeners appear to estimate pitch changes in these cases by tracking the frequency spectrum without estimating f0. This suggests that the classic view of pitch as f0 estimation is incomplete—at least two representations are involved, one of which does not involve the f0. Chapters 2–4 build from this initial finding, each testing a different hypothesis about when listeners might rely on f0-based pitch.
2. Chapter 2: We demonstrate that pitch discrimination is better for harmonic compared to inharmonic stimuli when stimuli are separated in time, despite being comparably accurate for back-to-back sounds. Listeners appear to use the f0 as an efficient representation for memory, demonstrating a novel form of abstraction within hearing. We also substantiate that listeners have two distinct representations of pitch, comparing the frequency spectra for sounds nearby in time and the f0 for sounds separated in time.
3. Chapter 3: F0-based pitch is traditionally envisioned as being invariant to spectral shape (timbre). We demonstrate that while pitch judgments show some degree of invariance to spectral shape, the invariance observed for natural sounds like speech and music does not depend on representations of f0, being comparable for harmonic and inharmonic sounds.
4. Chapter 4: We demonstrate that harmonic frequency relations aid hearing in noise. We found that it is easier to detect and discriminate sounds in noise when they are harmonic rather than inharmonic. A noise-robust f0-based pitch signal from harmonic sounds like music and speech may help such sounds stand out in noisy backgrounds. This is a previously undocumented aspect of auditory scene analysis.
5. Chapters 5–6: In a parallel line of research, we examine representations of concurrent harmonic musical notes, focusing on ‘fusion’, whereby pairs of notes are misperceived as a single note. In Chapter 5 we survey different factors that might influence fusion in Western listeners. In Chapter 6, we compare fusion of note pairs and preferences for note pairs between listeners in the US and the Tsimane’, an indigenous population of hunter agriculturalists living in the Bolivian Amazon who have limited exposure to Western culture and music. We find cross-cultural similarity in the tendency to fuse canonically ‘consonant’ intervals, despite differences in preferences for ‘consonant’ intervals across cultures. This result suggests universal perceptual mechanisms that could contribute to cross-cultural regularities in musical systems, but show that these regularities do not directly determine aesthetic associations, which appear to be culturally determined.
This work provides evidence for two distinct representations underlying pitch perception and reveals several constraints that determine when each is used. More broadly, this dissertation shows how representations of harmonic sounds are critical to auditory perception and cognition, and opens the door to a better understanding of auditory memory, the relationship between pitch in speech and music, and the role of experience in shaping sensory perception.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer