Content area
Full text
Introduction
Speaking is a highly complex human capacity: It does not only involve a motor act, but also leads to the perception and the monitoring of one’s own voice. The distinction between self-produced speech from speech of others is proposed to be accomplished by a “motor-to-sensory discharge” (Paus et al., 1996) or an internal forward model (Ventura et al., 2009; Tian and Poeppel, 2010; Hickok, 2012). The idea of an internal forward model suggests that an efference copy (von Holst and Mittelstädt, 1950) of a motor act is generated that predicts its sensory consequences (Wolpert et al., 1995). The prediction prepares a respective cortical area to perceive the predicted sensory input. Consequently, brain activity directed to incoming sensation is suppressed (Chen et al., 2011).
Interestingly, the suppression effect has been reported in many vocalization studies. It was shown that speech production elicits smaller event-related potentials (ERPs) or fields (ERFs) than passively perceived speech (Numminen and Curio, 1999; Numminen et al., 1999; Curio et al., 2000; Gunji et al., 2001; Ford et al., 2007; Ventura et al., 2009; Ott and Jäncke, 2013). Based on a non-human primate study (Müller-Preuss and Ploog, 1981; replication and extension of non-human primate investigations by Eliades and Wang, 2003), Creutzfeldt et al. (1989) recorded intracranial neuronal activity from the right and left superior, middle and inferior temporal gyri in patients undergoing surgery for epilepsy. Results revealed suppressed activity in response to vocalization. Relatedly, Chen et al. (2011) conducted an electrocorticography (ECoG) study during human vocalization. They reported neural phase synchrony in the gamma band between Broca’s area and the auditory cortex. This phase synchrony that preceded a speaker’s speech onset was greater during vocalizing than when listening to their own speech passively (i.e., pre-recorded), indicating that phase synchrony in the gamma band between the two brain regions may describe the transmission of a motor efference copy.
In a PET-study, Hirano et al. (1996, 1997) found strong cerebellar activation for distorted self-produced speech (i.e., delayed or changed in pitch), possibly indicating the role of the cerebellum in generating internal forward predictions based on an efference copy. Along similar lines, we have shown that the cerebellum is involved in generating motor-to-auditory predictions when processing self-initiated sounds (Knolle et al., 2012, 2013a). We utilized...