Content area
The number and type of connections involving different levels of orthographic and phonological representations differentiate between several models of spoken and visual word recognition. At the sublexical level of processing, Borowsky, Owen, and Fonos (1999) demonstrated evidence for direct processing connections from grapheme representations to phoneme representations (i.e., a sensitivity effect) over and above any bias effects, but not in the reverse direction. Neural network models of visual word recognition implement an orthography to phonology processing route that involves the same connections for processing sublexical and lexical information, and thus a similar pattern of cross-modal effects for lexical stimuli are expected by models that implement this single type of connection (i.e., orthographic lexical processing should directly affect phonological lexical processing, but not in the reverse direction). Furthermore, several models of spoken word perception predict that there should be no direct connections between orthographic representations and phonological representations, regardless of whether the connections are sublexical or lexical. The present experiments examined these predictions by measuring the influence of a cross-modal word context on word target discrimination. The results provide constraints on the types of connections that can exist between orthographic lexical representations and phonological lexical representations. [PUBLICATION ABSTRACT]
Full text
Abstract The number and type of connections involving different levels of orthographic and phonological representations differentiate between several models of spoken and visual word recognition. At the sublexical level of processing, Borowsky, Owen, and Fonos (1999) demonstrated evidence for direct processing connections from grapheme representations to phoneme representations (i.e., a sensitivity effect) over and above any bias effects, but not in the reverse direction. Neural network models of visual word recognition implement an orthography to phonology processing route that involves the same connections for processing sublexical and lexical information, and thus a similar pattern of cross-modal effects for lexical stimuli are expected by models that implement this single type of connection (i.e., orthographic lexical processing should directly affect phonological lexical processing, but not in the reverse direction). Furthermore, several models of spoken word perception predict that there should be no direct connections between orthographic representations and phonological representations, regardless of whether the connections are sublexical or lexical. The present experiments examined these predictions by measuring the influence of a cross-modal word context on word target discrimination. The results provide constraints on the types of connections that can exist between orthographic lexical representations and phonological lexical representations.
The identification of spoken and written words involves integration of the target stimulus and relevant contextual sources of information from the environment. It has been demonstrated that listeners integrate both auditory and visual sources of information during auditory perception. The classic "McGurk effect" (e.g., MacDonald & McGurk, 1978; McGurk & MacDonald, 1976) illustrates that when listeners are presented with an auditory stimulus (e.g., /ba-ba/) that does not match visually presented vocal gestures (e.g., mouth movements for /ga-ga/), the auditory and visual information are integrated during auditory perception (e.g., the listener hears "da-da"). People are often presented with concurrent spoken and printed stimuli that are not necessarily congruent. For example, students are often required to attend to text printed on overheads while also attending to a lecturer's spoken words, and parents often read stories to their children while their children follow the printed words. Thus, how concurrent visual and auditory stimuli are integrated has been an important issue for models of language processing (e.g., Borowsky, Owen, & Fonos, 1999; Fowler & Deckle, 1991; Frost & Katz, 1989; MacDonald & McGurk, 1978; Massaro, Cohen, & Thompson, 1988; McGurk & MacDonald, 1976).
Models of visual word recognition differ in the number and type of nonsemantic connections between orthographic and phonological representations. These connections between orthographic and phonological representations serve as a processing route for these processing subsystems to communicate with one another. The types of communication from one subsystem to another may be unidirectional or bidirectional. For example, the dual-route cascade model has unidirectional connections that map graphemes onto phonemes, and bidirectional connections that map orthographic lexical representations onto phonological lexical representations (Coltheart, Rastle, Perry, Langdon, & Zeigler, 2001). In contrast, the neural network models of Seidenberg and colleagues (Harm & Seidenberg, 1999; Plaut, McClelland, Seidenberg, & Patterson, 1996; Seidenberg & McClelland, 1989) implement a single nonsemantic route with one set of fully recurrent connections between orthographic and phonological units to handle both sublexical and lexical levels of translation from print to sound, and thus they are often referred to as "single-route" models. Neural network models typically group together the orthographic levels of representation (e.g., orthographic features, graphemes, and orthographic lexical representations), and similarly group together the phonological levels of representation (e.g., phonetic features, phonemes, and phonological lexical representations).
Figure 1. A modified framework for comparing dual- and single-route models of visual word recognition (see Borowsky et al., 1999): A. Dual-route model, B. Single-route model. Connections that have been corroborated by experiments are shown in bold. PD = phonetic decoding (i.e., sublexical, assembled phonology) route, SV = sight vocabulary (i.e., lexical, addressed phonology) route.
Figure 1 illustrates these differences and provides a framework for comparing dual- and single-route models of visual word recognition, including the types of connections for communicating between processing subsystems that are corroborated in the present experiments.
As illustrated in Figure 1A, dual-route models process printed words by first analyzing the printed words into orthographic features (e.g., curves, lines, angles), which have bidirectional connections with the graphemic level of representation (e.g., b). Graphemic information can follow one of two processing routes, hence the name dual-route models. The sublexical (i.e., phonetic decoding) processing route maps graphemes onto phonemes. Once the phonemes have been assemhled and synthesized, they can he used to produce speech output. Assembled phonology can also he checked against stored phonological lexical representations (e.g., Borowsky, Owen, & Masson, 2002, discuss several criteria for maximizing phonological lexical access when forced to rely on assembled phonology). Alternatively, the graphemes can be synthesized and mapped onto complete orthographic lexical representations. To produce spoken output via this route, orthographic lexical representations are then mapped directly onto phonological lexical representations (i.e., sight vocabulary). Coltheart et al. (2001) assumed that the set of connections from the orthographic lexical level to the phonological lexical level are bidirectional. It should be noted that both the orthographic lexical and phonological lexical representations may also be influenced by connections with the semantic system.
Within this framework, speech input could be analyzed into phonetic features (e.g., pitch, frequency) that are connected to a phonemic level of representation. Again, the phonemes can be assembled to produce speech output or to activate phonological lexical representations. The phonological lexical representations may be used to produce speech or to activate orthographic lexical representations via bidirectional connections to orthographic representations.
As illustrated in Figure 1B, single-route models process printed words by analyzing the printed words into orthographic representations (e.g., Wicklefeatures, Seidenberg, & McClelland, 1989). The orthographic representations are mapped onto corresponding phonological representations via a single set of connections between the orthographic level and the phonological level of representation. Speech input is analyzed into phonological representations; however, they are not considered to be represented separately at the level of features, phonemes, and words as in the dual-route class of models.
With respect to models of spoken word recognition, there are two models that are notable in their claim that there are no direct connections between orthographic and phonological lexical representations: Fowler and Deckle's (1991) Direct Realist Theory and Massaro et al.'s (1988) Fuzzy Logical Model of Perception. In particular, the Direct Realist Theory (developed from Liberman & Mattingly's, 1985, Motor Theory) states that orthographic processing will not influence phonological processing because orthography does not emanate from the same common causal source as speech (i.e., vocal tract gestures). Accordingly, it predicts that there should be no influence of orthographic lexical processing on phonological lexical processing. Massaro et al.'s (1988; Massaro & Cohen, 1995) Fuzzy Logical Model of Perception consists of three operations involved in speech perception, those of feature evaluation, feature integration, and decision. The feature evaluation of the orthographic information is assumed to be independent of the phonological information. As such, Massaro's (1989, pp. 402, 404; see also Massaro et al., 1988) Fuzzy Logical Model of Perception clearly predicts that cross- modal orthographic and phonological processing effects would be limited to simple bias effects (described below). In contrast, models that describe an interactive activation framework (McClelland & Rumelhart, 1981) as the means of connecting orthographic and phonological representations clearly do implement direct connections between these subsystems (e.g., Jacobs, Key, Ziegler, & Grainger, 1998, see also the lexical connections in Coltheart et al.'s, 2001, model).
The present research examines the nature of the connections between orthographic lexical and phonological lexical representations by utilizing a recent variant of the two-alternative, forced-choice (2AFC) paradigm (Borowsky et al., 1999; Ratcliff & McKoon, 1997). The experiments reported here involved presenting a "context" stimulus (e.g., saw: cap) simultaneously with a target stimulus in a different modality that was congruent (e.g., heard: /cap/), incongruent, or irrelevant to the context (i.e., a baseline). In the congruent condition, the visual context matched the auditory target stimulus, and was followed by a response probe that included the target and another alternative (see Table 1). In the incongruent and irrelevant conditions, the context and target items did not match. For these two conditions, the 2AFC probe presented to the participant determined the distinction between the incongruent and irrelevant conditions. For example, in the incongruent condition, the participant may have seen the visual context rap simultaneously with the auditory target /cap/, followed by the visual 2AFC probe heard /cap/ or heard /rap/ (i.e., the misleading context and the correct target). In the irrelevant (i.e., neutral) condition, the participant may have also seen the visual context map simultaneously with the auditory target /cap/; however, the visual 2AFC probe contained heard /rap/ or heard /cap/ (i.e., a nonpresented item and the correct target). Forced-choice accuracy is assumed to be related to the difference between the activation of the target and the foil. Thus, if the foil has little activation relative to the target, accuracy should be high, and if the foil and the target are equally active, accuracy should be at chance.
Figure 2. Hypothetical effects of the context stimulus upon target discrimination. Bias effects (A) produce equal benefits (congruent minus irrelevant accuracy scores) and costs (irrelevant minus incongruent accuracy scores). Sensitivity effects produce a significant difference between benefits and costs. Sensitivity effects may illustrate (B) greater benefits than costs, or (C) greater costs than benefits.
This 2AFC paradigm can be used to distinguish bias from sensitivity effects. Bias effects occur when the context benefits accurate target discriminations in the congruent condition to the same degree as the context costs target discrimination performance in the incongruent condition (see Figure 2A). Sensitivity (or encod-ing/activation) occur when there is a significant difference between the benefits and costs conveyed by the context in the congruent and incongruent conditions, respectively (see Figures 2B and 2C; see also Massaro, 19H9; Masson & Borowsky, 1998; Paap, Johansen, Chun, & Vonnahme, 2000; Ratcliff & McKoon, 1997).
Interpreting Bias Effects
If the context stimulus simply serves to bias a participant's willingness to choose a response probe alternative, then the difference between the congruent and the irrelevant conditions would equal the difference between the irrelevant and the incongruent conditions (i.e., the context provides equal benefits and costs; see Borowsky et al., 1999). Ratcliff and McKoon (1997) had proposed that a symmetrical effect of the context upon target discriminations may be interpreted as simple bias (i.e., the participant's selection of a probe stimulus is influenced by the context stimulus if the context stimulus is included in the response probe). Thus, a symmetrical effect of the context on target discriminations in the present research will be interpreted as indiscriminable from a bias effect.
Sensitivity Effects
If the context modality differentially affects congruent and incongruent target discriminations, the effect of the context on target discriminations will deviate from a symmetrical effect of the context on 2AFC accuracy. Borowsky et al. (1999) had proposed that asymmetrical effects of the context upon target discriminations are more definitive than bias effects in informing word recognition modelers about the nature of the connection between the context and target modalities. Asymmetrical effects are evidence for a sensitivity effect over and above any bias (i.e., symmetrical) effects. In the current paradigm, a sensitivity effect indicates that the context modality differentially affects target modality discriminations. One way (but certainly not the only way) that such sensitivity effects could occur is if the context modality has direct connections to the target modality so that the context modality activation can influence the target modality discrimination. If there is a sensitivity effect for only one type of discrimination task (i.e., lexical orthographic or phonological in the present study), then this pattern of results would be consistent with the claim that there are directionally weighted connections from a context modality to a target modality. However, if sensitivity effects are found to occur in both directions (i.e., regardless of which modality serves as context or as target), then this pattern of results would be consistent with the claim that there are connections in both directions between the context modality and target modality processing subsystems.
Borowsky et al. (1999) have previously used this logic to investigate the nature of the connections between sublexical orthographic (i.e., grapheme) and sublexical phonological (i.e., phoneme) processing systems. Extending Ratcliff and McKoon's (1997) 2AFC paradigm for assessing prime sensitivity effects, Borowsky et al. presented participants with three congruency conditions. A sublexical target stimulus (e.g., spoken /ta/) was presented simultaneously with a context stimulus from a different modality that was congruent (e.g., printed ta, with probes "heard ta" and "heard da" or "heard ta" and "heard na"), irrelevant (e.g., printed na, with probes "heard ta" and "heard da," or printed da, with probes "heard ta" and "heard na"), or incongruent (e.g., printed na, with probes "heard ta" and "heard da," or printed na, with probes "heard ta" and "heard na") to the target. For the phoneme discrimination experiments, a grapheme provided the context and the phoneme was considered the target, whereas in the grapheme discrimination experiments, a phoneme provided the context and the grapheme was considered the target.
For the phoneme discrimination experiments, Borowsky et al. (1999) showed that grapheme contexts had an asymmetrical effect on target phoneme discrimination. In particular, the benefits of the context grapheme exceeded the costs. For the grapheme discrimination experiments, they showed that the phoneme context had a symmetrical effect on congruent and incongruent condition performance compared to the irrelevant baseline condition. Borowsky et al. interpreted these findings to suggest that there is evidence for direct connections from the system that represents graphemes to the system that represents phonemes, and no evidence for direct connections in the opposite direction (i.e., a simple bias interpretation).
The models of Seidenberg and colleagues that implement a single-route with only one set of connections between orthographic and phonological representations must predict that the same pattern of results that were observed for sublexical stimuli (e.g., graphemes, phonemes) will be obtained with lexical stimuli (i.e., words). Because the Borowsky et al. (1999) study demonstrated evidence for direct connections from sublexical orthographic processing to sublexical phonological processing, single-route models must predict that orthographic lexical context processing will directly affect phonological lexical target discrimination accuracy. Furthermore, given the symmetrical (i.e., simple bias) effects of a sublexical phonological context on sublexical orthographic target processing, single-route models must predict a symmetrical effect of phonological lexical contexts on orthographic lexical target discrimination accuracy.
As dual-route models (e.g., Coltheart et al., 2001; Zorzi, Houghton, & Butterworth, 1998) have two sets of (nonsemantic) connections at the lexical and sublexical levels of representation, these models do not have to predict that the same pattern of results would be obtained for lexical and sublexical stimuli. In fact, Coltheart et al.'s (2001) version of the dual-route model utilizes unidirectional connections from orthographic sublexical representations to phonological sublexical representations (i.e., graphemes to phonemes), and bidirectional connections at the lexical representational level. Evidence for such bidirectional connections at the lexical level of representation would be obtained if both lexical contexts produce asymmetrical effects on cross-modal lexical target discrimination (i.e., sensitivity effects in both directions).
It is noteworthy that any evidence of a sensitivity effect of lexical context on cross-modal lexical target discrimination in the present experiments will serve to falsify the models that claim that there is no direct connection between orthographic and phonological lexical processing systems (i.e., the speech perception models of Fowler & Deckle, 1991, and Massaro, 1989). Although some models of word recognition do subscribe to the notion of direct connections between orthographic lexical and phonological lexical representations (e.g., Coltheart et al., 2001; Jacobs et al., 1998), the default assumption in such interactive activation frameworks is that the connections exist for both directions. If evidence is obtained in the present study for a sensitivity effect that is greater in one direction than the other, then these models will have to be modified.
Kay, Lesser, and Coltheart (1996) stated that little is known about the nature of the connections between processing subsystems. Given that the nature of connections between processing subsystems postulated by models of word recognition differ between several models, it is important to examine the nature of these connections. The current experiments sought to empirically determine the nature of the connections at the orthographic and phonological lexical level in order to inform models of visual and spoken word recognition.
Experiments 1 and 2
Experiments 1 and 2 investigated the influence of an orthographic lexical context upon spoken word discrimination. Experiment 1 was designed to be a relatively difficult spoken word discrimination task, whereas in Experiment 2 the spoken word discrimination was made easier by increasing the audibility of the spoken word targets. Experiment 2 served to evaluate whether the pattern of results would change as a function of location on the accuracy scale, which might implicate a scaling artefact.
Method
Participants. Thirty-two University of Saskatchewan students participated in Experiment 1 for partial credit in an introductory psychology class, and another 24 students were paid $5 for participating in Experiment 2. All reported English as their first language and normal (or corrected-to-normal) vision.
Apparatus. An IBM-compatible computer with Micro-Experimental Laboratories (MEL) software controlled the timing of events and recording of the data. Orthographic stimuli were presented in white on a black background using a NEC colour monitor (Model JC-15W1VMA). A pair of Altec Lansing ACS5 speakers, placed on either side of the monitor, was used to present the auditory stimuli via a Creative Lab Sound Olaster-compatible 16-bit audio card. The "1" and "2" keys on the numeric keypad were used to collect participants' responses.
Materials and design. Five three-letter word triplets were used for the set of experiments reported here. Within each triplet set, the items were matched for rhyme and whether the initial letter was an ascender, descender, or x-height. Creative WaveStudio (Version 2) was used to record the spoken words (spoken by a male). Each triplet was constructed such that each initial onset was added to the same rhyme. All spoken stimuli were recorded in 16-bit mono, at a sampling frequency of 22KHz, and were 500 ms in duration. Each spoken stimulus was presented simultaneously with white-noise (the MEL white-noise level was set to 88% maximum output for Experiment 1, and reduced to 86% maximum output for Experiment 2). MEL code specification for the white-noise output was AUDIO_ SET_VOLUME( 4, 0, 88 ) and AUDIO_SET_VOLUME( 4, 0, 86 ) for Experiments 1 and 2, respectively.
Three congruency conditions were created based on the match of the orthographic lexical context to the spoken word target and the response probe (see Table 1). The orthographic stimulus was presented simultaneously with the spoken word target and was congruent, incongruent, or irrelevant to the target. In the congruent condition, the orthographic context matched the spoken word target, and the visually presented response probe for this condition contained the target and one of the other two stimuli from the same triplet set (e.g., orthographic context cap and spoken word target /cap/, probed with heard cap or heard rap). In the irrelevant condition, the orthographic context did not match the spoken word target, and the visually presented response probe contained the target and the irrelevant remaining stimulus from the triplet set (e.g., orthographic context map and spoken word target /cap/, probed with heard" cap or heard" rap). In the incongruent condition, the orthographic context did not match the spoken word target, and the visually presented response probe contained both the context and target stimuli (e.g., orthographic context rap and spoken word target /cap/, probed with heard cap or heard rap). The three spoken words from each triplet and corresponding orthographic stimuli appeared in each of the congruent, incongruent, and irrelevant conditions equally often, and the correct alternative of the response probe appeared equally often on the right- or left-hand side, creating 36 trial conditions per triplet set. The experiment consisted of 15 practice trials, followed by two continuous blocks of 180 randomized trial conditions for a total of 360 experimental trials.
Procedure. Participants were instructed, both verbally and in writing, that they would see a printed word (e.g., cap, map, or rap) in the middle of the computer screen and, at the same time, they would hear a spoken word presented in white-noise. They were told to pay attention to both what they saw and what they heard (and that sometimes the two would match, sometimes not), but to respond to what they heard, selecting from a two-alternative response, as quickly and accurately as possible, with an emphasis placed upon accuracy of responding. If the participant was unsure of what they heard, they were told to guess. The sequence of events was: 1) a fixation mark appeared in the centre of the screen, 2) the participant pressed the space-bar to initiate each trial, 3) after a 100-ms interstimulus interval (ISI), a clearly visible orthographic stimulus appeared in the centre of the screen simultaneously with the degraded spoken word target, both for a total of 500 ms, and (4) after a 100-ms ISI, a two-alternative response probe was presented visually, in bright text, a couple of lines below where the context orthographic stimulus was presented (e.g., heard cap [press 1], heard rap [press 2]. The procedure was approximately 25 minutes in duration, during which time the experimenter remained in the laboratory.
TABLE 2Mean Difference Between Benefits and Costs (in Percent), and the 95% Confidence Intervals (Based Upon One-Sampled t-Tests) as a Function of Discrimination Task
Results
Experiments 1. Overall mean response accuracy for the congruent, irrelevant, and incongruent conditions is presented in Figure 2A. A repeated measures analysis of variance (ANOVA.) of condition (congruent, irrelevant, and incongruent) on accuracy was significant, F(2,62) = 68.85, MSE = 126.92, p < .001. Dependent t-tests showed that the mean accuracy for the congruent condition was significantly greater than that for the irrelevant condition, t(31) = 8.710, SE = 2.12, p < .001, and the irrelevant condition mean accuracy was significantly greater than the incongruent condition mean accuracy, t(31) = 6.77, SE = 2.14, p < .001. The test of the difference of the congruent condition mean accuracy minus the irrelevant condition mean accuracy (18.5%) and the irrelevant condition mean accuracy minus the incongruent mean accuracy (14.5%) was significant, t(31) = 2.17, SE = 1.85, p < .05.
Experiment 2. Overall mean response accuracy for the congruent, irrelevant, and incongruent conditions is presented in Figure 2B. A repeated measures ANOVA of condition on accuracy was significant, F(2,46) = 59.65, MSE = 76.59, p < .001. Dependant t-tests showed that the mean accuracy for the congruent condition was significantly greater than that for the irrelevant condition, t(23) = 8.795, SE = 1.82, p < .001, and the irrelevant condition mean accuracy was significantly greater than the incongruent condition mean accuracy, t(23) = 5.47, SE = 2.09, p < .001. The test of the difference of the congruent condition mean accuracy minus the irrelevant condition mean accuracy (16.0%) and the irrelevant condition mean accuracy minus the incongruent mean accuracy (11.5%) was significant, t(23) = 2.33, SE = 1.99, p < .05.
Experiment 2 was conducted to determine if increasing the response accuracy level would alter the sensitivity effect found in Experiment 1. A one-tailed independent samples t-test was conducted to determine if the baseline (i.e., irrelevant) condition in Experiment 2 was significantly greater than that observed in Experiment 1. The difference between the baseline conditions (3%) was significant, t(54) = 1.77, SE = 1.47, p < .05. To determine if the pattern of results differed between Experiments 1 and 2, an ANOVA of condition by experiment was conducted on the accuracy data. There was a main effect of experiment, F(1,54) = 5.73, MSE = 55.74, p < .05, and of condition, F(2,108) = 119.52, MSE = 105.48, p < .001. There was no interaction between experiment and condition in the repeated measures ANOVA (FS < 1.00), and thus Experiment 1 and 2 accuracy data were combined. A one-sampled t-test comparing the mean accuracy difference between benefits (i.e., the difference between the congruent and irrelevant conditions) and costs (i.e., the difference between the irrelevant and incongruent conditions) to a mean of zero was conducted. This difference score was significantly greater than zero, t(55) = 3.17, SE =1.35, p < .01, and the confidence intervals based upon this t-test did not include zero (see Table 2). This sensitivity effect was also supported by a significant quadratic trend among the condition means, F(1,55) = 10.07, MSE = 16.90, p < .05.
Discussion
Experiments 1 and 2 provided evidence for a direct connection (i.e., a sensitivity effect) from orthographic lexical representations to phonological lexical representations. As the same pattern held for both levels of phonological discriminability, this sensitivity effect was not compromised by a scaling artefact on overall accuracy, nor any form of additional bias due to the discriminability of the target. Scaling accounts include those that suggest the nonlinear function observed in Experiment 1 was due to some type of floor effect that limits poor performance in the incongruent condition. However, given that there were main effects of context condition and experiment, but no interaction, it is difficult to argue that the pattern of result in Experiments 1 and 2 are simply due to such scaling effects. Similarly, any account that claims that the amount of bias contributed by the context is variable, and that bias thus depends on the discriminability of the target, would be supported if the slope of the performance function changed when there was a significant change in discriminability (i.e., an interaction between condition and experiment, see also Borowsky et al., 1999). Nonetheless, the magnitude of the increase in discriminability from Experiment 1 to Experiment 2 was small (i.e., 3%) and the performance functions could conceivably change with a larger effect. Thus, the results of Experiments 1 and 2, which demonstrated that orthographic contexts benefit congruent condition accuracy more than they cost incongrucnt condition accuracy, arc concordant with the idea that there is a set of direct connections from orthographic lexical level of representation to phonological lexical level of representation. We now turn to Experiments 3 and 4, which examine the reverse of the effect examined in Experiments 1 and 2: whether there is an influence of phonological lexical contexts on orthographic lexical discriminations.
Figure 3. Mean spoken word discrimination accuracy (in percent) as a function of orthographic and phonological lexical congruency for: (A) Experiment 1, and (B) Experiment 2. Confidence intervals were calculated using the formula for a within-subjects design as outlined in Loftus and Masson (1994).
Experiments 3 and 4
Experiments 3 and 4 investigated the influence of a spoken-word context upon orthographic word discrimination. Experiment 3 was designed to he a relatively difficult orthographic word discrimination task, whereas in Experiment 4 the orthographic word discrimination was made easier by increasing the visibility of the orthographic word targets. Experiment 4 served to evaluate whether the pattern of results would change as a function of location on the accuracy scale.
Method
Participants. Thirty-two University of Saskatchewan students participated in Experiment 3 for partial credit in an Introductory Psychology class, while 24 different students participated in Experiment 4. All reported English as their first language and normal (or corrected- to-normal) vision.
Apparatus. The same apparatus as in the previous experiments was used.
Materials and design. The same materials and design as in the previous experiments were used for Experiments 3 and 4. The only differences were that clearly audible words (i.e., without any white-noise) now provided the context, and the orthographic words were degraded by contrast reduction and presented as targets. MEL code specification for the specific level of the contribution of red, green, and blue for dark gray was SET_PALETTE_VGA(8,5,5,6) and SET_PALETTE_ VGA(8,6,6,6) in Experiments 3 and 4, respectively. Although contrast reduction is arguably different from the addition of white-noise used in Experiments 1 and 2, Rorowsky and Resner (1991, 1993) have shown that contrast reduction is suitable for demonstrating both facilitation and inhibition priming effects in the lexical decision task.
Procedure. The procedure was similar to that in Experiments 1 and 2 except that participants were to discriminate between target orthographic words. In order to obtain similar mean response accuracy for the baseline (i.e., irrelevant) conditions in the orthographic discrimination tasks as we had observed for the same condition in the spoken word discrimination tasks, the visually degraded orthographic presentation was reduced to 150 ms.
The procedure was similar to Experiments 1 and 2, except participants were instructed to respond to what they saw. The sequence of events was: 1) a fixation mark appeared in the centre of the screen, 2) the participant pressed the space-bar to initiate each trial, 3) after a 100-ms ISI, a degraded orthographic stimulus appeared in the centre of the screen for 150 ms during the simultaneous presentation of a clearly audible spoken-word target for 500 ms, and 4) after a 100-ms ISI, a two-alternative response probe was presented visually, in bright text, a couple of lines below where the target orthographic stimulus was presented (e.g., saw cap [press 1], saw rap [press 2]. The procedure was approximately 35 minutes in duration, during which time the experimenter remained in the laboratory.
Results
Experiment 3. Overall mean response accuracy for the congruent, irrelevant, and incongruent conditions is presented in Figure 3A. A repeated measures ANOVA of condition (congruent, irrelevant, and incongruent) on accuracy was significant, F(2,62) = 40.72, M% = 249.09, p < .001. Dependent t-tests showed that the mean accuracy for the congruent condition was significantly greater than that for the irrelevant condition, t(31) = 6.01, SE = 2.97, p < .01, and the irrelevant condition mean accuracy was significantly greater than the incongruent condition mean accuracy, t(31) = 6.29,.SE = 2.82, p < .001. The test of the difference of costs (i.e., the congruent condition mean accuracy minus the irrelevant condition mean accuracy, 17.5%) and benefits (i.e., the irrelevant condition mean accuracy minus the incongruent mean accuracy, 17.5%) was not significant, t(31) = 0.04, SE = 1.91, p = .968.
Experiment 4. Overall mean response accuracy for the congruent, irrelevant, and incongruent conditions is presented in Figure 3B. A repeated measures ANOVA of condition on accuracy was significant, F(2,46) = 16.59, MSE = 124.29, p < .001. Dependent t-tests showed that the mean accuracy for the congruent condition was significantly greater than that for the irrelevant condition, t(23) = 3.39, SE = 2.46, p < .01, and the irrelevant condition mean accuracy was significantly greater than the incongruent condition mean accuracy, t(23) = 4.23, SE = 2.40, p < .01. Again the test of the difference of benefits (8.5%) versus costs (10.0%) was not significant, t(23) = - 0.85, SE = 2.09, p = .405.
Since the purpose of Experiment 4 was to determine if an increase in response accuracy would alter the symmetrical effect found in Experiment 3, a one-tailed independent samples t-test was conducted to confirm that the response accuracy for the baseline (i.e., irrelevant) condition in Experiment 4 was significantly greater than that observed for Experiment 3. There was a significant difference between the baseline conditions for the two experiments, t(28.1) = 5.56, .SE = 2.86, p < .001. To determine if the pattern of results differed between Experiments 3 and 4, the test of the quadratic trend was conducted. The quadratic trend, which is equivalent to comparing the benefit and cost effects, did not indicate any interaction between experiment and condition, F(1,54) = 0.42, MSE =18.58, p = .519, and thus Experiment 3 and 4 accuracy data were combined. A one-sampled t-test comparing the difference of the benefit effect and the cost effect to a mean of zero was conducted. The difference of difference score was not significantly greater than zero, t(55) = -0.51, .% =1.40, p = .613, and the confidence intervals based upon this t- test did include zero (see Table 2). The test of the quadratic trend supported the difference of differences analysis in that there was no significant deviation from a linear function, F(1,55) = 0.26, MSE = 18.38, p = .613. To examine if there was support for a difference in the quadratic trends amongst the condition means between the phonological discrimination tasks (i.e., Experiments 1 and 2) and the orthographic discrimination tasks (i.e., Experiments 3 and 4), a condition (congruent, irrelevant, incongruent) by discrimination task (phonological and orthographic) quadratic trend analysis was conducted. The interaction between condition and discrimination task was significant, F(1, 110) = 6.57, MSE = 17.64, p < .05. It should be noted that there was greater variability across participants in terms of linear trends, which is indicative of a bias mechanism, compared to the variability in terms of quadratic trends, which is indicative of sensitivity effects.
Discussion
Experiments 3 and 4 provided evidence of a symmetrical effect of phonological lexical contexts on orthographic lexical discriminations. As the same pattern held for both levels of orthographic discriminability, this symmetrical effect is not compromised by a scaling artefact on overall accuracy, nor any form of variable bias due to the level of target discriminability. In comparison to the phonological discrimination tasks (i.e., Experiments 1 and 2), the results for the orthographic discrimination task suggest that there is a real difference between the tasks. Specifically, the highest order trend for Experiments 1 and 2 combined was quadratic, whereas for Experiments 3 and 4 combined the highest order trend was linear. Furthermore, the quadratic trend analysis did produce a significant interaction between condition and discrimination task. Taken together, these results suggest that the pattern of results did change as a function of the discrimination task. The pattern of results for Experiments 3 and 4, which demonstrated that phonological contexts benefit congruent condition accuracy as much as they cost incongruent condition accuracy, could thus be accommodated by a simple bias account with no direct connections between the two lexical subsystems.
Semantic and/or sublexical involvement. A concern that deserves some consideration is whether target discriminations could have been made at the semantic level or at the sublexical level instead of at the lexical level. Given that the experiments all used five, three- letter word triplets that were repeated several times in counterbalancing, it seems unlikely that the stimuli were being semantically processed. Alternatively, it could be argued that the high repetition of the word triplets may have promoted the participants to eventually rely on a sublexical strategy whereby the participant would focus their attention to the onset of the target stimuli. An analysis of the first 90 trials (i.e., the first 25% of the experimental trials) for each experiment suggests that this is not the case, as the same symmetrical and asymmetrical effects are observed as reported for the full experiments (with the exception that there was only a trend for a 6.7% sensitivity effect in Experiment 2, t(23) = 1.540, SE = .043, p = .137, but note that the pattern was in the correct direction).
General Discussion
The present study extended the Borowsky et al. (1999) research to examine the type of connections involved at the lexical level of orthographic and phonological representations. Single-route models predict that the same type of connections must exist for both sublexical and lexical levels of representation, whereas dual-route models can allow for different types of connections along the two routes (e.g., Coltheart et al., 2001). Some models of spoken word recognition claim that there should be no connections whatsoever between orthographic and phonological representations, regardless of whether they are sublexical or lexical (Fowler & Deckle, 1991; Massaro et al., 1988), whereas others assume that connections would be equally available for processing in either direction (Jacobs et al., 1998).
As previously discussed, if the context manipulation produces a symmetrical effect on target discrimination, as indicated by the congruent context benefiting performance to the same degree as the incongruent context costs performance, then there is no unequivocal evidence for direct connections between the context and target modalities (i.e., a simple response bias effect has occurred). A more informative outcome, however, is when the context manipulation produces an asymmetrical effect (i.e., a sensitivity effect) on target discrimination, as indicated by costs not equaling benefits. If similar sensitivity effects occur for both types of discrimination task, one can conclude that there are bidirectional connections between the lexical processing subsystems. However, if there is a sensitivity effect for only one type of discrimination tasks then one would conclude that there are unidirectional connections between the lexical processing subsystems. The present results were of the latter type, and thus suggest that unidirectional connections exist from the orthographic lexical processing subsystem to the phonological lexical processing subsystem. Borowsky et al. (1999) also obtained this pattern for the level of connections that map graphemes onto phonemes. Although both dual- and single-route models can account for Borowsky et al.'s results in conjunction with the present set of results, such findings are important for constraining the types of connections necessary for models of visual word recognition and speech perception (see Figure 1).
Figure 4. Mean written word discrimination accuracy (in percent) as a function of orthographic and phonological lexical congruency for: (A) Experiment 3, and (B) Experiment 4. Confidence intervals were calculated using the formula for a within-subjects design as outlined in Loftus and Masson (1994).
Many current speech perception models that describe both orthographic and phonological processing cannot account for the present set of results (see also Borowsky et al., 1999). For example, the Direct Realist Theory (Fowler & Deckle, 1991) states that orthographic processing will not influence phonological processing. Similarly, Massaro et al.'s (1988; Massaro & Cohen, 1993) Fuzzy Logical Model of Perception predicts that cross-modal orthographic and phonological processing effects would be restricted to bias effects. These models were not supported given the sensitivity effect of orthographic lexical context upon phonological discrimination. Both Borowsky et al.'s results and the present results clearly indicated that orthography does have a direct influence on phonological discrimination sensitivity (i.e., at both phonemic and spoken-word levels).
Some models of word recognition are more flexible in being modified to handle the present set of results. Models that implement an interactive activation framework (McClelland & Rumelhart, 1981) of connectivity between orthographic and phonological lexical representations (Coltheart et al., 2001; Jacobs et al., 1998) assume bidirectional connections. If one simply assumes equally weighted bidirectional connections between the orthographic and phonological lexical representations, then asymmetrical effects (i.e., sensitivity effects) for both the phonological lexical discrimination task (i.e., Experiments 1 and 2) and orthographic lexical discrimination task (i.e., Experiments 3 and 4) should have been obtained. However, our results suggest that the nature of the lexical-level connections needs to reflect a greater influence of orthographic processing on phonological processing. As such, these models would require that the equally bidirectional connections at the lexical level be replaced with connections that place a greater weighting on the connections from orthography to phonology.
The current study provides an important constraint on the nature of the connections between lexical orthographic and phonological representations for models of speech and visual word recognition. In general, the present results are consistent with the fact that readers have a lot of experience mapping written letters and words onto phonological representations (Borowsky et al., 1999; Frost & Katz, 1989). Future studies could explore whether the opposite pattern of results (in particular, phonological lexical to orthographic lexical sensitivity) would be observed for individuals who are highly practiced in mapping spoken words onto orthographic representations (e.g., stenographers). Another important direction for this research is to explore semantic-mediated target discrimination, and the nature of the connections between the semantic system and the orthographic and phonological subsystems. For example, one could examine a semantic-mediated version of this paradigm whereby the imageability of the targets (e.g., Strain, Patterson, & Seidenberg, 1995), or the degree of polysemy (e.g., Borowsky & Masson, 1996) are manipulated, or whereby picture contexts are used (Masson & Borowsky, 1998).
Both authors made equal contributions to this work. This research was supported by the Natural Sciences and Engineering Research Council of Canada through a Post Graduate Scholarship to William J. Owen and a research grant to Ron Borowsky. We thank Peter Dixon, Ken Paap, and Jonathan Grainger for their insightful reviews of this manuscript. Correspondence concerning this article should be addressed to either William J. Owen, Department of Psychology, University of Northern Hritish Columbia, 3333
University Way, Prince George, British Columbia V2N 4Z9 or Ron Borowsky, Psychology Department, University of Saskatchewan, 9 Campus Drive, Saskatoon, Saskatchewan S7N 5A5 (E-mail: [email protected] or ron.borowsky @usask.ca).
References
Borowsky, R., & Besner, D. (1991). Visual word recognition across orthographies: On the interaction between context and degradation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 272-276.
Borowsky, R., & Besner, D. (1995). Visual word recognition: A multistage activation model. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 813-840.
Borowsky, R., & Masson, M. H. J. (1996). Semantic ambiguity effects in word identification. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 63-85.
Borowsky, R., Owen, W. J., & Fonos, N. (1999). Reading speech and hearing print: Constraining models of visual word recognition by exploring connections with speech perception. Canadian Journal of Experimental Psychology, 53, 294-305.
Borowsky, K., Owen, W. J., & Masson, M. K. J. (2002). Evaluating the diagnostics for phonological lexical access: Pseudohomophone naming advantages, disadvantages, and base-word frequency effects. Memory & Cognition, 30, 969-987.
Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Zeigler, J. (2001). DRC: A dual route cascade model of visual word recognition and reading aloud. Psychological Review, 108, 204-256.
Fowler, C. A., & Deckle, D. J. (1991). Listening with eye and hand: Cross-modal contributions to speech perception. Journal of Experimental Psychology: Human Perception and Performance, 17, 816-826.
Frost, R., & Katz, L. (1989). Orthographic depth and the interaction of visual and auditory processing in word recognition. Memory & Cognition, 17, 302-310.
Harm M. W., & Seidenberg, M. S. (1999). Phonology, reading acquisition, and dyslexia: Insight from connectionist models. Psychological Review, 106, 491-528.
Jacobs, A. M., Rey, A., Zeigler, J. C., & Grainger, J. (1998). MROM-P: An interactive activation, multiple read-out model of orthographic and phonological processes in visual word recognition. In J. Grainger & A. M. Jacobs (Eds.), Localist connectionist approaches to human cognition (pp. 147-188). Mahwah, NJ: Erlbaum.
Kay, J., Lesser, R., & Coltheart, M. (1996). Psycholinguistic assessments of language processing in aphasia (PALPA): An introduction. Aphasiology, 10, 159-180.
Liberman, A., & Mattingly, I. (1985). The motor theory of speech perception revised. Cognition, 21, 1-36.
Loftus, G. R., & Massen, M. E. (1994). Using confidence intervals in within-subject designs. Psychological Bulletin and Review, 1, 470-490.
MacDonald, J., & McGurk, H. (1978). Visual influences on speech perception processes. Perception & Psychophysics, 24, 253-257.
Massaro, D. W. (1989). Testing between the TRACE model and the fuzzy logical model of speech perception. Cognitive Psychology, 21, 398-421.
Massaro, D. W., & Cohen, M. M. (1993). The paradigm and the fuzzy logical model of perception are alive and well. Journal of Experimental Psychology: General, J22, 115-124.
Massaro, D. W., Cohen, M. M., & Thompson, L. (1988). Visual language in speech perception: Lipreading and reading. Visual Language, 22, 8-31.
Masson, M. E. J., & Borowsky, R. (1998). More than meets the eye: Context effects in word identification. Memory & Cognition, 26, 1245-1269.
McClelland, J. L., & Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: Part 1. An account of basic findings. Psychological Review, 88, 375-407.
McGurk, H., & MacDonald, J. (1970). Hearing lips and seeing voices. Nature, 264, 746-748.
Paap, K. R., Johansen, L. S., Chun, E., & Vonnahme, P. (2000). Neighborhood frequency does affect performance in the Reicher task: Encoding or decision? Journal of Experimental Psychology: Human Perception and Performance, 26, 1691-1720.
Plaut, D. C., McClelland, J. L., Seidenberg, M. S., & Patterson, K. (1996). Understanding normal and impaired word reading: Computational principles in quasi-regular domains. Psychological Review, 103, 56-115.
Ratcliff, R., & McKoon, G. (1997). A counter model for implicit priming in perceptual word identification. Psychological Review, 104, 319-343.
Seidenberg, M. S., & McClelland, J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96, 523-568.
Strain, E., Patterson, K., & Seidenberg, M. S. (1995). Semantic effects in single-word naming. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1140-1154.
Zorzi, M., Houghton, G., & Butterworth, B. (1998). Two routes or one in reading aloud? A connectionist dual-process model. Journal of Experimental Psychology: Human Perception and Performance, 24, 1131-1161.
William J. Owen, University of Northern British Columbia
Ron Borowsky, University of Saskatchewan
Copyright Canadian Psychological Association Dec 2003