Content area
While the relative effects of Processing instruction (PI) and various types of output-based instruction have been widely examined, the types of L2 knowledge generated after receiving instruction are under-researched. Moreover, the relationships between learners’ individual differences and the comparative effects of PI and output-based instruction have not yet been investigated. This study examined the types of L2 knowledge generated after three instructional methods and the relationships with individual differences in working memory capacity. A total of 86 adult beginning-level L2 Chinese learners were assigned into three experimental groups—processing instruction (PI), meaning-based output instruction (MOI), combined instruction (CI)—and a control group. An Untimed Grammaticality Judgment Test (UGJT) and an oral Elicited Imitation Test (EIT) were used to measure learners’ development of explicit and implicit knowledge respectively. A counting span task was employed to assess learners’ working memory capacity. The results demonstrated that the three types of instructional treatments brought about equal gains on the measure of explicit knowledge, and the meaning-based output instruction group made more gains on the measure of implicit knowledge. Working memory was negatively correlated with the UGJT for the MOI group. The results suggest that the three instructional treatments may aid the development of both explicit and implicit knowledge and the MOI group may have an advantage in facilitating the development of L2 implicit knowledge. The findings also suggest that learners with lower working memory capacity may benefit more from the MOI.
Introduction
There has been a proliferation of empirical research on instructed second language acquisition that examines the effectiveness of various instructional methods (Kang et al., 2019; Norris & Ortega, 2000; Shintani, 2015; Shintani et al., 2013; Spada & Tomita, 2010). One widely examined type of pedagogical intervention to grammar instruction is processing instruction (PI) that aims to push learners away from non optimal processing strategies so that they can make correct form-meaning connections during comprehension. Despite the large body of studies that have examined the comparative effectiveness of PI and different types of output-based Instruction, such as traditional instruction (TI) and meaning-based output instruction (MOI), on comprehension and production of various linguistic features (e.g., word order, clitic pronouns), there are issues that have not been fully addressed. For example, there is insufficient investigation of the types of L2 knowledge (i.e., explicit and/or implicit knowledge) promoted by PI and output-based instruction. Furthermore, previous studies have indicated that learners’ individual differences, such as language aptitude and working memory, may play a role in mediating the effectiveness of different instructional approaches (Li, 2015, 2017; Li et al., 2019; Robinson, 2002; Skehan, 2015). However, little attention has been given to the relationships between learners’ individual differences and the comparative effects of PI and output-based instruction. In this article we presented the results of a quasi-experimental study to examine the effects of PI, MOI, and a Combined Instruction (CI) on the development of L2 explicit and implicit knowledge and the relationships with learners’ individual differences in working memory.
Literature review
The relative effects of PI and output-based instruction
Processing Instruction is rooted in the Input Processing Theory proposed by VanPatten (2004), which seeks to elucidate the reasons behind the selective processing of certain linguistic elements in input by L2 learners. VanPatten (2004) defines processing as the establishment of a connection between form and meaning. Two major principles have been proposed to explain the challenges learners face in processing input. One is the Primacy of Meaning Principle which posits that learners prioritize the comprehension of meaning over other aspects when form and meaning potentially compete for processing resources (VanPatten, 2004). The other is the First Noun Principle which claims that learners have a tendency to interpret the first noun in a sentence as the subject or agent (VanPatten, 2004). Processing instruction is developed in order to help learners modify their default processing strategies to facilitate better form-meaning connections so that more input would be processed and retained in working memory and subsequently integrated into the developing system (i.e., restructuring and accommodation), leading to acquisition. PI typically consists of two components: explicit information (EI) and structured input (SI) activities (VanPatten & Cadierno, 1993). EI provides grammatical rules and information about processing strategies that may hinder learners’ interpretation of the target language structure (VanPatten & Cadierno, 1993). SI includes referential activities, which require learners to establish form-meaning connections and receive yes/no feedback on the correctness of their responses, and affective activities, which elicit affective responses based on personal experiences, opinions, or beliefs (Wong, 2004). No metalinguistic explanation is provided for learners’ responses, and they do not produce the target language structure (Lee & VanPatten, 2003).
The effects of PI have been compared to a variety of output-based instructional types. One commonly studied type is ‘traditional instruction (TI)’, which involves a series of activities progressing from mechanical to meaningful to communicative practices (VanPatten & Cadierno, 1993). Studies examining the comparative effectiveness of PI and TI have consistently shown that PI has better effects on interpretation tests, while producing equal effects to TI on production tests (e.g., Benati, 2001, 2005, 2009; VanPatten & Cadierno, 1993; VanPatten & Wong, 2004). Some researchers (e.g., Cadierno, 1995; Farley, 2001a) have questioned whether the overall superior effects of PI are due to the inclusion of the ‘meaningful’ component in SI. To address this issue, several studies (Benati, 2005; Farley & Aslan, 2012; Farley, 2001a, 2001b, 2004; Morgan-Short & Bowden, 2006) have employed ‘meaning-based output instruction (MOI)’, which consists solely of meaningful or communicative activities. MOI includes explicit information similar to that in PI, as well as ‘structured output (SO)’ activities that resemble SI but require the production of the target structure in oral or written form. Other types of output-based instruction compared to PI include meaning-based drills (Keating & Farley, 2008), communicative output tasks (Toth, 2006), and Dictogloss tasks (Qin, 2008; Uludag & VanPatten, 2012; VanPatten et al., 2009).
The results of studies comparing the effects of PI and output-based instruction have been inconsistent. In a meta-analysis of 33 empirical studies, Shintani (2015) found that PI outperformed output-based instruction on interpretation tests, while output-based instruction performed equally to PI on production tests. The output-based instruction outperformed the PI when both groups were provided with identical explicit information. A few studies have examined the relative effects of PI/SI in combination with different types of output-based activities (Benati & Batziou, 2019a, 2019b; Kirk, 2013; Mystkowska-Wiertelak, 2011). These studies generally found that the PI alone or the combination of PI/SI and output-based instruction groups were more effective than the output-based instruction alone group, thus confirming the essential role of PI/SI as effective grammar instruction. However, there is discrepancy of the design and operation of instructional treatments between these studies, which may make it difficult to conclude if the impact of PI was due to its nature or the absence of output-based instruction (Benati & Batziou, 2019a, 2019b). Taken together, the comparative effectiveness of the combined input and output-based instruction is inconclusive. Thus, this study aims to further examine this issue by incorporating a combined instruction of SI and SO to investigate whether it would yield beneficial effects.
As pointed out in a review by DeKeyser and Prieto Botana (2015), researchers have employed skill acquisition theory to explain their results, showing that the PI group performed better in comprehension tests because the skills required in the comprehension tests were similar to what they trained in PI. A similar pattern was also observed in MOI in the production test. According to skill acquisition theory (DeKeyser & Sokalski, 1996), learners in the instructional treatments acquire explicit knowledge about making connections between the form and meaning of the target structure, and they convert this type of knowledge to implicit knowledge in very specific ways based on the type of practice they receive (i.e., production or comprehension). There has been limited investigation regarding the linguistic knowledge types promoted by PI. In the following section, we will discuss the concept of linguistic knowledge types and the distinction between them, as well as examine PI studies that investigate the relative effects on L2 development of knowledge types.
Knowledge types generated by PI
While there have been empirical studies examining the relative effects of PI and different types of output-based instruction on L2 receptive and productive knowledge by using interpretation and production tests, as discussed in above section, very few studies employed separate measures to examine learners’ acquisition as pointed out by Shintani (2015). Acquisition can be defined in terms of explicit and implicit knowledge, as proposed by Ellis (2009). Ellis (2004) describes explicit knowledge as analyzed knowledge that learners are aware of and can articulate using metalanguage, such as verb complements and clauses. Implicit knowledge, on the other hand, is characterized as “sub-symbolic, procedural, and unconscious” (Ellis, 2004, p. 38).
There has been discussion of the types of L2 knowledge generated by PI/SI. For example, VanPatten (2004) noted that “at no time did our conclusions refer to comprehension vs. production. Our final conclusion was that instruction that was directed at intervening in learners’ processing strategies should have a significant impact on the learner’s developing system” (p.97). Marsden and Chen (2011) argued that the reported learning gains by PI have frequently been regarded as evidence of implicit knowledge rather than explicit knowledge (see Marsden & Chen, 2011 for a full discussion). Many scholars, however, have doubted the claims that PI impacts on learners’ underlying system. For example, some argued that participants either were given metalinguistic explanation of the language features or can induce the grammatical rules when they were constantly given yes/no feedback (DeKeyser et al., 2002) and the effects of PI studies are due to “practice with (given or induced) explicit knowledge” (De Jong, 2005, p. 211). Collentine (2004) disputed that findings of PI studies “do not reveal whether the learners’ developing system is responding differently to authentic input” (p.183) as the tasks solely measured how learners adopt processing strategies to comprehend SI as they were trained during instruction. As tasks employed in many PI studies (e.g., written production tasks without a time limit) were metalinguistic in nature, the tasks tapped learners’ explicit knowledge (Doughty, 2004). Furthermore, Harrington (2004) pointed out that “a pressing need in the Input Processing model is a more explicit account of the relationship between processing (in its broader and more restricted input sense) and L2 knowledge” (p. 92).
There were only a few studies investigating the development of L2 knowledge types promoted by PI and output-based instruction. Erlam et al. (2009) explored the relative effects of SI and output-based instruction on developing explicit and implicit knowledge of English indefinite articles. Findings showed that the output-based instruction group performed better than the control group in both explicit and implicit measures and there was no significant difference between the SI and output-based instruction groups in either measure. Marsden and Chen (2011) used different measures of explicit (i.e., an untimed written gap-fill test and a short semi-structured conversation) and implicit knowledge (i.e., a timed grammatical judgment test and an oral picture narration) to isolate the effects of referential and affective activities in SI. Results found that the timed grammatical judgment test and the untimed written gap-fill test tended to elicit explicit knowledge, which was gained from referential activities. Recent studies (Erlam & Wei, 2021; Spada et al., 2015; Suzuki, 2017; Suzuki & DeKeyser, 2015; Suzuki et al., 2023; Vafaee et al., 2017) have questioned the validity of tests of implicit knowledge (e.g., EIT) used in previous studies, suggesting that EIT might be a measure of speeded-up/automatized explicit knowledge (Suzuki & DeKeyser, 2015). Please see the next section for a discussion of automatized explicit knowledge. Therefore, instead of measuring implicit knowledge, Mostafa and Kim (2021) compared the effects of PI and output-based instruction on the development of explicit and automatized explicit knowledge of two linguistic structures. In their study, explicit knowledge was measured by an error correction test and automatized explicit knowledge was measured by an oral narrative production test. Findings suggested that both PI and output-based instruction were effective at developing explicit knowledge and output-based instruction were more effective than PI at developing automatized explicit knowledge. To summarize, there is a lacuna in the research body of empirical evidence on the knowledge types promoted by the PI and output-based instruction and the aforementioned studies included various research objectives and distinct methodology, making it challenging to draw meaningful comparisons and valid conclusions. Therefore, more research is warranted in this issue.
Another way of probing the types of L2 knowledge PI may elicit is to investigate “whether there is transfer of knowledge from exemplars in the treatment to novel test items” (Farley, 2004, p.150). Few studies have looked at this question except two studies. Farley (2004) examined whether PI and MOI brought about improved performance on sentence-level interpretation and production tasks of novel subjunctive forms. Results suggested that both types of instruction improved significantly on interpretation and production of forms that had not been taught during instruction. Erlam et al. (2009) compared the effects of SI and output-based instruction assessed in terms of measures of explicit and implicit knowledge and investigated whether there is any transfer of learning to new language forms after receiving treatments. Findings demonstrated that both types of instruction did not improve significantly on measures of explicit knowledge. The output-based instruction group led to significant gains for novel test items on measures of implicit knowledge on the delayed posttest while the SI group did not. This provided evidence that output-based instruction may impact on the development of implicit knowledge. Considering the scant research, it would be helpful to use tests that elicit both types of knowledge to investigate whether PI leads to development of explicit and/or implicit knowledge. The present study aims to address the issue of the comparative effects of PI, MOI, and a Combined Instruction (CI) on the types of L2 knowledge promoted. It will use tests of explicit and implicit knowledge and also include new test items to examine whether there is transfer of knowledge after receiving instruction.
Measures of explicit and implicit linguistic knowledge
A variety of measures have been applied to assess both explicit and implicit linguistic knowledge in second language learning and acquisition. Commonly used measures of explicit linguistic knowledge are untimed grammaticality judgment test (UGJT) and metalinguistic knowledge test (MKT) (Isbell & Rogers, 2021). Frequently employed tests of implicit linguistic knowledge are word monitoring, self-paced reading (SPR), oral narrative, timed grammaticality judgment test (T-/GJT), elicited imitation test (EIT) (Isbell & Rogers, 2021).
It is generally agreed that tests such as UGJT and MKT primarily reflect the assessment of explicit linguistic knowledge. However, there has been ongoing debate about the constructs measured by certain tests of implicit knowledge, such as the EIT and T-GJT. Previous studies have validated EIT as a measure of implicit linguistic knowledge (Erlam, 2006); Erlam & Wei, 2021; Erlam et al., 2009; Spada et al., 2015). On the other hand, some researchers have argued that EIT may not be a convincing measure of implicit linguistic knowledge. For instance, Suzuki and DeKeyser (2015) found that the EIT tends to tap automatized explicit knowledge, which refers to “a body of conscious linguistic knowledge including different levels of automatization.” (Suzuki, 2017, p.2, DeKeyser, 2017; Vafaee et al., 2017). Erlam and Wei (2021) validated EIT as a measure of implicit knowledge by comparing it with other measures of explicit and implicit knowledge and concluded that EITs may tap knowledge associated with overall language proficiency, which would involve both automatized explicit knowledge and implicit knowledge. By employing functional magnetic resonance imaging (fMRI), Suzuki et al. (2023) investigated advanced L2 speakers’ neural circuits associated with explicit and implicit knowledge during the listening and speaking phases when performing an EIT. Results found that learners may access explicit knowledge when speaking and both explicit and implicit knowledge when listening. This suggests that both types of linguistic knowledge can be activated when performing an EIT. The mixed results clearly indicate that further research is needed to theoretically distinguish between automatized explicit knowledge and implicit knowledge.
EIT was used in the present study as a measure to primarily gauge learners’ implicit knowledge, not automatized explicit knowledge, as discussed above, which was measured by an oral narrative production test in Mostafa and Kim (2021). There are three reasons. First, as argued by researchers (e.g., DeKeyser, 2012; Erlam & Wei, 2021; Suzuki, 2017), distinguishing between automatized explicit knowledge and implicit knowledge is critical from a theoretical perspective. However, automatized explicit knowledge functions equivalently to implicit knowledge, making it impossible to distinguish between the two types of knowledge that learners use in daily language use. In addition, the present study did not aim to validate the EIT but simply used it as a measure of L2 knowledge. Second, implicit knowledge was considered to have some potential for tapping into automatized explicit knowledge for advanced L2 learners. The participants targeted in the present study, however, were beginning-level learners. As discussed by Suzuki (2017), the automatization of explicit knowledge is a gradual process, which means that advanced L2 learners with a higher degree of automatization are more likely to quickly draw on explicit knowledge under time pressure. In contrast, beginning L2 learners with a lower degree of automatization are less likely to draw on explicit knowledge quickly. Therefore, the EIT is more likely to activate beginning level learners’ implicit knowledge. Third, the EIT in this study was designed to assess implicit knowledge by presenting participants first with comprehension questions which focus on meaning. Ungrammatical items were also included to provide evidence of learners’ unconscious correction because they were shown example answers only during the training phase but were not explicitly instructed to correct the ungrammatical stimuli. New items were also included, as discussed above, to investigate whether there was any transfer of knowledge to items that they were never taught. In other words, if participants automatically repeat correctly, it suggests that they have internalized the target structure and are likely tapping into implicit knowledge (Erlam et al., 2009). Please see the ‘Testing’ section for more information about the EIT.
The role of working memory in PI and output-based instruction
DeKeyser (2012, 2019) proposed an aptitude-treatment interaction (ATI) framework, in which learners’ individual cognitive abilities have been reported to interact differently with various learning conditions in the field of second language acquisition (SLA) (Li, 2015, 2017; Li et al., 2019; Robinson, 2002; Skehan, 2015). Among the empirical research, several studies have identified working memory to play a rol in mediating the effects of different instructional types (Li, 2017). Working memory is defined as “the temporary storage and manipulation of information that is assumed to be necessary for a wide range of complex cognitive activities” (Baddeley, 2003, p. 189). Baddeley and Hitch (1974) proposed a tripartite model of working memory that consists of a supervisory attentional system—the central executive, aided by two slave subsystems—phonological loop and visuo-spatial sketch pad. The central executive is responsible for coordinating the two slave systems, focusing and switching attention. As the central executive lacks storage capacity, the episodic buffer was added as a new component by Baddeley (2000) to acts as a temporary store, an interface between the two slave subsystems and a connection between working memory and long-term memory. The phonological loop holds information temporarily and serves as an articulatory rehearsal process which can keep information from fading. The visuo-spatial sketch pad processes and retrieves visual and spatial information. Measures of working memory are categorized as simple and complex tasks. Simple tasks only measure the ability of information storage, called phonological short-term memory. Non-word repetition tests, digit span tasks that require learners to recall “a string of nonrelated letters, words, digits or visual objects” are examples of simple tests (Linck et al., 2014, p. 7). Complex tasks tap dual functions of information storage and processing, called executive working memory. Such tasks include reading span, listening span, counting span and operation span.
Working memory can be an important factor impacting L2 learning in instructed settings. To have a comprehensive view of the influence of working memory on instructional types, by integrating a meta-analysis and a narrative review based on 24 studies, Li (2017) found an overall weak correlation between working memory and the effects of corrective feedback which is a form-focused strategy that responds to L2 learners’ errors during interaction. Some studies found that working memory was associated with instructional types that learners’ attention is never drawn to forms or learners learn grammar rules unintentionally (Robinson, 2002, 2005; Sanz et al., 2016). Other studies also reported contradictory findings that working memory was correlated with instructional types that require learners’ attention to forms or induce grammar rules on their own (Li et al., 2019; Tagarelli et al., 2015).
Although many studies have examined the role of working memory in mediating the effectiveness of instructional types, there have been only two studies investigating the interaction between the effectiveness of PI and working memory capacity (Erlam, 2005; Santamaria & Sunderman, 2015). Results from Erlam, (2005) study reported that working memory was correlated with written production tests, suggesting that structured input was beneficial for learners with more working memory resources and working memory was important for production. In Santamaria and Sunderman’ study (2015), learners’ working memory capacity did not impact learners’ performance on interpretation tests while learners with more working memory resources outperformed those with fewer ones on production tests. Researchers argued that the written production test was considered more challenging than the interpretation test, therefore, learners may require a heavier processing burden when performed the written production test (Li et al., 2019). Additionally, given that working memory is responsible for processing and storing L2 input simultaneously, learners with different working memory resources may benefit differently from processing instruction that aimed to alter learners’ ineffective processing strategies. Due to insufficient studies, it is difficult to draw any solid conclusions from the findings. The present study attempts to fill this research gap by investigating the relationships between working memory and the relative effects of PI and output-based instruction.
The present study
The review of the literature reveals that despite the pedagogical effectiveness of PI and output-based instruction on L2 comprehension and production, there is a lack of research on the relative effectiveness on developing L2 explicit and implicit knowledge, and there is also scant research on whether the comparative effects are related to learners’ individual differences in working memory. The current study attempts to fill research gaps in instructed second language acquisition and contribute to the Input Processing Theory by investigating the relative effects of PI and output-based instruction on second language acquisition (measured by tests of explicit and implicit knowledge), and contribute to aptitude-treatment interaction research by examining the correlations between learning gains and working memory. The research questions for the present study are as follows:
RQ1.
What are the relative effects of the three types of instruction (PI, MOI, and CI) on the development of explicit knowledge?
RQ2.
What are the relative effects of the three different types of instruction (PI, MOI, and CI) on the development of implicit knowledge?
RQ3.
Do individual differences in working memory capacity moderate the effects of three types of instruction on the learners’ L2 explicit and implicit knowledge?
Methodology
Participants
A total of 86 learners enrolled in Chinese language courses at a Chinese university participated in the study. The criteria for the participation were beginning level adult learners. The participants were aged between 18 and 59 and most of them (88%) were aged between 18 and 30. The participants had various language backgrounds and the majority of them were Asian (60 out of 86). Participants’ language proficiency level was gauged by means of an entrance examination administered by the college when they enrolled. The participants had received Chinese formal instruction for an average of ten months before coming to China. They attended 13.3 h of Chinese lessons per week at the time of data collection. They had five 100 min lessons a week for a total of 20 weeks. The participants were assigned into three experimental groups: processing instruction (PI) group (n = 23), meaning-based output instruction (MOI) group (n = 20), combined instruction (CI) group (n = 22) and one control group (n = 21). There were roughly equal numbers of participants across each group in terms of gender, age, and regions (see Table 1).
Table 1. Number of participants by group, gender, age, regions and first languages
Group | PI (n = 22) | MOI (n = 20) | CI (n = 23) | Control (n = 21) | Total |
|---|---|---|---|---|---|
Gender | |||||
Female | 15 | 14 | 13 | 13 | 55 |
Male | 8 | 6 | 9 | 8 | 31 |
Age | |||||
18–30 | 19 | 20 | 19 | 18 | 76 |
31–40 | 3 | 0 | 1 | 2 | 6 |
Over 50 | 1 | 0 | 2 | 1 | 4 |
Region | |||||
Asia | 18 | 14 | 15 | 13 | 60 |
Africa | 2 | 2 | 5 | 1 | 10 |
Europe & America | 3 | 4 | 2 | 7 | 16 |
L1 | |||||
Japanese | 9 | 7 | 9 | 5 | 30 |
Korean | 4 | 4 | 1 | 3 | 12 |
Thai | 2 | 2 | 2 | 2 | 8 |
Cambodian | 1 | 0 | 0 | 1 | 2 |
Indonesian | 2 | 0 | 0 | 1 | 3 |
Malay | 0 | 0 | 1 | 1 | 2 |
Arabic | 0 | 2 | 2 | 0 | 4 |
Urdu | 0 | 1 | 1 | 0 | 2 |
Swedish | 0 | 1 | 0 | 0 | 1 |
English | 3 | 1 | 2 | 3 | 9 |
French | 0 | 0 | 3 | 1 | 4 |
Italian | 0 | 2 | 0 | 1 | 3 |
Spanish | 1 | 0 | 1 | 2 | 4 |
Turkish | 1 | 0 | 0 | 1 | 2 |
Power analysis
According to Cohen (2013), the statistical power and effect size should both exceed 0.8, a probability level of 0.05. Based on this criterion, the total sample size required for a two-factor repeated measures design was estimated to be 40. Therefore, our sample size of 86 participants was sufficient.
Target structures
The target structure was two pairs of Chinese classifiers. They were chosen as the target grammatical structures because they are redundant and learners may ignore them when processing the whole phrase, as they may have a default processing strategy (i.e., The Primacy of Meaning Principle). Moreover, they are elementary-level structures suitable for beginners to learn, according to the New HSK (a standardized Chinese as a foreign language proficiency test) (2012 Revision). There are twenty-four lexical items divided into four sets to match the four classifiers. Both grammatical and lexical forms were illustrated in Table 2. Although the participants in the study had not received any formal instruction on the target classifiers in class, they may have come across them previously.
Table 2. Illustration of examples of the target grammatical forms
Classifiers | Definitions | Examples | Nouns |
|---|---|---|---|
míng (名) | For describing people’s identities or occupations | yī míng xúeshēng a classifier student | Doctor, lawyer, teacher, student, policeman, driver |
zhī (只) | For describing animals | yī zhī gǒu a classifier dog | Dog, cat, turtle, rabbit, bird, hedgehog |
bǎ (把) | For describing something with a handle or something you can hold in your hands | yī bǎ sǎn a classifier umbrella | Chair (with a handle), scissor, knife, umbrella, comb, spoon |
zhāng (张) | For describing something flat or something with a platform | yī zhāng zhuōzi a classifier table | Bed, carpet, picture, sofa, table, painting |
Instructional treatments
There were three instructional packets that consisted of EI and input/output-based activities. PI and CI groups received EI that included metalinguistic explanations of the classifiers and the information about the processing strategies. They were told that they should pay attention to the classifiers and rely on the classifiers to get the meaning. Instruction for the MOI group were only given metalinguistic explanations about the classifiers as the goal of the MOI is to elicit production of the target structure not to push them away from inefficient processing strategies. The PI group then received eight SI activities designed to alter their default processing strategies and push them to make correct form-meaning connections. Participants were never required to produce Chinese during SI. They were given yes/no feedback during which they were only told whether their answers were correct, that is, no grammatical explanation was given if their answers were wrong (Lee & VanPatten, 2003). No extra metalinguistic information was repeated by the instructor. Although students may produce some Chinese in affective activities, their production did not contain any target structures, for example, learners gave their responses by only answering ‘yes’. It should be noted that the students were never required to produce Chinese during this type of instructional activity.
The MOI group received eight SO activities encouraging learners to produce the classifiers (Lee & VanPatten, 2003). After completing an activity, participants in the MOI group also received feedback during which they were asked to give their answers to each item. When a wrong answer was given, another participant was called on until a correct answer was given. Participants were provided only with the correct answers, with no accompanying explanation. The CI group were given four SI and four SO activities.
The number of tokens was equal for all the experimental groups. To minimize the chance of receiving incidental input from their peers (Farley, 2004), there were more written than oral activities in all instructional groups. The instructional sessions were recorded to make sure that the instructor only gave metalinguistic explanation about the target structure in the EI. Additionally, to further ensure the rigorous implementation of the instructional treatments,
Table 3 reports the number of the opportunities for form-meaning connections, incidental input, and output in the three instructional treatments. See Appendix A for SI and SO activities. The participants in the control group did not receive any instructional treatment. They only completed language achievement tests that will be discussed in the following section.
Table 3. Form-meaning connections, incidental input and output in PI, MOI, and CI groups
Lexical items | Grammatical items | |||
|---|---|---|---|---|
Each lexical item | Total | Each grammatical item | Total | |
Form-meaning connections (Referential input-/output-based activities) | ||||
PI (input) | 12 | 288 | 72 | 288 |
MOI (output) | 12 | 288 | 72 | 288 |
CI (input) | 6 | 144 | 36 | 144 |
CI (output) | 6 | 144 | 36 | 144 |
CI (in total) | 12 | 288 | 72 | 288 |
Incidental input (from Explicit information, Feedback, Affective aural input-/oral output-based activities, Referential oral output-based activities) | ||||
PI | 15 | 360 | 90 | 360 |
MOI | 21 | 504 | 126 | 504 |
CI | 19 | 456 | 114 | 456 |
Output (Affective written activities) | ||||
PI | 0 | 0 | 0 | 0 |
MOI | 2 | 48 | 12 | 48 |
CI | 1 | 24 | 6 | 24 |
CI (in total) = CI (input + output)
The first author was the instructor. The teaching language was simple Chinese and English that was used only when participants had difficulty understanding Chinese. For learners who did not speak English, the instructor used body movements, gestures and pictures to help them understand. As students had received a worksheet with the EI of the target structure, they had the vocabulary needed to complete the activities.
Testing
Participants completed a written untimed grammatical judgment test (UGJT), designed as a measure of primarily explicit knowledge, an oral elicited imitation test (EIT), designed as a measure of primarily implicit knowledge, and a counting span task, designed as a measure of working memory. Both UGJT and EIT focused on the same four classifiers, which were applied to 24 previously introduced lexical items (i.e., old items). Another 20 new lexical items that had not been covered in the instructional treatments (i.e., new items) were also involved. Both old and new items were used in both grammatical and ungrammatical contexts. The EIT was always given prior to the UGJT in order to reduce any possible impact of the written test on the oral test.
There were 28 test items in the UGJT, comprising 15 old and 13 new items. The UGJT was given to participants as a paper-and-pencil test, and participants were allowed as much time as needed to do it. Learners were required to determine whether a sentence was grammatically correct, incorrect, or unclear. According to Mackey and Gass (2015), learners were asked to identify and remedy the error in sentences in order to prevent them from identifying them as grammatically incorrect without knowing what the mistake is or how to rectify it. Vocabulary annotation was provided in the form of pictures and learners were allowed to ask vocabulary related questions. For each item, pinyin was provided for each character. As the study is not intended to measure learners’ ability to write Chinese characters, participants were allowed to correct the errors in pinyin. No points were given to participants who did not judge whether a sentence was grammatically correct or who did not supply the correct forms of the target structure. The UGJT was only scored for identifying and correcting the wrong use of the classifiers only, therefore, the maximum score on the UGJT was 44 (i.e., 24 old and 20 new items for the four classifiers). The UGJT demonstrated high reliability, α = 0.971.
The EIT consisted of 32 statements, including 15 old items and 17 new items. It was administered one on one to participants in a laboratory room by the first author. The statements were presented to participants on a recording. For each statement, they had to first choose and indicate on their paper the picture that best corresponded to the meaning of the statement, then they were asked to repeat the statement in correct Chinese. Students heard each statement once. Only statements, which they correctly comprehended by circling the right picture, were scored for correct repetition of the target grammatical items. Grammatical items were scored as correct (1 point) if the target structure was correctly repeated and ungrammatical items were scored as correct if the target structure was corrected (1 point) or supplied in obligatory contexts (1 point). They could thus score up to a total of 44 points (i.e., 24 old and 20 new items for the four classifiers). The scoring focused only on the use of the target grammatical items, that is, the four Chinese classifiers, the rest of the repeated sentence was ignored. Unlike the UGJT, no vocabulary annotation was provided in the form of pictures and learners were not allowed to ask vocabulary related questions because some test items needed test takers to demonstrate their understanding of the vocabulary. The reliability of the EIT was relatively high, α = 0.968. Learners’ recordings of the EIT were also rated by another native speaker who is a postgraduate student in Teaching Chinese as a second/foreign language. The second rater rated 20% of the test data (n = 22). The inter-rater reliability of the EIT was high, r = 0.851.
The counting span task measures learners’ working memory capacity to store and process information simultaneously (Conway et al., 2005). The task used in the present study was created by Engle et al. (1999) and the reliability and validity have been testified. This task was delivered to participants one-on-one via Microsoft PowerPoint with the first author in a laboratory room. Each display consisted of randomly arranged dark blue circles, light blue circles, and dark blue squares. Participants were required to count the number of the targets (dark blue circles in this case) aloud in their first language and to repeat the final digit. For example, if there were four dark blue circles on the screen, the participant, who speaks English as their first language, would say aloud ‘one-two-three-four-four’. When the ‘four’ is repeated, the first author would press a key that progresses to the next display and counting would begin immediately. Participants whose first language was not English, count aloud and repeat in their first language. The distractors were dark blue squares and light blue circles. For each display, the number of the targets varied from three to nine, with three trials of each. After two to eight displays, a recall cue (i.e., ‘?’ in this case) was presented, at which point participants wrote down the total number of the targets in each of the previous displays, in the serial order in which they occurred. There were 105 stimuli in total with 1 point given to each correct recalling. Therefore, the full scores of the counting span task were 105 in total. The reliability of the counting span task was estimated using the scores of the students in the experimental groups (n = 65). The reliability estimate was acceptable, α = 0.749.
Procedure
The present study comprised three sessions on three separate days. During Session 1, participants completed the UGJT and EIT pre-tests, and the counting span task. In Session 2, participants in the three experimental groups received instructional treatments that included two lessons, each of 50 min duration with a 10 min break between the two lessons (100 min in total). Learners were then given a 10 min break and proceeded to take the immediate UGJT and EIT posttests (posttest 1). In Session 3, which took place one week after Session 2, learners completed the delayed UGJT and EIT posttests (posttest 2). The participants in the control group completed only the UGJT and EIT.
Data analysis
The assumption of normality was checked using the Shapiro–Wilk’s test. As some test data violated the assumption of normality, data analyses were performed by Generalized Estimating Equations (GEE). This approach was employed because it accounts for correlation among repeated measures over time. It can provide robust estimates without imposing normality assumptions on the data. Immediate and delayed posttest scores of UGJT and EIT for both old and new items were included as outcome/dependent variables. Group (PI, MOI, CI and control group) and time (immediate and delayed posttest) was included as independent variables. The working memory test scores were included as the covariate variables. The pretest score was also included as a covariate in the GEE model. This was because the measurement conditions of the pretest and the subsequent repeated measures were different, and thus the pretest score was treated as a covariate to control for its potential influence on the outcome, rather than being included as part of the repeated measures analysis. Mean scores and standard deviations of each group were calculated. The significance level was set at 0.05 for all statistical tests. The Kruskal Wallis H Test and Mann–Whitney U Test were used to further explore the differences between groups and the scores between two groups at a specific time point respectively. Bonferroni correction was used and the revised significance level was set at 0.0083.These non-parametric tests provided additional insights into the data and helped to confirm the findings from the GEE analysis.
A one-way ANOVA did not find significant differences between the performance of the four groups on the pretest scores on the UGJT for classifiers used with the old items, F (3, 82) = 0.031, p = 0.993, for classifiers used with the new items, F (3, 82) = 0.219, p = 0.883, and on the pretest scores on the EIT for classifiers used with the old items, F (3, 28) = 1.121, p = 0.345, for classifiers used with the new items, F (3, 28) = 0.546, p = 0.652, and on the Working Memory test, F (2, 62) = 0.375, p = 0.689. These results indicate that any differences among groups on posttest scores cannot be attributed to prior knowledge of the target structures.
Results
Descriptive statistics for the UGJTs, EITs are presented in Table 4 involving the number of participants in each group, means and standard deviation (SD) for each test condition. For both tests, two scores were given: the score for old items, the score for new items. GEE was used to test the effects of different instructional groups on the UGJTs and EITs for both old and new items.
Table 4. Descriptive statistics for the UGJT and EIT
PI (n = 22) | MOI (n = 20) | CI (n = 23) | Control (n = 21) | ||||||
|---|---|---|---|---|---|---|---|---|---|
UGJT | EIT | UGJT | EIT | UGJT | EIT | UGJT | EIT | ||
Pretest | |||||||||
Old | Mean | 9.73 | 7.96 | 9.95 | 8.40 | 9.87 | 6.50 | 9.95 | 7.05 |
SD | 2.35 | 4.79 | 3.55 | 3.65 | 2.77 | 3.13 | 2.38 | 3.06 | |
New | Mean | 5.36 | 5.44 | 5.30 | 6.15 | 5.26 | 5.86 | 5.71 | 5.00 |
SD | 1.81 | 3.63 | 1.72 | 3.92 | 2.40 | 2.36 | 2.12 | 2.12 | |
Posttest 1 | |||||||||
Old | Mean | 23.82 | 20.13 | 23.95 | 23.90 | 23.78 | 22.23 | 10.57 | 7.05 |
SD | 0.50 | 4.95 | 0.22 | 0.45 | 0.52 | 3.49 | 2.73 | 2.80 | |
New | Mean | 17.86 | 13.26 | 18.00 | 18.15 | 17.78 | 15.46 | 6.10 | 5.67 |
SD | 1.70 | 4.07 | 1.92 | 2.30 | 1.93 | 4.08 | 2.59 | 1.77 | |
Posttest 2 | |||||||||
Old | Mean | 23.64 | 18.57 | 23.60 | 23.30 | 23.70 | 20.91 | 10.24 | 7.14 |
SD | 0.49 | 5.34 | 0.60 | 0.98 | 0.47 | 3.22 | 2.49 | 2.43 | |
New | Mean | 17.18 | 11.83 | 17.45 | 18.90 | 17.22 | 13.32 | 5.81 | 5.62 |
SD | 1.82 | 4.39 | 2.19 | 1.65 | 1.91 | 3.43 | 2.68 | 1.94 | |
Posttest 1 = Immediate posttest; Posttest 2 = Delayed posttest
For both old and new items of UGJTs, results showed significant differences in time (old items: Wald χ2 = 10.647, df = 1, p < 0.01; new items: Wald χ2 = 12.936, df = 1, p < 0.01) and group (old items: Wald χ2 = 593.808, df = 3, p < 0.01; new items: Wald χ2 = 382.365, df = 3, p < 0.01) but did not have interaction effect (old items: Wald χ2 = 2.944, df = 3, p > 0.05; new items: Wald χ2 = 1.933, df = 3, p > 0.05). The Kruskal Wallis H Test showed significant differences in the four groups in posttest 1 (χ2 = 66.880, df = 3, p < 0.01) and posttest 2 (χ2 = 54.826, df = 3, p < 0.01) for the old items, and in posttest 1 (χ2 = 47.955, df = 3, p < 0.01) and posttest 2 (χ2 = 47.834, df = 3, p < 0.01) for the new items. Mann–Whitney U Tests showed that the three experimental groups outperformed the control group (all p-values < 0.0083) but no significant difference was found between the three instructional groups in both posttests on both old and new items.
For both old and new items of the EITs, results revealed significant differences between the four groups in time (old items: Wald χ2 = 33.413, df = 1, p < 0.01; new items: Wald χ2 = 22.968, df = 1, p < 0.01) and group (old items: Wald χ2 = 882.481, df = 3, p < 0.01; new items: Wald χ2 = 550.098, df = 3, p < 0.01). Results showed interaction effects between group and time for both old and new items (old items: Wald χ2 = 18.619, df = 3, p < 0.01; new items: Wald χ2 = 52.217, df = 3, p < 0.01). The Kruskal Wallis H Test found significant differences between the four groups in posttest 1 (χ2 = 57.444, df = 3, p < 0.01) and posttest 2 (χ2 = 56.506, df = 3, p < 0.01) for the old items, and in posttest 1 (χ2 = 52.589, df = 3, p < 0.01) and posttest 2 (χ2 = 59.529, df = 3, p < 0.01) for the new items. The individual Mann–Whitney U Test revealed that the three experimental groups outperformed the control group (all p-values < 0.0083) for both posttests on both old and new items. The MOI group performed significantly better than the PI and CI groups on posttest 1: MOI vs. CI group (U = 120.500, Z = − 3.146, p = 0.002, r = 0.485); MOI vs. PI group (U = 78.500, Z =− 4.064, p = 0.000, r = 0.620), and on posttest 2: MOI vs. CI (U = 85.500, Z = − 3.487, p = 0.000, r = 0.538); MOI vs. PI (U = 60.000, Z = − 4.225, p = 0.000, r = 0.644) for the old items. For the new items, the MOI group outperformed the PI group on posttest 1 (U = 62.000, Z = − 4.118, p = 0.000, r = 0.628) and posttest 2 (U = 23.000, Z = − 5.086, p = 0.000, r = 0.776); the MOI group outperformed the CI group on posttest 2 (U = 20.000, Z = − 5.089, p = 0.000, r = 0.785). No significant difference was found between the CI and the PI groups on both posttests for both old and new items.
Table 5 presents the descriptive statistics for the counting span task including the means and the standard deviations (SD) for the three experimental groups. In the GEE model, we included the Group variable as a categorical predictor. The GEE model is capable of handling categorical variables directly without the need for manual transformation into dummy variables. To ensure clarity in the interpretation of our results, we specified the CI group as the reference group. GEE analysis showed that there was no interaction effect between the working memory and instructional groups. This suggests that working memory did not moderate the effects of instructional groups on learners’ explicit and implicit knowledge.
Table 5. Descriptive statistics for the counting span task
PI (n = 22) | MOI (n = 20) | CI (n = 23) | |
|---|---|---|---|
Mean | 80.78 | 81.30 | 83.77 |
SD | 13.39 | 12.32 | 10.94 |
Discussion
Development of explicit knowledge
Research question 1 involved the relative effects of the three instructional treatments on measures of explicit knowledge. The results showed that the three instructional treatments improved significantly and equally on the UGJTs for classifiers used with both old and new items. This finding suggests that the three instructional groups may help learners develop explicit knowledge of the target structure which leads to their improved and similar performance on the UGJT. The results for the new items are also evidence for the claim that the three instructional treatments may induce learners’ explicit knowledge and suggest that learners were enabled to generalize their knowledge to the classifiers used with items that were not taught during instruction.
The results of the present study contradict those of Erlam et al. (2009) who found both input and output-based instructional groups did not improve on the UGJT for new items on both posttests but are consistent with the results for old items that both input and output-based instructional groups improved on both posttests. Similarly, Mostafa and Kim (2021) reported superior effects for both PI and output-based instruction on measures of explicit knowledge. One possible reason why the three instructional groups performed equally on the UGJT for both old and new items in the present study is that the three treatments may facilitate the development of explicit knowledge and the processes (i.e., semantic processing, noticing, and reflecting, Ellis, 2004) required for completion of the UGJT. For example, the EI gave learners information about each classifier, such as the meaning, pronunciation, and rules, which enabled them to understand the meaning of the target structures and the SI/SO helped learners make form-meaning connections thus involving the first process—semantic processing. The feedback that enabled learners to notice the gap between what they knew about the target language structures and what they did not know may help learners reflect on the target grammatical structure, thus may have encouraged the second and third processes, noticing and reflection. Another possible explanation relates to the measures employed in the present study. As argued by Shintani (2015), the measures used in previous studies essentially required learners to use the processing strategy. Therefore, the PI group that received training on input processing strategy would have an advantage in the interpretation tests, which also assessed their understanding of processing strategy. However, the UGJT used in the present study required learners to demonstrate explicit knowledge, such as grammatical judgement of sentences containing the target structure. Thus, the PI group did not have an advantage over the other two groups that also received information and training to facilitate the development of explicit knowledge. The other two groups may perform equally well as the PI group.
Development of implicit knowledge
Research question 2 asked the comparative effects of the three instructional treatments on the development of implicit knowledge. First, the results in the present study reported significant gains for the three instructional groups compared to the control group. These findings indicate that the three instructional groups may impact on learners’ development of implicit knowledge and generalization of knowledge to classifiers used with the new items. In addition, the results also showed better gains for the MOI group than the PI and CI groups for classifiers used with both old and new items for both posttests. This result suggests that meaningful output may have an advantage in facilitating the development of implicit knowledge.
The results that the MOI group outperformed the PI group on the EIT are compatible with those of Mostafa and Kim (2021) who reported better gains for output-based instruction than PI on measures of implicit/automatized explicit knowledge, and the findings from and two meta-analysis studies that output-based instruction had a greater effect in the production tests and when both instructional groups received the same explicit information, the output-based instruction group outperformed the input-based group in the production tests (Shintani, 2015; Shintani et al., 2013). Moreover, the findings that the MOI group outperformed better than the PI and CI groups on the delayed posttest are also consistent with the meta-analysis study of Shintani et al (2013) that suggests more durable effects for the output-based instruction on measures of productive knowledge that might be implicit knowledge. However, the results for the PI group in the present study contradict those of Erlam et al (2009) who found that only the output-based instruction improved on both EIT posttests.
The results that the MOI group made greater gains than the PI group could be attributed to a few possible reasons. First, this can be explained by the specificity of skill acquisition theory that output practice is better for production skills (DeKeyser, 2007). Another possible reason relates to the difference between feedback provided. The feedback in the MOI group, compared to the PI group, asked learners to produce the target language structures. This could lead to more opportunities for learners in the MOI group to (1) produce the target grammatical structures orally and receive more incidental input than those in the PI group; and (2) notice the gap between their errors and the correct form and reflect on their own language use and test out their hypotheses (Swain, 2000). However, why the CI group did not perform at the same level as the MOI group on the EIT could not be explained by the above reasons since the CI group experienced the same output-based activities. It is possible that the amount of output learners produced, rather than the nature of the language activities learners received, played a role in the results. A series of Chi-square tests were conducted to examine whether the number of form-meaning connections, incidental input, and output opportunities each group received differed significantly. The results reported that learners in the MOI group produced significantly more grammatical items than those in the CI group (χ2 = 48.00, df = 1, p < 0.01; χ2 = 8.00, df = 1, p < 0.01). The skill acquisition theory does not apply to the improved performance of the PI group on the EIT in the present study given that the PI group never engaged in output-based language activities. The results for the PI group could be explained by the Input Processing Theory that PI appeared to help learners receive better intake and incorporate new language forms into their developing language systems which can be accessed by for their production (VanPatten, 2004). The findings for both old and new items also provided some evidence that PI aids L2 learners’ developing system.
Working memory
Research question 3 addressed whether working memory moderates the relationship between instructional groups and the development of L2 knowledge. The findings showed that working memory did not moderate the effects of the three instructional groups on learners’ development of L2 explicit and implicit knowledge. The results were not congruent with those from other studies (Erçetin & Alptekin, 2013; Li et al., 2019; Martin & Ellis, 2012; Pawlak & Biedroń, 2019, 2021; Révész, 2012; Serafini & Sanz, 2016; Tagarelli et al., 2015) and were surprising since previous empirical evidence has shown that working memory is likely to be related to measures of explicit knowledge as working memory functions involve consciousness and attentional control processes (Baddeley, 2003; Ellis, 2005). The results could possibly be attributed to the difference between the counting span task which measured learners’ capacity of performing the dual task of information processing and storage (Conway et al., 2005; Engle et al., 1999), and the UGJTs. That is, whereas learners had time pressure when performed on the counting span task, learners completed the UGJTs without time pressure. Therefore, learners who did not have better working memory capacity would have enough time to process the sentence stimuli and complete the UGJTs.
An alternative explanation is that the UGJTs may require other cognitive abilities, such as language analytic ability. Previous studies (Li et al., 2019; Suzuki & DeKeyser, 2017) suggest that, the executive working memory is needed in tasks that impose heavy online processing resources. In the present study, the UGJTs asked learners to make grammatical judgment about the target linguistic structures and correct the ungrammatical sentences. The UGJT for classifiers used with the new items required learners to generalize the grammatical knowledge of the classifiers so that they can match them with the lexical items that were not taught during instruction. Therefore, it seems that learners’ ability to analyze, deduce and generalize the grammatical rules of the classifiers may play a greater role than processing and storing information in working memory.
The lack of effect of working memory on the measures of implicit knowledge was not surprising. Working memory is considered an online conscious process that involves the active storage and processing of information, as well as the conscious control and selective attention to information. In contrast, implicit knowledge is acquired unconsciously and intuitively, and its utilization is typically automated. This disparity between controlled and automated processes suggests that working memory is unlikely to directly influence the automated use of implicit knowledge. Some studies, however, report contradictory results, showing that working memory is correlated with implicit knowledge (Kim et al., 2015; Li et al., 2013, 2019; Révész, 2012) or that working memory is weakly related to explicit productive knowledge, with the relationship mediated by overall language proficiency (Pawlak & Biedroń, 2021). Clearly, the relationship between working memory and language knowledge type awaits further investigation (Roehr-Brackin, 2024).
Conclusion
The present study examined the relative effects of PI, MOI, and CI on learners’ development of explicit and implicit knowledge of four Chinese classifiers and the correlations with individual differences in working memory. Overall, the findings suggested superior effects for all instructional groups in developing explicit knowledge and better performance for the MOI group compared to the PI and CI groups in developing implicit knowledge of the target structure. The results also indicate that the PI, MOI, and CI were effective on helping learning generalize their knowledge to items that were not introduced during treatments. In pedagogical context, teachers are encouraged to design input and output-based activities that require form-meaning connections to develop explicit and implicit knowledge of the target linguistic feature. Moreover, the results also showed negative correlations between individual differences in working memory capacity and the UGJTs for the MOI group. These correlations suggested that learners who had low information processing and storage abilities might benefit more from the MOI.
The present study has a number of limitations. First, this study did not control the variable of students’ first language background. This might impact the results. Second, the participants in the present study were not randomly assigned into groups. This might affect the comparability between groups and the generalizability of the findings, as non-randomized participant grouping might introduce confounding variables (e.g., age, gender) that influence the relationship between instructional treatments and language achievement. Therefore, future research should incorporate random assignment to alleviate these concerns and enhance the generalizability of the study’s conclusions. Third, the first author was also the instructor. Researchers’ expectations might affect participants’ performance in different experimental groups and the findings of the study (Li, 2022). Future studies should recruit classroom teachers as much as possible to avoid researcher bias. Fourth, the interval between the two posttests was short. There was insufficient lapsed time to confirm whether the MOI group had long-term effects; whereas in most PI studies, the post-testing took place after more than two weeks (e.g., Benati, 2005; Benati & Batziou, 2019a, 2019b; VanPatten & Fernández, 2004). Therefore, interpretations of the long-term effects should be tentative. Finally, the present study only employed a single test to measure each knowledge type and the EIT was considered to measure automatized explicit knowledge not implicit knowledge (Suzuki, 2017; Suzuki & DeKeyser, 2015, 2017). Therefore, future investigations are needed to adopt different types of measures of each knowledge type.
Acknowledgements
The authors would like to thank all the participants and express our gratitude to the anonymous reviewers for their valuable feedback and suggestions.
Author contributions
YL conceived and designed the study, collected, and analyzed the data and drafted the manuscript. YL finalized it for submissions as the corresponding author. All authors read and approved the final manuscript.
Funding
This work is financially supported by the 2025 Self-Initiated Research Project of the College of International Education, Minzu University of China, titled “The Impact Mechanism and Enhancement Pathways of International Chinese Language Teachers’ Intercultural Conflict Management Styles on Bicultural Identity” (Project No. GJKY2025-06).
Availability of data and materials
The authors declare that the data supporting the results of the study are available within the manuscript.
Declarations
Ethical approval and consent to participate
All participants received the Participants Information Sheet and filled in the Consent Form and willingly participated in the study.
Human and animal rights
The studies involving human participants were reviewed and approved by the University of Auckland Human Participants Ethics Committee, with Y. Liang as the first researcher and L.J. Zhang as the PhD research thesis supervisor.
Competing interests
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest Ethics Statement.
Abbreviations
Analysis of variance
Combined instruction
Explicit information
Elicited Imitation Test
Second language
Meaning-based output instruction
Processing instruction
Standard deviations
Structured input
Structured output
Traditional instruction
Untimed Grammatical Judgment Test
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
Baddeley, A. The episodic buffer: A new component of working memory?. Trends in Cognitive Sciences; 2000; 4,
Baddeley, A. Working memory: Looking back and looking forward. Nature Reviews Neuroscience; 2003; 4,
Baddeley, AD; Hitch, G. Working memory. Psychology of Learning and Motivation; 1974; 8, pp. 47-89. [DOI: https://dx.doi.org/10.1016/S0079-7421(08)60452-1]
Benati, A. A comparative study of the effects of processing instruction and output-based instruction on the acquisition of the Italian future tense. Language Teaching Research; 2001; 5,
Benati, A. The effects of processing instruction, traditional instruction and meaning-output instruction on the acquisition of the English past simple tense. Language Teaching Research; 2005; 9,
Benati, A. Japanese language teaching: A communicative approach; 2009; Bloomsbury Publishing:
Benati, A; Batziou, M. Discourse and long-term effects of isolated and combined structured input and structured output on the acquisition of the English causative form. Language Awareness; 2019; 28,
Benati, A; Batziou, M. The relative effects of isolated and combined structured input and structured output on the acquisition of the English causative forms. International Review of Applied Linguistics in Language Teaching; 2019; 57,
Cadierno, T. Formal instruction from a processing perspective: An investigation into the Spanish past tense. The Modern Language Journal; 1995; 79,
Cohen, J. Statistical power analysis for the behavioral sciences; 2013; Routledge: [DOI: https://dx.doi.org/10.4324/9780203771587]
Collentine, J. (2004). Commentary: Where PI research has been and where it should be going. In processing instruction: theory, research, and commentary, (pp. 185–198).
Conway, AR; Kane, MJ; Bunting, MF; Hambrick, DZ; Wilhelm, O; Engle, RW. Working memory span tasks: A methodological review and user’s guide. Psychonomic Bulletin & Review; 2005; 12,
De Jong, N. Can second language grammar be learned through listening? An experimental study. Studies in Second Language Acquisition; 2005; 27,
DeKeyser, R. DeKeyser, R; VanPatten, B; Williams, J. Skill acquisition theory. Theories in Second Language Acquisition: An Introduction; 2007; Routledge: pp. 94-112.
DeKeyser, R. Interactions between individual differences, treatments, and structures in SLA. Language Learning; 2012; 62,
DeKeyser, R. (2017). Knowledge and skill in ISLA. In The Routledge handbook of instructed second language acquisition (pp. 15–32). Routledge.
DeKeyser, RM. Introduction to the special issue of aptitude-treatment interaction in second language learning. Journal of Second Language Studies; 2019; 2, pp. 165-168. [DOI: https://dx.doi.org/10.1075/jsls.00007.int]
DeKeyser, RM; Prieto Botana, G. The effectiveness of processing instruction in L2 grammar acquisition: A narrative review. Applied Linguistics; 2015; 36, pp. 290-305. [DOI: https://dx.doi.org/10.1093/applin/amu071]
DeKeyser, RM; Sokalski, KJ. The differential role of comprehension and production practice. Language Learning; 1996; 46,
DeKeyser, R; Salaberry, R; Robinson, P; Harrington, M. What gets processed in processing instruction? A commentary on Bill VanPatten’s “processing instruction: An update”. Language Learning; 2002; 52,
Doughty, C. (2004). Commentary: When PI is focus on form it is very, very good, but when it is focus on forms. In processing instruction: theory, research, and commentary, (pp. 257–270).
Ellis, R. The definition and measurement of L2 explicit knowledge. Language Learning; 2004; 54,
Ellis, NC. At the interface: Dynamic interactions of explicit and implicit language knowledge. Studies in Second Language Acquisition; 2005; 27,
Ellis, R. (2009). Implicit and explicit knowledge in second language learning, testing and teaching (Vol. 42). Multilingual Matters.
Engle, RW; Laughlin, JE; Tuholski, SW; Conway, ARA. Working memory, short-term memory, and general fluid intelligence: A latent-variable approach. Journal of Experimental Psychology: General; 1999; 128,
Erçetin, G; Alptekin, C. The explicit/implicit knowledge distinction and working memory: Implications for second-language reading comprehension. Applied Psycholinguistics; 2013; 34,
Erlam, R. Language aptitude and its relationship to instructional effectiveness in second language acquisition. Language Teaching Research; 2005; 9,
Erlam, R. (2006). Elicited imitation as a measure of L2 implicit knowledge: An empirical validation study. Applied Linguistics, 27(3), 464–491. https://doi.org/10.1093/applin/aml001
Erlam, R; Wei, L. The importance of increased processing demands in the design of elicited imitation tests. Language Teaching Research; 2021; 28,
Erlam, R., Loewen, S., Philp, J., Ellis, R., Elder, C., & Reinders, H. (2009). The roles of output-based and input-based instruction in the acquisition of L2 implicit and explicit knowledge. In implicit and explicit knowledge in second language learning, testing and teaching (pp.241–261) Multilingual Matters.
Farley, A. Authentic processing instruction and the Spanish subjunctive. Hispania; 2001; 84,
Farley, A. Processing instruction and meaning-based output instruction: A comparative study. Spanish Applied Linguistics; 2001; 5,
Farley, A; Aslan, E. The relative effects of processing instruction and meaning-based output instruction on L2 acquisition of the English subjunctive. ELT Research Journal; 2012; 1,
Farley, A. (2004). The relative effects of processing instruction and meaning-based output instruction. In processing instruction: theory, research, and commentary, (pp. 143–168).
Harrington, M. (2004). Commentary: Input processing as a theory of processing input.In: processing instruction: theory, research, and commentary, (pp. 81–94).
Isbell, DR; Rogers, J. Winke, P; Brunfaut, T. Measuring implicit and explicit learning and knowledge. The routledge handbook of second language acquisition and language testing; 2021; Routledge: pp. 304-313.
Kang, EY; Sok, S; Han, Z. Thirty-five years of ISLA on form-focused instruction: A meta-analysis. Language Teaching Research; 2019; 23,
Keating, GD; Farley, AP. Processing instruction, meaning-based output instruction, and meaning-based drills: Impacts on classroom L2 acquisition of Spanish object pronouns. Hispania; 2008; 91,
Kim, YouJin; Payant, C; Pearson, P. The intersection of task-based interaction, task complexity, and working memory: L2 question development through recasts in a laboratory setting. Studies in Second Language Acquisition; 2015; 37,
Kirk, RW. The effects of processing instruction with and without output: Acquisition of the Spanish subjunctive in three conjunctional phrases. Hispania; 2013; 96,
Lee, J. F., & VanPatten, B. (2003). Making communicative language teaching happen (2nd ed.). McGraw-Hill.
Li, S. The associations between language aptitude and second language grammar acquisition: A meta-analytic review of five decades of research. Applied Linguistics; 2015; 36,
Li, S; Sanz, C; Lado, B. Sanz, C; Lado, B; Bourns, SK. The differential roles of language analytic ability and working memory in mediating the effects of two types of feedback on the acquisition of an opaque linguistic structure. Individual Differences, L2 Development & Language Program Administration: From Theory to Application; 2013; Cengage Learning: pp. 32-52.
Li, S; Ellis, R; Zhu, Y. The associations between cognitive ability and L2 development under five different instructional conditions. Applied Psycholinguistics; 2019; 40,
Li, S. (2017). The effects of cognitive aptitudes on the process and product of L2 interaction: A synthetic review. In L. Gurzynski-Weiss (Ed.), AILA Applied Linguistics Series (Vol. 16, pp. 42–70). John Benjamins Publishing Company. https://doi.org/10.1075/aals.16.03li
Li, S. (2022). Quantitative research methods in ISLA. In L. Gurzynski-Weiss spsampsps Y. Kim (Eds.), Research Methods in Applied Linguistics (Vol. 3, pp. 31–54). John Benjamins Publishing Company. https://doi.org/10.1075/rmal.3.02li
Linck, JA; Osthus, P; Koeth, JT; Bunting, MF. Working memory and second language comprehension and production: A meta-analysis. Psychonomic Bulletin & Review; 2014; 21,
Mackey, A; Gass, S. Second language research: methodology and design; 2015; 2 nd ed. Erlbaum: [DOI: https://dx.doi.org/10.4324/9781315750606]
Marsden, E; Chen, H-Y. The roles of structured input activities in processing instruction and the kinds of knowledge they promote. Language Learning; 2011; 61,
Martin, KI; Ellis, NC. The roles of phonological short-term memory and working memory in L2 grammar and vocabulary learning. Studies in Second Language Acquisition; 2012; 34,
Morgan-Short, K; Bowden, HW. Processing instruction and meaningful output-based instruction: Effects on second language development. Studies in Second Language Acquisition; 2006; 28,
Mostafa, T; Kim, Y. The effects of input and output based instruction on the development of L2 explicit and automatised explicit knowledge: A classroom based study. Language Awareness; 2021; 30,
Mystkowska-Wiertelak, A. The effects of a combined output and input-oriented approach in teaching reported speech. Research in Language; 2011; 9,
Norris, JM; Ortega, L. Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. Language Learning; 2000; 50,
Pawlak, M; Biedroń, A. Verbal working memory as a predictor of explicit and implicit knowledge of English passive voice. Journal of Second Language Studies; 2019; 2,
Pawlak, M; Biedroń, A. Working memory as a factor mediating explicit and implicit knowledge of English grammar. Annual Review of Applied Linguistics; 2021; 41, pp. 118-125. [DOI: https://dx.doi.org/10.1017/s0267190521000052]
Qin, J. The effect of processing instruction and dictogloss tasks on acquisition of the English passive voice. Language Teaching Research; 2008; 12,
Révész, A. Working memory and the observed effectiveness of recasts on different L2 outcome measures. Language Learning; 2012; 62,
Robinson, P. Cognitive complexity and task sequencing: Studies in a componential framework for second language task design. International Review of Applied Linguistics in Language Teaching; 2005; 43,
Robinson, P. (2002). Effects of individual differences in intelligence, aptitude and working memory on adult incidental SLA. In individual differences and instructed language learning, (pp. 211–266). John Benjamins Publishing Company.
Roehr-Brackin, K. Explicit and implicit knowledge and learning of an additional language: A research agenda. Language Teaching; 2024; 57, pp. 68-86. [DOI: https://dx.doi.org/10.1017/S026144482200026X]
Santamaria, K; Sunderman, G. Working memory in processing instruction: The acquisition of L2 French clitics. Working Memory in Second Language Acquisition and Processing; 2015; 87, 205.
Sanz, C; Lin, H-J; Lado, B; Stafford, CA; Bowden, HW. One size fits all? learning conditions and working memory capacity in ab initio language development. Applied Linguistics; 2016; 37,
Serafini, EJ; Sanz, C. Evidence for the decreasing impact of cognitive ability on second language development as proficiency increases. Studies in Second Language Acquisition; 2016; 38,
Shintani, N. The effectiveness of processing instruction and production-based instruction on l2 grammar acquisition: A meta-analysis. Applied Linguistics; 2015; 36,
Shintani, N; Li, S; Ellis, R. Comprehension-based versus production-based grammar instruction: A meta-analysis of comparative studies. Language Learning; 2013; 63,
Skehan, P. Foreign language aptitude and its relationship with grammar: A critical overview. Applied Linguistics; 2015; 36,
Spada, N; Tomita, Y. Interactions between type of instruction and type of language feature: A meta-analysis. Language Learning; 2010; [DOI: https://dx.doi.org/10.1111/j.1467-9922.2010.00562.x]
Spada, N; Shiu, JL-J; Tomita, Y. Validating an elicited imitation task as a measure of implicit knowledge: Comparisons with other validation studies. Language Learning; 2015; 65,
Suzuki, Y. Validity of new measures of implicit knowledge: Distinguishing implicit knowledge from automatized explicit knowledge. Applied Psycholinguistics; 2017; 38,
Suzuki, Y; DeKeyser, R. Comparing elicited imitation and word monitoring as measures of implicit knowledge: Elicited imitation and word monitoring. Language Learning; 2015; 65,
Suzuki, Y; DeKeyser, R. Exploratory research on second language practice distribution: An Aptitude×Treatment interaction. Applied Psycholinguistics; 2017; 38,
Suzuki, Y; Jeong, H; Cui, H; Okamoto, K; Kawashima, R; Sugiura, M. Fmri reveals the dynamic interface between explicit and implicit knowledge recruited during elicited imitation task. Research Methods in Applied Linguistics; 2023; 2, [DOI: https://dx.doi.org/10.1016/j.rmal.2023.100051] 100051.
Swain, M. (2000). The output hypothesis and beyond: Mediating acquisition through collaborative dialogue. In socio-cultural theory and second language learning (pp. 97–114). Oxford University Press.
Tagarelli, KM; Mota, MB; Rebuschat, P. Wen, Z; Mota, MB; McNeill, A. Working memory, learning conditions and the acquisition of l2 syntax. Working memory in second language acquisition and processing; 2015; Multilingual Matters: pp. 224-247.
Toth, PD. Processing instruction and a role for output in second language acquisition. Language Learning; 2006; 56,
Uludag, O; VanPatten, B. The comparative effects of processing instruction and dictogloss on the acquisition of the English passive by speakers of Turkish. International Review of Applied Linguistics in Language Teaching; 2012; 50,
Vafaee, P; Suzuki, Y; Kachisnke, I. Validating grammaticality judgment tests: Evidence from two new psycholinguistic measures. Studies in Second Language Acquisition; 2017; 39,
VanPatten, B; Cadierno, T. Explicit instruction and input processing. Studies in Second Language Acquisition; 1993; 15,
VanPatten, B; Wong, W. Processing instruction and the French causative: Another replication. Processing Instruction: Theory, Research, and Commentary; 2004; 97, 118.
VanPatten, B; Inclezan, D; Salazar, H; Farley, AP. Processing instruction and dictogloss: A study on object pronouns and word order in Spanish. Foreign Language Annals; 2009; 42,
VanPatten, B., & Fernández, C. (2004). The long-term effects of processing instruction. In processing instruction: theory, research, and commentary, (pp. 273–289).
VanPatten, B. (Ed.). (2004). In processing instruction: theory, research, and commentary. L. Erlbaum Associates.
Wong, W. (2004). The nature of processing instruction. In processing instruction: theory, research, and commentary, (pp. 33–63).
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.