Content area
Predicting upcoming words in a sentence is important in sentence processing. Previous research has shown that children’s vocabulary size and language production skills influence prediction speed. This study investigates whether syntactic complexity affects predictive processing using eye-tracking in a picture-selection task. Three conditions were tested: baseline (object recognition), active (syntactically simple) and passive sentences (syntactically complex). Data was collected for 29 four- and five-year-old Dutch children and 10 Dutch young adults. Results show that adults predict sentence endings quickly and accurately, regardless of complexity. Children predicted in both conditions, but less strongly in passive sentences. These findings suggest that while both adults and children engage in predictive processing, syntactic complexity weakens prediction in children.
1. Introduction
Go to section...
- TOP
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Funding
- Acknowledgements
- Note
- References
- Authors' addresses
1.1 Predictive processing
The concept of predictive processing is not unique to language comprehension. The idea that our brain can detect patterns in incoming sensory information and is able to make predictions about upcoming input based on previously learned patterns is an important part of human cognition (Harkness & Keshava 2017). Prediction is involved in many different processes at varying levels of cognition. For example, a simple prediction occurs when you are about to eat something that smells sweet. From previous encounters with sweet-smelling food, your brain knows that the food also tasted sweet. Therefore, it predicts that this particular food will again taste sweet. However, if it unexpectedly tastes bitter, there will be surprise and next time, you might be more apprehensive about eating sweet smelling food.
This same principle of predicting and learning applies to higher cognitive processes such as language (Altmann & Kamide 1999). When hearing or reading the first words of a sentence such as “The boy eats…”, a lot of information at different linguistic levels is already available. At the level of semantics, you can predict from the meaning of the word ‘eat’ that the next word is probably something edible. From a syntactic perspective, you already know that this is an active sentence with a subject and a finite verb. It is then possible to deduce that the verb will most likely be followed by a direct object. Besides the semantic and syntactic information enclosed in this sentence, pragmatic information from a broader context can also influence prediction. For example, if this sentence is uttered in a story about a birthday party, one might predict that the next word is ‘cake’. But, if it is uttered when describing a dinner scene, ‘potatoes’ or ‘rice’ might be more likely.
1.2 Predictive processing in children
Predictive processing is not something that only happens in the adult brain. Research has shown that the concept of predictive processing can be related to different aspects of child development, such as statistical learning, motor and proprioceptive learning and infants’ basic understanding of their physical and social environment, that were previously thought to be separate phenomena (see Köster et al. 2020). Besides that, several studies have shown that children are also capable of predicting during language processing. Mani and Huettig (2012) used a preferential looking paradigm to see whether 2-year-old children are able to make predictions based on linguistic input. They found that children did already look at the target picture before hearing the corresponding target word, when the target could be predicted from the semantic properties of the verb (e.g. “The boy eats a big cake” with images of cake (semantically related) and bird (unrelated)). Besides, they also found differences between children related to their production skills. Whereas children with large productive vocabularies showed evidence of predicting upcoming linguistic input, children with small productive vocabularies did not. Interestingly, the size of their comprehensive vocabulary was unrelated to prediction skills. In an eye-tracking study with both adults and children (age 3–10 years), Borovsky et al. (2012) investigated prediction in a slightly more complex paradigm where information from both the agent and the finite verb had to be integrated to predict which of the four pictures would correspond to the last word of the sentence (e.g. “The pirate hides the treasure” with images of treasure (target, agent- and action-related), ship (agent-related), bone (action-related) and cat (unrelated)). Their results showed that age was not related to prediction abilities, but receptive vocabulary size (corrected for age) was. Anticipatory looks towards the target picture were significantly faster for both adults and children with large vocabularies compared to adults and children with small vocabularies. So, these studies show that the ability to predict upcoming linguistic input during sentence processing seems to be related to linguistic competence rather than to age. Even children as young as two years old are capable of predicting, provided they are skilled producers and this persists into later childhood and adulthood with higher language abilities being related to better prediction skills.
1.3 Syntactic complexity and prediction
Most research on predictive processing has focused on semantic information in syntactically simple structures. However, investigating how predictive processing operates in more complex sentences could provide important insights into how complex structures are processed. This study focuses on prediction in active and passive sentences in typically developing children at age 4–5, but since there is very little research on the relation between syntactic complexity and predictive processing in this group, studies that focus on these phenomena in other populations, such as children with Developmental Language Disorder (DLD) and second language learners (L2 learners), will be described to shape our expectations.
1.3.1 Developmental Language Disorder
Children with DLD experience persisting difficulties in language that cannot be attributed to a simple delay in development. Although these children have a wide variety of problems in many different aspects of language, a certain degree of morphosyntactic deficits seems to be a common denominator between them. Jones and Westermann (2021) propose that these morphosyntactic deficits might be strongly related to predictive processing. Since children with DLD have lower language skills than typically developing (TD) children, they have more difficulty making predictions based on linguistic input. Jones and Westermann (2021) argue that as a result, children with DLD may struggle to use prediction errors to improve their syntactic knowledge, which in turn keeps them from making better predictions in the future and this perpetuates their language difficulties. Based on this account, it is expected that syntactic complexity is related to predictive language processing. There is empirical evidence that supports this idea, at least for children with DLD.
Van Alphen et al. (2021) used eye-tracking to investigate word recognition and word prediction in children with and without (a suspicion of) DLD (age 2–4). For word recognition, they used simple sentences such as “Look, a hat!” where the target image can only be identified after the onset of the target word. For word prediction they used more complex sentences such as “Hey, he reads just a book”, where the last word of the sentence can be predicted based on the verb. Both groups showed a significant word recognition effect, but children with DLD shifted slower towards and looked longer at the target image. In the word prediction condition, the DLD group again showed slower shifts than the TD group and a smaller prediction effect overall, with greater variation linked to lower receptive and expressive language scores. Van Alphen et al. (2021) concluded that children with (a suspicion of) DLD showed atypical verb-based prediction and that this might stem from limitation in their processing capacity or linguistic knowledge.
While prediction in the Van Alphen et al. (2021) study could be based both on the meaning of the verb and the structure of the sentence, Hestvik et al. (2022) specifically looked at syntax-based prediction in children with and without DLD (age 8–13 years) using ERPs. They used relative clauses that typically contain a gap in the direct object position (e.g. “The zebra that the hippo kissed [e] on the nose”) and compared those to relative clauses with an unexpected filled gap (e.g. “The zebra that the hippo kissed the camel on the nose”), that trigger a surprise response in the brain. Hestvik et al. (2022) found that where TD children showed a surprise brain response to the filled gap, meaning they predicted the gap to be empty, children with DLD did not, which means they were unable to predict the correct structure of the sentence.
Interestingly, Van Alphen et al. (2021) found that children with DLD do predict, but are significantly slower than TD children, whereas Hestvik et al. (2022) found no prediction effect at all for children with DLD. An important difference between these two studies is the level of syntactic complexity in the sentences used. In the study by Hestvik et al. (2022), TD children showed mature brain responses to the complex sentences, whereas children with DLD did not, but in the relatively simple sentences used by Van Alphen et al. (2021), children with DLD were capable of predicting, only slower. Also, the different methodologies used in these two studies might influence the results. Van Alphen et al. (2021) used eye-tracking to examine predictive looks at the target before the onset of the target noun, whereas Hestvik et al. (2022) looked at brain responses following the onset of the target noun. Both methods indicate prediction, but they might highlight different aspects of predictive processing.
So, for children with DLD, syntactic complexity seems to influence their predictive processing during sentence comprehension. It is not clear yet whether this is also the case for TD children or if this influence of syntactic complexity on prediction is specific to DLD.
1.3.2 Adult L2-learners
Syntactic complexity also seems to influence prediction in adult L2-learners as shown by Chun et al. (2021) and Chun and Kaan (2019). Both studies used eye-tracking and found that skilled L2-listeners were capable of making predictions in more complex sentences, but that they were slower than L1-listeners. They conclude that this difference between L1- and L2-listeners is not due to a lack of comprehension but due to increased cognitive load, since L2-learners did predict. L2 processing generally takes more cognitive resources and research has shown that syntactic complexity increases cognitive load (Chmiel et al. 2024) and that increased cognitive load delays prediction (Ito et al. 2018).
1.4 Present study1
The present study investigates whether syntactic complexity influences predictive processing in typically developing children and adults. By comparing whether and how fast children and adults predict upcoming words in active and passive sentences we try to discern whether prediction is influenced by syntactic structure. We chose to test four- and five-year-old children, since at this age, they have not yet fully acquired the passive sentence structure (Armon-Lotem et al. 2016; Wijnen & Verrips 2011), meaning their syntactic knowledge of passives is still incomplete.
On the one hand, if syntactic complexity has no influence on prediction, we would expect no differences in predictive processing between active and passive sentences, both for children and adults. This would suggest that, despite not having fully acquired the syntactic structure of the sentence, children are still capable of making predictions. In such cases, they would likely rely more heavily on the semantic relation between words to make predictions, ensuring that their incomplete syntactic knowledge does not interfere with their predictions. On the other hand, if syntactic complexity does influence prediction, we would expect children, whose syntactic knowledge is still developing, to experience difficulties with predicting in syntactic structures they have not yet acquired. In this case, we expect that they would predict slower or with less certainty, in the passive sentences compared to the active ones. Adults, in contrast, would likely show no differences between active and passive sentences, since they have fully developed language and cognitive systems.
2. Methods
Go to section...
- TOP
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Funding
- Acknowledgements
- Note
- References
- Authors' addresses
The present study included two groups of participants. The first group consisted of 31 Dutch-speaking children, 29 of them completed the experiment (14 girls, 15 boys). The mean age of these 29 children was 5;2 years (range: 4;0–6;4), and all attended the same primary school. The second group included 10 Dutch-speaking young adults (9 female, 1 male), with a mean age of 22;1 years (range: 20;7–24;10).
2.2 Materials and design
To test childrens’ predictive language processing, a picture selection task in a visual world paradigm was created. By using eye-tracking, sentence processing can be investigated from the beginning of the sentence and predictions can be detected by anticipatory looks towards the target image. Two pictures were shown side by side, while a prerecorded sentence was presented acoustically. There were three different conditions. The first condition consisted of active sentences with lexical verbs, where the object could be predicted based on the meaning of the verb (1). In this condition, both the target and the distractor image displayed inanimate objects. Since it takes approximately 200 ms to plan and execute an eye movement (Fischer, 1992), the adverb “gewoon” (just) was added between the verb and the object, allowing for sufficient time to capture any predictive eye movements between the verb and onset of the noun. The second condition consisted of passive sentences with the same lexical verbs as in the active conditions (2). In this case, the target image was always animate, since the last part of the sentence always referred to the agent. The distractor image still displayed an inanimate object. In these passive sentences, the preposition “door” (by) provided enough time to capture anticipatory eye movements and it was not necessary to add an adverb. The third condition consisted of active sentences with copular verbs (3) where the object could not be predicted based on the meaning of the verb. Here, the distractor images were always inanimate objects, but the target image was animate in half of the sentences and inanimate in the other half. This baseline condition was used to inspect how long participants took to identify an object that they heard and to see whether the target being animate had any effect on the gaze patterns.
(1)
DeThe
jongenboy
lees-tread-3sg-prs
gewoonjust
eena
boekbook
‘The boy just reads a book’
(2)
HetThe
boekbook
word-tbecome-3sg-prs
ge-lezenptcp-pst-read
doorby
eena
jongenboy
‘The book is read by a boy’
(3)
DitThis
isbe-3sg-prs
eenan
olifantelephant
‘This is an elephant’
For each sentence, a recording was made of the same female native Dutch speaker pronouncing the sentence with a neutral intonation. While the audio was being presented, there were two images on the screen, one target image that matched the final word of the sentence and one unrelated image that served as a distractor (see Figure 1). The presentation of the target image on either the left or the right side of the screen was randomized and evenly distributed across trials. The images of objects and animals were retrieved from the BOSStimuli database (Brodeur et al. 2014) and the images of the characters were retrieved from Shutterstock (https://www.shutterstock.com).
[Image omitted: See PDF]
2.3 Procedure
To capture the participants’ eye movements, we used the EyeLink Portable Duo eye-tracker with a frequency of 500 Hz and the experiment was made using Experiment Builder software (SR Research Ltd., version 2.3.38). The adult participants were tested at the EyeLab of the University of Groningen and the children were tested in a quiet room at their school. The children were picked up individually from their classroom and taken to the room where the eye-tracker was set up. Each child received a thank-you sticker after the experiment, regardless of completion.
Before starting the experiment, the eye-tracker was calibrated and the participants received instructions on the task. They were told they would hear sentences and they had to indicate which image best fitted the sentence by pressing the corresponding button on a response pad. Then they completed two practice trials, before starting the actual experiment. Each participant completed 30 trials in total, 10 from the active condition, 10 from the passive and 10 from the baseline condition in randomized order. There were 20 verbs used to create the active and passive sentences. This study used counterbalanced sentence lists: Half of the participants saw list A, where verbs 1–10 appeared in the active condition and verbs 11–20 in the passive condition, while the other half saw list B, where verbs 11–20 appeared in the active condition and verbs 1–10 in the passive condition. Thus, each participant encountered each verb in only one condition, but across participants all verbs were presented both in the active and passive conditions. The baseline sentences remained identical across lists, so each participant encountered the same 10 baseline sentences.
2.4 Statistical analysis
All analyses were performed using R Statistical Software (v4.3.2; R Core Team 2023) in RStudio (v2024.12.0.467; Posit team 2024). We used a Generalized Additive Mixed Model to analyze the nonlinear effect of time on the proportion of looks towards the target minus distractor images. Since the data showed nonlinear patterns that linear models cannot adequately capture, the application of more flexible methods like GAMMs were required. The data was aligned on the onset of the target noun to account for differences in sentence length. Looks at the target image between the onset of the finite verb (active sentences) or the past participle (passive sentences) were interpreted as predictive eye-movements. Overall, children had fewer looks at the interest areas (target and distractor image), than adults, which is common in eye-tracking studies. A child’s gaze is less steady, and children have more looks outside the interest areas or off the screen. To account for this in the analysis, we looked at proportions of looks (calculated as looks at target minus looks at distractor divided by the sum of looks to the target and distractor for a given time bin) for children as well as adults.
The R packages ‘itsadug’ (Van Rij et al. 2022) and ‘mgcv’ (Wood 2017) were used to create the model and visualize the results. Post-hoc pairwise comparisons were conducted using the ‘emmeans’ package (Lenth 2025). The model included a smooth term for time (thin plate regression spline), modeled separately for each group (adults and children) in each condition (active, passive and baseline). Additionally, we included a random smooth (factor-smooth interactions) for participants to account for individual variability in time-course effects. The model was fitted using a Gaussian family with an identity link function and showed a significant nonlinear effect of time on the looks towards the target and distractor image. Parametric coefficients and False Discovery Rate (FDR) corrections were used to identify different gaze patterns across conditions and groups. Furthermore, difference plots were used to identify where in the sentence processing the gaze patterns across conditions and groups differed.
The participants did not receive feedback on the accuracy of their button press responses, which led to some children purposefully pressing the wrong button to see what would happen. Therefore, accuracy scores were not included in the final analysis. Note that accurate responses did not necessarily indicate good comprehension of passives, as participants could select the target picture just by hearing the final word.
3. Results
Go to section...
- TOP
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Funding
- Acknowledgements
- Note
- References
- Authors' addresses
The previous analyses provided insights into how the gaze patterns differ between adults and children, and between active, passive and baseline conditions within each group. In the following, we will examine more precisely when during sentence processing these differences emerge and how they vary across groups and conditions.
3.1 Active vs. passive sentences
Figure 2 shows the proportion of looks towards the target image minus the distractor image for adults (left) and children (right) in active (blue) and passive sentences (red). For adults, looks toward the target image increase rapidly shortly after verb onset, with similar patterns in both conditions. No significant differences were observed in this group. For children, we also see that looks toward the target image begin to increase after verb onset in both active and passive sentences. However, the increase is steeper in active sentences compared to passive sentences. This difference was statistically significant, as determined by the model visualization, and is indicated by the pink rectangle in Figure 2.
[Image omitted: See PDF]
3.2 Animacy effects
A baseline condition without semantically restrictive verbs was included to examine how adults and children recognize words in the absence of predictive cues, serving as a control condition. Half of these sentences contained an animate noun, and therefore an animate target picture, the other half an inanimate noun and target picture. Initially, the baseline condition was analyzed as a single category. However, unexpected gaze patterns — particularly among children, who looked at the target image frequently from the beginning of the sentence — necessitated further subdivision. Animate figures generally attract more visual attention, particularly in children, compared to inanimate objects (Altman et al. 2016).
Figure 3 illustrates the proportion of looks towards the target image minus the distractor image for adults (left) and children (right) in the subdivided baseline conditions. From the beginning of the sentence, both adults and — more pronounced — children look more towards the target image when it is animate (orange line) compared to when it is inanimate (green line). After the noun onset (dashed line), looks to the target image increase in both subdivisions of the condition.
[Image omitted: See PDF]
4. Discussion
Go to section...
- TOP
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Funding
- Acknowledgements
- Note
- References
- Authors' addresses
The finding that adults direct their gaze toward the target image before hearing the word indicates that they are capable of using the linguistic information present at the verb to predict how the sentence will continue. They do so with the same speed and certainty in active sentences as well as in passive sentences. Their predictions are therefore not influenced by the increased syntactic complexity of the passive sentences. Similarly, the gaze data reveal that children are also capable of predicting in active and passive sentences. However, as opposed to adults, their predictions are significantly weaker in the passive sentences compared to the active sentences. This suggests that, for children, syntactic complexity might hinder prediction.
The results from the baseline condition further support this finding. When subdividing the baseline sentences into those with an animate versus an inanimate target, we found that children especially showed a strong looking preference towards images of animate figures compared to inanimate figures, regardless of the context. Based on this, one might expect a similar pattern in the passive sentences, because there the target noun — and therefore the target image — was animate, naturally attracting more visual attention, as evidenced by gaze patterns in the control conditions. However, children’s predictions were still more pronounced in active sentences than in passive ones. This implies that children make much stronger predictions in the active than in passive sentences, despite there being animate target images for the passive sentences.
The findings of this study confirm the results of previous research that found that young children are already capable of making predictions during sentence comprehension (Borovsky et al. 2012; Mani & Huettig 2012). Additionally, our study shows that prediction in sentence comprehension is hindered by syntactic complexity for children, but not for adults. It is not exactly clear what causes this difference. It might be that 4- and 5- year old children make fewer or slower predictions because they have not yet fully acquired the passive sentence structure. On the other hand, since they do show some prediction in the passive condition, it is also possible that they do comprehend passive sentences, but that the increased cognitive load of processing complex sentences leaves less cognitive resources for making predictions (Chmiel et al. 2024; Ito et al. 2018).
As suspected by Jones & Westermann (2021) for children with DLD, our results suggest that there is a relation between syntactic complexity and predictive language processing for young TD children as well. However, it remains unclear whether the morphosyntactic difficulties experienced by children with DLD are caused by a specific prediction impairment or whether more general effects of syntactic complexity and increased cognitive load can explain differences in prediction for children with DLD, since they are also found in young TD children.
Since syntactic complexity and the subsequent increased use of cognitive resources are found to slow down prediction in L2-listeners (Chun et al. 2021; Chun & Kaan 2019; Ito et al. 2018), it seems likely that the difference in prediction between TD children and adults found in this study could also be explained by a combination of these two factors. Even when children comprehend passives, they do not yet have adult-like cognitive abilities. Therefore, the increased syntactic complexity of the passive sentences compared to the active ones places a greater demand on their cognitive resources which affects their prediction. Future research should include cognitive measures and more types of complex structures to disentangle the exact relationships between predictive processing, syntactic complexity and cognitive load.
4.1 Limitations
A limitation of this study was that the comprehension of passive sentences was not tested in this study. Based on previous studies we assumed the children that participated had not fully acquired passive sentences, but of course there might have been some exceptions. Moreover, accuracy scores of the picture-selection task were unusable. Children did not receive any feedback on their responses, which made them curious what would happen if they purposefully pressed the wrong button. To answer our research question, the gaze data contained all necessary information and the accuracy scores were not required for the prediction task. Nevertheless, it would be beneficial to make the task more appealing (e.g. include a game element) to ensure children are motivated to give the right answers.
5. Conclusion
Go to section...
- TOP
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Funding
- Acknowledgements
- Note
- References
- Authors' addresses
Funding
Go to section...
- TOP
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Funding
- Acknowledgements
- Note
- References
- Authors' addresses
Acknowledgements
Go to section...
- TOP
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Funding
- Acknowledgements
- Note
- References
- Authors' addresses
Note
Go to section...
- TOP
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Funding
- Acknowledgements
- Note
- References
- Authors' addresses
Floria Tosca van Rooy
University of Groningen
Atty Schouwenaars
University of Groningen
© 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.