Hyppää pääsisältöön
Väitös

María Andrea Cruz Blandón: Artificial neural networks help us understand how infants learn to speak – a study emphasises the need for more realistic simulations

Tampereen yliopisto
SijaintiKorkeakoulunkatu 8, Tampere
Hervannan kampus, Festia, sali FA032 ja etäyhteys
Ajankohta18.9.2025 12.00–16.00
Kielienglanti
PääsymaksuMaksuton tapahtuma
María Andrea Cruz Blandón.
Kuva: Aria Nourbakhsh
In her doctoral dissertation, M.Sc. María Andrea Cruz Blandón explored how infants learn their native language from the sole speech exposure using self-supervised artificial neural networks. In her work, she investigated how to introduce aspect of the real-life learning scenarios into the simulations, as well as, how to systematically compare learning simulations with current scientific knowledge of infants’ learning process.

Imagine you wake up and you are in a different country and people start talking to you in a different language than yours. Most likely you will not understand much, or anything at all. Similarly, but with even less understanding of the world, infants are exposed to their native language by interactions with their caregivers during the first months of life. 

To discover what all that speech signal means, likely infants would need to identify the sounds of the language, the possible combinations that form words, the structure of sentences, the relation between words and things in the world, and the context in which they are said. It is certainly a complex process. Yet, infants show a superficially effortless proficiency in the learning process of their native language, that regardless of the language or the simultaneous body and cognitive developments. A process known as early language acquisition. 

“Studying infant language acquisition is inherently challenging, as infants cannot explicitly communicate their learning process. This requires researchers to rely on indirect behavioural and neural measurements to assess their linguistic skills,” M.Sc. María Andrea Cruz Blandón says. 

Cruz Blandón used in her doctoral thesis computational models as a complementary methodology. Computational models can help us studying the language acquisition phenomenon by testing hypotheses about the mechanisms involved in the learning process, the type of language input necessary for a satisfactory process, and other developmental factors that are hard to investigate in real-life experiments.

New evaluation framework supports creating and assessing increasingly holistic models

In her work, Cruz Blandón focused on addressing two key challenges in computational modelling – how can we compare models with infant behaviour, and how can we increase the ecological validity of the learning simulations. For the first challenge, she and her collaborators improved and extended common practices for comparing models against empirical infant data with a new methodology they called “MetaEval”. 

“This new evaluation framework incorporates robust empirical data on infant language, accounts for multiple linguistic capabilities in parallel, and aligns with the experimental practices used to study infants' language behaviours. This approach supports creating and assessing increasingly holistic models that better reflect infant learning trajectories and contributes to standardising model evaluation practices,” Cruz Blandón reports. 

For the second challenge, the thesis explored how to have computational simulations of infant language learning that include more realistic conditions than previous studies. 

“The selected self-supervised neural network models rely solely on acoustic speech input without linguistic priors, simulating statistical learning as a core mechanism of language development,” Cruz Blandón explains. 

The research bridges the gap between computational models of infant language learning and real-life language acquisition experiences

Models as infants are exposed to speech without linguistic knowledge but with the capacity to discover and predict patterns. Cruz Blandón explains that they also integrated estimates of daily speech exposure of actual infants’ experiences and introduced prenatal language experience in the simulations for the first time in this type of studies.

“Overall, the contributions of the thesis are bridging the gap between computational models of infant language learning and real-life language acquisition experiences by advancing more ecologically valid simulations of infant language learning and closer comparisons between models' and infants' linguistic behaviours that account for multi-capability learners and noisy behavioural data. The results of this thesis highlight the importance and challenges of developing more realistic simulations of language learning,” Cruz Blandón concludes.

Public defence on Thursday 18 September

The doctoral dissertation of M.Sc. María Andrea Cruz Blandón in the field of computer and cognitive science titled “Computational Modelling of Early Language Acquisition: Towards naturalistic simulations and robust model assessment” will be publicly examined at the Faculty of Information Technology and Communication Sciences at Tampere University at 12.00 on Thursday 18 September 2025 at Hervanta campus, Festia building, room FA032 (Korkeakoulunkatu 8, Tampere). 

The Opponent will be CNRS Research Director Thomas Hueber, Grenoble-Alpes University. The Custos will be Professor Okko Räsänen, Faculty of Information Technology and Communication Sciences at Tampere University. 


The doctoral dissertation is available online
The public defence can be followed via remote connection.