Studying children’s language acquisition can help to develop machine intelligence
“Once we understand the mechanisms and principles underlying child language acquisition, we can utilise the information in the development of machine intelligence,” says Academy Research Fellow, Assistant Professor Okko Räsänen.
Räsänen leads the Speech and Cognition Research Group at Tampere University’s Unit of Computing Sciences at the Faculty of Information Technology and Communication Sciences.
“One of the main themes of our research group is studying children’s early language learning with speech technology. We create computational models on children’s language learning and develop technological tools for language acquisition research,” Räsänen says.
The models and tools developed by the research group can be used to analyse large audio data from children’s home environments automatically. The data are collected by using microphones worn by children.
“We have thousands of hours of audio data to analyse. They contain everything a child hears in his or her auditory environment,” Räsänen notes.
“The manual analysis of such materials is not possible for any research group in the field and the existing automatic speech recognition systems are not suitable for this purpose. The recordings are often of poor quality and the sound source may be far from the microphone. The tools must also work regardless of the language being spoken,” Räsänen says.
The methods used by the research group are based on signal processing and machine learning. The methods are suitable for assessing both the quantity and quality of speech. They can be used to analyse, for example, how much a child hears speech from different parents and whether the child is directly spoken to or only overhears conversations between the parents.
What can machines learn from children?
Räsänen’s research group is especially interested in what kind of computational models would enable a machine to learn to understand spoken language in the absence of any prior language-related knowledge. Children do not receive direct guidance in learning their native tongue, either. They learn by listening to and interacting with people in their environment.
“Our goal is to gain a better understanding of the interaction between children’s language experiences and language learning mechanisms. This is a way to grasp eg individual differences in children’s language development or which factors are critical for normal language development more generally,” Räsänen explains.
“We try to find out, for example, the roles of elementary speech sounds, syllables and syntax at different stages of learning. We are interested in how a child learns the structure of the utterances of their native tongue, how they segment and recognise words from a continuous flow of speech, or how learning the meanings of words is intertwined with recognising the words from speech. We also explore what other information and interaction language learning requires in addition to the speech one hears,” Räsänen says.
By the means of computational modelling, Räsänen has demonstrated in his studies eg that a child or a machine learning a language does not necessarily have to find words in the stream of speech first in order to connect words with their meanings. Instead, distinguishing words and learning their meanings are synergetic tasks and can therefore occur simultaneously when a child interacts with his or her environment in a multisensory manner. The group has also explored the link between so-called statistical learning, attention, and the prosody of speech.
“Among other things, our studies have led to a theory as to why infant directed speech draws the child’s attention more than normal speech. The same theoretical framework also explains how sentence stress affects the processing of speech and attentional orientation in adults as well,” Räsänen explains.
According to Räsänen, this theory is based on the observation that the predictability of sensory stimuli orients the direction of attention in our environment. Based on preliminary analyses, the rich intonation of infant directed speech seems more difficult to predict than the intonation of ordinary speech.
“The resulting knowledge can be applied to computer-based automatic sentence stress analysis of speech materials. This can be done without a human teaching the machine with examples of how sentence stress occurs in spoken language,” Räsänen explains.
The work of Räsänen’s research group combines aspects of engineering, psychology and linguistics. The development of computational models for processing speech data requires expertise in signal processing and machine learning. On the other hand, the phenomena being studied focus on human cognitive development as well as the structure of languages and language as a phenomenon.
Harnessing language learning mechanisms is only starting
Räsänen’s research group is currently undertaking two Academy of Finland-funded projects. The Computational basis of contextually grounded language acquisition in humans and machines project studies the mechanisms of children’s language learning and ways to utilise these mechanisms in the development of machine intelligence. The Analyzing Child Language Experiences Around the World (ACLEW) project is investigating the variables that play a role in children’s language acquisition in different linguistic and cultural settings.
In 2018, Räsänen transferred to Tampere University from Aalto University. ACLEW is officially an Aalto University project, which began when Räsänen still worked at Aalto. He continues to work as the principal investigator of the project which employs researchers at both Aalto and Tampere Universities.
The work Räsänen and his research group are doing in order to understand the language learning process of children is ground-breaking in many ways. According to Räsänen, harnessing language learning mechanisms for the use of machine intelligence is only starting.
“In recent years, the rapid technological development in the field of machine learning has led to great strides in the applications of speech and language technology. However, the systems are still far from human ability to learn and truly understand spoken language. In addition, the capability of the existing computational models to understand and address individual-level language development or its problems is still limited,” Räsänen points out.
“Understanding language learning or developing human-like linguistic and conceptual competence for artificial intelligence systems is an extremely long and challenging project. The work on this topic is sure to continue for decades to come,” he notes.
Text: Tiina Wesslin
Photo: Jonne Renvall