Skip to main content

Profiling areas in focus: Health data science studies how machine learning and AI could help to treat diseases

Published on 13.12.2021
Tampere University
Anssi Auvinen ja Matti Nykter
“Major universities around the world have at least Chairs, if not entire departments, dedicated to health data science,” Professor Anssi Auvinen says. He is convinced that Tampere University will become a major researcher and practitioner in the field. “This is a sure way to further cancer research. In all fields, the major problems are so difficult that they cannot be solved alone or within a single field. They require multidisciplinary expertise and cooperation,” Professor Matti Nykter says.
Data science methods are opening opportunities for advancing the personalised treatment of diseases and improving the healthcare system. Mathematical tools can be used, for example, to process large patient datasets to identify both causes of diseases and appropriate treatments.

“The ultimate goal is to use novel methods to promote human health,” says Matti Nykter, professor of bioinformatics.

The Health Data Science project, which began this autumn, combines two strong areas at Tampere University: data science – such as machine learning and signal processing – and health research. The five-year project is funded by the Academy of Finland’s PROFI6 funding instrument.

Professor Nykter is responsible for the bioinformatics part of the project, which is using biological laboratory data to study, for example, the structure of genes and proteins by using data science methods.

Building on extensive background data

Anssi Auvinen, professor of epidemiology and Director of the Prostate Cancer Research Center, is responsible for the healthcare perspective of the project.

“We aim to use extensive background data to study who will benefit most from specific treatments,” Auvinen says.

The research data comes from various registers and databases.

“Registers give us information on hospitalisations, outpatient visits and medications used. We can also analyse various samples to learn about exposome composed of life-long exposures. Using as rich background data as possible, we try to predict who will develop a disease or have a favourable disease course,” Auvinen explains.

Machine learning, artificial intelligence and exploratory big data analysis methods will be used to mine huge datasets to identify signals that might prove useful for the decision-making related to health and disease.

“Even though the research programme has a methodological focus, it has a very obvious link to healthcare practices, prediction and risk assessment,” Auvinen says.

Patients will be grouped according to treatment outcomes

In practice, the researchers will develop a model in which different disease outcomes and trajectories are explained by different underlying factors.

“We are asking how the original data – such as patient data – could be divided into subgroups based on, for example, how much or little chance patients have of benefiting from standard treatment,” Auvinen mentions.

The most accurate biological description of the patient’s condition as possible is constructed, which is then used to identify factors that could help to target treatments.

For example, if a certain biological abnormality supporting the survival of tumour cells is found in a cancer, in the best-case scenario, the best way to treat the disease is to target the functional abnormality.

Finland has unique health registers

Interdisciplinary data research has been carried out for decades. Those studies have contributed to basic research and clarified, for example, the biological mechanisms that underlie the development of cancer.
Tampere University also has several ongoing studies where various biomedical and register data are analysed using data science methods.

“Finland has unique opportunities especially in the health sector because we have huge, comprehensive datasets in various health care registers,” says Auvinen.

Finnish healthcare covers the entire population, everyone uses the same healthcare system, and every person’s health data is collected. This is Finland's strength. Most of the world’s countries do not have such a centralised and systematic collection of usable data.

For example, the Cancer Register established in 1954 is a unique database of information on all cancer patients in Finland.

Secondary Use Act extended and complicated the use of patient data for research

While treating patients, hospitals accumulate plenty of data in patient registers. This so-called real-world data has traditionally been used in register-based studies in medicine, nursing, and health sciences.
The Act on Secondary Use of Health and Social Data, which entered into force in Finland in 2019, provides a basis for using patient data in research. However, researchers see both advantages and disadvantages in it.

“Even though the Act also imposes significant restrictions on research, it has opened opportunities for accessing the large amount of data collected in hospitals. The computational analysis of patient data allows us to learn about the ways patients have been treated and what the impact of the treatments has been,” Nykter says.

Auvinen, on the other hand, regrets that the Act has slowed down the process of obtaining research permits and hugely increased the price of data use. This makes research difficult especially if it is not done in a large-scale and well-funded project.

University hospital’s data lake provides more data

Biobanks and data lakes created by hospitals provide extensive data for research purposes. A data lake is a collection of all digital patient data in a hospital. The data it contains is accessible to researchers in the hospital’s secure IT environment if they have obtained a permission to study a particular dataset.

“The data lake allows us to use real-world data more efficiently in research. It will also provide us with a better interface to hospital activities and data analytics can also gradually begin to support the treating physician in patient care,” says Nykter.

New technologies, such as genome sequencing for cancer treatment, can also be introduced in patient care pathways.

Aiming for leadership in health data science

“Our goal is to bring our high-level cancer genomics research to the clinical context in a few years’ time,” Nykter says.

The ambitious goal of the PROFI6 project is for Tampere University to strengthen its position as an internationally prominent researcher and practitioner of health data science in close cooperation with Tampere University Hospital. In Finland, Aalto University and the University of Helsinki have their own artificial intelligence flagship projects, but they mainly focus on methodological development.

“The specific aim of our research programme is to bring the latest computational methods to medicine and patient care,” Nykter says.

“We bring together clinical research and machine learning and use high-level computational methods to study medical issues. Our very close collaboration with Tampere University Hospital is an asset,” he adds.


Read more on the project's web pages

The Academy of Finland PROFI6 funding

At the beginning of 2021, the Academy of Finland made decisions on the PROFI6 competitive funding to support universities in strengthening their research profile and improving the quality of research.
The Academy granted Tampere University €12.7 million for enhancing the profiling of strategic research.
Four profiling areas are funded in Tampere University’s spearhead fields of technology, health, and society:
    1) Health Data Science
    2) Games as a Platform to Tackle Grand Challenges
    3) TAU Imaging Research Platform
    4) Sustainable Transformation of Urban Environments.