Skip to main content

Data science

Tampere University
Area of focusTechnology
Area of focusEcological innovations and social challenges


We conduct research in data science and statistics, focusing on diverse applications. We develop innovative machine learning models for image processing, medicine, and financial forecasting. In statistics, we concentrate on probabilistic machine learning, finite mixtures, survival analysis, and time-series analysis. We explore rule-based, machine learning and AI approaches for NLP. Our recent focus is on fair and transparent machine learning and recommender systems. We also study complex systems using network science in social, biological, and financial networks. Our developments apply across fields like image processing and social network analysis.

Contact persons

Prof. Juho Kanniainen, juho.kanniainen (at) tuni.fi

Associate Prof. Konstantinos Stefanidis, konstantinos.stefanidis (at) tuni.fi

Applied Statistics

Our Statistics Research is strongly connected with data science but has both its own distinct aspects within data science and its own core research separate of data science: in particular, solutions of tasks are carried out via statistical modeling, analysis of time-dependent data (timeseries, longitudinal), planning of data gathering, treatment of distributional assumptions, representation and management of uncertainty, probabilistic estimation and inference, prediction and hypothesis testing, and the research and theory of these core methodologies is unique to statistics. The Centre for Applied Statistics and Data Analysis conducts applied statistical research, where statistical methods are used and modified to solve research problems in different disciplines, for example in health, medicine, social sciences and technology.  

Statistical Machine Learning and Exploratory Data Analysis

Prof. Peltonen’s Statistical Machine Learning and Exploratory Data Analysis Group focuses on designing and developing Statistical Machine Learning solutions for modeling and exploring data. This includes novel methods for modeling text and matrix data with topic modeling approaches, vectorial embedding approaches generalizing word embeddings, and novel matrix factorization solutions. His group also works on methods for information retrieval from large databases, including modeling and elicitation of user intent by Bayesian regression, probabilistic retrieval, and visualization of user intent.  

Signal Analysis and Machine Intelligence

In the field of Machine Learning, Prof. Gabbouj’s Signal Analysis and Machine Intelligence Group introduced a paradigm shift in ANN by extending the linear operation part of the perceptron to an arbitrary nonlinear function. In this way, Multilayer Perceptrons (MLP) were upgraded to Generalized Operational Perceptrons, and Convolutional Neural Networks (CNN) were extended to Operational Neural Networks (ONN).

Financial Computing and Data Analytics

Prof. Kanniainen is leading Financial Computing and Data Analytics group. His group, in collaboration with Prof Gabbouj, has developed interpretable Machine Learning methods for Time-Series Modelling. The model is called Temporal Attention-Augmented Bilinear Network, which is highly interpretable, given its ability to highlight the importance and contribution of each temporal instance, thus allowing further analysis on the time instances of interest. Moreover, the group has developed network methods to model information cascades with partial observations on individuals’ states, which are applied for stock markets. These methods can be used to study how social relations drive investors in their decision making and to identify abuse of information in stock markets.

Data Analytics and Optimization

Prof. Lipping’s Data Analytics and Optimization group, located at the Pori Campus, develops deep learning and AI solutions for agriculture, health, and industry.

Multimedia and Data Mining

Prof Visa's Multimedia and Data Mining Group works with explainable machine learning or artificial intelligence. The main application fields for this technique are time series of hyperspectral signals or images.

Natural Language Processing

Our Natural Language Processing group, led by Prof. Nummenmaa, has developed novel rule-based and machine learning approaches for question answering, to retrieve answers to natural language queries from big data and knowledge bases. The group has also developed methods for managing and analyzing grammatically parsed data and worked on different text mining tasks, such as frequent pattern mining and distinguishing pattern mining, including sequence mining for textual representations, suitable for mining biological data, represented as text. 

Recommender Systems

Recommender Systems tend to anticipate user needs by automatically suggesting the information which is most appropriate to the users and their current context. Prof. Stefanidis' Recommender System Group focuses on algorithmic approaches for traditional and more sophisticated scenarios, like group and sequential recommendations, developing and applying machine learning solutions, and building on both numerical ratings and textual reviews. Moreover, the group studies the big data integration and entity resolution problem for highly heterogeneous data, with a recent focus on progressive solutions for entity matching. 

Responsible Data Management and Ethical Artificial Intelligence

Profs. Nummenmaa, Peltonen, Elomaa, Stefanidis and Juhola focus as well on Responsible Data Management and Ethical Artificial Intelligence, where a rising concern is how to perform statistical data analysis and machine learning in an ethical, fair, transparent, and explainable manner. In this line of work, we also focus on enabling different stakeholders to query, understand and fix sources of bias in data science solutions, in an accessible and transparent manner. Methods for providing explanations that target at understanding the cause of unfairness and examine the capability to capture user intent that typically changes across sessions are developed.   

Tampere Complexity Lab

The Tampere Complexity Lab, Prof. Iñiguez’s research group in network science and computational social science, develops computational tools and mathematical theories to understand collective human behaviour by analyzing data and making models of social digital interactions available online. TaCoLAB uses an interdisciplinary, data- and mechanism-driven perspective to study group segregation in social networks, attitudinal polarization online, information diffusion, and the dynamics of ranked and hierarchical complex systems.

Predictive Society and Data Analytics

Prof. Emmert-Streib’s group, Predictive Society and Data Analytics Lab, conducts innovative research in data science with a deep appreciation for statistical thinking. The group studies a wide range of data types, e.g., genomics data, text data and network data by developing and applying methods from machine learning, AI and statistics. Current methodological focus is on learning paradigms, including transfer learning, multi-label classification and the digital twin, and deep learning architectures. Furthermore, the inference and analysis of networks is studied by network science.