Detecting the Unexpected: Dr. Fahad Sohrab on Teaching Machines to Recognise What They’ve Never Seen

When You Can’t Collect What You’re Looking For
Most machine learning systems learn by example: show the model enough labelled instances and it learns to tell them apart. That works well when every relevant class can be represented in training data. But the situations where automated detection matters most – mechanical failures, rare diseases, novel security threats, are precisely those where examples are hardest to gather.
“In many complex systems, the most important events are precisely those that have not been observed before. Machine failures, unexpected system behaviour, or the early stages of disease are typically rare and unpredictable. Because of this, it is often difficult, and sometimes impossible, to collect enough examples of these abnormal situations to train conventional models effectively.”
Dr. Sohrab’s answer is to invert the framing. Rather than cataloguing abnormalities, his models learn the structure of normal behaviour and flag whatever deviates from it; even anomalies they have never encountered. This family of approaches, one-class classification and anomaly detection, sits at the centre of his research.
Gastrointestinal endoscopy offers a direct illustration: a model trained on images of healthy tissue can flag deviating regions for clinical review, without needing labelled examples of every possible pathology.
The Clinical Stakes: Heart Attacks and Stress Signals
Healthcare is where the asymmetry between normal and abnormal data is most consequential. Two of Sohrab’s applied projects sit here. The first involves early detection of myocardial infarction using multi-view echocardiography. The earliest warning signs can be vanishingly subtle: slight variations in heart wall movement across ultrasound views that experienced clinicians can miss.
“The goal is not simply to classify medical images, but to detect very early physiological changes that could indicate a developing myocardial infarction, potentially enabling clinicians to intervene at a stage when treatment can be most effective.”
A second project led to a multimodal stress detection dataset developed with the University of Louisiana at Lafayette, combining facial expression data with physiological signals such as heart rate variability. Both projects surface a consistent requirement: close engagement with clinicians and careful attention to ethics, privacy, and cross-border governance.
The most meaningful advances occur when technical innovation, clinical expertise, and ethical responsibility evolve together.
Fahad Sohrab
Blood Cells and Biological Data: A Bridge into Cancer Research
Among the more unexpected threads in Sohrab’s portfolio is his affiliation with the Heinaniemi lab at the University of Eastern Finland, a group focused on gene regulation and cell-cell interaction in hematologic malignancies. Modern laboratory technologies now generate vast, detailed datasets capturing how individual cells behave and change as blood cancers develop. Interpreting them is precisely the kind of problem that machine learning is positioned to address.
“Clinicians and biologists bring deep knowledge of how cells function and how diseases develop, while machine learning contributes powerful tools for analysing large and complex datasets. When these perspectives come together, researchers can uncover patterns and insights that might otherwise remain hidden.”
One Method, Many Domains: Grids, Malware, and Infrastructure
The same core logic that flags anomalous tissue also monitors power grids and identifies malicious software. Cybersecurity, energy systems, and financial fraud look different on the surface; at the level of method, the structure is consistent.
“In cybersecurity we analyse patterns of software or network behaviour to identify deviations that may indicate malicious activity. In energy systems such as smart power grids, anomaly detection can monitor streams of sensor data and identify unusual signals that may indicate faults or system instability. Although these domains appear very different, the fundamental task remains the same: learning what normal behaviour looks like and identifying meaningful deviations from it.”
Industry-connected projects in Finland, including IoT-based infrastructure monitoring, have tested these ideas operationally. What transfers between domains is the modelling principle; what changes is the data type and the contextual knowledge needed to interpret the results.
Community, Industry, and the Role of IEEE
Research intended to function in real environments depends on access to real data. Business Finland–supported collaborations with Finnish companies have given Sohrab access to industrial datasets and the chance to evaluate methods against operational demands. At the international level, earlier work with NSF-funded centres at the University of Louisiana at Lafayette has evolved into a planned partnership through the newly established AHeAD centre (Accessible Healthcare for AI-Augmented Decisions) focused on trustworthy, human-centred AI for clinical decision support.
“Being connected to research environments in different countries helps bring together diverse expertise and perspectives, which is especially important in interdisciplinary areas like artificial intelligence and biomedical data science.”
He recently became Vice-Chair of IEEE Finland, an organisation he sees as a key connector between researchers, industry, and the broader technical community.

What Comes Next: Multimodal AI and the Question of Trust
Looking ahead, Sohrab identifies two interrelated priorities: building systems that integrate multiple data sources, and making their outputs interpretable enough to act on with confidence.
“Detecting an anomaly is only the first step. In most application domains, it is equally important to understand why a system identifies something as unusual. Developing models that can explain their reasoning and quantify uncertainty will be essential for building systems that experts can trust and use in decision-making processes.”
The through line is a vision of AI that extends human capacity rather than substituting for human judgement.
“If we can develop systems that are both technically robust and trustworthy, they have the potential to support major advances in areas such as healthcare, sustainable infrastructure, and environmental monitoring.”
Vice-Chair IEEE Finland
Machine learning, subspace learning, anomaly detection, pattern recognition and related areas.
Signal Analysis and Machine Intelligence SAMI research group, Tampere University
Data Science Research Centre, Tampere University
Heinaniemi-lab, University of Eastern Finland
Author: Sujatro Majumdar








