Hyppää pääsisältöön

Bishwo Adhikari: Deep learning assists in reducing 2D image annotation workload for object detection

Tampereen yliopisto
PaikkakuntaHervannan kampus, Tietotalo, auditorio TB109 (Korkeakoulunkatu 1, Tampere)
10.6.2022 9.00–13.00
Kielienglanti
PääsymaksuMaksuton tapahtuma
Deep learning-based object detectors are commonly used in diverse computer vision applications. The fundamental challenge is to collect a large amount of high-quality labeled data to tune billions of parameters. In addition, powerful computing hardware is needed for training and deployment. In his doctoral dissertation, M.Sc. Bishwo Prakash Adhikari proposes a method that assists in image labeling by leveraging an object detector trained on the partially annotated dataset and accelerates the data annotation process. He also researches different design spectrums to deploy object detectors in different tasks on resource-limited devices.

Bishwo Adhikari investigates two well-known problems related to deep learning-based object detection: how to improve the image annotation process and how noisy labels impact the performance of object detectors, and how to efficiently utilize object detection models in different computer vision tasks for resource-limited embedded platforms.

Adhikari’s dissertation focuses on the issue related to image labeling. He proposes a human-machine collaborative approach that would solve the image annotation task to train deep learning-based object detectors. Instead of doing all work manually, the human annotators are required to do only annotation and inspection tasks that are comparatively faster and less boring. Adhikari proposes several practical solutions  to the problem of annotating large datasets.

“The proposed solutions leverage networks trained with partially annotated datasets to help the human annotator process the remainder of the data. Despite its simplicity, this approach experimentally shows good results in reducing the image labeling workload,” he says.

Adhikari also investigates the challenge of the deployment of object detectors in different imaging applications. “In my dissertation I present object detection in facial analysis, person detection and tracking in edge device, and path prediction of moving objects on edge device. By carefully selecting a detection network, adjusting network size, and utilizing the power of existing accelerator devices, a state-of-the-art detector can be deployed on low-resourced devices to solve various vision-based problems without compromising much accuracy,” he states.

While image data labeling is a tedious, expensive, and error-prone process, there has been significant interest in making large-scale labeled datasets open-source for diverse tasks. However, these datasets are still inadequate for the practical usage case to develop well-performing real-world applications.

“It is recommended to collect and label some custom data and apply transfer learning on the network trained on large-scale benchmark datasets. The method I propose speeds up data annotation, and makes it easier to train machine learning models for real-world applications,” Adhikari adds

Modern deep learning-based object detection networks require large amount of labeled and balanced training data, which are typically not available in the industry. Adhikari’s dissertation provides a comprehensive guide on collecting labeled datasets to train state-of-the-art object detectors and studies the design spectrum for deploying object detectors for real-world applications.

The doctoral dissertation of MSc (Tech) Bishwo Prakash Adhikari in the field of Information Technology titled Computer Assisted Image Labeling for Object Detection Using Deep Learning will be publicly examined in the Faculty of Information Technology and Communication Sciences at Tampere University at 12 o'clock on 10 June 2022. The venue is auditorium TB109 in Tietotalo (Korkeakoulunkatu 1, Tampere). The Opponent will be Associate Professor Miguel Bordallo López from University of Oulu. The Custos will be Associate Professor Esa Rahtu from Tampere University. The dissertation is co-supervised by Dr. Heikki Huttunen from Visy Oy.

The dissertation is available online at http://urn.fi/URN:ISBN:978-952-03-2420-9.

Photo: Rasmita