Pouria Babahajiani: Geometric Computer Vision: Omnidirectional visual and remotely sensed data analysis

In computer vision, geometry describes the topological structure of the environment. Specifically, it concerns measures such as shape, volume, depth, pose, disparity, motion, and optical flow, all of which are essential cues in scene acquisition and understanding. In his disseratation, Pouria Babahajiani designed and developed machine learning algorithms to improve practical systems that impact today’s technologies, including autonomous vehicles, virtual reality, augmented reality, robots, and smart-city infrastructures.

The contributions of Pouria Babahajiani’s dissertation can be categorized into two parts. The thesis’ first contribution is the proposal of semantic segmentation of the 3D Light Detection and Ranging (LiDAR) point cloud and its applications. The proposed method is made efficient by combining fast rule-based processing for building and street surface segmentation and super-voxel-based feature extraction and classification for the remaining thin elements. Based on the experiments, the rule-based processing stage provides substantial improvement not only in computational time but also in classification accuracy. Furthermore, two back ends are developed for semantically labeled data that exemplify two important applications: 3D high definition urban map that reconstructs a realistic 3D model using input labeled point cloud, and semantic segmentation of 2D street view images.

The second contribution of the thesis is the development of a practical, fast, and robust method to create high-resolution Depth-Augmented Stereo Panoramas (DASP) from a 360-degree VR camera that supports 6-Degrees-of-Freedom (DoF) head motion parallax using geometric computer vision and machine learning. A novel and complete optical flow-based pipeline is developed, which provides stereo 360-views of a real-world scene with DASP. The system consists of a texture and depth panorama for each eye. A bi-directional flow estimation network is explicitly designed for stitching and stereo depth estimation, which yields state-of-the-art results with a limited run-time budget. The proposed architecture explicitly leverages geometry by getting both optical flow ground-truths.

“The supporter paradigm for semantic representations in computer vision is geometry. Using scene geometry indeed improves the representational power of computer vision models, allowing them to learn relationships in the 3D scenes and improving their performance by simplifying the learning task. Consequently, many complex relationships in the scene, such as objects’ shape, size, and depth, do not need to be learned from scratch with machine learning,” Pouria Babahajiani says.

The doctoral dissertation of M. Sc Pouria Babahajiani in the field of machine learning and computer vision titled Geometric Computer Vision: Omnidirectional Visual and Remotely Sensed Data Analysis will be publicly examined in the Faculty of Information Technology and Communication Sciences of Tampere University at 12 o’clock on Friday 28 May 2021. The opponents will be Assoc.Prof. Sid Ahmed Fezza from National Institute of Telecommunications and ICT (INTTIC), Algeria and Prof. Azeddine Beghdadi from Université Sorbonne Paris Nord, France. The Custos will be Professor Moncef Gabbouj from the Faculty of Information Technology and Communication Sciences of Tampere University.

The event can be followed via remote connection on Zoom.

The dissertation is available online: http://urn.fi/URN:ISBN:978-952-03-1979-3

Photo: Sahar Husseini