Simultaneous Localisation and Mapping (SLAM)
The ability to map an unknown environment is important for field robots, drones and autonomous vehicles to navigate independently. Simultaneous Localisation and Mapping (or SLAM for short) is a relatively well-studied problem in robotics with a two-fold aim:
• building a representation of the environment (aka mapping)
• finding where the robot is with respect to the map (aka localisation).
When is SLAM needed?
For mobile robotics, it is important to know where the agent is at any moment in time. Normally, GPS can provide a rough location for a robot. However, in GPS-denied environments such as indoors, underground, or underwater, the mobile agent has to rely solely on its on-board sensors to construct a representation of the environment, which would then allow it to locate itself. This is the scenario in which SLAM is needed. Even in situations where GPS can provide coarse localisation, SLAM can be used to provide a fine-grained estimate of the robot location.
SLAM and deep learning
Different variants of the SLAM problem can be formed using various combinations of sensors such as a monocular, stereo and RGBD cameras, laser scanners, and Inertial Measurement Units. When cameras are used as the primary sensor, the problem is termed Visual SLAM and inherits many problems that come with cameras such as errors caused by illumination changes.
AIML researches are applying deep learning techniques to address many of the perceptual shortcomings of Visual SLAM including, single-view depth prediction, better features for matching images across large baselines, two-view pose and depth estimation and object-based SLAM via object detection in images.
Real-time Monocular Sparse SLAM with semantically meaningful landmarks (video prepared by Mehdi Hosseinzadeh).
2016 "Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age" Cesar Cadena, Luca Carlone, Henry Carrillo, Yasir Latif, Davide Scaramuzza, José Neira, Ian Reid, John J Leonard
This paper presents a survey of the current status of SLAM and discusses the open problems and future research directions for SLAM.
2016 "Unsupervised CNN for single view depth estimation: Geometry to the rescue" Ravi Garg, Vijay Kumar BG, Gustavo Carneiro, Ian Reid
We present a single-view depth estimation system that can be trained end-to-end from scratch, in a fully unsupervised fashion, using data captured using a stereo rig; removing the need for vast amounts of annotated training data. Our network is trained on less than half of the KITTI dataset and gives comparable performance to that of the state-of-the-art supervised methods for single view depth estimation.
2019 "Real-Time Monocular Object-Model Aware Sparse SLAM" Mehdi Hosseinzadeh, Kejie Li, Yasir Latif, Ian Reid
We introduced a monocular SLAM system that can incorporate plane and object models to allow for more accurate camera tracking and a richer map representation without huge computational cost.
2018 "Unsupervised learning of monocular depth estimation and visual odometry with deep feature reconstruction" Huangying Zhan, Ravi Garg, Chamara Saroj Weerasekera, Kejie Li, Harsh Agarwal, Ian Reid
We present an unsupervised learning framework for single-view depth-estimation and monocular visual odometry using stereo data for training.
2019 "Scalable Place Recognition Under Appearance Change for Autonomous Driving" Anh-Dzung Doan, Yasir Latif, Tat-Jun Chin, Yu Liu, Thanh-Toan Do, Ian Reid
A major challenge in place recognition for autonomous driving is to be robust against appearance changes due to short-term (e.g., weather, lighting) and long-term (seasons, vegetation growth, etc.) environmental variations.
We propose a novel method for scalable place recognition, which is lightweight in both training and testing with the data continuously accumulated to maintain all of the appearance variations for long-term place recognition. From the results, our algorithm shows significant potential towards achieving long-term autonomy.
Visual sensing for localisation and mapping in mining
Current mine surveying involves scanning from a number of fixed points using laser range-finding equipment. The aim of this project is to develop computer vision algorithms to improve the speed and accuracy of this digital mapping of mines, to allow accurate mapping in locations denied GPS, and in locations where other sensors cannot be deployed.
Ian Reid, Tat-Jun Chin, Maptek
ARC Grant ID: LP140100946
Lifelong Computer Vision Systems
The aim of the project is to develop robust computer vision systems that can operate over a wide area and over long periods. This is a major challenge because the geometry and appearance of an environment can change over time, and long-term operation requires robustness to this change. The outcome will be a system that can capture an understanding of a wide area in real time, through building a geometric map endowed with semantic descriptions, and which uses machine learning to continuously improve performance. The significance will lie in turning an inexpensive camera into a high-level sensor of the world, ushering in cognitive robotics and autonomous systems.
ARC Grant ID: FL130100102
Recognising and reconstructing objects in real time from a moving camera
The aim of this project is to visually understand an environment as seen from a moving camera in real time. This entails the recovery of 3D shape and the recognition of individual objects in the environment, while also recognising overall scene types (indoor or outdoor, home or office). This is a significant advance over existing systems, which focus on sparse 3D shape estimation, and produces a model of the environment which is akin to that maintained by a human observer. Such a model has applications beyond the typical domain of robotics, including driver assistance, automated map annotation, environment capture and true scene understanding, which is the original and ongoing goal of computer vision.
Ian Reid, Anthony Dick
ARC Grant ID: DP130104413