Monocular Depth Estimation in Python using Monodepth2 and Manydepth

Aditya NG
2 min readFeb 7, 2022

--

Estimating 3D depth from a single camera has remained an ill-posed problem for a long time. This also has vast applications in the field of robotics and self-driving vehicles as traditional depth estimation sensors like LiDAR, Radar, Sonar and even Stereo Vision have certain drawbacks including high cost, a sparse data signal and potentially require regular calibration. It becomes difficult to perform 3D object detection on a sparse signal of LiDAR, Radar or Sonar when compared to the rich and dense vision signal from a camera. Stereo cameras or a Camera-LiDAR setup would require regular calibration to maintain its accuracy.

Monodepth2 in action

Monocular depth estimation using Neural Networks proposes a simple and elegant soution to the high cost, sparse signal and calibration problem of traditional approaches by having a neural network predict depth given an image or a sequence of images.

Monodepth2[1] is a Self-Supervised Monocular Depth estimation technique that takes in one frame at a time and predicts the depth map. Monodepth2 runs at about 100ms per image frame on most modern NVIDIA GPUs. To install and run run the webcam demo run the following

pip install monodepth2
python -m monodepth2 # Run the webcam demo

Manydepth[2] is a followup to Monodepth2. It is aSelf-Supervised Monocular Depth estimation technique that takes in two adjacent frames at a time and predicts the depth map. Manydepth runs at about 300ms per image frame on most modern NVIDIA GPUs. To install and run run the webcam demo

pip install manydepth
python -m manydepth # Run the webcam demo

To use either of the models in your robotics project, you can import and pass it frames as follows

Using Manydepth and Monodepth2 in your robotics projects

--

--

Aditya NG
Aditya NG

Written by Aditya NG

Computer Vision and Autonomous Robotics Research

No responses yet