This document provides an overview of 3D SLAM (Simultaneous Localization and Mapping) techniques. It discusses what SLAM is, common sensor types used (such as laser, camera), visual odometry methods (feature-based and direct), graph optimization, loop closure, map reconstruction, and several popular open-source SLAM projects (ORB-SLAM, LSD-SLAM, ElasticFusion). It also briefly mentions applications of SLAM such as in robotics, navigation, and augmented/virtual reality.
Convert to study materialsBETA
Transform any presentation into ready-made study material!select from outputs like summaries, definitions, and practice questions.
9. Sensor
Laser
? Accurate
? Fast
? Long research history
? Heavy
? Expensive
? Ex. SICK, Velodyne, Rplidar
Camera
? Light-weight
? Cheap
? Rich information
? High computational complexity
? Strong assumption to env.
? Easy affect by noise
? Ex. Mono, Stereo, RGB-D
2016/12/13 9
11. 3D imaging tech.
Mono Stereo Structed-light TOF
Software complex complex normal simple
Hardware cost low low high normal
Response time normal normal slow fast
DepthAccuracy low low High normal
Low lighting bad bad good Good
Bright light Bad good bad good
Distance Not limited Not limited Midian Midian
Adv. Low cost x High accuracy
Robust to env.
light
Dis.
Scale
uncertainty
Computational
complex
Slow response
Limited detection
distance
2016/12/13 11
13. Visual Odometry
? According to two sequential images, estimate camera ego-motion
? Input: Point in space project to two camera
? Output: ego-motion of two camera
? Monocular: Only pixel related position, no depth info.
? Stereo、RGB-DGet depth info directly
? Dimensions
? 2D-2DEstimate motion by to set of pixel Epipolar geometry
? 3D-2DEstimate motion by known space position
and projection position. PnP
? 3D-3DEstimate motion by two set of known
space point ICP
2016/12/13 13
14. Visual Odometry
? Problems of monocular camera
? Unsure about scale, need to give initialization parameters
? Can not estimate motion when doing pure rotation
2016/12/13 14
16. Visual Odometry C Feature (Main stream)
? Feature
? SIFT、ORB
? Key-point、Descriptor
? Algorithm procedure
1. Get feature point and descriptor from image
2. Matching current image with last image
3. Minimize projection error, estimate ego-motion of camera(PnP、ICP)
? Disadvantage
? Feature extraction is time consume
? Feature extraction could fail
? Error matching
2016/12/13 16
17. Visual Odometry - Direct
? Photometric invariant assumption
? Sensitive to change of lighting
? No feature extraction
? Current system
? Sparse directSVO-SLAM
? Semi-dense directLSD-SLAM
? Dense directDVO-SLAM
2016/12/13 17
18. Visual Odometry
Feature Direct
Tracking
Feature descriptor
(100-1000 corner or
surface)
pixel
Reconstruction Corner Whole image
Comp. Complex Low
Sparse C Low
Dense - High
Inconsistence
model robustness
Yes No
History 20 years+ 2012~
Outliers Robust Hard to remove
2016/12/13 18
19. Visual Odometry - Conclusion
? Visual Odometry
? Matching pixel of feature point
? According matching result calculate camera ego-motion
? Estimate pixel for feature point position in global map
? Imperfect
? Result is noisy => Global optimization
? Accumulate error => Loop closure
? Loss(Camera moving too fast or been blocked) => Re-localization
2016/12/13 19
23. Loop closure
? Recognize visited location
? Error of visual odometry will gradually accumulate
? Use revisit clue to fix error
2016/12/13 23
24. Loop closure
? Bag-of-Words
? Process
? Separate nose, eyes, mouse
? Build dictionary
? Face = eye*2 + nose*1 + mouse*1
? Feature => Words
2016/12/13 24
25. Loop closure
? Advantage of visual SLAM
? Compare to Laser, visual SLAM system has richer information, which could
increase accuracy.
False Positive False Negative
2016/12/13 25
31. Kintinuous
? Kintinuous improve 3 main problems of KinectFusion
1. Restriction to a fixed small area in space
2. Reliance on geometric information alone for camera pose estimation
3. no means of explicitly incorporating loop closures
? Disadvantage of Kintinuous
1. Need GPU
2. Strip loop closure, incapable of large amount of loop closure
3. Only support ASUS Xtion pro live camera now, can¨t use Microsoft Kinect
2016/12/13 31
33. ElasticFusion
? ElasticFusion improve two extreme of furmer SLAM systems C
Extreme loopy (MonoSLAM [1], KinectFusion [2])or small amount of
loop (McDonald et al. [3] or Whelan et al. [4])
? Therefore ElasticFusion can handle space of room size, hand-held
camera filming same object that formed multiple loop.
? Disadvantage of ElasticFusion
1. System is not yet optimized, work in restricted space
2. Could not reconstruct map correctly when connected suface information is
not enough.
[1] A. J. Davison, N. D. Molton, I. Reid, and O. Stasse., ^MonoSLAM: Real-Time Single Camera SLAM., ̄ IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), pp.
1052-1067, 2007.
[2] R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohli, J. Shotton, S. Hodges, and A. Fitzgibbon., ^KinectFusion: Real-Time Dense Surface Mapping and
Tracking., ̄ In Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR), 2011.
[3] J. B. McDonald, M. Kaess, C. Cadena, J. Neira, and J. J., ^Real-time 6-DOF multi-session visual SLAM, ̄ Robotics and Autonomous Systems, pp. 1144-1158, 2013.
[4] T. Whelan, M. Kaess, H. Johannsson, M. F. Fallon, J. J. Leonard, and J. B. McDonald., ^Real-time large scale dense RGB-D SLAM with volumetric fusion., ̄ International Journal of
Robotics Research (IJRR), pp. 34(4-5):598-626, 2015.
2016/12/13 33
35. ORB-SLAM2
? Based on sparse feature point
? Input: Mono, stereo, RGB-D camera
? No need of GPU
? Still under maintenance, good for future development
? Disadvantage of ORB-SLAM2
1. Stereo and RGB-D application is not good enough(no point cloud)
2. Spend time to load dictionary
3. Frame rate <=10Hz, Microsoft Kinect2 qHD(950x540), ThinkPad T450
4. Not yet support map saving or loading
2016/12/13 35
37. LSD-SLAM
? Monocular camera
? Semi-dense depth map
? High computational efficiency (even could run on mobile phone)
? Capable of dealing with multiple loop and large scene
? Disadvantage of LSD-SLAM
1. Sensitive to lighting (direct method)
2. Localization error is 5~10 times of ORB-SLAM
3. Smooth camera movement assumption => matching will fail with the
camera moving too fast.
4. Assume no moving object
2016/12/13 37