SLAM: Difference between revisions - VR & AR Wiki - Virtual Reality & Augmented Reality Wiki

Line 25:

SLAM systems can be categorized based on the primary sensors used and the algorithmic approach:

* '''Visual SLAM (vSLAM):''' Relies mainly on [[cameras]]. Can be monocular (one camera), stereo (two cameras), or RGB-D (using a [[depth sensor]]). Often fused with [[IMU]] data ([[Visual Inertial Odometry|VIO-SLAM]]).

* '''[[ORB-SLAM2]]''': A widely cited open-source library using [[ORB feature detector|ORB features]]. It supports monocular, stereo, and RGB-D cameras but is purely vision-based (no IMU). Known for robust relocalization and creating sparse feature maps.

** '''[[ORB-SLAM2]]''': A widely cited open-source library using [[ORB feature detector|ORB features]]. It supports monocular, stereo, and RGB-D cameras but is purely vision-based (no IMU). Known for robust relocalization and creating sparse feature maps.

* '''[[ORB-SLAM3]]''': An evolution of ORB-SLAM2 (released c. 2020/21) adding tight visual-inertial fusion (camera + IMU) for significantly improved accuracy and robustness, especially during fast motion. Supports [[fisheye lens|fisheye]] cameras and multi-map capabilities (handling different sessions or areas). Still produces a sparse map, considered state-of-the-art in research for VIO-SLAM accuracy.

** '''[[ORB-SLAM3]]''': An evolution of ORB-SLAM2 (released c. 2020/21) adding tight visual-inertial fusion (camera + IMU) for significantly improved accuracy and robustness, especially during fast motion. Supports [[fisheye lens|fisheye]] cameras and multi-map capabilities (handling different sessions or areas). Still produces a sparse map, considered state-of-the-art in research for VIO-SLAM accuracy.

* '''[[RTAB-Map]]''' (Real-Time Appearance-Based Mapping): An open-source graph-based SLAM approach focused on long-term and large-scale mapping. Uses appearance-based loop closure. While it can use sparse features, it's often used with RGB-D or stereo cameras to build *dense* maps (point clouds, [[occupancy grid]]s, meshes) useful for navigation or scanning. Can also incorporate [[LiDAR]] data. Tends to be more computationally intensive than sparse methods.

** '''[[RTAB-Map]]''' (Real-Time Appearance-Based Mapping): An open-source graph-based SLAM approach focused on long-term and large-scale mapping. Uses appearance-based loop closure. While it can use sparse features, it's often used with RGB-D or stereo cameras to build *dense* maps (point clouds, [[occupancy grid]]s, meshes) useful for navigation or scanning. Can also incorporate [[LiDAR]] data. Tends to be more computationally intensive than sparse methods.

* '''[[LiDAR]] SLAM:''' Uses Light Detection and Ranging sensors. Common in robotics and autonomous vehicles, and used in some high-end AR/MR devices (like [[Apple Vision Pro]]), often fused with cameras and IMUs for enhanced mapping and tracking robustness.

* '''Filter-based vs. Optimization-based:''' Historically, methods like [[Extended Kalman Filter|EKF-SLAM]] were common (filter-based). Modern systems often use graph-based optimization techniques (like [[bundle adjustment]]) which optimize the entire trajectory and map simultaneously, especially after loop closures, generally leading to higher accuracy.