Jump to content

SLAM: Difference between revisions

No edit summary
No edit summary
Line 1: Line 1:
{{stub}}
[[SLAM]] (**S**imultaneous **L**ocalization **A**nd **M**apping) is a computational problem and a set of [[algorithms]] used primarily in robotics and autonomous systems, including [[VR headset]]s and [[AR headset]]s. The goal of SLAM is for a device, using data from its onboard [[sensors]] (like [[cameras]], [[IMU]]s, and sometimes [[depth sensors]]), to construct a [[map]] of an unknown [[environment]] while simultaneously determining its own position and orientation ([[pose]]) within that newly created map. This enables [[inside-out tracking]], meaning the device tracks its position in [[3D space]] without needing external sensors or markers (like [[Lighthouse]] base stations).
[[SLAM]] (Simultaneous Localization And Mapping) is a method of inside-out 3D tracking based on optical data of an environment that does not require any additional hardware other than the device being tracked. It is similar to [[visual inertial odometry]] (VIO).


One method of SLAM is ORB_SLAM2. Another method is ORB_SLAM3. Another SLAM-like method is RTAB-Map.
=== How SLAM Works ===
SLAM systems typically involve several key components working together:
* '''[[Feature Detection|Feature Detection/Tracking]]:''' Identifying salient points or features in the sensor data (e.g., corners in camera images). These features are tracked over time as the device moves.
* '''[[Mapping]]:''' Using the tracked features and the device's estimated movement to build and update a representation (the map) of the environment. This map might consist of feature points, lines, planes, or denser representations like point clouds or meshes.
* '''[[Localization]] (or Pose Estimation):''' Estimating the device's current position and orientation (pose) relative to the map it has built.
* '''[[Loop Closure]]:''' Recognizing when the device has returned to a previously visited location. This is crucial for correcting accumulated drift in the map and pose estimate, leading to a globally consistent map.
* '''[[Sensor Fusion]]:''' Often combining data from multiple sensors (e.g., cameras and [[IMU]]s in [[Visual Inertial Odometry|VIO]]) to improve robustness and accuracy against challenges like fast motion or textureless surfaces.


The [[HoloLens|Hololens 1]] and the [[Magic Leap 1]] both use a SLAM-type system for their 3D tracking.
=== SLAM vs. [[Visual Inertial Odometry]] (VIO) ===
While related and often used together, SLAM and [[Visual Inertial Odometry]] (VIO) have different primary goals:
* '''[[VIO]]''' primarily focuses on estimating the device's ego-motion (how it moves relative to its immediate surroundings) by fusing visual data from cameras and motion data from an [[IMU]]. It's excellent for short-term, low-latency tracking but can accumulate [[Drift (tracking)|drift]] over time and doesn't necessarily build a persistent, globally consistent map optimized for re-localization or sharing.
* '''SLAM''' focuses on building a map of the environment and localizing the device within that map. It aims for global consistency, often incorporating techniques like loop closure. Many modern VR/AR tracking systems use VIO for the high-frequency motion estimation component within a larger SLAM framework that handles mapping, persistence, and drift correction.


The [[Oculus Quest 2]] uses a tracking system that is some variant of SLAM or VIO.
=== Importance in VR/AR ===
SLAM (often in conjunction with VIO) is fundamental technology for modern standalone [[VR headset]]s and [[AR headset]]s/[[Smart Glasses|glasses]]:
* '''[[6DoF]] Tracking:''' Enables full six-degrees-of-freedom tracking (positional and rotational) without external base stations or markers, allowing users to move freely within their [[Playspace|playspace]].
* '''[[World Locking|World-Locking]]:''' Ensures virtual objects appear stable and fixed in the real world (for AR/[[Mixed Reality|MR]]) or that the virtual environment remains stable relative to the user's playspace (for VR).
* '''[[Roomscale VR|Roomscale]] Experiences:''' Defines boundaries and understands the physical playspace for safety and interaction.
* '''[[Passthrough AR|Passthrough]] and [[Mixed Reality]]:''' Helps align virtual content accurately with the real-world view captured by device cameras.
* '''Persistent Anchors & Shared Experiences:''' Allows digital content to be saved and anchored to specific locations in the real world (spatial anchors), enabling multi-user experiences where participants see the same virtual objects in the same real-world spots across different sessions or devices.
 
=== Types of SLAM ===
SLAM systems can be categorized based on the primary sensors used:
* '''Visual SLAM (vSLAM):''' Relies mainly on [[cameras]]. Can be monocular (one camera), stereo (two cameras), or RGB-D (using a [[depth sensor]]). Often fused with [[IMU]] data ([[Visual Inertial Odometry|VIO-SLAM]]). Popular research algorithms include [[ORB-SLAM3]] and [[RTAB-Map]].
* '''[[LiDAR]] SLAM:''' Uses Light Detection and Ranging sensors. Common in robotics and autonomous vehicles, and used in some high-end AR/MR devices (like [[Apple Vision Pro]]) often in conjunction with cameras for improved mapping and tracking robustness.
* '''Filter-based vs. Optimization-based:''' Historically, methods like [[Extended Kalman Filter|EKF-SLAM]] were common (filter-based). Modern systems often use graph-based optimization techniques (like [[bundle adjustment]]) for higher accuracy, especially after loop closures.
 
=== Examples in VR/AR Devices ===
Many consumer VR/AR devices utilize SLAM or SLAM-like systems, often incorporating VIO:
* '''[[Meta Quest]] Headsets ([[Meta Quest 2]], [[Meta Quest 3]], [[Meta Quest Pro]]):''' Use [[Meta Quest Insight|Insight tracking]], a sophisticated inside-out system based heavily on VIO with SLAM components for mapping, boundary definition, and persistence.
* '''[[Microsoft HoloLens|HoloLens 1]] & [[Microsoft HoloLens 2|HoloLens 2]]:''' Employ advanced SLAM systems using cameras, depth sensors, and IMUs for robust spatial mapping and tracking.
* '''[[Magic Leap 1]] & [[Magic Leap 2]]:''' Utilize SLAM for environment mapping and head tracking.
* '''[[Apple Vision Pro]]:''' Features an advanced tracking system fusing data from numerous cameras, [[LiDAR]], and IMUs, implementing sophisticated VIO and SLAM techniques for detailed spatial understanding.
* Many [[Windows Mixed Reality]] headsets.
* [[Pico Neo 3 Link|Pico Neo 3]], [[Pico 4]].
 
[[Category:Tracking]]
[[Category:Computer Vision]]
[[Category:Core Concepts]]