SLAM: Difference between revisions

Latest revision as of 09:51, 3 May 2025

See also: Terms and Technical Terms

SLAM (Simultaneous Localization And Mapping) is a computational problem and a set of algorithms used primarily in robotics and autonomous systems, including VR headsets and AR headsets.^[1] The core challenge SLAM addresses is often described as a "chicken-and-egg problem": to know where you are, you need a map, but to build a map, you need to know where you are.^[2] SLAM solves this by enabling a device, using data from its onboard sensors (like cameras, IMUs, and sometimes depth sensors like Time-of-Flight (ToF)), to construct a map of an unknown environment while simultaneously determining its own position and orientation (pose) within that newly created map.^[3] This self-contained process enables inside-out tracking, meaning the device tracks its position in 3D space without needing external sensors or markers (like Lighthouse base stations).^[4]

How SLAM Works

SLAM systems typically involve several key components working together in a continuous feedback loop:

Feature Detection/Tracking: Identifying salient points or features (often called landmarks) in the sensor data (for example corners in camera images using methods like the ORB feature detector). These features are tracked frame-to-frame as the device moves.^[5]
Mapping: Using the tracked features and the device's estimated movement (odometry) to build and update a representation (the map) of the environment. This map might consist of sparse feature points (common for localization-focused SLAM) or denser representations like point clouds or meshes (useful for environmental understanding).^[6]
Localization (or Pose Estimation): Estimating the device's current position and orientation (pose) relative to the map it has built, often by observing how known landmarks appear from the current viewpoint.
Loop Closure: Recognizing when the device has returned to a previously visited location by matching current sensor data to earlier map data (for example using appearance-based methods like bag-of-words). This is crucial for correcting accumulated drift (incremental errors) in the map and pose estimate, leading to a globally consistent map.^[7]
Sensor Fusion: Often combining data from multiple sensors. Visual‑Inertial Odometry (VIO) is extremely common in modern SLAM, fusing camera data with IMU data.^[8] The IMU provides high-frequency motion updates, improving robustness against fast motion, motion blur, or visually indistinct (textureless) surfaces where camera tracking alone might struggle.

SLAM vs. Visual Inertial Odometry (VIO)

While related and often used together, SLAM and Visual Inertial Odometry (VIO) have different primary goals:

VIO primarily focuses on estimating the device's ego-motion (how it moves relative to its immediate surroundings) by fusing visual data from cameras and motion data from an IMU. It's excellent for short-term, low-latency tracking but can accumulate drift over time and doesn't necessarily build a persistent, globally consistent map optimized for re-localization or loop closure. Systems like Apple's ARKit^[8] and Google's ARCore^[9] rely heavily on VIO for tracking, adding surface detection and limited mapping but typically without the global map optimization and loop closure found in full SLAM systems.
SLAM focuses on building a map of the environment and localizing the device within that map. It aims for global consistency, often incorporating techniques like loop closure to correct drift. Many modern VR/AR tracking systems use VIO for the high-frequency motion estimation component within a larger SLAM framework that handles mapping, persistence, and drift correction. Essentially, VIO provides the odometry, while SLAM builds and refines the map using that odometry and sensor data.

Importance in VR/AR

SLAM (often incorporating VIO) is fundamental technology for modern standalone VR headsets and AR headsets/glasses:

6DoF Tracking: Enables full six-degrees-of-freedom tracking (positional and rotational) without external base stations, allowing users to move freely within their playspace.^[4]
World-Locking: Ensures virtual objects appear stable and fixed in the real world (for AR/MR) or that the virtual environment remains stable relative to the user's playspace (for VR).
Roomscale Experiences & Environment Understanding: Defines boundaries (like Meta's Guardian) and understands the physical playspace (surfaces, obstacles) for safety, interaction, and realistic occlusion (virtual objects hidden by real ones).
Passthrough and Mixed Reality: Helps align virtual content accurately with the real-world view captured by device cameras.
Persistent Anchors & Shared Experiences: Allows digital content to be saved and anchored to specific locations in the real world (spatial anchors), enabling multi-user experiences where participants see the same virtual objects in the same real-world spots across different sessions or devices.

Types and Algorithms

SLAM systems can be categorized based on the primary sensors used and the algorithmic approach:

Visual SLAM (vSLAM): Relies mainly on cameras. Can be monocular (one camera), stereo (two cameras), or RGB-D (using a depth sensor). Often fused with IMU data (VIO-SLAM).^[2]
- ORB-SLAM2: A widely cited open-source library using ORB features. It supports monocular, stereo, and RGB-D cameras but is purely vision-based (no IMU). Known for robust relocalization and creating sparse feature maps.^[5]
- ORB-SLAM3: An evolution of ORB-SLAM2 (released c. 2020/21) adding tight visual-inertial fusion (camera + IMU) for significantly improved accuracy and robustness, especially during fast motion.^[7]
- RTAB-Map (Real-Time Appearance-Based Mapping): An open-source graph-based SLAM approach focused on long-term and large-scale mapping, often used with RGB-D or stereo cameras to build dense maps.^[6]
LiDAR SLAM: Uses Light Detection and Ranging sensors. Common in robotics and autonomous vehicles, and used in some high-end AR/MR devices (like Apple Vision Pro),^[10]^[11] often fused with cameras and IMUs for enhanced mapping and tracking robustness.
Filter-based vs. Optimization-based: Historically, methods like EKF‑SLAM were common (filter‑based).^[12] Modern systems often use graph-based optimization techniques (like bundle adjustment) which optimize the entire trajectory and map simultaneously, especially after loop closures, generally leading to higher accuracy.

Examples in VR/AR Devices

Many consumer VR/AR devices utilize SLAM or SLAM-like systems, often incorporating VIO:

Meta Quest Headsets (Meta Quest 2, Meta Quest 3, Meta Quest Pro): Use Insight tracking, a sophisticated inside‑out system based heavily on VIO with SLAM components.^[4]
HoloLens 1 (2016) & HoloLens 2: Employ advanced SLAM systems using multiple cameras, a ToF depth sensor, and an IMU for robust spatial mapping.^[13]
Magic Leap 1 (2018) & Magic Leap 2: Utilize SLAM (“Visual Perception”) with an array of cameras and sensors for environment mapping and head tracking.^[14]
Apple Vision Pro: Features an advanced tracking system fusing data from numerous cameras, LiDAR, depth sensors, and IMUs.^[10]
Many Windows Mixed Reality headsets.
Pico Neo 3, Pico 4.

References

↑ H. Durrant‑Whyte & T. Bailey, “Simultaneous Localization and Mapping: Part I,” IEEE Robotics & Automation Magazine, 13 (2), 99–110, 2006. https://www.doc.ic.ac.uk/~ajd/Robotics/RoboticsResources/SLAMTutorial1.pdf
↑ ^2.0 ^2.1 C. Cadena et al., “Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust‑Perception Age,” IEEE Transactions on Robotics, 32 (6), 1309–1332, 2016. https://rpg.ifi.uzh.ch/docs/TRO16_cadena.pdf
↑ A. Ranganathan, “The Oculus Insight positional tracking system,” AI Accelerator Institute, 27 Jun 2022. https://www.aiacceleratorinstitute.com/the-oculus-insight-positional-tracking-system-2/
↑ ^4.0 ^4.1 ^4.2 Meta, “Introducing Oculus Quest, Our First 6DOF All‑in‑One VR System,” Developer Blog, 26 Sep 2018. https://developers.meta.com/horizon/blog/introducing-oculus-quest-our-first-6dof-all-in-one-vr-system/
↑ ^5.0 ^5.1 R. Mur‑Artal & J. D. Tardós, “ORB‑SLAM2: an Open‑Source SLAM System for Monocular, Stereo and RGB‑D Cameras,” IEEE Transactions on Robotics, 33 (5), 2017. https://arxiv.org/abs/1610.06475
↑ ^6.0 ^6.1 M. Labbé & F. Michaud, “RTAB‑Map as an Open‑Source Lidar and Visual SLAM Library for Large‑Scale and Long‑Term Online Operation,” Journal of Field Robotics, 36 (2), 416–446, 2019. https://arxiv.org/abs/2403.06341
↑ ^7.0 ^7.1 C. Campos et al., “ORB‑SLAM3: An Accurate Open‑Source Library for Visual, Visual‑Inertial and Multi‑Map SLAM,” IEEE Transactions on Robotics, 2021. https://arxiv.org/abs/2007.11898
↑ ^8.0 ^8.1 Apple Inc., “Understanding World Tracking,” Apple Developer Documentation, accessed 3 May 2025. https://developer.apple.com/documentation/arkit/understanding-world-tracking
↑ Google LLC, “ARCore Overview,” Google for Developers, accessed 3 May 2025. https://developers.google.com/ar
↑ ^10.0 ^10.1 Apple Inc., “Introducing Apple Vision Pro,” Newsroom, 5 Jun 2023. https://www.apple.com/newsroom/2023/06/introducing-apple-vision-pro/
↑ L. Bonnington, “Apple’s Mixed‑Reality Headset, Vision Pro, Is Here,” Wired, 5 Jun 2023. https://www.wired.com/story/apple-vision-pro-specs-price-release-date
↑ J. Sun et al., “An Extended Kalman Filter for Magnetic Field SLAM Using Gaussian Process,” Sensors, 22 (8), 2833, 2022. https://www.mdpi.com/1424-8220/22/8/2833
↑ Microsoft, “HoloLens 2 hardware,” Microsoft Learn, accessed 3 May 2025. https://learn.microsoft.com/hololens/hololens2-hardware
↑ Magic Leap, “Spatial Mapping for Magic Leap 2,” 29 Mar 2025. https://www.magicleap.com/legal/spatial-mapping-ml2

[DurrantWhyte2006-1] H. Durrant‑Whyte & T. Bailey, “Simultaneous Localization and Mapping: Part I,” IEEE Robotics & Automation Magazine, 13 (2), 99–110, 2006. https://www.doc.ic.ac.uk/~ajd/Robotics/RoboticsResources/SLAMTutorial1.pdf

[Cadena2016-2] 2.0 ^2.1 C. Cadena et al., “Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust‑Perception Age,” IEEE Transactions on Robotics, 32 (6), 1309–1332, 2016. https://rpg.ifi.uzh.ch/docs/TRO16_cadena.pdf

[AIInsight-3] A. Ranganathan, “The Oculus Insight positional tracking system,” AI Accelerator Institute, 27 Jun 2022. https://www.aiacceleratorinstitute.com/the-oculus-insight-positional-tracking-system-2/

[QuestInsight2018-4] 4.0 ^4.1 ^4.2 Meta, “Introducing Oculus Quest, Our First 6DOF All‑in‑One VR System,” Developer Blog, 26 Sep 2018. https://developers.meta.com/horizon/blog/introducing-oculus-quest-our-first-6dof-all-in-one-vr-system/

[ORB2-5] 5.0 ^5.1 R. Mur‑Artal & J. D. Tardós, “ORB‑SLAM2: an Open‑Source SLAM System for Monocular, Stereo and RGB‑D Cameras,” IEEE Transactions on Robotics, 33 (5), 2017. https://arxiv.org/abs/1610.06475

[RTABMap-6] 6.0 ^6.1 M. Labbé & F. Michaud, “RTAB‑Map as an Open‑Source Lidar and Visual SLAM Library for Large‑Scale and Long‑Term Online Operation,” Journal of Field Robotics, 36 (2), 416–446, 2019. https://arxiv.org/abs/2403.06341

[ORB3-7] 7.0 ^7.1 C. Campos et al., “ORB‑SLAM3: An Accurate Open‑Source Library for Visual, Visual‑Inertial and Multi‑Map SLAM,” IEEE Transactions on Robotics, 2021. https://arxiv.org/abs/2007.11898

[ARKitVIO-8] 8.0 ^8.1 Apple Inc., “Understanding World Tracking,” Apple Developer Documentation, accessed 3 May 2025. https://developer.apple.com/documentation/arkit/understanding-world-tracking

[ARCore-9] Google LLC, “ARCore Overview,” Google for Developers, accessed 3 May 2025. https://developers.google.com/ar

[AppleVision2023-10] 10.0 ^10.1 Apple Inc., “Introducing Apple Vision Pro,” Newsroom, 5 Jun 2023. https://www.apple.com/newsroom/2023/06/introducing-apple-vision-pro/

[WiredVisionPro-11] L. Bonnington, “Apple’s Mixed‑Reality Headset, Vision Pro, Is Here,” Wired, 5 Jun 2023. https://www.wired.com/story/apple-vision-pro-specs-price-release-date

[EKF-12] J. Sun et al., “An Extended Kalman Filter for Magnetic Field SLAM Using Gaussian Process,” Sensors, 22 (8), 2833, 2022. https://www.mdpi.com/1424-8220/22/8/2833

[HoloLens2-13] Microsoft, “HoloLens 2 hardware,” Microsoft Learn, accessed 3 May 2025. https://learn.microsoft.com/hololens/hololens2-hardware

[MagicLeap2-14] Magic Leap, “Spatial Mapping for Magic Leap 2,” 29 Mar 2025. https://www.magicleap.com/legal/spatial-mapping-ml2

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

@@ Line 1: / Line 1: @@
 {{see also|Terms|Technical Terms}}
-[[SLAM]] ('''S'''imultaneous '''L'''ocalization '''A'''nd '''M'''apping) is a computational problem and a set of [[algorithms]] used primarily in robotics and autonomous systems, including [[VR headset]]s and [[AR headset]]s. The core challenge SLAM addresses is often described as a "chicken-and-egg problem": to know where you are, you need a map, but to build a map, you need to know where you are. SLAM solves this by enabling a device, using data from its onboard [[sensors]] (like [[cameras]], [[IMU]]s, and sometimes [[depth sensors]] like [[Time-of-Flight|Time-of-Flight (ToF)]]), to construct a [[map]] of an unknown [[environment]] while simultaneously determining its own position and orientation ([[pose]]) within that newly created map. This self-contained process enables [[inside-out tracking]], meaning the device tracks its position in [[3D space]] without needing external sensors or markers (like [[Lighthouse]] base stations).
+[[SLAM]] ('''S'''imultaneous '''L'''ocalization '''A'''nd '''M'''apping) is a computational problem and a set of [[algorithms]] used primarily in robotics and autonomous systems, including [[VR headset]]s and [[AR headset]]s.<ref name="DurrantWhyte2006">H. Durrant‑Whyte & T. Bailey, “Simultaneous Localization and Mapping: Part I,” ''IEEE Robotics & Automation Magazine'', 13 (2), 99–110, 2006. https://www.doc.ic.ac.uk/~ajd/Robotics/RoboticsResources/SLAMTutorial1.pdf</ref> The core challenge SLAM addresses is often described as a "chicken-and-egg problem": to know where you are, you need a map, but to build a map, you need to know where you are.<ref name="Cadena2016">C. Cadena ''et al.'', “Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust‑Perception Age,” ''IEEE Transactions on Robotics'', 32 (6), 1309–1332, 2016. https://rpg.ifi.uzh.ch/docs/TRO16_cadena.pdf</ref> SLAM solves this by enabling a device, using data from its onboard [[sensors]] (like [[cameras]], [[IMU]]s, and sometimes [[depth sensors]] like [[Time-of-Flight|Time-of-Flight (ToF)]]), to construct a [[map]] of an unknown [[environment]] while simultaneously determining its own position and orientation ([[pose]]) within that newly created map.<ref name="AIInsight">A. Ranganathan, “The Oculus Insight positional tracking system,” AI Accelerator Institute, 27 Jun 2022. https://www.aiacceleratorinstitute.com/the-oculus-insight-positional-tracking-system-2/</ref> This self-contained process enables [[inside-out tracking]], meaning the device tracks its position in [[3D space]] without needing external sensors or markers (like [[Lighthouse]] base stations).<ref name="QuestInsight2018">Meta, “Introducing Oculus Quest, Our First 6DOF All‑in‑One VR System,” Developer Blog, 26 Sep 2018. https://developers.meta.com/horizon/blog/introducing-oculus-quest-our-first-6dof-all-in-one-vr-system/</ref>
 ==How SLAM Works==
 SLAM systems typically involve several key components working together in a continuous feedback loop:
-* '''[[Feature Detection|Feature Detection/Tracking]]:''' Identifying salient points or features (often called [[landmarks]]) in the sensor data (for example corners in camera images using methods like the [[ORB feature detector]]). These features are tracked frame-to-frame as the device moves.
+* '''[[Feature Detection|Feature Detection/Tracking]]:''' Identifying salient points or features (often called [[landmarks]]) in the sensor data (for example corners in camera images using methods like the [[ORB feature detector]]). These features are tracked frame-to-frame as the device moves.<ref name="ORB2">R. Mur‑Artal & J. D. Tardós, “ORB‑SLAM2: an Open‑Source SLAM System for Monocular, Stereo and RGB‑D Cameras,” ''IEEE Transactions on Robotics'', 33 (5), 2017. https://arxiv.org/abs/1610.06475</ref>
-* '''[[Mapping]]:''' Using the tracked features and the device's estimated movement (odometry) to build and update a representation (the map) of the environment. This map might consist of sparse feature points (common for localization-focused SLAM) or denser representations like [[point cloud]]s or [[mesh]]es (useful for environmental understanding).
+* '''[[Mapping]]:''' Using the tracked features and the device's estimated movement (odometry) to build and update a representation (the map) of the environment. This map might consist of sparse feature points (common for localization-focused SLAM) or denser representations like [[point cloud]]s or [[mesh]]es (useful for environmental understanding).<ref name="RTABMap">M. Labbé & F. Michaud, “RTAB‑Map as an Open‑Source Lidar and Visual SLAM Library for Large‑Scale and Long‑Term Online Operation,” ''Journal of Field Robotics'', 36 (2), 416–446, 2019. https://arxiv.org/abs/2403.06341</ref>
 * '''[[Localization]] (or Pose Estimation):''' Estimating the device's current position and orientation (pose) relative to the map it has built, often by observing how known landmarks appear from the current viewpoint.
-* '''[[Loop Closure]]:''' Recognizing when the device has returned to a previously visited location by matching current sensor data to earlier map data (for example using appearance-based methods like [[bag-of-words]]). This is crucial for correcting accumulated [[Drift (tracking)|drift]] (incremental errors) in the map and pose estimate, leading to a globally consistent map.
+* '''[[Loop Closure]]:''' Recognizing when the device has returned to a previously visited location by matching current sensor data to earlier map data (for example using appearance-based methods like [[bag-of-words]]). This is crucial for correcting accumulated [[Drift (tracking)|drift]] (incremental errors) in the map and pose estimate, leading to a globally consistent map.<ref name="ORB3">C. Campos ''et al.'', “ORB‑SLAM3: An Accurate Open‑Source Library for Visual, Visual‑Inertial and Multi‑Map SLAM,” ''IEEE Transactions on Robotics'', 2021. https://arxiv.org/abs/2007.11898</ref>
-* '''[[Sensor Fusion]]:''' Often combining data from multiple sensors. [[Visual Inertial Odometry|Visual-Inertial Odometry (VIO)]] is extremely common in modern SLAM, fusing camera data with [[IMU]] data. The IMU provides high-frequency motion updates, improving robustness against fast motion, motion blur, or visually indistinct (textureless) surfaces where camera tracking alone might struggle.
+* '''[[Sensor Fusion]]:''' Often combining data from multiple sensors. [[Visual Inertial Odometry|Visual‑Inertial Odometry (VIO)]] is extremely common in modern SLAM, fusing camera data with [[IMU]] data.<ref name="ARKitVIO">Apple Inc., “Understanding World Tracking,” Apple Developer Documentation, accessed 3 May 2025. https://developer.apple.com/documentation/arkit/understanding-world-tracking</ref> The IMU provides high-frequency motion updates, improving robustness against fast motion, motion blur, or visually indistinct (textureless) surfaces where camera tracking alone might struggle.
 ==SLAM vs. [[Visual Inertial Odometry]] (VIO)==
 While related and often used together, SLAM and [[Visual Inertial Odometry]] (VIO) have different primary goals:
-* '''[[VIO]]''' primarily focuses on estimating the device's ego-motion (how it moves relative to its immediate surroundings) by fusing visual data from cameras and motion data from an [[IMU]]. It's excellent for short-term, low-latency tracking but can accumulate [[Drift (tracking)|drift]] over time and doesn't necessarily build a persistent, globally consistent map optimized for re-localization or loop closure. Systems like Apple's [[ARKit]] and Google's [[ARCore]] rely heavily on VIO for tracking, adding surface detection and limited mapping but typically without the global map optimization and loop closure found in full SLAM systems.
+* '''[[VIO]]''' primarily focuses on estimating the device's ego-motion (how it moves relative to its immediate surroundings) by fusing visual data from cameras and motion data from an [[IMU]]. It's excellent for short-term, low-latency tracking but can accumulate [[Drift (tracking)|drift]] over time and doesn't necessarily build a persistent, globally consistent map optimized for re-localization or loop closure. Systems like Apple's [[ARKit]]<ref name="ARKitVIO" /> and Google's [[ARCore]]<ref name="ARCore">Google LLC, “ARCore Overview,” Google for Developers, accessed 3 May 2025. https://developers.google.com/ar</ref> rely heavily on VIO for tracking, adding surface detection and limited mapping but typically without the global map optimization and loop closure found in full SLAM systems.
 * '''SLAM''' focuses on building a map of the environment and localizing the device within that map. It aims for global consistency, often incorporating techniques like loop closure to correct drift. Many modern VR/AR tracking systems use VIO for the high-frequency motion estimation component within a larger SLAM framework that handles mapping, persistence, and drift correction. Essentially, VIO provides the odometry, while SLAM builds and refines the map using that odometry and sensor data.
 ==Importance in VR/AR==
 SLAM (often incorporating VIO) is fundamental technology for modern standalone [[VR headset]]s and [[AR headset]]s/[[Smart Glasses|glasses]]:
-* '''[[6DoF]] Tracking:''' Enables full six-degrees-of-freedom tracking (positional and rotational) without external base stations, allowing users to move freely within their [[Playspace|playspace]].
+* '''[[6DoF]] Tracking:''' Enables full six-degrees-of-freedom tracking (positional and rotational) without external base stations, allowing users to move freely within their [[Playspace|playspace]].<ref name="QuestInsight2018" />
 * '''[[World Locking|World-Locking]]:''' Ensures virtual objects appear stable and fixed in the real world (for AR/[[Mixed Reality|MR]]) or that the virtual environment remains stable relative to the user's playspace (for VR).
 * '''[[Roomscale VR|Roomscale]] Experiences & Environment Understanding:''' Defines boundaries (like [[Meta Quest Insight|Meta's Guardian]]) and understands the physical playspace (surfaces, obstacles) for safety, interaction, and realistic occlusion (virtual objects hidden by real ones).
@@ Line 25: / Line 25: @@
 ==Types and Algorithms==
 SLAM systems can be categorized based on the primary sensors used and the algorithmic approach:
-* '''Visual SLAM (vSLAM):''' Relies mainly on [[cameras]]. Can be monocular (one camera), stereo (two cameras), or RGB-D (using a [[depth sensor]]). Often fused with [[IMU]] data ([[Visual Inertial Odometry|VIO-SLAM]]).
+* '''Visual SLAM (vSLAM):''' Relies mainly on [[cameras]]. Can be monocular (one camera), stereo (two cameras), or RGB-D (using a [[depth sensor]]). Often fused with [[IMU]] data ([[Visual Inertial Odometry|VIO-SLAM]]).<ref name="Cadena2016" />
-** '''[[ORB-SLAM2]]''': A widely cited open-source library using [[ORB feature detector|ORB features]]. It supports monocular, stereo, and RGB-D cameras but is purely vision-based (no IMU). Known for robust relocalization and creating sparse feature maps.
+** '''[[ORB-SLAM2]]''': A widely cited open-source library using [[ORB feature detector|ORB features]]. It supports monocular, stereo, and RGB-D cameras but is purely vision-based (no IMU). Known for robust relocalization and creating sparse feature maps.<ref name="ORB2" />
-** '''[[ORB-SLAM3]]''': An evolution of ORB-SLAM2 (released c. 2020/21) adding tight visual-inertial fusion (camera + IMU) for significantly improved accuracy and robustness, especially during fast motion. Supports [[fisheye lens|fisheye]] cameras and multi-map capabilities (handling different sessions or areas). Still produces a sparse map, considered state-of-the-art in research for VIO-SLAM accuracy.
+** '''[[ORB-SLAM3]]''': An evolution of ORB-SLAM2 (released c. 2020/21) adding tight visual-inertial fusion (camera + IMU) for significantly improved accuracy and robustness, especially during fast motion.<ref name="ORB3" />
-** '''[[RTAB-Map]]''' (Real-Time Appearance-Based Mapping): An open-source graph-based SLAM approach focused on long-term and large-scale mapping. Uses appearance-based loop closure. While it can use sparse features, it's often used with RGB-D or stereo cameras to build *dense* maps (point clouds, [[occupancy grid]]s, meshes) useful for navigation or scanning. Can also incorporate [[LiDAR]] data. Tends to be more computationally intensive than sparse methods.
+** '''[[RTAB-Map]]''' (Real-Time Appearance-Based Mapping): An open-source graph-based SLAM approach focused on long-term and large-scale mapping, often used with RGB-D or stereo cameras to build dense maps.<ref name="RTABMap" />
-* '''[[LiDAR]] SLAM:''' Uses Light Detection and Ranging sensors. Common in robotics and autonomous vehicles, and used in some high-end AR/MR devices (like [[Apple Vision Pro]]), often fused with cameras and IMUs for enhanced mapping and tracking robustness.
+* '''[[LiDAR]] SLAM:''' Uses Light Detection and Ranging sensors. Common in robotics and autonomous vehicles, and used in some high-end AR/MR devices (like [[Apple Vision Pro]]),<ref name="AppleVision2023">Apple Inc., “Introducing Apple Vision Pro,” Newsroom, 5 Jun 2023. https://www.apple.com/newsroom/2023/06/introducing-apple-vision-pro/</ref><ref name="WiredVisionPro">L. Bonnington, “Apple’s Mixed‑Reality Headset, Vision Pro, Is Here,” ''Wired'', 5 Jun 2023. https://www.wired.com/story/apple-vision-pro-specs-price-release-date</ref> often fused with cameras and IMUs for enhanced mapping and tracking robustness.
-* '''Filter-based vs. Optimization-based:''' Historically, methods like [[Extended Kalman Filter|EKF-SLAM]] were common (filter-based). Modern systems often use graph-based optimization techniques (like [[bundle adjustment]]) which optimize the entire trajectory and map simultaneously, especially after loop closures, generally leading to higher accuracy.
+* '''Filter-based vs. Optimization-based:''' Historically, methods like [[Extended Kalman Filter|EKF‑SLAM]] were common (filter‑based).<ref name="EKF">J. Sun ''et al.'', “An Extended Kalman Filter for Magnetic Field SLAM Using Gaussian Process,” ''Sensors'', 22 (8), 2833, 2022. https://www.mdpi.com/1424-8220/22/8/2833</ref> Modern systems often use graph-based optimization techniques (like [[bundle adjustment]]) which optimize the entire trajectory and map simultaneously, especially after loop closures, generally leading to higher accuracy.
 ==Examples in VR/AR Devices==
 Many consumer VR/AR devices utilize SLAM or SLAM-like systems, often incorporating VIO:
-* '''[[Meta Quest]] Headsets ([[Meta Quest 2]], [[Meta Quest 3]], [[Meta Quest Pro]]):''' Use [[Meta Quest Insight|Insight tracking]], a sophisticated inside-out system based heavily on VIO (using 4 low-light [[fisheye lens|fisheye]] cameras and an IMU on Quest 2/Pro/3) with SLAM components for mapping (sparse feature map), boundary definition (Guardian), persistence, and enabling features like Passthrough and Space Sense. Considered a breakthrough for affordable, high-quality consumer VR tracking.
+* '''[[Meta Quest]] Headsets ([[Meta Quest 2]], [[Meta Quest 3]], [[Meta Quest Pro]]):''' Use [[Meta Quest Insight|Insight tracking]], a sophisticated inside‑out system based heavily on VIO with SLAM components.<ref name="QuestInsight2018" />
-* '''[[Microsoft HoloLens|HoloLens 1]] (2016) & [[Microsoft HoloLens 2|HoloLens 2]]:''' Employ advanced SLAM systems using multiple visible-light tracking cameras, a [[Time-of-Flight|ToF]] [[depth sensor]], and an IMU for robust spatial mapping (generating a [[mesh]] of the environment) and tracking. All processing is done on-device.
+* '''[[Microsoft HoloLens|HoloLens 1]] (2016) & [[Microsoft HoloLens 2|HoloLens 2]]:''' Employ advanced SLAM systems using multiple cameras, a [[Time-of-Flight|ToF]] depth sensor, and an IMU for robust spatial mapping.<ref name="HoloLens2">Microsoft, “HoloLens 2 hardware,” Microsoft Learn, accessed 3 May 2025. https://learn.microsoft.com/hololens/hololens2-hardware</ref>
-* '''[[Magic Leap 1]] (2018) & [[Magic Leap 2]]:''' Utilize SLAM ("Visual Perception") with an array of cameras and sensors for environment mapping (creating a digital mesh) and head tracking. [[Magic Leap 2]] allows saving and reusing mapped spaces ("Spatial Anchors").
+* '''[[Magic Leap 1]] (2018) & [[Magic Leap 2]]:''' Utilize SLAM (“Visual Perception”) with an array of cameras and sensors for environment mapping and head tracking.<ref name="MagicLeap2">Magic Leap, “Spatial Mapping for Magic Leap 2,” 29 Mar 2025. https://www.magicleap.com/legal/spatial-mapping-ml2</ref>
-* '''[[Apple Vision Pro]]:''' Features an advanced tracking system fusing data from numerous cameras, [[LiDAR]], depth sensors, and IMUs, implementing sophisticated VIO and SLAM techniques for detailed spatial understanding and persistent anchoring.
+* '''[[Apple Vision Pro]]:''' Features an advanced tracking system fusing data from numerous cameras, [[LiDAR]], depth sensors, and IMUs.<ref name="AppleVision2023" />
 * Many [[Windows Mixed Reality]] headsets.
-* [[Pico Neo 3 Link|Pico Neo 3]], [[Pico 4]].
+* [[Pico Neo 3 Link|Pico Neo 3]], [[Pico 4]].
 ==References==