Jump to content

Passthrough

From VR & AR Wiki
See also: Terms and Technical Terms

Passthrough, often referred to as video passthrough, is a feature found in Virtual Reality (VR) and Mixed Reality (MR) headsets that utilizes external cameras to capture a live video feed of the physical environment around the user and display it on the internal screens within the headset.[1] This capability effectively allows users to see the "real world" without removing the headset, bridging the gap between fully immersive virtual experiences and the user's actual surroundings.

While primarily a feature of VR headsets aiming to add environmental awareness or MR capabilities, it functions as a form of Augmented Reality (AR), often termed "Video See-Through AR" (VST AR) or sometimes "pseudo-AR," as opposed to "Optical See-Through AR" (OST AR) systems which use transparent displays.[2] Passthrough is a key enabler of mixed reality and spatial computing experiences on modern headsets.

Core Technology and How It Works

The fundamental principle of passthrough involves a real-time processing pipeline:

  1. Capture: One or more outward-facing digital cameras mounted on the headset capture video of the external world. Early or basic systems might use a single camera (providing a monoscopic view), while more advanced systems use two or more cameras to capture stereoscopic video, enabling depth perception.[3] Modern systems often use a combination of RGB color cameras and monochrome (grayscale) sensors for different purposes (for example capturing color data vs. motion/detail).[4]
  2. Processing: The captured video footage is sent to the headset's processor (either an onboard SoC or a connected PC's GPU). This stage is computationally intensive and critical for a usable and comfortable experience. It typically involves several steps:
    • Rectification/Undistortion: Correcting lens distortion inherent in the wide-angle cameras typically used to maximize FOV.
    • Reprojection/Warping: Adjusting the captured image perspective to align with the user's eye position inside the headset, rather than the camera's physical position on the outside. This difference in viewpoint causes parallax, and correcting it ("perspective correction") is crucial for accurate spatial representation, correct scale perception, and minimizing motion sickness.[5][6] Algorithms based on Computer Vision and potentially IMU sensor data are used. Some modern headsets, like the Meta Quest Pro and Meta Quest 3, employ Machine Learning or neural networks to improve the realism and accuracy of this reconstruction.[7]
    • Sensor Fusion: Combining data from multiple cameras (for example fusing monochrome detail with RGB color[4]) and integrating tracking data (for example from inside-out tracking sensors or depth sensors) to ensure the passthrough view remains stable, depth-correct, and aligned with the user's head movements.
    • Color Correction & Enhancement: Adjusting colors, brightness, and contrast to appear more natural, especially under varying lighting conditions. This can also involve AI-based denoising or upscaling.[8]
  3. Display: The processed video feed is rendered onto the headset's internal displays, replacing or being overlaid upon the virtual content. The primary goal is to achieve this entire pipeline with minimal latency (ideally under 20 milliseconds[9]) to avoid discomfort and maintain realism.

History and Evolution

While the concept of video passthrough existed in research labs for decades,[10] its implementation in consumer VR headsets evolved significantly:

  • Early Stages (Mid-2010s): Passthrough began appearing primarily as a safety feature. In 2016, the HTC Vive prototype (Vive Pre) introduced a front-facing camera providing a basic, monochrome, 2D view for obstacle avoidance. Valve's software projected this onto a virtual sphere to approximate perspective.[11] It was low-resolution and intended for brief checks.
  • Integrated Monochrome (Late 2010s): Headsets using inside-out tracking leveraged their tracking cameras for improved passthrough. The Oculus Rift S (2019) offered "Passthrough+" using its multiple monochrome cameras for a stereoscopic view.[12] The original Oculus Quest (2019) and Meta Quest 2 (2020) provided similar basic monochrome passthrough, mainly for setting up the Guardian system and quick environment checks.[13]
  • Early Mixed Reality Steps (Early 2020s): In 2021, Meta released an experimental Passthrough API for Quest 2 developers, allowing apps to overlay virtual elements onto the monochrome feed, marking a step towards consumer MR.[14] Simultaneously, enterprise headsets like the Varjo XR-1 (2019) and XR-3 (2021) pushed high-fidelity color passthrough with dual high-resolution cameras, setting a benchmark for quality.[15]
  • Mainstream Color Passthrough (2022-Present):
    • The Meta Quest Pro (2022) was the first major consumer headset featuring high-quality, stereoscopic color passthrough, using a novel camera array (monochrome for depth/detail, RGB for color) and ML reconstruction.[4]
    • Competitors like the Pico 4 (late 2022) and HTC Vive XR Elite (2023) also introduced color passthrough, although early implementations like the Pico 4's were initially monoscopic and lacked depth correction.[16][17]
    • Sony's PlayStation VR2 (2023) included stereo passthrough, but kept it black-and-white, accessible via a dedicated button for quick checks.[18]
    • The Meta Quest 3 (late 2023) brought high-resolution stereo color passthrough with an active depth sensor (structured light projector) to the mainstream consumer market, offering significantly improved clarity and depth accuracy over Quest 2 and Quest Pro.[8][19]
    • The Apple Vision Pro (2023 announcement, 2024 release) emphasized passthrough-based MR ("spatial computing"), using dual high-resolution color cameras, advanced processing (Apple R1 chip), and a LiDAR scanner for precise depth mapping.[20][21]
    • Other high-end devices like the Pimax Crystal (2023) and Varjo XR-4 (late 2023) continued to push resolution and fidelity.[22][23]
    • Even mid-range devices began incorporating improved color passthrough and depth sensing, like the anticipated Pico 4 Ultra (2024).[24]

Passthrough has evolved from a basic safety utility to a core feature enabling sophisticated mixed reality experiences, blurring the lines between traditional VR and AR.

Types of Passthrough

Passthrough implementations vary significantly. Key characteristics include:

Monochrome Passthrough

Uses black-and-white camera feeds. Common in earlier VR headsets (Oculus Rift S, Quest 1 & 2) or as a design choice (PSVR2), often leveraging existing grayscale tracking cameras.[25][18] Provides basic environmental awareness but lacks color cues and realism. Advantages include potentially better low-light sensitivity and lower processing requirements.[14]

Color Passthrough

Uses RGB color cameras for a full-color view of the real world, greatly enhancing realism and enabling use cases like reading phone screens or interacting with colored objects. First widely available consumer example was Meta Quest Pro.[4] Quality varies significantly based on camera resolution, processing, and calibration (for example Quest 3 offers ~10x the passthrough pixels of Quest 2).[26] High-quality color passthrough (for example Varjo XR series, Vision Pro) aims for near-photorealism.[15][20] Requires more powerful hardware and sophisticated software.

Monoscopic vs. Stereoscopic

  • Monoscopic (2D): Uses a single camera view (or identical views) for both eyes (for example original HTC Vive, initial Pico 4 implementation[16]). Lacks binocular disparity, resulting in a "flat" image without true depth perception. Scale and distance can feel incorrect or uncomfortable.
  • Stereoscopic (3D): Uses two distinct camera viewpoints (one per eye, or reconstructed dual views) to create a 3D effect with depth perception. Requires cameras positioned roughly at the user's interpupillary distance (IPD) and careful calibration/reprojection. Essential for comfortable MR and accurate spatial interaction. Implemented in Rift S, PSVR2 (B&W stereo), Quest Pro, Quest 3, Vision Pro, Varjo XR series, etc.[8] Achieving correct scale and geometry is key to avoiding discomfort.[25]

Depth-Aware Passthrough

Systems that actively measure or infer the distance to real-world objects and surfaces, integrating this depth map into the passthrough experience. This enables:

  • Accurate placement and scaling of virtual objects relative to the real world.
  • Occlusion: Allowing virtual objects to realistically appear behind real objects (and vice-versa).
  • Improved interaction: Understanding the geometry of the environment for physics and hand interactions.

Methods include:

  • Passive Stereo Vision: Calculating depth from the differences between two camera images (computationally intensive, can struggle with textureless surfaces).
  • Active Depth Sensing: Using dedicated sensors like Infrared (IR) projectors (Quest Pro inference[4]), Structured Light projectors (Quest 3[8]), Time-of-Flight (ToF) sensors, or LiDAR (Vision Pro[20]). These provide more robust and direct depth measurements.

Depth-aware passthrough significantly enhances MR realism and comfort, enabling features like automatic room scanning and persistent virtual object anchoring.[25]

Mixed Reality Blending

Refers to how seamlessly the passthrough system integrates virtual content with the real-world camera feed. Advanced implementations aim to unify lighting, shadows, reflections, and occlusion across both realities. Examples include:

  • Virtual objects casting realistic shadows on real surfaces.[27]
  • Virtual elements being correctly hidden by real furniture or people.
  • Virtual lighting affecting the appearance of the real world within the passthrough view (and vice-versa).
  • Using ML for scene segmentation (identifying walls, floors, furniture, people) to enable complex interactions.[1]

Requires high-quality color, stereoscopic depth, active depth sensing, low latency, and sophisticated rendering techniques (for example real-time lighting estimation, environmental mapping). Devices like Quest 3 and Vision Pro heavily emphasize these capabilities.[25][21]

Technical Challenges

Creating high-quality, comfortable passthrough involves overcoming significant hurdles:

  • Latency: The delay between real-world motion and the passthrough display update (photon-to-photon latency). High latency (>~20ms[9]) causes disorientation, motion sickness ("world swimming"), and breaks immersion. Fast processing pipelines are essential.[5] Residual latency can cause ghosting or trailing artifacts on moving objects.[28]
  • Resolution and Image Quality: Camera feeds are often lower resolution than human vision, leading to pixelation or blurriness, making fine details (like text) hard to see.[29] Limited dynamic range struggles with bright highlights and dark shadows compared to the human eye. Poor low-light performance results in noisy, grainy images.[29] Achieving high resolution and good image quality requires better sensors and significant processing power.
  • Camera Placement and Perspective Mismatch: Cameras are offset from the user's eyes, causing parallax errors if not corrected. Naive display leads to distorted views, incorrect scale, and depth perception issues, especially for close objects.[5] Sophisticated reprojection algorithms are needed to warp the camera view to match the eye's perspective, but perfect correction is difficult.[6] This geometric misalignment can cause eye strain or discomfort.[6] Close objects (<~0.5m) often appear warped even in good systems due to sensor/lens limitations and reprojection challenges.[25]
  • Depth Perception and Occlusion: Even with stereo cameras, accurately replicating human depth perception is hard. Incorrect IPD matching or calibration can lead to scale issues. Lack of accurate, real-time depth maps makes correct occlusion (virtual behind real) difficult, breaking immersion. Errors in depth sensing or fusion can cause virtual objects to flicker or appear incorrectly positioned.[3]
  • Color Accuracy and Calibration: Matching the colors and brightness of the passthrough feed to both the real world and virtual elements is challenging. Poor white balance or color calibration makes the view look unnatural or filtered.[29] Display limitations also affect color reproduction. Consistent calibration across cameras and over time (accounting for thermal drift) is crucial.[6]
  • Field of View (FOV): Passthrough FOV is often narrower than human vision or even the headset's display FOV, creating a "tunnel vision" effect or visible borders where the passthrough image ends. Wide-angle lenses used to increase FOV introduce distortion that needs correction.

Modern Solutions and Advancements

Engineers employ various techniques to address passthrough challenges:

  • Multi-Camera Sensor Fusion: Using multiple cameras with different strengths (for example high-resolution RGB for color, fast monochrome for low-latency motion/detail) and fusing their data computationally.[4] Overlapping camera views help compute stereo depth and increase effective FOV.[1]
  • Active Depth Sensing: Incorporating dedicated depth sensors (IR ToF, Structured Light, LiDAR) provides robust, real-time 3D geometry information of the environment, improving reprojection accuracy, occlusion handling, and spatial anchoring.[8][20] This enables features like quick room meshing via APIs (for example Meta's Spatial Anchors, Apple's ARKit/RoomPlan).
  • Machine Learning Enhancements: Using AI/ML for various tasks:
    • Image upscaling and denoising to improve clarity, especially in low light.
    • Advanced reprojection algorithms for more accurate perspective correction.[7]
    • Scene segmentation to identify objects (hands, people, furniture) for better interaction and occlusion.[1]
    • Improving SLAM for more stable tracking and anchoring of virtual objects.
  • Reprojection and Virtual Cameras: Software techniques that warp the captured camera images based on depth data to synthesize a view from the user's actual eye positions ("virtual cameras"[6]). Time-warping techniques can further reduce perceived latency by adjusting the image based on last-moment head movements.
  • Improved Optics and Displays: Pancake lenses allow for thinner headsets where cameras can potentially be placed closer to the eyes, reducing offset. Higher resolution, higher dynamic range (for example Micro-OLED in Vision Pro), and faster refresh rate displays improve the fidelity of the displayed passthrough feed. Careful calibration of lens distortion profiles is also applied.[18]
  • User Experience (UX) Improvements: Features like a dedicated passthrough toggle button (PSVR2[18]), automatic passthrough activation when nearing boundaries (Quest Guardian[25]), and boundaryless MR modes enhance usability and seamlessly blend real/virtual interactions.

Applications and Use Cases

Passthrough enables diverse applications by allowing users to interact with the real world while immersed:

Consumer Uses

  • Safety and Convenience: Defining play boundaries (Guardian system, Chaperone (virtual reality)), avoiding obstacles, checking phones, finding controllers, or interacting briefly with people/pets without removing the headset.[14]
  • Mixed Reality Gaming and Entertainment: Games where virtual elements interact with the user's physical room (for example characters hiding behind real furniture, virtual objects placed on real tables).[25] Creative apps allowing virtual painting on real walls.
  • Productivity and Utility: Using virtual desktops or multiple virtual monitors while still seeing the real keyboard, mouse, and desk.[30]
  • Social Presence: Reducing isolation during VR use by allowing users to see others in the same physical space. Enabling co-located MR experiences where multiple users interact with shared virtual content in the same room.

Enterprise and Professional Uses

  • Collaboration: Design reviews where virtual prototypes are viewed in a real meeting room alongside physical mockups or colleagues.[31] Remote collaboration where experts guide on-site technicians using virtual annotations overlaid on the real equipment view.
  • Training and Simulation: Combining virtual scenarios with physical controls or environments (for example flight simulation using a real cockpit visible via passthrough, medical training on physical manikins with virtual overlays).[32]
  • Visualization: Architects visualizing 3D models on a real site, designers overlaying virtual concepts onto physical products.
  • Productivity: Creating expansive virtual workspaces integrated with the physical office environment, improving multitasking while maintaining awareness.[33]

Industrial and Field Uses

  • Maintenance and Repair: Displaying step-by-step instructions, diagrams, or real-time data directly overlaid onto the machinery being worked on.
  • Assembly and Manufacturing: Providing guidance and quality control checks by highlighting parts or showing virtual indicators on physical products.
  • Logistics: Warehouse workers seeing picking information or navigation paths overlaid onto the real warehouse environment.
  • Construction: On-site visualization of BIM models overlaid onto the actual construction progress for inspection and alignment checks.
  • Remote Operation: Controlling robots or drones using a passthrough view from the machine's perspective, augmented with virtual data displays.

Comparison with Optical See-Through AR

Passthrough (Video See-Through, VST) is distinct from Optical See-Through (OST) AR, used by devices like HoloLens and Magic Leap.

Optical See-Through (OST)

  • Uses semi-transparent displays (waveguides, birdbath optics) allowing direct view of the real world. Virtual images are projected onto these combiners.
  • Pros: Real world is seen perfectly naturally (zero latency, full resolution/color/dynamic range). Lower power consumption for viewing the real world. Solid real-world objects are always solid.
  • Cons: Virtual elements often appear transparent or "ghostly," lacking solidity. Limited FOV for virtual content is common. Difficulty displaying black (virtual content is additive). Virtual content can be washed out by bright ambient light. Accurate alignment ("registration") of virtual content to the real world can be challenging. Cannot computationally modify the real-world view.

Video Passthrough (VST)

  • Uses cameras and opaque displays to show a reconstructed view of the real world.
  • Pros: Virtual elements can be fully opaque and seamlessly blended. Potential for wider FOV matching the VR display. Can computationally modify the real-world view (for example brightness enhancement, selective filtering). Better blocking of ambient light for virtual content.
  • Cons: Real-world view is mediated by technology, subject to limitations (latency, resolution, color, dynamic range, distortion). Higher power consumption. Potential for discomfort (motion sickness, eye strain) if not implemented well. Real-world objects might appear less "solid" due to latency or artifacts.[2]

VST AR is currently favored in the consumer MR space, leveraging existing VR display technology, while OST AR maintains advantages for applications where unobstructed real-world vision is paramount.

Notable Implementations

  • Meta Quest Series (Quest, Quest 2, Quest Pro, Quest 3): Evolved from basic monochrome safety features to sophisticated, depth-aware color passthrough using ML reconstruction, making MR central to the platform.[19][7]
  • Apple Vision Pro: High-resolution color passthrough as the default mode for "Spatial Computing", emphasizing low latency via a dedicated Apple R1 chip and LiDAR for depth.[20]
  • Varjo XR Series (XR-1, XR-3, XR-4): Industry benchmark for high-fidelity, low-latency color passthrough, aimed at professional/enterprise markets.[23][15]
  • HTC Vive XR Elite: Offers color passthrough with a depth sensor for MR capabilities.[17]
  • Pimax Crystal: High-resolution VR headset incorporating color passthrough features.[22]
  • Lynx R1: Standalone headset project focusing specifically on delivering quality color passthrough at a competitive price point.[34]
  • PlayStation VR2: Features stereo black-and-white passthrough primarily for setup and quick environment checks.[18]
  • Valve Index: Basic stereoscopic monochrome passthrough via front cameras.[35]

Future Developments

Ongoing research and development aim to improve passthrough by:

  • Achieving even lower latency and higher resolution/FOV, approaching the fidelity of human vision.
  • Improving camera dynamic range, color fidelity, and low-light performance.
  • Developing more sophisticated and efficient depth sensing and real-time 3D reconstruction (for example using LiDAR, advanced CV, NeRFs).
  • Integrating AI for enhanced scene understanding, object recognition, segmentation, and interaction modeling (realistic physics, occlusion).
  • Implementing selective passthrough (showing only specific real-world elements like hands or keyboards) and potentially "augmented reality" filters applied to the real-world view.
  • Utilizing eye tracking for foveated rendering of the passthrough feed or dynamic depth-of-field adjustments.
  • Exploring novel camera technologies like light field cameras (for example Meta's "Flamera" concept[36]) to better solve perspective issues.

As technology matures, VST passthrough aims to provide a near-seamless blend between the virtual and physical worlds, potentially unifying VR and AR capabilities into single, versatile devices.

See Also

References

  1. 1.0 1.1 1.2 1.3 XR Today – What is VR Passthrough and How is it Shaping the Future of XR? (Dec 2024)
  2. 2.0 2.1 Revisiting Milgram and Kishino's Reality-Virtuality Continuum - Discusses the spectrum including Video See-Through.
  3. 3.0 3.1 Example paper discussing stereoscopic passthrough challenges
  4. 4.0 4.1 4.2 4.3 4.4 4.5 MIXED News – Project Cambria: Meta explains new passthrough technology (May 16 2022)
  5. 5.0 5.1 5.2 UploadVR – Passthrough AR: The Technical Challenges of Blending Realities (Oct 23 2023)
  6. 6.0 6.1 6.2 6.3 6.4 KGOnTech – Perspective‑Correct Passthrough (Sept 26 2023)
  7. 7.0 7.1 7.2 Meta Blog: Inside Meta Reality and Passthrough on Quest Pro
  8. 8.0 8.1 8.2 8.3 8.4 UploadVR – Quest 3 Review: Excellent VR With Limited MR (Oct 9 2023)
  9. 9.0 9.1 Latency Requirements for Plausible Interaction in Augmented and Virtual Reality - Research discussing latency impact.
  10. Milgram, P., & Kishino, F. (1994). A taxonomy of mixed reality visual displays. IEICE Transactions on Information Systems, E77-D(12), 1321-1329.
  11. Road to VR – 8 Minutes of the HTC Vive’s Front‑facing Camera in Action (Mar 10 2016)
  12. Oculus Rift S Product Documentation (2019)
  13. Quest Blog on Quest 2 Passthrough improvements
  14. 14.0 14.1 14.2 Meta Developer Blog – Oculus Experiments With Mixed Reality via New Passthrough API (July 25 2021)
  15. 15.0 15.1 15.2 The Ghost Howls – Varjo XR‑3 hands‑on review (June 8 2022)
  16. 16.0 16.1 Reddit – Meta Quest Pro vs PICO 4 Passthrough Comparison (2022 thread)
  17. 17.0 17.1 HTC Vive XR Elite Product Page
  18. 18.0 18.1 18.2 18.3 18.4 Road to VR – PSVR 2 Review (Feb 22 2023)
  19. 19.0 19.1 RoadToVR Quest 3 Review detailing passthrough improvements
  20. 20.0 20.1 20.2 20.3 20.4 Apple Vision Pro Announcement
  21. 21.0 21.1 The Verge - Apple Vision Pro review: magic, until it’s not (Nilay Patel, June 2023)
  22. 22.0 22.1 Pimax Crystal Product Page
  23. 23.0 23.1 Varjo XR-4 Product Page
  24. Auganix – Pico Unveils ‘Pico 4 Ultra’ (Aug 21 2024)
  25. 25.0 25.1 25.2 25.3 25.4 25.5 25.6 UploadVR – Quest 3 Review: Excellent VR With Limited MR (Oct 9 2023)
  26. UploadVR – Quest 3 Specs Compared to Quest 2 & Apple Vision Pro (Sept 27 2023)
  27. Varjo Blog - Video Pass-Through XR - Merge Real and Virtual (Urho Konttori, 2020)
  28. Ghosting issue noted in UploadVR Quest 3 Review (Oct 9 2023)
  29. 29.0 29.1 29.2 Ars Technica Quest 3 Review discussing passthrough quality
  30. TechTarget: What is augmented reality (AR)?
  31. XR Today – VR Passthrough in Enterprise (2024)
  32. VIVE Blog: What is VR Passthrough? Mixed Reality's Secret Sauce
  33. XR Today – What is VR Passthrough… (benefits) (Dec 2024)
  34. Lynx R1 Official Website
  35. Valve Index Hardware Manual (PDF) (Mar 2021)
  36. KGOnTech – Meta Flamera Light‑Field Passthrough (Sept 26 2023)