Occlusion culling: Difference between revisions - VR & AR Wiki - Virtual Reality & Augmented Reality Wiki

(One intermediate revision by the same user not shown)

Line 16:

Occlusion culling is a form of [[hidden surface determination]] (also known as hidden-surface removal) and is closely related to other culling methods like [[back-face culling]] and view-[[frustum culling]].<ref>BytePlus. "Real-time rendering in AR for VR: Techniques & insights". https://www.byteplus.com/en/topic/240382</ref> In [[3D computer graphics]], if one object completely hides another from the camera's perspective, the hidden object is "culled" (not rendered) to save processing time. By default, game engines already perform frustum culling and back-face culling, but they might still issue [[draw call]]s for objects that are within the view frustum even if those objects are entirely behind other geometry. This causes unnecessary overdraw and wasted [[GPU]] work as hidden pixels get drawn over by nearer pixels.<ref>Unity Technologies. "Optimizing your VR/AR Experiences – Unity Learn Tutorial". https://learn.unity.com/tutorial/optimizing-your-vr-ar-experiences</ref>

For [[VR]]/[[AR]] specifically, occlusion culling addresses the unique challenge of [[stereoscopic rendering]], which requires rendering each scene ~~twice—once~~ for each ~~eye—effectively~~ doubling the rendering workload. [[John Carmack]] at [[Oculus]] established that VR requires below 20 milliseconds of motion-to-photons [[latency]] to feel imperceptible to humans, making every optimization critical.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref> Modern VR-specific innovations like [[Umbra Software]]'s "Stereo Camera" feature perform a single occlusion query for a spherical volume encompassing both eyes, effectively halving the required processing time compared to traditional per-eye approaches.<ref>Road to VR. "Umbra Positioning Occlusion Culling Tech for 120 FPS VR Gaming". https://www.roadtovr.com/umbra-software-occlusion-culling-120-fpt-virtual-reality-gaming/</ref>

For [[VR]]/[[AR]] specifically, occlusion culling addresses the unique challenge of [[stereoscopic rendering]], which requires rendering each scene twice, once for each eye, effectively doubling the rendering workload. [[John Carmack]] at [[Oculus]] established that VR requires below 20 milliseconds of motion-to-photons [[latency]] to feel imperceptible to humans, making every optimization critical.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref> Modern VR-specific innovations like [[Umbra Software]]'s "Stereo Camera" feature perform a single occlusion query for a spherical volume encompassing both eyes, effectively halving the required processing time compared to traditional per-eye approaches.<ref>Road to VR. "Umbra Positioning Occlusion Culling Tech for 120 FPS VR Gaming". https://www.roadtovr.com/umbra-software-occlusion-culling-120-fpt-virtual-reality-gaming/</ref>

The technique operates by performing visibility tests before geometry enters the [[rendering pipeline]], rather than processing all objects and discarding occluded fragments during [[rasterization]]. The optimal occlusion culling algorithm would select only visible objects for rendering, but practical implementations balance accuracy, performance overhead, and implementation complexity.<ref>Game Developer. "Occlusion Culling Algorithms". https://www.gamedeveloper.com/programming/occlusion-culling-algorithms</ref> Modern approaches range from CPU-based precomputed visibility systems that trade memory for runtime performance, to fully GPU-driven pipelines using [[compute shaders]] and [[mesh shaders]] that eliminate CPU-GPU synchronization entirely.

Line 46:

=== Hierarchical Approaches (1990s) ===

The modern era of occlusion culling began with [[Ned Greene]], Michael Kass, and Gavin Miller's seminal 1993 [[SIGGRAPH]] paper "Hierarchical Z-buffer Visibility."<ref>NVIDIA. "Chapter 29. Efficient Occlusion Culling". GPU Gems. https://developer.nvidia.com/gpugems/gpugems/part-v-performance-and-practicalities/chapter-29-efficient-occlusion-culling</ref> This groundbreaking work introduced the hierarchical visibility algorithm using [[octrees]] for scene organization and the Z-pyramid ~~concept—a~~ hierarchical image pyramid of depth values enabling efficient culling of occluded regions. Their technique generated roughly one hundred times fewer depth comparisons than conventional Z-buffering, establishing hierarchical approaches as the path forward.

The modern era of occlusion culling began with [[Ned Greene]], Michael Kass, and Gavin Miller's seminal 1993 [[SIGGRAPH]] paper "Hierarchical Z-buffer Visibility."<ref>NVIDIA. "Chapter 29. Efficient Occlusion Culling". GPU Gems. https://developer.nvidia.com/gpugems/gpugems/part-v-performance-and-practicalities/chapter-29-efficient-occlusion-culling</ref> This groundbreaking work introduced the hierarchical visibility algorithm using [[octrees]] for scene organization and the Z-pyramid concept, a hierarchical image pyramid of depth values enabling efficient culling of occluded regions. Their technique generated roughly one hundred times fewer depth comparisons than conventional Z-buffering, establishing hierarchical approaches as the path forward.

Parallel developments included Hansong Zhang's 1997 [[Hierarchical Occlusion Maps]] (HOM) algorithm, which extended depth-buffer methods for systems without hardware Z-pyramid support by using opacity thresholds for approximate visibility culling.<ref>NVIDIA. "Chapter 29. Efficient Occlusion Culling". GPU Gems. https://developer.nvidia.com/gpugems/gpugems/part-v-performance-and-practicalities/chapter-29-efficient-occlusion-culling</ref>

Line 66:

Recent developments since 2015 have focused on GPU-driven rendering pipelines. Jon Hasselgren, Magnus Andersson, and Tomas Akenine-Möller's 2016 "Masked Software Occlusion Culling" paper introduced [[SIMD]]-optimized software rasterization achieving high performance on CPUs.<ref>Intel. "Software Occlusion Culling". https://www.intel.com/content/www/us/en/developer/articles/technical/software-occlusion-culling.html</ref> The emergence of [[mesh shaders]] in NVIDIA's Turing architecture (2018) and AMD's RDNA2 enabled per-meshlet culling at unprecedented granularity, with implementations in [[Unreal Engine 5]]'s [[Nanite]] and [[Alan Wake II]] showing 40-48% performance improvements.<ref>NVIDIA. "Introduction to Turing Mesh Shaders". https://developer.nvidia.com/blog/introduction-turing-mesh-shaders/</ref>

Modern engines now employ two-phase [[hierarchical depth buffer]] (HiZ) culling: rendering objects visible in the previous frame, building a depth pyramid, then testing newly visible ~~objects—eliminating~~ CPU-GPU synchronization while maintaining efficiency.<ref>Vulkan Guide. "Compute based Culling". https://vkguide.dev/docs/gpudriven/compute_culling/</ref> By the 2020s, mobile VR/AR devices (~~e.g.,~~ Meta Quest series) necessitated custom, lightweight implementations, blending traditional methods with AI-accelerated depth estimation for dynamic scenes.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref>

Modern engines now employ two-phase [[hierarchical depth buffer]] (HiZ) culling: rendering objects visible in the previous frame, building a depth pyramid, then testing newly visible objects, eliminating CPU-GPU synchronization while maintaining efficiency.<ref>Vulkan Guide. "Compute based Culling". https://vkguide.dev/docs/gpudriven/compute_culling/</ref> By the 2020s, mobile VR/AR devices (for example Meta Quest series) necessitated custom, lightweight implementations, blending traditional methods with AI-accelerated depth estimation for dynamic scenes.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref>

== Principles and Techniques ==

Occlusion culling operates on the principle that not all scene geometry contributes to the final image, as closer opaque objects can fully obscure distant ones. It is distinct from but complementary to other culling methods. Techniques are broadly classified as image-space (pixel-level, ~~e.g.,~~ Z-buffering) or object-space (geometry-level, ~~e.g.,~~ PVS), with hybrids common in practice.<ref>Wikipedia. "Occlusion culling". https://en.wikipedia.org/wiki/Occlusion_culling</ref>

Occlusion culling operates on the principle that not all scene geometry contributes to the final image, as closer opaque objects can fully obscure distant ones. It is distinct from but complementary to other culling methods. Techniques are broadly classified as image-space (pixel-level, for example Z-buffering) or object-space (geometry-level, for example PVS), with hybrids common in practice.<ref>Wikipedia. "Occlusion culling". https://en.wikipedia.org/wiki/Occlusion_culling</ref>

Algorithms for occlusion culling can be categorized based on when the visibility calculations are performed: during a pre-processing step or on-the-fly at runtime.<ref>Eurographics. "Occlusion Culling Methods". https://diglib.eg.org/bitstream/handle/10.2312/egst20011049/oc-star.pdf</ref>

Line 81:

Hardware occlusion queries leverage the GPU's depth testing capabilities to determine object visibility. The technique issues visibility checks directly to the GPU, rendering [[bounding volumes]] with color and depth writes disabled, then retrieving the count of fragments that passed the depth test.<ref>NVIDIA. "Chapter 6. Hardware Occlusion Queries Made Useful". GPU Gems 2. https://developer.nvidia.com/gpugems/gpugems2/part-i-geometric-complexity/chapter-6-hardware-occlusion-queries-made-useful</ref> If the query returns zero visible pixels, the enclosed object is occluded and can be skipped. This approach became standard after GPU support emerged in 2001 and remains the default in [[Unreal Engine]] for dynamic scenes.<ref>Epic Games. "Visibility and Occlusion Culling in Unreal Engine". https://dev.epicgames.com/documentation/en-us/unreal-engine/visibility-and-occlusion-culling-in-unreal-engine</ref>

The fundamental challenge is pipeline ~~latency—queries~~ travel from CPU to GPU queue to rasterization and back, typically requiring 1-3 frames for results. Naïve implementations that wait for query results cause CPU stalls that starve the GPU, negating any performance benefit. The solution is the coherent hierarchical culling algorithm developed by Bittner and Wimmer, which exploits [[temporal coherence]] by assuming objects visible in the previous frame remain visible, rendering them immediately while asynchronously querying previously occluded objects organized in a spatial hierarchy like an [[octree]] or [[k-d tree]].<ref>NVIDIA. "Chapter 6. Hardware Occlusion Queries Made Useful". GPU Gems 2. https://developer.nvidia.com/gpugems/gpugems2/part-i-geometric-complexity/chapter-6-hardware-occlusion-queries-made-useful</ref>

The fundamental challenge is pipeline latency, queries travel from CPU to GPU queue to rasterization and back, typically requiring 1-3 frames for results. Naïve implementations that wait for query results cause CPU stalls that starve the GPU, negating any performance benefit. The solution is the coherent hierarchical culling algorithm developed by Bittner and Wimmer, which exploits [[temporal coherence]] by assuming objects visible in the previous frame remain visible, rendering them immediately while asynchronously querying previously occluded objects organized in a spatial hierarchy like an [[octree]] or [[k-d tree]].<ref>NVIDIA. "Chapter 6. Hardware Occlusion Queries Made Useful". GPU Gems 2. https://developer.nvidia.com/gpugems/gpugems2/part-i-geometric-complexity/chapter-6-hardware-occlusion-queries-made-useful</ref>

To avoid testing every object individually (which could itself be expensive if there are thousands of objects), engines commonly organize the scene graph or space partitioning structure in a hierarchy. Occlusion culling can then operate on groups of objects: if an entire group (node) is found to be occluded, all of its children (sub-objects) can be skipped without further checks. This hierarchical approach makes occlusion tests much more scalable to large scenes by quickly discarding large unseen sections.<ref>NVIDIA. "Chapter 29. Efficient Occlusion Culling". GPU Gems. https://developer.nvidia.com/gpugems/gpugems/part-v-performance-and-practicalities/chapter-29-efficient-occlusion-culling</ref>

These occlusion queries (sometimes called "Hierarchical Z-buffer" or "Z-culling" when done at a coarse level) allow dynamic, on-the-fly culling of arbitrary objects without precomputed data. To mitigate latency, engines often use techniques like '''temporal reprojection''' or '''multi-frame queries''' (~~e.g.,~~ issuing queries for many objects and using last frame's results to decide what to draw in the current frame, sometimes known as "round-robin" occlusion culling). This reduces stalls by giving the GPU more time to produce query results in parallel. Unreal Engine, for example, can use asynchronous occlusion queries and even has a "Round Robin Occlusion" mode optimized for VR to distribute query workload across frames.<ref>Epic Games. "Visibility and Occlusion Culling in Unreal Engine". https://dev.epicgames.com/documentation/en-us/unreal-engine/visibility-and-occlusion-culling-in-unreal-engine</ref>

These occlusion queries (sometimes called "Hierarchical Z-buffer" or "Z-culling" when done at a coarse level) allow dynamic, on-the-fly culling of arbitrary objects without precomputed data. To mitigate latency, engines often use techniques like '''temporal reprojection''' or '''multi-frame queries''' (for example issuing queries for many objects and using last frame's results to decide what to draw in the current frame, sometimes known as "round-robin" occlusion culling). This reduces stalls by giving the GPU more time to produce query results in parallel. Unreal Engine, for example, can use asynchronous occlusion queries and even has a "Round Robin Occlusion" mode optimized for VR to distribute query workload across frames.<ref>Epic Games. "Visibility and Occlusion Culling in Unreal Engine". https://dev.epicgames.com/documentation/en-us/unreal-engine/visibility-and-occlusion-culling-in-unreal-engine</ref>

=== Hierarchical Z-Buffering ===

Line 113:

Some occlusion culling systems use a software renderer or depth buffer on the CPU to simulate the view and determine visibility. For example, a system might rasterize the scene's biggest occluders to a low-resolution depth buffer, then test each object's bounding box against this buffer to see if it is behind a filled depth pixel. This is essentially performing a custom visibility test in software without involving GPU queries (avoiding GPU pipeline stalls).<ref>Unity Technologies. "Optimizing your VR/AR Experiences – Unity Learn Tutorial". https://learn.unity.com/tutorial/optimizing-your-vr-ar-experiences</ref>

The primary advantage is zero-frame ~~latency—visibility~~ determination happens entirely before rendering begins, preventing the pop-in artifacts common with asynchronous GPU queries. This makes it particularly valuable for VR where even single-frame delays create noticeable temporal artifacts.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref> The technique also frees GPU resources for rendering rather than visibility testing, crucial on bandwidth-constrained mobile VR platforms.

The primary advantage is zero-frame latency, visibility determination happens entirely before rendering begins, preventing the pop-in artifacts common with asynchronous GPU queries. This makes it particularly valuable for VR where even single-frame delays create noticeable temporal artifacts.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref> The technique also frees GPU resources for rendering rather than visibility testing, crucial on bandwidth-constrained mobile VR platforms.

==== Masked Software Occlusion Culling ====

Line 157:

=== Hierarchical Occlusion Maps ===

[[Hierarchical Occlusion Maps]] (HOM), introduced by Hansong Zhang in 1997, separate visibility testing into a 2D overlap test and 1D depth test. The system renders occluders as white-on-black images without shading, builds an opacity pyramid through averaging, and tests objects by checking opacity values at appropriate hierarchy levels.<ref>NVIDIA. "Chapter 29. Efficient Occlusion Culling". GPU Gems. https://developer.nvidia.com/gpugems/gpugems/part-v-performance-and-practicalities/chapter-29-efficient-occlusion-culling</ref> An opacity threshold provides controllable approximate ~~culling—accepting~~ slight inaccuracies for performance.

[[Hierarchical Occlusion Maps]] (HOM), introduced by Hansong Zhang in 1997, separate visibility testing into a 2D overlap test and 1D depth test. The system renders occluders as white-on-black images without shading, builds an opacity pyramid through averaging, and tests objects by checking opacity values at appropriate hierarchy levels.<ref>NVIDIA. "Chapter 29. Efficient Occlusion Culling". GPU Gems. https://developer.nvidia.com/gpugems/gpugems/part-v-performance-and-practicalities/chapter-29-efficient-occlusion-culling</ref> An opacity threshold provides controllable approximate culling, accepting slight inaccuracies for performance.

The depth component uses a coarse grid (64×64 typical) storing the furthest Z-value per region estimated from occluder bounding box vertices. This two-stage approach reduces memory compared to full-resolution depth buffers while enabling automatic occluder fusion through the opacity pyramid. The technique's advantage over pure HZB is flexibility in ~~accuracy—developers~~ tune the threshold to balance culling aggressiveness against potential false positives. Techniques like Hierarchical Occlusion Maps use multi-resolution buffers to quickly cull objects by testing against progressively lower-detail occluder representations.<ref>NVIDIA. "Chapter 29. Efficient Occlusion Culling". GPU Gems. https://developer.nvidia.com/gpugems/gpugems/part-v-performance-and-practicalities/chapter-29-efficient-occlusion-culling</ref>

The depth component uses a coarse grid (64×64 typical) storing the furthest Z-value per region estimated from occluder bounding box vertices. This two-stage approach reduces memory compared to full-resolution depth buffers while enabling automatic occluder fusion through the opacity pyramid. The technique's advantage over pure HZB is flexibility in accuracy, developers tune the threshold to balance culling aggressiveness against potential false positives. Techniques like Hierarchical Occlusion Maps use multi-resolution buffers to quickly cull objects by testing against progressively lower-detail occluder representations.<ref>NVIDIA. "Chapter 29. Efficient Occlusion Culling". GPU Gems. https://developer.nvidia.com/gpugems/gpugems/part-v-performance-and-practicalities/chapter-29-efficient-occlusion-culling</ref>

=== Modern GPU-Driven Approaches ===

The cutting edge of occlusion culling as of 2024-2025 centers on fully GPU-driven rendering pipelines that eliminate CPU involvement entirely. [[Mesh shaders]], available since NVIDIA Turing (2018) and AMD RDNA2, enable programmable geometry processing with per-meshlet culling granularity.<ref>NVIDIA. "Introduction to Turing Mesh Shaders". https://developer.nvidia.com/blog/introduction-turing-mesh-shaders/</ref> ~~Meshlets—small~~ clusters of 32-256 ~~triangles—undergo~~ individual frustum, backface cone, and occlusion tests on the GPU before rasterization.

The cutting edge of occlusion culling as of 2024-2025 centers on fully GPU-driven rendering pipelines that eliminate CPU involvement entirely. [[Mesh shaders]], available since NVIDIA Turing (2018) and AMD RDNA2, enable programmable geometry processing with per-meshlet culling granularity.<ref>NVIDIA. "Introduction to Turing Mesh Shaders". https://developer.nvidia.com/blog/introduction-turing-mesh-shaders/</ref> Meshlets, small clusters of 32-256 triangles, undergo individual frustum, backface cone, and occlusion tests on the GPU before rasterization.

Two-phase HiZ culling represents the industry standard modern technique. Phase one renders objects visible in the previous frame, exploiting temporal coherence. Phase two builds a hierarchical depth buffer using compute shaders, tests newly visible objects against this HiZ, and updates visibility bits for the next frame.<ref>Vulkan Guide. "Compute based Culling". https://vkguide.dev/docs/gpudriven/compute_culling/</ref> This approach provides one-frame latency without CPU-GPU synchronization, maintaining efficiency while handling dynamic scenes. [[Unity 6]]'s GPU Resident Drawer and [[Unreal Engine 5]]'s Nanite exemplify this architecture, using current and previous frame depth textures to avoid missing newly visible objects.<ref>Unity. "Use GPU occlusion culling". https://docs.unity3d.com/Packages/[email protected]/manual/gpu-culling.html</ref>

Line 207:

=== Latency Requirements ===

The stringent latency requirements compound these challenges. John Carmack established that VR needs below 20 milliseconds motion-to-photons latency to avoid perceptible lag.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref> With rendering limited to approximately 3 milliseconds on the CPU thread before vertical sync on console VR, visibility culling becomes a critical bottleneck. Hardware occlusion queries with their 1-3 frame delay cause unacceptable pop-in ~~artifacts—objects~~ suddenly appearing or disappearing, which proves dramatically more noticeable in VR than flat-screen gaming and particularly jarring if occurring in only one eye.

The stringent latency requirements compound these challenges. John Carmack established that VR needs below 20 milliseconds motion-to-photons latency to avoid perceptible lag.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref> With rendering limited to approximately 3 milliseconds on the CPU thread before vertical sync on console VR, visibility culling becomes a critical bottleneck. Hardware occlusion queries with their 1-3 frame delay cause unacceptable pop-in artifacts, objects suddenly appearing or disappearing, which proves dramatically more noticeable in VR than flat-screen gaming and particularly jarring if occurring in only one eye.

Another consideration is the cost of culling itself. Occlusion culling computations (whether running queries on GPU or doing software tests on CPU) take time, and in VR the time budget per frame is very low. If the scene is simple, the overhead of occlusion culling might outweigh its benefits. Therefore, developers must profile VR scenes to ensure that enabling occlusion culling is actually yielding a net gain. In many cases, VR titles with complex environments do see major performance gains from occlusion culling.

Line 233:

==== Extending the System for Moving Cameras ====

For parts of the game with limited camera movement (~~e.g.,~~ along a fixed rail), the system was extended. Instead of baking a full PVS at many points along the path, which would be memory-intensive, they baked a PVS only at key points.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 2: Moving Cameras and Other Insights". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-part-2-moving-cameras-and-other-insights/</ref> For the space between two key points, they stored only the ''difference''—a small list of objects to enable or disable when transitioning from one PVS to the next. This "difference list" approach dramatically reduced the memory footprint and the computational cost of updating visibility as the camera moved.

For parts of the game with limited camera movement (for example along a fixed rail), the system was extended. Instead of baking a full PVS at many points along the path, which would be memory-intensive, they baked a PVS only at key points.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 2: Moving Cameras and Other Insights". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-part-2-moving-cameras-and-other-insights/</ref> For the space between two key points, they stored only the ''difference'', a small list of objects to enable or disable when transitioning from one PVS to the next. This "difference list" approach dramatically reduced the memory footprint and the computational cost of updating visibility as the camera moved.

This solution combined portal-style room-based culling with the Dead Secret method: baking potentially visible sets by rendering 385 colorized cubemaps at 6×512×512 resolution, storing hand-authored visibility lists per camera position, and maintaining difference lists between adjacent cells for moving cameras. This achieved a 95% draw call reduction (1,400 to 60), enabling a AAA-quality experience on mobile hardware.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref>

Line 261:

[[Augmented reality]] presents an entirely different challenge: achieving realistic occlusion of virtual content behind real-world objects. This requires real-time three-dimensional sensing and reconstruction of the physical environment to generate occlusion geometry or depth maps. For visual realism, modern AR frameworks have introduced depth sensing and occlusion capabilities. For example, Apple's [[ARKit]] and Google's [[ARCore]] provide depth APIs that let the device understand the geometry of the real world to some extent.

[[Google]]'s [[ARCore]] Depth API, released from beta in June 2020, democratized this capability by using monocular depth ~~estimation—capturing~~ multiple images as the device moves and calculating depth through motion parallax triangulation.<ref>MDPI. "Occlusion Handling for Mobile AR Applications in Indoor and Outdoor Scenarios". https://www.mdpi.com/1424-8220/23/9/4245</ref> ARCore's Depth API produces a depth map of the environment using sensors and camera data (via techniques like [[structured light]], [[time-of-flight]], or stereo vision), which the AR application can use to allow real objects to occlude virtual objects.<ref>Google Developers. "Depth adds realism – ARCore Depth API documentation". https://developers.google.com/ar/develop/depth</ref> This works on standard mobile cameras across 200+ million devices without requiring specialized [[Time-of-Flight]] sensors. In practice, AR developers can enable "environment occlusion" so that, when the device knows the distance to real surfaces (walls, furniture, people, etc.), it will not draw virtual content that is behind those surfaces relative to the camera. Google emphasizes that "occlusion – accurately rendering a virtual object behind real-world objects – is paramount to an immersive AR experience".<ref>Google Developers. "Depth adds realism – ARCore Depth API documentation". https://developers.google.com/ar/develop/depth</ref>

[[Google]]'s [[ARCore]] Depth API, released from beta in June 2020, democratized this capability by using monocular depth estimation, capturing multiple images as the device moves and calculating depth through motion parallax triangulation.<ref>MDPI. "Occlusion Handling for Mobile AR Applications in Indoor and Outdoor Scenarios". https://www.mdpi.com/1424-8220/23/9/4245</ref> ARCore's Depth API produces a depth map of the environment using sensors and camera data (via techniques like [[structured light]], [[time-of-flight]], or stereo vision), which the AR application can use to allow real objects to occlude virtual objects.<ref>Google Developers. "Depth adds realism – ARCore Depth API documentation". https://developers.google.com/ar/develop/depth</ref> This works on standard mobile cameras across 200+ million devices without requiring specialized [[Time-of-Flight]] sensors. In practice, AR developers can enable "environment occlusion" so that, when the device knows the distance to real surfaces (walls, furniture, people, etc.), it will not draw virtual content that is behind those surfaces relative to the camera. Google emphasizes that "occlusion – accurately rendering a virtual object behind real-world objects – is paramount to an immersive AR experience".<ref>Google Developers. "Depth adds realism – ARCore Depth API documentation". https://developers.google.com/ar/develop/depth</ref>

Apple's ARKit provides a feature called '''people occlusion''', which uses real-time person segmentation and depth data to allow people in the camera view to properly occlude AR objects. For instance, if a virtual creature runs behind a person in the view, ARKit can use the person's depth silhouette to hide the creature when it passes behind them. ARKit's documentation notes that a person will occlude a virtual object only when the person is closer to the camera than that object, ensuring correct depth ordering.<ref>Apple Developer Documentation. "Occluding virtual content with people – ARKit". https://developer.apple.com/documentation/arkit/occluding-virtual-content-with-people</ref> These depth-based occlusion techniques greatly increase realism, but they rely on hardware capabilities (like LiDAR scanners or dual cameras) or advanced computer vision, and they can be computationally expensive.

Line 271:

From a performance standpoint, occlusion culling in AR has a slightly different focus. Many handheld AR applications involve relatively few virtual objects (compared to full VR or gaming scenes), so the classic performance gains of occlusion culling (reducing draw calls for hundreds of off-screen models) might be less dramatic. However, as AR use cases grow (for example, outdoor AR gaming or AR cloud experiences with many virtual elements), culling hidden objects is still important to keep frame rates high on mobile devices.

AR developers are encouraged to use the same optimizations as VR: level-of-detail, batching, and occlusion culling for virtual content that might be off-screen or behind other virtual objects. In fact, Unity's AR Foundation toolkit integrates occlusion culling just as in VR — for example, Unity will not render virtual objects that are outside the camera view or completely behind other virtual objects in the scene. This helps save processing power on phones or AR glasses.<ref>Unity Technologies. "Optimizing your VR/AR Experiences – Unity Learn Tutorial". https://learn.unity.com/tutorial/optimizing-your-vr-ar-experiences</ref>

AR developers are encouraged to use the same optimizations as VR: level-of-detail, batching, and occlusion culling for virtual content that might be off-screen or behind other virtual objects. In fact, Unity's AR Foundation toolkit integrates occlusion culling just as in VR, for example, Unity will not render virtual objects that are outside the camera view or completely behind other virtual objects in the scene. This helps save processing power on phones or AR glasses.<ref>Unity Technologies. "Optimizing your VR/AR Experiences – Unity Learn Tutorial". https://learn.unity.com/tutorial/optimizing-your-vr-ar-experiences</ref>

=== Challenges and Limitations ===

Line 281:

Developers often have to balance quality and performance: for instance, Meta's recent '''Depth API''' for the [[Meta Quest 3]] mixed reality headset offers two modes – "hard occlusion" and "soft occlusion". Hard occlusion uses a coarse depth mask that is cheaper to compute but produces jagged edges in the composite, whereas soft occlusion smooths the mask for more realistic blending at the cost of extra GPU processing.<ref>Learn XR Blog. "Quest 3 Mixed Reality with Meta Depth API – New Occlusion Features!" https://blog.learnxr.io/xr-development/quest-3-mixed-reality-with-meta-depth-api-new-occlusion-features</ref> The Quest 3 can use the depth sensor data to occlude virtual objects with real world depth, bringing AR-like occlusion into a VR/MR headset experience. Initial reports indicate that even the softer occlusion had minimal performance impact on Quest 3, but developers are advised to profile their apps and enable or disable these features depending on the target device's capability.<ref>Learn XR Blog. "Quest 3 Mixed Reality with Meta Depth API – New Occlusion Features!" https://blog.learnxr.io/xr-development/quest-3-mixed-reality-with-meta-depth-api-new-occlusion-features</ref>

Techniques to gather environmental depth include [[structured light]], [[time-of-flight]] cameras, and stereo camera vision, each with limitations in range, lighting, and resolution. When high-quality depth data is available (~~e.g.,~~ LiDAR on high-end devices), AR apps can pre-scan the environment and even use a generated mesh as an occluder for virtual content, achieving very convincing occlusion effects. Developers often combine depth-based occlusion with other tricks (like shader-based depth masks or manual placement of invisible occlusion geometry in known locations) to handle occlusion in specific scenarios.<ref>Medium. "Occlusion Culling in Augmented Reality". https://medium.com/@ishtian_rev/occlusion-culling-in-augmented-reality-c1ee433598</ref>

Techniques to gather environmental depth include [[structured light]], [[time-of-flight]] cameras, and stereo camera vision, each with limitations in range, lighting, and resolution. When high-quality depth data is available (for example LiDAR on high-end devices), AR apps can pre-scan the environment and even use a generated mesh as an occluder for virtual content, achieving very convincing occlusion effects. Developers often combine depth-based occlusion with other tricks (like shader-based depth masks or manual placement of invisible occlusion geometry in known locations) to handle occlusion in specific scenarios.<ref>Medium. "Occlusion Culling in Augmented Reality". https://medium.com/@ishtian_rev/occlusion-culling-in-augmented-reality-c1ee433598</ref>

In summary, occlusion culling in AR serves both to avoid rendering unseen virtual content (improving performance) and to correctly hide virtual objects behind real objects (improving realism). As AR hardware advances, the fidelity of depth sensing and environmental understanding is improving, which makes occlusion more accurate. Nonetheless, it remains a challenging problem: as one AR developer noted, '''"the hardest challenge for creating an occlusion mask is reconstructing a good enough model of the real world"''' in real time. As AR continues to evolve toward mixed reality (MR) with devices like HoloLens and Meta Quest, the line between virtual and real occlusion blurs. A fully spatially aware device can occlusion-cull virtual objects against both virtual and real geometry seamlessly. Ultimately, solving occlusion in AR boosts both performance (by not rendering what isn't visible) and immersion (by making virtual content obey real-world physics of line-of-sight). Both aspects are essential for convincing and comfortable AR experiences.

Line 311:

=== GPU and CPU Overhead ===

GPU savings manifest through multiple mechanisms. [[Draw call]] reduction proves critical as each eliminated draw call saves CPU-side validation, state changes, and GPU command ~~processing—Republique~~ VR's reduction from 1,400 to 60 draw calls freed enormous overhead.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref> [[Overdraw]] reduction directly saves fragment shading work; in indoor levels with high depth complexity, 50-90% of GPU fragment processing can be eliminated. [[Early-Z]] rejection operates at the hardware level but only helps when objects render front-to-back, whereas occlusion culling prevents occluded objects from entering the pipeline entirely.

GPU savings manifest through multiple mechanisms. [[Draw call]] reduction proves critical as each eliminated draw call saves CPU-side validation, state changes, and GPU command processing, Republique VR's reduction from 1,400 to 60 draw calls freed enormous overhead.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref> [[Overdraw]] reduction directly saves fragment shading work; in indoor levels with high depth complexity, 50-90% of GPU fragment processing can be eliminated. [[Early-Z]] rejection operates at the hardware level but only helps when objects render front-to-back, whereas occlusion culling prevents occluded objects from entering the pipeline entirely.

The CPU overhead varies by technique:

Line 322:

=== Comparison with Other Culling Techniques ===

[[Frustum culling]] removes objects outside the camera's field of view through simple geometric tests at very low ~~cost—it~~ should always run first and always be enabled. Occlusion culling then operates on the frustum-culled set, eliminating hidden objects within the view. The techniques multiply: approximately 3x from frustum times 3x from occlusion yields 8-9x combined as Intel demonstrated.<ref>Intel. "Software Occlusion Culling". https://www.intel.com/content/www/us/en/developer/articles/technical/software-occlusion-culling.html</ref>

[[Frustum culling]] removes objects outside the camera's field of view through simple geometric tests at very low cost, it should always run first and always be enabled. Occlusion culling then operates on the frustum-culled set, eliminating hidden objects within the view. The techniques multiply: approximately 3x from frustum times 3x from occlusion yields 8-9x combined as Intel demonstrated.<ref>Intel. "Software Occlusion Culling". https://www.intel.com/content/www/us/en/developer/articles/technical/software-occlusion-culling.html</ref>

[[Level of Detail]] (LOD) serves an entirely different purpose: reducing triangle counts for visible distant objects rather than eliminating objects entirely. LOD makes rendering cheaper; occlusion culling renders fewer things. The ideal pipeline runs frustum culling, occlusion culling, LOD selection based on distance or screen size, then [[backface culling]] during rasterization (hardware-accelerated, essentially free). Stacking these optimizations provides 10x or greater total gains.<ref>Unity Technologies. "Occlusion Culling - Unity Manual". https://docs.unity3d.com/Manual/OcclusionCulling.html</ref>

[[Backface culling]] operates differently ~~still—a~~ hardware-accelerated fixed-function GPU unit that eliminates triangles facing away from the camera during rasterization using simple dot product tests. This provides approximately 50% fragment shading reduction for closed opaque meshes at zero CPU cost. It runs automatically after all visibility culling and should remain enabled for solid geometry.

[[Backface culling]] operates differently still, a hardware-accelerated fixed-function GPU unit that eliminates triangles facing away from the camera during rasterization using simple dot product tests. This provides approximately 50% fragment shading reduction for closed opaque meshes at zero CPU cost. It runs automatically after all visibility culling and should remain enabled for solid geometry.

== Implementation in Game Engines ==

Line 369:

By default, every Unreal Engine project uses a combination of View Frustum Culling and dynamic Hardware Occlusion Queries (HOQ). The system issues visibility checks to the GPU each frame per-Actor, using the scene depth buffer for queries.<ref>Epic Games. "Visibility and Occlusion Culling in Unreal Engine". https://dev.epicgames.com/documentation/en-us/unreal-engine/visibility-and-occlusion-culling-in-unreal-engine</ref> This means that occlusion culling is always active and works for both static and dynamic objects without requiring any manual baking process. The engine automatically issues queries to the GPU for objects within the view frustum to determine if they are occluded by other objects closer to the camera.

This enables longer view distances compared to max draw distance settings and works for movable and non-movable Actors supporting opaque and masked blend modes. The inherent one-frame latency can cause "popping" with rapid camera ~~movement—objects~~ suddenly appearing as visibility predictions lag actual view changes.

This enables longer view distances compared to max draw distance settings and works for movable and non-movable Actors supporting opaque and masked blend modes. The inherent one-frame latency can cause "popping" with rapid camera movement, objects suddenly appearing as visibility predictions lag actual view changes.

==== Dynamic Occlusion System ====

Line 392:

=== Godot Engine ===

[[Godot Engine]] implements occlusion culling in Godot 4.x using the [[Embree]] library for CPU-based ~~rasterization—a~~ software raytracing library from Intel. The system bakes simplified representations of static geometry using OccluderInstance3D nodes and tests [[AABBs]] against occluder shapes at runtime. Setup involves enabling Occlusion Culling in Project Settings Rendering section, creating occluders through automatic baking or manual authoring, and baking the occlusion data. The technique proves most effective for indoor scenes with many small rooms and particularly benefits the Mobile renderer which lacks depth prepass.

[[Godot Engine]] implements occlusion culling in Godot 4.x using the [[Embree]] library for CPU-based rasterization, a software raytracing library from Intel. The system bakes simplified representations of static geometry using OccluderInstance3D nodes and tests [[AABBs]] against occluder shapes at runtime. Setup involves enabling Occlusion Culling in Project Settings Rendering section, creating occluders through automatic baking or manual authoring, and baking the occlusion data. The technique proves most effective for indoor scenes with many small rooms and particularly benefits the Mobile renderer which lacks depth prepass.

== Non-Rendering Applications ==

Line 431:

=== Implementation Guidelines ===

Profile first to confirm GPU-bound scenarios before implementing occlusion culling. Always combine with frustum culling for multiplicative gains. Use hierarchical testing by grouping objects and testing large nodes first to reduce query counts. Exploit temporal coherence by reusing previous frame ~~results—objects~~ visible last frame likely remain visible.<ref>NVIDIA. "Chapter 6. Hardware Occlusion Queries Made Useful". GPU Gems 2. https://developer.nvidia.com/gpugems/gpugems2/part-i-geometric-complexity/chapter-6-hardware-occlusion-queries-made-useful</ref>

Profile first to confirm GPU-bound scenarios before implementing occlusion culling. Always combine with frustum culling for multiplicative gains. Use hierarchical testing by grouping objects and testing large nodes first to reduce query counts. Exploit temporal coherence by reusing previous frame results, objects visible last frame likely remain visible.<ref>NVIDIA. "Chapter 6. Hardware Occlusion Queries Made Useful". GPU Gems 2. https://developer.nvidia.com/gpugems/gpugems2/part-i-geometric-complexity/chapter-6-hardware-occlusion-queries-made-useful</ref>

Pipeline processing by working on frame N while rendering frame N+1 eliminates stalls. Conservative testing prevents missing visible ~~objects—false~~ negatives cause obvious artifacts while false positives merely reduce efficiency.

Pipeline processing by working on frame N while rendering frame N+1 eliminates stalls. Conservative testing prevents missing visible objects, false negatives cause obvious artifacts while false positives merely reduce efficiency.

For Unity, mark static objects appropriately and tune cell sizes to balance accuracy versus memory. For Unreal, monitor stat initviews and combine hardware queries with distance culling and precomputed visibility as appropriate.<ref>Epic Games. "Visibility and Occlusion Culling in Unreal Engine". https://dev.epicgames.com/documentation/en-us/unreal-engine/visibility-and-occlusion-culling-in-unreal-engine</ref>

Line 449:

=== Mesh Shader Integration ===

Mesh shader integration enables unprecedented culling granularity. Rather than culling entire meshes or objects, per-meshlet culling at 32-256 triangle granularity provides fine-grained efficiency.<ref>NVIDIA. "Introduction to Turing Mesh Shaders". https://developer.nvidia.com/blog/introduction-turing-mesh-shaders/</ref> The performance benefits prove ~~substantial—40~~-48% improvements documented across multiple shipping ~~titles—but~~ hardware requirements remain a constraint. As of 2025, mesh shaders require NVIDIA Turing/RTX 2000+ (2018), AMD RDNA2+ (2020), or Intel Arc, limiting deployment to relatively recent hardware.

Mesh shader integration enables unprecedented culling granularity. Rather than culling entire meshes or objects, per-meshlet culling at 32-256 triangle granularity provides fine-grained efficiency.<ref>NVIDIA. "Introduction to Turing Mesh Shaders". https://developer.nvidia.com/blog/introduction-turing-mesh-shaders/</ref> The performance benefits prove substantial, 40-48% improvements documented across multiple shipping titles, but hardware requirements remain a constraint. As of 2025, mesh shaders require NVIDIA Turing/RTX 2000+ (2018), AMD RDNA2+ (2020), or Intel Arc, limiting deployment to relatively recent hardware.

=== AR Depth Sensing Democratization ===

For AR specifically, depth sensing democratization represents the major development. Google's ARCore Depth API release from beta in June 2020 brought real-world occlusion to 200+ million devices using monocular depth ~~estimation—capturing~~ multiple images as devices move and calculating depth through motion parallax without requiring Time-of-Flight sensors.<ref>MDPI. "Occlusion Handling for Mobile AR Applications in Indoor and Outdoor Scenarios". https://www.mdpi.com/1424-8220/23/9/4245</ref> This software-based approach works on standard mobile cameras, though it fuses ToF data when available on premium devices.

For AR specifically, depth sensing democratization represents the major development. Google's ARCore Depth API release from beta in June 2020 brought real-world occlusion to 200+ million devices using monocular depth estimation, capturing multiple images as devices move and calculating depth through motion parallax without requiring Time-of-Flight sensors.<ref>MDPI. "Occlusion Handling for Mobile AR Applications in Indoor and Outdoor Scenarios". https://www.mdpi.com/1424-8220/23/9/4245</ref> This software-based approach works on standard mobile cameras, though it fuses ToF data when available on premium devices.

=== Neural Networks and Machine Learning ===

The future trajectory involves [[neural networks]] and [[machine learning]] for improved depth estimation and visibility prediction. Research demonstrates deep learning models predicting depth from single images with increasing accuracy, potentially replacing or augmenting physical depth sensors. However, real-time performance for complex dynamic scenes remains a research challenge as of ~~2025—practical~~ deployment requires models running in milliseconds on mobile GPUs, a constraint current approaches struggle to meet.

The future trajectory involves [[neural networks]] and [[machine learning]] for improved depth estimation and visibility prediction. Research demonstrates deep learning models predicting depth from single images with increasing accuracy, potentially replacing or augmenting physical depth sensors. However, real-time performance for complex dynamic scenes remains a research challenge as of 2025, practical deployment requires models running in milliseconds on mobile GPUs, a constraint current approaches struggle to meet.

== See Also ==