Jump to content

Occlusion culling: Difference between revisions

Created page with "{{Infobox technique | name = Occlusion Culling | image = | caption = | type = Rendering optimization | used_in = Virtual Reality (VR) and Augmented Reality (AR) | developer = | year = | website = }} '''Occlusion culling''' is a rendering optimization technique that prevents graphics systems from processing and drawing geometry hidden behind other objects in a scene. Unlike view frustum culling..."
 
m Text replacement - "e.g.," to "for example"
Line 66: Line 66:
Recent developments since 2015 have focused on GPU-driven rendering pipelines. Jon Hasselgren, Magnus Andersson, and Tomas Akenine-Möller's 2016 "Masked Software Occlusion Culling" paper introduced [[SIMD]]-optimized software rasterization achieving high performance on CPUs.<ref>Intel. "Software Occlusion Culling". https://www.intel.com/content/www/us/en/developer/articles/technical/software-occlusion-culling.html</ref> The emergence of [[mesh shaders]] in NVIDIA's Turing architecture (2018) and AMD's RDNA2 enabled per-meshlet culling at unprecedented granularity, with implementations in [[Unreal Engine 5]]'s [[Nanite]] and [[Alan Wake II]] showing 40-48% performance improvements.<ref>NVIDIA. "Introduction to Turing Mesh Shaders". https://developer.nvidia.com/blog/introduction-turing-mesh-shaders/</ref>
Recent developments since 2015 have focused on GPU-driven rendering pipelines. Jon Hasselgren, Magnus Andersson, and Tomas Akenine-Möller's 2016 "Masked Software Occlusion Culling" paper introduced [[SIMD]]-optimized software rasterization achieving high performance on CPUs.<ref>Intel. "Software Occlusion Culling". https://www.intel.com/content/www/us/en/developer/articles/technical/software-occlusion-culling.html</ref> The emergence of [[mesh shaders]] in NVIDIA's Turing architecture (2018) and AMD's RDNA2 enabled per-meshlet culling at unprecedented granularity, with implementations in [[Unreal Engine 5]]'s [[Nanite]] and [[Alan Wake II]] showing 40-48% performance improvements.<ref>NVIDIA. "Introduction to Turing Mesh Shaders". https://developer.nvidia.com/blog/introduction-turing-mesh-shaders/</ref>


Modern engines now employ two-phase [[hierarchical depth buffer]] (HiZ) culling: rendering objects visible in the previous frame, building a depth pyramid, then testing newly visible objects—eliminating CPU-GPU synchronization while maintaining efficiency.<ref>Vulkan Guide. "Compute based Culling". https://vkguide.dev/docs/gpudriven/compute_culling/</ref> By the 2020s, mobile VR/AR devices (e.g., Meta Quest series) necessitated custom, lightweight implementations, blending traditional methods with AI-accelerated depth estimation for dynamic scenes.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref>
Modern engines now employ two-phase [[hierarchical depth buffer]] (HiZ) culling: rendering objects visible in the previous frame, building a depth pyramid, then testing newly visible objects—eliminating CPU-GPU synchronization while maintaining efficiency.<ref>Vulkan Guide. "Compute based Culling". https://vkguide.dev/docs/gpudriven/compute_culling/</ref> By the 2020s, mobile VR/AR devices (for example Meta Quest series) necessitated custom, lightweight implementations, blending traditional methods with AI-accelerated depth estimation for dynamic scenes.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref>


== Principles and Techniques ==
== Principles and Techniques ==


Occlusion culling operates on the principle that not all scene geometry contributes to the final image, as closer opaque objects can fully obscure distant ones. It is distinct from but complementary to other culling methods. Techniques are broadly classified as image-space (pixel-level, e.g., Z-buffering) or object-space (geometry-level, e.g., PVS), with hybrids common in practice.<ref>Wikipedia. "Occlusion culling". https://en.wikipedia.org/wiki/Occlusion_culling</ref>
Occlusion culling operates on the principle that not all scene geometry contributes to the final image, as closer opaque objects can fully obscure distant ones. It is distinct from but complementary to other culling methods. Techniques are broadly classified as image-space (pixel-level, for example Z-buffering) or object-space (geometry-level, for example PVS), with hybrids common in practice.<ref>Wikipedia. "Occlusion culling". https://en.wikipedia.org/wiki/Occlusion_culling</ref>


Algorithms for occlusion culling can be categorized based on when the visibility calculations are performed: during a pre-processing step or on-the-fly at runtime.<ref>Eurographics. "Occlusion Culling Methods". https://diglib.eg.org/bitstream/handle/10.2312/egst20011049/oc-star.pdf</ref>
Algorithms for occlusion culling can be categorized based on when the visibility calculations are performed: during a pre-processing step or on-the-fly at runtime.<ref>Eurographics. "Occlusion Culling Methods". https://diglib.eg.org/bitstream/handle/10.2312/egst20011049/oc-star.pdf</ref>
Line 85: Line 85:
To avoid testing every object individually (which could itself be expensive if there are thousands of objects), engines commonly organize the scene graph or space partitioning structure in a hierarchy. Occlusion culling can then operate on groups of objects: if an entire group (node) is found to be occluded, all of its children (sub-objects) can be skipped without further checks. This hierarchical approach makes occlusion tests much more scalable to large scenes by quickly discarding large unseen sections.<ref>NVIDIA. "Chapter 29. Efficient Occlusion Culling". GPU Gems. https://developer.nvidia.com/gpugems/gpugems/part-v-performance-and-practicalities/chapter-29-efficient-occlusion-culling</ref>
To avoid testing every object individually (which could itself be expensive if there are thousands of objects), engines commonly organize the scene graph or space partitioning structure in a hierarchy. Occlusion culling can then operate on groups of objects: if an entire group (node) is found to be occluded, all of its children (sub-objects) can be skipped without further checks. This hierarchical approach makes occlusion tests much more scalable to large scenes by quickly discarding large unseen sections.<ref>NVIDIA. "Chapter 29. Efficient Occlusion Culling". GPU Gems. https://developer.nvidia.com/gpugems/gpugems/part-v-performance-and-practicalities/chapter-29-efficient-occlusion-culling</ref>


These occlusion queries (sometimes called "Hierarchical Z-buffer" or "Z-culling" when done at a coarse level) allow dynamic, on-the-fly culling of arbitrary objects without precomputed data. To mitigate latency, engines often use techniques like '''temporal reprojection''' or '''multi-frame queries''' (e.g., issuing queries for many objects and using last frame's results to decide what to draw in the current frame, sometimes known as "round-robin" occlusion culling). This reduces stalls by giving the GPU more time to produce query results in parallel. Unreal Engine, for example, can use asynchronous occlusion queries and even has a "Round Robin Occlusion" mode optimized for VR to distribute query workload across frames.<ref>Epic Games. "Visibility and Occlusion Culling in Unreal Engine". https://dev.epicgames.com/documentation/en-us/unreal-engine/visibility-and-occlusion-culling-in-unreal-engine</ref>
These occlusion queries (sometimes called "Hierarchical Z-buffer" or "Z-culling" when done at a coarse level) allow dynamic, on-the-fly culling of arbitrary objects without precomputed data. To mitigate latency, engines often use techniques like '''temporal reprojection''' or '''multi-frame queries''' (for example issuing queries for many objects and using last frame's results to decide what to draw in the current frame, sometimes known as "round-robin" occlusion culling). This reduces stalls by giving the GPU more time to produce query results in parallel. Unreal Engine, for example, can use asynchronous occlusion queries and even has a "Round Robin Occlusion" mode optimized for VR to distribute query workload across frames.<ref>Epic Games. "Visibility and Occlusion Culling in Unreal Engine". https://dev.epicgames.com/documentation/en-us/unreal-engine/visibility-and-occlusion-culling-in-unreal-engine</ref>


=== Hierarchical Z-Buffering ===
=== Hierarchical Z-Buffering ===
Line 233: Line 233:
==== Extending the System for Moving Cameras ====
==== Extending the System for Moving Cameras ====


For parts of the game with limited camera movement (e.g., along a fixed rail), the system was extended. Instead of baking a full PVS at many points along the path, which would be memory-intensive, they baked a PVS only at key points.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 2: Moving Cameras and Other Insights". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-part-2-moving-cameras-and-other-insights/</ref> For the space between two key points, they stored only the ''difference''—a small list of objects to enable or disable when transitioning from one PVS to the next. This "difference list" approach dramatically reduced the memory footprint and the computational cost of updating visibility as the camera moved.
For parts of the game with limited camera movement (for example along a fixed rail), the system was extended. Instead of baking a full PVS at many points along the path, which would be memory-intensive, they baked a PVS only at key points.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 2: Moving Cameras and Other Insights". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-part-2-moving-cameras-and-other-insights/</ref> For the space between two key points, they stored only the ''difference''—a small list of objects to enable or disable when transitioning from one PVS to the next. This "difference list" approach dramatically reduced the memory footprint and the computational cost of updating visibility as the camera moved.


This solution combined portal-style room-based culling with the Dead Secret method: baking potentially visible sets by rendering 385 colorized cubemaps at 6×512×512 resolution, storing hand-authored visibility lists per camera position, and maintaining difference lists between adjacent cells for moving cameras. This achieved a 95% draw call reduction (1,400 to 60), enabling a AAA-quality experience on mobile hardware.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref>
This solution combined portal-style room-based culling with the Dead Secret method: baking potentially visible sets by rendering 385 colorized cubemaps at 6×512×512 resolution, storing hand-authored visibility lists per camera position, and maintaining difference lists between adjacent cells for moving cameras. This achieved a 95% draw call reduction (1,400 to 60), enabling a AAA-quality experience on mobile hardware.<ref>Meta for Developers. "Occlusion Culling for Mobile VR - Part 1: Developing a Custom Solution". https://developers.meta.com/horizon/blog/occlusion-culling-for-mobile-vr-developing-a-custom-solution/</ref>
Line 281: Line 281:
Developers often have to balance quality and performance: for instance, Meta's recent '''Depth API''' for the [[Meta Quest 3]] mixed reality headset offers two modes – "hard occlusion" and "soft occlusion". Hard occlusion uses a coarse depth mask that is cheaper to compute but produces jagged edges in the composite, whereas soft occlusion smooths the mask for more realistic blending at the cost of extra GPU processing.<ref>Learn XR Blog. "Quest 3 Mixed Reality with Meta Depth API – New Occlusion Features!" https://blog.learnxr.io/xr-development/quest-3-mixed-reality-with-meta-depth-api-new-occlusion-features</ref> The Quest 3 can use the depth sensor data to occlude virtual objects with real world depth, bringing AR-like occlusion into a VR/MR headset experience. Initial reports indicate that even the softer occlusion had minimal performance impact on Quest 3, but developers are advised to profile their apps and enable or disable these features depending on the target device's capability.<ref>Learn XR Blog. "Quest 3 Mixed Reality with Meta Depth API – New Occlusion Features!" https://blog.learnxr.io/xr-development/quest-3-mixed-reality-with-meta-depth-api-new-occlusion-features</ref>
Developers often have to balance quality and performance: for instance, Meta's recent '''Depth API''' for the [[Meta Quest 3]] mixed reality headset offers two modes – "hard occlusion" and "soft occlusion". Hard occlusion uses a coarse depth mask that is cheaper to compute but produces jagged edges in the composite, whereas soft occlusion smooths the mask for more realistic blending at the cost of extra GPU processing.<ref>Learn XR Blog. "Quest 3 Mixed Reality with Meta Depth API – New Occlusion Features!" https://blog.learnxr.io/xr-development/quest-3-mixed-reality-with-meta-depth-api-new-occlusion-features</ref> The Quest 3 can use the depth sensor data to occlude virtual objects with real world depth, bringing AR-like occlusion into a VR/MR headset experience. Initial reports indicate that even the softer occlusion had minimal performance impact on Quest 3, but developers are advised to profile their apps and enable or disable these features depending on the target device's capability.<ref>Learn XR Blog. "Quest 3 Mixed Reality with Meta Depth API – New Occlusion Features!" https://blog.learnxr.io/xr-development/quest-3-mixed-reality-with-meta-depth-api-new-occlusion-features</ref>


Techniques to gather environmental depth include [[structured light]], [[time-of-flight]] cameras, and stereo camera vision, each with limitations in range, lighting, and resolution. When high-quality depth data is available (e.g., LiDAR on high-end devices), AR apps can pre-scan the environment and even use a generated mesh as an occluder for virtual content, achieving very convincing occlusion effects. Developers often combine depth-based occlusion with other tricks (like shader-based depth masks or manual placement of invisible occlusion geometry in known locations) to handle occlusion in specific scenarios.<ref>Medium. "Occlusion Culling in Augmented Reality". https://medium.com/@ishtian_rev/occlusion-culling-in-augmented-reality-c1ee433598</ref>
Techniques to gather environmental depth include [[structured light]], [[time-of-flight]] cameras, and stereo camera vision, each with limitations in range, lighting, and resolution. When high-quality depth data is available (for example LiDAR on high-end devices), AR apps can pre-scan the environment and even use a generated mesh as an occluder for virtual content, achieving very convincing occlusion effects. Developers often combine depth-based occlusion with other tricks (like shader-based depth masks or manual placement of invisible occlusion geometry in known locations) to handle occlusion in specific scenarios.<ref>Medium. "Occlusion Culling in Augmented Reality". https://medium.com/@ishtian_rev/occlusion-culling-in-augmented-reality-c1ee433598</ref>


In summary, occlusion culling in AR serves both to avoid rendering unseen virtual content (improving performance) and to correctly hide virtual objects behind real objects (improving realism). As AR hardware advances, the fidelity of depth sensing and environmental understanding is improving, which makes occlusion more accurate. Nonetheless, it remains a challenging problem: as one AR developer noted, '''"the hardest challenge for creating an occlusion mask is reconstructing a good enough model of the real world"''' in real time. As AR continues to evolve toward mixed reality (MR) with devices like HoloLens and Meta Quest, the line between virtual and real occlusion blurs. A fully spatially aware device can occlusion-cull virtual objects against both virtual and real geometry seamlessly. Ultimately, solving occlusion in AR boosts both performance (by not rendering what isn't visible) and immersion (by making virtual content obey real-world physics of line-of-sight). Both aspects are essential for convincing and comfortable AR experiences.
In summary, occlusion culling in AR serves both to avoid rendering unseen virtual content (improving performance) and to correctly hide virtual objects behind real objects (improving realism). As AR hardware advances, the fidelity of depth sensing and environmental understanding is improving, which makes occlusion more accurate. Nonetheless, it remains a challenging problem: as one AR developer noted, '''"the hardest challenge for creating an occlusion mask is reconstructing a good enough model of the real world"''' in real time. As AR continues to evolve toward mixed reality (MR) with devices like HoloLens and Meta Quest, the line between virtual and real occlusion blurs. A fully spatially aware device can occlusion-cull virtual objects against both virtual and real geometry seamlessly. Ultimately, solving occlusion in AR boosts both performance (by not rendering what isn't visible) and immersion (by making virtual content obey real-world physics of line-of-sight). Both aspects are essential for convincing and comfortable AR experiences.