Jump to content

Depth cue: Difference between revisions

No edit summary
Undo revision 34813 by Xinreality (talk)
Tag: Undo
 
(6 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{see also|Terms|Technical Terms}}
{{see also|Terms|Technical Terms}}
[[Depth cue]] is any of a variety of perceptual signals that allow the [[human visual system]] to infer the distance or depth of objects in a scene, enabling the brain to transform two-dimensional retinal images into a perception of three-dimensional space. <ref name="HowardRogers2012">Howard, I. P., & Rogers, B. J. (2012). *Perceiving in Depth, Volume 1: Basic Mechanisms*. Oxford University Press.</ref> These cues are crucial for navigating the three-dimensional world and are fundamental to creating convincing, immersive, and comfortable experiences in [[Virtual Reality]] (VR) and [[Augmented Reality]] (AR), where reproducing accurate depth perception presents significant technical challenges. <ref name="HowardRogers1995">Howard, Ian P., and Brian J. Rogers. (1995). *Binocular vision and stereopsis*. Oxford University Press.</ref> The brain automatically fuses multiple available depth cues to build a robust model of the spatial layout of the environment. <ref name="HITLCues1">(2014-06-20) Visual Depth Cues - Human Interface Technology Laboratory. Retrieved April 25, 2025, from https://www.hitl.washington.edu/projects/knowledge-base/virtual-worlds/EVE/III.A.1.b.VisualDepthCues.html</ref>
[[Depth cue]] is any of a variety of perceptual signals that allow the [[human visual system]] to infer the distance or depth of objects in a scene, enabling the brain to transform two-dimensional retinal images into a perception of three-dimensional space. <ref name="HowardRogers2012">Howard, I. P., & Rogers, B. J. (2012). *Perceiving in Depth, Volume 1: Basic Mechanisms*. Oxford University Press.</ref> These cues are crucial for navigating the three-dimensional world and are fundamental to creating convincing, immersive, and comfortable experiences in [[Virtual Reality]] (VR) and [[Augmented Reality]] (AR), where reproducing accurate depth perception presents significant technical challenges. <ref name="HowardRogers1995">Howard, Ian P., and Brian J. Rogers. (1995). *Binocular vision and stereopsis*. Oxford University Press.</ref> The brain automatically fuses multiple available depth cues to build a robust model of the spatial layout of the environment. <ref name="HITLCues1">(2014-06-20) Visual Depth Cues - Human Interface Technology Laboratory. Retrieved April 25, 2025, from https://www.hitl.washington.edu/projects/knowledge_base/virtual-worlds/EVE/III.A.1.c.DepthCues.html</ref>


== Classification of Depth Cues ==
== Classification of Depth Cues ==
Depth cues are typically classified based on whether they require input from one or both eyes:
Depth cues are typically classified based on whether they require input from one or both eyes:


*   '''[[Binocular Cues]]''': These cues rely on the slightly different perspectives provided by the two eyes, or the state of the eyes themselves.
*'''[[Binocular Cues]]''': These cues rely on the slightly different perspectives provided by the two eyes, or the state of the eyes themselves.
*   '''[[Monocular Cues]]''': These cues can be perceived with only one eye and include:
*'''[[Monocular Cues]]''': These cues can be perceived with only one eye and include:
    **   '''Physiological Cues''': Related to the physical state or action of the eye(s).
**'''Physiological Cues''': Related to the physical state or action of the eye(s).
    **   '''Pictorial Cues''': Static cues that can be perceived in a single 2D image, like a photograph or painting.
**'''Pictorial Cues''': Static cues that can be perceived in a single 2D image, like a photograph or painting.
    **   '''Dynamic Cues''': Cues that arise from motion, either of the observer or of objects in the scene.
**'''Dynamic Cues''': Cues that arise from motion, either of the observer or of objects in the scene.


== Binocular Cues ==
==Binocular Cues==
These cues are fundamental to [[stereoscopic vision]] and heavily utilized in most VR systems.
These cues are fundamental to [[stereoscopic vision]] and heavily utilized in most VR systems.


=== [[Binocular Disparity]] (Stereopsis) ===
===[[Binocular Disparity]] (Stereopsis)===
Because the two eyes are horizontally separated (by the [[interpupillary distance]], or IPD, typically around 6-7 cm), they receive slightly different images of the world. This difference in the image location of an object seen by the left and right eyes is called '''binocular disparity'''. The brain's visual cortex processes this disparity to generate the perception of depth, a phenomenon known as '''[[stereopsis]]'''. <ref name="BlakeWilson2011">Blake, R., & Wilson, H. R. (2011). Binocular vision. *Vision Research, 51*(7), 754–770. doi:10.1016/j.visres.2010.10.009</ref> <ref name="ParkerStereo2007">Parker, Andrew J. (2007). Binocular depth perception and the cerebral cortex. *Nature Reviews Neuroscience, 8*(5), 379-391.</ref> VR headsets exploit this by presenting a separate image with the correct perspective offset to each eye, simulating the natural disparity an observer would experience. It is an especially powerful depth cue for near to mid-range distances. <ref name="HITLCues1"/>
Because the two eyes are horizontally separated (by the [[interpupillary distance]], or IPD, typically around 6-7 cm), they receive slightly different images of the world. This difference in the image location of an object seen by the left and right eyes is called '''binocular disparity'''. The brain's visual cortex processes this disparity to generate the perception of depth, a phenomenon known as '''[[stereopsis]]'''. <ref name="BlakeWilson2011">Blake, R., & Wilson, H. R. (2011). Binocular vision. *Vision Research, 51*(7), 754-770. doi:10.1016/j.visres.2010.10.009</ref> <ref name="ParkerStereo2007">Parker, Andrew J. (2007). Binocular depth perception and the cerebral cortex. *Nature Reviews Neuroscience, 8*(5), 379-391.</ref> VR headsets exploit this by presenting a separate image with the correct perspective offset to each eye, simulating the natural disparity an observer would experience. It is an especially powerful depth cue for near to mid-range distances. <ref name="HITLCues1"/>


=== [[Convergence]] (Vergence) ===
===[[Convergence]] (Vergence)===
This refers to the simultaneous movement of both eyes in opposite directions to maintain single binocular vision. The eyes rotate inward ('''convergence''') to focus on a nearby object, or rotate outward ('''divergence''') for a distant object. The [[extraocular muscles]] that control eye movement provide feedback to the brain about the degree of convergence, which acts as a cue to the object's distance. <ref name="HowardRogers2012"/> <ref name="WattFocusCues2005">Watt, Simon J., Auld, W. S., & Binnie, R. G. (2005). Focus cues affect perceived depth. *Journal of vision, 5*(10), 834-862.</ref> In VR/AR, the required convergence angle changes naturally as a user looks at virtual objects simulated at different distances. Convergence is most effective as a cue at close ranges (within a few meters) and diminishes significantly for distant objects (beyond ~10 meters, the lines of sight are nearly parallel). <ref name="HITLCues1"/> [[Eye tracking]] technology can measure the vergence angle directly.
This refers to the simultaneous movement of both eyes in opposite directions to maintain single binocular vision. The eyes rotate inward ('''convergence''') to focus on a nearby object, or rotate outward ('''divergence''') for a distant object. The [[extraocular muscles]] that control eye movement provide feedback to the brain about the degree of convergence, which acts as a cue to the object's distance. <ref name="HowardRogers2012"/> <ref name="WattFocusCues2005">Watt, Simon J., Auld, W. S., & Binnie, R. G. (2005). Focus cues affect perceived depth. *Journal of vision, 5*(10), 834-862.</ref> In VR/AR, the required convergence angle changes naturally as a user looks at virtual objects simulated at different distances. Convergence is most effective as a cue at close ranges (within a few meters) and diminishes significantly for distant objects (beyond ~10 meters, the lines of sight are nearly parallel). <ref name="HITLCues1"/> [[Eye tracking]] technology can measure the vergence angle directly.


== Monocular Cues ==
==Monocular Cues==
These cues provide depth information even when viewing a scene with one eye closed. They are essential for depth perception in everyday life and are heavily relied upon in traditional 2D media as well as being simulated in VR/AR rendering.
These cues provide depth information even when viewing a scene with one eye closed. They are essential for depth perception in everyday life and are heavily relied upon in traditional 2D media as well as being simulated in VR/AR rendering.


=== Physiological Monocular Cues ===
===Physiological Monocular Cues===
====[[Accommodation]]====
This refers to the automatic adjustment of the eye's [[lens (anatomy)|lens]] focus to maintain a clear image (retinal focus) of an object as its distance changes. The [[ciliary muscle]] controls the lens shape; the muscular tension or effort involved provides the brain with a cue to the object's distance. <ref name="CuttingVishton1995">Cutting, J. E., & Vishton, P. M. (1995). Perceiving layout and knowing distances: The integration, relative potency, and contextual use of different information about depth. In W. Epstein & S. Rogers (Eds.), *Handbook of perception and cognition: Vol. 5. Perception of space and motion* (pp. 69-117). Academic Press.</ref> <ref name="FisherAccommodation1988">Fisher, Scott K., and Kenneth J. Ciuffreda. (1988). Accommodation and apparent distance. *Perception, 17*(5), 609-621.</ref> This cue is primarily effective for objects within approximately 2 meters and is relatively weak compared to other cues, often working in conjunction with them. <ref name="HITLCues2">(2014-06-20) Accommodation and Convergence - Human Interface Technology Laboratory. Retrieved April 25, 2025, from https://www.hitl.washington.edu/projects/knowledge-base/virtual-worlds/EVE/III.A.1.a.AccommodationConvergence.html</ref> <ref name="HITLCues1"/>


==== [[Accommodation]] ====
===Pictorial (Static) Monocular Cues===
This refers to the automatic adjustment of the eye's [[lens (anatomy)|lens]] focus to maintain a clear image (retinal focus) of an object as its distance changes. The [[ciliary muscle]] controls the lens shape; the muscular tension or effort involved provides the brain with a cue to the object's distance. <ref name="CuttingVishton1995">Cutting, J. E., & Vishton, P. M. (1995). Perceiving layout and knowing distances: The integration, relative potency, and contextual use of different information about depth. In W. Epstein & S. Rogers (Eds.), *Handbook of perception and cognition: Vol. 5. Perception of space and motion* (pp. 69–117). Academic Press.</ref> <ref name="FisherAccommodation1988">Fisher, Scott K., and Kenneth J. Ciuffreda. (1988). Accommodation and apparent distance. *Perception, 17*(5), 609-621.</ref> This cue is primarily effective for objects within approximately 2 meters and is relatively weak compared to other cues, often working in conjunction with them. <ref name="HITLCues2">(2014-06-20) Accommodation and Convergence - Human Interface Technology Laboratory. Retrieved April 25, 2025, from https://www.hitl.washington.edu/projects/knowledge-base/virtual-worlds/EVE/III.A.1.a.AccommodationConvergence.html</ref> <ref name="HITLCues1"/>
 
=== Pictorial (Static) Monocular Cues ===
These cues are often called "pictorial" because artists use them to create the illusion of depth on a flat canvas. They can be perceived in a static image.
These cues are often called "pictorial" because artists use them to create the illusion of depth on a flat canvas. They can be perceived in a static image.


==== [[Occlusion]] (Interposition) ====
====[[Occlusion]] (Interposition)====
When one object partially blocks the view of another object, the occluding (blocking) object is perceived as being closer. The brain uses the continuity of an object’s outline; an object that uninterruptedly covers another is assumed to be in front. <ref name="HITLCues1"/> This is a very powerful and unambiguous depth cue. <ref name="Palmer1999">Palmer, S. E. (1999). *Vision Science: Photons to Phenomenology*. MIT Press.</ref>
When one object partially blocks the view of another object, the occluding (blocking) object is perceived as being closer. The brain uses the continuity of an object’s outline; an object that uninterruptedly covers another is assumed to be in front. <ref name="HITLCues1"/> This is a very powerful and unambiguous depth cue. <ref name="Palmer1999">Palmer, S. E. (1999). *Vision Science: Photons to Phenomenology*. MIT Press.</ref>


==== [[Relative Size]] ====
====[[Relative Size]]====
If two objects are known or assumed to be of similar physical size, the one that casts a smaller retinal image (appears smaller) is perceived as being farther away. <ref name="CuttingVishton1995"/> <ref name="HITLCues1"/>
If two objects are known or assumed to be of similar physical size, the one that casts a smaller retinal image (appears smaller) is perceived as being farther away. <ref name="CuttingVishton1995"/> <ref name="HITLCues1"/>


==== [[Familiar Size]] ====
====[[Familiar Size]]====
Prior knowledge of an object's typical physical size can influence perceived distance. For example, if we see an image of a car that appears very small, we perceive it as being far away because we know the standard size range of a car. <ref name="Palmer1999"/>
Prior knowledge of an object's typical physical size can influence perceived distance. For example, if we see an image of a car that appears very small, we perceive it as being far away because we know the standard size range of a car. <ref name="Palmer1999"/>


==== [[Relative Height]] (Elevation in the Visual Field) ====
====[[Relative Height]] (Elevation in the Visual Field)====
For objects resting on the same ground plane, those that are higher in the visual field (closer to the horizon line) are typically perceived as being farther away. For objects above the horizon line (e.g., clouds), those lower in the visual field are perceived as farther. <ref name="CuttingVishton1995"/> <ref name="OoiHeight2001">Ooi, Teng Leng, Bing Wu, and Zijiang J. He. (2001). Distance determined by the angular declination below the horizon. *Nature, 414*(6860), 197-200.</ref>
For objects resting on the same ground plane, those that are higher in the visual field (closer to the horizon line) are typically perceived as being farther away. For objects above the horizon line (for example clouds), those lower in the visual field are perceived as farther. <ref name="CuttingVishton1995"/> <ref name="OoiHeight2001">Ooi, Teng Leng, Bing Wu, and Zijiang J. He. (2001). Distance determined by the angular declination below the horizon. *Nature, 414*(6860), 197-200.</ref>


==== [[Linear Perspective]] ====
====[[Linear Perspective]]====
Parallel lines, such as railway tracks or the edges of a straight road, appear to converge towards a single [[vanishing point]] as they recede into the distance. The degree of convergence provides a strong cue to distance and spatial layout. <ref name="Palmer1999"/> <ref name="SchwartzPerspective2009">Schwartz, Steven H. (2009). *Visual perception: A clinical orientation*. McGraw Hill Professional.</ref>
Parallel lines, such as railway tracks or the edges of a straight road, appear to converge towards a single [[vanishing point]] as they recede into the distance. The degree of convergence provides a strong cue to distance and spatial layout. <ref name="Palmer1999"/> <ref name="SchwartzPerspective2009">Schwartz, Steven H. (2009). *Visual perception: A clinical orientation*. McGraw Hill Professional.</ref>


==== [[Texture Gradient]] ====
====[[Texture Gradient]]====
The texture of surfaces appears coarser (elements are larger and more spaced out) when close and finer (elements are smaller and denser) when farther away. <ref name="Gibson1950">Gibson, J. J. (1950). *The Perception of the Visual World*. Houghton Mifflin.</ref> This gradual change provides depth information. In VR/AR, techniques like texture mapping and [[Level of Detail]] (LOD) management simulate this cue. <ref name="HITLCues1"/>
The texture of surfaces appears coarser (elements are larger and more spaced out) when close and finer (elements are smaller and denser) when farther away. <ref name="Gibson1950">Gibson, J. J. (1950). *The Perception of the Visual World*. Houghton Mifflin.</ref> This gradual change provides depth information. In VR/AR, techniques like texture mapping and [[Level of Detail]] (LOD) management simulate this cue. <ref name="HITLCues1"/>


==== [[Atmospheric Perspective]] (Aerial Perspective) ====
====[[Atmospheric Perspective]] (Aerial Perspective)====
Objects at great distances appear less saturated, lower in contrast, hazier, and often shifted towards a bluish hue. This is due to light scattering by particles (dust, water vapor) in the atmosphere. The farther the object, the more pronounced the effect. <ref name="Palmer1999"/> <ref name="OSheaContrast1994">O'Shea, Robert P., Simon J. Blackburn, and Hiroshi Ono. (1994). Contrast as a depth cue. *Vision research, 34*(12), 1595-1604.</ref> <ref name="HITLCues1"/>
Objects at great distances appear less saturated, lower in contrast, hazier, and often shifted towards a bluish hue. This is due to light scattering by particles (dust, water vapor) in the atmosphere. The farther the object, the more pronounced the effect. <ref name="Palmer1999"/> <ref name="OSheaContrast1994">O'Shea, Robert P., Simon J. Blackburn, and Hiroshi Ono. (1994). Contrast as a depth cue. *Vision research, 34*(12), 1595-1604.</ref> <ref name="HITLCues1"/>


==== [[Shading]] and [[Lighting]] ====
====[[Shading]] and [[Lighting]]====
The way light falls on objects creates patterns of light and shadow (shading) that provide crucial cues about their three-dimensional shape, surface curvature, and relative position to light sources and other objects. <ref name="Palmer1999"/> <ref name="RamachandranShading1988">Ramachandran, Vilayanur S. (1988). Perception of shape from shading. *Nature, 331*(6152), 163-166.</ref> Assumptions, such as light typically coming from above, help interpret these cues. Shadows cast by one object onto another also indicate relative position. <ref name="HITLCues1"/>
The way light falls on objects creates patterns of light and shadow (shading) that provide crucial cues about their three-dimensional shape, surface curvature, and relative position to light sources and other objects. <ref name="Palmer1999"/> <ref name="RamachandranShading1988">Ramachandran, Vilayanur S. (1988). Perception of shape from shading. *Nature, 331*(6152), 163-166.</ref> Assumptions, such as light typically coming from above, help interpret these cues. Shadows cast by one object onto another also indicate relative position. <ref name="HITLCues1"/>


==== [[Relative Clarity]] ====
====[[Relative Clarity]]====
Objects that appear clearer, sharper, and more detailed are often perceived as being closer than objects that appear hazier or less distinct (related to atmospheric perspective but can also apply at shorter distances due to factors like fog or focus). <ref name="FryFog1976">Fry, Glenn A., Kerr, K. E., Trezona, P. W., & Westerberg, C. F. (1976). The effect of fog on the perception of distance. *Human Factors, 18*(4), 342-346.</ref>
Objects that appear clearer, sharper, and more detailed are often perceived as being closer than objects that appear hazier or less distinct (related to atmospheric perspective but can also apply at shorter distances due to factors like fog or focus). <ref name="FryFog1976">Fry, Glenn A., Kerr, K. E., Trezona, P. W., & Westerberg, C. F. (1976). The effect of fog on the perception of distance. *Human Factors, 18*(4), 342-346.</ref>


=== Dynamic Monocular Cues ===
===Dynamic Monocular Cues===
These cues rely on motion.
These cues rely on motion.


==== [[Motion Parallax]] ====
====[[Motion Parallax]]====
As an observer moves their head or body, objects at different distances move at different apparent speeds across the visual field. Closer objects appear to move faster and in the opposite direction relative to the observer's movement compared to more distant objects, which appear to move slower and potentially in the same direction. <ref name="Gibson1950"/> <ref name="RogersMotionParallax1979">Rogers, Brian, and Maureen Graham. (1979). Motion parallax as an independent cue for depth perception. *Perception, 8*(2), 125-134.</ref> For example, when looking out the side window of a moving car, nearby posts zip by while distant trees move slowly. This is a powerful depth cue, effectively utilized in VR/AR systems through [[head tracking]]. <ref name="HITLCues1"/> <ref name="ScienceLearnParallax">Depth perception. Science Learning Hub Pokapū Akoranga Pūtaiao. Retrieved April 25, 2025, from https://www.sciencelearn.org.nz/resources/107-depth-perception</ref>
As an observer moves their head or body, objects at different distances move at different apparent speeds across the visual field. Closer objects appear to move faster and in the opposite direction relative to the observer's movement compared to more distant objects, which appear to move slower and potentially in the same direction. <ref name="Gibson1950"/> <ref name="RogersMotionParallax1979">Rogers, Brian, and Maureen Graham. (1979). Motion parallax as an independent cue for depth perception. *Perception, 8*(2), 125-134.</ref> For example, when looking out the side window of a moving car, nearby posts zip by while distant trees move slowly. This is a powerful depth cue, effectively utilized in VR/AR systems through [[head tracking]]. <ref name="HITLCues1"/> <ref name="ScienceLearnParallax">Depth perception. Science Learning Hub - Pokapū Akoranga Pūtaiao. Retrieved April 25, 2025, from https://www.sciencelearn.org.nz/resources/107-depth-perception</ref>


==== [[Kinetic Depth Effect]] ====
====[[Kinetic Depth Effect]]====
When a rigid, unfamiliar object rotates, the resulting changes in its two-dimensional projection onto the retina provide information about its three-dimensional structure. <ref name="WallachOConnell1953">Wallach, H., & O'Connell, D. N. (1953). The kinetic depth effect. *Journal of Experimental Psychology, 45*(4), 205–217. doi:10.1037/h0058000</ref>
When a rigid, unfamiliar object rotates, the resulting changes in its two-dimensional projection onto the retina provide information about its three-dimensional structure. <ref name="WallachOConnell1953">Wallach, H., & O'Connell, D. N. (1953). The kinetic depth effect. *Journal of Experimental Psychology, 45*(4), 205-217. doi:10.1037/h0058000</ref>


==== [[Ocular Parallax]] ====
====[[Ocular Parallax]]====
A subtle cue resulting from the slight shift in perspective that occurs when the eye rotates around its center within the eye socket (distinct from head movement). Objects at different depths shift relative to the retina in slightly different ways during eye rotation, providing potential depth information. <ref name="KudoOcularParallax1988">Kudo, Hiromi, and Hirohiko Ono. (1988). Depth perception, ocular parallax, and stereopsis. *Perception, 17*(4), 473-480.</ref>
A subtle cue resulting from the slight shift in perspective that occurs when the eye rotates around its center within the eye socket (distinct from head movement). Objects at different depths shift relative to the retina in slightly different ways during eye rotation, providing potential depth information. <ref name="KudoOcularParallax1988">Kudo, Hiromi, and Hirohiko Ono. (1988). Depth perception, ocular parallax, and stereopsis. *Perception, 17*(4), 473-480.</ref>


== Depth Cues in VR and AR ==
==Depth Cues in VR and AR==
 
Modern VR and AR headsets aim to simulate these depth cues to create immersive, believable, and comfortable virtual and augmented worlds. The successful implementation and integration of depth cues are crucial for the effectiveness and usability of these technologies.
Modern VR and AR headsets aim to simulate these depth cues to create immersive, believable, and comfortable virtual and augmented worlds. The successful implementation and integration of depth cues are crucial for the effectiveness and usability of these technologies.


=== Current Simulation Approaches and Limitations ===
===Current Simulation Approaches and Limitations===
Most consumer VR and AR headsets effectively simulate several key depth cues:
Most consumer VR and AR headsets effectively simulate several key depth cues:
*   **Binocular Disparity**: Achieved by rendering separate images for each eye from slightly different viewpoints calculated based on the user's IPD and the virtual scene geometry.
*'''Binocular Disparity''': Achieved by rendering separate images for each eye from slightly different viewpoints calculated based on the user's IPD and the virtual scene geometry.
*   **Convergence**: Users' eyes naturally converge/diverge to fuse the stereoscopic images of virtual objects simulated at different distances.
*'''Convergence''': Users' eyes naturally converge/diverge to fuse the stereoscopic images of virtual objects simulated at different distances.
*   **Motion Parallax**: Enabled by [[head tracking]] (and sometimes [[body tracking]]), which updates the rendered viewpoint based on the user's movements in real-time.
*'''Motion Parallax''': Enabled by [[head tracking]] (and sometimes [[body tracking]]), which updates the rendered viewpoint based on the user's movements in real-time.
*   **Pictorial Cues**: Occlusion, linear perspective, relative size, texture gradients, shading, and lighting are routinely implemented through standard [[Computer Graphics]] rendering techniques. Atmospheric perspective can be simulated with effects like fog.
*'''Pictorial Cues''': Occlusion, linear perspective, relative size, texture gradients, shading, and lighting are routinely implemented through standard [[Computer Graphics]] rendering techniques. Atmospheric perspective can be simulated with effects like fog.


However, significant technical challenges remain, particularly in reproducing physiological cues naturally:
However, significant technical challenges remain, particularly in reproducing physiological cues naturally:


==== The [[Vergence-Accommodation Conflict]] (VAC) ====
====The [[Vergence-Accommodation Conflict]] (VAC)====
A major limitation in most current VR/AR displays is the mismatch between vergence and accommodation cues. Most headsets use [[fixed-focus display]]s, meaning the optics present the virtual image at a fixed focal distance (often 1.5-2 meters or optical infinity), regardless of the simulated distance of the virtual object. <ref name="ARInsiderVAC">(2024-01-29) Understanding Vergence-Accommodation Conflict in AR/VR Headsets - AR Insider. Retrieved April 25, 2025, from https://arinsider.co/2024/01/29/understanding-vergence-accommodation-conflict-in-ar-vr-headsets/</ref> <ref name="WikiVAC">Vergence-accommodation conflict - Wikipedia. Retrieved April 25, 2025, from https://en.wikipedia.org/wiki/Vergence-accommodation_conflict</ref> <ref name="DeliverContactsFocus">(2024-07-18) Exploring the Focal Distance in VR Headsets - Deliver Contacts. Retrieved April 25, 2025, from https://delivercontacts.com/blog/exploring-the-focal-distance-in-vr-headsets</ref> While the user's eyes converge appropriately for the virtual object's simulated distance (e.g., 0.5 meters), their eyes must maintain focus (accommodate) at the fixed optical distance of the display itself to keep the image sharp. This mismatch between the distance signaled by vergence and the distance signaled by accommodation is known as the '''[[vergence-accommodation conflict]]''' (VAC). <ref name="HoffmanVAC2008">Hoffman, D. M., Girshick, A. R., Akeley, K., & Banks, M. S. (2008). Vergence–accommodation conflicts hinder visual performance and cause visual fatigue. *Journal of Vision, 8*(3), 33. doi:10.1167/8.3.33</ref> <ref name="FacebookVAC2019">Facebook Research. (2019, March 28). *Vergence-Accommodation Conflict: Facebook Research Explains Why Varifocal Matters For Future VR*. YouTube. [https://www.youtube.com/watch?v=YWA4gVibKJE]</ref> <ref name="KramidaVAC2016">Kramida, Gregory. (2016). Resolving the vergence-accommodation conflict in head-mounted displays. *IEEE transactions on visualization and computer graphics, 22*(7), 1912-1931.</ref>
A major limitation in most current VR/AR displays is the mismatch between vergence and accommodation cues. Most headsets use [[fixed-focus display]]s, meaning the optics present the virtual image at a fixed focal distance (often 1.5-2 meters or optical infinity), regardless of the simulated distance of the virtual object. <ref name="ARInsiderVAC">(2024-01-29) Understanding Vergence-Accommodation Conflict in AR/VR Headsets - AR Insider. Retrieved April 25, 2025, from https://arinsider.co/2022/06/22/5-ways-to-address-ars-vergence-accommodation-conflict/</ref> <ref name="WikiVAC">Vergence-accommodation conflict - Wikipedia. Retrieved April 25, 2025, from https://en.wikipedia.org/wiki/Vergence-accommodation_conflict</ref> <ref name="DeliverContactsFocus">(2024-07-18) Exploring the Focal Distance in VR Headsets - Deliver Contacts. Retrieved April 25, 2025, from https://delivercontacts.com/research/virtual-reality-the-vergence-accommodation-conflict/</ref> While the user's eyes converge appropriately for the virtual object's simulated distance (for example 0.5 meters), their eyes must maintain focus (accommodate) at the fixed optical distance of the display itself to keep the image sharp. This mismatch between the distance signaled by vergence and the distance signaled by accommodation is known as the '''[[vergence-accommodation conflict]]''' (VAC). <ref name="HoffmanVAC2008">Hoffman, D. M., Girshick, A. R., Akeley, K., & Banks, M. S. (2008). Vergence-accommodation conflicts hinder visual performance and cause visual fatigue. *Journal of Vision, 8*(3), 33. doi:10.1167/8.3.33</ref> <ref name="FacebookVAC2019">Facebook Research. (2019, March 28). *Vergence-Accommodation Conflict: Facebook Research Explains Why Varifocal Matters For Future VR*. YouTube. [https://www.youtube.com/watch?v=YWA4gVibKJE]</ref> <ref name="KramidaVAC2016">Kramida, Gregory. (2016). Resolving the vergence-accommodation conflict in head-mounted displays. *IEEE transactions on visualization and computer graphics, 22*(7), 1912-1931.</ref>


The VAC forces the brain to deal with conflicting depth information, potentially leading to several issues:
The VAC forces the brain to deal with conflicting depth information, potentially leading to several issues:
*   Visual fatigue, discomfort, and eye strain <ref name="HoffmanVAC2008"/> <ref name="ARInsiderVAC"/>
*Visual fatigue, discomfort, and eye strain <ref name="HoffmanVAC2008"/> <ref name="ARInsiderVAC"/>
*   Headaches or [[simulator sickness]] symptoms (nausea, disorientation) <ref name="ARInsiderVAC"/> <ref name="VosVAC2005">Vos, G. A., Barfield, W., & Yamamoto, T. (2005). The Virtual Vertical: Depth Perception and Discomfort in Stereoscopic Displays. *Presence: Teleoperators & Virtual Environments, 14*(6), 649-664.</ref>
*Headaches or [[simulator sickness]] symptoms (nausea, disorientation) <ref name="ARInsiderVAC"/> <ref name="VosVAC2005">Vos, G. A., Barfield, W., & Yamamoto, T. (2005). The Virtual Vertical: Depth Perception and Discomfort in Stereoscopic Displays. ''Presence: Teleoperators & Virtual Environments, 14''(6), 649-664.</ref>
*   Difficulty fusing stereoscopic images
*Difficulty fusing stereoscopic images
*   Inaccurate depth and size perception, particularly for near-field objects (within arm's reach) <ref name="JonesVAC2008">Jones, J. A., Swan II, J. E., Singh, G., & Ellis, S. R. (2008). The effects of virtual reality, augmented reality, and motion parallax on egocentric depth perception. *Proceedings of the 5th symposium on Applied perception in graphics and visualization*, 9-16.</ref>
*Inaccurate depth and size perception, particularly for near-field objects (within arm's reach) <ref name="JonesVAC2008">Jones, J. A., Swan II, J. E., Singh, G., & Ellis, S. R. (2008). The effects of virtual reality, augmented reality, and motion parallax on egocentric depth perception. *Proceedings of the 5th symposium on Applied perception in graphics and visualization*, 9-16.</ref>
*   Reduced realism and immersion
*Reduced realism and immersion


The VAC is particularly problematic for interactions requiring sustained focus or high visual fidelity at close distances (e.g., virtual surgery simulation, detailed object inspection, reading text on near virtual objects). <ref name="HowardRogers2012"/>
The VAC is particularly problematic for interactions requiring sustained focus or high visual fidelity at close distances (for example virtual surgery simulation, detailed object inspection, reading text on near virtual objects). <ref name="HowardRogers2012"/>


==== Other Limitations ====
====Other Limitations====
*   **Limited or Incorrect Focus Cues:** Beyond the fixed focus of VAC, conventional displays lack natural [[Depth of Field]] (blur) cues associated with accommodation. Objects at different virtual depths often appear equally sharp unless blur is artificially simulated.
*'''Limited or Incorrect Focus Cues:''' Beyond the fixed focus of VAC, conventional displays lack natural [[Depth of Field]] (blur) cues associated with accommodation. Objects at different virtual depths often appear equally sharp unless blur is artificially simulated.
*   **Limited Ocular Parallax:** Few systems accurately reproduce the subtle shifts related to eye rotation, though this is becoming more feasible with advanced [[eye tracking]].
*'''Limited Ocular Parallax:''' Few systems accurately reproduce the subtle shifts related to eye rotation, though this is becoming more feasible with advanced [[eye tracking]].
*   **Imperfect Atmospheric Effects:** Simulating realistic atmospheric scattering and haze dynamically remains challenging.
*'''Imperfect Atmospheric Effects:''' Simulating realistic atmospheric scattering and haze dynamically remains challenging.


=== Advanced Display Technologies Addressing Depth Cue Limitations ===
===Advanced Display Technologies Addressing Depth Cue Limitations===
To mitigate or eliminate the VAC and provide more accurate depth cues, researchers and companies are actively developing advanced display technologies:
To mitigate or eliminate the VAC and provide more accurate depth cues, researchers and companies are actively developing advanced display technologies:


*   '''[[Varifocal Displays]]''': These displays dynamically adjust the focal distance of the display optics (e.g., using physically moving lenses/screens, [[liquid lens]] technology, or [[deformable mirror]] devices) to match the simulated distance of the object the user is currently looking at. <ref name="KonradVAC2016">Konrad, R., Cooper, E. A., & Banks, M. S. (2016). Towards the next generation of virtual and augmented reality displays. *Optics Express, 24*(15), 16800-16809. doi:10.1364/OE.24.016800</ref> <ref name="DunnVarifocal2017">Dunn, David, et al. (2017). Wide field of view varifocal near-eye display using see-through deformable membrane mirrors. *IEEE transactions on visualization and computer graphics, 23*(4), 1322-1331.</ref> This typically requires fast and accurate [[eye tracking]] to determine the user's point of gaze and intended focus depth. Varifocal systems often simulate [[Depth of Field]] effects computationally, blurring parts of the scene not at the current focal distance. <ref name="ARInsiderVAC"/> Prototypes like Meta Reality Labs' "Half Dome" series have demonstrated this approach. <ref name="ARInsiderVAC"/>
*'''[[Varifocal Displays]]''': These displays dynamically adjust the focal distance of the display optics (for example using physically moving lenses/screens, [[liquid lens]] technology, or [[deformable mirror]] devices) to match the simulated distance of the object the user is currently looking at. <ref name="KonradVAC2016">Konrad, R., Cooper, E. A., & Banks, M. S. (2016). Towards the next generation of virtual and augmented reality displays. *Optics Express, 24*(15), 16800-16809. doi:10.1364/OE.24.016800 https://www.computationalimaging.org/publications/accommodation-invariant-near-eye-displays-siggraph-2017/</ref> <ref name="DunnVarifocal2017">Dunn, David, et al. (2017). Wide field of view varifocal near-eye display using see-through deformable membrane mirrors. *IEEE transactions on visualization and computer graphics, 23*(4), 1322-1331.</ref> This typically requires fast and accurate [[eye tracking]] to determine the user's point of gaze and intended focus depth. Varifocal systems often simulate [[Depth of Field]] effects computationally, blurring parts of the scene not at the current focal distance. <ref name="ARInsiderVAC"/> Prototypes like Meta Reality Labs' "Half Dome" series have demonstrated this approach. <ref name="ARInsiderVAC"/>


*   '''[[Multifocal Displays]] (Multi-Plane Displays)''': Instead of a single, continuously adjusting focus, these displays present content on multiple discrete focal planes simultaneously or in rapid succession. <ref name="AkeleyMultifocal2004">Akeley, Kurt, Watt, S. J., Girshick, A. R., & Banks, M. S. (2004). A stereo display prototype with multiple focal distances. *ACM transactions on graphics (TOG), 23*(3), 804-813.</ref> The visual system can then accommodate to the plane closest to the target object's depth. Examples include stacked display panels or systems using switchable lenses. Magic Leap 1 used a two-plane system. <ref name="ARInsiderVAC"/> While reducing VAC, they can still exhibit quantization effects if an object lies between planes, and complexity increases with the number of planes.
*'''[[Multifocal Displays]] (Multi-Plane Displays)''': Instead of a single, continuously adjusting focus, these displays present content on multiple discrete focal planes simultaneously or in rapid succession. <ref name="AkeleyMultifocal2004">Akeley, Kurt, Watt, S. J., Girshick, A. R., & Banks, M. S. (2004). A stereo display prototype with multiple focal distances. *ACM transactions on graphics (TOG), 23*(3), 804-813.</ref> The visual system can then accommodate to the plane closest to the target object's depth. Examples include stacked display panels or systems using switchable lenses. Magic Leap 1 used a two-plane system. <ref name="ARInsiderVAC"/> While reducing VAC, they can still exhibit quantization effects if an object lies between planes, and complexity increases with the number of planes.


*   '''[[Light Field Displays]]''': These displays aim to reconstruct the [[light field]] of a scene the distribution of light rays in space more completely. By emitting rays with the correct origin and direction, they allow the viewer's eye to naturally focus at different depths within the virtual scene, as if viewing a real 3D environment. <ref name="WetzsteinLightField2011">Wetzstein, Gordon, et al. (2011). Computational plenoptic imaging. *Computer Graphics Forum, 30*(8), 2397-2426.</ref> <ref name="Lanman2013">Lanman, D., & Luebke, D. (2013). Near-eye light field displays. *ACM Transactions on Graphics (TOG), 32*(6), 1-10. doi:10.1145/2508363.2508366</ref> This can potentially solve the VAC without requiring eye tracking. However, generating the necessary dense light fields poses significant computational and hardware challenges, often involving trade-offs between resolution, field of view, and form factor. <ref name="ARInsiderVAC"/> Companies like CREAL are developing light field modules for AR/VR. <ref name="WikiVAC"/>
*'''[[Light Field Displays]]''': These displays aim to reconstruct the [[light field]] of a scene, the distribution of light rays in space, more completely. By emitting rays with the correct origin and direction, they allow the viewer's eye to naturally focus at different depths within the virtual scene, as if viewing a real 3D environment. <ref name="WetzsteinLightField2011">Wetzstein, Gordon, et al. (2011). Computational plenoptic imaging. *Computer Graphics Forum, 30*(8), 2397-2426.</ref> <ref name="Lanman2013">Lanman, D., & Luebke, D. (2013). Near-eye light field displays. *ACM Transactions on Graphics (TOG), 32*(6), 1-10. doi:10.1145/2508363.2508366</ref> This can potentially solve the VAC without requiring eye tracking. However, generating the necessary dense light fields poses significant computational and hardware challenges, often involving trade-offs between resolution, field of view, and form factor. <ref name="ARInsiderVAC"/> Companies like CREAL are developing light field modules for AR/VR. <ref name="WikiVAC"/>


*  '''[[Holographic Displays]]''': True [[holography|holographic]] displays aim to reconstruct the wavefront of light from the virtual scene using diffraction, which would inherently provide all depth cues, including accommodation, correctly and continuously. <ref name="MaimoneHolo2017">Maimone, A., Georgiou, A., & Kollin, J. S. (2017). Holographic near-eye displays for virtual and augmented reality. *ACM Transactions on Graphics (TOG), 36*(4), 1-16. doi:10.1145/3072959.3073610</ref> This is often considered an ultimate goal for visual displays. However, current implementations suitable for near-eye displays face major challenges in computational load, achievable [[field of view]], image quality (e.g., [[speckle noise]]), and component size. <ref name="MaimoneHolo2017"/> <ref name="ARInsiderVAC"/>
*  '''[[Holographic Displays]]''': True [[holography|holographic]] displays aim to reconstruct the wavefront of light from the virtual scene using diffraction, which would inherently provide all depth cues, including accommodation, correctly and continuously. <ref name="MaimoneHolo2017">Maimone, A., Georgiou, A., & Kollin, J. S. (2017). Holographic near-eye displays for virtual and augmented reality. *ACM Transactions on Graphics (TOG), 36*(4), 1-16. doi:10.1145/3072959.3073610</ref> This is often considered an ultimate goal for visual displays. However, current implementations suitable for near-eye displays face major challenges in computational load, achievable [[field of view]], image quality (for example [[speckle noise]]), and component size. <ref name="MaimoneHolo2017"/> <ref name="ARInsiderVAC"/>


*   '''[[Retinal Projection]] (Retinal Scan Displays)''': These systems bypass intermediate screens and project images directly onto the viewer's retina, often using low-power lasers or micro-LED arrays. <ref name="ARInsiderVAC"/> Because the image is formed on the retina, it can appear in focus regardless of the eye's accommodation state, potentially eliminating VAC. This approach could enable very compact form factors. Challenges include achieving a sufficiently large [[eye-box]] (the area where the eye can see the image), potential sensitivity to eye floaters or optical path debris, and safety considerations. <ref name="ARInsiderVAC"/> Examples include the discontinued North Focals smart glasses.
*'''[[Retinal Projection]] (Retinal Scan Displays)''': These systems bypass intermediate screens and project images directly onto the viewer's retina, often using low-power lasers or micro-LED arrays. <ref name="ARInsiderVAC"/> Because the image is formed on the retina, it can appear in focus regardless of the eye's accommodation state, potentially eliminating VAC. This approach could enable very compact form factors. Challenges include achieving a sufficiently large [[eye-box]] (the area where the eye can see the image), potential sensitivity to eye floaters or optical path debris, and safety considerations. <ref name="ARInsiderVAC"/> Examples include the discontinued North Focals smart glasses.


=== Specific Considerations for Augmented Reality ===
===Specific Considerations for Augmented Reality===
In AR, correctly rendering depth cues is arguably even more critical and complex than in VR because virtual objects must appear convincingly integrated with the real-world environment, which already provides a rich and consistent set of depth cues. Key challenges include:
In AR, correctly rendering depth cues is arguably even more critical and complex than in VR because virtual objects must appear convincingly integrated with the real-world environment, which already provides a rich and consistent set of depth cues. Key challenges include:


*   **Occlusion:** Virtual objects must realistically occlude real objects behind them, and be occluded by real objects in front of them. This requires accurate real-time 3D reconstruction of the surrounding environment, often using depth sensors and [[Simultaneous Localization and Mapping]] (SLAM) techniques. Without correct occlusion, virtual objects may appear as semi-transparent "ghosts" overlaid on reality. <ref name="HowardRogers2012Vol3">Howard, I. P., & Rogers, B. J. (2012). *Perceiving in Depth, Volume 3: Other Mechanisms of Depth Perception*. Oxford University Press.</ref> <ref name="PubMedOcclusionAR">Kiyokawa, K., Billinghurst, M., Hayes, S. E., & Gupta, A. (2003). An occlusion-capable optical see-through head mount display for supporting co-located collaboration. *Proceedings. ISMAR 2003. Second IEEE and ACM International Symposium on Mixed and Augmented Reality*, 133-141. doi:10.1109/ISMAR.2003.1240688</ref>
*'''Occlusion:''' Virtual objects must realistically occlude real objects behind them, and be occluded by real objects in front of them. This requires accurate real-time 3D reconstruction of the surrounding environment, often using depth sensors and [[Simultaneous Localization and Mapping]] (SLAM) techniques. Without correct occlusion, virtual objects may appear as semi-transparent "ghosts" overlaid on reality. <ref name="HowardRogers2012Vol3">Howard, I. P., & Rogers, B. J. (2012). ''Perceiving in Depth, Volume 3: Other Mechanisms of Depth Perception''. Oxford University Press.</ref> <ref name="PubMedOcclusionAR">Kiyokawa, K., Billinghurst, M., Hayes, S. E., & Gupta, A. (2003). An occlusion-capable optical see-through head mount display for supporting co-located collaboration. 'Proceedings. ISMAR 2003. Second IEEE and ACM International Symposium on Mixed and Augmented Reality'', 133-141. doi:10.1109/ISMAR.2003.1240688</ref>
*   **Lighting and Shadows:** Virtual objects should be lit consistently with real-world lighting conditions and cast plausible shadows onto real surfaces (and receive shadows from real objects) to appear grounded in the environment. <ref name="HowardRogers2012Vol3"/>
*'''Lighting and Shadows:''' Virtual objects should be lit consistently with real-world lighting conditions and cast plausible shadows onto real surfaces (and receive shadows from real objects) to appear grounded in the environment. <ref name="HowardRogers2012Vol3"/>
*   **Perspective and Scale:** Virtual objects must be rendered with perspective and size that are consistent with their intended location within the real scene. <ref name="HowardRogers2012Vol3"/>
*'''Perspective and Scale:''' Virtual objects must be rendered with perspective and size that are consistent with their intended location within the real scene. <ref name="HowardRogers2012Vol3"/>
*   **Focus:** In optical see-through AR, the fixed focus of virtual objects often conflicts with the user's ability to focus naturally on real objects at different distances, leading to [[focal rivalry]] in addition to VAC. <ref name="ARInsiderVAC"/>
*'''Focus:''' In optical see-through AR, the fixed focus of virtual objects often conflicts with the user's ability to focus naturally on real objects at different distances, leading to [[focal rivalry]] in addition to VAC. <ref name="ARInsiderVAC"/>


=== [[Ocular Parallax]], Eye-Tracking and Eye-Box Considerations ===
===[[Ocular Parallax]], Eye-Tracking and Eye-Box Considerations===
In the context of VR/AR optics, the term 'ocular parallax' is sometimes used differently from the monocular depth cue described earlier. It can refer to the apparent shift in the virtual image relative to the user's eye pupil as the eye moves within the viewing zone (the '[[eye-box]]') of the headset's optics. If not well-managed, this can cause the virtual world to appear unstable or "swim," impacting depth perception and comfort, especially in AR where alignment with the real world is critical. Accurate [[eye tracking]] can help systems compensate for these effects by adjusting the rendering based on precise eye position ("gaze-contingent rendering"). <ref name="ChangEyeTrack2020">Chang, Jen-Hao Rick, et al. (2020). Toward a unified framework for hand-eye coordination in virtual reality. *2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR)*. IEEE.</ref>
In the context of VR/AR optics, the term 'ocular parallax' is sometimes used differently from the monocular depth cue described earlier. It can refer to the apparent shift in the virtual image relative to the user's eye pupil as the eye moves within the viewing zone (the '[[eye-box]]') of the headset's optics. If not well-managed, this can cause the virtual world to appear unstable or "swim," impacting depth perception and comfort, especially in AR where alignment with the real world is critical. Accurate [[eye tracking]] can help systems compensate for these effects by adjusting the rendering based on precise eye position ("gaze-contingent rendering"). <ref name="ChangEyeTrack2020">Chang, Jen-Hao Rick, et al. (2020). Toward a unified framework for hand-eye coordination in virtual reality. ''2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR)''. IEEE.</ref>


== Health and Comfort Implications ==
==Health and Comfort Implications==
The incomplete or inconsistent reproduction of depth cues in current VR and AR systems can lead to various negative effects for users:
The incomplete or inconsistent reproduction of depth cues in current VR and AR systems can lead to various negative effects for users:


*   **Visual Fatigue and Discomfort:** The [[vergence-accommodation conflict]] is a primary contributor to eye strain, headaches, blurred vision, and general visual discomfort, especially during prolonged use. <ref name="HoffmanVAC2008"/> <ref name="ARInsiderVAC"/>
*'''Visual Fatigue and Discomfort:''' The [[vergence-accommodation conflict]] is a primary contributor to eye strain, headaches, blurred vision, and general visual discomfort, especially during prolonged use. <ref name="HoffmanVAC2008"/> <ref name="ARInsiderVAC"/>
*   **Spatial Perception Errors:** Inaccurate or conflicting depth cues can lead to misjudgments of distance, size, and the spatial relationships between objects, potentially affecting user performance in tasks requiring precise spatial awareness or interaction. <ref name="JonesVAC2008"/> <ref name="WillemsenHMD2009">Willemsen, Peter, Colton, M. B., Creem-Regehr, S. H., & Thompson, W. B. (2009). The effects of head-mounted display mechanical properties and field of view on distance judgments in virtual environments. *ACM Transactions on Applied Perception (TAP), 6*(2), 1-14.</ref>
*'''Spatial Perception Errors:''' Inaccurate or conflicting depth cues can lead to misjudgments of distance, size, and the spatial relationships between objects, potentially affecting user performance in tasks requiring precise spatial awareness or interaction. <ref name="JonesVAC2008"/> <ref name="WillemsenHMD2009">Willemsen, Peter, Colton, M. B., Creem-Regehr, S. H., & Thompson, W. B. (2009). The effects of head-mounted display mechanical properties and field of view on distance judgments in virtual environments. ''ACM Transactions on Applied Perception (TAP), 6''(2), 1-14.</ref>
*   **[[Simulator Sickness]]:** Inconsistencies between visual depth cues and other sensory information (e.g., vestibular signals from the inner ear) can contribute to symptoms like nausea, disorientation, and dizziness. <ref name="VosVAC2005"/> <ref name="WannAdaptation1995">Wann, John P., Simon Rushton, and Mark Mon-Williams. (1995). Natural problems for stereoscopic depth perception in virtual environments. *Vision research, 35*(19), 2731-2736.</ref>
*'''[[Simulator Sickness]]:''' Inconsistencies between visual depth cues and other sensory information (for example vestibular signals from the inner ear) can contribute to symptoms like nausea, disorientation, and dizziness. <ref name="VosVAC2005"/> <ref name="WannAdaptation1995">Wann, John P., Simon Rushton, and Mark Mon-Williams. (1995). Natural problems for stereoscopic depth perception in virtual environments. *Vision research, 35*(19), 2731-2736.</ref>


== Design Considerations for VR/AR Developers ==
==Design Considerations for VR/AR Developers==
When designing content and experiences for current VR and AR systems, developers should be mindful of depth cue limitations and best practices:
When designing content and experiences for current VR and AR systems, developers should be mindful of depth cue limitations and best practices:


*   **Leverage Multiple Cues:** Rely on a combination of available cues (stereo, motion parallax, strong pictorial cues) to create a robust sense of depth. Enhance monocular cues like shadows, perspective, and texture gradients to compensate for limitations in physiological cues. <ref name="CuttingVishton1995"/>
*'''Leverage Multiple Cues:** Rely on a combination of available cues (stereo, motion parallax, strong pictorial cues) to create a robust sense of depth. Enhance monocular cues like shadows, perspective, and texture gradients to compensate for limitations in physiological cues. <ref name="CuttingVishton1995"/>
*   **Manage VAC Impact:**
*'''Manage VAC Impact:'''
    *  **Comfort Zones:** Place critical interactive content primarily within the zone of comfortable viewing (often suggested as roughly 0.75-3.5 meters in VR) where VAC effects may be less severe for many users. <ref name="ShibataComfortZone2011">Shibata, Takashi, Kim, J., Hoffman, D. M., & Banks, M. S. (2011). The zone of comfort: Predicting visual discomfort with stereo displays. *Journal of vision, 11*(8), 11-11.</ref> Avoid sustained focus on very near objects (< 0.5m).
**'''Comfort Zones:''' Place critical interactive content primarily within the zone of comfortable viewing (often suggested as roughly 0.75-3.5 meters in VR) where VAC effects may be less severe for many users. <ref name="ShibataComfortZone2011">Shibata, Takashi, Kim, J., Hoffman, D. M., & Banks, M. S. (2011). The zone of comfort: Predicting visual discomfort with stereo displays. *Journal of vision, 11*(8), 11-11.</ref> Avoid sustained focus on very near objects (< 0.5m).
    *  **Depth Budget:** Limit the overall range of depths presented simultaneously or avoid rapid, large shifts in depth between near and far objects that force quick vergence changes against a fixed accommodation state.
**'''Depth Budget:''' Limit the overall range of depths presented simultaneously or avoid rapid, large shifts in depth between near and far objects that force quick vergence changes against a fixed accommodation state.
*   **Guide Attention:** Use composition, lighting, and visual design to guide the user's focal attention appropriately within the scene.
*'''Guide Attention:''' Use composition, lighting, and visual design to guide the user's focal attention appropriately within the scene.
*   **Simulated Depth of Field:** Strategically apply computationally rendered blur (simulated [[Depth of Field]]) based on estimated user focus or salient objects to help guide accommodation, mask focus limitations, or enhance realism. <ref name="DuchowskiDoF2014">Duchowski, Andrew T., et al. (2014). Reducing visual discomfort with HMDs using dynamic depth of field. *IEEE computer graphics and applications, 34*(5), 34-41.</ref>
*'''Simulated Depth of Field:''' Strategically apply computationally rendered blur (simulated [[Depth of Field]]) based on estimated user focus or salient objects to help guide accommodation, mask focus limitations, or enhance realism. <ref name="DuchowskiDoF2014">Duchowski, Andrew T., et al. (2014). Reducing visual discomfort with HMDs using dynamic depth of field. *IEEE computer graphics and applications, 34*(5), 34-41.</ref>
*   **Consider Interaction Distance:** Be aware that applications requiring precise manipulation or inspection of virtual objects at close range are most susceptible to VAC issues and benefit most from advanced display technologies that address it.
*'''Consider Interaction Distance:''' Be aware that applications requiring precise manipulation or inspection of virtual objects at close range are most susceptible to VAC issues and benefit most from advanced display technologies that address it.


== Future Directions ==
==Future Directions==
The field of depth perception in VR and AR continues to evolve rapidly. Emerging areas of research and development include:
The field of depth perception in VR and AR continues to evolve rapidly. Emerging areas of research and development include:


*   **Perceptual Adaptation:** Studying how users adapt to inconsistent or unnatural depth cues over time, potentially leading to training paradigms or design strategies that improve comfort on current hardware. <ref name="WannAdaptation1995"/>
*'''Perceptual Adaptation:''' Studying how users adapt to inconsistent or unnatural depth cues over time, potentially leading to training paradigms or design strategies that improve comfort on current hardware. <ref name="WannAdaptation1995"/>
*   **Personalized Depth Rendering:** Calibrating depth cue presentation based on individual user characteristics (e.g., IPD, visual acuity, refractive error, sensitivity to VAC) for optimized comfort and performance. <ref name="WillemsenHMD2009"/>
*'''Personalized Depth Rendering:''' Calibrating depth cue presentation based on individual user characteristics (for example IPD, visual acuity, refractive error, sensitivity to VAC) for optimized comfort and performance. <ref name="WillemsenHMD2009"/>
*   **[[Cross-modal interaction|Cross-Modal Integration]]:** Investigating how integrating depth information from other senses (e.g., [[spatial audio]], [[haptic feedback]]) can enhance or reinforce visual depth perception. <ref name="ErnstCrossModal2002">Ernst, Marc O., and Martin S. Banks. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. *Nature, 415*(6870), 429-433.</ref>
*'''[[Cross-modal interaction|Cross-Modal Integration]]:** Investigating how integrating depth information from other senses (for example [[spatial audio]], [[haptic feedback]]) can enhance or reinforce visual depth perception. <ref name="ErnstCrossModal2002">Ernst, Marc O., and Martin S. Banks. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. ''Nature, 415''(6870), 429-433.</ref>
*   **[[Neural rendering|Neural Rendering]] and AI:** Utilizing machine learning techniques (e.g., [[Neural Radiance Fields]] (NeRF)) to potentially render complex scenes with perceptually accurate depth cues more efficiently by learning implicit scene representations. <ref name="MildenhallNeRF2020">Mildenhall, Ben, et al. (2020). Nerf: Representing scenes as neural radiance fields for view synthesis. *European conference on computer vision*. Springer, Cham.</ref>
*'''[[Neural rendering|Neural Rendering]] and AI:''' Utilizing machine learning techniques (for example [[Neural Radiance Fields]] (NeRF)) to potentially render complex scenes with perceptually accurate depth cues more efficiently by learning implicit scene representations. <ref name="MildenhallNeRF2020">Mildenhall, Ben, et al. (2020). Nerf: Representing scenes as neural radiance fields for view synthesis. *European conference on computer vision*. Springer, Cham.</ref>
 
== Conclusion ==
Depth cues are fundamental to human visual perception and represent both a cornerstone and a significant challenge for virtual and augmented reality systems. While current technology effectively simulates many cues like binocular disparity, motion parallax, and various pictorial cues, the inability of most displays to correctly reproduce the physiological cue of accommodation leads to the vergence-accommodation conflict, impacting user comfort, performance, and the overall realism of immersive experiences. Ongoing research and the development of advanced display technologies like varifocal, multifocal, light field, and holographic systems promise to overcome these limitations, paving the way for VR and AR experiences with more natural and complete depth perception. A thorough understanding of the interplay and limitations of depth cues remains essential for researchers and developers pushing the boundaries of immersive technologies.


==References==
==References==
Line 158: Line 153:
<ref name="HowardRogers1995">Howard, Ian P., and Brian J. Rogers. (1995). *Binocular vision and stereopsis*. Oxford University Press.</ref>
<ref name="HowardRogers1995">Howard, Ian P., and Brian J. Rogers. (1995). *Binocular vision and stereopsis*. Oxford University Press.</ref>
<ref name="HITLCues1">(2014-06-20) Visual Depth Cues - Human Interface Technology Laboratory. Retrieved April 25, 2025, from https://www.hitl.washington.edu/projects/knowledge-base/virtual-worlds/EVE/III.A.1.b.VisualDepthCues.html</ref>
<ref name="HITLCues1">(2014-06-20) Visual Depth Cues - Human Interface Technology Laboratory. Retrieved April 25, 2025, from https://www.hitl.washington.edu/projects/knowledge-base/virtual-worlds/EVE/III.A.1.b.VisualDepthCues.html</ref>
<ref name="BlakeWilson2011">Blake, R., & Wilson, H. R. (2011). Binocular vision. *Vision Research, 51*(7), 754–770. doi:10.1016/j.visres.2010.10.009</ref>
<ref name="BlakeWilson2011">Blake, R., & Wilson, H. R. (2011). Binocular vision. *Vision Research, 51*(7), 754-770. doi:10.1016/j.visres.2010.10.009</ref>
<ref name="ParkerStereo2007">Parker, Andrew J. (2007). Binocular depth perception and the cerebral cortex. *Nature Reviews Neuroscience, 8*(5), 379-391.</ref>
<ref name="ParkerStereo2007">Parker, Andrew J. (2007). Binocular depth perception and the cerebral cortex. *Nature Reviews Neuroscience, 8*(5), 379-391.</ref>
<ref name="WattFocusCues2005">Watt, Simon J., Auld, W. S., & Binnie, R. G. (2005). Focus cues affect perceived depth. *Journal of vision, 5*(10), 834-862.</ref>
<ref name="WattFocusCues2005">Watt, Simon J., Auld, W. S., & Binnie, R. G. (2005). Focus cues affect perceived depth. *Journal of vision, 5*(10), 834-862.</ref>
<ref name="CuttingVishton1995">Cutting, J. E., & Vishton, P. M. (1995). Perceiving layout and knowing distances: The integration, relative potency, and contextual use of different information about depth. In W. Epstein & S. Rogers (Eds.), *Handbook of perception and cognition: Vol. 5. Perception of space and motion* (pp. 69–117). Academic Press.</ref>
<ref name="CuttingVishton1995">Cutting, J. E., & Vishton, P. M. (1995). Perceiving layout and knowing distances: The integration, relative potency, and contextual use of different information about depth. In W. Epstein & S. Rogers (Eds.), *Handbook of perception and cognition: Vol. 5. Perception of space and motion* (pp. 69-117). Academic Press.</ref>
<ref name="FisherAccommodation1988">Fisher, Scott K., and Kenneth J. Ciuffreda. (1988). Accommodation and apparent distance. *Perception, 17*(5), 609-621.</ref>
<ref name="FisherAccommodation1988">Fisher, Scott K., and Kenneth J. Ciuffreda. (1988). Accommodation and apparent distance. *Perception, 17*(5), 609-621.</ref>
<ref name="HITLCues2">(2014-06-20) Accommodation and Convergence - Human Interface Technology Laboratory. Retrieved April 25, 2025, from https://www.hitl.washington.edu/projects/knowledge-base/virtual-worlds/EVE/III.A.1.a.AccommodationConvergence.html</ref>
<ref name="HITLCues2">(2014-06-20) Accommodation and Convergence - Human Interface Technology Laboratory. Retrieved April 25, 2025, from https://www.hitl.washington.edu/projects/knowledge-base/virtual-worlds/EVE/III.A.1.a.AccommodationConvergence.html</ref>
Line 172: Line 167:
<ref name="FryFog1976">Fry, Glenn A., Kerr, K. E., Trezona, P. W., & Westerberg, C. F. (1976). The effect of fog on the perception of distance. *Human Factors, 18*(4), 342-346.</ref>
<ref name="FryFog1976">Fry, Glenn A., Kerr, K. E., Trezona, P. W., & Westerberg, C. F. (1976). The effect of fog on the perception of distance. *Human Factors, 18*(4), 342-346.</ref>
<ref name="RogersMotionParallax1979">Rogers, Brian, and Maureen Graham. (1979). Motion parallax as an independent cue for depth perception. *Perception, 8*(2), 125-134.</ref>
<ref name="RogersMotionParallax1979">Rogers, Brian, and Maureen Graham. (1979). Motion parallax as an independent cue for depth perception. *Perception, 8*(2), 125-134.</ref>
<ref name="ScienceLearnParallax">Depth perception. Science Learning Hub Pokapū Akoranga Pūtaiao. Retrieved April 25, 2025, from https://www.sciencelearn.org.nz/resources/107-depth-perception</ref>
<ref name="ScienceLearnParallax">Depth perception. Science Learning Hub - Pokapū Akoranga Pūtaiao. Retrieved April 25, 2025, from https://www.sciencelearn.org.nz/resources/107-depth-perception</ref>
<ref name="WallachOConnell1953">Wallach, H., & O'Connell, D. N. (1953). The kinetic depth effect. *Journal of Experimental Psychology, 45*(4), 205–217. doi:10.1037/h0058000</ref>
<ref name="WallachOConnell1953">Wallach, H., & O'Connell, D. N. (1953). The kinetic depth effect. *Journal of Experimental Psychology, 45*(4), 205-217. doi:10.1037/h0058000</ref>
<ref name="KudoOcularParallax1988">Kudo, Hiromi, and Hirohiko Ono. (1988). Depth perception, ocular parallax, and stereopsis. *Perception, 17*(4), 473-480.</ref>
<ref name="KudoOcularParallax1988">Kudo, Hiromi, and Hirohiko Ono. (1988). Depth perception, ocular parallax, and stereopsis. *Perception, 17*(4), 473-480.</ref>
<ref name="ARInsiderVAC">(2024-01-29) Understanding Vergence-Accommodation Conflict in AR/VR Headsets - AR Insider. Retrieved April 25, 2025, from https://arinsider.co/2024/01/29/understanding-vergence-accommodation-conflict-in-ar-vr-headsets/</ref>
<ref name="ARInsiderVAC">(2024-01-29) Understanding Vergence-Accommodation Conflict in AR/VR Headsets - AR Insider. Retrieved April 25, 2025, from https://arinsider.co/2024/01/29/understanding-vergence-accommodation-conflict-in-ar-vr-headsets/</ref>
<ref name="WikiVAC">Vergence-accommodation conflict - Wikipedia. Retrieved April 25, 2025, from https://en.wikipedia.org/wiki/Vergence-accommodation_conflict</ref>
<ref name="WikiVAC">Vergence-accommodation conflict - Wikipedia. Retrieved April 25, 2025, from https://en.wikipedia.org/wiki/Vergence-accommodation_conflict</ref>
<ref name="DeliverContactsFocus">(2024-07-18) Exploring the Focal Distance in VR Headsets - Deliver Contacts. Retrieved April 25, 2025, from https://delivercontacts.com/blog/exploring-the-focal-distance-in-vr-headsets</ref>
<ref name="DeliverContactsFocus">(2024-07-18) Exploring the Focal Distance in VR Headsets - Deliver Contacts. Retrieved April 25, 2025, from https://delivercontacts.com/blog/exploring-the-focal-distance-in-vr-headsets</ref>
<ref name="HoffmanVAC2008">Hoffman, D. M., Girshick, A. R., Akeley, K., & Banks, M. S. (2008). Vergence–accommodation conflicts hinder visual performance and cause visual fatigue. *Journal of Vision, 8*(3), 33. doi:10.1167/8.3.33</ref>
<ref name="HoffmanVAC2008">Hoffman, D. M., Girshick, A. R., Akeley, K., & Banks, M. S. (2008). Vergence-accommodation conflicts hinder visual performance and cause visual fatigue. *Journal of Vision, 8*(3), 33. doi:10.1167/8.3.33</ref>
<ref name="FacebookVAC2019">Facebook Research. (2019, March 28). *Vergence-Accommodation Conflict: Facebook Research Explains Why Varifocal Matters For Future VR*. YouTube. [https://www.youtube.com/watch?v=YWA4gVibKJE]</ref>
<ref name="FacebookVAC2019">Facebook Research. (2019, March 28). *Vergence-Accommodation Conflict: Facebook Research Explains Why Varifocal Matters For Future VR*. YouTube. [https://www.youtube.com/watch?v=YWA4gVibKJE]</ref>
<ref name="KramidaVAC2016">Kramida, Gregory. (2016). Resolving the vergence-accommodation conflict in head-mounted displays. *IEEE transactions on visualization and computer graphics, 22*(7), 1912-1931.</ref>
<ref name="KramidaVAC2016">Kramida, Gregory. (2016). Resolving the vergence-accommodation conflict in head-mounted displays. *IEEE transactions on visualization and computer graphics, 22*(7), 1912-1931.</ref>
<ref name="VosVAC2005">Vos, G. A., Barfield, W., & Yamamoto, T. (2005). The Virtual Vertical: Depth Perception and Discomfort in Stereoscopic Displays. *Presence: Teleoperators & Virtual Environments, 14*(6), 649-664.</ref>
<ref name="JonesVAC2008">Jones, J. A., Swan II, J. E., Singh, G., & Ellis, S. R. (2008). The effects of virtual reality, augmented reality, and motion parallax on egocentric depth perception. *Proceedings of the 5th symposium on Applied perception in graphics and visualization*, 9-16.</ref>
<ref name="JonesVAC2008">Jones, J. A., Swan II, J. E., Singh, G., & Ellis, S. R. (2008). The effects of virtual reality, augmented reality, and motion parallax on egocentric depth perception. *Proceedings of the 5th symposium on Applied perception in graphics and visualization*, 9-16.</ref>
<ref name="KonradVAC2016">Konrad, R., Cooper, E. A., & Banks, M. S. (2016). Towards the next generation of virtual and augmented reality displays. *Optics Express, 24*(15), 16800-16809. doi:10.1364/OE.24.016800</ref>
<ref name="KonradVAC2016">Konrad, R., Cooper, E. A., & Banks, M. S. (2016). Towards the next generation of virtual and augmented reality displays. *Optics Express, 24*(15), 16800-16809. doi:10.1364/OE.24.016800</ref>
Line 189: Line 183:
<ref name="Lanman2013">Lanman, D., & Luebke, D. (2013). Near-eye light field displays. *ACM Transactions on Graphics (TOG), 32*(6), 1-10. doi:10.1145/2508363.2508366</ref>
<ref name="Lanman2013">Lanman, D., & Luebke, D. (2013). Near-eye light field displays. *ACM Transactions on Graphics (TOG), 32*(6), 1-10. doi:10.1145/2508363.2508366</ref>
<ref name="MaimoneHolo2017">Maimone, A., Georgiou, A., & Kollin, J. S. (2017). Holographic near-eye displays for virtual and augmented reality. *ACM Transactions on Graphics (TOG), 36*(4), 1-16. doi:10.1145/3072959.3073610</ref>
<ref name="MaimoneHolo2017">Maimone, A., Georgiou, A., & Kollin, J. S. (2017). Holographic near-eye displays for virtual and augmented reality. *ACM Transactions on Graphics (TOG), 36*(4), 1-16. doi:10.1145/3072959.3073610</ref>
<ref name="HowardRogers2012Vol3">Howard, I. P., & Rogers, B. J. (2012). *Perceiving in Depth, Volume 3: Other Mechanisms of Depth Perception*. Oxford University Press.</ref>
<ref name="PubMedOcclusionAR">Kiyokawa, K., Billinghurst, M., Hayes, S. E., & Gupta, A. (2003). An occlusion-capable optical see-through head mount display for supporting co-located collaboration. *Proceedings. ISMAR 2003. Second IEEE and ACM International Symposium on Mixed and Augmented Reality*, 133-141. doi:10.1109/ISMAR.2003.1240688</ref>
<ref name="ChangEyeTrack2020">Chang, Jen-Hao Rick, et al. (2020). Toward a unified framework for hand-eye coordination in virtual reality. *2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR)*. IEEE.</ref>
<ref name="WillemsenHMD2009">Willemsen, Peter, Colton, M. B., Creem-Regehr, S. H., & Thompson, W. B. (2009). The effects of head-mounted display mechanical properties and field of view on distance judgments in virtual environments. *ACM Transactions on Applied Perception (TAP), 6*(2), 1-14.</ref>
<ref name="WannAdaptation1995">Wann, John P., Simon Rushton, and Mark Mon-Williams. (1995). Natural problems for stereoscopic depth perception in virtual environments. *Vision research, 35*(19), 2731-2736.</ref>
<ref name="WannAdaptation1995">Wann, John P., Simon Rushton, and Mark Mon-Williams. (1995). Natural problems for stereoscopic depth perception in virtual environments. *Vision research, 35*(19), 2731-2736.</ref>
<ref name="ShibataComfortZone2011">Shibata, Takashi, Kim, J., Hoffman, D. M., & Banks, M. S. (2011). The zone of comfort: Predicting visual discomfort with stereo displays. *Journal of vision, 11*(8), 11-11.</ref>
<ref name="ShibataComfortZone2011">Shibata, Takashi, Kim, J., Hoffman, D. M., & Banks, M. S. (2011). The zone of comfort: Predicting visual discomfort with stereo displays. *Journal of vision, 11*(8), 11-11.</ref>
<ref name="DuchowskiDoF2014">Duchowski, Andrew T., et al. (2014). Reducing visual discomfort with HMDs using dynamic depth of field. *IEEE computer graphics and applications, 34*(5), 34-41.</ref>
<ref name="DuchowskiDoF2014">Duchowski, Andrew T., et al. (2014). Reducing visual discomfort with HMDs using dynamic depth of field. *IEEE computer graphics and applications, 34*(5), 34-41.</ref>
<ref name="ErnstCrossModal2002">Ernst, Marc O., and Martin S. Banks. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. *Nature, 415*(6870), 429-433.</ref>
<ref name="MildenhallNeRF2020">Mildenhall, Ben, et al. (2020). Nerf: Representing scenes as neural radiance fields for view synthesis. *European conference on computer vision*. Springer, Cham.</ref>
<ref name="MildenhallNeRF2020">Mildenhall, Ben, et al. (2020). Nerf: Representing scenes as neural radiance fields for view synthesis. *European conference on computer vision*. Springer, Cham.</ref>
</references>
</references>