Light field

A light field (also spelled lightfield) is a fundamental concept in optics and computer graphics that describes the amount of light traveling in every direction through every point in space.^[1]^[2] Essentially, it's a function that represents the radiance of light rays at any position and direction within a given volume or area. Understanding and utilizing light fields is crucial for advancing virtual reality (VR) and augmented reality (AR) technologies, as it allows for the capture and reproduction of visual scenes with unprecedented realism, including effects like parallax and refocusing after capture.^[3]

History

The concept of measuring light rays has early roots. Michael Faraday first speculated in 1846 in his lecture "Thoughts on Ray Vibrations" that light should be understood as a field, similar to the magnetic field he had studied.^[4] The term "light field" (svetovoe pole in Russian) was more formally defined by Andrey Gershun in a classic 1936 paper on the radiometric properties of light in three-dimensional space.^[5]

In the context of computer vision and graphics, the concept was further developed with the introduction of the 7D plenoptic function by Adelson and Bergen in 1991.^[6] This function describes all possible light rays, parameterized by 3D position (x, y, z), 2D direction (θ, φ), wavelength (λ), and time (t).

Practical computational approaches often reduce the dimensionality. Two seminal papers in 1996, "Light Field Rendering" by Levoy and Hanrahan^[1] and "The Lumigraph" by Gortler et al.^[2], independently proposed using a 4D subset of the plenoptic function for capturing and rendering complex scenes. They introduced the highly influential two-plane parameterization (2PP), simplifying the representation by defining rays based on their intersection points with two parallel planes. This 4D light field representation forms the basis for most modern light field capture and display technologies.

Theory and Representation

The core idea behind the light field is to capture not just the intensity of light arriving at a point (like a conventional camera), but also the direction from which that light is arriving.

The Plenoptic Function

The most complete representation is the 7D plenoptic function, P(x, y, z, θ, φ, λ, t), describing the radiance of light at any 3D point (x,y,z), in any direction (θ, φ), for any wavelength (λ), at any time (t).^[6] For many applications, this is overly complex and contains redundant information (e.g., light doesn't typically change along a straight ray in free space unless wavelength or time are critical).

4D Light Field

For static scenes under constant illumination, the time (t) and wavelength (λ, often simplified to RGB channels) dependencies can often be dropped. Furthermore, due to the constancy of radiance along a ray in free space, the 3D spatial component can be reduced. The most common simplification is the 4D light field.^[1]^[2]

Two-Plane Parameterization (2PP)

This popular 4D parameterization defines a light ray by its intersection points with two arbitrary planes, often denoted as the (u, v) plane and the (s, t) plane. A ray is thus uniquely identified by the coordinates L(u, v, s, t).^[1] This representation is convenient because:

It relates well to how light field cameras with microlens arrays capture data.
It simplifies rendering algorithms, which often involve resampling this 4D function.

Other 4D parameterizations exist, such as using one point and two angles, or using spherical coordinates.

Light Field Capture

Capturing a light field involves sampling the intensity and direction of light rays within a scene. Several methods exist:

Light Field Cameras (Plenoptic Cameras): These are the most common devices. They typically insert a microlens array between the main lens and the image sensor.^[7]^[3] Each microlens samples the light arriving from the main lens from slightly different perspectives, directing these samples onto different pixels on the sensor below it. The sensor thus records not only the total light hitting each microlens (spatial information) but also how that light is distributed directionally (angular information). Consumer examples included cameras from Lytro (now defunct) and Raytrix.
Camera Arrays: A synchronized array of conventional cameras can capture a light field.^[8] Each camera samples the scene from a different viewpoint. By combining the images and knowing the cameras' precise positions and orientations, the 4D light field can be reconstructed. This approach often yields higher spatial resolution than single plenoptic cameras but requires careful calibration and synchronization.
Scanning/Gantry Setups: A single camera moved precisely to multiple positions can sequentially capture the views needed to sample the light field. This is suitable for static scenes.
Coded Aperture Photography: Placing a patterned mask (coded aperture) near the sensor or aperture can modulate incoming light in a way that allows directional information to be computationally recovered.^[9]
Computational Photography Techniques: Various other methods combining optics and computation are continuously being developed.

The raw data captured by these methods needs significant processing to reconstruct the 4D light field representation, L(u, v, s, t).

Light Field Rendering and Display

Once a light field is captured or synthetically generated, it can be used to render new views of the scene.

Rendering

Rendering novel views involves sampling the 4D light field data. For a desired virtual camera position and orientation, the rendering algorithm calculates which rays from the light field would reach the virtual camera's sensor plane and integrates their radiance values.^[1] This allows for:

Refocusing: Shifting the virtual focal plane after capture by appropriately integrating rays.
Changing Depth of Field (DoF): Adjusting the aperture size computationally.
Small Viewpoint Shifts: Generating views from slightly different positions than the original capture positions, enabling parallax effects.

Light Field Displays

Displaying light fields aims to reproduce the captured directional light rays, allowing the viewer to perceive depth and parallax naturally by moving their head. This is a key area of research for future VR and AR head-mounted displays (HMDs). Approaches include:

Multi-layer Displays: Using stacked LCD or OLED panels with attenuating layers to sculpt the light directionally.^[10]
Microlens Array Displays: Essentially the reverse of a plenoptic camera, where a display panel (e.g., OLED) emits light through a microlens array to project different images in different directions.^[11]
Projector Arrays: Using multiple micro-projectors to beam images onto directional screens (e.g., lenticular sheets or anisotropic diffusers).
Holographic Optical Elements (HOEs): Using diffractive optics to steer light rays appropriately.
Volumetric displays: Creating a true 3D image in a volume of space, though often distinct from pure light field displays that reproduce rays intersecting a plane.

These display technologies aim to provide more realistic visual cues compared to traditional stereoscopic displays.

Applications in VR and AR

Light field technology holds immense promise for VR and AR by enabling more immersive and visually comfortable experiences:

Correct Parallax: Light fields inherently capture parallax information. Viewers using light field displays can move their heads slightly and see the scene perspective shift correctly, significantly enhancing realism and immersion, particularly crucial for six degrees of freedom (6DoF) experiences.^[12]
View-dependent Effects: Complex interactions of light, such as reflections and refractions, change based on viewpoint. Light fields capture these effects, allowing them to be reproduced accurately in VR/AR headsets.
Post-Capture Refocusing and DoF Control: Light field recordings allow users or applications to change focus or DoF after the fact, which could be used for cinematic effects, accessibility features, or gaze-tracked rendering.
Addressing the Vergence-Accommodation Conflict (VAC): Conventional stereoscopic displays present conflicting depth cues: the eyes converge at the depth of the virtual object, but accommodate (focus) at the fixed distance of the display screen. This mismatch can cause eye strain and nausea. Light field displays aim to present light rays that appear to originate from the correct depths, allowing the eye to focus naturally and potentially mitigating the VAC.^[13]
Realistic Capture for VR/AR Content: Light field cameras can capture real-world scenes that can be explored more naturally in VR/AR than traditional 360° video or photogrammetry models, preserving subtle lighting effects.
Enhanced Depth Estimation: The angular information in light fields provides strong cues for calculating accurate depth maps of captured scenes.

Advantages

Provides highly realistic views with correct parallax and view-dependent effects.
Enables post-capture refocusing and depth of field adjustments.
Potential to significantly reduce or eliminate the vergence-accommodation conflict in HMDs.
Captures rich scene information useful for various computational photography tasks.

Challenges and Limitations

Data Size: 4D light fields represent significantly more data than conventional 2D images or even stereo pairs, posing challenges for capture, storage, transmission, and processing.
Capture Hardware Complexity: Building high-resolution light field cameras or camera arrays is complex and often expensive. Achieving wide field of view (FoV) and high angular resolution simultaneously is difficult.
Display Technology Immaturity: High-resolution, high-brightness, wide FoV light field displays suitable for consumer VR/AR are still largely in the research and development phase. Current prototypes often face trade-offs between resolution, brightness, FoV, and computational cost.
Computational Cost: Rendering light fields, especially in real-time for VR/AR, requires significant computational power. Efficient compression and rendering algorithms are crucial.
Limited Angular Resolution: Current practical systems often have limited angular resolution, which can constrain the range of viewpoint movement and the effectiveness in resolving VAC.

References

↑ ^1.0 ^1.1 ^1.2 ^1.3 ^1.4 Levoy, M., & Hanrahan, P. (1996). Light field rendering. Proceedings of the 23rd annual conference on Computer graphics and interactive techniques - SIGGRAPH '96, 31–42.
↑ ^2.0 ^2.1 ^2.2 Gortler, S. J., Grzeszczuk, R., Szeliski, R., & Cohen, M. F. (1996). The Lumigraph. Proceedings of the 23rd annual conference on Computer graphics and interactive techniques - SIGGRAPH '96, 43–54.
↑ ^3.0 ^3.1 Ng, R. (2005). Digital Light Field Photography. Ph.D. Thesis, Stanford University.
↑ Faraday, M. (1846). Thoughts on Ray Vibrations. Philosophical Magazine, S.3, Vol. 28, No. 188.
↑ Gershun, A. (1939). The Light Field. Journal of Mathematics and Physics, 18(1-4), 51–151. (English translation of 1936 Russian paper).
↑ ^6.0 ^6.1 Adelson, E. H., & Bergen, J. R. (1991). The plenoptic function and the elements of early vision. In Computational Models of Visual Processing (pp. 3-20). MIT Press.
↑ Adelson, E. H., & Wang, J. Y. A. (1992). Single lens stereo with a plenoptic camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 99-106.
↑ Wilburn, B., Joshi, N., Vaish, V., Talvala, E. V., Antunez, E., Barth, A., ... & Levoy, M. (2005). High performance imaging using large camera arrays. ACM Transactions on Graphics (TOG), 24(3), 765-776.
↑ Veeraraghavan, A., Raskar, R., Agrawal, A., Mohan, A., & Tumblin, J. (2007). Dappled photography: mask enhanced cameras for heterodyned light fields and coded aperture refocusing. ACM Transactions on Graphics (TOG), 26(3), 69-es.
↑ Wetzstein, G., Lanman, D., Hirsch, M., & Raskar, R. (2011). Layered 3D: Tomographic Image Synthesis for Attenuation-based Light Field and High Dynamic Range Displays. ACM Transactions on Graphics (TOG) - Proceedings of ACM SIGGRAPH 2011, 30(4), 95:1-95:12.
↑ Jones, A., McDowall, I., Yamada, H., Bolas, M., & Debevec, P. (2007). Rendering for an interactive 360° light field display. ACM Transactions on Graphics (TOG), 26(3), 40-es.
↑ Lanman, D., & Luebke, D. (2013). Near-eye light field displays. ACM SIGGRAPH 2013 Talks, 1-1.
↑ Konrad, R., Cooper, E. A., Wetzstein, G., & Banks, M. S. (2017). Accommodation and vergence responses to near-eye light field displays. Journal of Vision, 17(10), 987-987.

[LevoyHanrahan1996-1] 1.0 ^1.1 ^1.2 ^1.3 ^1.4 Levoy, M., & Hanrahan, P. (1996). Light field rendering. Proceedings of the 23rd annual conference on Computer graphics and interactive techniques - SIGGRAPH '96, 31–42.

[Gortler1996-2] 2.0 ^2.1 ^2.2 Gortler, S. J., Grzeszczuk, R., Szeliski, R., & Cohen, M. F. (1996). The Lumigraph. Proceedings of the 23rd annual conference on Computer graphics and interactive techniques - SIGGRAPH '96, 43–54.

[Ng2005-3] 3.0 ^3.1 Ng, R. (2005). Digital Light Field Photography. Ph.D. Thesis, Stanford University.

[Faraday1846-4] Faraday, M. (1846). Thoughts on Ray Vibrations. Philosophical Magazine, S.3, Vol. 28, No. 188.

[Gershun1936-5] Gershun, A. (1939). The Light Field. Journal of Mathematics and Physics, 18(1-4), 51–151. (English translation of 1936 Russian paper).

[AdelsonBergen1991-6] 6.0 ^6.1 Adelson, E. H., & Bergen, J. R. (1991). The plenoptic function and the elements of early vision. In Computational Models of Visual Processing (pp. 3-20). MIT Press.

[AdelsonWang1992-7] Adelson, E. H., & Wang, J. Y. A. (1992). Single lens stereo with a plenoptic camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 99-106.

[Wilburn2005-8] Wilburn, B., Joshi, N., Vaish, V., Talvala, E. V., Antunez, E., Barth, A., ... & Levoy, M. (2005). High performance imaging using large camera arrays. ACM Transactions on Graphics (TOG), 24(3), 765-776.

[Veeraraghavan2007-9] Veeraraghavan, A., Raskar, R., Agrawal, A., Mohan, A., & Tumblin, J. (2007). Dappled photography: mask enhanced cameras for heterodyned light fields and coded aperture refocusing. ACM Transactions on Graphics (TOG), 26(3), 69-es.

[Wetzstein2011-10] Wetzstein, G., Lanman, D., Hirsch, M., & Raskar, R. (2011). Layered 3D: Tomographic Image Synthesis for Attenuation-based Light Field and High Dynamic Range Displays. ACM Transactions on Graphics (TOG) - Proceedings of ACM SIGGRAPH 2011, 30(4), 95:1-95:12.

[Jones2007-11] Jones, A., McDowall, I., Yamada, H., Bolas, M., & Debevec, P. (2007). Rendering for an interactive 360° light field display. ACM Transactions on Graphics (TOG), 26(3), 40-es.

[Lanman2013-12] Lanman, D., & Luebke, D. (2013). Near-eye light field displays. ACM SIGGRAPH 2013 Talks, 1-1.

[Konrad2017-13] Konrad, R., Cooper, E. A., Wetzstein, G., & Banks, M. S. (2017). Accommodation and vergence responses to near-eye light field displays. Journal of Vision, 17(10), 987-987.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]