Jump to content

Diminished reality

From VR & AR Wiki

Diminished reality (DR) is the real-time removal, hiding, or de-emphasis of real objects from a user's perception of the surrounding world, so that those objects appear to be gone even though they are physically still present.[1] It is widely described as the conceptual inverse of augmented reality: where augmented reality adds virtual content to the real scene, diminished reality subtracts real content from it and fills the gap with a plausible substitute.[2][3] Because it modifies rather than purely overlays the view of reality, diminished reality is generally treated as part of mixed reality and of the broader notion of mediated reality.[4]

In virtual reality and augmented reality systems, diminished reality matters because a headset or phone that can already insert virtual objects into a live view of the room can, with the same underlying scene understanding, also take real objects away. Combined with mixed reality passthrough, this allows a user to edit their physical surroundings visually, for example to clear clutter, hide a piece of furniture before placing a virtual one, or conceal private items from a remote collaborator.[5][6]

Relationship to augmented and mediated reality

Diminished reality is most often defined by contrast with augmentation. Augmented reality superimposes synthetic information onto a perceived environment to add to it, whereas diminished reality conceals, eliminates, or sees through real objects to remove from it.[1] A reference lexicon describes it plainly as being "like the reverse of Augmented Reality," removing or hiding unwanted objects instead of adding digital elements.[2]

The idea sits within the framework of mediated reality introduced by researcher Steve Mann. Mediated reality treats a wearable or camera-based system as a filter on perception that can not only add signal, as in augmentation, but also modify, attenuate, or block it. In this view, mediated reality is described as a proper superset of mixed reality, augmented reality, and virtual reality, because it also includes diminished reality.[4] Mann and James Fung set out the deliberate-diminishment case directly in their work on EyeTap devices for "augmented, deliberately diminished, or otherwise altered" visual perception of real-world scenes, published in the journal PRESENCE: Teleoperators and Virtual Environments.[7] An early demonstrated motivation was visual decluttering: using a wearable visual filter to detect real-world advertisements and billboards and to block or replace them in the wearer's view, for example substituting a calmer image in place of an advertisement.[8]

Mann's later taxonomy formalises this as a second axis on top of the reality-virtuality continuum. Where Paul Milgram's continuum runs along virtuality (an X axis from the real to the fully virtual), Mann's mediated reality adds a "mediality" axis (a Y axis) for the degree to which reality is modified, from slightly modified to extremely modified.[9] Removing or de-emphasising real objects is a movement along that modification axis in the subtractive direction, the opposite of adding virtual objects.[9]

How it works

A diminished reality pipeline generally has three stages: deciding which region of the view corresponds to the target object, removing that region, and generating a plausible replacement for the background that the object was hiding.[1] The replacement must be convincing enough that the viewer does not notice the edit, and in interactive virtual reality or augmented reality use it must be produced fast enough to keep up with the moving camera, which makes the background-filling step the central technical problem.[1]

A 2017 survey by Shohei Mori, Sei Ikeda, and Hideo Saito, published in IPSJ Transactions on Computer Vision and Applications, classifies diminished reality methods and is a standard reference for the field. It groups techniques by how they recover the hidden background, broadly into inpainting-based approaches and observation-based approaches.[1]

Inpainting-based methods

Inpainting-based methods estimate the hidden background from the pixels that remain visible around the target, synthesising texture that plausibly continues the surroundings into the removed region.[1] A common strategy is patch-based inpainting, which searches the rest of the image for patches that match the area around the hole and copies them in, rather than computing a single optimal solution, because the approximate search is fast enough for video.[3] A representative example is the work of Norihiko Kawai, Tomokazu Sato, and Naokazu Yokoya, who removed real objects from video and filled the missing regions with plausible textures in real time. Rather than assuming the background is a single flat plane, their method approximates the background geometry by combining local planes, which corrects perspective distortion and improves the texture search, and it keeps the result stable over time using camera pose and geometry estimated by visual simultaneous localization and mapping (visual SLAM). The target region is found by projecting a 3D region rather than tracking the object in the 2D image.[10]

Observation-based methods

Observation-based methods do not guess the hidden background; they obtain it from actual observations, either from additional cameras that view the scene from other angles, or from a pre-captured model of the space.[1] Because these methods use pixels that were genuinely seen from another viewpoint, they can restore the occluded background accurately, but reconstructing 3D structure from multiple views in real time is computationally expensive.[1]

A recent example aimed at virtual reality and augmented reality capture is the InpaintFusion system, described in 2025 by researchers at Graz University of Technology, including Dieter Schmalstieg, Shohei Mori, and Denis Kalkofen, with Hideo Saito's team at Keio University. It performs real-time diminished reality on live 3D recordings, letting a user mark an object on a 2D screen, projecting that selection into the 3D scene, and then creating a plausible background by collecting and merging matching pixels from the object's surroundings; the colour and depth of the scanned scene are optimised together so the edit stays consistent as the camera moves.[3] The team likened the result to "a kind of Photoshop for 3D scenes."[3]

Comparison of approaches

Approach How the hidden background is recovered Strengths Limitations
Inpainting-based Synthesised from the visible pixels around the target, often by copying matching image patches[1] Needs only the device's own view, no extra cameras or prior scan[1] The filled region is plausible rather than ground truth[1]
Observation-based Taken from real observations: other camera viewpoints or a pre-captured model of the space[1] Can restore the true background using pixels actually seen from another view[1] Multi-view 3D reconstruction in real time is computationally costly[1]

Use cases in VR and AR

Scene editing and interior design

A prominent application is letting a user reshape their real room before adding virtual content. In interior design and real-estate visualisation, diminished reality removes existing furniture so a virtual replacement can be placed where the old item stood, giving a cleaner preview of a proposed layout.[5] The VTT Technical Research Centre of Finland demonstrated such a system, in which the diminished reality step takes the indoor 3D structure into account and adapts to the lighting of the environment so that virtual furniture casts realistic shadows.[5] This pairs naturally with mixed reality headsets and passthrough, where add, rearrange, and remove operations on furniture can be combined in a single view of the room.

Decluttering and de-emphasis

Beyond replacing single objects, diminished reality can de-emphasise visual noise. Reference descriptions of the technology list decluttering as a basic example, such as viewing a messy desk through an app that makes the mess disappear,[2] and Mann's early mediated reality work specifically targeted blocking out distracting real-world billboards and advertisements.[8] In an augmented reality context this can help keep a user's attention on relevant content by suppressing irrelevant parts of the real scene.

Privacy in mixed reality collaboration

When two or more people share a live mixed reality space, one person's camera can expose private parts of a real room, such as a bedroom or office, to remote participants. Diminished reality offers object-level privacy control, removing chosen personal items from the shared view and filling the gap with background content so the removal is not obvious. A 2025 research prototype combined automatic object detection with a real-time inpainting model: detected objects are presented to the headset wearer as selectable boxes, and chosen items are masked and filled with synthesised background.[6]

Status as a research area

Diminished reality is an active research field rather than a single finished product. The core difficulty is generating a convincing hidden background quickly enough for an interactive virtual reality or augmented reality view, and methods continue to trade off accuracy, speed, and how much prior knowledge of the scene they require.[1][10] Much published work demonstrates the technique on specific systems and scenes, so general consumer features that simply "delete" arbitrary real objects in a headset are not yet a routine shipped capability; published demonstrations remain bounded by lighting, scene complexity, and available computing power.[3][1]

See also

References

  1. 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 1.13 1.14 1.15 "A survey of diminished reality: Techniques for visually concealing, eliminating, and seeing through real objects". Springer. 2017. https://link.springer.com/article/10.1186/s41074-017-0028-1.
  2. 2.0 2.1 2.2 "Diminished Reality". https://about.zaubar.com/en/xr-ai-lexicon/diminished-reality.
  3. 3.0 3.1 3.2 3.3 3.4 "Diminished reality: Making objects disappear in real time in live recordings". 2025-04-10. https://techxplore.com/news/2025-04-diminished-reality-real.html.
  4. 4.0 4.1 "Computer-mediated reality". https://en.wikipedia.org/wiki/Computer-mediated_reality.
  5. 5.0 5.1 5.2 "A complete interior design solution with diminished reality". VTT Technical Research Centre of Finland. https://cris.vtt.fi/en/publications/a-complete-interior-design-solution-with-diminished-reality/.
  6. 6.0 6.1 "A Real-Time Diminished Reality Approach to Privacy in MR Collaboration". https://arxiv.org/html/2509.10466v1/.
  7. "EyeTap Devices for Augmented, Deliberately Diminished, or Otherwise Altered Visual Perception of Rigid Planar Patches of Real-World Scenes". MIT Press. https://direct.mit.edu/pvar/article/11/2/158/18401/EyeTap-Devices-for-Augmented-Deliberately.
  8. 8.0 8.1 "Mediated Reality". University of Toronto Humanistic Intelligence / Mediated Reality project. https://www.linuxjournal.com/article/3265.
  9. 9.0 9.1 "All Reality: Virtual, Augmented, Mixed (X), Mediated (X,Y), and Multimediated Reality". 2018. https://ar5iv.labs.arxiv.org/html/1804.08386.
  10. 10.0 10.1 "Diminished Reality Based on Image Inpainting Considering Background Geometry". IEEE. 2016-03. https://pubmed.ncbi.nlm.nih.gov/26829239/.