Predictive tracking: Difference between revisions

Revision as of 17:13, 1 May 2025

This page is a stub, please expand it if you have more information.

Introduction

In augmented reality (AR) and virtual reality (VR), predictive tracking is the practice of estimating a user’s future pose (position + orientation) by a small time offset—typically the system’s **motion-to-photon latency**—so that the scene can be rendered for where the user will be rather than where they were when the last sensor sample arrived.^[1] First explored systematically by Ronald Azuma in the mid-1990s, predictive tracking was shown to reduce dynamic registration error in optical see-through AR by a factor of five to ten.^[2] Today every consumer headset—from Meta Quest to Apple Vision Pro and HoloLens—relies on some form of predictive tracking to deliver a stable, low-latency experience.^[3]

Why Prediction Is Necessary

Even with high-speed inertial measurement units (IMUs) and modern GPUs, the end-to-end pipeline of **sensing → transmission → fusion → simulation → rendering → display** incurs delays that add up to tens of milliseconds.^[4] If a frame were rendered using only the most recent measured head pose, the virtual scene would appear to “lag” behind real-world motion, causing discomfort and breaking immersion. Predictive tracking mitigates this by extrapolating the head (or hand) pose to the moment the next frame’s photons actually leave the display.

Sources of Latency

Typical contributors include:^[5]

Sensor fusion delay—time to combine IMU and camera data
USB / wireless transmission
Game-logic and physics simulation
GPU rendering
Display scan-out and pixel switching

How Far Ahead to Predict

Head-mounted systems usually predict on the order of 10–30 ms—roughly their measured pipeline delay. Prediction error grows quadratically with horizon length, so overshooting degrades accuracy.^[1] Modern headsets sample IMUs at up to 1 kHz, allowing reliable extrapolation over these short intervals without large drift.

Common Prediction Algorithms

Dead reckoning – constant-velocity extrapolation; low compute cost but assumes no acceleration.
Kalman filtering – statistically optimal state estimation that fuses noisy sensor data with a motion model. Widely used in inside-out tracking.
Alpha-Beta-Gamma (ABG) filter – a fixed-gain variant estimating position (α), velocity (β) and acceleration (γ).
Constant-acceleration models – often a special case of ABG; used in Oculus Rift DK-era prototypes.^[4]
Machine-learning predictors – recurrent neural networks (e.g., LSTM) have recently been shown to outperform classical filters for aggressive motion, though they are not yet common in shipping products.^[6]

Implementation in Current Devices

Meta Quest (3/Pro) combines high-rate IMUs with inside-out camera SLAM and uses asynchronous time-warp and SpaceWarp to correct frames just before display.^[7]
Apple Vision Pro fuses multiple high-speed cameras, depth sensors and IMUs on Apple-designed silicon; measured optical latency of ≈11 ms implies aggressive short-horizon prediction for head and eye pose.^[3]
Microsoft HoloLens 2 uses IMU + depth-camera fusion and hardware-assisted reprojection to keep holograms locked to real space; Microsoft stresses maintaining ≤16.6 ms frame time and using prediction to cover any additional delay.^[8]

Historical Perspective

Azuma’s 1995 dissertation identified dynamic (motion-induced) error as the dominant source of mis-registration in optical see-through AR and demonstrated that a constant-velocity inertial predictor could dramatically improve stability.^[2] Subsequent VR research throughout the 2000s and early 2010s (e.g., LaValle et al. for the Oculus Rift) refined these concepts with higher sensor rates and deeper error analysis, leading to today’s robust inside-out predictive pipelines.^[4]

References

↑ ^1.0 ^1.1 LaValle, Steven M. “The Latent Power of Prediction.” Oculus Developer Blog, July 12 2013.
↑ ^2.0 ^2.1 Azuma, Ronald T. Predictive Tracking for Augmented Reality. Ph.D. dissertation, University of North Carolina at Chapel Hill, 1995.
↑ ^3.0 ^3.1 Lang, Ben. “Vision Pro and Quest 3 Hand-Tracking Latency Compared.” Road to VR, March 28 2024.
↑ ^4.0 ^4.1 ^4.2 LaValle, Steven M., Yershova, A., Katsev, M., & Antonov, M. “Head Tracking for the Oculus Rift.” In: IEEE Virtual Reality, 2014.
↑ Boger, Yuval. “Understanding Predictive Tracking and Why It’s Important for AR/VR Headsets.” Road to VR, April 24 2017.
↑ Paul, S. et al. “A Study on Sensor System Latency in VR Motion Sickness.” Journal of Sensor & Actuator Networks 10, no. 3 (2021): 53.
↑ Dasch, Tom. “Understanding Gameplay Latency for Oculus Quest, Oculus Go and Gear VR.” Oculus Developer Blog, April 11 2019.
↑ Microsoft. “Hologram Stability.” Mixed Reality Documentation (HoloLens 2), 2021.

[LaValle2013-1] 1.0 ^1.1 LaValle, Steven M. “The Latent Power of Prediction.” Oculus Developer Blog, July 12 2013.

[Azuma1995-2] 2.0 ^2.1 Azuma, Ronald T. Predictive Tracking for Augmented Reality. Ph.D. dissertation, University of North Carolina at Chapel Hill, 1995.

[Lang2024-3] 3.0 ^3.1 Lang, Ben. “Vision Pro and Quest 3 Hand-Tracking Latency Compared.” Road to VR, March 28 2024.

[LaValle2014-4] 4.0 ^4.1 ^4.2 LaValle, Steven M., Yershova, A., Katsev, M., & Antonov, M. “Head Tracking for the Oculus Rift.” In: IEEE Virtual Reality, 2014.

[Boger2017-5] Boger, Yuval. “Understanding Predictive Tracking and Why It’s Important for AR/VR Headsets.” Road to VR, April 24 2017.

[Paul2021-6] Paul, S. et al. “A Study on Sensor System Latency in VR Motion Sickness.” Journal of Sensor & Actuator Networks 10, no. 3 (2021): 53.

[Dasch2019-7] Dasch, Tom. “Understanding Gameplay Latency for Oculus Quest, Oculus Go and Gear VR.” Oculus Developer Blog, April 11 2019.

[Microsoft2021-8] Microsoft. “Hologram Stability.” Mixed Reality Documentation (HoloLens 2), 2021.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

@@ Line 1: / Line 1: @@
 {{stub}}
 ==Introduction==
-When it comes to either [[AR]] or [[VR]] systems, the [[predictive tracking]] for these 2 systems revolves around a process that predicts where the user and/or their body will be in the future. An example of this would be a game that would need to predict the whereabouts of your hand at pretty much any one time.
+In [[augmented reality]] (AR) and [[virtual reality]] (VR), '''predictive tracking''' is the practice of estimating a user’s future [[pose]] (position + orientation) by a small time offset—typically the system’s **motion-to-photon latency**—so that the scene can be rendered for where the user will be rather than where they were when the last sensor sample arrived.<ref name="LaValle2013" />  First explored systematically by Ronald Azuma in the mid-1990s, predictive tracking was shown to reduce dynamic registration error in optical see-through AR by a factor of five to ten.<ref name="Azuma1995" />  Today every consumer headset—from [[Meta Quest]] to [[Apple Vision Pro]] and [[HoloLens]]—relies on some form of predictive tracking to deliver a stable, low-latency experience.<ref name="Lang2024" />
-Quite possibly the biggest use of predictive tracking comes down to an issue with latency when it comes to the peripherals in question. For instance, if you’re looking to your left, you’d expect the view to be consistent on how far you’ve decided to look to the left. Without the use of predictive tracking, there would be a worryingly high amount of delay between your actions and the way the device displays them when you’re finished. Through the use of predictive tracking, the estimated orientation through your input allows for the overall latency to be reduced significantly; making it seem more natural as you would expect.
-Plenty of attention has been aimed directly towards the VR side of things, but AR also plays a key part in the use of predictive tracking. To keep things in check, there’s a graphical overlay over the top of a real-world object, hence the use of “augmented” reality. Even when you move around the world, this overlay doesn’t move from its original position and effectively stays locked in its current location. Via the use of predictive tracking, you can harness the processing power of the graphics chip to keep the objects in frame, the overlay in place for where it should be, and keep things looking natural throughout your time in AR; or at the very least, as best as possible.
-Predictive tracking isn’t always 100% accurate, but through the use of general understanding of a whole host of different factors that crop up during use, such as the speed of someone’s head moving, the angle that they might be looking at, and a whole bunch of others that can affect the tracking. Without the use of this scientific knowledge, the tracking wouldn’t be quite as effective and would be a result of nothing more than a few lucky guesses here and there, but through the use of accurate tests to prove how tracking would work for the average person, it results in predictive tracking working as intended because of the improved tracking model throughout.
-==Latency Sources==
-There’s a particular device that goes by the name of a latency tester that measures “motion-to-photon” latency inside a VR headset. Latency occurs between actual movement, and how long an image will take to reflect on the main screen of the device; the longer the delay, the longer the latency is considered.
-Because of this predictive tracking notion, it can heavily reduce the amount of latency the user would receive and if you look below, there’s a list of reasons that are considered a source of latency:
-*'''Processing Delay''' - Due to the use of sensor data with uses an algorithm to help receive date can add latency to the device if the developers aren’t too careful with their peripherals.
-*'''Rendering Delays''' - At times, the processor is bound to render some rather complicated scenes on the device and to make sure every single pixel is where it is meant to be can result in some latency.
-*'''Data Smoothing''' - Sensor data can be a tricky thing and even results in unnecessary noise whilst it’s doing its job, so to keep that noise down to some degree; low-level algorithms are brought into place to keep them quiet but at the crux of latency at times.
-*'''Framerate Delays''' - Whenever a framerate drop occurs, it takes time before every appropriate pixel slots into place after the processor works overtime to get everything in its correct positions; thus resulting in latency once more.
-*'''Sensing Delays''' - An issue that can occur is when the camera-sensors exhibit some form of delay, as a result of the light reflecting off of a tracked object, which can take an extra amount of time to deal with appropriately when the latency is occurred.
-All of these delays aren’t helpful to the VR or AR scenario of play, but some are worse than others; whereas others aren’t nearly as bad as some. Using a system, like predictive tracking, it can lower these issues wholeheartedly and making the system run a little more naturally than it would be otherwise.
-==How Far should it Predict into the Future?==
-Honestly, it really does depend. In terms of latency, you’re going to want to estimate just how much the end-to-end latency for your system is as the starting point and then proceed to optimise and improve it from there.
+==Why Prediction Is Necessary==
+Even with high-speed inertial measurement units (IMUs) and modern GPUs, the end-to-end pipeline of **sensing → transmission → fusion → simulation → rendering → display** incurs delays that add up to tens of milliseconds.<ref name="LaValle2014" />  If a frame were rendered using only the most recent measured head pose, the virtual scene would appear to “lag” behind real-world motion, causing discomfort and breaking immersion.  Predictive tracking mitigates this by extrapolating the head (or hand) pose to the moment the next frame’s photons actually leave the display.
-At times, you’ll need to predict more than just the one time point in the future at any point. Some of the reasons for this, along with examples are listed below:
+==Sources of Latency==
+Typical contributors include:<ref name="Boger2017" />
+* '''[[Sensor fusion]] delay'''—time to combine IMU and camera data
+* '''USB / wireless transmission'''
+* '''Game-logic and physics simulation'''
+* '''GPU rendering'''
+* '''Display scan-out and pixel switching'''
-*Depending on the object that the system is currently focusing on, it will need to account for different latency. For example, if the system originally focused solely on the head tracker and was then designed to follow arm/hand movements as well, a predictive tracker will be required for the pair of them. Having “different” predictive tracking allows for the least amount of latency and keeps the whole experience looking as natural as possible.
+==How Far Ahead to Predict==
+Head-mounted systems usually predict on the order of 10–30 ms—roughly their measured pipeline delay.  Prediction error grows quadratically with horizon length, so overshooting degrades accuracy.<ref name="LaValle2013" />  Modern headsets sample IMUs at up to 1 kHz, allowing reliable extrapolation over these short intervals without large drift.
-*In the cases of single screen monitors, like your mobile device, if the imagery used goes to both eyes, but one is very slightly out of sync by not even a second, for that second eye, you’ll want to adjust/delay a couple of frames so the pair of them sync up. This saves the user from experiencing any negative effects in response to this.
+==Common Prediction Algorithms==
+* '''[[Dead reckoning]]''' – constant-velocity extrapolation; low compute cost but assumes no acceleration.
+* '''[[Kalman filter|Kalman filtering]]''' – statistically optimal state estimation that fuses noisy sensor data with a motion model.  Widely used in inside-out tracking.
+* '''[[Alpha–beta filter|Alpha-Beta-Gamma (ABG) filter]]''' – a fixed-gain variant estimating position (α), velocity (β) and acceleration (γ).
+* '''Constant-acceleration models''' – often a special case of ABG; used in Oculus Rift DK-era prototypes.<ref name="LaValle2014" />
+* '''Machine-learning predictors''' – recurrent neural networks (e.g., LSTM) have recently been shown to outperform classical filters for aggressive motion, though they are not yet common in shipping products.<ref name="Paul2021" />
-==Common and Regularly Used Prediction Algorithms==
+==Implementation in Current Devices==
+* '''Meta Quest (3/Pro)''' combines high-rate IMUs with inside-out camera SLAM and uses asynchronous [[Time warp (virtual reality)|time-warp]] and SpaceWarp to correct frames just before display.<ref name="Dasch2019" />
+* '''Apple Vision Pro''' fuses multiple high-speed cameras, depth sensors and IMUs on Apple-designed silicon; measured optical latency of ≈11 ms implies aggressive short-horizon prediction for head and eye pose.<ref name="Lang2024" />
+* '''Microsoft HoloLens 2''' uses IMU + depth-camera fusion and hardware-assisted '''reprojection''' to keep holograms locked to real space; Microsoft stresses maintaining ≤16.6 ms frame time and using prediction to cover any additional delay.<ref name="Microsoft2021" />
-Below are some regularly used predictive tracking algorithms that many developers have used to bring their game to the standards they’d expect:
+==Historical Perspective==
+Azuma’s 1995 dissertation identified dynamic (motion-induced) error as the dominant source of mis-registration in optical see-through AR and demonstrated that a constant-velocity inertial predictor could dramatically improve stability.<ref name="Azuma1995" />  Subsequent VR research throughout the 2000s and early 2010s (e.g., LaValle et al. for the Oculus Rift) refined these concepts with higher sensor rates and deeper error analysis, leading to today’s robust inside-out predictive pipelines.<ref name="LaValle2014" />
-*'''Alpha-Beta-Gamma (ABG)''' - The ABG predictor does what it can to continuously estimate both acceleration and velocity to use them appropriately in its prediction. Since these estimates use actual account data, it sacrifices noise reduction for exemplifying responsiveness.
+==See Also==
-*'''Dead Reckoning''' - This right here is a very simple algorithm that uses both the position and velocity at any given time to predict the next position, assuming that both the last known position and velocity are correct and that the velocity remains the same. One major problem with this algorithm is that it requires the velocity to be constant on the regular, and in most cases the velocity does not remain constant; making the next set of assumptions incorrect.
+* [[Time warp (virtual reality)]]
-*'''Kalman Predictor''' - Coming from a popular filter of the same name, this algorithm is used to help reduce the sensor noise in systems where a mathematical model for the system currently exists.
+* [[Sensor fusion]]
+* [[Motion-to-photon latency]]
-Predictive tracking isn’t the easiest part of development when it comes to AR and VR systems, but it is one of the most commonly used techniques to reduce latency. Compared to other fixes, the implementation of the already mentioned algorithms are both simple and get the job done with very few cruxes overall. Safe to say, without predictive tracking, both VR and AR systems would not be the same and the gaming industry would suffer as a result on the VR/AR front!
+==References==
+<references>
+<ref name="Azuma1995">Azuma, Ronald T. ''Predictive Tracking for Augmented Reality''. Ph.D. dissertation, University of North Carolina at Chapel Hill, 1995.</ref>
+<ref name="LaValle2013">LaValle, Steven M. “The Latent Power of Prediction.” Oculus Developer Blog, July 12 2013.</ref>
+<ref name="LaValle2014">LaValle, Steven M., Yershova, A., Katsev, M., &amp; Antonov, M. “Head Tracking for the Oculus Rift.” In: ''IEEE Virtual Reality'', 2014.</ref>
+<ref name="Boger2017">Boger, Yuval. “Understanding Predictive Tracking and Why It’s Important for AR/VR Headsets.” ''Road to VR'', April 24 2017.</ref>
+<ref name="Dasch2019">Dasch, Tom. “Understanding Gameplay Latency for Oculus Quest, Oculus Go and Gear VR.” Oculus Developer Blog, April 11 2019.</ref>
+<ref name="Microsoft2021">Microsoft. “Hologram Stability.” ''Mixed Reality Documentation'' (HoloLens 2), 2021.</ref>
+<ref name="Lang2024">Lang, Ben. “Vision Pro and Quest 3 Hand-Tracking Latency Compared.” ''Road to VR'', March 28 2024.</ref>
+<ref name="Paul2021">Paul, S. et al. “A Study on Sensor System Latency in VR Motion Sickness.” ''Journal of Sensor &amp; Actuator Networks'' 10, no. 3 (2021): 53.</ref>
+</references>
-[[Category:Terms]] [[Category:Technical Terms]]
+[[Category:Terms]]
+[[Category:Technical Terms]]