Predictive tracking: Difference between revisions - VR & AR Wiki - Virtual Reality & Augmented Reality Wiki

Line 1:

==Introduction==

When it comes to either [[AR]] or [[VR]] systems, the [[predictive tracking]] for these 2 systems revolves around a process that predicts where the user and/or their body will be in the future. An example of this would be a game that would need to predict the whereabouts of your hand at pretty much any one time.

Quite possibly the biggest use of predictive tracking comes down to an issue with latency when it comes to the peripherals in question. For instance, if you’re looking to your left, you’d expect the view to be consistent on how far you’ve decided to look to the left. Without the use of predictive tracking, there would be a worryingly high amount of delay between your actions and the way the device displays them when you’re finished. Through the use of predictive tracking, the estimated orientation through your input allows for the overall latency to be reduced significantly; making it seem more natural as you would expect.

Plenty of attention has been aimed directly towards the VR side of things, but AR also plays a key part in the use of predictive tracking. To keep things in check, there’s a graphical overlay over the top of a real-world object, hence the use of “augmented” reality. Even when you move around the world, this overlay doesn’t move from its original position and effectively stays locked in its current location. Via the use of predictive tracking, you can harness the processing power of the graphics chip to keep the objects in frame, the overlay in place for where it should be, and keep things looking natural throughout your time in AR; or at the very least, as best as possible.

Predictive tracking isn’t always 100% accurate, but through the use of general understanding of a whole host of different factors that crop up during use, such as the speed of someone’s head moving, the angle that they might be looking at, and a whole bunch of others that can affect the tracking. Without the use of this scientific knowledge, the tracking wouldn’t be quite as effective and would be a result of nothing more than a few lucky guesses here and there, but through the use of accurate tests to prove how tracking would work for the average person, it results in predictive tracking working as intended because of the improved tracking model throughout.

==Latency Sources==

There’s a particular device that goes by the name of a latency tester that measures “motion-to-photon” latency inside a VR headset. Latency occurs between actual movement, and how long an image will take to reflect on the main screen of the device; the longer the delay, the longer the latency is considered.

~~==Introduction==~~

Because of this predictive tracking notion, it can heavily reduce the amount of latency the user would receive and if you look below, there’s a list of reasons that are considered a source of latency:

~~In [[augmented reality]] (AR)~~ and ~~[[virtual reality]] (VR)~~, '''~~predictive tracking~~''' is the ~~practice~~ of ~~estimating a user’s future [[pose]] (position + orientation) by a small time offset—typically~~ the ~~system’s~~ **~~motion~~-to-~~photon~~ latency**~~—so~~ that ~~the scene~~ can ~~be rendered for where the user will be rather than where they were~~ when the ~~last sensor sample arrived.<ref name="LaValle2013" /> First explored systematically by Ronald Azuma in the mid~~-~~1990s~~, ~~predictive tracking was shown to reduce dynamic registration error in optical see-through AR by~~ a ~~factor~~ of ~~five~~ to ~~ten~~.~~<ref name="Azuma1995" /> Today every consumer headset—from [[Meta Quest]]~~ to ~~[[Apple Vision Pro]] and [[HoloLens]]—relies on~~ some ~~form of~~ predictive tracking ~~to deliver~~ a ~~stable, low-latency experience~~.~~<ref name~~=~~"Lang2024" />~~

*'''Processing Delay''' - Due to the use of sensor data with uses an algorithm to help receive date can add latency to the device if the developers aren’t too careful with their peripherals.

*'''Rendering Delays''' - At times, the processor is bound to render some rather complicated scenes on the device and to make sure every single pixel is where it is meant to be can result in some latency.

*'''Data Smoothing''' - Sensor data can be a tricky thing and even results in unnecessary noise whilst it’s doing its job, so to keep that noise down to some degree; low-level algorithms are brought into place to keep them quiet but at the crux of latency at times.

*'''Framerate Delays''' - Whenever a framerate drop occurs, it takes time before every appropriate pixel slots into place after the processor works overtime to get everything in its correct positions; thus resulting in latency once more.

*'''Sensing Delays''' - An issue that can occur is when the camera-sensors exhibit some form of delay, as a result of the light reflecting off of a tracked object, which can take an extra amount of time to deal with appropriately when the latency is occurred.

All of these delays aren’t helpful to the VR or AR scenario of play, but some are worse than others; whereas others aren’t nearly as bad as some. Using a system, like predictive tracking, it can lower these issues wholeheartedly and making the system run a little more naturally than it would be otherwise.

==How Far should it Predict into the Future?==

~~==Why Prediction Is Necessary==~~

Honestly, it really does depend. In terms of latency, you’re going to want to estimate just how much the end-to-end latency for your system is as the starting point and then proceed to optimise and improve it from there.

~~Even with high-speed inertial measurement units (IMUs) and modern GPUs~~, the end-to-end pipeline of **sensing → transmission → fusion → simulation → rendering → display** incurs delays that add up to tens of milliseconds.<ref name="LaValle2014" /> If a frame were rendered using only the ~~most recent measured head pose, the virtual scene would appear~~ to ~~“lag” behind real-world motion, causing discomfort~~ and ~~breaking immersion. Predictive tracking mitigates this by extrapolating the head (or hand) pose to the moment the next frame’s photons actually leave the display~~.

~~==Sources~~ of ~~Latency==~~

At times, you’ll need to predict more than just the one time point in the future at any point. Some of the reasons for this, along with examples are listed below:

~~Typical contributors include~~:~~<ref name="Boger2017" />~~

* '''[[Sensor fusion]] delay'''—time to combine IMU and camera data

* '''USB / wireless transmission'''

* '''Game-logic and physics simulation'''

* '''GPU rendering'''

* '''Display scan-out and pixel switching'''

~~==How Far Ahead to Predict==~~

*Depending on the object that the system is currently focusing on, it will need to account for different latency. For example, if the system originally focused solely on the head tracker and was then designed to follow arm/hand movements as well, a predictive tracker will be required for the pair of them. Having “different” predictive tracking allows for the least amount of latency and keeps the whole experience looking as natural as possible.

~~Head-mounted systems usually predict~~ on the ~~order of 10–30 ms—roughly their measured pipeline delay~~. ~~Prediction error grows quadratically with horizon length~~, ~~so overshooting degrades accuracy.<ref name="LaValle2013"~~ /~~> Modern headsets sample IMUs at up to 1 kHz~~, ~~allowing reliable extrapolation over these short intervals without large drift~~.

~~==Common Prediction Algorithms==~~

*In the cases of single screen monitors, like your mobile device, if the imagery used goes to both eyes, but one is very slightly out of sync by not even a second, for that second eye, you’ll want to adjust/delay a couple of frames so the pair of them sync up. This saves the user from experiencing any negative effects in response to this.

* ~~'''[[Dead reckoning]]''' – constant-velocity extrapolation; low compute cost~~ but ~~assumes no acceleration.~~

* '''[[Kalman filter|Kalman filtering]]''' – statistically optimal state estimation that fuses noisy sensor data with a motion model. Widely used in inside-out ~~tracking.~~

* '''[[Alpha–beta filter|Alpha-Beta-Gamma (ABG) filter]]''' – a ~~fixed-gain variant estimating position (α)~~, ~~velocity (β) and acceleration (γ).~~

* '''Constant-acceleration models''' – often a ~~special case~~ of ~~ABG; used~~ in ~~Oculus Rift DK-era prototypes.<ref name="LaValle2014" />~~

* '''Machine-learning predictors''' – recurrent neural networks (e.g., LSTM) have recently been shown to ~~outperform classical filters for aggressive motion, though they are not yet common in shipping products~~.~~<ref name="Paul2021" />~~

==~~Implementation in Current Devices==~~

==Common and Regularly Used Prediction Algorithms==

* '''Meta Quest (3/Pro)''' combines high-rate IMUs with inside-out camera SLAM and uses asynchronous [[Time warp (virtual reality)|time-warp]] and SpaceWarp to correct frames just before display.<ref name="Dasch2019" />

* '''Apple Vision Pro''' fuses multiple high-speed cameras, depth sensors and IMUs on Apple-designed silicon; measured optical latency of ≈11 ms implies aggressive short-horizon prediction for head and ~~eye pose.<ref name~~=~~"Lang2024" />~~

* '''Microsoft HoloLens 2''' uses IMU + depth-camera fusion and hardware-assisted '''reprojection''' to keep holograms locked to real space; Microsoft stresses maintaining ≤16.6 ms frame time and using prediction to cover any additional delay.<ref name=~~"Microsoft2021" />~~

~~==Historical Perspective==~~

Below are some regularly used predictive tracking algorithms that many developers have used to bring their game to the standards they’d expect:

~~Azuma’s 1995 dissertation identified dynamic (motion-induced) error as the dominant source of mis-registration in optical see-through AR and demonstrated~~ that ~~a constant-velocity inertial predictor could dramatically improve stability.<ref name="Azuma1995" /> Subsequent VR research throughout~~ the 2000s and early 2010s (e.g., LaValle et al. for the Oculus Rift) refined these concepts with higher sensor rates and deeper error analysis, leading to today’s robust inside-out predictive pipelines.<ref name="LaValle2014" />

~~==See Also==~~

*'''Alpha-Beta-Gamma (ABG)''' - The ABG predictor does what it can to continuously estimate both acceleration and velocity to use them appropriately in its prediction. Since these estimates use actual account data, it sacrifices noise reduction for exemplifying responsiveness.

* ~~[[Time warp~~ (~~virtual reality~~)]]

*'''Dead Reckoning''' - This right here is a very simple algorithm that uses both the position and velocity at any given time to predict the next position, assuming that both the last known position and velocity are correct and that the velocity remains the same. One major problem with this algorithm is that it requires the velocity to be constant on the regular, and in most cases the velocity does not remain constant; making the next set of assumptions incorrect.

* ~~[[Sensor fusion]]~~

*'''Kalman Predictor''' - Coming from a popular filter of the same name, this algorithm is used to help reduce the sensor noise in systems where a mathematical model for the system currently exists.

* ~~[[Motion~~-to~~-photon latency]]~~

~~==References==~~

Predictive tracking isn’t the easiest part of development when it comes to AR and VR systems, but it is one of the most commonly used techniques to reduce latency. Compared to other fixes, the implementation of the already mentioned algorithms are both simple and get the job done with very few cruxes overall. Safe to say, without predictive tracking, both VR and AR systems would not be the same and the gaming industry would suffer as a result on the VR/AR front!

~~<references>~~

~~<ref name="Azuma1995">Azuma, Ronald T. ''~~Predictive ~~Tracking for Augmented Reality''. Ph.D. dissertation~~, ~~University~~ of ~~North Carolina at Chapel Hill, 1995~~.~~</ref>~~

~~<ref name="LaValle2013">LaValle~~, ~~Steven M. “The Latent Power~~ of ~~Prediction.” Oculus Developer Blog, July 12 2013.</ref>~~

~~<ref name="LaValle2014">LaValle, Steven M., Yershova, A., Katsev, M., & Antonov, M. “Head Tracking for~~ the ~~Oculus Rift.” In: ''IEEE Virtual Reality'', 2014.</ref>~~

~~<ref name="Boger2017">Boger, Yuval. “Understanding Predictive Tracking~~ and ~~Why It’s Important for AR/VR Headsets~~.~~” ''Road~~ to ~~VR''~~, ~~April 24 2017.</ref>~~

~~<ref name="Dasch2019">Dasch~~, ~~Tom. “Understanding Gameplay Latency for Oculus Quest, Oculus Go~~ and ~~Gear VR.” Oculus Developer Blog, April 11 2019.</ref>~~

~~<ref name="Microsoft2021">Microsoft. “Hologram Stability.” ''Mixed Reality Documentation'' (HoloLens 2), 2021.</ref>~~

~~<ref name="Lang2024">Lang, Ben. “Vision Pro~~ and ~~Quest 3 Hand-Tracking Latency Compared.” ''Road to VR'', March 28 2024.</ref>~~

~~<ref name="Paul2021">Paul, S. et al. “A Study~~ on ~~Sensor System Latency in~~ VR ~~Motion Sickness.” ''Journal of Sensor & Actuator Networks'' 10, no. 3 (2021): 53.<~~/~~ref>~~

~~</references>~~

[[Category:Terms]]

[[Category:Terms]] [[Category:Technical Terms]]

[[Category:Technical Terms]]