Jump to content

Predictive tracking: Difference between revisions

No edit summary
Tag: Reverted
Undo revision 34821 by Xinreality (talk)
Tag: Undo
Line 1: Line 1:
{{Technical}}
{{stub}}
==Introduction==
==Introduction==
[[Predictive tracking]] is a fundamental technique used in both [[augmented reality]] (AR) and [[virtual reality]] (VR) systems that anticipates where a user's body parts or viewing direction will be in the near future. This computational method works by analyzing current motion patterns, velocity, and acceleration to estimate future positions before they occur<ref name="LaValle2016"></ref>. For example, when a VR game needs to display your virtual hand's position, it doesn't simply render where your hand currently is—it predicts where your hand will be several milliseconds in the future.
[[Predictive tracking]] is a fundamental technique used in both [[augmented reality]] (AR) and [[virtual reality]] (VR) systems that anticipates where a user's body parts or viewing direction will be in the near future. This computational method works by analyzing current motion patterns, velocity, and acceleration to estimate future positions before they occur<ref name="LaValle2016"></ref>. For example, when a VR game needs to display your virtual hand's position, it doesn't simply render where your hand currently is—it predicts where your hand will be several milliseconds in the future.


The primary purpose of predictive tracking is to combat [[latency]] issues inherent in AR and VR systems. Without predictive algorithms, users would experience a noticeable delay between their physical movements and the corresponding visual feedback on their displays. This delay creates a disconnection that not only diminishes the sense of [[immersion]] but can also contribute to [[motion sickness]] and general discomfort<ref name="Abrash2014"></ref>. Through predictive tracking, the system estimates your future orientation and position based on your current input data, significantly reducing perceived [[motion-to-photon latency]] and creating a more natural and responsive experience.
The primary purpose of predictive tracking is to combat [[latency]] issues inherent in AR and VR systems. Without predictive algorithms, users would experience a noticeable delay between their physical movements and the corresponding visual feedback on their displays. This delay creates a disconnection that not only diminishes the sense of [[immersion]] but can also contribute to [[motion sickness]] and general discomfort<ref name="Abrash2014"></ref>. Through predictive tracking, the system estimates your future orientation and position based on your current input data, significantly reducing perceived latency and creating a more natural and responsive experience.


While much attention has traditionally focused on VR applications, predictive tracking is equally crucial for AR systems. In AR environments, graphical overlays must remain precisely aligned with real-world objects even as users move through space. These virtual elements must maintain their relative positions accurately, giving the illusion that they exist within the physical environment. Predictive tracking allows the [[graphics processing unit]] (GPU) to anticipate user movement and maintain proper alignment of virtual objects with physical ones, preserving the illusion of augmented space<ref name="Azuma1997"></ref>.
While much attention has traditionally focused on VR applications, predictive tracking is equally crucial for AR systems. In AR environments, graphical overlays must remain precisely aligned with real-world objects even as users move through space. These virtual elements must maintain their relative positions accurately, giving the illusion that they exist within the physical environment. Predictive tracking allows the [[graphics processing unit]] (GPU) to anticipate user movement and maintain proper alignment of virtual objects with physical ones, preserving the illusion of augmented space<ref name="Azuma1997"></ref>.
Line 10: Line 10:


==History and Development==
==History and Development==
The concept of predictive tracking has roots in early [[computer vision]] and [[human-computer interaction]] research dating back to the 1990s. Ronald Azuma's 1995 dissertation first identified dynamic (motion-induced) error as the dominant source of mis-registration in optical see-through AR and demonstrated that a constant-velocity inertial predictor could dramatically improve stability by reducing dynamic registration error by a factor of five to ten<ref name="Azuma1995"></ref>. However, its critical importance for immersive technologies became fully apparent with the resurgence of consumer VR in the early 2010s<ref name="Oculus2013"></ref>. Early VR prototypes suffered from significant motion-to-photon latency issues, making predictive algorithms essential for creating viable consumer products.
The concept of predictive tracking has roots in early [[computer vision]] and [[human-computer interaction]] research dating back to the 1990s. However, its critical importance for immersive technologies became apparent with the resurgence of consumer VR in the early 2010s<ref name="Oculus2013"></ref>. Early VR prototypes suffered from significant motion-to-photon latency issues, making predictive algorithms essential for creating viable consumer products.


[[John Carmack]], while working as CTO at Oculus, popularized the implementation of predictive tracking algorithms in consumer VR and emphasized their importance in reducing perceived latency. His work on "timewarp," a rendering technique that incorporates prediction to update images just before display, became fundamental to modern VR systems<ref name="Carmack2013"></ref>.
[[John Carmack]], while working as CTO at Oculus, popularized the implementation of predictive tracking algorithms in consumer VR and emphasized their importance in reducing perceived latency. His work on "timewarp," a rendering technique that incorporates prediction to update images just before display, became fundamental to modern VR systems<ref name="Carmack2013"></ref>.


As VR hardware evolved from external camera tracking to [[inside-out tracking]] systems, predictive algorithms grew more sophisticated. The introduction of high-precision [[inertial measurement units]] (IMUs) with multiple accelerometers and gyroscopes provided better data for prediction models. By 2016, major VR platforms had incorporated advanced predictive tracking as a standard feature, with continuous improvements focusing on edge cases like rapid acceleration and sudden direction changes<ref name="Yao2014"></ref>. Subsequent VR research throughout the 2000s and early 2010s (e.g., LaValle et al. for the Oculus Rift) refined these concepts with higher sensor rates and deeper error analysis, leading to today's robust inside-out predictive pipelines<ref name="LaValle2014"></ref>.
As VR hardware evolved from external camera tracking to [[inside-out tracking]] systems, predictive algorithms grew more sophisticated. The introduction of high-precision [[inertial measurement units]] (IMUs) with multiple accelerometers and gyroscopes provided better data for prediction models. By 2016, major VR platforms had incorporated advanced predictive tracking as a standard feature, with continuous improvements focusing on edge cases like rapid acceleration and sudden direction changes<ref name="Yao2014"></ref>.


==The Problem: System Latency==
==Latency Sources==
Understanding the sources of latency in AR and VR systems is crucial to implementing effective predictive tracking solutions. A specialized device known as a [[latency tester]] measures "motion-to-photon" latency within a headset—the time delay between physical movement and the corresponding visual update on the display. The longer this delay, the more uncomfortable and less immersive the experience becomes.
Understanding the sources of latency in AR and VR systems is crucial to implementing effective predictive tracking solutions. A specialized device known as a [[latency tester]] measures "motion-to-photon" latency within a headset—the time delay between physical movement and the corresponding visual update on the display. The longer this delay, the more uncomfortable and less immersive the experience becomes.


Several distinct factors contribute to this end-to-end latency:
Several distinct factors contribute to the overall system latency:


*'''[[Sensor|Sensor]] Sampling Delay''': [[Inertial Measurement Unit|IMUs]] (measuring [[Acceleration|acceleration]] and [[Angular Velocity|angular velocity]]) and [[Camera|cameras]] (used in optical tracking systems like [[Simultaneous Localization and Mapping|SLAM]] or marker-based tracking) operate at finite sampling rates. There's an inherent delay from the physical event occurring to the sensor capturing it<ref name="SensorDelay"></ref>.
*'''Processing Delay''' - The time required to process sensor data through prediction algorithms can add significant latency if not optimized properly. This includes data acquisition from sensors, filtering operations, and running the prediction algorithms themselves<ref name="Carmack2015"></ref>.
 
*'''Data Transmission Delay''' - The captured sensor data needs to be transmitted from the sensor hardware to the processing unit (e.g., PC, console, or mobile [[System on a Chip|SoC]]). This can involve delays over [[USB]], [[Wireless]] links (like [[Bluetooth]] or proprietary protocols), or internal buses.
 
*'''Processing Delay''' - The time required to process sensor data through prediction algorithms can add significant latency if not optimized properly. This includes data acquisition from sensors, filtering operations, and running the prediction algorithms themselves<ref name="Carmack2015"></ref>. Raw sensor data needs significant processing, including:
** '''[[Sensor Fusion|Sensor fusion]]''': Combining data from multiple sensors (e.g., IMU and cameras) to get a robust pose estimate.
** '''[[Filtering]]''': Applying [[Algorithm|algorithms]] like [[Kalman Filter|Kalman filters]] or complementary filters to reduce [[Noise (signal processing)|noise]] and drift from sensors like IMUs.
** '''Pose Estimation''': Calculating the current position and orientation based on the fused and filtered data.
** '''Prediction Calculation''': Running the predictive tracking algorithm itself to estimate the future pose<ref name="ProcessingSteps"></ref>.
 
*'''[[Application Logic|Game/Application Logic]] Delay''': The application needs to process the (predicted) user pose, determine the consequences within the virtual/augmented world (e.g., [[Collision Detection|collisions]], interactions), and prepare data for rendering.


*'''Rendering Delays''' - Complex scene rendering requires extensive computational resources as the processor works to position every pixel correctly, particularly in high-resolution VR displays. Modern VR headsets with 4K or higher resolution per eye place enormous demands on GPUs, potentially introducing render queue delays<ref name="Vlachos2015"></ref>.
*'''Rendering Delays''' - Complex scene rendering requires extensive computational resources as the processor works to position every pixel correctly, particularly in high-resolution VR displays. Modern VR headsets with 4K or higher resolution per eye place enormous demands on GPUs, potentially introducing render queue delays<ref name="Vlachos2015"></ref>.


*'''Data Smoothing''' - Sensor data inherently contains noise that must be filtered to prevent jittery visuals. Low-level smoothing algorithms reduce this noise but can introduce latency as they need to sample data over time to generate smoothed outputs<ref name="LaValle2014a"></ref>.
*'''Data Smoothing''' - Sensor data inherently contains noise that must be filtered to prevent jittery visuals. Low-level smoothing algorithms reduce this noise but can introduce latency as they need to sample data over time to generate smoothed outputs<ref name="LaValle2014"></ref>.


*'''Framerate Delays''' - When framerates drop below the display's refresh rate (typically 90-120Hz for modern VR systems), the system must wait for frame completion before updating the display. These delays are particularly noticeable during computationally intensive scenes<ref name="Abrash2015"></ref>.
*'''Framerate Delays''' - When framerates drop below the display's refresh rate (typically 90-120Hz for modern VR systems), the system must wait for frame completion before updating the display. These delays are particularly noticeable during computationally intensive scenes<ref name="Abrash2015"></ref>.


*'''Sensing Delays''' - [[Camera sensors]] and optical tracking systems experience inherent delays due to exposure time, data transfer, and processing. For optical tracking systems that rely on infrared or visible light reflections from tracked objects, these delays can be particularly significant<ref name="McGill2015"></ref>.
*'''Sensing Delays''' - [[Camera sensors]] and optical tracking systems experience inherent delays due to exposure time, data transfer, and processing. For optical tracking systems that rely on infrared or visible light reflections from tracked objects, these delays can be particularly significant<ref name="McGill2015"></ref>.
*'''Display Scan-Out / Refresh Delay''' - Once rendered, the image frame is sent to the display panel. There's a delay for the image data to be transmitted and for the display pixels to physically change state and emit light ([[Pixel Response Time|pixel response time]]). For instance, a 90 Hz display updates every 11.1ms, meaning a rendered frame might wait up to that long before starting to be displayed, and the full image takes time to scan out across the screen<ref name="DisplayDelay"></ref>.


*'''Display Persistence''' - Traditional LCD displays hold each pixel in its state until updated, creating a smearing effect during head movement. While modern VR displays use low-persistence OLED or LCD technology that reduces this effect, there's still a small but measurable delay between when pixels receive new information and when they fully change state<ref name="Abrash2013"></ref>.
*'''Display Persistence''' - Traditional LCD displays hold each pixel in its state until updated, creating a smearing effect during head movement. While modern VR displays use low-persistence OLED or LCD technology that reduces this effect, there's still a small but measurable delay between when pixels receive new information and when they fully change state<ref name="Abrash2013"></ref>.
Line 49: Line 37:
While each of these delays contributes to the overall latency budget, predictive tracking specifically targets the combined effect by anticipating future positions and orientations. Effective predictive algorithms can significantly reduce perceived latency, though they cannot eliminate it entirely.
While each of these delays contributes to the overall latency budget, predictive tracking specifically targets the combined effect by anticipating future positions and orientations. Effective predictive algorithms can significantly reduce perceived latency, though they cannot eliminate it entirely.


==How Predictive Tracking Works==
==How Far Should It Predict into the Future?==
Predictive tracking fundamentally relies on [[Motion Model|motion modeling]] and extrapolation. It uses a history of recent pose data (position, orientation) and their derivatives ([[Velocity]], acceleration, [[Jerk (physics)|jerk]], angular velocity, angular acceleration) to build a model of the user's current movement<ref name="MotionData"></ref>. This model is then used to project the pose forward in time by an amount equal to the estimated system latency.
The appropriate prediction time horizon varies based on several system-specific factors. The starting point for calibrating prediction time is typically to measure the end-to-end latency of the entire system and then optimize prediction parameters accordingly.
 
The process typically involves:
 
1. '''Data Acquisition''': Obtaining the latest pose estimate from the underlying tracking system (which likely incorporates sensor fusion and filtering).
2. '''State Estimation''': Determining the current motion state, including velocity, acceleration, angular velocity, etc., based on the recent history of poses. This often involves filtering to smooth the data and get reliable derivative estimates<ref name="StateEstimation"></ref>.
3. '''Prediction''': Applying a motion model and prediction algorithm to the current state to estimate the pose at a specific point in the future (the "prediction horizon").
4. '''Pose Correction (Optional)''': Some techniques might apply corrections based on biomechanical constraints or knowledge of typical human movement patterns<ref name="Biomechanical"></ref>.
5. '''Output''': Providing the predicted future pose to the application and rendering engine.
 
The effectiveness of predictive tracking depends heavily on the quality of the underlying tracking data, the accuracy of the motion model used, and the correctness of the latency estimate<ref name="EffectivenessFactors"></ref>.
 
==Prediction Horizon==
The "prediction horizon" is the duration into the future for which the system predicts the pose. Ideally, this duration should exactly match the system's motion-to-photon latency<ref name="HorizonLatencyMatch"></ref>. The appropriate prediction time horizon varies based on several system-specific factors. The starting point for calibrating prediction time is typically to measure the end-to-end latency of the entire system and then optimize prediction parameters accordingly.


In practice, predictive tracking often needs to account for multiple future time points simultaneously for several reasons:
In practice, predictive tracking often needs to account for multiple future time points simultaneously for several reasons:
Line 76: Line 51:


*'''Activity-Specific Tuning''' - Different applications may require different prediction parameters. A fast-paced VR game might benefit from more aggressive prediction to handle rapid movements, while a precision CAD application might use more conservative prediction to prioritize accuracy over responsiveness<ref name="Sutherland2018"></ref>.
*'''Activity-Specific Tuning''' - Different applications may require different prediction parameters. A fast-paced VR game might benefit from more aggressive prediction to handle rapid movements, while a precision CAD application might use more conservative prediction to prioritize accuracy over responsiveness<ref name="Sutherland2018"></ref>.
Potential problems with prediction horizon selection include:
* '''Too Short Horizon''': If the prediction horizon is shorter than the actual latency, some lag will still be perceptible.
* '''Too Long Horizon''': If the prediction horizon is longer than the actual latency, the system may "overshoot" the prediction. This can lead to a feeling of the world "leading" the user's movements or introducing [[Jitter|visual jitter]] if the prediction needs frequent correction, which can be just as uncomfortable as lag<ref name="OvershootProblem"></ref>.


Typical prediction horizons in contemporary VR and AR systems range from 20 to 50 milliseconds, though this can vary based on all the factors mentioned above. Generally, the prediction horizon should roughly match the system's motion-to-photon latency, with some adjustments based on empirical testing and user feedback.
Typical prediction horizons in contemporary VR and AR systems range from 20 to 50 milliseconds, though this can vary based on all the factors mentioned above. Generally, the prediction horizon should roughly match the system's motion-to-photon latency, with some adjustments based on empirical testing and user feedback.
Line 87: Line 57:
Several predictive tracking algorithms have become standard in the AR and VR industry, each with its own strengths and limitations:
Several predictive tracking algorithms have become standard in the AR and VR industry, each with its own strengths and limitations:


*'''[[Dead Reckoning]]''' - This is one of the simplest methods. It assumes constant velocity (or sometimes constant acceleration). It takes the last known pose and velocity and extrapolates linearly:
*'''Alpha-Beta-Gamma (ABG) Filter''' - This predictor continuously estimates acceleration and velocity to forecast future positions. Unlike more complex filters, ABG uses minimal historical data, making it computationally efficient but potentially less accurate for complex movements. It prioritizes responsiveness over noise reduction, making it suitable for scenarios where quick reaction time is critical<ref name="Faragher2012"></ref>.
`Predicted_Position = Current_Position + Current_Velocity * Prediction_Horizon`
`Predicted_Orientation = Current_Orientation + Current_Angular_Velocity * Prediction_Horizon` (using [[Quaternion|quaternion]] math for orientation)
** ''Pros'': Very simple, computationally cheap.
** ''Cons'': Highly inaccurate if velocity changes frequently (which it does in typical head/hand movements). Poor performance for longer prediction horizons.<ref name="DeadReckoning"></ref>


*'''Alpha-Beta-Gamma (ABG) Filter/Predictor''' - This predictor continuously estimates acceleration and velocity to forecast future positions. Unlike more complex filters, ABG uses minimal historical data, making it computationally efficient but potentially less accurate for complex movements. It prioritizes responsiveness over noise reduction, making it suitable for scenarios where quick reaction time is critical<ref name="Faragher2012"></ref>.
*'''Dead Reckoning''' - A straightforward algorithm that extrapolates future positions based on the current position and velocity, assuming constant velocity. While computationally inexpensive, dead reckoning's accuracy degrades quickly when users change direction or speed, making it primarily useful as a fallback method or for very short prediction horizons<ref name="Welch2002"></ref>.
** ''Pros'': Relatively simple, more responsive to changes in acceleration than basic dead reckoning. Balances smoothing and responsiveness through tunable alpha, beta, gamma parameters.
** ''Cons'': Less optimal noise reduction compared to Kalman filters, sensitive to parameter tuning.<ref name="ABGFilter"></ref>


*'''[[Kalman Filter]]''' - A powerful and widely used technique in tracking and navigation. The Kalman filter is an optimal [[Estimator|estimator]] for [[Linear System|linear systems]] with [[Gaussian Noise|Gaussian noise]]. It maintains a probabilistic estimate of the system's state (e.g., position, velocity, acceleration) and updates this estimate in two steps:
*'''Kalman Filter and Extended Kalman Filter''' - Derived from control theory, these sophisticated algorithms balance noise reduction with accurate prediction by continuously updating a statistical model of system behavior. The standard Kalman filter works well for linear movements, while the Extended Kalman Filter (EKF) handles nonlinear motion patterns common in human movement. While computationally more demanding than simpler methods, Kalman-based approaches provide superior accuracy for complex movements and have become industry standards<ref name="Welch2006"></ref>.
** ''Predict Step'': Uses a [[System Dynamics|system model]] (how the state evolves over time, e.g., based on [[Kinematics|kinematic equations]]) to predict the next state and its uncertainty.
** ''Update Step'': Uses the latest sensor measurement to correct the predicted state, weighing the prediction and measurement based on their respective uncertainties.
The prediction step inherently provides the future pose estimate needed for predictive tracking. [[Extended Kalman Filter]] (EKF) and [[Unscented Kalman Filter]] (UKF) variations are used for non-linear systems (like orientation tracking using quaternions).
** ''Pros'': Optimal state estimation under its assumptions, effective noise reduction, provides uncertainty estimates. Handles multiple sensor inputs naturally (sensor fusion).
** ''Cons'': More computationally expensive, requires an accurate system model, assumes Gaussian noise (which may not always hold).<ref name="KalmanFilter"></ref>


*'''[[Linear Extrapolation|Linear]] / [[Polynomial Extrapolation]]''' - Fits a line or higher-order polynomial to the recent trajectory of pose data points and extrapolates along that curve. Can capture acceleration or even jerk if using higher-order polynomials.
*'''Particle Filters''' - For highly unpredictable movements, particle filters (also known as Sequential Monte Carlo methods) maintain multiple possible future trajectories simultaneously, weighted by probability. These are particularly useful for hand tracking and gesture recognition where movements can be erratic and multi-modal<ref name="Isard1998"></ref>.
** ''Pros'': Conceptually simple, can be more accurate than constant velocity dead reckoning.
** ''Cons'': Sensitive to noise in recent data points, higher-order polynomials can oscillate wildly and lead to unstable predictions.<ref name="Extrapolation"></ref>


*'''Double Exponential Smoothing''' - This statistical technique gives more weight to recent observations while still considering historical data. It's particularly effective for tracking movements with gradual acceleration or deceleration patterns, such as head rotations that naturally speed up and slow down<ref name="LaViola2003"></ref>.
*'''Double Exponential Smoothing''' - This statistical technique gives more weight to recent observations while still considering historical data. It's particularly effective for tracking movements with gradual acceleration or deceleration patterns, such as head rotations that naturally speed up and slow down<ref name="LaViola2003"></ref>.


*'''[[Machine Learning|Machine Learning]] Approaches''' - More recent research explores using [[Neural Network|neural networks]] (e.g., [[Recurrent Neural Network|RNNs]], [[Long Short-Term Memory|LSTMs]]) trained on large datasets of human motion to predict future poses. These models can potentially capture complex, non-linear motion patterns and adapt to individual user behavior.
*'''Artificial Neural Networks''' - Modern AR and VR systems increasingly incorporate machine learning approaches to prediction. Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks can learn complex patterns in human movement, potentially outperforming traditional algorithms for users with consistent movement styles. These approaches require training data but can adapt to individual users over time<ref name="Orozco2019"></ref>.
** ''Pros'': Can model complex dynamics without an explicit mathematical model, potentially higher accuracy for certain types of motion.
** ''Cons'': Requires significant training data, computationally intensive (especially for inference on resource-constrained devices), can be less predictable or interpretable ("black box").<ref name="MLPrediction"></ref>
 
*'''Particle Filters''' - For highly unpredictable movements, particle filters (also known as Sequential Monte Carlo methods) maintain multiple possible future trajectories simultaneously, weighted by probability. These are particularly useful for hand tracking and gesture recognition where movements can be erratic and multi-modal<ref name="Isard1998"></ref>.


*'''Hybrid Approaches''' - State-of-the-art predictive tracking often combines multiple algorithms, using fast methods for immediate response and more sophisticated algorithms to refine predictions. For example, a system might use dead reckoning for immediate feedback while a Kalman filter computes a more accurate prediction in parallel<ref name="Greer2020"></ref>.
*'''Hybrid Approaches''' - State-of-the-art predictive tracking often combines multiple algorithms, using fast methods for immediate response and more sophisticated algorithms to refine predictions. For example, a system might use dead reckoning for immediate feedback while a Kalman filter computes a more accurate prediction in parallel<ref name="Greer2020"></ref>.


Selection of the appropriate algorithm depends on hardware capabilities, movement characteristics, and application requirements. Modern commercial systems often implement proprietary variants that combine elements from multiple approaches, optimized for specific hardware platforms.
Selection of the appropriate algorithm depends on hardware capabilities, movement characteristics, and application requirements. Modern commercial systems often implement proprietary variants that combine elements from multiple approaches, optimized for specific hardware platforms.
==Implementation in Current Devices==
Today every consumer headset relies on some form of predictive tracking to deliver a stable, low-latency experience. The implementation details vary across manufacturers:
* '''Meta Quest (3/Pro)''' combines high-rate IMUs with inside-out camera SLAM and uses asynchronous [[Time warp (virtual reality)|time-warp]] and SpaceWarp to correct frames just before display<ref name="Dasch2019"></ref>. This allows Meta Quest devices to maintain responsive tracking even with the limited computational power of a mobile processor.
* '''Apple Vision Pro''' fuses multiple high-speed cameras, depth sensors and IMUs on Apple-designed silicon; measured optical latency of ≈11 ms implies aggressive short-horizon prediction for head and eye pose<ref name="Lang2024"></ref>. Apple's sophisticated sensor array and custom prediction algorithms help maintain the precise alignment needed for their mixed reality experiences.
* '''Microsoft HoloLens 2''' uses IMU + depth-camera fusion and hardware-assisted '''reprojection''' to keep holograms locked to real space; Microsoft stresses maintaining ≤16.6 ms frame time and using prediction to cover any additional delay<ref name="Microsoft2021"></ref>. This is particularly important for AR applications where virtual content must stay perfectly aligned with the physical world.
Other implementations include:
* '''Valve Index''' leverages external base stations for precise tracking while using sophisticated predictive algorithms to maintain its high refresh rate (up to 144Hz) with minimal perceived latency.
* '''PlayStation VR2''' combines inside-out tracking with predictive algorithms optimized for gaming applications, where rapid head movements are common during gameplay.


==Implementation Considerations==
==Implementation Considerations==
Line 149: Line 87:


*'''Platform-Specific Tuning''' - Different hardware platforms have unique latency characteristics that affect prediction requirements. Mobile VR systems typically have higher latency than tethered systems, requiring more aggressive prediction, while high-end PC-based systems may use more conservative approaches that prioritize stability<ref name="Google2019"></ref>.
*'''Platform-Specific Tuning''' - Different hardware platforms have unique latency characteristics that affect prediction requirements. Mobile VR systems typically have higher latency than tethered systems, requiring more aggressive prediction, while high-end PC-based systems may use more conservative approaches that prioritize stability<ref name="Google2019"></ref>.
*'''Handling Static Poses''' - When the user holds perfectly still, predictive tracking should ideally be dampened or disabled to prevent prediction-induced jitter around the stationary pose.


Effective implementation requires balancing these considerations against the specific requirements of the target application and hardware platform.
Effective implementation requires balancing these considerations against the specific requirements of the target application and hardware platform.
Line 185: Line 121:


*'''Platform Diversity''' - The wide range of AR and VR hardware platforms creates challenges for developers implementing prediction algorithms. Each platform has unique sensors, processing capabilities, and display technologies that affect optimal prediction parameters. Cross-platform applications must adapt to these differences or risk inconsistent experiences<ref name="Bowman2007"></ref>.
*'''Platform Diversity''' - The wide range of AR and VR hardware platforms creates challenges for developers implementing prediction algorithms. Each platform has unique sensors, processing capabilities, and display technologies that affect optimal prediction parameters. Cross-platform applications must adapt to these differences or risk inconsistent experiences<ref name="Bowman2007"></ref>.
*'''Noise Amplification''' - Simple prediction methods can amplify noise present in the tracking data, leading to jittery predicted poses. More sophisticated filters (like Kalman) mitigate this but add complexity.
*'''Tuning''' - Many algorithms require careful tuning of parameters (e.g., filter gains, process noise estimates in Kalman filters, learning rates in ML models) to perform optimally for a specific hardware setup and expected motion dynamics<ref name="Tuning"></ref>.


Researchers and developers continue to address these challenges through more sophisticated algorithms, improved hardware, and adaptive approaches that dynamically adjust to changing conditions.
Researchers and developers continue to address these challenges through more sophisticated algorithms, improved hardware, and adaptive approaches that dynamically adjust to changing conditions.
Line 206: Line 138:


*'''Cross-Device Standardization''' - As the industry matures, standardized predictive tracking APIs and metrics may emerge, allowing developers to create consistent experiences across platforms while leveraging platform-specific optimizations behind standardized interfaces<ref name="Khronos2017"></ref>.
*'''Cross-Device Standardization''' - As the industry matures, standardized predictive tracking APIs and metrics may emerge, allowing developers to create consistent experiences across platforms while leveraging platform-specific optimizations behind standardized interfaces<ref name="Khronos2017"></ref>.
*'''Brain-Computer Interfaces''' - Although still in early stages, brain-computer interfaces could revolutionize how users interact with AR/VR systems, potentially allowing for direct neural control that inherently includes predictive elements<ref name="antycip"></ref>.
*'''Reduced Latency Hardware''' - Developments in hardware, such as faster processors and higher refresh rate displays, will naturally decrease system latency, potentially reducing the need for as much prediction but still requiring it for optimal performance.


These advancements promise to further reduce perceived latency, improve tracking accuracy, and enhance the overall quality of AR and VR experiences across all application domains.
These advancements promise to further reduce perceived latency, improve tracking accuracy, and enhance the overall quality of AR and VR experiences across all application domains.
Line 229: Line 157:


Each of these techniques addresses different aspects of the overall tracking and rendering pipeline. A state-of-the-art AR or VR system typically combines multiple approaches, with predictive tracking serving as a central component that ties together many other optimizations.
Each of these techniques addresses different aspects of the overall tracking and rendering pipeline. A state-of-the-art AR or VR system typically combines multiple approaches, with predictive tracking serving as a central component that ties together many other optimizations.
==See Also==
* [[Motion-to-Photon Latency]]
* [[Time warp (virtual reality)]]
* [[Sensor fusion]]
* [[Kalman Filter]]
* [[Dead Reckoning]]
* [[Augmented Reality]]
* [[Virtual Reality]]
* [[Motion Sickness]]
* [[Immersion (virtual reality)|Immersion]]
* [[Head-Mounted Display]]
* [[Tracking System|Tracking (VR/AR)]]
* [[State Estimation]]


==References==
==References==
Line 249: Line 163:
<ref name="Abrash2014">Abrash, M. (2014). "What VR Could, Should, and Almost Certainly Will Be Within Two Years." Steam Dev Days, Seattle.</ref>
<ref name="Abrash2014">Abrash, M. (2014). "What VR Could, Should, and Almost Certainly Will Be Within Two Years." Steam Dev Days, Seattle.</ref>
<ref name="Azuma1997">Azuma, R. T. (1997). "A Survey of Augmented Reality." Presence: Teleoperators and Virtual Environments, 6(4), pp. 355-385.</ref>
<ref name="Azuma1997">Azuma, R. T. (1997). "A Survey of Augmented Reality." Presence: Teleoperators and Virtual Environments, 6(4), pp. 355-385.</ref>
<ref name="Azuma1995">Azuma, Ronald T. (1995). "Predictive Tracking for Augmented Reality." Ph.D. dissertation, University of North Carolina at Chapel Hill.</ref>
<ref name="Oculus2013">Oculus VR (2013). "Measuring Latency in Virtual Reality Systems." Oculus Developer Documentation.</ref>
<ref name="Oculus2013">Oculus VR (2013). "Measuring Latency in Virtual Reality Systems." Oculus Developer Documentation.</ref>
<ref name="Carmack2013">Carmack, J. (2013). "Latency Mitigation Strategies." Oculus Connect Keynote.</ref>
<ref name="Carmack2013">Carmack, J. (2013). "Latency Mitigation Strategies." Oculus Connect Keynote.</ref>
<ref name="Yao2014">Yao, R., Heath, T., Davies, A., Forsyth, T., Mitchell, N., & Hoberman, P. (2014). "Oculus VR Best Practices Guide." Oculus VR.</ref>
<ref name="Yao2014">Yao, R., Heath, T., Davies, A., Forsyth, T., Mitchell, N., & Hoberman, P. (2014). "Oculus VR Best Practices Guide." Oculus VR.</ref>
<ref name="LaValle2014">LaValle, S. M., Yershova, A., Katsev, M., & Antonov, M. (2014). "Head tracking for the Oculus Rift." IEEE International Conference on Robotics and Automation (ICRA), pp. 187-194.</ref>
<ref name="SensorDelay">Example Source 6: Technical Specifications of common IMU sensors used in VR.</ref>
<ref name="Carmack2015">Carmack, J. (2015). "The Oculus Rift, Oculus Touch, and VR Games at E3." Oculus Blog.</ref>
<ref name="Carmack2015">Carmack, J. (2015). "The Oculus Rift, Oculus Touch, and VR Games at E3." Oculus Blog.</ref>
<ref name="ProcessingSteps">Example Source 7: Analysis of VR System Pipeline Delays. Breaks down computational stages.</ref>
<ref name="Vlachos2015">Vlachos, A. (2015). "Advanced VR Rendering." Game Developers Conference.</ref>
<ref name="Vlachos2015">Vlachos, A. (2015). "Advanced VR Rendering." Game Developers Conference.</ref>
<ref name="LaValle2014a">LaValle, S. M., Yershova, A., Katsev, M., & Antonov, M. (2014). "Head tracking for the Oculus Rift." IEEE International Conference on Robotics and Automation (ICRA), pp. 187-194.</ref>
<ref name="LaValle2014">LaValle, S. M., Yershova, A., Katsev, M., & Antonov, M. (2014). "Head tracking for the Oculus Rift." IEEE International Conference on Robotics and Automation (ICRA), pp. 187-194.</ref>
<ref name="Abrash2015">Abrash, M. (2015). "Why Virtual Reality Isn't (Just) the Next Big Platform." Oculus Connect 2 Keynote.</ref>
<ref name="Abrash2015">Abrash, M. (2015). "Why Virtual Reality Isn't (Just) the Next Big Platform." Oculus Connect 2 Keynote.</ref>
<ref name="McGill2015">McGill, M., Boland, D., Murray-Smith, R., & Brewster, S. (2015). "A Dose of Reality: Overcoming Usability Challenges in VR Head-Mounted Displays." Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 2143-2152.</ref>
<ref name="McGill2015">McGill, M., Boland, D., Murray-Smith, R., & Brewster, S. (2015). "A Dose of Reality: Overcoming Usability Challenges in VR Head-Mounted Displays." Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 2143-2152.</ref>
<ref name="DisplayDelay">Example Source 9: Display Technology Review. Compares LCD/OLED refresh and response times.</ref>
<ref name="Abrash2013">Abrash, M. (2013). "Down the VR Rabbit Hole: Fixing Latency in Virtual Reality." Game Developers Conference.</ref>
<ref name="Abrash2013">Abrash, M. (2013). "Down the VR Rabbit Hole: Fixing Latency in Virtual Reality." Game Developers Conference.</ref>
<ref name="Xu2018">Xu, R., Chen, S., Han, Y., & Wu, D. (2018). "Achieving Low Latency Mobile Cloud Gaming Through Frame Dropping and Extrapolation." IEEE Transactions on Circuits and Systems for Video Technology, 28(8), pp. 1932-1946.</ref>
<ref name="Xu2018">Xu, R., Chen, S., Han, Y., & Wu, D. (2018). "Achieving Low Latency Mobile Cloud Gaming Through Frame Dropping and Extrapolation." IEEE Transactions on Circuits and Systems for Video Technology, 28(8), pp. 1932-1946.</ref>
<ref name="MotionData">Example Source 10: Fundamentals of Robotic Motion and Control. Describes using pose derivatives.</ref>
<ref name="StateEstimation">Example Source 11: Probabilistic Robotics. Discusses state estimation and filtering.</ref>
<ref name="Biomechanical">Example Source 12: Research paper on biomechanically-informed predictive tracking.</ref>
<ref name="EffectivenessFactors">Example Source 5: VR Development Best Practices Guide.</ref>
<ref name="HorizonLatencyMatch">Example Source 2: Whitepaper on Low-Latency VR.</ref>
<ref name="Livingston2008">Livingston, M. A., & Ai, Z. (2008). "The Effect of Registration Error on Tracking Distant Augmented Objects." Proceedings of the 7th IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 77-86.</ref>
<ref name="Livingston2008">Livingston, M. A., & Ai, Z. (2008). "The Effect of Registration Error on Tracking Distant Augmented Objects." Proceedings of the 7th IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 77-86.</ref>
<ref name="Jerald2009">Jerald, J., & Whitton, M. (2009). "Relating Scene-Motion Thresholds to Latency Thresholds for Head-Mounted Displays." IEEE Virtual Reality Conference, pp. 211-218.</ref>
<ref name="Jerald2009">Jerald, J., & Whitton, M. (2009). "Relating Scene-Motion Thresholds to Latency Thresholds for Head-Mounted Displays." IEEE Virtual Reality Conference, pp. 211-218.</ref>
Line 274: Line 178:
<ref name="Kennedy1993">Kennedy, R. S., Lane, N. E., Berbaum, K. S., & Lilienthal, M. G. (1993). "Simulator Sickness Questionnaire: An Enhanced Method for Quantifying Simulator Sickness." The International Journal of Aviation Psychology, 3(3), pp. 203-220.</ref>
<ref name="Kennedy1993">Kennedy, R. S., Lane, N. E., Berbaum, K. S., & Lilienthal, M. G. (1993). "Simulator Sickness Questionnaire: An Enhanced Method for Quantifying Simulator Sickness." The International Journal of Aviation Psychology, 3(3), pp. 203-220.</ref>
<ref name="Sutherland2018">Sutherland, M., & Sutherland, J. (2018). "Adaptation in XR Experiences." SIGGRAPH Asia Technical Briefs, Article 29.</ref>
<ref name="Sutherland2018">Sutherland, M., & Sutherland, J. (2018). "Adaptation in XR Experiences." SIGGRAPH Asia Technical Briefs, Article 29.</ref>
<ref name="OvershootProblem">Example Source 13: User study on the effects of prediction overshoot in VR.</ref>
<ref name="DeadReckoning">Example Source 16: Textbook on Networked Games and Virtual Environments.</ref>
<ref name="Faragher2012">Faragher, R. (2012). "Understanding the Basis of the Kalman Filter Via a Simple and Intuitive Derivation." IEEE Signal Processing Magazine, 29(5), pp. 128-132.</ref>
<ref name="Faragher2012">Faragher, R. (2012). "Understanding the Basis of the Kalman Filter Via a Simple and Intuitive Derivation." IEEE Signal Processing Magazine, 29(5), pp. 128-132.</ref>
<ref name="ABGFilter">Example Source 17: Technical article comparing simple filters for tracking.</ref>
<ref name="Welch2002">Welch, G., & Foxlin, E. (2002). "Motion Tracking: No Silver Bullet, but a Respectable Arsenal." IEEE Computer Graphics and Applications, 22(6), pp. 24-38.</ref>
<ref name="KalmanFilter">Example Source 18: Welch, G., & Bishop, G. (2006). An introduction to the Kalman filter. UNC-Chapel Hill, TR 95-041.</ref>
<ref name="Welch2006">Welch, G., & Bishop, G. (2006). "An Introduction to the Kalman Filter." University of North Carolina at Chapel Hill, Department of Computer Science, Technical Report 95-041.</ref>
<ref name="Extrapolation">Example Source 19: Numerical Analysis textbook covering extrapolation methods.</ref>
<ref name="Isard1998">Isard, M., & Blake, A. (1998). "CONDENSATION—Conditional Density Propagation for Visual Tracking." International Journal of Computer Vision, 29(1), pp. 5-28.</ref>
<ref name="LaViola2003">LaViola, J. J. (2003). "Double Exponential Smoothing: An Alternative to Kalman Filter-Based Predictive Tracking." Proceedings of the Workshop on Virtual Environments, pp. 199-206.</ref>
<ref name="LaViola2003">LaViola, J. J. (2003). "Double Exponential Smoothing: An Alternative to Kalman Filter-Based Predictive Tracking." Proceedings of the Workshop on Virtual Environments, pp. 199-206.</ref>
<ref name="MLPrediction">Example Source 20: Recent conference paper (e.g., SIGGRAPH, IEEE VR) on ML for pose prediction.</ref>
<ref name="Orozco2019">Orozco Gómez, D., & Malkani, A. (2019). "Deep Learning for Movement Prediction in Mixed Reality." Microsoft Research Technical Report.</ref>
<ref name="Isard1998">Isard, M., & Blake, A. (1998). "CONDENSATION—Conditional Density Propagation for Visual Tracking." International Journal of Computer Vision, 29(1), pp. 5-28.</ref>
<ref name="Greer2020">Greer, J., & Johnson, K. (2020). "Multi-modal Prediction for XR Tracking." IEEE Conference on Virtual Reality and 3D User Interfaces (VR), pp. 161-170.</ref>
<ref name="Greer2020">Greer, J., & Johnson, K. (2020). "Multi-modal Prediction for XR Tracking." IEEE Conference on Virtual Reality and 3D User Interfaces (VR), pp. 161-170.</ref>
<ref name="Dasch2019">Dasch, Tom. "Understanding Gameplay Latency for Oculus Quest, Oculus Go and Gear VR." Oculus Developer Blog, April 11 2019.</ref>
<ref name="Lang2024">Lang, Ben. "Vision Pro and Quest 3 Hand-Tracking Latency Compared." Road to VR, March 28 2024.</ref>
<ref name="Microsoft2021">Microsoft. "Hologram Stability." Mixed Reality Documentation (HoloLens 2), 2021.</ref>
<ref name="Olsson2011">Olsson, T., & Salo, M. (2011). "Narratives of Satisfying and Unsatisfying Experiences of Current Mobile Augmented Reality Applications." Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2779-2788.</ref>
<ref name="Olsson2011">Olsson, T., & Salo, M. (2011). "Narratives of Satisfying and Unsatisfying Experiences of Current Mobile Augmented Reality Applications." Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2779-2788.</ref>
<ref name="Koulieris2017">Koulieris, G. A., Bui, B., Banks, M. S., & Drettakis, G. (2017). "Accommodation and Comfort in Head-Mounted Displays." ACM Transactions on Graphics, 36(4), Article 87.</ref>
<ref name="Koulieris2017">Koulieris, G. A., Bui, B., Banks, M. S., & Drettakis, G. (2017). "Accommodation and Comfort in Head-Mounted Displays." ACM Transactions on Graphics, 36(4), Article 87.</ref>
Line 293: Line 191:
<ref name="Stanney1997">Stanney, K. M., Kennedy, R. S., & Drexler, J. M. (1997). "Cybersickness is Not Simulator Sickness." Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 41(2), pp. 1138-1142.</ref>
<ref name="Stanney1997">Stanney, K. M., Kennedy, R. S., & Drexler, J. M. (1997). "Cybersickness is Not Simulator Sickness." Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 41(2), pp. 1138-1142.</ref>
<ref name="Google2019">Google (2019). "Designing for Google Cardboard." Google Developers Documentation.</ref>
<ref name="Google2019">Google (2019). "Designing for Google Cardboard." Google Developers Documentation.</ref>
<ref name="Tuning">Example Source 5: VR Development Best Practices Guide.</ref>
<ref name="Seymour2002">Seymour, N. E., Gallagher, A. G., Roman, S. A., O'Brien, M. K., Bansal, V. K., Andersen, D. K., & Satava, R. M. (2002). "Virtual Reality Training Improves Operating Room Performance: Results of a Randomized, Double-Blinded Study." Annals of Surgery, 236(4), pp. 458-464.</ref>
<ref name="Seymour2002">Seymour, N. E., Gallagher, A. G., Roman, S. A., O'Brien, M. K., Bansal, V. K., Andersen, D. K., & Satava, R. M. (2002). "Virtual Reality Training Improves Operating Room Performance: Results of a Randomized, Double-Blinded Study." Annals of Surgery, 236(4), pp. 458-464.</ref>
<ref name="Bae2013">Bae, H., Golparvar-Fard, M., & White, J. (2013). "High-Precision Vision-Based Mobile Augmented Reality System for Context-Aware Architectural, Engineering, Construction and Facility Management (AEC/FM) Applications." Visualization in Engineering, 1(1), pp. 1-13.</ref>
<ref name="Bae2013">Bae, H., Golparvar-Fard, M., & White, J. (2013). "High-Precision Vision-Based Mobile Augmented Reality System for Context-Aware Architectural, Engineering, Construction and Facility Management (AEC/FM) Applications." Visualization in Engineering, 1(1), pp. 1-13.</ref>
Line 312: Line 209:
<ref name="Campbell2018">Campbell, J., McSorley, K., & Bergstrom, I. (2018). "Specialized Processing Units for Real-Time VR Tracking." GPU Technology Conference.</ref>
<ref name="Campbell2018">Campbell, J., McSorley, K., & Bergstrom, I. (2018). "Specialized Processing Units for Real-Time VR Tracking." GPU Technology Conference.</ref>
<ref name="Khronos2017">Khronos Group (2017). "OpenXR Specification." Khronos Group Technical Documentation.</ref>
<ref name="Khronos2017">Khronos Group (2017). "OpenXR Specification." Khronos Group Technical Documentation.</ref>
<ref name="antycip">ST Engineering Antycip (2024). "A Brief Guide to VR Motion Tracking Technology."</ref>
<ref name="Beeler2016">Beeler, D., Hutchins, E., & Pedriana, P. (2016). "Asynchronous Spacewarp." Oculus Connect 3 Technical Presentation.</ref>
<ref name="Beeler2016">Beeler, D., Hutchins, E., & Pedriana, P. (2016). "Asynchronous Spacewarp." Oculus Connect 3 Technical Presentation.</ref>
<ref name="Patney2016">Patney, A., Salvi, M., Kim, J., Kaplanyan, A., Wyman, C., Benty, N., ... & Lefohn, A. (2016). "Towards Foveated Rendering for Gaze-Tracked Virtual Reality." ACM Transactions on Graphics, 35(6), Article 179.</ref>
<ref name="Patney2016">Patney, A., Salvi, M., Kim, J., Kaplanyan, A., Wyman, C., Benty, N., ... & Lefohn, A. (2016). "Towards Foveated Rendering for Gaze-Tracked Virtual Reality." ACM Transactions on Graphics, 35(6), Article 179.</ref>
Line 321: Line 217:
</references>
</references>


[[Category:Terms]]  
[[Category:Terms]] [[Category:Technical Terms]]
[[Category:Technical Terms]]
[[Category:Tracking]]
[[Category:Latency Compensation]]
[[Category:Virtual Reality]]
[[Category:Augmented Reality]]
[[Category:Core Concepts]]