Markerless outside-in tracking: Difference between revisions
Xinreality (talk | contribs) No edit summary |
Xinreality (talk | contribs) No edit summary Tag: Reverted |
||
Line 1: | Line 1: | ||
{{ | {{stub}} | ||
:''See also [[Outside-in tracking]], [[Markerless tracking]], [[Positional tracking]]'' | :''See also [[Outside-in tracking]], [[Markerless tracking]], [[Positional tracking]]'' | ||
==Introduction== | ==Introduction== | ||
'''[[Markerless outside-in tracking]]''' is a subtype of [[positional tracking]] used in [[virtual reality]] (VR) and [[augmented reality]] (AR). | '''[[Markerless outside-in tracking]]''' is a subtype of [[positional tracking]] used in both [[virtual reality]] (VR) and [[augmented reality]] (AR). It places external [[camera]]s or other [[depth sensing]] devices around the play area and estimates a user’s six-degree-of-freedom pose without any worn [[fiducial marker]]s. Instead, the system runs [[computer vision]] algorithms—most famously the per-pixel body-part classifier introduced for Microsoft’s Kinect—to create a real-time [[motion capture]] skeleton.<ref name="Shotton2011">Shotton, J.; Fitzgibbon, A.; Cook, M.; Sharp, T.; Finocchio, M.; Moore, R.; Kipman, A.; Blake, A. “Real-Time Human Pose Recognition in Parts from a Single Depth Image.” ''Proceedings of CVPR 2011''. IEEE, 2011.</ref> | ||
==Underlying technology== | ==Underlying technology== | ||
A typical markerless outside-in pipeline | A typical markerless outside-in pipeline includes: | ||
* | * '''Sensing layer''' – One or more fixed [[RGB-D]] or [[infrared]] depth cameras (e.g., the first-generation [[Kinect]]) acquire point-cloud frames. Depth is measured with [[structured light]] or [[time-of-flight]] illumination.<ref name="Zhang2012">Zeng, W.; Zhang, Z. “Microsoft Kinect Sensor and Its Effect.” ''IEEE MultiMedia'', 19 (2), 2012, pp. 4–10.</ref><ref name="StructuredLight">“Structured-light 3D scanner.” ''Wikipedia''. Accessed 1 May 2025.</ref> | ||
* | * '''Segmentation''' – Foreground extraction isolates user pixels from the static background. | ||
* | * '''Body-part classification''' – A decision-forest classifier labels each depth pixel as head, hand, torso, and so on, following Shotton ''et al.''<ref name="Shotton2011" /> | ||
* | * '''Skeletal fitting and filtering''' – Joint hypotheses are fitted to a kinematic model and temporally smoothed, generating continuous head- and hand-pose streams. | ||
Open software stacks such as [[OpenNI]]/NITE expose these joint streams to developers.<ref name="OpenNI2013">OpenNI Foundation. ''OpenNI 1.5.2 User Guide''. 2013.</ref> | |||
==Markerless vs. marker-based tracking== | ==Markerless vs. marker-based tracking== | ||
Marker-based outside-in systems (HTC Vive Lighthouse, PlayStation VR) attach active LEDs or | Marker-based outside-in systems (HTC Vive Lighthouse, PlayStation VR) attach active LEDs or reflective spheres to the headset or controllers, achieving millimetre-level accuracy. Markerless systems remove that hardware layer but incur: | ||
* * | * Susceptibility to occlusion and environmental lighting. | ||
* Higher positional noise and latency (~20–30 ms end-to-end).<ref name="Pfister2022">Pfister, A.; West, N.; et al. “Applications and limitations of current markerless motion capture methods for clinical gait biomechanics.” ''Journal of Biomechanics'', 129 (2022) 110844.</ref> | |||
==History and notable systems== | ==History and notable systems== | ||
{| class="wikitable" | {| class="wikitable" | ||
! Year !! System !! | ! Year !! System !! Technical note | ||
|- | |- | ||
| 2003 || [[EyeToy]] (PlayStation 2) || 2-D silhouette tracking with a single RGB | | 2003 || [[EyeToy]] (PlayStation 2) || 2-D silhouette tracking with a single RGB webcam.<ref name="EyeToy2003">Pham, A. “EyeToy Springs From One Man’s Vision.” ''Los Angeles Times'', 27 Nov 2003.</ref> | ||
|- | |- | ||
| 2010 || [[Kinect]] for Xbox 360 || | | 2010 || [[Kinect]] for Xbox 360 || Structured-light depth sensor providing full-body skeletons for up to six users.<ref name="Kinect2010">Microsoft News Center. “The Future of Entertainment Starts Today as Kinect for Xbox 360 …”, 4 Nov 2010.</ref> | ||
|- | |- | ||
| | | 2011 || Kinect + FAAST middleware || Demonstrated low-cost VR interaction with markerless tracking.<ref name="Lange2011">Lange, B.; Rizzo, A.; Chang, C-Y.; Suma, E.; Bolas, M. “Markerless Full Body Tracking: Depth-Sensing Technology within Virtual Environments.” ''I/ITSEC 2011''.</ref> | ||
|- | |- | ||
| 2017 || Kinect production ends || Microsoft | | 2017 || Kinect production ends || Microsoft ceased manufacturing Kinect as industry moved to other tracking paradigms.<ref name="KinectDead2017">Good, O. “Kinect is officially dead. Really. Officially. It’s dead.” ''Polygon'', 25 Oct 2017.</ref> | ||
|} | |} | ||
==Applications== | ==Applications== | ||
* **Gaming and | * **Gaming and entertainment** – Titles such as ''Kinect Sports'' map whole-body gestures to avatars; hobbyists still use Kinect for full-body VR chat avatars. | ||
* **Rehabilitation and | * **Rehabilitation and exercise** – Depth-based pose tracking supports remote physiotherapy and balance-training systems.<ref name="Pfister2022" /> | ||
* **Interactive | * **Interactive exhibits** – Museums mount depth cameras to create “magic-mirror” AR overlays. | ||
* **Telepresence** – Multi- | * **Telepresence** – Multi-camera arrays stream volumetric avatars into shared virtual spaces. | ||
==Advantages== | ==Advantages== | ||
* | * No wearable markers, enhancing comfort. | ||
* | * Quick single-sensor setup and lower hardware cost. | ||
* Ability to track multiple users at once. | |||
* | |||
==Disadvantages== | ==Disadvantages== | ||
* | * Occlusion sensitivity and limited camera field-of-view. | ||
* | * Lower accuracy than marker-based alternatives.<ref name="Remocapp2024">Remocapp. “Marker vs Markerless Motion Capture by Accuracy and Detail Level.” Blog post, 2024.</ref> | ||
* | * Performance degradation in bright sunlight or on reflective surfaces. | ||
==References== | ==References== | ||
< | <references/> | ||
[[Category:Terms]] | [[Category:Terms]] | ||
[[Category:Technical Terms]] | [[Category:Technical Terms]] | ||
Revision as of 17:15, 30 April 2025

Introduction
Markerless outside-in tracking is a subtype of positional tracking used in both virtual reality (VR) and augmented reality (AR). It places external cameras or other depth sensing devices around the play area and estimates a user’s six-degree-of-freedom pose without any worn fiducial markers. Instead, the system runs computer vision algorithms—most famously the per-pixel body-part classifier introduced for Microsoft’s Kinect—to create a real-time motion capture skeleton.[1]
Underlying technology
A typical markerless outside-in pipeline includes:
- Sensing layer – One or more fixed RGB-D or infrared depth cameras (e.g., the first-generation Kinect) acquire point-cloud frames. Depth is measured with structured light or time-of-flight illumination.[2][3]
- Segmentation – Foreground extraction isolates user pixels from the static background.
- Body-part classification – A decision-forest classifier labels each depth pixel as head, hand, torso, and so on, following Shotton et al.[1]
- Skeletal fitting and filtering – Joint hypotheses are fitted to a kinematic model and temporally smoothed, generating continuous head- and hand-pose streams.
Open software stacks such as OpenNI/NITE expose these joint streams to developers.[4]
Markerless vs. marker-based tracking
Marker-based outside-in systems (HTC Vive Lighthouse, PlayStation VR) attach active LEDs or reflective spheres to the headset or controllers, achieving millimetre-level accuracy. Markerless systems remove that hardware layer but incur:
- Susceptibility to occlusion and environmental lighting.
- Higher positional noise and latency (~20–30 ms end-to-end).[5]
History and notable systems
Year | System | Technical note |
---|---|---|
2003 | EyeToy (PlayStation 2) | 2-D silhouette tracking with a single RGB webcam.[6] |
2010 | Kinect for Xbox 360 | Structured-light depth sensor providing full-body skeletons for up to six users.[7] |
2011 | Kinect + FAAST middleware | Demonstrated low-cost VR interaction with markerless tracking.[8] |
2017 | Kinect production ends | Microsoft ceased manufacturing Kinect as industry moved to other tracking paradigms.[9] |
Applications
- **Gaming and entertainment** – Titles such as Kinect Sports map whole-body gestures to avatars; hobbyists still use Kinect for full-body VR chat avatars.
- **Rehabilitation and exercise** – Depth-based pose tracking supports remote physiotherapy and balance-training systems.[5]
- **Interactive exhibits** – Museums mount depth cameras to create “magic-mirror” AR overlays.
- **Telepresence** – Multi-camera arrays stream volumetric avatars into shared virtual spaces.
Advantages
- No wearable markers, enhancing comfort.
- Quick single-sensor setup and lower hardware cost.
- Ability to track multiple users at once.
Disadvantages
- Occlusion sensitivity and limited camera field-of-view.
- Lower accuracy than marker-based alternatives.[10]
- Performance degradation in bright sunlight or on reflective surfaces.
References
- ↑ 1.0 1.1 Shotton, J.; Fitzgibbon, A.; Cook, M.; Sharp, T.; Finocchio, M.; Moore, R.; Kipman, A.; Blake, A. “Real-Time Human Pose Recognition in Parts from a Single Depth Image.” Proceedings of CVPR 2011. IEEE, 2011.
- ↑ Zeng, W.; Zhang, Z. “Microsoft Kinect Sensor and Its Effect.” IEEE MultiMedia, 19 (2), 2012, pp. 4–10.
- ↑ “Structured-light 3D scanner.” Wikipedia. Accessed 1 May 2025.
- ↑ OpenNI Foundation. OpenNI 1.5.2 User Guide. 2013.
- ↑ 5.0 5.1 Pfister, A.; West, N.; et al. “Applications and limitations of current markerless motion capture methods for clinical gait biomechanics.” Journal of Biomechanics, 129 (2022) 110844.
- ↑ Pham, A. “EyeToy Springs From One Man’s Vision.” Los Angeles Times, 27 Nov 2003.
- ↑ Microsoft News Center. “The Future of Entertainment Starts Today as Kinect for Xbox 360 …”, 4 Nov 2010.
- ↑ Lange, B.; Rizzo, A.; Chang, C-Y.; Suma, E.; Bolas, M. “Markerless Full Body Tracking: Depth-Sensing Technology within Virtual Environments.” I/ITSEC 2011.
- ↑ Good, O. “Kinect is officially dead. Really. Officially. It’s dead.” Polygon, 25 Oct 2017.
- ↑ Remocapp. “Marker vs Markerless Motion Capture by Accuracy and Detail Level.” Blog post, 2024.