Jump to content

Markerless outside-in tracking: Difference between revisions

No edit summary
m Text replacement - "e.g.," to "for example"
Tags: Mobile edit Mobile web edit
 
(One intermediate revision by the same user not shown)
Line 10: Line 10:
* '''Sensing layer''' – One or more fixed [[RGB-D]] or [[infrared]] depth cameras acquire per-frame point clouds. Commodity devices such as the Microsoft Kinect project a [[structured light]] pattern or use [[time-of-flight]] methods to compute depth maps.<ref name="Zhang2012" />
* '''Sensing layer''' – One or more fixed [[RGB-D]] or [[infrared]] depth cameras acquire per-frame point clouds. Commodity devices such as the Microsoft Kinect project a [[structured light]] pattern or use [[time-of-flight]] methods to compute depth maps.<ref name="Zhang2012" />
* '''Segmentation''' – Foreground extraction or person segmentation isolates user pixels from the static background.
* '''Segmentation''' – Foreground extraction or person segmentation isolates user pixels from the static background.
* '''Per-pixel body-part classification''' – A machine-learning model labels each pixel as “head”, “hand”, “torso”, and so on (e.g., the Randomised Decision Forest used in the original Kinect).<ref name="Shotton2011" />
* '''Per-pixel body-part classification''' – A machine-learning model labels each pixel as “head”, “hand”, “torso”, and so on (for example the Randomised Decision Forest used in the original Kinect).<ref name="Shotton2011" />
* '''Skeletal reconstruction and filtering''' – The system fits a kinematic skeleton to the classified pixels and applies temporal filtering to reduce jitter, producing smooth head- and hand-pose data that can drive VR/AR applications.
* '''Skeletal reconstruction and filtering''' – The system fits a kinematic skeleton to the classified pixels and applies temporal filtering to reduce jitter, producing smooth head- and hand-pose data that can drive VR/AR applications.


Although a single camera can suffice, multi-camera rigs extend coverage and mitigate occlusion problems. Open source and proprietary middleware (e.g., [[OpenNI]]/NITE, the [[Microsoft Kinect]] SDK) expose joint-stream APIs for developers.<ref name="OpenNI2013" />
Although a single camera can suffice, multi-camera rigs extend coverage and mitigate occlusion problems. Open source and proprietary middleware (for example [[OpenNI]]/NITE, the [[Microsoft Kinect]] SDK) expose joint-stream APIs for developers.<ref name="OpenNI2013" />


==Markerless vs. marker-based tracking==
==Markerless vs. marker-based tracking==
Line 53: Line 53:


==References==
==References==
<ref name="Shotton2011">Shotton, Jamie; Fitzgibbon, Andrew; Cook, Mat; Sharp, Toby; Finocchio, Mark; Moore, Bob; Kipman, Alex; Blake, Andrew (2011). Real-Time Human Pose Recognition in Parts from a Single Depth Image. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2011​
<ref name="Shotton2011">Shotton, J.; Fitzgibbon, A.; Cook, M.; Sharp, T.; Finocchio, M.; Moore, R.; Kipman, A.; Blake, A. “Real‑Time Human Pose Recognition in Parts from a Single Depth Image.” *Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*, 2011, pp. 1297–1304. DOI: 10.1109/CVPR.2011.5995316. Available at: https://ieeexplore.ieee.org/document/5995316 (accessed 3 May 2025).</ref>
microsoft.com
<ref name="Zhang2012">Zhang, Z. “Microsoft Kinect Sensor and Its Effect.” *IEEE MultiMedia*, vol. 19, no. 2, 2012, pp. 4–10. DOI: 10.1109/MMUL.2012.24. Available at: https://dl.acm.org/doi/10.1109/MMUL.2012.24 (accessed 3 May 2025).</ref>
.</ref>
<ref name="OpenNI2013">OpenNI Foundation. *OpenNI 1.5.2 User Guide*, 2010. PDF. Available at: https://www.cs.rochester.edu/courses/577/fall2011/kinect/openni-user-guide.pdf (accessed 3 May 2025).</ref>
<ref name="Zhang2012">Zeng, Wenjun; Zhang, Zhengyou (2012). Microsoft Kinect Sensor and Its Effect. IEEE MultiMedia, 19(2):4–10​
<ref name="Pfister2022">Pfister, A.; West, N.; et al. “Applications and Limitations of Current Markerless Motion Capture Methods for Clinical Gait Biomechanics.” *Journal of Biomechanics*, vol. 129, 2022, Article 110844. DOI: 10.1016/j.jbiomech.2021.110844. Available at: https://pubmed.ncbi.nlm.nih.gov/35237469/ (accessed 3 May 2025).</ref>
microsoft.com
<ref name="Pham2004">Pham, A. “EyeToy Springs From One Man’s Vision.” *Los Angeles Times*, 18 Jan 2004. Available at: https://www.latimes.com/archives/la-xpm-2004-jan-18-fi-eyetoy18-story.html (accessed 3 May 2025).</ref>
.</ref>
<ref name="Microsoft2010">Microsoft News Center. “The Future of Entertainment Starts Today as Kinect for Xbox 360 Leaps and Lands at Retailers Nationwide.” Press release, 4 Nov 2010. Available at: https://news.microsoft.com/2010/11/04/the-future-of-entertainment-starts-today-as-kinect-for-xbox-360-leaps-and-lands-at-retailers-nationwide/ (accessed 3 May 2025).</ref>
<ref name="OpenNI2013">OpenNI Foundation (2010). OpenNI 1.5.2 User Guide. “OpenNI is an open source API that is publicly available at www.OpenNI.org.”&#8203;:contentReference[oaicite:2]{index=2}.</ref>
<ref name="Lange2011">Lange, B.; Rizzo, A.; Chang, C.-Y.; Suma, E. A.; Bolas, M. “Markerless Full Body Tracking: Depth‑Sensing Technology within Virtual Environments.” *Interservice/Industry Training, Simulation and Education Conference (I/ITSEC)*, 2011. PDF. Available at: http://ict.usc.edu/pubs/Markerless%20Full%20Body%20Tracking-%20Depth-Sensing%20Technology%20within%20Virtual%20Environments.pdf (accessed 3 May 2025).</ref>
<ref name="Pfister2022">Pfister, Andreas; West, Niels; et al. (2022). Applications and limitations of current markerless motion capture methods for clinical gait biomechanics. Journal of Biomechanics, 129:110844. “While markerless temporospatial measures generally appear equivalent to marker-based systems, joint center locations and joint angles are not yet sufficiently accurate for clinical applications.”​
<ref name="Microsoft2017">Good, O. S. “Kinect Is Officially Dead. Really. Officially. It’s Dead.” *Polygon*, 25 Oct 2017. Available at: https://www.polygon.com/2017/10/25/16543192/kinect-discontinued-microsoft-announcement (accessed 3 May 2025).</ref>
pmc.ncbi.nlm.nih.gov
 
.</ref>
<ref name="Pham2004">Pham, Alex (2004-01-18). EyeToy Springs From One Man’s Vision. Los Angeles Times. “the $50 EyeToy, a tiny camera that enables video game players to control the action by jumping around and waving their arms…”​
latimes.com
.</ref>
<ref name="Microsoft2010">Microsoft News Center (2010-11-04). The Future of Entertainment Starts Today as Kinect for Xbox 360 Leaps and Lands at Retailers Nationwide. “Kinect for Xbox 360 lets you use your body and voice to play your favorite games... No buttons. No barriers. Just you.”​
news.microsoft.com
.</ref>
<ref name="Lange2011">Lange, Belinda; Rizzo, Skip; Chang, Chien-Yen; Suma, Evan A.; Bolas, Mark (2011). Markerless Full Body Tracking: Depth-Sensing Technology within Virtual Environments. Proc. I/ITSEC 2011. “FAAST is middleware to facilitate integration of full-body control with virtual reality applications... (e.g. Microsoft Kinect).”​
illusioneering.cs.umn.edu
.</ref>
<ref name="Microsoft2017">Good, Owen S. (2017-10-25). Kinect is officially dead. Really. Officially. It’s dead. Polygon. “Microsoft has confirmed it is no longer manufacturing Kinect and none will be sold once retailers run out.”​
polygon.com
.</ref>