Body tracking
Body tracking is the capture of the pose and movement of a user's body so that it can drive an avatar or interaction in virtual reality (VR) and augmented reality (AR). In consumer VR the head and the two hands are tracked directly by the headset and controllers, which gives three points of positional tracking; reproducing the rest of the body (torso, hips, knees and feet) requires either additional sensors worn on the body or software that estimates the missing joints from the available points.[1][2]
When tracking is extended beyond the head and hands to follow the whole skeleton, it is usually called full-body tracking (FBT). Body tracking draws on techniques from motion capture, a field developed for film, games and biomechanics, and adapts them to run in real time and at consumer cost for social VR, fitness and embodied interaction.[3]
How it works
The standard problem in consumer VR is that the headset reports head pose and the controllers report hand pose, leaving the rest of the body unknown. Two broad approaches add the missing information: measure more of the body with hardware, or infer it with software.[2][4]
Hardware-based body tracking places trackers at extra points (commonly the waist and the two feet) and reads each tracker's position and orientation. A skeleton fitted to those points, together with the user's measured body proportions, then poses the avatar.[5] Software-based body tracking takes the few points that are already tracked and applies inverse kinematics (IK), a method that computes the joint angles of a kinematic chain from the desired position of its end effectors (such as a hand or foot). IK lets a system place the elbow, shoulder, pelvis and legs so that the chain reaches the known head and hand positions, but the result is an estimate: with only three points the legs in particular are underdetermined, which can make feet slide or sitting and crouching look wrong.[2]
A third approach replaces or supplements hand-written IK with learned models that generate plausible motion from sparse inputs. At CVPR 2023, Du and colleagues at Meta described AGRoL, a conditional diffusion model that synthesises full-body motion, including the lower body, from the head and two wrist signals available on standalone headsets, training on the AMASS motion-capture dataset; the authors report more realistic motion than prior methods while running in real time.[4]
Methods
Body tracking systems used with VR and AR fall into several families that differ in the sensors they use and in whether they need external infrastructure.
| Method | How it senses the body | External infrastructure | Notes |
|---|---|---|---|
| Dedicated optical trackers (Lighthouse) | Pucks worn on the body are tracked by SteamVR Lighthouse base stations using infrared laser sweeps | Requires base stations | High precision; example is the Vive Tracker |
| Self-tracking (inside-out) pucks | Each tracker carries its own cameras and computes its position from the environment | None | Example is the VIVE Ultimate Tracker |
| Inertial (IMU) trackers and suits | Accelerometer, gyroscope and sometimes magnetometer in each unit measure orientation; positions are derived from body proportions | None | Examples are SlimeVR and Xsens MVN suits |
| Markerless camera estimation | A depth or RGB camera observes the user and a model labels body parts and fits a skeleton | A camera facing the user | Example is the Microsoft Kinect |
| Sparse-point inference | IK or a learned model extends the headset and controller points to the whole body | None | Example is Meta's Movement SDK |
Dedicated optical trackers
The most common consumer full-body setup attaches dedicated trackers to the waist and feet. The Vive Tracker 3.0 is a 75 g puck with up to 7.5 hours of battery life that is tracked by SteamVR Lighthouse base stations and pairs to the PC through a wireless dongle; HTC lists it at 149 USD and the number usable at once depends on available USB ports, application support and radio conditions.[6] Third-party Lighthouse trackers such as Tundra Trackers work the same way, and applications like VRChat accept up to eight extra trackers in addition to the headset and controllers, mapping them to feet, knees, hips, chest, elbows and shoulders, with a calibration step in which the user stands upright and looks forward so the system can align the trackers to the avatar's proportions.[1]
To remove the base stations, HTC released the VIVE Ultimate Tracker, announced in November 2023, which carries two wide field-of-view cameras and an onboard chipset so each puck performs its own inside-out positional tracking. HTC recommends at least three trackers for body tracking, a USB-C dongle supports up to five at once, and the system works over OpenXR with SteamVR and with the Vive XR Elite.[7][8]
Inertial trackers and suits
Inertial systems put an IMU on each tracked body segment. The sensors measure orientation directly; absolute position is reconstructed from a biomechanical model of the body rather than measured optically, so the systems work outdoors and indoors with no cameras, emitters, markers, line-of-sight requirements or lighting constraints. Xsens MVN, a professional inertial motion-capture suit, is built on miniature inertial sensors, biomechanical models and sensor fusion algorithms.[9]
In the consumer space, SlimeVR is a set of open hardware and open-source software for full-body tracking that needs no base stations. Each tracker measures its own rotation with an IMU and sends the data to a server on the PC or phone, which combines the trackers with the user's proportions and the headset position to place the body parts; five trackers (one on each thigh, one on each ankle and one at the waist) are enough for a basic setup.[5][10] Drift from accumulated IMU error is corrected over time by skeletal calibration; SlimeVR's "Autobone" routine estimates bone lengths from movement.[11]
Markerless camera estimation
Markerless systems observe the user with a camera and recover the skeleton in software, with no worn hardware. Inexpensive depth cameras, in particular the Microsoft Kinect, made monocular full-body skeletal tracking practical for research and hobbyist VR. A typical pipeline segments the user from the background, classifies each pixel into a body part with a machine-learning model, fits a kinematic skeleton to the classified pixels and applies temporal filtering to reduce jitter. The main weaknesses are occlusion (joints hidden from a single camera) and non-frontal poses, which researchers address by fusing two or more cameras placed at different angles.[12] Markerless tracking improves comfort and cuts setup time compared with worn trackers, but generally at the cost of lower positional accuracy and higher latency.[12]
Sparse-point inference on standalone headsets
On standalone headsets the trend has been to add body tracking without any extra hardware by inferring the body from the sensors the headset already has. Meta's Movement SDK exposes body, eye and face tracking through OpenXR; its body-tracking path reconstructs a body skeleton from three input points, the headset and the two hands or controllers, and is supported on Meta Quest 2, Quest 3, Quest 3S and Quest Pro.[13] At Meta Connect 2023 the company added two features, released to developers on 20 December 2023 in the v60 update. Inside-Out Body Tracking (IOBT), exclusive to Meta Quest 3, uses the headset's side cameras to capture elbow, wrist and torso movement directly. Generative Legs, compatible with all Quest devices, uses an AI model to infer leg motion such as walking, jumping, ducking and squatting from the upper-body pose alone; Meta says this produces more natural avatars than traditional IK and, combined with IOBT, yields a plausible full body without external hardware.[14][15]
Origins in motion capture
Body tracking for VR descends from motion capture, the recording of human movement for animation and analysis. One of the earliest worn systems was the Data Suit from VPL Research in the late 1980s, which used sensors connected by fibre-optic cables to a computer that updated the figure 15 to 30 times a second.[16] Two families of professional mocap matured in parallel and still underpin body tracking today. Optical marker-based systems, made by companies such as OptiTrack and Vicon, surround a capture volume with high-speed infrared cameras that track reflective or light-emitting markers on a performer, giving high precision and low latency; this is an example of outside-in tracking adapted to the body. Inertial systems such as Xsens MVN instead put IMUs on the body and need no cameras at all.[3][9] Consumer VR body tracking takes these ideas and trims them to a handful of trackers, or to no trackers, so they fit a living room and a consumer budget.
Uses in VR and AR
The most visible use is embodiment in social VR. In applications like VRChat, standard three-point tracking (head and two hands) drives the upper body, and adding waist and foot trackers lets the legs and hips follow the real body, which the platform handles through its IK system and a calibration step that aligns the user to their avatar's proportions.[1][2] Better lower-body tracking improves the sense of presence and self-immersion, and supports actions such as dancing, kicking and sitting that three-point IK approximates poorly. Body tracking is also used in VR fitness and exercise titles, in dance and performance, in VTubing where a tracked performer animates a virtual character, and in research on embodiment and avatars. On AR and standalone hardware the same upper-body and leg-estimation methods let a headset animate a fuller avatar of the wearer for telepresence and social presence without any worn trackers.[14][11]
Current status
As of 2026 there is no single dominant method; the choice depends on the platform and the user's tolerance for extra hardware. PC VR users who want accurate legs still rely on worn trackers, either Lighthouse pucks such as the Vive Tracker 3.0 or inertial trackers such as SlimeVR. Self-tracking pucks like the VIVE Ultimate Tracker remove the base-station requirement for that audience.[6][7] Open hardware continues to develop: SlimeVR's Butterfly trackers, detailed during a Crowd Supply crowdfunding campaign in February 2026, are ultra-slim IMU trackers built on a Nordic nRF52833 microcontroller and a TDK ICM-45686 IMU, rated for over 48 hours of use and a latency under 15 ms, with a six-tracker set listed from 279 USD and shipping expected in August 2026.[11] On standalone headsets the direction is hardware-free body tracking driven by on-device cameras and learned models, such as Meta's IOBT and Generative Legs, which trade some accuracy for the convenience of needing nothing beyond the headset.[14][13]
See also
References
- ↑ 1.0 1.1 1.2 "Full-Body Tracking". https://docs.vrchat.com/docs/full-body-tracking.
- ↑ 2.0 2.1 2.2 2.3 "Guides:Types of tracking". https://wiki.vrchat.com/wiki/Guides:Types_of_tracking.
- ↑ 3.0 3.1 "History of motion capture". https://www.xsens.com/motion-capture/history-of-motion-capture.
- ↑ 4.0 4.1 Du, Yuming; Kips, Robin; Pumarola, Albert; Starke, Sebastian; Thabet, Ali; Sanakoyeu, Artsiom (2023). "Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking Inputs with Diffusion Model". IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). https://arxiv.org/abs/2304.08577.
- ↑ 5.0 5.1 "SlimeVR Full-Body Trackers". https://slimevr.dev/.
- ↑ 6.0 6.1 "VIVE Tracker 3.0". https://shop-us.vive.com/products/vive-tracker-3-0-full-body-tracking.
- ↑ 7.0 7.1 "Vive Ultimate Tracker: Body Tracking Without Base Stations". 2023-11-29. https://www.uploadvr.com/vive-ultimate-tracker/.
- ↑ Template:Cite news
- ↑ 9.0 9.1 "Motion Capture". https://www.xsens.com/products/motion-capture.
- ↑ "Open Source SlimeVR Project Turns ESP8266 Microcontrollers Into Low-Cost Wireless Full-Body Trackers". https://www.hackster.io/news/open-source-slimevr-project-turns-esp8266-microcontrollers-into-low-cost-wireless-full-body-trackers-cb3b3d820e54.
- ↑ 11.0 11.1 11.2 "SlimeVR Butterfly Trackers - nRF52833-based, ultra-slim, full-body VR trackers offer up to 48h battery life". 2026-02-12. https://www.cnx-software.com/2026/02/12/slimevr-butterfly-trackers-nrf52833-based-ultra-slim-full-body-vr-trackers-offer-up-to-48h-battery-life/.
- ↑ 12.0 12.1 (2021). "Practical 3D human skeleton tracking based on multi-view and multi-Kinect fusion".{Template:Journal. https://dl.acm.org/doi/10.1007/s00530-021-00846-x.
- ↑ 13.0 13.1 "Movement SDK for OpenXR". https://developers.meta.com/horizon/documentation/native/android/move-overview/.
- ↑ 14.0 14.1 14.2 "Create More Natural Movements Using Inside-Out Body Tracking and Generative Legs". 2023-12-20. https://developers.meta.com/horizon/blog/inside-out-body-tracking-and-generative-legs/.
- ↑ "Meta Quest 3 is the first headset with built-in upper body tracking". 2023-09-27. https://mixed-news.com/en/meta-quest-3-upper-body-tracking/.
- ↑ "Motion capture suit". https://en.wikipedia.org/wiki/Motion_capture_suit.