Hand tracking: Difference between revisions
Appearance
Xinreality (talk | contribs) Created page with "{{Infobox technology | name = Hand tracking | image = <!-- omit if no free image available --> | caption = A user interacting with a virtual interface using optical hand tracking | type = Computer vision / Gesture recognition / Natural user interface technology | invention_date = 1977 (Sayre Glove) | inventor = Daniel Sandin, Thomas DeFanti, Richard Sayre | fields = Virtual reality, Augmented reality, Mixed reality | related = Gesture recognitio..." |
Xinreality (talk | contribs) No edit summary |
||
| Line 11: | Line 11: | ||
}} | }} | ||
'''Hand tracking''' is a [[computer vision]]-based technology used in [[virtual reality]] (VR), [[augmented reality]] (AR), and [[mixed reality]] (MR) systems to detect, track, and interpret the position, orientation, and movements of a user's hands and fingers in real time. Unlike traditional input methods such as [[motion controller|controllers]] or gloves, hand tracking enables controller-free, natural interactions by leveraging cameras, sensors, and artificial intelligence (AI) algorithms to map hand poses into virtual environments.<ref name="Frontiers2021" | '''Hand tracking''' is a [[computer vision]]-based technology used in [[virtual reality]] (VR), [[augmented reality]] (AR), and [[mixed reality]] (MR) systems to detect, track, and interpret the position, orientation, and movements of a user's hands and fingers in real time. Unlike traditional input methods such as [[motion controller|controllers]] or gloves, hand tracking enables controller-free, natural interactions by leveraging cameras, sensors, and artificial intelligence (AI) algorithms to map hand poses into virtual environments.<ref name="Frontiers2021" /> This technology enhances immersion, presence, and usability in [[extended reality]] (XR) applications by allowing users to perform gestures like pointing, grabbing, pinching, and swiping directly with their bare hands. | ||
Hand tracking systems typically operate using optical methods, such as [[infrared]] (IR) illumination and monochrome cameras, or visible-light cameras integrated into [[head-mounted display]]s (HMDs). Modern implementations achieve low-latency tracking (e.g., 10–70 ms) with high accuracy, supporting up to 27 degrees of freedom (DoF) per hand to capture complex articulations.<ref name="UltraleapDocs" | Hand tracking systems typically operate using optical methods, such as [[infrared]] (IR) illumination and monochrome cameras, or visible-light cameras integrated into [[head-mounted display]]s (HMDs). Modern implementations achieve low-latency tracking (e.g., 10–70 ms) with high accuracy, supporting up to 27 degrees of freedom (DoF) per hand to capture complex articulations.<ref name="UltraleapDocs" /> The human hand has approximately 27 degrees of freedom, making accurate tracking a complex challenge.<ref name="HandDoF" /> It has evolved from early wired prototypes in the 1970s to sophisticated, software-driven solutions integrated into consumer devices like the [[Meta Quest]] series, [[Microsoft HoloLens 2]], and [[Apple Vision Pro]]. | ||
Hand tracking is a cornerstone of [[human-computer interaction]] in [[spatial computing]]. Modern systems commonly provide a per-hand skeletal pose (e.g., joints and bones), expose this data to applications through standard APIs (such as [[OpenXR]] and [[WebXR]]), and pair it with higher-level interaction components (e.g., poke, grab, raycast) for robust user experiences across devices.<ref name="OpenXR11" | Hand tracking is a cornerstone of [[human-computer interaction]] in [[spatial computing]]. Modern systems commonly provide a per-hand skeletal pose (e.g., joints and bones), expose this data to applications through standard APIs (such as [[OpenXR]] and [[WebXR]]), and pair it with higher-level interaction components (e.g., poke, grab, raycast) for robust user experiences across devices.<ref name="OpenXR11" /><ref name="WebXRHand" /> | ||
== History == | == History == | ||
| Line 21: | Line 21: | ||
=== Early Developments (1970s–1990s) === | === Early Developments (1970s–1990s) === | ||
The foundational milestone occurred in 1977 with the invention of the '''Sayre Glove''', a wired data glove developed by electronic visualization pioneer Daniel Sandin and computer graphics researcher Thomas DeFanti at the University of Illinois at Chicago's Electronic Visualization Laboratory (EVL). Inspired by an idea from colleague Rich Sayre, the glove used optical flex sensors—light emitters paired with photocells embedded in the fingers—to measure joint angles and finger bends. Light intensity variations were converted into electrical signals, enabling basic gesture recognition and hand posture tracking for early VR simulations.<ref name="SayreGlove" | The foundational milestone occurred in 1977 with the invention of the '''Sayre Glove''', a wired data glove developed by electronic visualization pioneer Daniel Sandin and computer graphics researcher Thomas DeFanti at the University of Illinois at Chicago's Electronic Visualization Laboratory (EVL). Inspired by an idea from colleague Rich Sayre, the glove used optical flex sensors—light emitters paired with photocells embedded in the fingers—to measure joint angles and finger bends. Light intensity variations were converted into electrical signals, enabling basic gesture recognition and hand posture tracking for early VR simulations.<ref name="SayreGlove" /><ref name="SenseGlove" /> This device, considered the first data glove, established the principle of measuring finger flexion for computer input. | ||
In 1983, Gary Grimes of Bell Labs developed the '''Digital Data Entry Glove''', a more sophisticated system patented as an alternative to keyboard input. This device integrated flex sensors, touch sensors, and tilt sensors to recognize unique hand positions corresponding to alphanumeric characters, specifically gestures from the American Sign Language manual alphabet.<ref name="BellGlove" | In 1983, Gary Grimes of Bell Labs developed the '''Digital Data Entry Glove''', a more sophisticated system patented as an alternative to keyboard input. This device integrated flex sensors, touch sensors, and tilt sensors to recognize unique hand positions corresponding to alphanumeric characters, specifically gestures from the American Sign Language manual alphabet.<ref name="BellGlove" /> | ||
The 1980s saw the emergence of the first commercially viable data gloves, largely driven by the work of Thomas Zimmerman and [[Jaron Lanier]]. Zimmerman patented an optical flex sensor in 1982 and later co-founded VPL Research with Lanier in 1985. VPL Research became the first company to sell VR hardware, including the iconic '''DataGlove''', which was released commercially in 1987.<ref name="VPL" | The 1980s saw the emergence of the first commercially viable data gloves, largely driven by the work of Thomas Zimmerman and [[Jaron Lanier]]. Zimmerman patented an optical flex sensor in 1982 and later co-founded VPL Research with Lanier in 1985. VPL Research became the first company to sell VR hardware, including the iconic '''DataGlove''', which was released commercially in 1987.<ref name="VPL" /><ref name="VirtualSpeech" /> The DataGlove used fiber optic cables to measure finger bends and was typically paired with a [[Polhemus]] magnetic tracking system for positional data. This became an iconic symbol of early VR technology. | ||
In 1989, Mattel released the '''Nintendo [[Power Glove]]''', a low-cost consumer version licensed from VPL Research, which used resistive ink flex sensors and ultrasonic emitters for positional tracking. This was the first affordable, mass-market data glove for consumers, popularizing the concept of gestural control in gaming.<ref name="PowerGlove" | In 1989, Mattel released the '''Nintendo [[Power Glove]]''', a low-cost consumer version licensed from VPL Research, which used resistive ink flex sensors and ultrasonic emitters for positional tracking. This was the first affordable, mass-market data glove for consumers, popularizing the concept of gestural control in gaming.<ref name="PowerGlove" /> | ||
Throughout the 1990s, hand tracking advanced with the integration of more sophisticated sensors into VR systems. Researchers at MIT's Media Lab built on the Sayre Glove, incorporating electromagnetic and inertial measurement unit (IMU) sensors for improved positional tracking. These systems, often tethered to workstations, supported rudimentary interactions but were limited by wiring and low resolution. The '''CyberGlove''' (early 1990s) by Virtual Technologies used thin foil strain gauges sewn into fabric to measure up to 22 joint angles, becoming a high-precision glove used in research and professional applications.<ref name="AvatarAcademy" | Throughout the 1990s, hand tracking advanced with the integration of more sophisticated sensors into VR systems. Researchers at MIT's Media Lab built on the Sayre Glove, incorporating electromagnetic and inertial measurement unit (IMU) sensors for improved positional tracking. These systems, often tethered to workstations, supported rudimentary interactions but were limited by wiring and low resolution. The '''CyberGlove''' (early 1990s) by Virtual Technologies used thin foil strain gauges sewn into fabric to measure up to 22 joint angles, becoming a high-precision glove used in research and professional applications.<ref name="AvatarAcademy" /> | ||
{| class="wikitable sortable" | {| class="wikitable sortable" | ||
| Line 53: | Line 53: | ||
=== 2010s: Optical Tracking and Controller-Free Era === | === 2010s: Optical Tracking and Controller-Free Era === | ||
A pivotal shift occurred in 2010 with the founding of '''Leap Motion''' (later [[Ultraleap]]) by Michael Buckwald, David Holz, and John Gibb, who aimed to create affordable, high-precision optical hand tracking. In 2010, Microsoft also released the '''[[Kinect]]''' sensor for Xbox 360, which popularized the use of depth cameras for full-body and hand skeleton tracking in gaming and research.<ref name="Kinect" | A pivotal shift occurred in 2010 with the founding of '''Leap Motion''' (later [[Ultraleap]]) by Michael Buckwald, David Holz, and John Gibb, who aimed to create affordable, high-precision optical hand tracking. In 2010, Microsoft also released the '''[[Kinect]]''' sensor for Xbox 360, which popularized the use of depth cameras for full-body and hand skeleton tracking in gaming and research.<ref name="Kinect" /> | ||
The '''Leap Motion Controller''', released in 2013, was a small USB device with stereo IR cameras that could track both hands with fine precision (down to approximately 0.01 mm according to specifications) in a limited space above the device.<ref name="LeapWiki" | The '''Leap Motion Controller''', released in 2013, was a small USB device with stereo IR cameras that could track both hands with fine precision (down to approximately 0.01 mm according to specifications) in a limited space above the device.<ref name="LeapWiki" /><ref name="LeapWikipedia" /> This device revolutionized VR by enabling untethered, gesture-based input. Developers and enthusiasts mounted Leap Motion sensors onto VR headsets to experiment with hand input in VR, spurring further interest in the technique. Their flagship controller used two IR cameras and three IR LEDs to track hands at up to 200 Hz over a 60 cm × 60 cm interactive zone. | ||
In 2016, Leap Motion's '''Orion''' software update improved robustness against occlusion and lighting variations, boosting adoption in VR development.<ref name="Orion" | In 2016, Leap Motion's '''Orion''' software update improved robustness against occlusion and lighting variations, boosting adoption in VR development.<ref name="Orion" /> The company was acquired by Ultrahaptics in 2019, rebranding as [[Ultraleap]] and expanding into mid-air haptics.<ref name="UltrahapticsAcq" /> | ||
On the AR side, '''Microsoft HoloLens''' (first version, 2016) included simple gesture input such as "air tap" and "bloom" using camera-based hand recognition, though with limited gestures rather than full tracking. | On the AR side, '''Microsoft HoloLens''' (first version, 2016) included simple gesture input such as "air tap" and "bloom" using camera-based hand recognition, though with limited gestures rather than full tracking. | ||
By the late 2010s, [[inside-out tracking]] cameras became standard in new VR and AR hardware, and companies began leveraging them for hand tracking. The '''[[Oculus Quest]]''', a standalone VR headset released in 2019, initially launched with traditional controller input. At Oculus Connect 6 in September 2019, hand tracking was announced, and in late 2019 an experimental update introduced controller-free hand tracking using its built-in cameras.<ref name="Meta2019" | By the late 2010s, [[inside-out tracking]] cameras became standard in new VR and AR hardware, and companies began leveraging them for hand tracking. The '''[[Oculus Quest]]''', a standalone VR headset released in 2019, initially launched with traditional controller input. At Oculus Connect 6 in September 2019, hand tracking was announced, and in late 2019 an experimental update introduced controller-free hand tracking using its built-in cameras.<ref name="Meta2019" /><ref name="SpectreXR2022" /> This made the Quest one of the first mainstream VR devices to offer native hand tracking to consumers, showcasing surprisingly robust performance, albeit with some limitations in fast motion and certain angles. This inside-out system used the headset's monochrome cameras and AI for controller-free interactions, marking the mainstream consumer debut. | ||
The '''Microsoft HoloLens 2''' (2019) greatly expanded hand tracking capabilities with fully articulated tracking, allowing users to touch and grasp virtual elements directly. The system tracked 25 points of articulation per hand, demonstrating the benefit of more natural interactions for enterprise AR use cases and eliminating the limited "air tap" gestures of its predecessor.<ref name="Develop3D2019" | The '''Microsoft HoloLens 2''' (2019) greatly expanded hand tracking capabilities with fully articulated tracking, allowing users to touch and grasp virtual elements directly. The system tracked 25 points of articulation per hand, demonstrating the benefit of more natural interactions for enterprise AR use cases and eliminating the limited "air tap" gestures of its predecessor.<ref name="Develop3D2019" /><ref name="DirectManipulation" /> | ||
=== 2020s: AI-Driven Refinements and Mainstream Integration === | === 2020s: AI-Driven Refinements and Mainstream Integration === | ||
The 2020s brought AI enhancements and broader integration. Meta's '''MEgATrack''' (2020), deployed on Quest, used four fisheye monochrome cameras and neural networks for 60 Hz PC tracking and 30 Hz mobile, with low jitter and large working volumes.<ref name="MEgATrack" | The 2020s brought AI enhancements and broader integration. Meta's '''MEgATrack''' (2020), deployed on Quest, used four fisheye monochrome cameras and neural networks for 60 Hz PC tracking and 30 Hz mobile, with low jitter and large working volumes.<ref name="MEgATrack" /> | ||
Ultraleap's '''Gemini''' software update (2021) represented a major overhaul with stronger two-hand interactions and initialization speed for stereo-IR modules and integrated OEM headsets.<ref name="Gemini" | Ultraleap's '''Gemini''' software update (2021) represented a major overhaul with stronger two-hand interactions and initialization speed for stereo-IR modules and integrated OEM headsets.<ref name="Gemini" /> | ||
Meta continued improving the feature over time with successive updates. The '''Hands 2.1''' update (2022) and '''Hands 2.2''' update (2023) reduced apparent latency and improved fast-motion handling and recovery after tracking loss.<ref name="MetaHands21" | Meta continued improving the feature over time with successive updates. The '''Hands 2.1''' update (2022) and '''Hands 2.2''' update (2023) reduced apparent latency and improved fast-motion handling and recovery after tracking loss.<ref name="MetaHands21" /><ref name="MetaHands22" /> Subsequent devices like the '''[[Meta Quest Pro]]''' (2022) and '''[[Meta Quest 3]]''' (2023) included more advanced camera systems and neural processing to further refine hand tracking. | ||
In the 2020s, hand tracking became an expected feature in many XR devices. An analysis by SpectreXR noted that the percentage of new VR devices supporting hand tracking jumped from around 21% in 2021 to 46% in 2022, as more manufacturers integrated the technology.<ref name="SpectreXR2023" | In the 2020s, hand tracking became an expected feature in many XR devices. An analysis by SpectreXR noted that the percentage of new VR devices supporting hand tracking jumped from around 21% in 2021 to 46% in 2022, as more manufacturers integrated the technology.<ref name="SpectreXR2023" /> At the same time, the cost barrier dropped dramatically, with the average price of hand-tracking-capable VR headsets falling by approximately 93% from 2021 to 2022, making the technology far more accessible.<ref name="SpectreXR2023" /> | ||
Another milestone was Apple's introduction of the '''[[Apple Vision Pro]]''' (released 2024), which relies on hand tracking along with [[eye tracking]] as the primary input method for a spatial computer, completely doing away with handheld controllers. Apple's implementation allows users to make micro-gestures like pinching fingers at waist level, tracked by downward-facing cameras, which—combined with eye gaze—lets users control the interface in a very effortless manner.<ref name="AppleGestures" | Another milestone was Apple's introduction of the '''[[Apple Vision Pro]]''' (released 2024), which relies on hand tracking along with [[eye tracking]] as the primary input method for a spatial computer, completely doing away with handheld controllers. Apple's implementation allows users to make micro-gestures like pinching fingers at waist level, tracked by downward-facing cameras, which—combined with eye gaze—lets users control the interface in a very effortless manner.<ref name="AppleGestures" /><ref name="UploadVR2023" /> This high-profile adoption has been seen as a strong endorsement of hand tracking for mainstream XR interaction. | ||
By 2025, hand tracking is standard in many XR devices, with latencies under 70 ms and applications spanning gaming to medical simulations. | By 2025, hand tracking is standard in many XR devices, with latencies under 70 ms and applications spanning gaming to medical simulations. | ||
| Line 111: | Line 111: | ||
1. '''Detection''': Find hands in the camera frame (often with a palm detector) | 1. '''Detection''': Find hands in the camera frame (often with a palm detector) | ||
2. '''Landmark regression''': Predict 2D/3D keypoints for wrist and finger joints (commonly 21 landmarks per hand in widely used models)<ref name="MediaPipeHands" | 2. '''Landmark regression''': Predict 2D/3D keypoints for wrist and finger joints (commonly 21 landmarks per hand in widely used models)<ref name="MediaPipeHands" /> | ||
3. '''Pose / mesh estimation''': Fit a kinematic skeleton or hand mesh consistent with human biomechanics for stable interaction and animation | 3. '''Pose / mesh estimation''': Fit a kinematic skeleton or hand mesh consistent with human biomechanics for stable interaction and animation | ||
4. '''Temporal smoothing & prediction''': Filter jitter and manage short occlusions for responsive feedback | 4. '''Temporal smoothing & prediction''': Filter jitter and manage short occlusions for responsive feedback | ||
| Line 120: | Line 120: | ||
Hand tracking can be classified by the origin of the tracking sensors: | Hand tracking can be classified by the origin of the tracking sensors: | ||
* '''[[Inside-out tracking]]''' (egocentric): Cameras or sensors are mounted on the user's own device (headset or glasses), watching the hands from the user's perspective. This is common in standalone VR headsets and AR glasses, as it allows free movement without external setup. Many VR/AR devices use inside-out tracking with on-board cameras to observe the user's hands from the headset itself. This is an optical method where differences in camera pixels over time are analyzed to infer hand movements and gestures. The downside is that hands can only be tracked when in view of the device's sensors, and tracking from a single perspective can lose some accuracy (especially for motions where one hand occludes the other).<ref name="VRExpert2023" | * '''[[Inside-out tracking]]''' (egocentric): Cameras or sensors are mounted on the user's own device (headset or glasses), watching the hands from the user's perspective. This is common in standalone VR headsets and AR glasses, as it allows free movement without external setup. Many VR/AR devices use inside-out tracking with on-board cameras to observe the user's hands from the headset itself. This is an optical method where differences in camera pixels over time are analyzed to infer hand movements and gestures. The downside is that hands can only be tracked when in view of the device's sensors, and tracking from a single perspective can lose some accuracy (especially for motions where one hand occludes the other).<ref name="VRExpert2023" /> | ||
* '''[[Outside-in tracking]]''' (exocentric): External sensors (such as multiple camera towers or depth sensors placed in the environment) track the user's hands (and body) within a designated area or "tracking volume." With sensors capturing from multiple angles, outside-in tracking can achieve very precise and continuous tracking, since the hands are less likely to be occluded from all viewpoints at once. This method was more common in earlier VR setups and research settings (for example, using external motion capture cameras). Outside-in systems are less common in consumer AR/VR today due to setup complexity, but they remain in use for certain professional or room-scale systems requiring high fidelity.<ref name="VRExpert2023" /> | * '''[[Outside-in tracking]]''' (exocentric): External sensors (such as multiple camera towers or depth sensors placed in the environment) track the user's hands (and body) within a designated area or "tracking volume." With sensors capturing from multiple angles, outside-in tracking can achieve very precise and continuous tracking, since the hands are less likely to be occluded from all viewpoints at once. This method was more common in earlier VR setups and research settings (for example, using external motion capture cameras). Outside-in systems are less common in consumer AR/VR today due to setup complexity, but they remain in use for certain professional or room-scale systems requiring high fidelity.<ref name="VRExpert2023" /> | ||
| Line 127: | Line 127: | ||
Some systems augment or replace optical tracking with active depth sensing such as [[LiDAR]] or structured light infrared systems. These emit light (laser or IR LED) and measure its reflection to more precisely determine the distance and shape of hands, even in low-light conditions. LiDAR-based hand tracking can capture 3D positions with high precision and is less affected by ambient lighting or distance than pure camera-based methods.<ref name="VRExpert2023" /> | Some systems augment or replace optical tracking with active depth sensing such as [[LiDAR]] or structured light infrared systems. These emit light (laser or IR LED) and measure its reflection to more precisely determine the distance and shape of hands, even in low-light conditions. LiDAR-based hand tracking can capture 3D positions with high precision and is less affected by ambient lighting or distance than pure camera-based methods.<ref name="VRExpert2023" /> | ||
Ultraleap's hand tracking module (e.g., the Stereo IR 170 sensor) projects IR light and uses two IR cameras to track hands in 3D, allowing for robust tracking under various lighting conditions. This module has been integrated into devices like the Varjo VR-3/XR-3 and certain [[Pico]] headsets to provide built-in hand tracking.<ref name="SoundxVision" | Ultraleap's hand tracking module (e.g., the Stereo IR 170 sensor) projects IR light and uses two IR cameras to track hands in 3D, allowing for robust tracking under various lighting conditions. This module has been integrated into devices like the Varjo VR-3/XR-3 and certain [[Pico]] headsets to provide built-in hand tracking.<ref name="SoundxVision" /><ref name="VRExpert2023" /> Active depth systems (e.g., [[time-of-flight camera|Time-of-Flight]] or [[structured light]]) project or emit IR to recover per-pixel depth, improving robustness in low light and during complex hand poses. Several headsets integrate IR illumination to make hands stand out for monochrome sensors. Some [[mixed reality]] devices also include dedicated scene depth sensors that aid perception and interaction. | ||
Optical hand tracking is generally affordable to implement since it can leverage the same camera hardware used for environment tracking or passthrough video. However, its performance can be affected by the cameras' field of view, lighting conditions, and frame rate. If the user's hands move outside the view of the cameras or lighting is poor, tracking quality will suffer. Improvements in computer vision and AI have steadily increased the accuracy and robustness of optical hand tracking, enabling features like two-hand interactions and fine finger gesture detection.<ref name="VRExpert2023" /> | Optical hand tracking is generally affordable to implement since it can leverage the same camera hardware used for environment tracking or passthrough video. However, its performance can be affected by the cameras' field of view, lighting conditions, and frame rate. If the user's hands move outside the view of the cameras or lighting is poor, tracking quality will suffer. Improvements in computer vision and AI have steadily increased the accuracy and robustness of optical hand tracking, enabling features like two-hand interactions and fine finger gesture detection.<ref name="VRExpert2023" /> | ||
| Line 147: | Line 147: | ||
| [[Meta Quest]] series (Quest, Quest 2, Quest 3, Quest Pro) || Inside-out mono/IR cameras; AI hand pose; runtime prediction || Direct manipulation, raycast, standardized system pinches; continual "Hands 2.x" latency and robustness improvements || Hand tracking first shipped (experimental) in 2019 on Quest; later updates reduced latency and improved fast motion handling<ref name="Meta2019" /><ref name="MetaHands22" /><ref name="MetaHands21" /> | | [[Meta Quest]] series (Quest, Quest 2, Quest 3, Quest Pro) || Inside-out mono/IR cameras; AI hand pose; runtime prediction || Direct manipulation, raycast, standardized system pinches; continual "Hands 2.x" latency and robustness improvements || Hand tracking first shipped (experimental) in 2019 on Quest; later updates reduced latency and improved fast motion handling<ref name="Meta2019" /><ref name="MetaHands22" /><ref name="MetaHands21" /> | ||
|- | |- | ||
| [[Microsoft HoloLens 2]] || Depth sensing + RGB; fully articulated hand joints || "Direct manipulation" (touch, grab, press) with tactile affordances (fingertip cursor, proximity cues) || Millimeter-scale fingertip errors reported vs. Vicon in validation studies<ref name="DirectManipulation" /><ref name="HL2Accuracy" | | [[Microsoft HoloLens 2]] || Depth sensing + RGB; fully articulated hand joints || "Direct manipulation" (touch, grab, press) with tactile affordances (fingertip cursor, proximity cues) || Millimeter-scale fingertip errors reported vs. Vicon in validation studies<ref name="DirectManipulation" /><ref name="HL2Accuracy" /> | ||
|- | |- | ||
| [[Apple Vision Pro]] || Multi-camera, IR illumination, [[LiDAR]] scene sensing; eye-hand fusion || "Look to target, pinch to select", flick to scroll; relaxed, low-effort micro-gestures || Hand + eye as primary input paradigm in visionOS<ref name="AppleGestures" /> | | [[Apple Vision Pro]] || Multi-camera, IR illumination, [[LiDAR]] scene sensing; eye-hand fusion || "Look to target, pinch to select", flick to scroll; relaxed, low-effort micro-gestures || Hand + eye as primary input paradigm in visionOS<ref name="AppleGestures" /> | ||
|- | |- | ||
| [[Ultraleap]] modules (e.g., Controller 2, Stereo IR) || Stereo IR + LEDs; skeletal model || Robust two-hand support; integrations for Unity/Unreal/OpenXR || Widely embedded in enterprise headsets (e.g., Varjo XR-3/VR-3)<ref name="UltraleapDocs" /><ref name="VarjoUltraleap" | | [[Ultraleap]] modules (e.g., Controller 2, Stereo IR) || Stereo IR + LEDs; skeletal model || Robust two-hand support; integrations for Unity/Unreal/OpenXR || Widely embedded in enterprise headsets (e.g., Varjo XR-3/VR-3)<ref name="UltraleapDocs" /><ref name="VarjoUltraleap" /> | ||
|} | |} | ||
| Line 179: | Line 179: | ||
=== User Interface Navigation === | === User Interface Navigation === | ||
Basic system UI controls can be operated by hand gestures. Users can point at menu items, pinch to click on a button, grab and drag virtual windows, or make a fist to select tools. Hand tracking allows for "touchless" interaction with virtual interfaces in a way that feels similar to interacting with physical objects or touchscreens, lowering the learning curve for new users.<ref name="VarjoSupport" | Basic system UI controls can be operated by hand gestures. Users can point at menu items, pinch to click on a button, grab and drag virtual windows, or make a fist to select tools. Hand tracking allows for "touchless" interaction with virtual interfaces in a way that feels similar to interacting with physical objects or touchscreens, lowering the learning curve for new users.<ref name="VarjoSupport" /> | ||
== Applications == | == Applications == | ||
| Line 186: | Line 186: | ||
* '''System UI & Productivity''': Controller-free navigation, window management, and typing/pointing surrogates in spatial desktops. Natural file manipulation, multitasking across virtual screens, and interface control without handheld devices.<ref name="AppleGestures" /> | * '''System UI & Productivity''': Controller-free navigation, window management, and typing/pointing surrogates in spatial desktops. Natural file manipulation, multitasking across virtual screens, and interface control without handheld devices.<ref name="AppleGestures" /> | ||
* '''Gaming & Entertainment''': Titles such as ''Hand Physics Lab'' showcase free-hand puzzles and physics interactions using optical hand tracking on Quest.<ref name="HPL_RoadToVR" | * '''Gaming & Entertainment''': Titles such as ''Hand Physics Lab'' showcase free-hand puzzles and physics interactions using optical hand tracking on Quest.<ref name="HPL_RoadToVR" /> Games and creative applications use hand interactions—e.g., a puzzle game might let the player literally reach out and grab puzzle pieces in VR, or users can play virtual piano or create pottery simulations. | ||
* '''Training & Simulation''': Natural hand use improves ecological validity for assembly, maintenance, and surgical rehearsal in enterprise, medical, and industrial contexts.<ref name="Frontiers2021" /> Workers can practice complex procedures in safe virtual environments, developing muscle memory that transfers to real-world tasks. | * '''Training & Simulation''': Natural hand use improves ecological validity for assembly, maintenance, and surgical rehearsal in enterprise, medical, and industrial contexts.<ref name="Frontiers2021" /> Workers can practice complex procedures in safe virtual environments, developing muscle memory that transfers to real-world tasks. | ||
| Line 203: | Line 203: | ||
Measured performance varies by device and algorithm, but peer-reviewed evaluations provide useful bounds: | Measured performance varies by device and algorithm, but peer-reviewed evaluations provide useful bounds: | ||
* On '''[[Meta Quest 2]]''', a methodological study reported an average fingertip positional error of approximately 1.1 cm and temporal delay of approximately 45 ms for its markerless optical hand tracking, with approximately 9.6° mean finger-joint error under test conditions.<ref name="Quest2Accuracy" | * On '''[[Meta Quest 2]]''', a methodological study reported an average fingertip positional error of approximately 1.1 cm and temporal delay of approximately 45 ms for its markerless optical hand tracking, with approximately 9.6° mean finger-joint error under test conditions.<ref name="Quest2Accuracy" /> | ||
* On '''[[Microsoft HoloLens 2]]''', a 2024 study comparing to a Vicon motion-capture reference found millimeter-scale fingertip errors (approximately 2-4 mm) in a tracing task, with good agreement for pinch span and many grasping joint angles.<ref name="HL2Accuracy" /> | * On '''[[Microsoft HoloLens 2]]''', a 2024 study comparing to a Vicon motion-capture reference found millimeter-scale fingertip errors (approximately 2-4 mm) in a tracing task, with good agreement for pinch span and many grasping joint angles.<ref name="HL2Accuracy" /> | ||
| Line 248: | Line 248: | ||
=== Neural Interfaces === | === Neural Interfaces === | ||
Meta's [[electromyography]] (EMG) wristband program detects neural signals controlling hand muscles, enabling zero or negative latency by predicting movements before they occur. Mark Zuckerberg stated in 2024 these wristbands will "ship in the next few years" as primary input for AR glasses.<ref name="neural" | Meta's [[electromyography]] (EMG) wristband program detects neural signals controlling hand muscles, enabling zero or negative latency by predicting movements before they occur. Mark Zuckerberg stated in 2024 these wristbands will "ship in the next few years" as primary input for AR glasses.<ref name="neural" /> Companies like Ultraleap have raised significant investments to advance hand tracking accuracy and even combine it with mid-air haptics (using ultrasound to provide tactile feedback without touch).<ref name="SoundxVision" /> | ||
=== Advanced Haptics === | === Advanced Haptics === | ||
| Line 261: | Line 261: | ||
=== Market Projections === | === Market Projections === | ||
The AR/VR market is projected to reach $214.82 billion by 2031 at a 31.70% compound annual growth rate (CAGR), with hand tracking as a key growth driver.<ref name="market" | The AR/VR market is projected to reach $214.82 billion by 2031 at a 31.70% compound annual growth rate (CAGR), with hand tracking as a key growth driver.<ref name="market" /> | ||
== See Also == | == See Also == | ||