More Results

For these videos, we detect hands using a wholebody keypoint detector. We have also experimented with using a hand bounding box detector. The wholebody detector is more robust for smaller hands, where part of the rest of the body is visible. Since it is using the body context, it is more accurate at estimating side information (left/right). However, it requires running a person detector first. On the other hand, the hand bounding box detector is more accurate with larger hands, does not require additional preprocessing (i.e., human detector), but can miss smaller hands and is more prone to left-right classification errors. We will provide both alternatives with our demo code.

Citation

@inproceedings{pavlakos2024reconstructing,
    title={Reconstructing Hands in 3{D} with Transformers},
    author={Pavlakos, Georgios and Shan, Dandan and Radosavovic, Ilija and Kanazawa, Angjoo and Fouhey, David and Malik, Jitendra},
    booktitle={CVPR},
    year={2024}
}

Acknowledgements

This research was supported by the DARPA Machine Common Sense program, ONR MURI, as well as BAIR/BDD sponsors. We thank members of the BAIR community for helpful discussions. We also thank StabilityAI for supporting us through a compute grant. DF and DS were supported by the National Science Foundation under Grant No. 2006619. This webpage template was borrowed from some colorful folks. Music credits: SLAHMR. Icons: Flaticon.