TransPose proposes a Transformer-based model for human pose estimation that aims to improve explainability. It applies a Transformer encoder to feature maps from an image to estimate keypoint heatmaps. Self-attention can visualize relationships between pixels. The model achieves comparable accuracy to CNN models but with 73% fewer parameters and faster speed. Heatmap visualizations show which locations influence each joint the most.
1 of 18
Download to read offline
More Related Content
TransPose: Towards Explainable Human Pose Estimation by Transformer
1. TransPose: Towards Explainable
Human Pose Estimation by
Transformer
第6回 全日本コンピュータビジョン勉強会
Transformer 読み会
2021/04/18
@yasutomo57jp
https://yasutomo57jp.github.io