研究室の輪講で使った古いスライド。物体検出の黎明期からシングルショット系までのまとめ。
Old slides used in a lab lecture. A summary of object detection from its early days to single-shot systems.
フォント不足による表示崩れがあります(筑紫A丸ゴシック、Montserratを使用)。
This document summarizes recent research on applying self-attention mechanisms from Transformers to domains other than language, such as computer vision. It discusses models that use self-attention for images, including ViT, DeiT, and T2T, which apply Transformers to divided image patches. It also covers more general attention modules like the Perceiver that aims to be domain-agnostic. Finally, it discusses work on transferring pretrained language Transformers to other modalities through frozen weights, showing they can function as universal computation engines.
研究室の輪講で使った古いスライド。物体検出の黎明期からシングルショット系までのまとめ。
Old slides used in a lab lecture. A summary of object detection from its early days to single-shot systems.
フォント不足による表示崩れがあります(筑紫A丸ゴシック、Montserratを使用)。
This document summarizes recent research on applying self-attention mechanisms from Transformers to domains other than language, such as computer vision. It discusses models that use self-attention for images, including ViT, DeiT, and T2T, which apply Transformers to divided image patches. It also covers more general attention modules like the Perceiver that aims to be domain-agnostic. Finally, it discusses work on transferring pretrained language Transformers to other modalities through frozen weights, showing they can function as universal computation engines.
cvpaper.challengeにおいてECCVのOral論文をまとめた「ECCV 2020 報告」です。
ECCV2020 Oral論文 完全読破(2/2) [/cvpaperchallenge/eccv2020-22-238640597/1]
pp. 7-10 ECCVトレンド
pp. 12-81 3D geometry & reconstruction
pp. 82-137 Geometry, mapping and tracking
pp. 138-206 Image and Video synthesis
pp. 207-252 Learning methods
cvpaper.challengeはコンピュータビジョン分野の今を映し、トレンドを創り出す挑戦です。論文サマリ作成?アイディア考案?議論?実装?論文投稿に取り組み、凡ゆる知識を共有します。2020の目標は「トップ会議に30+本投稿」することです。
26. データ拡張とは?
データ拡張とは、データに適切な画像処理を施して
学習サンプルを増やすこと
例)MNISTデータセット[6]
[7] P. Y. Simard, et al., “Best practices for convolutional neural networks applied to visual document Analysis”, ICDAR, 2003.
[6] Y. LeCun, "Gradient-based learning applied to document recognition“
Proc IEEE, 86(11), 1998.
ランダムな
ホモグラフィ
弾性変形[7]
細線化
元画像 変形画像
26/33Copyright ? 2015 DENSO IT LABORATORY, INC. All Rights Reserved.
27. データ拡張の効果: 識別性能の向上
?. ??
0.69
MNISTにおけるCNNの誤識別率
拡張あり 拡張なし
誤
識
別
率
[4] I. Sato, et al., “APAC: Augmented PAttern Classification with Neural Networks”, arXiv:1505.03229.
従来は特徴設計に事前知識を取り込んでいたが、
DLではデータに事前知識を取り込む
0
27/33Copyright ? 2015 DENSO IT LABORATORY, INC. All Rights Reserved.