cvpaper.challenge の Meta Study Group k燕スライド
cvpaper.challenge はコンピュ`タビジョン蛍勸の書を啌し、トレンドをり竃す薬蕕任后U猟サマリ?アイディア深宛?h?g廾?猟誘後に函りMみ、群ゆる岑Rを慌嗤します。2019の朕法献肇奪彁疱h30+云誘後々仝2指參貧のトップ氏hW_議サ`ベイ々
http://xpaperchallenge.org/cv/
This document summarizes several datasets for image captioning, video classification, action recognition, and temporal localization. It describes the purpose, collection process, annotation format, examples and references for datasets including MS COCO, Visual Genome, Flickr8K/30K, Kinetics, Charades, AVA, STAIR Captions and Actions. The datasets vary in scale from thousands to millions of images/videos and cover a wide range of tasks from image captioning to complex activity recognition.
This document contains a summary of 3 papers on deep residual networks and squeeze-and-excitation networks:
1. Kaiming He et al. "Deep Residual Learning for Image Recognition" which introduced residual networks for image recognition.
2. Andreas Veit et al. "Residual Networks Behave Like Ensembles of Relatively Shallow Networks" which analyzed how residual networks behave like ensembles.
3. Jie Hu et al. "Squeeze-and-Excitation Networks" which introduced squeeze-and-excitation blocks to help convolutional networks learn channel dependencies.
The document also references the PyTorch ResNet implementation and provides URLs to the first and third papers. It contains non-English
The document summarizes a research paper that compares the performance of MLP-based models to Transformer-based models on various natural language processing and computer vision tasks. The key points are:
1. Gated MLP (gMLP) architectures can achieve performance comparable to Transformers on most tasks, demonstrating that attention mechanisms may not be strictly necessary.
2. However, attention still provides benefits for some NLP tasks, as models combining gMLP and attention outperformed pure gMLP models on certain benchmarks.
3. For computer vision, gMLP achieved results close to Vision Transformers and CNNs on image classification, indicating gMLP can match their data efficiency.
This document summarizes several datasets for image captioning, video classification, action recognition, and temporal localization. It describes the purpose, collection process, annotation format, examples and references for datasets including MS COCO, Visual Genome, Flickr8K/30K, Kinetics, Charades, AVA, STAIR Captions and Actions. The datasets vary in scale from thousands to millions of images/videos and cover a wide range of tasks from image captioning to complex activity recognition.
This document contains a summary of 3 papers on deep residual networks and squeeze-and-excitation networks:
1. Kaiming He et al. "Deep Residual Learning for Image Recognition" which introduced residual networks for image recognition.
2. Andreas Veit et al. "Residual Networks Behave Like Ensembles of Relatively Shallow Networks" which analyzed how residual networks behave like ensembles.
3. Jie Hu et al. "Squeeze-and-Excitation Networks" which introduced squeeze-and-excitation blocks to help convolutional networks learn channel dependencies.
The document also references the PyTorch ResNet implementation and provides URLs to the first and third papers. It contains non-English
The document summarizes a research paper that compares the performance of MLP-based models to Transformer-based models on various natural language processing and computer vision tasks. The key points are:
1. Gated MLP (gMLP) architectures can achieve performance comparable to Transformers on most tasks, demonstrating that attention mechanisms may not be strictly necessary.
2. However, attention still provides benefits for some NLP tasks, as models combining gMLP and attention outperformed pure gMLP models on certain benchmarks.
3. For computer vision, gMLP achieved results close to Vision Transformers and CNNs on image classification, indicating gMLP can match their data efficiency.
25. References
? Aberle DR, Adams AM, Berg CD, Black WC, Clapp JD, Fagerstrom RM. Gareen IF, Gatsonis C, Marcus
PM, Sicks JD; National Lung Screening Trial Research Team. Reduced lung-cancer mortality with low-
dose computed tomographic screening (2011)
? van Rikxoort EM, de Hoop B, Viergever MA, Prokop M, van Ginneken B. Automatic lung segmentation
from thoracic computed tomography scans using a hybrid approach with error detection (2009)
? Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru
Erhan, Vincent Vanhoucke, Andrew Rabinovich. Going Deeper with Convolutions (2014)
? Pinsky PF, Gierada DS, Black W, Munden R, Nath H, Aberle D, Kazerooni E. Performance of Lung-
RADS in the National Lung Screening Trial: a retrospective assessment (2015)
25