The document discusses various generative adversarial networks (GANs), specifically focusing on deep convolutional GANs (DCGANs) and their generators and discriminators. It references important papers and resources, including works by Goodfellow et al. and Radford et al., while also mentioning various tools and documentation relevant to deep learning like PyTorch. Additionally, it includes links to lectures and resources for further learning in this area.
The document summarizes research trends in trajectory prediction using deep learning. It discusses categories of trajectory prediction methods including Bayesian-based, deep learning-based, and planning-based. For deep learning-based methods, it categorizes models based on whether they consider interactions between objects or not. Pooled models and attention models are discussed for methods that consider interactions. Recent models from 2016 to 2020 are also presented in a table. The purpose of the survey is to summarize characteristics of trajectory prediction methods in each category, introduce datasets and evaluation metrics, and discuss prediction accuracy and results of representative models.
The document summarizes research trends in trajectory prediction using deep learning. It discusses categories of trajectory prediction methods including Bayesian-based, deep learning-based, and planning-based. For deep learning-based methods, it categorizes models based on whether they consider interactions between objects or not. Pooled models and attention models are discussed for methods that consider interactions. Recent models from 2016 to 2020 are also presented in a table. The purpose of the survey is to summarize characteristics of trajectory prediction methods in each category, introduce datasets and evaluation metrics, and discuss prediction accuracy and results of representative models.
11. テキストから画像を生成するGAN
GAN-INT-CLS: Generative Adversarial Text to Image Synthesis, ICML 2016 (link)
GANをテキストからの画像生成に初めて利用。 64x64の画像生成に成功。
GAWWN: Learning What and Where to Draw, NIPS 2016 (link)
どの位置にどの物体があるかを BoundingBoxで指定することができた。
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, ICCV 2017 (link), C. Ledig, et.al.
2ステージ訓練により 256x256の圧倒的な高解像度を生成。もやもや画像が 2ステージ目でくっきり。
TAC-GAN: Text Conditioned Auxiliary Classifier Generative Adversarial Network, arXiv 2017 (link)
訓練アシストのための auxiliary classifierを使用。同じテキストから多様なタイプの画像が生成できる。
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks, arXiv 2017 (link), H. Zhang, et.al.
StackGAN-v2と呼ばれ、tree-likeネットワークを使用
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks, CVPR 2018 (link) Tao Xu, et.al.
Attentionドリブンな方法で細部を生成できるように
AttnGAN以降はend-to-end学習系の論文が出ている。
FusedGAN: Semi-supervised FusedGAN for Conditional Image Generation, arXiv 2018 (link)
2ステージ訓練を End-to-Endで学習できるよう1 stageにfuseした。
HDGAN: Photographic Text-to-Image Synthesis with a Hierarchically-nested Adversarial Network, arXiv 2018 (link)
hierarchical-nestedネットワーク構造で、高解像度画像を end-to-endで学習した。
12. LAPGAN(Laplacian Pyramid of Generative Adversarial Network)
Deep Generative Image Models using a ?Laplacian Pyramid of Adversarial Networks, NIPS(2015)
低解像度と高解像度の画像の差を学習し、低解像度画像をもとに高解像度の画像を生成する。各解像度のGeneratorを
学習し、段階的に高解像度の画像を生成。
AlignDRAW(Align Deep Recurrent Attention Writer)
Generating Images from Captions with Attention, ICLR(2016)
VAE(Variational Auto Encoder)はエンコーダの出力が正規分布の平均と共分散行列であり、潜在変数からのデータを再
構成する確率的変分推論アルゴリズムによって訓練する。そのVAEを拡張し、Attention機構を再帰的に組み込んだ
Deep Recurrent Attention Writer(DRAW)を追加したのがAlignDRAWである。AlignDRAWによりテキスト内の各単語
ごとに画像パッチを生成し、反復的に画像を生成した。
しかしこれらのモデルが生成する画像は、単語レベルの情報を欠いてしまっている。
高解像度画像生成の従来手法