This document discusses generative adversarial networks (GANs) and their relationship to reinforcement learning. It begins with an introduction to GANs, explaining how they can generate images without explicitly defining a probability distribution by using an adversarial training process. The second half discusses how GANs are related to actor-critic models and inverse reinforcement learning in reinforcement learning. It explains how GANs can be viewed as training a generator to fool a discriminator, similar to how policies are trained in reinforcement learning.
This document summarizes a research paper on scaling laws for neural language models. Some key findings of the paper include:
- Language model performance depends strongly on model scale and weakly on model shape. With enough compute and data, performance scales as a power law of parameters, compute, and data.
- Overfitting is universal, with penalties depending on the ratio of parameters to data.
- Large models have higher sample efficiency and can reach the same performance levels with less optimization steps and data points.
- The paper motivated subsequent work by OpenAI on applying scaling laws to other domains like computer vision and developing increasingly large language models like GPT-3.
The document discusses FactorVAE, a method for disentangling latent representations in variational autoencoders (VAEs). It introduces Total Correlation (TC) as a penalty term that encourages independence between latent variables. TC is added to the standard VAE objective function to guide the model to learn disentangled representations. The document provides details on how TC is defined and computed based on the density-ratio trick from generative adversarial networks. It also discusses how FactorVAE uses TC to learn disentangled representations and can be evaluated using a disentanglement metric.
[DL輪読会]Recent Advances in Autoencoder-Based Representation LearningDeep Learning JP
?
1. Recent advances in autoencoder-based representation learning include incorporating meta-priors to encourage disentanglement and using rate-distortion and rate-distortion-usefulness tradeoffs to balance compression and reconstruction.
2. Variational autoencoders introduce priors to disentangle latent factors, but recent work aggregates posteriors to directly encourage disentanglement.
3. The rate-distortion framework balances the rate of information transmission against reconstruction distortion, while rate-distortion-usefulness also considers downstream task usefulness.
1. The document discusses energy-based models (EBMs) and how they can be applied to classifiers. It introduces noise contrastive estimation and flow contrastive estimation as methods to train EBMs.
2. One paper presented trains energy-based models using flow contrastive estimation by passing data through a flow-based generator. This allows implicit modeling with EBMs.
3. Another paper argues that classifiers can be viewed as joint energy-based models over inputs and outputs, and should be treated as such. It introduces a method to train classifiers as EBMs using contrastive divergence.
This document summarizes a research paper on scaling laws for neural language models. Some key findings of the paper include:
- Language model performance depends strongly on model scale and weakly on model shape. With enough compute and data, performance scales as a power law of parameters, compute, and data.
- Overfitting is universal, with penalties depending on the ratio of parameters to data.
- Large models have higher sample efficiency and can reach the same performance levels with less optimization steps and data points.
- The paper motivated subsequent work by OpenAI on applying scaling laws to other domains like computer vision and developing increasingly large language models like GPT-3.
The document discusses FactorVAE, a method for disentangling latent representations in variational autoencoders (VAEs). It introduces Total Correlation (TC) as a penalty term that encourages independence between latent variables. TC is added to the standard VAE objective function to guide the model to learn disentangled representations. The document provides details on how TC is defined and computed based on the density-ratio trick from generative adversarial networks. It also discusses how FactorVAE uses TC to learn disentangled representations and can be evaluated using a disentanglement metric.
[DL輪読会]Recent Advances in Autoencoder-Based Representation LearningDeep Learning JP
?
1. Recent advances in autoencoder-based representation learning include incorporating meta-priors to encourage disentanglement and using rate-distortion and rate-distortion-usefulness tradeoffs to balance compression and reconstruction.
2. Variational autoencoders introduce priors to disentangle latent factors, but recent work aggregates posteriors to directly encourage disentanglement.
3. The rate-distortion framework balances the rate of information transmission against reconstruction distortion, while rate-distortion-usefulness also considers downstream task usefulness.
1. The document discusses energy-based models (EBMs) and how they can be applied to classifiers. It introduces noise contrastive estimation and flow contrastive estimation as methods to train EBMs.
2. One paper presented trains energy-based models using flow contrastive estimation by passing data through a flow-based generator. This allows implicit modeling with EBMs.
3. Another paper argues that classifiers can be viewed as joint energy-based models over inputs and outputs, and should be treated as such. It introduces a method to train classifiers as EBMs using contrastive divergence.
Paper Introduction "RankCompete:Simultaneous ranking and clustering of info...Kotaro Yamazaki
?
Paper Introduction.
RankCompete:Simultaneous ranking and clustering of information networks
https://www.researchgate.net/publication/257352130_RankCompete_Simultaneous_ranking_and_clustering_of_information_networks
This document summarizes face image quality assessment (FIQA) and introduces several FIQA algorithms. It defines FIQA and outlines common FIQA processes of inputting a face image, detecting the face region, and applying a FIQA algorithm to output a quality score. It discusses levels of FIQA algorithms from unlearned to integrated with face recognition. Example algorithms described include FaceQnet, SER-FIQ, and MagFace. FaceQnet generates quality score ground truths from face recognition and trains a model to predict scores. SER-FIQ and MagFace leverage face embeddings from recognition models to assess quality without separate training.
論文紹介: Long-Tailed Classification by Keeping the Good and Removing the Bad Mom...Plot Hong
?
1) The paper proposes a new method called De-confound-TDE to address long-tailed classification problems by removing the bad causal effect of head classes' momentum on tail classes during training.
2) It decouples representation and classifier learning via multi-head normalization and removes the effect of feature drift toward head classes via counterfactual TDE inference.
3) Experimental results show it achieves state-of-the-art performance on long-tailed classification benchmarks like CIFAR-10-LT, CIFAR-100-LT, and ImageNet-LT, as well as object detection and segmentation benchmarks like LVIS.
This document discusses deepfakes, including their creation and detection. It begins with an introduction to face swapping, face reenactment, and face synthesis techniques used to generate deepfakes. It then describes several methods for creating deepfakes, such as faceswap algorithms, 3D modeling approaches, and GAN-based methods. The document also reviews several datasets used to detect deepfakes. Finally, it analyzes current research on detecting deepfakes using techniques like two-stream neural networks, analyzing inconsistencies in audio-video, and detecting warping artifacts.
24. 24
Class-Balanced Loss Based on Effective Number of
Samples,CVPR 2019 [4]
? あるクラスに対して、データサンプル数の増加に連れ
て、新しいサンプルがモデルへの貢献が少なくなる
? 有効サンプル数の概念を提案した
? 過去のre-weighting手法では各クラスのサンプル数を
参照して重み付けに対して、有効サンプル数で重みを
デザインする
3.2.1
49. 49
[1] Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár. Focal Loss for
Dense Object Detection. In ICCV, 2017.
[2] Bingyi Kang, Saining Xie, Marcus Rohrbach, Zhicheng Yan, Albert Gordo, Jiashi
Feng, Yannis Kalantidis. Decoupling Representation and Classifier for Long-Tailed
Recognition. In ICLR, 2020.
[3] Boyan Zhou, Quan Cui, Xiu-Shen Wei, Zhao-Min Chen. Bilateral-Branch Network
with Cumulative Learning for Long-Tailed Visual Recognition. In CVPR, 2020.
[4] Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, Serge Belongie. Class-Balanced Loss
Based on Effective Number of Samples. In CVPR, 2019.
[5] Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, Tengyu Ma. Learning
Imbalanced Datasets with Label-Distribution-Aware Margin Loss. In NIPS, 2019.
[6] Muhammad Abdullah Jamal, Matthew Brown, Ming-Hsuan Yang, Liqiang Wang,
Boqing Gong. Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition
from a Domain Adaptation Perspective. In CVPR, 2020.
Reference
50. 50
[7] Hsin-Ping Chou, Shih-Chieh Chang, Jia-Yu Pan, Wei Wei, Da-Cheng Juan. Remix:
Rebalanced Mixup. In arxiv, 2020.
[8] Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz. mixup:
Beyond empirical risk minimization. In ICLR, 2018.
[9] Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, Stella X. Yu.
Large-Scale Long-Tailed Recognition in an Open World. In CVPR, 2019.
[10] Jialun Liu, Yifan Sun, Chuchu Han, Zhaopeng Dou, Wenhui Li. Deep
Representation Learning on Long-tailed Data: A Learnable Embedding Augmentation
Perspective. In CVPR, 2020.
[11] Liuyu Xiang, Guiguang Ding, Jungong Han. Learning From Multiple Experts: Self-
paced Knowledge Distillation for Long-tailed Classification. In ECCV, 2020.
[12] Agrim Gupta, Piotr Dollár, Ross Girshick. LVIS: A Dataset for Large Vocabulary
Instance Segmentation. In ICCV, 2019
Reference
51. 51
[13] Jingru Tan, Changbao Wang, Buyu Li, Quanquan Li, Wanli Ouyang, Changqing Yin,
Junjie Yan. Equalization Loss for Long-Tailed Object Recognition. In CVPR, 2020.
[14] Yu Li, Tao Wang, Bingyi Kang, Sheng Tang, Chunfeng Wang, Jintao Li, Jiashi Feng.
Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group
Softmax. In CVPR, 2020.
[15] Junran Peng, Xingyuan Bu, Ming Sun, Zhaoxiang Zhang, Tieniu Tan, Junjie Yan.
Large-Scale Object Detection in the Wild from Imbalanced Multi-Labels. In CVPR,
2020.
Reference