This document provides an overview of POMDP (Partially Observable Markov Decision Process) and its applications. It first defines the key concepts of POMDP such as states, actions, observations, and belief states. It then uses the classic Tiger problem as an example to illustrate these concepts. The document discusses different approaches to solve POMDP problems, including model-based methods that learn the environment model from data and model-free reinforcement learning methods. Finally, it provides examples of applying POMDP to games like ViZDoom and robot navigation problems.
This document discusses the relationship between control as inference, reinforcement learning, and active inference. It provides an overview of key concepts such as Markov decision processes (MDPs), partially observable MDPs (POMDPs), optimality variables, the evidence lower bound (ELBO), variational inference, and the free energy principle as applied to active inference. Control as inference frames reinforcement learning as probabilistic inference by defining a generative process and performing variational inference to find an optimal policy. Active inference uses the free energy principle and minimizes expected free energy to select actions that resolve uncertainty.
【DL輪読会】NeRF-VAE: A Geometry Aware 3D Scene Generative ModelDeep Learning JP
?
NeRF-VAE is a 3D scene generative model that combines Neural Radiance Fields (NeRF) and Generative Query Networks (GQN) with a variational autoencoder (VAE). It uses a NeRF decoder to generate novel views conditioned on a latent code. An encoder extracts latent codes from input views. During training, it maximizes the evidence lower bound to learn the latent space of scenes and allow for novel view synthesis. NeRF-VAE aims to generate photorealistic novel views of scenes by leveraging NeRF's view synthesis abilities within a generative model framework.
1. The document discusses probabilistic modeling and variational inference. It introduces concepts like Bayes' rule, marginalization, and conditioning.
2. An equation for the evidence lower bound is derived, which decomposes the log likelihood of data into the Kullback-Leibler divergence between an approximate and true posterior plus an expected log likelihood term.
3. Variational autoencoders are discussed, where the approximate posterior is parameterized by a neural network and optimized to maximize the evidence lower bound. Latent variables are modeled as Gaussian distributions.
1. The document discusses energy-based models (EBMs) and how they can be applied to classifiers. It introduces noise contrastive estimation and flow contrastive estimation as methods to train EBMs.
2. One paper presented trains energy-based models using flow contrastive estimation by passing data through a flow-based generator. This allows implicit modeling with EBMs.
3. Another paper argues that classifiers can be viewed as joint energy-based models over inputs and outputs, and should be treated as such. It introduces a method to train classifiers as EBMs using contrastive divergence.
This document discusses self-supervised representation learning (SRL) for reinforcement learning tasks. SRL learns state representations by using prediction tasks as an auxiliary objective. The key ideas are: (1) SRL learns an encoder that maps observations to states using a prediction task like modeling future states or actions; (2) The learned state representations improve generalization and exploration in reinforcement learning algorithms; (3) Several SRL methods are discussed, including world models, inverse models, and causal infoGANs.
【DL輪読会】NeRF-VAE: A Geometry Aware 3D Scene Generative ModelDeep Learning JP
?
NeRF-VAE is a 3D scene generative model that combines Neural Radiance Fields (NeRF) and Generative Query Networks (GQN) with a variational autoencoder (VAE). It uses a NeRF decoder to generate novel views conditioned on a latent code. An encoder extracts latent codes from input views. During training, it maximizes the evidence lower bound to learn the latent space of scenes and allow for novel view synthesis. NeRF-VAE aims to generate photorealistic novel views of scenes by leveraging NeRF's view synthesis abilities within a generative model framework.
1. The document discusses probabilistic modeling and variational inference. It introduces concepts like Bayes' rule, marginalization, and conditioning.
2. An equation for the evidence lower bound is derived, which decomposes the log likelihood of data into the Kullback-Leibler divergence between an approximate and true posterior plus an expected log likelihood term.
3. Variational autoencoders are discussed, where the approximate posterior is parameterized by a neural network and optimized to maximize the evidence lower bound. Latent variables are modeled as Gaussian distributions.
1. The document discusses energy-based models (EBMs) and how they can be applied to classifiers. It introduces noise contrastive estimation and flow contrastive estimation as methods to train EBMs.
2. One paper presented trains energy-based models using flow contrastive estimation by passing data through a flow-based generator. This allows implicit modeling with EBMs.
3. Another paper argues that classifiers can be viewed as joint energy-based models over inputs and outputs, and should be treated as such. It introduces a method to train classifiers as EBMs using contrastive divergence.
This document discusses self-supervised representation learning (SRL) for reinforcement learning tasks. SRL learns state representations by using prediction tasks as an auxiliary objective. The key ideas are: (1) SRL learns an encoder that maps observations to states using a prediction task like modeling future states or actions; (2) The learned state representations improve generalization and exploration in reinforcement learning algorithms; (3) Several SRL methods are discussed, including world models, inverse models, and causal infoGANs.
This document discusses using R and RStudio to simulate reinforcement learning models. It demonstrates simulating a Rescorla-Wagner model to update action values Q_A and Q_B based on payoffs from actions A and B over time. The model is expanded to select actions stochastically using a softmax function of the difference between Q_A and Q_B. Plots show the evolution of Q_A and Q_B over time for different learning rate and temperature parameters. The document provides an example code implementation of this reinforcement learning model in R.
Karl Fristonが提唱している「自由エネルギー原理(free-energy principle = FEP)」について、北大文学部の聴衆を対象にして、物理学や機械学習の知識の前提抜きにして、説明を行い、その意義を説明したものです。FEPの意識研究への応用に向けて、FEPとエナクション説の近接性について強調したものとなっております。
This document summarizes two lectures about consciousness and neuroscience:
1) It discusses theories of consciousness such as qualia and awareness, and the distinction between the dorsal and ventral visual pathways and their roles in vision for action vs perception. It also covers blindsight and the idea that a "feeling of something" without qualia may arise from saliency computation.
2) It discusses using bistable percepts like binocular rivalry to study neural correlates of awareness. It introduces the ideas of neurophenomenology and heterophenomenology to study first-person experience through intersubjective methods. It provides an example of neurophenomenology applied to the aura experience before epileptic seizures.
駒場学部講義2015 総合情報学特論III 「意識の神経科学:「気つ?き」と「サリエンシー」を手か?かりに」Masatoshi Yoshida
?
1. The document summarizes a lecture about the neural basis of consciousness, focusing on awareness, attention, and the study of blindsight.
2. It discusses evidence that the dorsal visual pathway is involved in vision for action while the ventral pathway is involved in vision for perception.
3. In blindsight, there is a "feeling of something happening" in the blind field that can be explained by saliency computation and sensorimotor contingencies rather than conscious visual experience.