ゼロから始める深層強化学習(NLP2018講演資料)/ Introduction of Deep Reinforcement LearningPreferred Networks
?
Introduction of Deep Reinforcement Learning, which was presented at domestic NLP conference.
言語処理学会第24回年次大会(NLP2018) での講演資料です。
http://www.anlp.jp/nlp2018/#tutorial
The document summarizes recent research related to "theory of mind" in multi-agent reinforcement learning. It discusses three papers that propose methods for agents to infer the intentions of other agents by applying concepts from theory of mind:
1. The papers propose that in multi-agent reinforcement learning, being able to understand the intentions of other agents could help with cooperation and increase success rates.
2. The methods aim to estimate the intentions of other agents by modeling their beliefs and private information, using ideas from theory of mind in cognitive science. This involves inferring information about other agents that is not directly observable.
3. Bayesian inference is often used to reason about the beliefs, goals and private information of other agents based
This document summarizes a presentation on offline reinforcement learning. It discusses how offline RL can learn from fixed datasets without further interaction with the environment, which allows for fully off-policy learning. However, offline RL faces challenges from distribution shift between the behavior policy that generated the data and the learned target policy. The document reviews several offline policy evaluation, policy gradient, and deep deterministic policy gradient methods, and also discusses using uncertainty and constraints to address distribution shift in offline deep reinforcement learning.
【DL輪読会】Efficiently Modeling Long Sequences with Structured State SpacesDeep Learning JP
?
This document summarizes a research paper on modeling long-range dependencies in sequence data using structured state space models and deep learning. The proposed S4 model (1) derives recurrent and convolutional representations of state space models, (2) improves long-term memory using HiPPO matrices, and (3) efficiently computes state space model convolution kernels. Experiments show S4 outperforms existing methods on various long-range dependency tasks, achieves fast and memory-efficient computation comparable to efficient Transformers, and performs competitively as a general sequence model.
This document discusses self-supervised representation learning (SRL) for reinforcement learning tasks. SRL learns state representations by using prediction tasks as an auxiliary objective. The key ideas are: (1) SRL learns an encoder that maps observations to states using a prediction task like modeling future states or actions; (2) The learned state representations improve generalization and exploration in reinforcement learning algorithms; (3) Several SRL methods are discussed, including world models, inverse models, and causal infoGANs.
The document summarizes recent research related to "theory of mind" in multi-agent reinforcement learning. It discusses three papers that propose methods for agents to infer the intentions of other agents by applying concepts from theory of mind:
1. The papers propose that in multi-agent reinforcement learning, being able to understand the intentions of other agents could help with cooperation and increase success rates.
2. The methods aim to estimate the intentions of other agents by modeling their beliefs and private information, using ideas from theory of mind in cognitive science. This involves inferring information about other agents that is not directly observable.
3. Bayesian inference is often used to reason about the beliefs, goals and private information of other agents based
This document summarizes a presentation on offline reinforcement learning. It discusses how offline RL can learn from fixed datasets without further interaction with the environment, which allows for fully off-policy learning. However, offline RL faces challenges from distribution shift between the behavior policy that generated the data and the learned target policy. The document reviews several offline policy evaluation, policy gradient, and deep deterministic policy gradient methods, and also discusses using uncertainty and constraints to address distribution shift in offline deep reinforcement learning.
【DL輪読会】Efficiently Modeling Long Sequences with Structured State SpacesDeep Learning JP
?
This document summarizes a research paper on modeling long-range dependencies in sequence data using structured state space models and deep learning. The proposed S4 model (1) derives recurrent and convolutional representations of state space models, (2) improves long-term memory using HiPPO matrices, and (3) efficiently computes state space model convolution kernels. Experiments show S4 outperforms existing methods on various long-range dependency tasks, achieves fast and memory-efficient computation comparable to efficient Transformers, and performs competitively as a general sequence model.
This document discusses self-supervised representation learning (SRL) for reinforcement learning tasks. SRL learns state representations by using prediction tasks as an auxiliary objective. The key ideas are: (1) SRL learns an encoder that maps observations to states using a prediction task like modeling future states or actions; (2) The learned state representations improve generalization and exploration in reinforcement learning algorithms; (3) Several SRL methods are discussed, including world models, inverse models, and causal infoGANs.
大規模データセットでの推論に便利なSVIの概要をまとめました.
SVIは確率的最適化の枠組みで行う変分ベイズ法です.
随時更新してます.
参考文献
[1]Matthew D Hoffman, David M Blei, Chong Wang, and John Paisley. Stochastic variational inference. The Journal of Machine Learning Research, Vol. 14, No. 1, pp. 1303–1347, 2013.
[2] 佐藤一誠. トピックモデルによる統計的意味解析. コロナ社, 2015.
On the Dynamics of Machine Learning Algorithms and Behavioral Game TheoryRikiya Takahashi
?
Presentation Material used in guest lecturing at University of Tsukuba on September 17, 2016.
Target audience is part-time PhD student working at a machine learning, data mining, or agent-based simulation project.
sublabel accurate convex relaxation of vectorial multilabel energiesFujimoto Keisuke
?
This document summarizes a presentation on the paper "Sublabel-Accurate Convex Relaxation of Vectorial Multilabel Energies". It discusses how the paper proposes a method to efficiently solve high-dimensional, nonlinear vectorial labeling problems by approximating them as convex problems. Specifically, it divides the problem domain into subregions and approximates each subregion with a convex function, yielding an overall approximation that is still non-convex but with higher accuracy. This lifting technique transforms the variables into a higher-dimensional space to formulate the data and regularization terms in a way that allows solving the problem as a convex optimization.
Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Coverin...Kenko Nakamura
?
This document proposes a sketch-based box-covering algorithm to efficiently analyze the fractality of massive graphs. It summarizes that some real-world networks have been found to be fractal in nature, but existing algorithms for determining fractality are too slow for large networks. The proposed method uses min-hash to represent boxes implicitly and solves the box-covering problem efficiently in the sketch space using a binary search tree and heap, allowing fractality analysis of networks with millions of edges for the first time.
4. Sutton & Barto の新作
draftが読める。目次を一部紹介:
第1部: Tabular Solution Methods
6 Temporal-Difference Learning
8 Planning and Learning with Tabular Methods
第2部: Approximate Solution Methods
12 Eligibility Traces
13 Policy Gradient Methods
第3部: Looking Deeper
16 Applications and Case Studies
16.6 Human-Level Video Game Play
16.7 Mastering the Game of Go
16.8 Personalized Web Services
4
https://webdocs.cs.ualberta.ca/~sutton/book/the-book-2nd.html