This document discusses the concept of "simple" and "easy" as it relates to programming languages and Clojure in particular. It explores the differences between concepts that are simple versus complex, and easy versus hard. It provides examples of how Clojure aims to make programming simple by avoiding unnecessary complexity through choices like immutable data and avoiding side effects.
This document discusses the concept of "simple" and "easy" as it relates to programming languages and Clojure in particular. It explores the differences between concepts that are simple versus complex, and easy versus hard. It provides examples of how Clojure aims to make programming simple by avoiding unnecessary complexity through choices like immutable data and avoiding side effects.
Imagination-Augmented Agents for Deep Reinforcement Learning?? ?
?
I will introduce a paper about I2A architecture made by deepmind. That is about Imagination-Augmented Agents for Deep Reinforcement Learning
This slide were presented at Deep Learning Study group in DAVIAN LAB.
Paper link: https://arxiv.org/abs/1707.06203
The guided policy search(GPS) is the branch of reinforcement learning developed for real-world robotics, and its utility is substantiated along many research. This slide show contains the comprehensive concept of GPS, and the detail way to implement, so it would be helpful for anyone who want to study this field.
The document discusses a method called HIRO for hierarchical reinforcement learning that is data-efficient and can operate off-policy. HIRO uses state observations directly as goals for higher levels rather than manually designed subgoals. It introduces experience replay at both the low and high levels to stabilize training and address non-stationary problems. The high level policy is trained to match desired state transitions provided by sampling future states from a learned goal transition model. Experiments demonstrate this approach can efficiently solve challenging tasks with limited experience collection.
Hindsight experience replay paper reviewEuijin Jeong
?
- Hindsight Experience Replay (HER) is an RL technique that allows agents to learn from unachieved goals as if they were achieved. This helps address the sparse reward problem in RL.
- HER replays experiences with substituted goals, generating pseudo-transitions with new rewards. This increases the effective sample complexity for learning tasks with sparse rewards.
- Experiments show HER improves performance on manipulation tasks like pushing, sliding, and pick-and-place, allowing policies to be learned for these tasks where vanilla RL fails.
Deep Sarsa and Deep Q-learning use neural networks to estimate state-action values in reinforcement learning problems. Deep Q-learning uses experience replay and a target network to improve stability over the basic Deep Q-learning algorithm. Experience replay stores transitions in a replay buffer, and the target network is periodically updated to reduce bias from bootstrapping. Deepmind's DQN algorithm combined Deep Q-learning with experience replay and a target network to achieve good performance on complex tasks.
It described about MDP, Monte-Carlo, Time-Difference, sarsa, and q-learning method, and used for Reinforcement Learning study group's lecture, where is belonged to Korea Artificial Intelligence Laboratory.