The guided policy search(GPS) is the branch of reinforcement learning developed for real-world robotics, and its utility is substantiated along many research. This slide show contains the comprehensive concept of GPS, and the detail way to implement, so it would be helpful for anyone who want to study this field.
Deep learning study 1. this slide includes basic mathematical theorems for deep learning, such as Bayes's theorem, Bayesian inference, information theorem.
The guided policy search(GPS) is the branch of reinforcement learning developed for real-world robotics, and its utility is substantiated along many research. This slide show contains the comprehensive concept of GPS, and the detail way to implement, so it would be helpful for anyone who want to study this field.
Deep learning study 1. this slide includes basic mathematical theorems for deep learning, such as Bayes's theorem, Bayesian inference, information theorem.
100% Serverless big data scale production Deep Learning Systemhoondong kim
油
- BigData Sale Deep Learning Training System (with GPU Docker PaaS on Azure Batch AI)
- Deep Learning Serving Layer (with Auto Scale Out Mode on Web App for Linux Docker)
- BigDL, Keras, Tensorlfow, Horovod, TensorflowOnAzure
1. Ask the right question
Active Question Reformulation with Reinforcement Learning
2018.06.11 伎
2. Table of Contents
1. Reinforcement Learning
2. Active Question Answering
3. BiDirectional Attention Flow
4. Experiment
5. Analysis of The Agents Language
11. Policy Gradient
Parameterized policy襯 螳 (linear function approx. or Neural Network)
policy input state feature願碓 raw pixels / output probability of action
http://karpathy.github.io/2016/05/31/rl/
15. Application of DeepRL
Game play
Alphago, Atari, Vizdoom
Robotics
robot arm manipulation, locomotion
Natural language process
Question Answering, Chatting
Autonomous driving
Mobileye
!15
https://www.youtube.com/watch?v=vppFvq2quQ0
20. SearchQA Dataset
SearchQA
Matthew Dunn, Levent Sagun, Mike Higgins, Ugur Guney, Volkan Cirik, and Kyunghyun Cho.
SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine. https://arxiv.org/
abs/1704.05179, 2017.
Github repo: https://github.com/nyu-dl/SearchQA
Jeopardy! 讌覓瑚骸 dataset 覦朱伎 web 襦る
140k question-answer pairs, 螳 pair 蠏 49.6 snippet
螳 question襷 google querying
譬 れ information retrieval system 襷 一危一
36. Statistics of Questions
Length: question word 螳. TF(term frequency): question 覦覲給 word 螳
DF(document frequency): question token context median
QC(Query Clarity): question螻 reformulation 伎 relative entropy
Question
Clue
gandhi deeply in鍖uenced count wrote war
peace.
Base-NMT Who in鍖uenced count wrote war?
AQA-QR
What is name gandhi gandhi in鍖uence
wrote peace peace?
37. Statistics of Questions
Base-NMT
螳 syntactically well-formed question
Lower DF: NMT training corpus螳 SearchQA 一危 螻 谿願 碁
AQA-QR: TopHyp
99.8% 螳 what is name朱 : 覈 answer螳 name螻 蟯 伎 企蟆 給
Less 鍖uent
Multiple token 螻 蟆曙郁 SearchQA 觜 2覦