1. Policy gradient methods estimate the optimal policy through gradient ascent on the expected return. They directly learn stochastic policies without estimating value functions.
2. REINFORCE uses Monte Carlo returns to estimate the policy gradient. It updates the policy parameters in the direction of the gradient to maximize expected returns.
3. PPO improves upon REINFORCE by clipping the objective function to restrict how far the new policy can be from the old policy, which helps stabilize training. It uses a surrogate objective and importance sampling to train the policy on data collected from previous policies.
The document summarizes research on developing efficient convolutional neural network architectures called MobileNets that are well-suited for mobile and embedded vision applications. The key ideas are using depthwise separable convolutions to factorize standard convolutions and using a width multiplier and resolution multiplier to control model size. Experiments show MobileNets achieve higher accuracy and speed than prior mobile networks on image classification and object detection tasks while having a smaller memory footprint.
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Sergey Karayev
?
This document discusses a lecture on transfer learning and transformers. It begins with an outline of topics to be covered, including transfer learning in computer vision, embeddings and language models, ELMO/ULMFit as "NLP's ImageNet Moment", transformers, attention in detail, and BERT, GPT-2, DistillBERT and T5. It then goes on to provide slides and explanations on these topics, discussing how transfer learning works, word embeddings, language models like Word2Vec, ELMO, ULMFit, the transformer architecture, attention mechanisms, and prominent transformer models.
The document discusses the perceptron algorithm, which is a simple neural network used for binary classification. It was invented in 1957 and works by computing weighted inputs and applying a threshold activation function. The perceptron learns by adjusting its weights during the training process. It is computationally efficient but can only learn linearly separable problems and not more complex nonlinear relationships.
This document provides an outline for a course on neural networks and fuzzy systems. The course is divided into two parts, with the first 11 weeks covering neural networks topics like multi-layer feedforward networks, backpropagation, and gradient descent. The document explains that multi-layer networks are needed to solve nonlinear problems by dividing the problem space into smaller linear regions. It also provides notation for multi-layer networks and shows how backpropagation works to calculate weight updates for each layer.
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Sergey Karayev
?
This document discusses a lecture on transfer learning and transformers. It begins with an outline of topics to be covered, including transfer learning in computer vision, embeddings and language models, ELMO/ULMFit as "NLP's ImageNet Moment", transformers, attention in detail, and BERT, GPT-2, DistillBERT and T5. It then goes on to provide slides and explanations on these topics, discussing how transfer learning works, word embeddings, language models like Word2Vec, ELMO, ULMFit, the transformer architecture, attention mechanisms, and prominent transformer models.
The document discusses the perceptron algorithm, which is a simple neural network used for binary classification. It was invented in 1957 and works by computing weighted inputs and applying a threshold activation function. The perceptron learns by adjusting its weights during the training process. It is computationally efficient but can only learn linearly separable problems and not more complex nonlinear relationships.
This document provides an outline for a course on neural networks and fuzzy systems. The course is divided into two parts, with the first 11 weeks covering neural networks topics like multi-layer feedforward networks, backpropagation, and gradient descent. The document explains that multi-layer networks are needed to solve nonlinear problems by dividing the problem space into smaller linear regions. It also provides notation for multi-layer networks and shows how backpropagation works to calculate weight updates for each layer.
[paper review] ??? - Eye in the sky & 3D human pose estimation in video with ...Gyubin Son
?
1. Eye in the Sky: Real-time Drone Surveillance System (DSS) for Violent Individuals Identification using ScatterNet Hybrid Deep Learning Network
https://arxiv.org/abs/1806.00746
2. 3D human pose estimation in video with temporal convolutions and semi-supervised training
https://arxiv.org/abs/1811.11742
27. 27
??? CNN ?? ?? : Speech Recognition
Covolutional neural networks for speech recognition, IEEE/ACM Transactions on Audio,
Speech, and Language Processing, Abdel-Hamid et al.(2014)
28. 28
??? CNN ?? ?? : Text Classification
Convolutional Neural Networks for Sentence Classification, Conference on Empirical
Methods in Natural Language, Processing Kim.(2014)
29. 29
??? CNN ?? ?? : Time series prediction
Time Series Classification Using Multi-Channels Deep Convolutional Neural
Networks, International Conference on Web-Age Information Management,(2014)
30. 30
??? CNN ?? ?? : Time series prediction
Encoding Time Series as Images for Visual Inspection and Classification Using
Tiled Convolutional Neural Networks, Association fortheAdvancement ofArtificial Intelligence,(2015)
31. 31
??? CNN ?? ?? : Time series prediction
Learning Traffic as Images: A Deep Convolutional Neural Network for Large-Scale
Transportation Network Speed Prediction, Sensors, (2017)