The document discusses challenges with the traditional view of psychological architecture for behavior, which depicts perception, cognition, and action as distinct sequential processes. It notes that this view was designed to explain human problem-solving and assumes a disembodied mind. The document questions where the "central executive" of cognition is located in the brain, as neural correlates of decision-making are found in many regions. It suggests this traditional view may not adequately explain neural data and that brains could be considered control systems rather than strictly input/output devices.
This document summarizes research on reducing the computational complexity of self-attention in Transformer models from O(L2) to O(L log L) or O(L). It describes the Reformer model which uses locality-sensitive hashing to achieve O(L log L) complexity, the Linformer model which uses low-rank approximations and random projections to achieve O(L) complexity, and the Synthesizer model which replaces self-attention with dense or random attention. It also briefly discusses the expressive power of sparse Transformer models.
This document summarizes recent research on applying self-attention mechanisms from Transformers to domains other than language, such as computer vision. It discusses models that use self-attention for images, including ViT, DeiT, and T2T, which apply Transformers to divided image patches. It also covers more general attention modules like the Perceiver that aims to be domain-agnostic. Finally, it discusses work on transferring pretrained language Transformers to other modalities through frozen weights, showing they can function as universal computation engines.
The document contains mathematical equations and notation related to machine learning and probability distributions. It involves defining terms like P(y|x), which represents the probability of outcome y given x, and exploring ways to calculate the expected value of an objective function Rn under different probability distributions p and q over the variables x and y. The goal appears to be to select parameters θ to optimize some objective while accounting for the distributions of the training data.
This document provides an overview of POMDP (Partially Observable Markov Decision Process) and its applications. It first defines the key concepts of POMDP such as states, actions, observations, and belief states. It then uses the classic Tiger problem as an example to illustrate these concepts. The document discusses different approaches to solve POMDP problems, including model-based methods that learn the environment model from data and model-free reinforcement learning methods. Finally, it provides examples of applying POMDP to games like ViZDoom and robot navigation problems.
This document summarizes research on reducing the computational complexity of self-attention in Transformer models from O(L2) to O(L log L) or O(L). It describes the Reformer model which uses locality-sensitive hashing to achieve O(L log L) complexity, the Linformer model which uses low-rank approximations and random projections to achieve O(L) complexity, and the Synthesizer model which replaces self-attention with dense or random attention. It also briefly discusses the expressive power of sparse Transformer models.
This document summarizes recent research on applying self-attention mechanisms from Transformers to domains other than language, such as computer vision. It discusses models that use self-attention for images, including ViT, DeiT, and T2T, which apply Transformers to divided image patches. It also covers more general attention modules like the Perceiver that aims to be domain-agnostic. Finally, it discusses work on transferring pretrained language Transformers to other modalities through frozen weights, showing they can function as universal computation engines.
The document contains mathematical equations and notation related to machine learning and probability distributions. It involves defining terms like P(y|x), which represents the probability of outcome y given x, and exploring ways to calculate the expected value of an objective function Rn under different probability distributions p and q over the variables x and y. The goal appears to be to select parameters θ to optimize some objective while accounting for the distributions of the training data.
This document provides an overview of POMDP (Partially Observable Markov Decision Process) and its applications. It first defines the key concepts of POMDP such as states, actions, observations, and belief states. It then uses the classic Tiger problem as an example to illustrate these concepts. The document discusses different approaches to solve POMDP problems, including model-based methods that learn the environment model from data and model-free reinforcement learning methods. Finally, it provides examples of applying POMDP to games like ViZDoom and robot navigation problems.
The document discusses various techniques for artificial intelligence and online experimentation. It covers topics like Python, Scala, web development, AI, ROI analysis, A/B testing, cost per acquisition, click-through rate, epsilon-greedy algorithms, softmax, UCB, multi-armed bandits, minimax, alpha-beta pruning, reinforcement learning, and more. Many sections provide references and links for further reading.
粒子フィルタ入門です.
References
- http://www.jstatsoft.org/v30/i06/paper
私はこのライブラリを使っています.
- Sequential Monte Carlo Methods in Practice (Springer)
この1章がとてもよくまとまっていておすすめです. 他にも応用例が色々書いてあるので実用向きという印象があります.
- Standard machine learning models do not guarantee satisfying physical conservation laws for motion prediction.
- The paper proposes learning the "equations of motion" in the form of a Hamiltonian function using neural networks to predict trajectories that obey conservation laws.
- The learned Hamiltonian function is integrated on the fly to generate predictions, ensuring the predictions satisfy conservation of energy.
The document discusses spherical CNNs, which are CNNs built for data on spherical domains like the 2-sphere (S2). It notes that the basic building block of CNNs is the cross-correlation (convolution) operation, and that on S2 this can be implemented via Fourier correlation using the Fourier transform. Fourier correlation allows translating filter signals on S2.
Nonlinear Filtering and Path Integral Method (Paper Review)Kohta Ishikawa
?
The document summarizes nonlinear filtering and quantum physics using a path integral perspective. It presents stochastic differential equations (SDEs) to model the system and observation processes. It describes how to represent the probability density function of the state using a path measure and functional differentiation approach. This leads to a Fokker-Planck equation and path integral representation of the filtering distribution.
This document provides an introduction to pair trading based on cointegration. It discusses that pair trading selects two highly correlated stocks and trades their price differences. Cointegration refers to the long-term co-movement of stock prices, which pair trading exploits. The document outlines the basic idea of pair trading when stock prices diverge, and simulates pair trading using R language to estimate spreads, check for cointegration, generate signals, and backtest performance. In summary, pair trading is a quantitative strategy that aims to profit from mean reversion of cointegrated stock price spreads.
The document discusses particle filter tracking in Python. Particle filters use a distribution of samples, or "particles", to approximate the posterior distribution of the state. The particle filter algorithm involves predicting the movement of particles, updating weights based on observation and likelihood, and resampling particles. Example Python code is provided to implement a particle filter for tracking an object in video frames using OpenCV.
8. 状態、純粋状態、混合状態
?? 混合状態
?
–? 純粋状態は、具体的な表示で見ると
?
X X
| i= |xihx| i = (x)|xi
x x
?
?
?
?
?などとなり、固有状態の重ね合わせ(様々な位置にいる
?
?
?
?
?
?状態が同時に混在している)となっている。
?
?
–? 一方、複数の純粋状態の「古典的な重ね合わせ」を考え
たい場合もある
?
?? 統計力学では、多数の粒子のあり得る配位についての確率的な
平均を考える
?
?? それぞれの配位は物理的に干渉する訳ではないので、古典的な
重ね合わせとなる
?
?? そのような状態を混合状態と呼ぶ
9. 状態、純粋状態、混合状態
?? 混合状態
?
–? 定義から、純粋状態 | 1 i, · · · , | k i を確率的重み
?
p1 , · · · , pk
?
?
?
? で混合した混合状態に対して、物理量Aの
?
?
?
?
?確率分布は
?
X
? P (a) = pi |ha| i i|2
? i
となる。
? 物理量Aが固有値aをとる状態のブラベクトル
10. 状態、純粋状態、混合状態
?? 混合状態
?
–? そのような混合状態を表すために、以下の密度演算子を
考えると便利
?
X X
?=
? pi | i ih i | ?=
? |xihx|?|x0 ihx0 |
?
i x,x0
密度演算子の行列表示(密度行列)
–? 密度演算子が与えられると、物理量Aの確率分布は
?
P (a) = ha|?|ai
?
?
?
?
?と書け、期待値は
?
X X X
? aP (a) = aha|?|ai =
? ha|?A|ai ? Tr(?A)
? ?
? a a a
?
?
?
?と書ける
?
20. 量子的Hamiltonianの取り扱い
?? 量子効果を入れたHamiltonianはまともに計算する
ことができない
?
(Hc +Hq )
log P (x) = log Tr{e }
–? 非対角な行列のexp??
?
?? 鈴木-?‐Tro9er展開
?
! ? ◆! m ? ◆
X Y Al 1
exp Al = exp +O
m m
l l
?? Hamiltonianの非対角部分を計算可能な形に近似し、MCMCサン
プリングなどを行う
?
?? mが一つの独立な対角Hamiltonianに対応する形となり、実装的
にはm個のシミュレーテッドアニーリングを走らせることになる
?
21. アニーリング
?? 温度項の導入
? 逆温度(物理的には1/kBT)
?
(Hc +Hq )
log P (x) = log Tr{e }
?? アニーリング
?
–? シミュレーテッドアニーリング
?
?? βを徐々に増加(温度を低下)させながらサンプリング
?
–? 量子アニーリング
?
?? βを徐々に増加させ、量子Hamiltonianの係数Γを徐々にゼロに近
づけながらサンプリング
T
SA
SA
SA
QA
QA
QA
Γ