Dsh data sensitive hashing for high dimensional k-nn searchWooSung Choi
油
Gao, Jinyang, et al. "Dsh: data sensitive hashing for high-dimensional k-nnsearch." Proceedings of the 2014 ACM SIGMOD international conference on Management of data. ACM, 2014.
The guided policy search(GPS) is the branch of reinforcement learning developed for real-world robotics, and its utility is substantiated along many research. This slide show contains the comprehensive concept of GPS, and the detail way to implement, so it would be helpful for anyone who want to study this field.
The guided policy search(GPS) is the branch of reinforcement learning developed for real-world robotics, and its utility is substantiated along many research. This slide show contains the comprehensive concept of GPS, and the detail way to implement, so it would be helpful for anyone who want to study this field.
2. 2
Lecture 2: supervised learning
Machine learning : supervised learning, unsupervised learning, etc.
Supervised learning: linear regression, logistic regression (classification), etc.
1-page Review
Supervised learning
Given a set of labeled example,
倹 = (ヰヰ$, $) $=1
, learn a mapping :
which minimizes L(鐃署 = , )
Unsupervised learning
Given a set of unlabeled example,
倹 = (ヰヰ$) $=1
, learn a meaningful
representation of the data
Linear regression
署 = =0
ヰヰ = 署署
署 署 =
1
2
裡=1
署
2
腫 署
ヰヰ
()
署 =
1
1+署
署 0.5 = 1
署 < 0.5 = 0
署 = =1
()
log 署(
) + (1
) log(1 署(
))
+ 腫
署
ヰヰ
Logistic regression
3. Overfitting?
Training data 讌豺蟆(over) fit 朱 豢碁ゼ 讌 覈詩 覓語
朱朱 給一危一 讌豺蟆 クル 覲旧″ 覈(覿 curve)襦 誤 覦
Overfitting 譴願鍵 覦覯
Reduce the number of parameters
(一危磯ゼ feature vector襯 螻殊朱 蟲燕朱 data dimension 豢)
Regularization (cost function)
3
The problem of overfitting
hypothesis function
蟆企朱 螳
8. 8
Clustering
觜訣 轟煙 一危磯れ 覓矩 螻襴讀
K-means algorithm
Spectral clustering
Anomaly detection
Density estimation
Unsupervised learning
EECE695J (2017)
Sang Jun Lee (POSTECH)
K-means
Cluster center襦覿一
蟇磯Μ襯 蠍一朱 clustering
Spectral clustering
咋(, 乞): vertices and edges
朱 觜朱 蠏碁
襯 partitioning 蟆瑚
9. 9
譯殊伎 螳 一危( 倹
$=1
)襯 螳 center ( = 1, , 情)襯 谿城 覓語
一危磯れ 螳 蠏碁9朱 partitioning
Initialize ( = 1, , 情)
Repeat until convergence!
K-means algorithm
EECE695J (2017)
Sang Jun Lee (POSTECH)
Assignment step
Assign all data points to the cluster for which
2
is smallest
Update step
Compute new means for every cluster
=
1
裡$≠駒駒
谿場狩蟆 optimal cluster optimal mean碁..
mean 螻 optimal cluster襯 襾殊 谿剰
谿場 optimal cluster
襦 optimal mean 螻
Not optimal solution!
16. 16
K-means algorithm in TensorFlow
EECE695J (2017)
Sang Jun Lee (POSTECH)
ipython 轟 蠏碁殊 蠏碁Μ蠍 覈轟
螳 normal distribution朱
sample data 蟲
vectors_set : 2000x2
18. 18
K-means algorithm in TensorFlow
EECE695J (2017)
Sang Jun Lee (POSTECH)
2000螳 譴 k螳襯 襦
initial mean 螳朱
expanded_vectors: (tensor) 1x2000x2
expanded_centroids: (tensor) 4x1x2
19. 19
K-means algorithm in TensorFlow
EECE695J (2017)
Sang Jun Lee (POSTECH)
Assignment 螻殊 螳 れ願 蟆
殊 operation 企麹 graph
20. 20
K-means algorithm in TensorFlow
EECE695J (2017)
Sang Jun Lee (POSTECH)
れ 糾骸 (100 覦覲)
K=4 K=2
21. 21
K-means algorithm 譴 螳讌 伎
Mean initialization 企至 蟆瑚?
Random initialization:
N螳 譴 k螳襯 覓伎襦 觸 豐蠍郁朱
豐蠍郁 磯 clustering 蟆郁骸螳 譬
Random initialization 螻殊 覯
Cluster 螳 k 蟆一
朱朱 cluster 螳襯 一危一 覿 磯 蟆一..
K-means algorithm
EECE695J (2017)
Sang Jun Lee (POSTECH)
蠏碁 豢豌: https://wikidocs.net/4693
22. 22
K-means algorithm 譴 螳讌 伎
Cluster 螳 k 蟆一
Elbow method:
k螳 磯ジ cost function 襯 蠏碁語 , 轟 k 危 cost螳 覲螳 elbow point襯 k螳朱 蟆一
k螳 磯ジ cost function 覲螳 smooth 蟆曙 elbow point襯 谿剰鍵 .
K-means algorithm
EECE695J (2017)
Sang Jun Lee (POSTECH)
蠏碁 豢豌: https://wikidocs.net/4693
23. 23
Clustering criteria:
The affinities of data within the same cluster should be high
The affinities of data between different clusters should be low
Spectral clustering
EECE695J (2017)
蠏碁 谿語^: POSTECH CSED441 lecture13
一危 伎 蟯
24. 24
Optimization objective:
Affinity 豐 豕螳 cluster襯 谿城 蟆 覈 ( : 一危一 cluster 覲企ゼ 企 vector)
Convert the discrete optimization problem to continuous domain (+ ヰレ length襦 normalize)
Spectral clustering
EECE695J (2017)
Discrete optimization problem
Maximum eigenvalue problem
29. 29
Minimum-cut algorithm:
Find the second minimum eigenvector of 瑞 = 倹
Partition the second minimum eigenvector
谿瑚: Two moon data second minimum eigenvector
谿瑚:
Minimum-cut algorithm optimal solution 谿城 蟆 optimal solution approximation 谿城
Spectral clustering
EECE695J (2017)
Second smallest eigenvector
32. 32
朱 一危郁 螻 螳 覿襯 螳讌 ,
(觜螳 り骸 螳) 伎 一危磯れ 蟆豢企企 覦覯?
Density estimation using multivariate Gaussian distribution
Anomaly detection
EECE695J (2017)
33. 33
Density estimation using multivariate Gaussian distribution
Parameter fitting:
Given training set {ヰ
: = 1, , }
=
1
鐃
=1
ヰ()
裡 =
1
鐃
=1
ヰ
ヰ
Anomaly if ヰ; , 裡 < for given
Anomaly detection
EECE695J (2017)