16. 補足:clusDCAのアルゴリズム1
1) 正規化項の規約を緩和
2) 最適化 (最小二乗法)
3) GOラベルグラフのdiffusion state
log ?sij =wi
T
xj -log exp{wi
T
xj }
j
? log ?sij =wi
T
xj
diffusion state matrix L Q: a small positive constant
α: ‘back propagation’ parameter
17. 補足:clusDCAのアルゴリズム2
4) 共発現NWグラフの投影
y′i: projection of the gene vector xi
zij: pairwise affinity score
W: transformation matrix
Wの最適化式 fj: set of genes that are positively
(negatively) annotated with function j.
X: gene vector, Y: functional vector
18. 補足:GeneMANIA
1. A linear regression-based algorithm that calculates a single
composite functional association network from multiple data
sources.
2. A label propagation algorithm for predicting gene function given
the composite functional association network.
http://morrislab.med.utoronto.ca/projects.html
#16: B = 遷移確率行列、p = restart確率, t = ステップ数、ei is an n-dimensional distribution vector with ei(i)=1 and ei(j)=0, ?j ≠ I
2) diffusion state を多項ロジスティックモデルで近似 KL情報量の最小化→ 次元削減されたwi,xiが求まる
#17: GO graphの特性
a directed acyclic graph (DAG) over functional labels where the edges represent various semantic relationships.
DAGは有向非巡回グラフ。閉路のない有向グラフ(頂点と有向辺からなって、辺をたどっても出発点には戻らない)
the ‘is a’ and ‘part of’ edges, which results in a hierarchy of labels with edges going from the more specific to the more generic terms.
この階層構造がもつ問題
generally not present in molecular networks
a naive application of RWR on the ontology graph
the edges are treated as undirected unfairly favors high-level nodes tend to have higher centrality.
allowing a random walk to only move from high- to low-level nodes would greatly restrict the portion of the graph a random walk can explore.
課題への対応
allow both edge directions but with different weights, whose ratio is controlled by the ‘back propagation’ parameter α
α is generally shrinks the diffusion scores of high-level nodes
B denoting the transition matrix of the original graph with unidirectional edges