yuuuun
2020. 11. 7. 11:29
반응형
$$h_v = f(x_v, x_{co[v]}, h_{ne[v]}, x_{ne[v]}) $$
$$o_v = g(h_v, x_v)$$
- Functions
- $f$: local transition function which is shared among all nodes
- $g$ : local output function
- Symbols
- x: the input feature
- h: hidden state
- $co[v]$: the set of edges connected to node $v$
- $ne[v]$: the set of neighbors of node $v$
- $x_v$: the features of $v$
- $x_{co[v]}$: the features of its edges
- $h_{ne[v]}$: the states of nodes in the neighborhood of v
- $h_{co[v]}$: the states of features in the neighborhood of v
Example for node $l_1$
- $x_{l_1}$: the input feature
- $co[l_1]$: $l_{(1, 4)}, l_{(1, 6)}, l_{(3, 1)}, l_{(1, 2)}$
- $ne[l_1]$: $l_2, l_3, l_4, l_6$
전체 노드에 대하여 아래의 식으로 나타낼 수 있음
$$H=F(H, X)$$
$$O=G(H, X_N)$$
- H: the matrices constructed by stacking all the states 중간 산물로 나온 모든 결과물
- O: the matrices constructed by all the outputs 마지막 output들
- X: the matrices constructed by all the features
- $X_n$: the matrices constructed by all the node features
- F: the global transition function
- G: the global output function
여러개의 layer을 거치게 될 경우, $$H^{t+1} = F(H^t, X)$$
Loss Function
$$loss = \sum_{i=1}^p (t_i - o_i)$$
여기서 $p$는 supervised nodes의 개수를 의미
Learning algorithm
based on a gradient descent strategy and is composed of the following steps
- The states $h_v^t$ are iteratively updated by $h_v = f(x_v, x_{co[v]}, h_{ne[v]}, x_{ne[v]}) $ until a time step $T$. Then obtain an approximate fixed point solution of $H=F(H, X)$ : $H(T) \approx H$
- The gradient of weights W is computed from the loss
- The weights W are updated according to the gradient computed in the last step.
반응형