4.2 Model

yuuuun 2020. 11. 7. 11:29

$$h_v = f(x_v, x_{co[v]}, h_{ne[v]}, x_{ne[v]}) $$

$$o_v = g(h_v, x_v)$$

Functions
- $f$: local transition function which is shared among all nodes
- $g$ : local output function
Symbols
- x: the input feature
- h: hidden state
- $co[v]$: the set of edges connected to node $v$
- $ne[v]$: the set of neighbors of node $v$
- $x_v$: the features of $v$
- $x_{co[v]}$: the features of its edges
- $h_{ne[v]}$: the states of nodes in the neighborhood of v
- $h_{co[v]}$: the states of features in the neighborhood of v

Example for node $l_1$

전체 노드에 대하여 아래의 식으로 나타낼 수 있음

$$H=F(H, X)$$

$$O=G(H, X_N)$$

여러개의 layer을 거치게 될 경우, $$H^{t+1} = F(H^t, X)$$

$$loss = \sum_{i=1}^p (t_i - o_i)$$

여기서 $p$는 supervised nodes의 개수를 의미

based on a gradient descent strategy and is composed of the following steps

The states $h_v^t$ are iteratively updated by $h_v = f(x_v, x_{co[v]}, h_{ne[v]}, x_{ne[v]}) $ until a time step $T$. Then obtain an approximate fixed point solution of $H=F(H, X)$ : $H(T) \approx H$
The gradient of weights W is computed from the loss
The weights W are updated according to the gradient computed in the last step.