Recurrent Neural Networks

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

관리 메뉴

yuns

Recurrent Neural Networks 본문

goorm 수업 정리/Deep Learning

Recurrent Neural Networks

yuuuun 2021. 9. 10. 12:14

Language Model

RNN의 여러가지 형태
- one-to-many: image captioning
- many-to-one: (sequence of text)sentiment classification (positive / negative)
- sequence-to-sequence: machine translation(delay), video classification on frame level(delay x)
간단한 예시
- 키, 몸무게, 혈액형이 들어와서 당뇨병여부를 판별할 때 그 정보로 과체중 정도, 혈액형으로 feature를 뽑은 뒤 최종적으로 당뇨병 여부를 판별한다.
- 만약에 갑자기 몸무게가 늘어나게 될 경우의 당뇨병 여부를 판별(과거의 정보와 현재 정보를 포함)
$h_t = f_W (h_{t-1}, x_t )$ -> $h_t = tanh (W_{hh}h_{t-1}+W_{xh}x_t), y_t = W_{hy}h_t$
- 현재의 정보인 $x_t$를 이용하여 데이터 과거 시점의 정보인 $h_{t-1}$를 활용하여 과거의 정보와 같이 사용
hello를 예측 할 때, h다음에 e가 나와야 되고 e 다음에 l이 나와야 되는 방식
- h=[1 0 0 0], e = [0 1 0 0], l = [0 0 1 0], o=[0 0 0 1]
- output layer는 다음 글자를 예측하도록 해야함
- 입력을 예측한 예측값으로

Backpropagation Through Time(BPTT)

전체를 다 학습할 경우에는 업데이트하는데 시간이 많이 소요되는 단점이 있기 때문에 적절히 잘라서(truncate) 학습해야함

Vanishing Gradient

필요로 하는 단어가 멀어지면 그 영향이 더 작아지는 현상이 발생
- 메모리 라는 개념 사용!
- $h_t = ax_t + bh_{t-1} + c$꼴
- 증폭되는 현상도 일어남 -> gradient exploding

Example

LSTM
- 1997년 발표 Long short term memory
- memory종류
  - Forget gate: whether to earse cell
  - Input gate: whether to write to cell
  - Gate gate: how much to write to cell
  - Output gate: how much to reveal cell
- 동작 방식
  - cell state에 추가할 정보를 생성하고 input_gate를 통하여 일부를 제거
  - 버릴 것은 버린 (forget gate) 과거에서 넘어온 cell state에 현재 정보를 더해서 현재의 cell state를 생성
  - 현재의 cell state를 tanh를 통과하고 여기에 output gate를 통과시켜 현재의 hidden state를 생성
  - 이후 hidden_state는 다음 time step로 넘겨주고, 필요시 output쪽이나 next layer로 넘겨줌

def forward(self, x, h_0, c_0):
        """
        Inputs
            input (x): [batch_size, input_size]
            hidden_state (h_0): [batch_size, hidden_size]
            cell_state (c_0): [batch_size, hidden_size]
        Outputs
            next_hidden_state (h_1): [batch_size, hidden_size]
            next_cell_state (c_1): [batch_size, hidden_size]    
        """
        h_1, c_1 = None, None
        input = torch.cat((x, h_0), 1)
        # Implement LSTM cell as noted above
        ### YOUR CODE HERE (~ 6 lines)
        i = self.sigmoid(self.Wi(input))
        f = self.sigmoid(self.Wf(input))
        g = self.tanh(self.Wg(input))
        o = self.sigmoid(self.Wo(input))
        c_1 = f * c_0 + i * g
        h_1 = o * self.tanh(c_1)
        print(h_1, c_1)
        return h_1, c_1

GRU(Gated Recurrent Unit)
- LSTM 대용으로 사용
- reset gate와 update 두개의 gate를 사용하고, cell state와 hidden state가 합쳐져서 하나의 hidden stae로 표현되고 있다.

'goorm 수업 정리 > Deep Learning' 카테고리의 다른 글

Generative Adversarial Network (0)	2021.09.15
Attention Network (0)	2021.09.14
Convolution Neural Network (0)	2021.09.08
Training the Neural Network (0)	2021.09.07

'goorm 수업 정리/Deep Learning' Related Articles

Comments

yuns

Recurrent Neural Networks 본문

Recurrent Neural Networks

Language Model

Backpropagation Through Time(BPTT)

Vanishing Gradient

Example

'goorm 수업 정리 > Deep Learning' 카테고리의 다른 글

티스토리툴바