Open
Description
lstm_output : [batch_size, n_step, n_hidden * num_directions(=2)], F matrix
def attention_net(self, lstm_output, final_state):
batch_size = len(lstm_output)
hidden_forward=final_state[0]
hidden_backward=final_state[1]
hidden_f_b=torch.cat((hidden_forward, hidden_backward), 1)
hidden = hidden_f_b.view(batch_size, -1, 1) #
hidden = final_state.view(batch_size, -1, 1) # this line in source code is wrong, bi-lstm's hidden is[2,batch,embed_size] ,we need to concatenate forward and backward hidden state. if we final_state.view(batch_size, -1, 1) the hidden state is not concatenate by final_state[0][0] and final_state[1][0]
Metadata
Metadata
Assignees
Labels
No labels