Source: Deep Learning on Medium
2.4. Output Gate
The hidden state h_t at each memory cell is decided based on the updated cell state C_t and the output vector o_t. Similar to layers in the forget and input gates, here also “number of neurons” is fixed to 70.
The output logic is composed of a single neural network layer having sigmoid function as a non-linear activation. This is shown in Figure 4. Size of the output vector o_t is 70. They are described as:
o_t = sigmoid(W_o * X + b_o)
Here, W_o and b_o denotes weight matrix (70 x 102) and bias vector (70 x 1) respectively with subscript term o indicating output gate.
Finally, element-wise multiplication between output o_t and cell state C_t is carried out to obtain the hidden state vector, h_t.
The hidden state h_t is a vector of size 70.