GRUΒΆ
Computes an one-layer GRU. This operator is usually supported via some custom implementation such as CuDNN. Notations: * X - input tensor * z - update gate * r - reset gate * h - hidden gate * t - time step (t-1 means previous time step) * W[zrh] - W parameter weight matrix for update, rβ¦
Abstract Signature:
GRU(X: Tensor, W: Tensor, R: Tensor, B: Tensor, sequence_lens, initial_h: Tensor, activation_alpha: List[float], activation_beta: List[float], activations: List[str], clip: float, direction: str, hidden_size: int, layout: int, linear_before_reset: int)