TransformerFeedForwardMoeΒΆ
A sharded MoE Layer.
Abstract Signature:
TransformerFeedForwardMoe(input_dims: int, hidden_dims: int, num_experts: int, num_groups: int)
PyTorch
API:
βStrategy: Custom / Partial
A sharded MoE Layer.
Abstract Signature:
TransformerFeedForwardMoe(input_dims: int, hidden_dims: int, num_experts: int, num_groups: int)
β