GShardSharedEmbeddingSoftmaxΒΆ

Softmax layer with embedding lookup and Gaussian init used in GShard.

Abstract Signature:

GShardSharedEmbeddingSoftmax(in_features: int, num_classes: int)

PyTorch

API: β€”
Strategy: Custom / Partial

Keras

API: β€”
Strategy: Custom / Partial

Flax NNX

API: β€”
Strategy: Custom / Partial

PaxML / Praxis

API: paxml.layers.GShardSharedEmbeddingSoftmax
Strategy: Direct Mapping