QuantizedMatmulΒΆ
Matrix multiplication with a quantized weight matrix.
Abstract Signature:
QuantizedMatmul(x: Tensor, w: Tensor, scales: Tensor, biases: Tensor | None, transpose: bool = True, group_size: int | None, bits: int | None, mode: str = affine)
PyTorch
API:
βStrategy: Plugin (quantized_matmul)
JAX (Core)
API:
βStrategy: Plugin (quantized_matmul)
Keras
API:
βStrategy: Plugin (quantized_matmul)
TensorFlow
API:
βStrategy: Plugin (quantized_matmul)
Flax NNX
API:
βStrategy: Plugin (quantized_matmul)
PaxML / Praxis
API:
βStrategy: Plugin (quantized_matmul)