QqmmΒΆ
Quantized Matrix Multiplication.
Abstract Signature:
Qqmm(x: Tensor, w: Tensor, scales: Tensor | None, group_size: int | None, bits: int | None, mode: str = nvfp4)
PyTorch
API:
βStrategy: Plugin (quantized_matmul)
JAX (Core)
API:
βStrategy: Plugin (quantized_matmul)
Keras
API:
βStrategy: Plugin (quantized_matmul)
TensorFlow
API:
βStrategy: Plugin (quantized_matmul)
Flax NNX
API:
βStrategy: Plugin (quantized_matmul)
PaxML / Praxis
API:
βStrategy: Plugin (quantized_matmul)