QqmmΒΆ

Quantized Matrix Multiplication.

Abstract Signature:

Qqmm(x: Tensor, w: Tensor, scales: Tensor | None, group_size: int | None, bits: int | None, mode: str = nvfp4)

PyTorch

API: β€”
Strategy: Plugin (quantized_matmul)

JAX (Core)

API: β€”
Strategy: Plugin (quantized_matmul)

Keras

API: β€”
Strategy: Plugin (quantized_matmul)

TensorFlow

API: β€”
Strategy: Plugin (quantized_matmul)

Apple MLX

API: mlx.core.qqmm
Strategy: Direct Mapping

Flax NNX

API: β€”
Strategy: Plugin (quantized_matmul)

PaxML / Praxis

API: β€”
Strategy: Plugin (quantized_matmul)