Add a kernel usable as a GEBP inner loop for an LLVM IR GEMM
This is not used in any real code path, but I've added an escape hatch that runs regular matrix multiplies through this kernel for testing purposes. As far as I can tell this is functionally correct, but I don't yet have a proper apples-to-apples performance comparison -- that'll have to wait till the implementation is complete. PiperOrigin-RevId: 197422075
Loading
Please sign in to comment