Commit bffa3e10 authored by Nathan Luehr's avatar Nathan Luehr Committed by drpngx
Browse files

Add support for CUBLAS_TENSOR_OP_MATH in fp16 GEMM (#13451)

- Applies to matrix multiplications with fp16 input/output.
  Computations will fall back to pseudo-fp16 if tensor op math is
  disabled or not supported.
- Enabled by default. Tensor ops (both in cublas gemms and cudnn
  convolutions) can be disabled globally by setting the
  environment variable TF_DISABLE_TENSOR_OP_MATH=1. To disable
  tensor ops specifically for gemms or convolutions use
  TF_DISABLE_CUBLAS_TENSOR_OP_MATH=1 or
  TF_DISABLE_CUDNN_TENSOR_OP_MATH=1, respectively.
- Added CUBLAS 9.0 algorithms to GetBlasGemmAlgorithms().
parent 2f165f38
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment