Commit bffa3e10 authored Dec 26, 2017 by Nathan Luehr Committed by drpngx Dec 26, 2017

Add support for CUBLAS_TENSOR_OP_MATH in fp16 GEMM (#13451)

- Applies to matrix multiplications with fp16 input/output.
  Computations will fall back to pseudo-fp16 if tensor op math is
  disabled or not supported.
- Enabled by default. Tensor ops (both in cublas gemms and cudnn
  convolutions) can be disabled globally by setting the
  environment variable TF_DISABLE_TENSOR_OP_MATH=1. To disable
  tensor ops specifically for gemms or convolutions use
  TF_DISABLE_CUBLAS_TENSOR_OP_MATH=1 or
  TF_DISABLE_CUDNN_TENSOR_OP_MATH=1, respectively.
- Added CUBLAS 9.0 algorithms to GetBlasGemmAlgorithms().

parent 2f165f38

Show whitespace changes

Inline Side-by-side

Please to comment