Commit 8fcf6473 authored by A. Unique TensorFlower's avatar A. Unique TensorFlower Committed by TensorFlower Gardener
Browse files

Better wrapping of stream executor's cuDNN API calls. Replacing mutex locking...

Better wrapping of stream executor's cuDNN API calls. Replacing mutex locking and setting the cuDNN stream followed by calling wrap::cudnn... with an RAII CudnnHandle object that handles the former two operations.

Distinguish three different API types:

A) APIs that don't take a cudnnHandle_t: These are thread-safe APIs that don't enqueue any CUDA work on a stream. They can be called directly without any extra precautions.

B) APIs that take a cudnnHandle_t and perform CUDA work. The CUDA context needs to be acquired and the stream needs to be set beforehand, calls need to be serialized. A CudnnHandle instance guarantees that this work has been performed before calling cuDNN.

C) APIs that do take a cudnnHandle_t, but (presumably, the API makes no guarantees) still don't perform any CUDA work. This is limited to the API to setup RNN descriptors. Calls need to be serialized, but most likely we wouldn't need to acquire the CUDA context or set the stream. We still do though using the legacy default stream, because there are no guarantees.

PiperOrigin-RevId: 195856300
parent b15500be
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment