Better wrapping of stream executor's cuDNN API calls. Replacing mutex locking...
Better wrapping of stream executor's cuDNN API calls. Replacing mutex locking and setting the cuDNN stream followed by calling wrap::cudnn... with an RAII CudnnHandle object that handles the former two operations. Distinguish three different API types: A) APIs that don't take a cudnnHandle_t: These are thread-safe APIs that don't enqueue any CUDA work on a stream. They can be called directly without any extra precautions. B) APIs that take a cudnnHandle_t and perform CUDA work. The CUDA context needs to be acquired and the stream needs to be set beforehand, calls need to be serialized. A CudnnHandle instance guarantees that this work has been performed before calling cuDNN. C) APIs that do take a cudnnHandle_t, but (presumably, the API makes no guarantees) still don't perform any CUDA work. This is limited to the API to setup RNN descriptors. Calls need to be serialized, but most likely we wouldn't need to acquire the CUDA context or set the stream. We still do though using the legacy default stream, because there are no guarantees. PiperOrigin-RevId: 195856300
Loading
Please sign in to comment