* Add mechanism to CudaSolver for capturing references to temporary tensors....
* Add mechanism to CudaSolver for capturing references to temporary tensors. This way users of the class don't have to remember to capture each one manually to avoid premature deallocation and memory races for asynchronous op kernels. * Add simple tests that run multiple ops concurrently for linalg ops that use CudaSolver. * Put a lock around the calls to cusolverDn*getrs and cusolverDn*gesvd, which appear not to be threadsafe. * Misc. cleanup in linalg GPU kernels. I ran all the related tests 1000 times without failure. Before this change, tests for matrix_solve and svd would fail or hang occasionally. PiperOrigin-RevId: 170557380
Loading
Please sign in to comment