Commit ad65038c authored by Rohan Jain's avatar Rohan Jain Committed by TensorFlower Gardener
Browse files

Fixing MultiDeviceIterator memory leak issue in eager mode.

In Graph mode, we rely on MultiDeviceIteratorHandleOp destruction to decrement the ref count for the resource. Since we don't destroy kernels in Eager mode, we explicitly added in a destroy_resource_op to mitigate this.

The problem is that this isn't enough. The ResourceMgr.LookupOrCreate method ends up increasing the ref count of the resource by 2 and we were effectively doing two Unref's in graph mode in the destructor. So even with the destroy resource op, the refcount remained 1 and didn't go down to zero.

The fix here is to handle the Eager mode case separately, similar to what we've done with the AnonymousIteratorHandleOp. Instead of creating a whole new kernel, we re-use the existing kernel and use a special shared_name argument to identify when to switch the behavior. Now in Eager mode, after running the HandleOp kernel, the refcount of the resource is 1.

PiperOrigin-RevId: 231333966
parent 60192192
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment