By default the _Retval Op returns INT32 tensors in Host memory. This causes a...
By default the _Retval Op returns INT32 tensors in Host memory. This causes a problem for RemoteCall ops which when run on a GPU expect to put their results always (barring for strings and resources) in device memory. As a result, when we run a remote call function via GeneratorDataset (placed on GPU) in the MultiDeviceIterator, the _Retval ends up copying the INT32 tensors on to Host memory causing downstream consumers of the data to misbehave (treat them as garbage / NaN's). We fix this by conditionally (based on an experimental attr) replacing the _Retval op with a new Op that doesn't do the HostMemory annotation for INT32 types, thereby making sure that INT32's are placed in device memory. We also make sure that the MultiDeviceIterator implementation makes use of this experimental attr. PiperOrigin-RevId: 216961804
Loading
Please sign in to comment