Commit 5f69248a authored Oct 09, 2018 by Igor Ganichev Committed by TensorFlower Gardener Oct 09, 2018

Make defun work under distributed strategies.

The core of the change is have the gradient tape capture
distributed variables instead of plain ResourceVariables.
In other words, we move the distribution awareness from defun
down to tape and rely on distributed variable magic to provide us
with the right variable at runtime.

In tower context, we always watch the container (e.g. MirroredVariable).
In cross tower context, we always watch all the components.

PiperOrigin-RevId: 216430530

parent c1093a37

Show whitespace changes

Inline Side-by-side

Please to comment