Fix loss reductions to correctly divide by global batch size
when using a tf.distribute.Strategy, instead of less-reliable division by the number of replicas in optimizers. RELNOTES: Bug fix: loss and gradients should now more reliably be correctly scaled w.r.t. the global batch size when using a tf.distribute.Strategy. PiperOrigin-RevId: 232400597
Loading
Please sign in to comment