Commit d4bb9925 authored by Ayush Dubey's avatar Ayush Dubey Committed by TensorFlower Gardener
Browse files

Hierarchical broadcast collective op.

Before this CL, we had a binary tree based implementation.  We would
essentially form an arbitrary, logical binary tree over all devices and then
send the tensor from each node to its children in the tree.

This change introduces a topology-aware hierarchical implementation.  First, we
perform a binary tree broadcast with only 1 device per machine.  Then for each
host, the device that participated in the first inter-machine broadcast is the
source for an intra-machine binary tree broadcast.

PiperOrigin-RevId: 206771821
parent a7a9ea97
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment