Commit 1736e0bb authored by Peter Hawkins's avatar Peter Hawkins Committed by TensorFlower Gardener
Browse files

[TF] Add new internal ops _VarHandlesOp and _ReadVariablesOp.

The purpose of these ops is to fix a latency problem observed for an inference benchmark. Often a inference step starts by reading the value of many (hundreds) of weights. For a resource variable, this requires a VarHandleOp and a ReadVariableOp per variable. Running hundreds of trivial ops can add hundreds of microseconds of latency to the critical path of an inference step. The inter-op latency of the executor can be hundreds of nanoseconds, which rapidly adds up.

This change introduces two fused ops _VarHandlesOp and _ReadVariablesOp that allow us to read many variables in a pair of larger ops, rather than many tiny ops.

PiperOrigin-RevId: 214662338
parent 652ce1aa
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment