Commit 723f285b authored Nov 28, 2017 by Justin Lebar Committed by TensorFlower Gardener Nov 28, 2017

[XLA] Improvements to replay_computation tool.

 * Reduce threshold at which we run fake-data generation on the device
   from 1gb to 1mb.  At the old threshold, I observed cases where
   we'd spend many seconds, and >50% of our runtime, in logf(), used for
   computing random numbers.

 * Don't retrieve or print the result when running with fake data.
   Presumably this is uninteresting, because garbage in, garbage out.
   Retrieving this data can take as long as running the whole
   computation, and printing it can take many times longer.

 * Add a LOG(INFO) indicating how long execution took.

 * Add a --num_runs flag.  This is particularly important on GPUs, where
   the first run does autotuning, and so isn't interesting from a
   performance perspective.

PiperOrigin-RevId: 177185636

parent c81a8ae5

Show whitespace changes

Inline Side-by-side

Please to comment