Commit a21dcdfc authored by Brennan Saeta's avatar Brennan Saeta Committed by TensorFlower Gardener
Browse files

Improve the performance of the resize_bicubic_op.

This change results a >2X speedup for scaling up an image, and ~1.6x speed improvement for scaling down an image. (Based on the benchmarks defined in image_ops_test.py.)

Additionally, we preserve the old behavior in a unit test, and ensure we do not deviate by more than 1e-5. (The computations are the same, but we've reordered them, and so floating point inaccuricies crop up.)

The two biggest performance wins come from:
  1. Instead of using array<float, 4>, manage them as varaibles. (This allows the compiler to avoid pushing things onto the stack and instead use registers.)
  2. Cache previously computed intermediate values to avoid having to fetch and re-compute.
Change: 145577417
parent 2d2ca484
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment