Commit 09a7d525 authored by Bixia Zheng's avatar Bixia Zheng Committed by TensorFlower Gardener
Browse files

[XLA:GPU] Convert the reduction implementation to use tiling scheme.

Convert the implementation of scalar reduction, row reduction and column
reduction to use EmitTiledKernel, which is a more general kernel tiling
implementation that is based on the information defined by an object of
TilingScheme. For scalar reduction and row reduction, the new implementation
should generate the same optimized code as the old implementation.

For column reduction, the old implementation in routine
IrEmitterUnnested::EmitColumnReduction uses kTileWidth=2 so that one thread
computes the partial results for two elements in the output of each kReduce
instruction. The new implementation is equivalent to the old implementation
with kTileWidth=1 in this regard.
PiperOrigin-RevId: 222752674
parent 8a89ca38
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment