Commit 428d0342 authored by A. Unique TensorFlower's avatar A. Unique TensorFlower Committed by TensorFlower Gardener
Browse files

Fix pontential issue with number of blocks launched for depthwise kernels: the...

Fix pontential issue with number of blocks launched for depthwise kernels: the number of work_elements was too small, which could return a block_count that is too small to cover all elements.

We also have been ignoring the suggested thread_per_block, so were potentially launching more blocks than necessary to fill the GPU (which is inefficient, but functionally correct).

Changing 'assert(false && ...' to LOG(FATAL) because it shouldn't be debug only.

PiperOrigin-RevId: 186037306
parent 96c2a846
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment