Commit 33035bb7 authored by Adrian Kuegel's avatar Adrian Kuegel Committed by TensorFlower Gardener
Browse files

Parallelize BitonicSort on GPU.

We now emit O(log^n) kernel thunks. Each thunk is responsible for looping over
the other dimensions, and then doing a comparison loop through the dimension
that should be sorted.

PiperOrigin-RevId: 205791397
parent fca1561b
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment