[XLA:CPU/GPU] Implement the parallel Philox random number generation algorithm.
Implement the RNG elemental ir generator using the Philox algorithm. To ensure multiple execution of the same RNG hlo instruction rarely produce the same result, we increment a global variable with the number of random numbers generated by the RNG hlo each time the hlo is executed and use the value of the global variable to construct the seed for the RNG algorithm. Modify the GPU backend to generate a parallel loop to execute the Philox algorithm. The CPU backend still uses a sequential loop to perform Philox random number generation, and we will need to enhance the ParallelTaskAssignment pass to change this. Remove the old PCG RNG algorithm for the CPU and GPU backends. PiperOrigin-RevId: 206069733
Loading
Please sign in to comment