[XLA:GPU] Rewrite elemental emission of bitcasts
My first attempt at this only handled bitcasts that implement a reshape operation, now transposes or mixed bitcasts are handled as well. There is probably some optimization potential to reduce the amount of address arithmetic emitted to IR for a follow-up. This is already tested fairly well with the existing test suite, there are failing tests with layout_assignment before fusion without this change. PiperOrigin-RevId: 188155082
Loading
Please sign in to comment