Improve performance of HloComputation::MakeInstructionPostOrder
Previously it used the same infrastructure as HloInstruction::Accept what caused a high overhead for large models due to the excess amount of work it have to do to support modifying the graph under iteration and due to the lack of caching on graphs with multiple sinks. The new code is a very simple implementation of an iterative DFS based topological sort. PiperOrigin-RevId: 199606688
Loading
Please sign in to comment