Split up HloExecutionProfile into a set of re-usable components
The end goal is to have Hlo profiling support in XlaJitCompiledCpuFunction and eventually AOT compiled XlaCompiledCpuFunction. This change leaves the HloExecutionProfile interface mostly intact -- internally it uses the new split out components to do what it did before. However, in future CLs: - I'll extract out a HloExecutionProfilePrototype that contains the HloProfilePrinter, the OwningHloProfilePrinterStaticData and the HloToProfileIndex. This will then live in the Executable (if profiling is enabled). - The HloExecutionProfile for a specific execution will have a pointer to the parent HloExecutionProfilePrototype, which it will use to paginate profile_counters_. - The CPU backend will use the HloToProfileIndex in the HloExecutionProfilePrototype to map hlo instructions to profile counter offsets. This will make the indices in the generated code "line up" with the indices that the HloProfilePrinter expects. These changes will allow the XlaJitCompiledCpuFunction (and later AOT) clients to pass in an appropriately sized zeroed buffer to the generated function and then pass that same buffer to the appropriate HloProfilePrinter to get a textual Hlo profile. PiperOrigin-RevId: 175613737
Loading
Please sign in to comment