Commit db2bb829 authored by Justin Lebar's avatar Justin Lebar Committed by TensorFlower Gardener
Browse files

[XLA:GPU] Cleanups to fused 021 transpose implementation.

- Fix typos.
- Clarify comments.
- Reduce nesting in a few places.
- Add asserts that this code is dealing with specifically a loop fusion.
- Rename some functions.  In particular, it's confusing to have a
  function with a generic name like EmitCodeWithBoundCheck that actually
  is specialized to a tiled implementation.
- Remove statement expression (GCC language extension), replacing it
  with an IIFE.
- Don't refer to shared-memory tile space as "buffer" without other
  qualifying words, since that's ambiguous with what XLA refers to as a
  "buffer".
- Use llvm::cast instead of static_cast.
- Comply with style guide naming rules for compile-time constants
  (kFoo).
- Use c_accumulate instead of std::accumulate.
- Put std::function parameter at the end of the param list.  This lets
  us cleanly embed the lambda into the call because of how clang-format
  formats such calls.  (I think this one is possibly the most helpful
  change in this patch, as it suddenly makes clear to me the way that we
  use two calls to emit_tiled_elemental_code_with_bounds_check to emit
  the code.)

PiperOrigin-RevId: 204134102
parent 50d121e2
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment