Commit d8af56e3 authored by A. Unique TensorFlower's avatar A. Unique TensorFlower Committed by TensorFlower Gardener
Browse files

Fix data race in tensorflow/core/kernels/map_stage_op.cc, and use tf...

Fix data race in tensorflow/core/kernels/map_stage_op.cc, and use tf locking/formatting  conventions.

- Fix data race in which a routine mysteriously releases a lock which was held when
  it was called, and which the caller expects will still be held afterwards.(!)
  See the changes to the routines notify_inserters_if_bounded() and notify_removers().

  The data race was in put() on current_bytes_ near the end, after the call to
  put_complete().  put_complete() calls notify_removers(), which released the
  lock, thus causing the race.  Any reader of the code would have expected that
  the lock would have been held at that point, but it wasn't.

  The same pattern in notify_inserters_if_bounded() didn't cause a race in the
  current code, but is a pitfall for future maintainers, who would not mornally
  expect something called notify_inserters_if_bounded() to mean "and in some
  circumstances, unlock".

  Even in the absense of this bug, and the weird, uncommented spec for those
  routines, it is almost always a bad idea to release the lock before calling a
  condition-variable wakeup.  First, it complicates the code, and makes
  it hard to reason about object deletion if the deletion is predicated by one
  of the signalled conditions.  Second, it's an _unnecessary_ complication:  a
  well-written condition variable implementation will requeue the woken
  thread(s) on the mutex queue, so the expensive part of the wakeup will be
  deferred until the waking thread unlocks anyway.

- Avoid the C++11 unique_lock mechanism that allowed the bug to happen,
  and instead use tensorflow::mutex, tensorflow::mutex_lock, and
  tensorflow::condition_variable.

- Use while-loops instead of lambdas for condition-variable waits, because
  while-loops are much easier to read.

- Fix the routines would_exceed_memory_limit() and is_capacity_full() to
  include their own validity check, so it's not needed at each of their call
  sites.  This improves the readability at the call sites.

- Use annotalysis locking annotations so that other violations can be found
  statically by the compiler.

- Reformat with clang-format.

PiperOrigin-RevId: 175913085
parent 2800d6e9
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment