NaN propagation for GPU pooling ops (#12504)
- Enable custom fwd maxpooling kernel to propagate NaNs. This makes it match the behaviour of CUDNN, and ensures that CUDNN's bwd maxpooling kernel behaves as expected (propagating NaNs). - Previous behavior remains default. To enable nan-propagation, set TF_ENABLE_MAXPOOL_NANPROP=1. - GPU bwd maxpool op tests cover both propagated and not- propagated NaNs. Performance is unaffected by change. On P100 GPU: - //tensorflow/core/ops_nn_ops_test is 12ms before and after - //tensorflow/python/kernel_tests:pooling_ops_test takes 84.0 sec before vs. 83.8 sec after change (delta is in the noise).
Loading
Please sign in to comment