Commit fe589d9e authored by Luke Iwanski's avatar Luke Iwanski Committed by Benoit Steiner
Browse files

[OpenCL] Implementation improvements (#9117)

* OpenCL Improvements

* Registers Scatter and ScatterNd Ops for SYCL

* Registers Stack op for SYCL

* Fixes No sycl buffer found error for debug ops

* Registers MatMul and Transpose Ops to SYCL device for double

* Extends analyzer_cli_test.py test to cover SYCL

* Fixes Transpose Op for double when on SYCL

* Bumps Eigen version to fix double precision issue on SYCL

* Extends SessionDebugTestBase to cover SYCL

* Register SYCL implementations for random ops

* Avoid functions that might not be defined on SYCL device (#51)

* Avoid functions that might not be defined on SYCL device

* Simplify by using Eigen math functions

* OpenCL improvements

 - Bumps Eigen Version
 - Refactors Ops registration
 - Introduces workaround for Const Op related to the difference between
   CUDA which uses pointers and OpenCL that uses buffers/accessors
 - Extends memory types to cover DEVICE_SYCL as well
 - Introduces  GetSYCLDevice() method that returns list of supported devices
   with GPU device having the highest priority ( doesn't include blacklisted devices )
 - ::internal::Transpose -> tensorflow::internal::Transpose in order to
   avoid compilation reported error
 - re-introduces fix for bugged string replacement causing a lot of compilation
   warnings -c -> --include
 - Adds sycl_runtime to bazels ARRAY_DEPS
 - Replicates TF_CALL_GPU_PROXY_TYPES for SYCL

* [OpenCL] Fixes an issue caused by switch to aligned allocator for sycl buffer (#53)

* [Build] Use gcc/g++ as a host compiler to avoid https://github.com/tensorflow/tensorflow/issues/8394 (#54)

* [OpenCL] Fixes Scatter Op

* Fix testSimple and testConst in stack_op_test (#3)

* Fix testSimple and testConst in stack_op_test

* Create a specialisation of DoParallelConcatUpdate for SyclDevice and
register it

* Guard all code in TENSORFLOW_USE_SYCL

* Do not use sycl device for int32

* Registration of the Sycl version is now looking like the one for the GPU

* Remove added empty line

* Register batch normalization kernels for OpenCL (#61)

* [OpenCL] RandomGamma has no GPU friendly implementation (#57)

* [OpenCL] Compatibility fixes for TensorFlow 1.1.0-rc1

* [OpenCL] Implements BatchMatmul Op for SYCL

* Lowercase the device name when GPU or SYCL returned

* [OpenCL] kernel_estimator_test.py assertEqual-> assertAlmostEqual due to floating point representation on the device

* [Eigen] Version bump

* GPU device name string manipulation is not needed anymore

* [OpenCL] Adds SYCL to device backwards compatibility

* [OpenCL] Extends core_rnn_test.py to run for SYCL device

* [OpenCL] Minor optimizations for build script

* [OpenCL] Enables skip folder list in build script

* [OpenCL] Fixes ApplyAdamOp for Sycl device

* [OpenCL] SYCL device improvements

* [OpenCL] Fixes debug_ops's SEGFAULT for SYCL device

* [Build] Adds hexagon to skipped folders list

* [OpenCL] Removes EnterLameDuckMode from SYCL device and allocator

* [OpenCL] Registers Unique Op for SYCL device

* [OpenCL][Temporary] Disables tests for SYCL target due to features not being implemented yet

  Tests affected:
    - tensorflow/contrib/memory_stats/python/kernel_tests/memory_stats_ops_test.py
    - tensorflow/contrib/rnn/python/kernel_tests/core_rnn_test.py
    - tensorflow/python/kernel_tests/conv_ops_test.py
    - tensorflow/python/kernel_tests/depthwise_conv_op_test.py
    - tensorflow/python/kernel_tests/pooling_ops_3d_test.py
    - tensorflow/python/kernel_tests/pooling_ops_test.py
    - tensorflow/python/kernel_tests/scatter_nd_ops_test.py
    - tensorflow/python/training/adam_test.py
    - tensorflow/python/training/localhost_cluster_performance_test.py
    - tensorflow/python/training/training_ops_test.py

* [OpenCL][Temporary] Disables failing tests for SYCL in order to establish regression baseline

  Tests affected:
    - tensorflow/python/debug/cli/analyzer_cli_test.py
    - tensorflow/python/debug/lib/session_debug_testlib.py
    - tensorflow/python/debug/lib/stepper_test.py
    - tensorflow/python/kernel_tests/unstack_op_test.py
    - tensorflow/python/ops/image_ops_test.py

* [OpenCL] Take options.config.device_count() into consideration

* [OpenCL] Fixes compilation warning

* [OpenCL] device:SYCL:0 -> sycl:0

* [OpenCL] Removes unwanted flags in building script

Removes flags given to computecpp that enable SIMD instructions
Removes duplicate flags

* bool -> const bool

* [OpenCL] sycl in test_util.gpu_device_name() -> is_sycl_enabled()

* [OpenCL][Temporary] Disables failing tests for SYCL in order to establish regression baseline

  Test affected:
    - tensorflow/contrib/stateless/python/kernel_tests/stateless_random_ops_test.py

* Imports test_util from tensorflow.python.framework

* [OpenCL] Fixes formatting in Python code

* [OpenCL] Extends session_test.py to cover SYCL device

* [OpenCL] Cleans singleton class

* [OpenCL] Keeping CUDA happy

* [OpenCL][Temporary] Disables failing tests for SYCL in order to establish regression baseline

  Test affected:
   - tensorflow/contrib/rnn/python/kernel_tests/core_rnn_cell_test.py
   - tensorflow/contrib/seq2seq/python/kernel_tests/beam_search_ops_test.py

* Added support for building with SYCL on ARM.

* Acts on the review feedback from:
 - https://github.com/tensorflow/tensorflow/pull/9117#discussion_r113608975
 - https://github.com/tensorflow/tensorflow/pull/9117#discussion_r113609173

* [OpenCL] Fixes scatter_nd_op_test

* Fixes auto-merge mistake

* [OpenCL] struct SyclDevice -> class SyclDevice

* Revert "[OpenCL] struct SyclDevice -> class SyclDevice"

This reverts commit addd43348c374a5379f67bb1e5ad084715722fc2.

* [OpenCL] Reverting refactoring commit.

  As requested in the review https://github.com/tensorflow/tensorflow/pull/9117#issuecomment-298454466
  This change set will be re-introduced in smaller chunks.

* Revert "[OpenCL] device:SYCL:0 -> sycl:0"

This reverts commit cf16e60340b62d16c3764d71b716fe03d35f87a9.

* Revert "[OpenCL] Adds SYCL to device backwards compatibility"

This reverts commit b8401b5164199b7a169be1c1d8dea5001195c390.

* Acts on the feedback from https://github.com/tensorflow/tensorflow/pull/9117#discussion_r115036905

* control_flow_ops_py_test.py expects device name to be lower cased

* Acts on the feedback from https://github.com/tensorflow/tensorflow/pull/9117#discussion_r115037222

* Removes debug print

* Removes not needed partial specialisation

* [OpenCL] Registers ScatterNdFunctor for SYCL device

* [OpenCL] Make it compile

* [OpenCL] Follow gpu_device changes

* [OpenCL] Adds cxx_builtin_include_directory for python lib

  Fixes bazels missing undeclared inclusions that appeared after
  merge with TensorFlow upstream

* [OpenCL] Fixes Constant Op

* [OpenCL] gXX-4.8 -> gXX

* [OpenCL] Removes -D_GLIBCXX_USE_CXX11_ABI=0 as it breaks default compiler setup for Ubuntu 16.04

* Revert "[OpenCL] kernel_estimator_test.py assertEqual-> assertAlmostEqual due to floating point representation on the device"

This reverts commit 06c50c0a485f40c30a436f02c3fa7794e370c49d.

* [OpenCL] CPU allocator is a singleton we should not delete it
parent a365082c
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment