Use a singleton threadpool in SingleThreadedCpuDevice instead of creating one for each graph.
We currently create a new ThreadPool with a single thread each time we create a new TF_Graph, perform constant folding, perform shape inference in C++ via a ShapeRefiner, import a GraphDef, or restore an Iterator from a checkpoint. This pool is only created in case we try to run an Eigen kernel that uses intra-op parallelism, and indeed since we never attempt to parallelize when the pool has only one thread, it is idle for its entire short existence. This change turns the ThreadPool used in SingleThreadedCpuDevice into a global singleton. The cost is that we keep an idle thread around for the lifetime of the process, but we save on thread creation and destruction. Since we previously created a SingleThreadedCpuDevice (and hence a thread) at least once per graph function, this seems like a reasonable tradeoff. PiperOrigin-RevId: 236739404
Loading
Please sign in to comment