Support for CUDA 9.0
Add explicit __syncwarp to bias_op - Makes warp-synchronous code safe on Volta Add sync mask to __shfl intrinsics Add libdevice bytecode paths for CUDA 9 - In CUDA 9, all supported architectures are merged into a single file Update code gating for CUDA 9 Add sm_70 to the lookup table used by XLA Change the default sm arch from 20 to 30. Fix for NVPTX not yet supporting sm_70 Remove unnecessary cuda decorators from defaulted constructors Use updated NCCL for CUDA 9 fp16 support
Loading
Please sign in to comment