Use safe variance epsilon for float16 layer_norm
- The original epsilon value (1e-12) is too small for float16 and can cause NaNs in the output when the variance is small. - This commit also adds float16 and float32 cases to LayerNormTest and improves the numerical robustness of the test logic.
Loading
Please sign in to comment