net/mlx5e: Reuse per-RQ XDP buffer to avoid stack zeroing overhead
[ Upstream commit b66b76a8 ] CONFIG_INIT_STACK_ALL_ZERO introduces a performance cost by zero-initializing all stack variables on function entry. The mlx5 XDP RX path previously allocated a struct mlx5e_xdp_buff on the stack per received CQE, resulting in measurable performance degradation under this config. This patch reuses a mlx5e_xdp_buff stored in the mlx5e_rq struct, avoiding per-CQE stack allocations and repeated zeroing. With this change, XDP_DROP and XDP_TX performance matches that of kernels built without CONFIG_INIT_STACK_ALL_ZERO. Performance was measured on a ConnectX-6Dx using a single RX channel (1 CPU at 100% usage) at ~50 Mpps. The baseline results were taken from net-next-6.15. Stack zeroing disabled: - XDP_DROP: * baseline: 31.47 Mpps * baseline + per-RQ allocation: 32.31 Mpps (+2.68%) - XDP_TX: * baseline: 12.41 Mpps * baseline + per-RQ allocation: 12.95 Mpps (+4.30%) Stack zeroing enabled: - XDP_DROP: * baseline: 24.32 Mpps * baseline + per-RQ allocation: 32.27 Mpps (+32.7%) - XDP_TX: * baseline: 11.80 Mpps * baseline + per-RQ allocation: 12.24 Mpps (+3.72%) Reported-by:Sebastiano Miano <mianosebastiano@gmail.com> Reported-by:
Samuel Dobron <sdobron@redhat.com> Link: https://lore.kernel.org/all/CAMENy5pb8ea+piKLg5q5yRTMZacQqYWAoVLE1FE9WhQPq92E0g@mail.gmail.com/ Signed-off-by:
Carolina Jubran <cjubran@nvidia.com> Reviewed-by:
Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by:
Tariq Toukan <tariqt@nvidia.com> Acked-by:
Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/1747253032-663457-1-git-send-email-tariqt@nvidia.com Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Stable-dep-of: afd5ba57 ("net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy RQ") Signed-off-by:
Sasha Levin <sashal@kernel.org>
Loading
Please sign in to comment