Improve aarch64 MonitorEntry/Exit assembly code
We make two kinds of changes: 1) We remove some redundant moves, which appeared to have been copied from some architecture with a 2 address instruction format. 2) We avoid the use of dmb barrier instructions, and instead use acquire/release instructions for the actual lock loads/updates. (2) is a clear win on A53/A57, where there seems to be very little additional cost associated with acquire/release when used with "exclusive" memory operations, as they are here. On the cores used in 2016 Pixel phones, the story is more mixed. But the addition of acquire/release to a pair of exclusive load/store operations still seems to cost enough less than 2 dmb's, so that even if 10% of lock acquisitions are nested and unnecessarily enforce ordering, we come out slightly ahead. ARM's advice for the future is also to move in this direction. Test: AOSP boots. AOSP art test failures seem attributable to other issues. Change-Id: I2399baeab3df93196471e65612c00d95ad4e2b62
Loading
Please sign in to comment