FROMLIST: BACKPORT: THP zones: the use cases of policy zones
There are three types of zones: 1. The first four zones partition the physical address space of CPU memory. 2. The device zone provides interoperability between CPU and device memory. 3. The movable zone commonly represents a memory allocation policy. Though originally designed for memory hot removal, the movable zone is instead widely used for other purposes, e.g., CMA and kdump kernel, on platforms that do not support hot removal, e.g., Android and ChromeOS. Nowadays, it is legitimately a zone independent of any physical characteristics. In spite of being somewhat regarded as a hack, largely due to the lack of a generic design concept for its true major use cases (on billions of client devices), the movable zone naturally resembles a policy (virtual) zone overlayed on the first four (physical) zones. This proposal formally generalizes this concept as policy zones so that additional policies can be implemented and enforced by subsequent zones after the movable zone. An inherited requirement of policy zones (and the first four zones) is that subsequent zones must be able to fall back to previous zones and therefore must add new properties to the previous zones rather than remove existing ones from them. Also, all properties must be known at the allocation time, rather than the runtime, e.g., memory object size and mobility are valid properties but hotness and lifetime are not. ZONE_MOVABLE becomes the first policy zone, followed by two new policy zones: 1. ZONE_NOSPLIT, which contains pages that are movable (inherited from ZONE_MOVABLE) and restricted to a minimum order to be anti-fragmentation. The latter means that they cannot be split down below that order, while they are free or in use. 2. ZONE_NOMERGE, which contains pages that are movable and restricted to an exact order. The latter means that not only is split prohibited (inherited from ZONE_NOSPLIT) but also merge (see the reason in Chapter Three), while they are free or in use. Since these two zones only can serve THP allocations (__GFP_MOVABLE | __GFP_COMP), they are called THP zones. Reclaim works seamlessly and compaction is not needed for these two zones. Compared with the hugeTLB pool approach, THP zones tap into core MM features including: 1. THP allocations can fall back to the lower zones, which can have higher latency but still succeed. 2. THPs can be either shattered (see Chapter Two) if partially unmapped or reclaimed if becoming cold. 3. THP orders can be much smaller than the PMD/PUD orders, e.g., 64KB contiguous PTEs on arm64 [1], which are more suitable for client workloads. Policy zones can be dynamically resized by offlining pages in one of them and onlining those pages in another of them. Note that this is only done among policy zones, not between a policy zone and a physical zone, since resizing is a (software) policy, not a physical characteristic. Implementing the same idea in the pageblock granularity has also been explored but rejected at Google. Pageblocks have a finer granularity and therefore can be more flexible than zones. The tradeoff is that this alternative implementation was more complex and failed to bring a better ROI. However, the rejection was mainly due to its inability to be smoothly extended to 1GB THPs [2], which is a planned use case of TAO. [1] https://lore.kernel.org/20240215103205.2607016-1-ryan.roberts@arm.com/ [2] https://lore.kernel.org/20200928175428.4110504-1-zi.yan@sent.com/ Change-Id: I7eb555541d04b16b93dea5aa0e2b329c49694a10 Signed-off-by:Yu Zhao <yuzhao@google.com> Link: https://lore.kernel.org/r/20240229183436.4110845-2-yuzhao@google.com/ Bug: 313807618 [ Don't allocate order 0 from nomerge/nosplit zone - causes increase in reclaim activity ] Signed-off-by:
Kalesh Singh <kaleshsingh@google.com>
Loading