A.8 Memory Management

  • Add memory.vmscan_stat memory control group that displays numbers of scanned, rotated, and freed pages, and elapsed times for direct reclaim and soft reclaim. (3.1)

  • Extend the memory hotplug API to allow memory hotplug in virtual machines. Also required for the Xen balloon driver. (3.1)

  • Fix significant stalls in the page allocator when copying large amounts of data on NUMA machines. (3.1)

  • Add slub_debug method to the slub slab allocator to check if memory is not freed and help diagnose memory usage. (3.1)

  • Reduce CPU overhead of slub_debug. (3.1)

  • The cross memory attach feature adds the system calls process_vm_readv and process_vm_writev(), which allow data to be transferred between the address spaces of the two processes without passing through kernel space. (3.2)

  • Add a block plug for page reclaim to vmscan that reduces CPU overhead by reducing lock contention and merging requests. (3.2)

  • Implement per-CPU cache in slub for partial pages. (3.2)

  • Restrict access to slab files under procfs and sysfs, hiding slabinfo and /sys/kernel/slab/*. (3.2)

  • Add the slab_max_order kernel parameter that determines the maximum allowed order for slabs. High settings can cause OOMs due to memory fragmentation. The default value is 1 for systems with more than 32 MB of RAM. Otherwise, the default value is 0. (3.3)

  • To increase the probability of detecting memory corruption, change the buddy allocator to retain more free, protected pages and to interlace free, protected pages and allocated pages. (3.3)

  • Charge the pages dirtied by an exited process to random dirtying tasks. (3.3)

  • Allow the poll time and call intervals to balance dirty pages to be controlled by the value of the max_pause parameter. (3.3)

  • Fix dirtied pages accounting on sub-page writes. (3.3)

  • Introduce the dirty rate limit to compensate a task's think time when computing the final pause time. (3.3)

  • Reduce dirty throttling polls and CPU overhead. (3.3)

  • Avoid tiny dirty poll intervals. (3.3)

  • Make swap-in read-ahead skip over holes, allowing the system to swap back in at several MB/s, instead of a few hundred kB/s. (3.4)

  • Introduce bit-optimized iterator and radix tree cleanup in the core page cache. (3.4)

  • Improve allocation of contiguous memory chunks by adding DMA mapping helper functions. (3.5)

  • Remove swap token code and lumpy reclaim. (3.5)

  • Improve throughput and reduce CPU overhead by allowing swap read-ahead to be merged. (3.6)

  • Add cgroup controller that allows HugeTLB usage per control group to be limited and enforces the limit during page faults. (3.6)