Page table is where Linux stores virtual to physical page address translation, and its size can get huge when memory usage is high. One way to reduce the size of page tables, and reduce the number of page faults, is to use huge pages. I’ve been digging some information on hugepages for my own curiosity, and it looks like Linux has pretty good support for huge pages. And this blog serves as a quick note on my readings.
Hugepages
The sysctl directory contains /sys/kernel/mm/hugepages/hugepages-{pagesize}kB/
control files and information on hugepages, where pagesize could be 1048576 or 2048, corresponding to 1GB or 2MB of hugepage size.
To get information on hugepages on your Linux systems, the hugepages
directory contains the controlling files:
nr_hugepages
nr_hugepages_mempolicy
nr_overcommit_hugepages
free_hugepages
resv_hugepages
surplus_hugepages
.
You can also get hugepage-related information from /proc/meminfo
:
1 | HugePages_Total: 2048 |
The Kernel Documentation of Hugetlbpage contains the detailed information and explanation of the purpose and usage of hugepage files, as well as meminfo
fields.
Allocating Hugepages
The most convenient way to reserve hugepages on x86_64 Linux is to echo into the sysctl file /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
, e.g.:
1 | # echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages |
And for user to access and use the huge pages, Linux actually provides a quite convenient interface: the hugetlbfs
file system, e.g.:
1 | mount -t hugetlbfs nodev /mnt/huge |
Which mounts a pseudo filesystem of type hugetlbfs
on /mnt/huge
, and uses the default huge pagesize specified by the system, and all files created inside the directory uses huge pages.
And after that, you can use hugepage-backed memory by creating files inside /mnt/huge
directory. See example in Linux source tree: hugepage-mmap.c. The author takes the following steps:
- Open a file inside
/mnt
with read-write permission. - Map memory to the file using
mmap
, with proper protection and flags set (PROT_READ | PROR_WRITE
andMAP_SHARED
in this case). - Use the hugepage-backed memory as usual.
- Clean up memory and file.
Enable Hugepage On Start
According to Linux Kernel Documentation:
1 | System administrators may want to put this command in one of the local rc |
And to quote the LWN article:
1 | If the huge page pool is statically allocated at boot-time, then this section |
So to avoid external fragmentation and make sure that the hugepage allocation is always successful, we may want to reserve hugepages memory regions on boot time. This Debian Wiki Page provides a way to do that. To reserve number of hugepages, add the following line in /etc/sysctl.conf
:
1 | vm.nr_hugepages = 1024 |
And to mount it automatically on system start, simply add to /etc/fstab
:
1 | hugetlbfs /hugepages hugetlbfs mode=1770,gid=2021 0 0 |
Advanced Topics
There are other advanced topics, which probably will not be covered in the scope of this quick note:
- Transparent Huge Pages: system automatically decides if memory should be backed by hugepages, which make usage of hugepage memory much easier.
- libhugetlbfs APIs:
libhugetlbfs
provides programmer APIs to manage and access hugepage memory as well. Hugepage library utilitieshugectl
can overload Linux standardshmget()
functions to allow huge pages to be used by allocating shared memory. - Text And Data:
hugectl
has options to run an application, with itstext
anddata
section mapped by hugepages, which gives potential performance benefits.
These are ideas worth digging into in the future, for applications where hugepages can potential give a good performance boost.