Efficient way to get files off the page cache
There's this great feature in modern operating systems called the page cache. Simply put, it keeps in memory what normally is stored on disk and helps both read and write performance. While it's all nice for a day to day use, it often gets in the way when one wants to track performance issues with "cold cache" (when the files you need to access are not in the page cache yet).
A commonly used command to flush the Linux page cache is the following:
# echo 3 > /proc/sys/vm/drop_caches
Unfortunately, its effect is broad, and it flushes the whole page cache. When working on cold startup performance of an application like Firefox, what you really want is to have your page cache in a state close to what it was before you started the application.
One way to get in a better position than flushing the entire page cache is to reboot: the page cache will be filled with system and desktop environment libraries during the boot process, making the application startup closer to what you want. But it takes time. A whole lot of it.
In one of my "what if" moments, I wondered what happens to the page cache when using posix_fadvise
with the POSIX_FADV_DONTNEED
hint. Guess what? It actually reliably flushes the page cache for the given range in the given file. At least it does so with Debian Squeeze's Linux kernel. Provided you have a list of files your application loads that aren't already in the page cache, you can flush these files and only these from the page cache.
The following source code compiles to a tool to which you give a list of files as arguments, and that flushes these files:
#include <unistd.h> #include <fcntl.h> #include <sys/stat.h>
int main(int argc, char *argv[]) { int i, fd; for (i = 1; i < argc; i++) { if ((fd = open(argv[i], O_RDONLY)) != -1) { struct stat st; fstat(fd, &st); posix_fadvise(fd, 0, st.st_size, POSIX_FADV_DONTNEED); close(fd); } } return 0; }
It's actually a pretty scary feature, especially on multi-user environments, because any user can flush any file she can open, repeatedly, possibly hitting system performance. By the way, on systems using lxc (and maybe other containers, I don't know), running the echo 3 > /proc/sys/vm/drop_caches
command from a root shell in a container does flush the host page cache, which could be dangerous for VPS hosting services using these solutions.
Update: I have to revise my judgment, it appears posix_fadvise(,,,POSIX_FADV_DONTNEED)
doesn't flush (parts of) files that are still in use by other processes, which still makes it very useful for my usecase, but also makes it less dangerous than I thought. The drop_cache
problem is still real with lxc, though.
2010-12-29 19:34:37+0900