Efficient way to get files off the page cache
There's this great feature in modern operating systems called the page cache. Simply put, it keeps in memory what normally is stored on disk and helps both read and write performance. While it's all nice for a day to day use, it often gets in the way when one wants to track performance issues with "cold cache" (when the files you need to access are not in the page cache yet).
A commonly used command to flush the Linux page cache is the following:
# echo 3 > /proc/sys/vm/drop_caches
Unfortunately, its effect is broad, and it flushes the whole page cache. When working on cold startup performance of an application like Firefox, what you really want is to have your page cache in a state close to what it was before you started the application.
One way to get in a better position than flushing the entire page cache is to reboot: the page cache will be filled with system and desktop environment libraries during the boot process, making the application startup closer to what you want. But it takes time. A whole lot of it.
In one of my "what if" moments, I wondered what happens to the page cache when using posix_fadvise
with the POSIX_FADV_DONTNEED
hint. Guess what? It actually reliably flushes the page cache for the given range in the given file. At least it does so with Debian Squeeze's Linux kernel. Provided you have a list of files your application loads that aren't already in the page cache, you can flush these files and only these from the page cache.
The following source code compiles to a tool to which you give a list of files as arguments, and that flushes these files:
#include <unistd.h> #include <fcntl.h> #include <sys/stat.h>
int main(int argc, char *argv[]) { int i, fd; for (i = 1; i < argc; i++) { if ((fd = open(argv[i], O_RDONLY)) != -1) { struct stat st; fstat(fd, &st); posix_fadvise(fd, 0, st.st_size, POSIX_FADV_DONTNEED); close(fd); } } return 0; }
It's actually a pretty scary feature, especially on multi-user environments, because any user can flush any file she can open, repeatedly, possibly hitting system performance. By the way, on systems using lxc (and maybe other containers, I don't know), running the echo 3 > /proc/sys/vm/drop_caches
command from a root shell in a container does flush the host page cache, which could be dangerous for VPS hosting services using these solutions.
Update: I have to revise my judgment, it appears posix_fadvise(,,,POSIX_FADV_DONTNEED)
doesn't flush (parts of) files that are still in use by other processes, which still makes it very useful for my usecase, but also makes it less dangerous than I thought. The drop_cache
problem is still real with lxc, though.
2010-12-29 19:34:37+0900
You can leave a response, or trackback from your own site.
2010-12-29 21:10:16+0900
The cache drop thing does not work with OpenVZ.
2010-12-30 00:58:28+0900
That sounds like a security issue (DoS), please get a CVE.
2010-12-30 05:05:47+0900
I’ve always wondered why you can’t just copy the application files and start from the freshly copied files to get something like a cold start. Is there are a problem with that kind of approach?
(NB: I have no idea how the file caches work — maybe it’s hash-based or something? Changing the hash wouldn’t be hard though. And I assume file dependencies outside the app would still be cached, but I don’t think that’s a big deal in Firefox’s case, is it?)
2010-12-30 07:39:38+0900
Interesting article. How do you tell what is or isn’t in the page cache?
2010-12-30 08:24:40+0900
voracity: because when you copy the files, they get in the page cache.
Andrew Pollock: that, i’d love to know. AFAIK, there is no easy way to do so.
2010-12-30 10:23:47+0900
@glandium: I naively thought only reads caused blocks to get copied into the page cache (not writes), but I’m now the wiser. I’m surprised there’s no low-level disk write commands that avoid going through the cache.
Does the page cache survive unmounting/remounting? (Possibly with different mount params or mountpoints?)
2010-12-30 10:38:55+0900
> I’m surprised there’s no low-level disk write commands that avoid going through the cache.
There are. Using the O_DIRECT flag when opening a file, for example. Copying the files have a side effect, though: it changes the layout on disk, possibly making comparisons harder.
> Does the page cache survive unmounting/remounting?
Usually not, but how do you unmount/remount / or /usr without stopping your desktop environment ?
2010-12-30 11:54:50+0900
[Typing source of HTML source sucks.]
Your article contains the following invalid HTML:
</pre>
<p></code></p></blockquote>
The </code> never makes it to Planet making the rest of the page monospace :-(
2010-12-30 12:21:49+0900
Neil: fixed
2010-12-31 03:59:14+0900
Why would you have to unmount/remount / or /usr? I was imagining creating a special partition just for Firefox in a portable format, and unmounting/remounting that. It sounds like you’ve got a pretty good method for cold starts on Linux anyway. I guess I was thinking something like this might work for Windows, where, if I remember correctly, cold starts were more difficult.
2010-12-31 09:43:31+0900
voracity: because there also are system files involved in Firefox startup.