{"id":219,"date":"2008-12-13T17:01:12","date_gmt":"2008-12-13T15:01:12","guid":{"rendered":"http:\/\/glandium.org\/blog\/?p=219"},"modified":"2010-01-27T08:52:26","modified_gmt":"2010-01-27T07:52:26","slug":"shared-subtrees-and-per-process-namespaces","status":"publish","type":"post","link":"https:\/\/glandium.org\/blog\/?p=219","title":{"rendered":"Shared subtrees and per-process namespaces"},"content":{"rendered":"<p>Now we have seen what <a href=\"\/blog\/?p=217\">per-process namespaces<\/a> and <a href=\"\/blog\/?p=218\">shared subtrees<\/a> are and how to operate them, we can try to combine these two features.<\/p>\n<p>We'll be using our <code>newns<\/code> tool from this <a href=\"\/blog\/?p=217\">earlier post<\/a> to create new namespaces. And for practical reasons, let's say you have a terminal with a <code>1$<\/code> prompt and a second terminal with a <code>2$<\/code> prompt (that will allow me to skip \"go to the second terminal\" phrases).<\/p>\n<p>In the kernel, per-process namespaces are a bit like bind mounts, such that shared subtrees work with namespaces like they do with bind mounts.<\/p>\n<p>As with standard bind mounts, the default shared subtree mode is <code>private<\/code>, which means mounts done on either namespaces will be private to the namespace. Only mount points active at the time of the new namespace creation will be in both namespaces:<\/p>\n<blockquote><p><code>1$ .\/newns<br \/>\n1$ mount \/dev\/sda1 \/mnt<br \/>\n1$ ls \/mnt<br \/>\nconfig-2.6.26-1-amd64  grub  initrd.img-2.6.26-1-amd64\tSystem.map-2.6.26-1-amd64  vmlinuz-2.6.26-1-amd64<br \/>\n2$ ls \/mnt<br \/>\n2$ mount \/dev\/sda1 \/cdrom<br \/>\n2$ ls \/cdrom<br \/>\nconfig-2.6.26-1-amd64  grub  initrd.img-2.6.26-1-amd64\tSystem.map-2.6.26-1-amd64  vmlinuz-2.6.26-1-amd64<br \/>\n1$ ls \/cdrom<br \/>\n1$ umount \/mnt<br \/>\n1$ exit<\/code> # Exit from newns<br \/>\n<code>2$ umount \/cdrom<br \/>\n<\/code><\/p><\/blockquote>\n<p><code>shared<\/code> mode allows both namespaces to share subsequent mounts. As with bind mounts, and for obvious reasons, you need to change the subtree mode before creating the new namespace:<\/p>\n<blockquote><p><code>1$ mount --make-rshared \/<br \/>\n1$ .\/newns<br \/>\n1$ mount --bind \/usr \/mnt<br \/>\n1$ ls \/mnt<br \/>\nbin  games  include  lib  lib32  lib64\tlocal  sbin  share\tsrc  X11R6<br \/>\n2$ ls \/mnt<br \/>\nbin  games  include  lib  lib32  lib64\tlocal  sbin  share\tsrc  X11R6<br \/>\n<\/code><\/p><\/blockquote>\n<p>Like with shared bind mounts, the new mount point can be unmounted from either namespace:<\/p>\n<blockquote><p><code>2$ umount \/mnt<br \/>\n1$ ls \/mnt<br \/>\n1$ exit<\/code> # Exit from newns\n<\/p><\/blockquote>\n<p>A <code>slave<\/code> subtree will have mount points under its master (shared) subtrees propagated, while propagation won't happen in the other direction. Again, very much like bind mounts:<\/p>\n<blockquote><p><code>1$ mount --make-rshared \/<\/code> # For completeness, we already did that earlier<br \/>\n<code>1$ .\/newns<br \/>\n1$ mount --make-rslave \/<br \/>\n1$ mount --bind \/usr \/mnt<br \/>\n1$ ls \/mnt<br \/>\nbin  games  include  lib  lib32  lib64\tlocal  sbin  share\tsrc  X11R6<br \/>\n2$ ls \/mnt<br \/>\n2$ mount --bind \/usr \/cdrom<br \/>\n2$ ls \/cdrom<br \/>\nbin  games  include  lib  lib32  lib64\tlocal  sbin  share\tsrc  X11R6<br \/>\n1$ ls \/cdrom<br \/>\nbin  games  include  lib  lib32  lib64\tlocal  sbin  share\tsrc  X11R6<br \/>\n1$ exit<\/code> # Exit from newns<br \/>\n<code>2$ umount \/cdrom<br \/>\n<\/code><\/p><\/blockquote>\n<p>Contrary to <code>shared<\/code> and <code>slave<\/code>, <code>unbindable<\/code> doesn't add value when used in two different namespaces. This is only something that will have impact on the current namespace:<\/p>\n<blockquote><p><code>1$ mount --make-rshared \/<\/code> # For completeness, we already did that earlier<br \/>\n<code>1$ .\/newns<br \/>\n2$ mount --make-runbindable \/<br \/>\n2$ mount --bind \/usr \/mnt<br \/>\nmount: wrong fs type, bad option, bad superblock on \/usr,<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;missing codepage or helper program, or other error<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;In some cases useful info is found in syslog - try<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;dmesg | tail  or so<br \/>\n1$ mount --bind \/usr \/mnt<br \/>\nbin  games  include  lib  lib32  lib64\tlocal  sbin  share\tsrc  X11R6<br \/>\n1$ exit<\/code> # Exit from newns<br \/>\n<code>1$ mount --make-rprivate<\/code> # Set back to default\n<\/p><\/blockquote>\n<p>Now we've seen what can be done with namespaces and shared subtrees, let's see what nice features can be implemented with both.<\/p>\n<p>As <a href=\"http:\/\/etbe.coker.com.au\/2008\/12\/13\/per-process-namespaces-pam-namespace\/\">Russell revealed<\/a>, pam-namespace is able to polyinstanciate some directories following rules set in <code>\/etc\/security\/namespace.conf<\/code> (see <code>namespace.conf(5)<\/code>). The sad thing is that it apparently can't just create a new namespace without polyinstanciating a directory, which could be useful if you only want separate namespaces, but no polyinstanciated directory.<\/p>\n<p>Russell's recipe goes as follows:<\/p>\n<ul>\n<li>At boot time, <code>\/<\/code> is put in <code>shared<\/code> mode and <code>\/tmp<\/code> in <code>private<\/code> mode.<\/li>\n<li>When opening a session, pam will create a new namespace, and bind mount <code>\/tmp-inst\/$USER<\/code> to \/tmp.<\/li>\n<\/ul>\n<p>What this means is that in the newly created namespace, a user reading in <code>\/tmp<\/code> will actually be reading in <code>\/tmp-inst\/$USER<\/code> without the user knowing. Also, if root mounts something on the parent namespace, it will be propagated (<code>\/<\/code> being <code>shared<\/code>) to the user namespace. This means that mounting USB storage, for example, will be propagated to the user namespace. This also means that something mounted from the user namespace will also be propagated to the parent namespace. In both cases, this only applies to mounts occuring outside of <code>\/tmp<\/code>, for which mounts don't get propagated.<\/p>\n<p>Note that without setting <code>\/tmp<\/code> as <code>private<\/code>, when PAM would be mounting <code>\/tmp\/inst\/$USER<\/code>, it would propagate as well, setting <code>\/tmp<\/code> to <code>\/tmp\/inst\/$USER<\/code> for everyone. So setting <code>\/tmp<\/code> as <code>private<\/code> is mandatory.<\/p>\n<p>There is, however, a flaw in Russell's recipe: If any of the user mounts something under a submount of <code>\/<\/code>, under <code>\/var<\/code> for example, if <code>\/var<\/code> is mounted, it won't be mounted to a user that already had a session opened. For that to be possible, you have to use <code>--make-rshared<\/code> instead of <code>--make-shared<\/code> in Russell's recipe.<\/p>\n<p>Some nice setup that can be done with all these, is the following:<\/p>\n<p>Add the following to <code>\/etc\/security\/namespace.conf<\/code>:<\/p>\n<blockquote><p><code>\/tmp tmpfs tmpfs root<\/code><\/p><\/blockquote>\n<p>Add the following to <code>\/etc\/pam.d\/common-session<\/code>:<\/p>\n<blockquote><p><code>session required pam_namespace.so<\/code><\/p><\/blockquote>\n<p>Until here, it is pretty much the same as Russell's, except <code>\/tmp<\/code> is a <em>tmpfs<\/em> instead of a real directory in <code>\/tmp-inst\/<\/code>, which can have some advantages. It doesn't seem to be possible to give pam-namespace a size for the <em>tmpfs<\/em>, though.<\/p>\n<p>Add the following to <code>\/etc\/rc.local<\/code>:<\/p>\n<blockquote><p><code>mount --make-rshared \/<br \/>\nmount --bind \/tmp \/tmp<br \/>\nmount --make-private \/tmp<\/code><\/p><\/blockquote>\n<p>Here again, this is the same as Russell's except for the correction for <code>--make-rshared<\/code> as seen above. If <code>\/tmp<\/code> is already a mount point (on my systems, it is a <em>tmpfs<\/em>), you can remove <code>mount --bind \/tmp \/tmp<\/code> from above.<\/p>\n<p>Add the following to <code>\/etc\/security\/namespace.init<\/code>:<\/p>\n<blockquote><p><code>mount --make-rslave \/<\/code><\/p><\/blockquote>\n<p>Now, this is where it gets interesting: we're setting the whole tree as <code>slave<\/code> in the user namespace which means that if a user mounts a file system anywhere, it will only be seen in his session. Which means the user can more safely mount encrypted volumes: they won't be available to other users (root can still go wandering in <code>\/dev\/kmem<\/code>, though). And you don't even need SE linux for that.<\/p>\n<p>The caveat is that if you open a root session with su, from the user session, you are still inside the user boundaries, and don't have access to the virtual filesystem as it is for init. And if you mount something as root then, while it will appear in the user namespace, it won't appear neither in other users namespace nor in init's, which can be a good thing in some cases, but a burden in others. It means you may need to setup a special user that won't get a new namespace in <code>\/etc\/security\/namespace.conf<\/code>.<\/p>\n<p>An alternative model would be to only set the user's home as <code>slave<\/code>, which means anything mounted by the user in his home directory would stay private in his namespace, while anything mounted outside would be shared with other namespaces.<\/p>\n<p>For that, you may want to replace the lines we added to <code>\/etc\/security\/namespace.init<\/code> with the following:<\/p>\n<blockquote><p><code>HOME=$(getent passwd $3 | cut -d: -f6)<br \/>\nmount --bind \"$HOME\" \"$HOME\"<br \/>\nmount --make-rslave \"$HOME\"<br \/>\n<\/code><\/p><\/blockquote>\n<p>Either way, the remaining problem is that a root session opened with su from the user session won't get access to the original <code>\/tmp<\/code>.<\/p>\n<p>As we saw, there are various use cases for namespaces and shared subtrees. I'll follow-up again on these features soon, as I'll be using them in yet another way to achieve a very different purpose.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Now we have seen what per-process namespaces and shared subtrees are and how to operate them, we can try to combine these two features. We&#8217;ll be using our newns tool from this earlier post to create new namespaces. And for practical reasons, let&#8217;s say you have a terminal with a 1$ prompt and a second [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5],"tags":[23],"class_list":["post-219","post","type-post","status-publish","format-standard","hentry","category-pdo","tag-en"],"_links":{"self":[{"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/posts\/219","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=219"}],"version-history":[{"count":3,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/posts\/219\/revisions"}],"predecessor-version":[{"id":660,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/posts\/219\/revisions\/660"}],"wp:attachment":[{"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=219"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=219"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=219"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}