{"id":141,"date":"2007-06-24T12:27:21","date_gmt":"2007-06-24T10:27:21","guid":{"rendered":"http:\/\/web.glandium.org\/blog\/?p=141"},"modified":"2010-01-27T08:52:32","modified_gmt":"2010-01-27T07:52:32","slug":"playing-more-with-lvm-luks-and-the-device-mapper","status":"publish","type":"post","link":"https:\/\/glandium.org\/blog\/?p=141","title":{"rendered":"Playing more with LVM, LUKS and the device mapper"},"content":{"rendered":"<p>Following my previous entry about <a href=\"\/blog\/?p=139\">playing around with LVM, LUKS and the device mapper<\/a>, I documented myself about internals involved in pvmove, which is more of a challenge, considering there is no such documentation. I could find no useful documentation for either the device mapper or LVM. A bit of good old <acronym title=\"Use the source, Luke\">UTSL<\/acronym> later, I could elaborate how to do what I wanted, and realized it was even possible to do in shell script.<\/p>\n<p>So here you are : a shell script to transform an LVM physical volume to a LVM over LUKS physical volume,. I'll detail in between how it works. As the previous script, use at your own risk, it comes with no warranty.<br \/>\nNote you theorically can still access the filesystems underneath without problems. It's an in-place and live conversion. Also note this is more a proof of concept than a proper, risk-less and well-written solution.<\/p>\n<blockquote><p><code>set -e<br \/>\ndev=$1<br \/>\nluks=$(mktemp)<br \/>\ncryptdev=$(basename $dev)_crypt<br \/>\npvsize=$(pvs -o pe_start,pv_size --units s --noheadings --nosuffix \"$dev\" | awk '{print $1 + $2}')<br \/>\ndevsize=$(blockdev --getsz \"$dev\")<br \/>\nmchunk=8<br \/>\n<\/code><\/p><\/blockquote>\n<p>The script takes the physical volume device to convert as an argument. Note there is no check for the validity of its value.<\/p>\n<ul>\n<li>luks will be the temporary file where we are going to create a temporary LUKS volume.<\/li>\n<li>pvsize is the full size of the LVM physical volume, i.e. the size of all the physical extents (pv_size) + the LVM headers and metadata (pe_start).<\/li>\n<li>devsize is the raw device size.<\/li>\n<li>mchunk is the size in blocks of chunks for the mirror target, see below.<\/li>\n<\/ul>\n<blockquote><p><code>dd of=\"$luks\" seek=$devsize count=0 bs=512 2> \/dev\/null<br \/>\nluksdev=$(losetup -f)<br \/>\nlosetup \"$luksdev\" \"$luks\"<br \/>\ntrap \"losetup -d \\\"$luksdev\\\"; rm -f \\\"$luks\\\"\" EXIT<br \/>\n<\/code><\/p><\/blockquote>\n<p>Next, we create a sparse luks file the same size as the device (in case luksFormat would use the size somehow, but I believe it doesn't), and a loopback device on this file. The trap is here to avoid leaving the loopback device and the file when an error occurs later (though during the conversion itself, it will be pointless).<\/p>\n<blockquote><p><code>cryptsetup luksFormat -q \"$luksdev\"<br \/>\ncryptsetup luksOpen \"$luksdev\" ${cryptdev}_real<br \/>\nread start length crypt format key IVoff cdev offset &lt;&lt;EOF<br \/>\n$(dmsetup table ${cryptdev}_real)<br \/>\nEOF<br \/>\n<\/code><\/p><\/blockquote>\n<p>We create a LUKS device, so that we can get the encryption key ($key), and the size of the LUKS header ($offset). Note you need to add --showkey to the <code>dmsetup table<\/code> command on sid.<\/p>\n<blockquote><p><code>if [ $(expr $devsize - $pvsize) -lt $offset ]; then<br \/>\n&nbsp;&nbsp;echo Not enough free space after LVM physical volume<br \/>\n&nbsp;&nbsp;cryptsetup luksClose ${cryptdev}_real<br \/>\n&nbsp;&nbsp;exit<br \/>\nfi<br \/>\n<\/code><\/p><\/blockquote>\n<p>Check we have enough space after the LVM physical volume to offset everything by the size of the LUKS header. If not, you can still try again after you reduce the size of the LVM physical volume by an extent.<\/p>\n<blockquote><p><code>if [ $(expr $devsize % $offset % $mchunk) -gt 0 ]; then<br \/>\n&nbsp;&nbsp;echo Last<br \/>\n&nbsp;&nbsp;cryptsetup luksClose ${cryptdev}_real<br \/>\n&nbsp;&nbsp;exit<br \/>\nfi<br \/>\n<\/code><\/p><\/blockquote>\n<p>This is another check to avoid surprises at the end, when dealing with the last chunk. As the script is written for the moment, it doesn't support cases where this last chunk is not a multiple of $mchunk. So we need to abort in these.<\/p>\n<blockquote><p><code>read major minor &lt;&lt;EOF<br \/>\n$(stat -t \"$dev\" | awk '{print $10,$11}')<br \/>\nEOF<br \/>\nmaps=$(dmsetup deps | awk -F: \"\/\\($major, $minor\\)\/{print \\$1}\")<br \/>\ndmsetup create $cryptdev &lt;&lt;EOF<br \/>\n0 $length linear $dev 0<br \/>\nEOF<br \/>\ndmsetup reload ${cryptdev}_real &lt;&lt;EOF<br \/>\n$start $length crypt $format $key $IVoff $dev $offset<br \/>\nEOF<br \/>\ndmsetup resume ${cryptdev}_real<br \/>\nfor map in ${maps}; do<br \/>\n&nbsp;&nbsp;dmsetup table \"$map\" | sed s,$major:$minor,\/dev\/mapper\/$cryptdev, | dmsetup reload \"$map\"<br \/>\n&nbsp;&nbsp;dmsetup resume \"$map\"<br \/>\ndone<br \/>\n<\/code><\/p><\/blockquote>\n<p>Here, we create the dev_crypt device mapper that is our fake LUKS device, which starts as a simple linear mapper and will end as a complete crypt mapper. This fake LUKS device is inserted as an intermediate mapper between the LVM device mapper and the real device. So, the LVM device mapper will have this fake LUKS device as backend, and, at the beginning, the fake LUKS device maps linearly to the real device.<\/p>\n<p>Note we look for all device mappers using the real device as backend before creating the fake LUKS device to avoid finding the fake LUKS device in the list.<\/p>\n<p>Also note <code>dmsetup reload<\/code> only loads a new table in the INACTIVE slot, and <code>dmsetup resume<\/code> makes this inactive table LIVE.<\/p>\n<blockquote><p><code>cursor=$length<br \/>\nchunk=$offset<br \/>\nwhile [ $cursor -gt 0 ]; do<br \/>\n&nbsp;&nbsp;cursor=$(expr $cursor - $chunk)<br \/>\n&nbsp;&nbsp;if [ $cursor -lt 0 ]; then<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;chunk=$(expr $chunk + $cursor)<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;cursor=0<br \/>\n&nbsp;&nbsp;fi<br \/>\n&nbsp;&nbsp;(<br \/>\n&nbsp;&nbsp;[ $cursor -ne 0 ] && echo 0 $cursor linear $dev 0<br \/>\n&nbsp;&nbsp;echo $cursor $chunk mirror core 1 $mchunk 2 $dev $cursor \/dev\/mapper\/${cryptdev}_real $cursor<br \/>\n&nbsp;&nbsp;[ $cursor -lt $(expr $length - $chunk) ] && echo $(expr $cursor + $chunk) $(expr $length - $cursor - $chunk) crypt $format $key $(expr $IVoff + $cursor + $chunk) $dev $(expr $offset + $cursor + $chunk)<br \/>\n&nbsp;&nbsp;) | dmsetup reload \"$cryptdev\"<br \/>\n&nbsp;&nbsp;dmsetup resume \"$cryptdev\"<br \/>\n&nbsp;&nbsp;chunks=$(expr $chunk \/ $mchunk)<br \/>\n&nbsp;&nbsp;while ! dmsetup status \"$cryptdev\" | grep \"$chunks\/$chunks\"; do<br \/>\n&nbsp;&nbsp;&nbsp;&nbsp;true<br \/>\n&nbsp;&nbsp;done<br \/>\ndone<br \/>\n<\/code><\/p><\/blockquote>\n<p>This is where the main work is done : moving the data around. We actually just let the device mapper deal with the data duplication, $offset blocks by $offset blocks ($offset being the LUKS header size), using a mirror target for the chunk being moved. So our disk looks like the following:<br \/>\n<img decoding=\"async\" src=\"\/blog\/wp-content\/uploads\/2007\/06\/lvm2luks.png\"\/><br \/>\nWe use the extra dev_crypt_real device (the previously remapped LUKS device) as the encryption backend for the mirror.<br \/>\nI haven't figured a better way to wait for the end of the mirroring than to do a loop checking with <code>dmsetup status<\/code>, <code>dmsetup wait<\/code> doesn't seem to be very helpful here.<br \/>\nAnyways, this is the part of the script where you don't want a crash to occur. Because if it does, all you can do is start on a rescue system, and try to find where the encrypted part of the disk start to setup a device mapper by hand.<br \/>\nAnd you'd better have the luks temporary file in a directory that is neither in RAM (think tmpfs) nor in the LVM you are converting (\/tmp in a default etch install is, for instance ; note the script works nevertheless fine in this case). Also note the trap will remove the luks temporary file if the script exits...<\/p>\n<blockquote><p><code>dmsetup reload \"$cryptdev\" &lt;&lt;EOF<br \/>\n$start $length crypt $format $key $IVoff $dev $offset<br \/>\nEOF<br \/>\ndmsetup resume \"$cryptdev\"<br \/>\ndmsetup remove \"${cryptdev}_real\"<br \/>\ndd if=\"$luks\" of=$dev count=$offset bs=512 2> \/dev\/null<br \/>\n<\/code><\/p><\/blockquote>\n<p>Final steps of the conversion : our dev_crypt device becomes a full LUKS volume, so we can remove dev_crypt_real and add the LUKS headers at the top of the device we converted.<\/p>\n<p>At this moment, the LUKS volume is setup just as if it had been setup by cryptsetup. For LVM to recognize the change properly, you need to run <code>pvscan<\/code>. Once you ran it, you can do whatever you want with LVM.<\/p>\n<p>Now, you may want to add the following to your <code>\/etc\/crypttab<\/code> file:<\/p>\n<blockquote><p><code>$cryptdev $dev none luks<br \/>\n<\/code><\/p><\/blockquote>\n<p>i.e. <code>hda5_crypt \/dev\/hda5 none luks<\/code> if the device was <code>\/dev\/hda5<\/code>.<\/p>\n<p>And if the LVM volume you converted contains your root filesystem, you should run (for Debian systems):<\/p>\n<blockquote><p><code>update-initramfs -u<br \/>\n<\/code><\/p><\/blockquote>\n<p>I tested this successfully under qemu and will give it a shot on my laptop some time soon.<\/p>\n<p>Now, because I had a hard time not finding much about the mirror target of the device mapper, here is what I could gather about it. The target syntax is as follows:<\/p>\n<blockquote><p><code>&lt;logical_start_sector&gt; &lt;num_sectors&gt; mirror [ core | disk ] &lt;num_params&gt; &lt;param&gt; ... &lt;num_mirrors&gt; [ &lt;destination &gt; &lt;start_sector&gt; ] ...<br \/>\n<\/code><\/p><\/blockquote>\n<p><code>logical_start_sector<\/code>, <code>num_sectors<\/code>, <code>destination<\/code> and <code>start_sector<\/code> have the same meaning as in other targets.<br \/>\n<code>core<\/code> and <code>disk<\/code> are two different log types (to track differences between mirrors), respectively in memory and on disk.<br \/>\n<code>num_params<\/code> and <code>params<\/code> depend on the log type:<\/p>\n<ul>\n<li>for <code>core<\/code>, <code>num_params<\/code> can be either 1 or 2.<br \/>\nThe first parameter is <code>region_size<\/code>, which is the size (in blocks) of the chunks the mirror log tracks for synchronization. It can't be less than the page size divided by 512, i.e. 8 on x86, must be a power of 2, and must not exceed <code>num_sectors<\/code>. Apparently, it is not a problem if <code>num_sectors<\/code> is not a multiple of <code>region_size<\/code>.<br \/>\nThe optional second parameter is either <code>sync<\/code> or <code>nosync<\/code>, meaning of which I'm not sure. I think it determines whether the mirror should do the initial synchronization (<code>sync<\/code>) or not (<code>nosync<\/code>).<\/li>\n<li>for <code>disk<\/code>, <code>num_params<\/code> can be either 2 or 3, with params being <code>log_device<\/code> (device where the logs are kept), <code>region_size<\/code>, as above, and an optional <code>sync<\/code> or <code>nosync<\/code>, as above too.<\/li>\n<\/ul>\n<p><code>num_mirrors<\/code> is the number of mirrors and for each mirror, we have a pair <code>destination<\/code> and <code>start_sector<\/code>.<\/p>\n<p>[ <b>Update:<\/b> after a quick look at the device mapper source code in Linus's git tree, updated the mirror target description ]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Following my previous entry about playing around with LVM, LUKS and the device mapper, I documented myself about internals involved in pvmove, which is more of a challenge, considering there is no such documentation. I could find no useful documentation for either the device mapper or LVM. A bit of good old UTSL later, I [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1,5],"tags":[23],"class_list":["post-141","post","type-post","status-publish","format-standard","hentry","category-misc","category-pdo","tag-en"],"_links":{"self":[{"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/posts\/141","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=141"}],"version-history":[{"count":1,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/posts\/141\/revisions"}],"predecessor-version":[{"id":721,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=\/wp\/v2\/posts\/141\/revisions\/721"}],"wp:attachment":[{"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=141"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=141"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/glandium.org\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=141"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}