Thursday, November 06, 2014

Monitor Trap Flag (MTF) Usage in EPT-based Guest Physical Memory Monitoring

Monitor Trap Flag (MTF) is a flag specifically designed for single-stepping in x86/Intel hardware virtualization VT-x technology. When MTF is set, the guest will trigger a VM Exit after executing each instruction (need to consider NMI or other interrupt delivery boundary). This paper presents an idea to use MTF for memory write allowing when monitoring modification to guest virtual-to-physical mapping (page table entries) tables. 

In that paper (SPIDER: Stealthy Binary Program Instrumentation and Debugging via Hardware Virtualization), it details a solution to trap guest virtual-to-physical mapping address changes by monitoring the corresponding guest page tables. Based upon my previous experience, monitoring page table entries (with read-only permission in EPT PTE settings) will cause significant performance cost. In this post, I am not challenging that solution since it is not a product after all.

As we all know that EPT can be configured to monitor guest physical memory access with appropriate RWX permission settings. For example, for a guest data page, we can configure the corresponding EPT page table entry with !W permission, then whenever the processor fetches the instructions in that guest physical page for execution (e.g. code injection for shellcode execution), an EPT violation vmexit (or #VE interrupt) will occur. 

However, the contents of some guest physical page might be swapped out to disk by OS under a low memory pressure condition, and then that physical page might be remapped to another guest virtual address used for by other process. In this case, we must restore the EPT permission to default (e.g. RWX), otherwise there are many unwanted EPT violations occur. 

One of solutions is to monitor guest virtual-to-physical mapping page table entries just as what the paper does. For example, we can monitor guest PTE page (guest physical address) with EPT Read-Only permission. Whenever a page remapping is required, the guest OS kernel will update the corresponding guest PTE entry. 

Since the PTE entry in EPT permission is read-only, any change to that entry will trigger EPT violation vmexit. After hypervisor captures this event, it will record the current values of PTE entry, then temporarily set the PTE page to writable and let the guest single-step (through enabling MTF) through the instruction that performs the write access. After the single-stepping, hypervisor will read the new values of PTE entry and see which ones of them have been modified, and take appropriate actions based up mapping updates. After that, hypervisor will disable MTF flag and set the PTE page back to read-only to capture future remapping event

In the real case, the guest page table may have multiple levels, also changes to page table entries
may be very frequent, and minimal EPT page granularity is 4KB (too large), therefore this can only be an experimental solution due to huge performance penalty.

However, using MTF flag to grant a data write access and/or inspect the write content on a data page that is wrote less frequently is acceptable.


  1. For ring-3 level process in guest OS can also be performed in this method?
    we only need to identify the CR3 register ? Am i right? please correct me :)

    1. I think yes, maybe we also need to trap CR3 switching event for guest page walking.