Tuesday, November 18, 2014

Anybody knows How to Legitimately Register a PMI (PMU Performance Monitor Interrupt) Callback Handler on Windows OS?

According to IA32/Intel Software Development Manual, when some PMU (Performance Monitor Unit) counter overflows occur, or LBR (Last Branch Record)/BTS (Branch Trace Store) is near full, the processor will deliver a PMI (Performance Monitor Interrupt). In Linux Kernel implementation, the PMU (perf tool) is using NMI to deliver such a PMI interrupt, and we can directly change the kernel source to add our own PMI handler for a particular event. 

But in Windows OS, how to register a PMI handler callback in a driver without hooking the kernel IDT table? Does anybody know about it? 

I've searched almost all the Driver Support Routines provided for kernel-mode drivers to use in MSDN site, but didn't get the documented kernel APIs to do so. However, by checking the Windows 7 32bit OS with Windbg tool, I got something interesting. 


According to IA32 manual, the local APIC is set up to deliver the PMI interrupt and a software handler for the corresponding interrupt must be in place on a certain vector entry of IDT (Interrupt Descriptor Table) table.

To be more specific, the Local APIC LVT (Local Vector Table) Performance Counter Register must be set up for this purpose. In xAPIC mode, the LVT Performance Counter Register MMIO address is (APIC base physical address + offset 0x340H), while when x2APIC mode is enabled, its address is IA32_X2APIC_LVT_PMI MSR (index 0x834h), which is called x2APIC LVT Performance Monitor register.

On a Windows 7 32bit OS, I used Windbg to check the MSR IA32_APIC_BASE (0x1B) with rdmsr command:

kd> rdmsr 0x1b
msr[1b] = 00000000`fee00900

See layout below of IA32_APIC_BASE MSR, so we can get to know that the APIC base physical address is 0xfee00000h, and the xAPIC mode is enabled on my system. This means the LVT Performance Counter Register MMIO address is 0xfee00340h. 



Then, I used !dd command to read the content of this register address, see below, the value is 0x000000fe. 

kd> !dd [uc] fee00340
#fee00340 000000fe 00000000 000000fe 00000000
#fee00350 0001001f 00000000 0001001f 00000000
#fee00360 000004ff 00000000 000004ff 00000000
#fee00370 000000e3 00000000 000000e3 00000000
#fee00380 00000000 00000000 00000000 00000000
#fee00390 00000000 00000000 00000000 00000000
#fee003a0 00000000 00000000 00000000 00000000
#fee003b0 00000000 00000000 00000000 00000000

Now, see the picture below for its layout, which means by default Windows OS kernel uses Fixed (000b) Delivery Mode and IDT vector 0xfe to deliver PMI interrupt.  




Now, let's check the vector 0xfe in IDT table with !idt command in Windbg tool, the PMI ISR (Interrupt Service Routine) is hal!HalpPerfInterrupt installed by OS kernel. 

kd> !idt 0xfe
Dumping IDT: 80b95400
fe: 82a221a8 hal!HalpPerfInterrupt

Disassemble this function as below with command uf, ellipsis(...) means some of instructions are truncated. We can see that it retrieves the handler (callback?) from the global variable hal!HalpPerfInterruptHandler, then calls it. So now my question is - how to register this performance interrupt handler (PMI handler) in my own driver, so that my callback routine can get called whenever a PMI event occurs? 

kd> uf hal!HalpPerfInterrupt
...
hal!HalpPerfInterrupt:
82a221a8 54              push    esp
82a221a9 55              push    ebp
82a221aa 53              push    ebx
82a221ab 56              push    esi
82a221ac 57              push    edi
82a221ad 83ec54          sub     esp,54h
82a221b0 8bec            mov     ebp,esp
82a221b2 894544          mov     dword ptr [ebp+44h],eax
82a221b5 894d40          mov     dword ptr [ebp+40h],ecx
82a221b8 89553c          mov     dword ptr [ebp+3Ch],edx
82a221bb f7457000000200  test    dword ptr [ebp+70h],20000h
82a221c2 75bc            jne     hal!V86_Hpf_a (82a22180)  Branch

hal!HalpPerfInterrupt+0x1c:
82a221c4 66837d6c08      cmp     word ptr [ebp+6Ch],8
82a221c9 741f            je      hal!HalpPerfInterrupt+0x42 (82a221ea)  Branch
...
hal!HalpPerfInterrupt+0x146:
82a222ee 8bcd            mov     ecx,ebp
82a222f0 a1e43ca282      mov     eax,dword ptr [hal!HalpPerfInterruptHandler (82a23ce4)]
82a222f5 0bc0            or      eax,eax
82a222f7 745b            je      hal!HalpPerfInterrupt+0x1ac (82a22354)  Branch

hal!HalpPerfInterrupt+0x151:
82a222f9 ffd0            call    eax
...

A possible solution ? (It requires to change OS default settings):
As we know that, PMI interrupt event vector is shared, so basically a PMI interrupt handler should check the IA32_PERF_GLOBAL_STATUS MSR (0x38E) to determine which event(s) triggered the PMI. However, for each PMU, during a specific time period, there should have only one PMU driver (if we have multiple PMU drivers) to control and use it. Hence, Windows operating system (Win7+) provides two APIs below for PMU drivers. 

      HalAllocateHardwareCounters()
      HalFreeHardwareCounters()

See their usages as below, for details please take a look at MSDN link.
If more than one such tool is installed on a computer, the associated drivers must avoid trying to use the same hardware counters simultaneously. To avoid such resource conflicts, all drivers that use counter resources should use the HalAllocateHardwareCounters and HalFreeHardwareCounters routines to coordinate their sharing of these resources. 
A counter resource is a single hardware counter, a block of contiguous counters, or a counter overflow interrupt in a PMU.
Before configuring the counters, a driver can call the HalAllocateHardwareCounters routine to acquire exclusive access to a set of counter resources. After the driver no longer needs these resources, it must free the resources by calling the HalFreeHardwareCounters routine.
Does this mean that once we successfully call HalAllocateHardwareCounters() to acquire exclusive access to PMI (e.g. counter overflow interrupt in a PMU), then we can even re-program the default Local APIC LVT Performance Counter Register? 

If we can do that without triggering PatchGuard (Windows x64 OS) or causing any other compatibility issues, then we could do it as below:
  1. Call HalAllocateHardwareCounters() to acquire exclusive access to PMI interrupt.
  2. Re-program APIC LVT performance counter register by setting Delivery Mode with NMI (100b), see its layout in picture above. Then whenever a PMI interrupt is triggered, a NMI (nonmaskable interrupt) handler will get called.
    In other words, such a setting converts PMI event to NMI event. 
  3. Fortunately, Windows OS kernel provides two APIs below:
    KeRegisterNmiCallback() - Registers a routine to be called whenever a NMI occurs

    KeDeregisterNmiCallback()
    See this MSDN link for details. It means OS kernel allows our driver to register a NMI callback routine to handle any NMI interrupt event. 
Once we have done these, I think we can control and use a particular PMU, and handle the PMI interrupt event appropriately. When jobs are done, apparently we must restore APIC (xAPIC or x2APIC) LVT performance register back to its default settings, de-register NMI callback, and free hardware counter resource. 


Notes:
  1. Due to this bug in my previous post, on Windows 8.1 32bit OS, NMI interrupt will cause system crash. Not sure if Microsoft fix this issue on latest version. 
  2. Intel VTune driver on Windows OS might be using PMU PMI, but I have no idea how it does :-(
  3. If anybody knows there is a good solution to register PMI interrupt, please let me know :)  I really appreciate it! 
<The End>


4 comments:

  1. Were you successful on doing that ? I'm facing the same issues.

    ReplyDelete
    Replies
    1. I didn't have better idea except the one I mentioned above. how about you?

      Delete
    2. Hi Anababa, thank you for your excellent readings.

      As far as I can see in the 8.1 x64 kernel, you could probably call HalpSetSystemInformation stored in HalDispatchTable[2] to change the HalpPerfInterruptHandler. No reference to HalpPerfInterruptHandler was seen other than from the HalpSetSystemInformation. I have not tested it but thought may be use of.
      ----
      0: kd> dps @@masm(nt!HalDispatchTable) l3
      fffff801`fff30690 00000000`00000004
      fffff801`fff30698 fffff801`ffc70320 hal!HaliQuerySystemInformation
      fffff801`fff306a0 fffff801`ffc70228 hal!HalpSetSystemInformation

      NTSTATUS HalpSetSystemInformation(ULONG_PTR InformationClass =1, SIZE_T BufferSize =8, void *Buffer =Handler)

      Delete
    3. Satoshi, Thank you for your information on this. It is great to know this though now I don't have a chance to try it out because project changes ... really appreciate your information!

      Delete