tag:blogger.com,1999:blog-3850946929189928282024-03-05T17:38:59.323-08:00SIMPLE IS BETTERAnababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.comBlogger47125tag:blogger.com,1999:blog-385094692918992828.post-70062831295164607442015-06-09T23:47:00.002-07:002015-06-09T23:52:38.312-07:00How to enable CoreOS to boot on top of iKGT (Intel Kernel Guard Technology) ?<span style="font-family: Trebuchet MS, sans-serif;">Linked to here: </span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><a href="https://01.org/intel-kgt/blogs">https://01.org/intel-kgt/blogs</a> or <a href="https://01.org/intel-kgt/blogs/bzhu5/2015/coreos-ikgt">https://01.org/intel-kgt/blogs/bzhu5/2015/coreos-ikgt</a></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><END></span>Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-74943631978139323522015-06-09T23:38:00.001-07:002015-06-09T23:49:44.655-07:00Intel Kernel Guard Technology is released as opensource software<span style="font-family: Trebuchet MS, sans-serif;">See the official site for details: <a href="https://01.org/intel-kgt">https://01.org/intel-kgt</a></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><END></span>Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-61794485248369077672015-04-06T20:03:00.002-07:002015-04-13T04:38:09.230-07:00Common security design issues in privileged hypervisor or in any privileged emulators<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Recently I've reviewed nearly 100 Xen Security Advisories (<a href="http://xenbits.xen.org/xsa/">http://xenbits.xen.org/xsa/</a>), except some bad security coding practices for any ordinary software, I found there are some specific security issues that we need to take into considerations when designing prvilieged hypervisors or privileged emulators.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><Work In Progress></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-38362694657146358002015-04-06T19:54:00.000-07:002015-04-06T19:55:49.005-07:00"What, How, and Why" on Interrupt Window (or NMI Window) Exiting in Virtualization Technology<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">More recently, one of my colleagues asked me <b><i>why </i></b>there is a feature called "<i><b>Interrupt Window exiting</b></i>" in virtualization technology, and <i><b>how </b></i>it can be used by VMM? This blog is going to briefly describe its "what, how and why" .</span><br />
<br />
<br />
<a name='more'></a><br />
<span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;"><u>WHAT, and HOW</u></span><br />
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">"Interrupt Window Exiting" is one of VM exit reasons (#7 in Intel Technology). If “interrupt-window exiting” VM-execution control is set, this VM exit happens right after a VM entry and at the beginning of an any instruction:</span><br />
<br />
<ul>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">at which RFLAGS.IF = 1 (external interrupt is unmaksed) and,</span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">on which the interruptibility state of the guest would allow delivery of an interrupt (for example, not being blocked by STI or by MOV SS).</span></li>
</ul>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: blue;"><u>WHY</u></span><br />In a typical case, the VMM software wants to inject/deliver a (virtual) interrupt to its one of Guest VM at some point, but unfortunately the interruptibility state of its guest would NOT allow delivery of an interrupt at that moment (for example, since its guest RFLAGS.IF = 0). </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">So, in order to deliver this interrupt, the VMM will need to poll and check the interruptibility state of the guest, once the interruptibility state of its guest allows delivery of an interrupt (A window is open), then VMM can deliver it at this moment. This is inefficient way to do so.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /><span style="color: blue;"><i>So, the problem is that -- How does a VMM get to know when its guest becomes interruptible? </i></span></span><br />
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">With this feature supported, a VMM is allowed to queue a virtual interrupt to its guest when the guest is not in an interruptible state. The VMM can just only set the “<i>interrupt-window exiting</i>” VM-execution control for that guest and depend on a VM exit to know when the guest becomes interruptible (and, therefore, when it can inject a virtual interrupt). The VMM can detect such VM exits by checking for the basic exit reason “interrupt-window”, if the value of exit reason is 7, then VMM knows it is right time to deliver a virtual interrupt to its specific guest.<br /><br /><br /><b><i>Similarly, those also apply to "NMI window exiting" feature in Virtualization Technology.</i></b></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<br />
<br />Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-41885727901844102502015-01-26T05:45:00.000-08:002015-01-26T07:24:19.480-08:00Control-flow processor exceptions (single-stepping on branches) on control-flow branch instructions (jmp/call/ret)<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">"single-stepping on branches" is processor hardware feature of x86/Intel architecture. When it is enabled, the processor generates a single-step debug exception <span style="color: blue;">only after instructions that cause a branch</span>. This mechanism</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">allows a debugger to single-step on <span style="color: blue;">control transfers caused by branches</span>. What does this imply to defense against control-flow hijacking attacks (e.g. ROP or JOP) ? </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"></span><br />
<a name='more'></a><span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;"><u>Control-flow Transfer Instructions</u></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Control-flow hijacking attacks allow an attacker to overwrite a value that is loaded into the program counter (EIP) of a running program, typically redirecting execution to his own injected code or existing ROP/JOP gadget chains for executing arbitrary malicious code. In general, the value that is subverted could a jump target address, function pointer, or return address in a user-controlled stack. </span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Call/Jmp/Ret instructions, called as control-transfer branch instructions, are used by control-flow hijacking attacks to redirect CPU execution. There are many software tools that can perform binary analysis on those instructions, for example, by </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">dynamically </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">instrumenting control-flow graph (CFG) for control-flow integrity (CFI) enforcement. So it would be good if the hardware processor can generate an exception on (or after) those control-transfer instructions. </span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><u><span style="color: blue;">Single-stepping on Branches</span></u></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">In x86/Intel processor architecture, there is a bit (Trap Flag, TF) in EFLAGS register as below. It is set to enable single-step mode for debugging, clear to disable single-step mode. </span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjoOCoR3NBg3c7TD-V0YI4f7KTz4mWFHYY-Tl-ClyGnjU8Htc2sw0KjsNjPoSkryOhy1Tr8eBLwy3qfFFr3nO554eiPR-D0_u5YdyZMV1lJj7GwzopC4K11OulrlUa3UGhpnnjAwlPt-JM/s1600/btf-1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjoOCoR3NBg3c7TD-V0YI4f7KTz4mWFHYY-Tl-ClyGnjU8Htc2sw0KjsNjPoSkryOhy1Tr8eBLwy3qfFFr3nO554eiPR-D0_u5YdyZMV1lJj7GwzopC4K11OulrlUa3UGhpnnjAwlPt-JM/s1600/btf-1.png" height="352" width="640" /></a></div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">In single-step mode, the processor generates a debug exception (#DB) after each instruction. This allows the execution </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">state of a program to be inspected after each instruction.</span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">However, things are changed under a special condition as indicated below. When </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><u><span style="color: purple;">BTF </span></u>(single-step on branches) flag in IA32_DEBUGCTL MSR is set, the processor treats the TF flag in the EFLAGS register as a “single-step on branches” flag rather than a “single-step on instructions” flag. This mechanism allows single-stepping the processor on taken branches. Note that t</span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">he exception is a trap-class exception, which means the exception is generated after the branch instruction (call/ret/jmp) is executed.</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8njpZ87BYhRxfHvOtR3eEtL-5p_HU8xy_cwg8En1ZkRkLlq84Y3wuRCb2XU8vHUZkBOBcwOcZkRJtzIYF7EpJP6Wr3DwOR5KbMo4MKdXqXr0prwBLSfMX4n99swesnvEyVgg0i4MtebE/s1600/btf-2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8njpZ87BYhRxfHvOtR3eEtL-5p_HU8xy_cwg8En1ZkRkLlq84Y3wuRCb2XU8vHUZkBOBcwOcZkRJtzIYF7EpJP6Wr3DwOR5KbMo4MKdXqXr0prwBLSfMX4n99swesnvEyVgg0i4MtebE/s1600/btf-2.png" height="422" width="640" /></a></div>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">So now we can make processor generate an exception (#DB) on (^after^, actually) every call/jmp/ret instruction. </span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="color: blue; font-family: 'Trebuchet MS', sans-serif; font-size: large;"><u>Potential Usages</u></span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">We might have some usages with this capability, for example:</span><br />
<br />
<ol>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Build dynamical CFG (<i>Control Flow Graph</i>) without changes to software binary or source code.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Detect unknown control-flow hijacking vulnerabilities by using dynamic taint analysis, e.g. when a tainted value loaded into the program counter (EIP) has been influenced by data from the untrusted inputs.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Perform software-invisible hooks for function calling (target of "call" instruction).</span></li>
</ol>
<br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">However, there are some </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><span style="color: blue;">limitations</span>:</span><br />
<br />
<ol>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Performance overhead !!! (unless we use it under some environment where performance is not a big concern).</span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">It cannot control jmp/ret/call individually, for example, trigger exceptions only on CALL instructions, or RET instructions, or even only on "indirect" jmp/call instructions (because normally code with direct-jmp/call is trusted due to W^X on code section). </span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">It also has no CPL (user or kernel) controls, but we can control it through EFLAGS.TF bit crossing system call/ret. </span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">Because this #DB is controlled by EFLAGS bit, it can be easy to be disabled by using a "popf" instruction if the stack is controlled by an attacker:( . </span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">It obviously requires OS kernel changes (Does OS provide legitimate</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"> #DB handler registration?) </span></li>
</ol>
<br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">Please let me know if you have any comments.</span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="background-color: white; color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; font-size: medium;"><span style="color: blue; letter-spacing: 0.400000005960465px;"><b>References:</b></span></span><br />
<span style="background-color: white; color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; font-size: medium;"><span style="letter-spacing: 0.400000005960465px;">Intel IA32 architecture software development manual:</span></span><br />
<a href="references: Intel IA32 architecture software development manual: http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html" target="_blank"><span style="background-color: white; color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; font-size: medium;"><span style="letter-spacing: 0.400000005960465px;">http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html</span></span><span style="background-color: white; color: #4e4e4e; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 15px; line-height: 20.7900009155273px;"> </span></a><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-47494436291418471402015-01-16T01:46:00.001-08:002015-01-27T23:55:33.766-08:00How to defend against Stack Pivoting attacks on existing 32-bit x86 processor architecture?<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Stack Pivoting is a common technique widely used by vulnerability exploits to bypass hardware protections like NX/SMEP, or to chain ROP (<i>Return-Oriented Programing</i>, the </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">Wikipedia</span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"> <a href="http://en.wikipedia.org/wiki/Return-oriented_programming" target="_blank">link</a>) gadgets. However, there is NO hardware protection solution to defend against it (at least for now:-). This blog will describe a software solution to detect Stack Pivoting at run time, and I will also point out some limitations due to current processor architecture implementations. <Please let me know if this is NOT a new idea, or NOT doable.></span><br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">The basic idea of detecting stack pivoting is: <span style="color: blue;">configure the appropriate stack base/limit (normally, the modern OS sets base/limit with 0~4G in 32bit mode) in stack segment register for a specific thread, then if a stack pivoting that causes the stack address (ESP) out of the defined range is detected, the processor will generate a #SS fault (limit violation exception).</span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Before introducing my solution, let me briefly talk about an existing solution to detect stack pivot in Windows 8 OS. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Microsoft implements a simple protection mechanism: every function associated with manipulating virtual memory, including the often-abused VirtualProtect and VirtualAlloc, now includes a check that the stack pointer, as contained in the trap frame, falls within the range defined by the Thread Environment Block (TEB, see below picture, StackBase/StackLimit)</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjXcPZdFWzj7NKJ735KSaZ_71qyLgEhLfhv3vE28Xdbv9i6A8ZnrniOf5w4QSjd2B5rB6lPzG2tQZlYFj9xnE_QwI1aV5FbFtjLXa0j2z7TewIairJjq_8KP1Ike7Ax6CUntQgMqjzBjaM/s1600/%23SS-4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjXcPZdFWzj7NKJ735KSaZ_71qyLgEhLfhv3vE28Xdbv9i6A8ZnrniOf5w4QSjd2B5rB6lPzG2tQZlYFj9xnE_QwI1aV5FbFtjLXa0j2z7TewIairJjq_8KP1Ike7Ax6CUntQgMqjzBjaM/s1600/%23SS-4.png" height="196" width="640" /></a></div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">You can take a look at this <a href="http://vulnfactory.org/blog/2011/09/21/defeating-windows-8-rop-mitigation/" target="_blank">blog</a> for detailed descriptions. However, the blog author (Dan Rosenberg</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">) also describes an approach to bypassing it.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Now I'm going to talk about the solution and limitations in greater details.</span><br />
<br />
<span style="color: blue; font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.3pt;"><b><u>What's stack pivoting?</u></b></span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.3pt;">Please skip this section if you already know about what's stack pivoting.</span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.3pt;"><br /></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.3pt;">With stack pivoting, attacks can pivot from the real stack to a fake
stack which could be an attacker-controlled buffer, such as the heap, then attackers can control the program execution. For example, this is achieved by controlling
data pointed to by RSP(stack pointer register), such that each </span><b style="font-family: 'Trebuchet MS', sans-serif; font-size: x-large; letter-spacing: 0.3pt;"><i>ret</i></b><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.3pt;">
instruction results in incrementing RSP and transferring execution to the next
address chosen by attackers.</span><br />
<div class="MsoNormal" style="margin-bottom: 3pt;">
<span style="letter-spacing: 0.3pt;"><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></span></div>
<div class="MsoNormal" style="margin-bottom: 3pt;">
<span style="font-family: Trebuchet MS, sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">Here are some good blogs to briefly explain what is stack-pivoting, how to pivot a stack, and how it is used for attacks (e.g. ROP).</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><a href="http://neilscomputerblog.blogspot.com/2012/06/stack-pivoting.html" target="_blank">http://neilscomputerblog.blogspot.com/2012/06/stack-pivoting.html</a></span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><a href="http://blogs.mcafee.com/mcafee-labs/emerging-stack-pivoting-exploits-bypass-common-security" target="_blank">http://blogs.mcafee.com/mcafee-labs/emerging-stack-pivoting-exploits-bypass-common-security</a></span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><a href="http://neilscomputerblog.blogspot.com/2013/04/rop-return-oriented-programming.html" target="_blank">http://neilscomputerblog.blogspot.com/2013/04/rop-return-oriented-programming.html</a></span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><br /></span></span><span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;"><b><u><span style="letter-spacing: 0.400000005960465px;">#SS (</span><span style="letter-spacing: 0.400000005960465px;">Stack Fault Exception)</span></u></b></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">In x86/Intel processor architecture, exception vector 12 is assigned to #SS fault. There are a couple of conditions that can result in a #SS fault. One of them, according to IA32 architecture manual, is <b><i>limit violation </i></b>as below:</span></span><br />
<blockquote class="tr_bq">
<span style="font-family: Trebuchet MS, sans-serif; font-size: large; letter-spacing: 0.400000005960465px;"><i>A limit violation is detected during an operation that refers to the SS register. Operations that can cause a limit violation include stack-oriented instructions such as POP, PUSH, CALL, RET, IRET, ENTER, and LEAVE, as well as other memory references which implicitly or explicitly use the SS register (for example, MOV AX, [BP+6] or MOV AX, SS:[EAX+6]). The ENTER instruction generates this exception when there is not enough stack space for allocating local variables.</i></span></blockquote>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">So, basically processor checks stack <i>base and limit</i> value when operating any stack-oriented instructions. If the referenced stack address is out of the range (indicated by base/limit values in SS register, see picture below), then a #SS fault will be generated. </span></span><br />
<div class="separator" style="clear: both; text-align: center;">
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEig0utiZuqJyq0J71KnMZR65VvN0l4wjUZO68aOK2G5-XJ_XuQQDq1tkHsTgCkr_VDW5V4y23b-wtFJcKvzZWa3RKcH-kVu0moj0yTgLpoWtQZVg7kuG6NbdUdctRTIhscB0H1HTPagU34/s1600/%23SS-1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEig0utiZuqJyq0J71KnMZR65VvN0l4wjUZO68aOK2G5-XJ_XuQQDq1tkHsTgCkr_VDW5V4y23b-wtFJcKvzZWa3RKcH-kVu0moj0yTgLpoWtQZVg7kuG6NbdUdctRTIhscB0H1HTPagU34/s1600/%23SS-1.png" height="291" width="640" /></a></span></span></div>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">However, please note that this limit violation only applies to 32-bit processor mode, I will talk about this later. </span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><br /></span></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: blue; letter-spacing: 0.400000005960465px;"><b><u>Segment Register (SS)</u></b></span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">Every segment register, including SS, has </span><span style="letter-spacing: 0.400000005960465px;">a “visible” part and a “hidden” part (see below). </span><span style="letter-spacing: 0.400000005960465px;">The hidden part is sometimes referred to as a </span></span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">“descriptor cache” or a “shadow register”.</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGwBmH7AN6Nzga6IgQr5g-VW3Bk-TYBG20eh89yMNRWViFqVjqDwm540APdp4Fb4n08E6MVwHUJnsdb_Sb_mdbAQ2r2Z9kxiIok1d6-HrXsbsfjvcJYWLpkgCmN0S8EjPM_PZZFGS89kY/s1600/%23SS-2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGwBmH7AN6Nzga6IgQr5g-VW3Bk-TYBG20eh89yMNRWViFqVjqDwm540APdp4Fb4n08E6MVwHUJnsdb_Sb_mdbAQ2r2Z9kxiIok1d6-HrXsbsfjvcJYWLpkgCmN0S8EjPM_PZZFGS89kY/s1600/%23SS-2.png" height="228" width="640" /></a></div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">According to the IA32 architecture, </span><span style="letter-spacing: 0.400000005960465px;">when a segment selector is loaded into the visible part of a segment </span></span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">register, the processor also loads the hidden part of the segment register with the base address, segment limit, and </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">access control information from the <i><b>segment descriptor</b> (see next section)</i> pointed to by the <i>segment selector</i>. The information cached </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">in the segment register (visible and hidden) allows the processor to translate addresses without taking extra bus </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">cycles to read the base address and limit from the segment descriptor.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><br /></span></span>
<b><span style="color: blue;"><u><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"></span></span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">Segment Descriptor </span></span></u></span></b><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">A segment descriptor (see picture below) is a data structure in a GDT or LDT that provides the processor with the size and location (e.g. base/limit) of </span></span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">a segment, as well as access control and status information.</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-6mZo9072I8KjAyMOhusXhXcstdB8_x6dbootwahO9qXhwY8Qv1ZX-oFlXMVO4tT-irFLDO5DCPSB-SQdUGcUrUlyX0vO8VLyf3IX-FVEVk05VLcaJ9Ur6PnTL9tYlxulZirn2vCiw3k/s1600/%23SS-3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-6mZo9072I8KjAyMOhusXhXcstdB8_x6dbootwahO9qXhwY8Qv1ZX-oFlXMVO4tT-irFLDO5DCPSB-SQdUGcUrUlyX0vO8VLyf3IX-FVEVk05VLcaJ9Ur6PnTL9tYlxulZirn2vCiw3k/s1600/%23SS-3.png" height="393" width="640" /></a></div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">The segment descriptor is pointed by the corresponding segment selector, for example, a stack segment descriptor is referenced by SS selector, and normally OS uses different SS selectors for kernel and applications. </span></span><br />
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">As indicated in last section, the "hidden" part of segment register is loaded from the corresponding segment descriptor (in GDT table residing in RAM). However, it is </span></span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">software's responsibility to reload the segment registers when </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">the segment descriptor tables are modified (e.g. when base or/and limit value are changed). If this is not done, an old segment descriptor cached in a segment register might </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">be used after its memory-resident version (segment descriptor in GDT table) has been modified.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><br /></span></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">So, <b>when OS system software modifies stack base/limit in SS segment descriptor for a particular thread, it must reload the corresponding SS segment register</b>. According to x86/Intel architecture, there are two kinds of load instructions provided for loading the segment registers:</span></span><br />
<ol>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">Direct load instructions such as the MOV, POP, LSS instructions. These instructions </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">explicitly reference the segment registers.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">Implicit load instructions such as the far pointer versions of the CALL, JMP, and RET instructions, the SYSENTER and SYSEXIT instructions, and the IRET, INTn, INTO and INT3 instructions. These instructions change the contents of the SS register (and sometimes other segment registers) as an incidental part of their operation.</span></span></li>
</ol>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><b><span style="color: blue;"><u>OS Implementation</u></span></b></span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">To simplify the discussion, I'm taking user mode application as an example for stack pivoting detection. </span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><br /></span></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: blue; letter-spacing: 0.400000005960465px;">Normally, OS software allocates unique stack space for each user mode thread. We can change thread scheduler to modify the stack base/limit values in SS segment descriptor (in GDT table) pointed by user mode SS selector, as part of thread context switching</span><span style="color: #0b5394; letter-spacing: 0.400000005960465px;">. </span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><br /></span></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">When that user mode thread starts to execution in user mode after switching stack from kernel to user, the base/limit values in RAM will be automatically reloaded to "hidden" part of SS segment register. </span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><br /></span></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: blue; letter-spacing: 0.400000005960465px;">Then </span></span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><span style="color: blue;">if there is an attack initialed by a stack pivoting that causes the user mode stack address (ESP) out of the defined range (base/limit in "hidden" part of SS segment register) is detected, the processor will generate a #SS fault (limit violation exception), then the anti-malware software can detect such an attack.</span> </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><br /><b><span style="color: blue;"><u>Limitations</u></span></b></span></span><br />
<br />
<ol>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">One of big problems is that we cannot apply this solution to x86/Intel 64-bit processor mode. This is b</span></span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">ecause SS (and DS/ES) segment registers are not used in 64-bit mode, their fields (base, limit, and attribute) in segment descriptor of GDT table are ignored. Address calculations that reference the ES, DS, or SS segments are treated as if the segment base </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">is zero. So the #SS exception due to "limit violation" cannot be generated.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">Because the SS segment descriptor is located in kernel memory space, so the application cannot modify it directly in user mode. Hence, this solution cannot apply to User Mode Thread, one of examples is Microsoft UMS or User-Mode Scheduling, which is a lightweight mechanism that applications can use to schedule their own threads. An application can switch between UMS threads in user mode without involving the system scheduler. For details, please see the link</span><br /><a href="http://msdn.microsoft.com/en-us/library/windows/desktop/dd627187(v=vs.85).aspx" style="letter-spacing: 0.400000005960465px;" target="_blank">http://msdn.microsoft.com/en-us/library/windows/desktop/dd627187(v=vs.85).aspx</a> Note that this feature is not available on 32-bit versions of Windows:) <span style="letter-spacing: 0.400000005960465px;"> </span></span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large; letter-spacing: 0.400000005960465px;">It requires extra changes for thread schedule (as part of context switching) in 32-bit OS, but the change is very minimal, please see above.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">One of assumptions is that the thread stack is virtually contiguous in address space, so that the base/limit checks can apply. </span></span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">It cannot detect the stack pivoting to other memory space that is also part of stack (still in the range of base/limit).</span></span></li>
</ol>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><br /></span></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: blue; letter-spacing: 0.400000005960465px;"><b>References:</b></span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;">Intel IA32 architecture software development manual:</span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><a href="http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html" target="_blank">http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html</a></span></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="letter-spacing: 0.400000005960465px;"><br /></span></span></div>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Transparent ROP Detection using CPU Performance Counters: <a href="https://www.trailofbits.com/threads/2014/transparent_rop_detection_using_cpu_perfcounters.pdf" target="_blank">https://www.trailofbits.com/threads/2014/transparent_rop_detection_using_cpu_perfcounters.pdf</a></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Defeating Windows 8 ROP Mitigation:</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><a href="http://vulnfactory.org/blog/2011/09/21/defeating-windows-8-rop-mitigation/" target="_blank">http://vulnfactory.org/blog/2011/09/21/defeating-windows-8-rop-mitigation/</a></span><br />
<br />
<br />Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com5tag:blogger.com,1999:blog-385094692918992828.post-11898697744221031222014-12-15T06:49:00.002-08:002015-01-13T04:42:08.188-08:00Using LBR (Last Branch Record) feature to detect ret2usr (return-to-user) attack w/ MMU paging structure corruption<span style="font-size: large;"><span style="background-color: white; color: #4e4e4e; line-height: 25.2000007629395px;"><span style="font-family: Trebuchet MS, sans-serif;">SMEP (Supervisor Mode Execution Prevention) is a mitigation that aims to prevent the CPU from running code from user-mode while in kernel-mode, however </span></span><span style="background-color: white; color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; line-height: 25.2000007629395px;">this post (</span><a href="https://labs.mwrinfosecurity.com/blog/2014/08/15/windows-8-kernel-memory-protections-bypass/" style="background-color: white; color: #0200a4; font-family: 'Trebuchet MS', sans-serif; line-height: 25.2000007629395px; text-decoration: none;" target="_blank">Windows 8 Kernel Memory Protections Bypass</a><span style="background-color: white; color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; line-height: 25.2000007629395px;">) presents a generic technique for exploiting kernel vulnerabilities with bypassing SMEP. Unlike my previous post (</span><span style="color: #4e4e4e; font-family: Trebuchet MS, sans-serif;"><span style="line-height: 25.2000007629395px;"><a href="http://hypervsir.blogspot.com/2014/11/page-structure-table-corruption-attacks.html" target="_blank">Page Table Structure Corruption Attacks - How to Mitigate it?</a>) that presented a mitigation to that attack, this post will present a solution to detect such a ret2usr attack due to MMU paging structure corruption.</span></span></span><br />
<span style="font-size: large;"><span style="color: #4e4e4e; font-family: Trebuchet MS, sans-serif;"><span style="line-height: 25.2000007629395px;"><br /></span></span></span>
<br />
<a name='more'></a><span style="color: #4e4e4e; font-family: Trebuchet MS, sans-serif; font-size: large;">In Intel/x86 recent processors, the LBR (last branch record) feature has some filtering capabilities like CPL (current privilege level) filtering and indirect jmp/call filterings. </span><br />
<span style="color: #4e4e4e; font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="color: #4e4e4e; font-family: Trebuchet MS, sans-serif; font-size: large;">For instance, for a specific suspicious process or application, we can configure LBR to only record last branch recording addresses (like <i><b>LastBranchToIP</b></i></span><span style="color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; font-size: large;">) for indirect jmp/call and ret branch instructions in kernel mode (CPL=0). </span><br />
<span style="color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; font-size: large;">Therefore, by analyzing the </span><span style="color: #4e4e4e; font-family: Trebuchet MS, sans-serif; font-size: large;"><i><b>LastBranchToIP</b></i></span><span style="color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; font-size: large;"> </span><span style="color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; font-size: large;">addresses in BTS (branch trace store) buffer resident in system RAM, we can get to know that whether or not a "ret2usr" attack occurred. </span><br />
<span style="color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; font-size: large;">The rule is pretty simple: </span><br />
<span style="color: blue;"><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">check all the </span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><i><b>LastBranchToIP</b></i></span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"> </span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">addresses, if we can find out that any one or more of addresses are located in the range of 0~2GB, then it indicates that a "ret2usr" attack occurred in a "monitored" process or application.</span></span><br />
<span style="color: #4e4e4e; font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="color: #4e4e4e; font-family: Trebuchet MS, sans-serif; font-size: large;">This is typically because the user mode virtual address space range is 0~2GB by default on a 32-bit Windows Operating system, even if the paging-structure entry (e.g. PTE) U/S bit is corrupted by a write-what-where vulnerability which causes a user mode memory to be interpreted as a kernel memory. </span><span style="color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; font-size: large;"> </span><br />
<span style="color: #4e4e4e; font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-90910245474186509232014-12-11T23:10:00.002-08:002015-01-27T23:56:50.198-08:00New security feature - Control Flow Guard (CFG) - available in Visual Studio 2015 Preview<span style="font-weight: normal;"><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">This <a href="http://blogs.msdn.com/b/vcblog/archive/2014/12/08/visual-studio-2015-preview-work-in-progress-security-feature.aspx" target="_blank">blog </a>announced that </span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">the Preview for Visual Studio 2015 includes a new, work-in-progress feature, called Control Flow Guard (CFG). </span></span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"></span><br />
<span style="font-weight: normal;"><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></span>
<br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large; font-weight: normal;">It says </span><br />
<blockquote class="tr_bq">
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">"Whilst compiling and linking code, <span style="color: blue;">it analyzes and discovers every location that any indirect-call instruction can reach. It builds that knowledge into the binaries (in extra data structures). It also injects a check, before every indirect-call in your code, that ensures the target is one of those expected, safe, locations</span>. If that check fails at runtime, the Operating System closes the program"</span></blockquote>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">I will evaluate this, e.g. performance impact and effectiveness against JOP/ROP attacks,</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"> when I'm free, and update this post then :-)</span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><u><i><b>Update:</b></i></u></span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">MJ0011, "Windows 10 Control Flow Guard Internals"</span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><a href="http://webhard.milkgun.kr/%EC%9E%90%EB%A3%8C/POC%202014/MJ0011%20-%20Windows%2010%20Control%20Flow%20Guard%20Internals.pdf" target="_blank">http://webhard.milkgun.kr/%EC%9E%90%EB%A3%8C/POC%202014/MJ0011%20-%20Windows%2010%20Control%20Flow%20Guard%20Internals.pdf</a></span><br />
<br />
<br />Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-41097766779404776962014-11-21T07:53:00.001-08:002014-11-21T08:08:26.583-08:00Defending Against ret2dir Attacks (partially) with Virtualization Technology?<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">I was so excited when recently reading the paper (<a href="http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf" target="_blank">ret2dir: Rethinking Kernel Isolation</a>) from <a href="http://www.cs.columbia.edu/~vpk/" target="_blank">Vasileios P. Kemerlis</a>. This post is basically going to introduce the idea of ret2dir attack, and how to prevent such an attack with hardware virtualization technology, actually partially. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">ret2dir (Return-to-Direct-Mapped-Memory) attack abuses <i><b>physmap </b></i>design in kernel virtual memory management system of many Linux/Unix OSs, it can bypass the SMEP/SMAP, PXN, KERNEXEC/UDEREF. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br />
So, what is <b><i>physmap</i></b>? </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br />
It is Address Aliasing technique, which is designed for performance improvement. According to the author, "Given the existence of physmap, whenever the kernel (buddy allocator) maps a page frame to user space, it effectively creates an alias ( synonym) of user content in kernel space!"</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br />
To be more specific, the key point is that<span style="color: blue;"> for the same physical memory address space in </span><i><span style="color: blue;">physmap </span></i><span style="color: blue;">region, there might have two virtual address addresses that will be translated or mapped to the same physical memory space, aka, N:1 mapping, here N is 2</span>. One virtual address is in kernel address space (Page table U/S bit =0), and the other is in user address space (U/S=1). See picture below from ret2dir paper.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br />
</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTl_LktQj8ZKJS3G4lM9y7XTHRHMYaW4lTkyNvTtVglkmFc8aSJWP3yFP-RayhBR3JoiMe8YSrPFp52DxF29DU6EbSaCJj2xyBjn7x6t__FQtlVsyIwZ4BLzxQFobSP5XNBy4EmcKcvIg/s1600/ret2dir.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTl_LktQj8ZKJS3G4lM9y7XTHRHMYaW4lTkyNvTtVglkmFc8aSJWP3yFP-RayhBR3JoiMe8YSrPFp52DxF29DU6EbSaCJj2xyBjn7x6t__FQtlVsyIwZ4BLzxQFobSP5XNBy4EmcKcvIg/s1600/ret2dir.png" height="588" width="640" /></span></a></div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">In order to make this kind of attacks work, the paper presents several innovative solutions. For example, how to force user memory allocation physical pages emerge in physmap area, how to get PFN info through information leak, and how to do physmap spaying (<a href="http://hypervsir.blogspot.com/2014/11/what-does-transactional-synchronization.html" target="_blank">Can TSX help x-spaying exploit writing?</a>) , etc. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br />
Besides, the author also innovates a solution in Linux kernel to mitigate this ret2dir attack: eXclusive Page Frame Ownerwhip (XPFO). </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">The basic idea is very straightforward, it enforces exclusive ownership (of page frames) by either the kernel or userland unless explicitly requested by a kernel component(e.g., to implement zero-copy shared buffers). It means whenever a page frame is allocated to userland, it unmaps it from <b><i>physmap</i></b>; when such a page frame is reclaimed from userland, it puts it back to <b><i>physmap</i></b>. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="color: blue;"><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">In a virtualization environment, however, this solution (</span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><a href="http://hypervsir.blogspot.com/2014/11/how-to-implement-software-based.html" target="_blank">How to Implement a software-based SMEP with Virtualization/Hypervisor Technology</a></span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">) in my previous post may be able to stop or defend against ret2dir attacks. But it also has a limitation: the virtualization-based software SMEP can only stop execution of shellcode/payload in <b><i>physmem</i></b> area.</span></span><br />
<span style="color: blue;"><br /></span>
<span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;">In order to stop read/write access to payload from kernel space with ret2dir attacks (which could be used to do ROP attacks, e.g. use the payload as kernel stack after stack pivoting), t</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><span style="color: blue;">echnically, we can extend that solution to implement a virtualization-based software SMAP to defend against ret2dir attacks..</span>. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">
<br />
</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><u><i><b>References:</b></i></u></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">ret2dir: Rethinking Kernel Isolation -- paper:</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><a href="http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf" target="_blank">http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf</a></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br />
and its slides:</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><a href="https://www.usenix.org/sites/default/files/conference/protected-files/sec14_slides_kemerlis.pdf" target="_blank">https://www.usenix.org/sites/default/files/conference/protected-files/sec14_slides_kemerlis.pdf</a></span><br />
<h2 style="color: #262626; margin: 0px; padding: 0.2em;">
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></h2>
<h2 style="margin: 0px; padding: 0.2em;">
<span style="font-family: Trebuchet MS, sans-serif; font-size: large; font-weight: normal;"><span style="color: black;">OpenBSD</span><span style="color: #262626;"> fix to remove executable permission in direct-map pages (recently): </span></span></h2>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><a href="https://secure.freshbsd.org/commit/openbsd/52e8e9f52ef21a21a315187623fafe4800efd868" target="_blank">https://secure.freshbsd.org/commit/openbsd/52e8e9f52ef21a21a315187623fafe4800efd868</a></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-71206655587276162792014-11-21T00:02:00.000-08:002014-11-21T00:47:04.284-08:00Improve Performance for Separating Kernel and User Address Space with Process-Context Identifiers (PCIDs) <span style="font-family: Trebuchet MS, sans-serif;">This post is not talking about any new idea, just about what I'm thinking..</span><br />
<br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif;">Back to year 2003, <a href="http://lwn.net/Articles/39283/" target="_blank"><span style="background-color: white;">Ingo Molnar proposed </span>4G/4G split on x86 with 64 GB RAM support</a> to separate user and kernel mode virtual address space. This is another post <a href="http://lwn.net/Articles/39925/" target="_blank">64GB on 32-bit systems</a> talking about this. Originally, the motivation was as below in that post. </span><br />
<blockquote class="tr_bq">
<span style="font-family: Trebuchet MS, sans-serif;"><i>The 4G/4G split feature is primarily intended for large-RAM x86 systems, which want to (or have to) get more kernel/user VM, at the expense of per-syscall TLB-flush overhead</i>.</span></blockquote>
<span style="font-family: Trebuchet MS, sans-serif;">Obviously this is true for 32bit OS, for example in Linux OS, by default the kernel uses higher 1GB virtual address, while user space uses lower 3GB virtual space. In Windows OS, by default 2G/2G <span style="background-color: white;">split is used.</span></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">At that time, security design was not a big concern, however, in year 2010 (maybe before this date), PaX team revisited this again by <a href="http://grsecurity.net/pipermail/grsecurity/2010-April/001024.html" target="_blank">[grsec] Announcing UDEREF/amd64</a>. But this time the motivation is not about kernel/user VM size, but about securing kernel and user address space to mitigate many ret2user attacks. </span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">Before hardware PCID (Process-Context Identifiers) feature was introduced, TLB flush due to user/kernel page table switch through </span><span style="font-family: 'Trebuchet MS', sans-serif;">syscall or interrupt/exception</span><span style="font-family: 'Trebuchet MS', sans-serif;"> will cause significant performance cost. In latest UDEREF/amd64 implementation (see this link: </span><span style="background-color: white; font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px; text-align: justify;"><a href="http://grsecurity.net/stable/grsecurity-3.0-3.14.24-201411150026.patch" target="_blank">http://grsecurity.net/stable/grsecurity-3.0-3.14.24-201411150026.patch</a>, <a href="http://hypervsir.blogspot.com/2014/11/implement-software-based-smep-with-non.html" target="_blank">PaX team told me that they will blog this</a>), this PCID hardware feature will be used to speed up performance. </span><br />
<span style="background-color: white; font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px; text-align: justify;"><br /></span>
<span style="background-color: white; font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px; text-align: justify;"><br /></span>
<span style="background-color: white; font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px; text-align: justify;">So what is PCID? see text below from Intel SDM.</span><br />
<blockquote class="tr_bq" style="text-align: justify;">
<span style="font-family: Trebuchet MS, sans-serif;"><span style="line-height: 18.8999996185303px;">Process-context identifiers (PCIDs) are a facility by which a logical processor may cache information for multiple linear-address spaces. The processor may retain cached information when software switches to a different linear address space with a different PCID.</span></span></blockquote>
<blockquote class="tr_bq" style="text-align: justify;">
<span style="font-family: Trebuchet MS, sans-serif;"><span style="line-height: 18.8999996185303px;">When a logical processor creates entries in the TLBs and paging-structure caches, it associates those entries with the current PCID. When using entries in the TLBs and paging-structure caches to translate a linear address, a logical processor uses only those entries associated with the current PCID.</span></span></blockquote>
<span style="font-family: Trebuchet MS, sans-serif;"><span style="background-color: white; line-height: 18.8999996185303px; text-align: justify;"></span></span><br />
<span style="background-color: white; font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px; text-align: justify;"><br /></span>
<span style="background-color: white; font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px; text-align: justify;">However, the PCID feature is only available on x64 mode (Intel </span><span style="font-family: Trebuchet MS, sans-serif;"><span style="line-height: 18.8999996185303px;">IA-32e mode), which means only 64bit OS can use it. The PCID is a 12-bit value stored in CR3 register for each address space, see below from SDM manual. </span></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><span style="line-height: 18.8999996185303px;"><br /></span></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_sNyeCtsnN0bAezF87WpfBrTkUwJJGmN3nr2sWc34EFZ8MeuKDqSAjrruT8UtsJalR5L8gmjghG7Qe9RaoKl7TWu68hpKuNkTkZCCoSLIvoG9Mx0gP9esSzD-pXgKqwIEBcMhhb3uru8/s1600/PCID.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_sNyeCtsnN0bAezF87WpfBrTkUwJJGmN3nr2sWc34EFZ8MeuKDqSAjrruT8UtsJalR5L8gmjghG7Qe9RaoKl7TWu68hpKuNkTkZCCoSLIvoG9Mx0gP9esSzD-pXgKqwIEBcMhhb3uru8/s1600/PCID.png" height="136" width="640" /></a></div>
<span style="font-family: Trebuchet MS, sans-serif;"><span style="line-height: 18.8999996185303px;"><br /></span></span>
<span style="font-family: Trebuchet MS, sans-serif;"><span style="line-height: 18.8999996185303px;">To assisting in OS software programing, the new instruction INVPCID (Invalidate Process-Context Identifier) also is introduced to invalidate mappings in the translation lookaside buffers (TLBs) and paging-structure caches based on process context </span></span><span style="font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px;">identifier (PCID). It is kind of like INVEPT and INVVPID in Intel Virtualization technology, the former is to i</span><span style="font-family: Trebuchet MS, sans-serif;"><span style="line-height: 18.8999996185303px;">nvalidate information cached from the </span></span><span style="font-family: Trebuchet MS, sans-serif;"><span style="line-height: 18.8999996185303px;">EPT paging structures, </span></span><span style="font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px;">and the later is to i</span><span style="font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px;">nvalidates mappings in the translation lookaside buffers (TLBs) and paging-structure caches based on Virtual Processor I</span><span style="font-family: Trebuchet MS, sans-serif;"><span style="line-height: 18.8999996185303px;">dentifier (VPID). </span></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><span style="line-height: 18.8999996185303px;"><br /></span></span>
<span style="font-family: Trebuchet MS, sans-serif;"><span style="line-height: 18.8999996185303px;">There are four INVPCID types (granularities</span></span><span style="font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px;">) currently defined (copied from IA32/Intel SDM):</span><br />
<br />
<ol>
<li><span style="font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px;"><b><i>Individual-address invalidation</i></b>: If the INVPCID type is 0, the logical processor invalidates mappings—except global translations—for the linear address and PCID specified in the INVPCID descriptor. In some cases, the instruction may invalidate global translations or mappings for other linear addresses (or other PCIDs) as well.</span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px;"><i><b>Single-context invalidation</b></i>: If the INVPCID type is 1, the logical processor invalidates all mappings—except global translations—associated with the PCID specified in the INVPCID descriptor. In some cases, the instruction may invalidate global translations or mappings for other PCIDs as well.</span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px;"><b><i>All-context invalidation</i></b>, including global translations: If the INVPCID type is 2, the logical processor invalidates all mappings—including global translations—associated with any PCID.</span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; line-height: 18.8999996185303px;"><i><b>All-context invalidation</b></i>: If the INVPCID type is 3, the logical processor invalidates all mappings—except global translations—associated with any PCID. In some case, the instruction may invalidate global translations as well.</span></li>
</ol>
<br />
<br />
<i><b><span style="font-size: large;"><br /></span></b></i>
<i><b><span style="font-size: large;">References:</span></b></i><br />
IA32 Intel Software Development Manual... just searching it in Google..<br />
<br />
4G/4G split on x86, 64 GB RAM (and more) support<br />
<a href="http://lwn.net/Articles/39283/" target="_blank">http://lwn.net/Articles/39283/</a><br />
<br />
64GB on 32-bit systems<br />
<a href="http://lwn.net/Articles/39925/" target="_blank">http://lwn.net/Articles/39925/</a><br />
<br />
[grsec] Announcing UDEREF/amd64<br />
<a href="http://grsecurity.net/pipermail/grsecurity/2010-April/001024.html" target="_blank">http://grsecurity.net/pipermail/grsecurity/2010-April/001024.html</a><br />
<br />
Grsecurity patch download<br />
<a href="http://grsecurity.net/download.php" target="_blank">http://grsecurity.net/download.php</a><br />
<br />
<br />Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-33997519595859959672014-11-18T17:22:00.001-08:002014-11-19T05:48:47.237-08:00Anybody knows How to Legitimately Register a PMI (PMU Performance Monitor Interrupt) Callback Handler on Windows OS? <span style="font-family: Trebuchet MS, sans-serif;">According to IA32/Intel Software Development Manual, when some PMU (Performance Monitor Unit) counter overflows occur, or LBR (Last Branch Record)/BTS (Branch Trace Store) is near full, the processor will deliver a PMI (Performance Monitor Interrupt). In Linux Kernel implementation, the PMU (perf tool) is using NMI to deliver such a PMI interrupt, and we can directly change the kernel source to add our own PMI handler for a particular event. </span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">But in Windows OS, how to register a PMI handler callback in a driver without hooking the kernel IDT table? Does anybody know about it? </span><br />
<br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif;">I've searched almost all the<a href="http://msdn.microsoft.com/en-us/library/windows/hardware/ff544200(v=vs.85).aspx" target="_blank"> Driver Support Routines </a>provided for kernel-mode drivers to use in MSDN site, but didn't get the documented kernel APIs to do so. </span><span style="font-family: 'Trebuchet MS', sans-serif;">However, by checking the Windows 7 32bit OS with Windbg tool, I got something interesting. </span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<br />
<span style="font-family: Trebuchet MS, sans-serif;">According to IA32 manual, the local APIC is set up to deliver the PMI interrupt and a software handler for the corresponding interrupt must be in place on a certain vector entry of IDT (Interrupt Descriptor Table) table.</span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">To be more specific, the Local APIC LVT (Local Vector Table) Performance Counter Register must be set up for this purpose. In xAPIC mode, the LVT Performance Counter Register MMIO address is (APIC base physical address + offset 0x340H), while when x2APIC mode is enabled, its address is IA32_X2APIC_LVT_PMI </span><span style="font-family: 'Trebuchet MS', sans-serif;">MSR</span><span style="font-family: Trebuchet MS, sans-serif;"> (</span><span style="font-family: 'Trebuchet MS', sans-serif;">index 0x834h), which is called </span><span style="font-family: Trebuchet MS, sans-serif;">x2APIC LVT Performance Monitor register.</span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">On a Windows 7 32bit OS, I used Windbg to check the MSR </span><span style="font-family: 'Trebuchet MS', sans-serif;">IA32_APIC_BASE (0x1B) with <span style="color: blue;"><i>rdmsr </i></span>command:</span><br />
<span style="font-family: 'Trebuchet MS', sans-serif;"><br /></span>
<span style="color: blue;">
<span style="font-family: Courier New, Courier, monospace;">kd> rdmsr 0x1b</span></span><br />
<span style="color: blue; font-family: Courier New, Courier, monospace;">msr[1b] = 00000000`fee00900</span><br />
<br />
<span style="font-family: Trebuchet MS, sans-serif;">See layout below of IA32_APIC_BASE MSR, so we can get to know that the APIC base physical address is 0xfee00000h, and the xAPIC mode is enabled on my system. This means the LVT Performance Counter Register MMIO address is 0xfee00340h. </span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhunkp79TrIiG8QOYVBgCdYGUgyAxmPsfOyGmdii2LNa1GJMHwqYtiGt1RnOfjRsnKiJaZonF8_oAQnbMCb49NkGXyMR8G7ix-rNvfiJoKtkkLDtfcX5v7M7_pMG7WNKuXixeu471v1wdY/s1600/pmi-2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhunkp79TrIiG8QOYVBgCdYGUgyAxmPsfOyGmdii2LNa1GJMHwqYtiGt1RnOfjRsnKiJaZonF8_oAQnbMCb49NkGXyMR8G7ix-rNvfiJoKtkkLDtfcX5v7M7_pMG7WNKuXixeu471v1wdY/s1600/pmi-2.png" height="222" width="640" /></a></div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">Then, I used <span style="color: blue;"><i>!dd</i></span> command to read the content of this register address, see below, the value is 0x000000fe. </span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="color: blue; font-family: Courier New, Courier, monospace;">kd> !dd [uc] fee00340</span><br />
<span style="color: blue; font-family: Courier New, Courier, monospace;">#fee00340 </span><span style="color: red; font-family: Courier New, Courier, monospace;">000000fe </span><span style="color: blue; font-family: Courier New, Courier, monospace;">00000000 000000fe 00000000</span><br />
<span style="color: blue; font-family: Courier New, Courier, monospace;">#fee00350 0001001f 00000000 0001001f 00000000</span><br />
<span style="color: blue; font-family: Courier New, Courier, monospace;">#fee00360 000004ff 00000000 000004ff 00000000</span><br />
<span style="color: blue; font-family: Courier New, Courier, monospace;">#fee00370 000000e3 00000000 000000e3 00000000</span><br />
<span style="color: blue; font-family: Courier New, Courier, monospace;">#fee00380 00000000 00000000 00000000 00000000</span><br />
<span style="color: blue; font-family: Courier New, Courier, monospace;">#fee00390 00000000 00000000 00000000 00000000</span><br />
<span style="color: blue; font-family: Courier New, Courier, monospace;">#fee003a0 00000000 00000000 00000000 00000000</span><br />
<span style="color: blue; font-family: Courier New, Courier, monospace;">#fee003b0 00000000 00000000 00000000 00000000</span><br />
<div>
<br />
<span style="font-family: Trebuchet MS, sans-serif;">Now, see the picture below for its layout, which means by default Windows OS kernel uses <i><b>Fixed (000b) Delivery Mode</b></i> and<i><b> IDT vector 0xfe</b></i> to deliver PMI interrupt. </span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgsQQ65pf0dveL1F5rOse9kPC9uW9gjuDk-XHlKGhr53yRGWGO0SIvAjhxYlvBvSaMxlI_FyxY8iirReMDvmWtWugM6YAf-Q0-zsx0tE7nuOq-Xo4bkgUfvHaSIqQob7aLMF5ylvMQOhWc/s1600/PMI-0.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgsQQ65pf0dveL1F5rOse9kPC9uW9gjuDk-XHlKGhr53yRGWGO0SIvAjhxYlvBvSaMxlI_FyxY8iirReMDvmWtWugM6YAf-Q0-zsx0tE7nuOq-Xo4bkgUfvHaSIqQob7aLMF5ylvMQOhWc/s1600/PMI-0.png" height="640" width="513" /></a></div>
<br />
<br />
<span style="font-family: Trebuchet MS, sans-serif;">Now, let's check the vector 0xfe in IDT table with <span style="color: blue;"><i>!idt</i></span> command in Windbg tool, the PMI ISR (Interrupt Service Routine) is </span><span style="color: blue; font-family: Trebuchet MS, sans-serif;">hal!HalpPerfInterrupt </span><span style="font-family: Trebuchet MS, sans-serif;">installed by OS kernel. </span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;">kd> !idt <span style="color: blue;">0xfe</span></span><br />
<span style="font-family: Courier New, Courier, monospace;">Dumping IDT: 80b95400</span><br />
<span style="font-family: Courier New, Courier, monospace;">fe:<span class="Apple-tab-span" style="white-space: pre;"> </span>82a221a8 <span style="color: blue;">hal!HalpPerfInterrupt</span></span><br />
<br />
<span style="font-family: Trebuchet MS, sans-serif;">Disassemble this function as below with command </span><span style="color: blue;"><i>uf</i></span><span style="font-family: 'Trebuchet MS', sans-serif;">, ellipsis(...) means some of instructions are truncated. </span><span style="font-family: 'Trebuchet MS', sans-serif;">We can see that it retrieves the handler (callback?) from the global variable <span style="color: blue;">hal!HalpPerfInterruptHandler</span>, then calls it. <span style="color: red;">So now my question is - how to register this performance interrupt handler (PMI handler) in my own driver, so that my callback routine can get called whenever a PMI event occurs? </span></span><br />
<br />
<span style="font-family: Courier New, Courier, monospace;">kd> uf hal!HalpPerfInterrupt</span><br />
<span style="font-family: Courier New, Courier, monospace;">...</span><br />
<span style="font-family: Courier New, Courier, monospace;">hal!HalpPerfInterrupt:</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221a8 54 push esp</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221a9 55 push ebp</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221aa 53 push ebx</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221ab 56 push esi</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221ac 57 push edi</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221ad 83ec54 sub esp,54h</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221b0 8bec mov ebp,esp</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221b2 894544 mov dword ptr [ebp+44h],eax</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221b5 894d40 mov dword ptr [ebp+40h],ecx</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221b8 89553c mov dword ptr [ebp+3Ch],edx</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221bb f7457000000200 test dword ptr [ebp+70h],20000h</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221c2 75bc jne hal!V86_Hpf_a (82a22180) Branch</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;">hal!HalpPerfInterrupt+0x1c:</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221c4 66837d6c08 cmp word ptr [ebp+6Ch],8</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a221c9 741f je hal!HalpPerfInterrupt+0x42 (82a221ea) Branch</span><br />
<span style="font-family: Courier New, Courier, monospace;">...</span><br />
<span style="font-family: Courier New, Courier, monospace;">hal!HalpPerfInterrupt+0x146:</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a222ee 8bcd mov ecx,ebp</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a222f0 a1e43ca282 mov eax,dword ptr [<span style="color: blue;"><b>hal!HalpPerfInterruptHandler</b></span> (82a23ce4)]</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a222f5 0bc0 or eax,eax</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a222f7 745b je hal!HalpPerfInterrupt+0x1ac (82a22354) Branch</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: Courier New, Courier, monospace;">hal!HalpPerfInterrupt+0x151:</span><br />
<span style="font-family: Courier New, Courier, monospace;">82a222f9 ffd0 call eax</span><br />
<span style="font-family: Courier New, Courier, monospace;">...</span><br />
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;"><u>A possible solution ? (It requires to change OS default settings):</u></span><br />
<span style="font-family: Trebuchet MS, sans-serif;">As we know that, PMI interrupt event vector is shared, so basically a PMI interrupt handler should check the IA32_PERF_GLOBAL_STATUS MSR (0x38E) to determine which event(s) triggered the PMI. However, for each PMU, during a specific time period, there should have only one PMU driver (if we have multiple PMU drivers) to control and use it. Hence, Windows operating system (Win7+) provides two APIs below for PMU drivers. </span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span><span style="font-family: Trebuchet MS, sans-serif;"><b> HalAllocateHardwareCounters()</b></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><b> HalFreeHardwareCounters()</b></span><br />
<div>
<br /></div>
<span style="font-family: Trebuchet MS, sans-serif;">See their usages as below, for details please take a look at <a href="http://msdn.microsoft.com/en-us/library/windows/hardware/ff546577(v=vs.85).aspx" target="_blank">MSDN link</a>.</span><br />
<blockquote class="tr_bq" style="color: #454545; line-height: 1.429em !important; padding-bottom: 15px;">
<span style="font-family: inherit;">If more than one such tool is installed on a computer, the associated drivers must avoid trying to use the same hardware counters simultaneously. To avoid such resource conflicts, all drivers that use counter resources should use the <strong>HalAllocateHardwareCounters</strong> and <strong>HalFreeHardwareCounters</strong> routines to coordinate their sharing of these resources. </span></blockquote>
<blockquote class="tr_bq" style="color: #454545; line-height: 1.429em !important; padding-bottom: 15px;">
<span style="line-height: 1.429em;"><span style="font-family: inherit;">A counter resource is a single hardware counter, a block of contiguous counters, or a <b><i>counter overflow interrupt</i></b> in a PMU.</span></span></blockquote>
<blockquote class="tr_bq" style="color: #454545; line-height: 1.429em !important; padding-bottom: 15px;">
<span style="font-family: inherit;"><span style="line-height: 1.429em;">Before configuring the counters, a driver can call the</span><span style="line-height: 1.429em;"> </span><strong style="line-height: 1.429em;">HalAllocateHardwareCounters</strong><span style="line-height: 1.429em;"> </span><span style="line-height: 1.429em;">routine to acquire exclusive access to a set of counter resources. After the driver no longer needs these resources, it must free the resources by calling the</span><span style="line-height: 1.429em;"> </span><strong style="line-height: 1.429em;">HalFreeHardwareCounters</strong><span style="line-height: 1.429em;"> </span><span style="line-height: 1.429em;">routine.</span></span></blockquote>
<span style="color: blue;"><span style="font-family: Trebuchet MS, sans-serif;">Does this mean that once we successfully call HalAllocateHardwareCounters() to acquire exclusive access to PMI (e.g. counter overflow interrupt in a PMU</span><span style="font-family: 'Trebuchet MS', sans-serif;">), then we can even re-program the default Local APIC LVT Performance Counter Register? </span></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">If we can do that without triggering PatchGuard (Windows x64 OS) or causing any other compatibility issues, then we could do it as below:</span><br />
<ol>
<li><span style="font-family: Trebuchet MS, sans-serif;">Call <i><b>HalAllocateHardwareCounters</b></i>() to acquire exclusive access to PMI interrupt.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif;">Re-program APIC LVT performance counter register by setting <i><b>Delivery Mode</b></i> with <b><i>NMI </i></b>(100b), see its layout in picture above. Then whenever a PMI interrupt is triggered, a NMI (nonmaskable interrupt) handler will get called.<br />In other words, such a setting converts PMI event to NMI event. </span></li>
<li><span style="font-family: Trebuchet MS, sans-serif;">Fortunately, Windows OS kernel provides two APIs below:<br /><b><i>KeRegisterNmiCallback</i></b>() - Registers a routine to be called whenever a NMI occurs</span><br /><span style="font-family: 'Trebuchet MS', sans-serif;"><b><i>KeDeregisterNmiCallback</i></b>()<br />See <a href="http://msdn.microsoft.com/en-us/library/windows/hardware/ff553116(v=vs.85).aspx" target="_blank">this MSDN link </a>for details. It means OS kernel allows our driver to register a NMI callback routine to handle any NMI interrupt event. </span></li>
</ol>
<span style="font-family: Trebuchet MS, sans-serif;">Once we have done these, I think we can control and use a particular PMU, and handle the PMI interrupt event appropriately. When jobs are done, apparently </span><span style="font-family: 'Trebuchet MS', sans-serif;">we must restore APIC (xAPIC or x2APIC) LVT performance register back to its default settings, de-register NMI callback, and free hardware counter resource. </span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;"><b><u>Notes:</u></b></span><br />
<ol>
<li><span style="font-family: Trebuchet MS, sans-serif;">Due to<a href="http://hypervsir.blogspot.com/2014/10/an-os-kernel-bug-in-windows-81-32-bit-os.html" target="_blank"> this bug in my previous post</a>, on Windows 8.1 32bit OS, NMI interrupt will cause system crash. Not sure if Microsoft fix this issue on latest version. </span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif;">Intel VTune driver on Windows OS might be using PMU PMI, but I have no idea how it does :-(</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif;">If anybody knows there is a good solution to register PMI interrupt, please let me know :) I really appreciate it! </span></li>
</ol>
</div>
<div>
<The End><br />
<br />
<br /></div>
Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com4tag:blogger.com,1999:blog-385094692918992828.post-83924585305517842212014-11-18T05:44:00.000-08:002014-11-20T18:29:04.999-08:00Page Table Structure Corruption Attacks - How to Mitigate it?<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">On x86 and many other processor architectures (with MMU), page tables are critical data structures for address translations. And many <a href="http://hypervsir.blogspot.com/2014/10/introduction-on-hardware-security.html" target="_blank">hardware-based page level protection technologies in my previous post</a>, like SMEP, XD/DEP, highly depend on correct page table settings. <span style="color: blue;">so what if page tables are controlled by an attacker? ...At the end of this post, I will propose an extra solution to mitigate page table structure attacks.</span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"></span><br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Recently, this post (<a href="https://labs.mwrinfosecurity.com/blog/2014/08/15/windows-8-kernel-memory-protections-bypass/" target="_blank">Windows 8 Kernel Memory Protections Bypass</a>) presents a generic technique for exploiting kernel vulnerabilities with bypassing SMEP and DEP. It just requires only a single vulnerability that provides an attacker with a write-what-where primitive, then <span style="color: blue;">exploits it with modifying the page tables (U/S and XD bit flags) intentionally to bypass SMEP and DEP protections</span>. </span><br />
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">As we all know that SMEP (Supervisor Mode Execution Prevention) is a mitigation that aims to prevent the CPU from running code from user-mode while in kernel-mode. Internally the processor check the U/S bit flag in corresponding page structure tables when fetching instruction for execution in kernel mode. <span style="color: blue;">Hence if we can </span><span style="color: blue;">corrupt the paging structures to modify the U/S flag, then we can cause a user memory to be interpreted as kernel memory without any other additional changes</span>.</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Similarly, DEP (Data Execution Prevention) depends on the NX bit flag (set) to prohibit a data page being executed. <span style="color: blue;">If we can clear such a flag by corrupting the paging structures, we can cause a data page to be marked as executable</span>. </span><br />
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">On Windows 8 system, both SMEP and DEP are enabled by default, and the KASLR (Kernel <i><b>Address Space Layout Randomization</b></i>) is also enabled. <span style="color: #cc0000;">But unfortunately, the virtual address of a corresponding PTE entry address for a particular virtual address (for example, a user mode address) is fixed and easy to calculated</span>. So how to retrieve page table addresses? </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-size: large;"><span style="font-family: Trebuchet MS, sans-serif;">For example, on</span><span style="font-family: 'Trebuchet MS', sans-serif;"> 32bit PAE Windows system, the code below can get the <i>virtual address of PTE (not the PTE contents)</i> for a particular virtual address as input. </span></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define PT_VIRTUAL_BASE_ADDRESS 0xC0000000</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define PAGE_TABLE_SHIFT 12</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define PAGE_DIR_SHIFT 21</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define PAGE_DIR_POINTER_SHIFT 30</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"><br /></span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">__inline</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">UINT32 PAEGetPteAtVirtualAddress(UINT32 Vaddr)</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">{</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> return (UINT32) </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> ( PT_VIRTUAL_BASE_ADDRESS + </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> ((Vaddr & 0xC0000000) >> PAGE_DIR_POINTER_SHIFT) * 0x200000 </span><span style="font-family: 'Courier New', Courier, monospace;">+ </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> ((Vaddr & 0x3FE00000) >> PAGE_DIR_SHIFT) * 0x1000 +</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> ((Vaddr & 0x001FF000) >> PAGE_TABLE_SHIFT) * 8</span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace;"> );</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">}</span></div>
</div>
<div style="font-family: 'Trebuchet MS', sans-serif;">
<span style="font-size: large;"><br /></span></div>
<div style="font-family: 'Trebuchet MS', sans-serif;">
<span style="font-size: large;">On 64-bit Windows system, similarly. </span></div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace;"><br /></span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">/* you can see this definition in Win DDK/SDK ntddk.h file */</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define PTE_BASE 0xFFFFF68000000000UI64</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"><br /></span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define PTE_SHIFT 3</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define PTI_SHIFT 12</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define PDI_SHIFT 21</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define PPI_SHIFT 30</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define PXI_SHIFT 39</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"><br /></span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define VIRTUAL_ADDRESS_BITS 48</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define VIRTUAL_ADDRESS_MASK ((((UINT64)1) << VIRTUAL_ADDRESS_BITS) - 1)</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"><br /></span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;">#define X64GetPteAddress(va) \</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace;"> (((((UINT64)(va) & VIRTUAL_ADDRESS_MASK) >> PTI_SHIFT) << PTE_SHIFT) + PTE_BASE)</span></div>
<div style="font-family: 'Trebuchet MS', sans-serif; font-size: x-large;">
<br /></div>
</div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Then if there is write-what-where kernel vulnerability, an attacker can corrupt the corresponding PTE based upon the calculations above for a particular virtual address of user mode code that is controlled by attacker. </span></div>
</div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;">So now, how to mitigate this kind of SMEP/DEP bypassing? </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">As the author of that post said, </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">randomization for </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">page table address itself is not possible because</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"> it is recognised that many of the core functions of the kernel memory management may rely on this mapping to locate and update paging structures.</span></div>
<div>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">The author also proposed two solutions to mitigate it:</span><br />
<br />
<ol>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">One is to use a separate data segment for holding page structures.<br />This requires an extra dedicated segment register. Maybe GS is unused in 32bit Windows, and FS is unused in 64bit Windows, then we can use this solution. <br /> </span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">The other one is to set hardware debug breakpoints on</span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"> the access to the paging structures (or key fields of the structures).<br />Hardware breakpoint is a very limited resource (only max 4 H/W breakpoints supported), and it may also cause other compatibility issues.</span></li>
</ol>
</div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="color: #cc0000; font-family: Trebuchet MS, sans-serif; font-size: large;">Now, I am proposing another solution to solve this issue by write-protecting page table structures with CR0.WP capability. </span><br />
<span style="color: #cc0000; font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="color: #cc0000; font-family: Trebuchet MS, sans-serif; font-size: large;">The basic idea is to set data page of page structures/tables themselves with Read-Only permission. And because CR0.WP bit is set by default, so any write access to page table structures will generate #PF exception by processor. But for legitimate modification to page table structures, use the code sequence below:</span><br />
<span style="color: #cc0000; font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="background-color: white; color: blue; font-family: 'Trebuchet MS', sans-serif; font-size: 15px; line-height: 20.7900009155273px;"> disable_wp(); // clear CR0.WP bit.</span><br />
<div style="background-color: white; color: #4e4e4e; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 15px; line-height: 20.7900009155273px;">
<span style="color: blue; font-family: 'Trebuchet MS', sans-serif;"> write access to RO page structures.</span></div>
<div style="background-color: white; color: #4e4e4e; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 15px; line-height: 20.7900009155273px;">
<span style="color: blue; font-family: 'Trebuchet MS', sans-serif; line-height: 20.7900009155273px;"> enable_wp(); // set CR0.WP bit again.</span><span style="background-color: transparent; font-family: Trebuchet MS, sans-serif; font-size: large;"> </span></div>
</div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">I have talked about this idea before in my previous posts, please check details below:</span><br />
<br />
<ul>
<li><span style="background-color: white; color: #0000ee; font-size: 20px;"><span style="font-family: Trebuchet MS, sans-serif;"><a href="http://hypervsir.blogspot.com/2014/10/security-os-design-idea-to-prevent.html" target="_blank">Security OS Kernel Design: an idea to prevent malicious software overwriting the critical system kernel data structures</a></span></span></li>
</ul>
<ul>
<li><a href="http://hypervsir.blogspot.com/2014/11/security-os-design-cont-write.html" style="font-size: 20px;" target="_blank"><span style="font-family: Trebuchet MS, sans-serif;">Security OS Design (cont.): Write Protection for Linux Kernel critical data structures (GDT, IDT, syscall table, task_strcture, mm_struct,...)</span></a></li>
</ul>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><u><b><i>Update:</i></b></u></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Some references from M$FT slides about Windows self-mapping page tables:</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj2QszezOtWNtp4sExcLsagW-BhY0H7wQL_UmPOMdvHNZagBQwHPx3boiM49tX3CI6bCXFYjURplFQirx1755YLS0yfQd5WMhSK3hUB8z6z5E5_Yub7LpMC4EUMigyVjHUf9StmVpIcwL0/s1600/self-mapping1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj2QszezOtWNtp4sExcLsagW-BhY0H7wQL_UmPOMdvHNZagBQwHPx3boiM49tX3CI6bCXFYjURplFQirx1755YLS0yfQd5WMhSK3hUB8z6z5E5_Yub7LpMC4EUMigyVjHUf9StmVpIcwL0/s1600/self-mapping1.png" height="478" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWIyWtFKMGpF71tD5DOFR9Y9cOi1AvlgIl9Hx-jWu_75Ym2nuf3Tsp8rCwGlJs7JO0wpn8WzwjywBwB9cU_le1cZQLD4-YYh370YMpADt3_aWzrzzsOTY6Tq-ZLHyn7MstFLEUaiMC5A4/s1600/self-mapping2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWIyWtFKMGpF71tD5DOFR9Y9cOi1AvlgIl9Hx-jWu_75Ym2nuf3Tsp8rCwGlJs7JO0wpn8WzwjywBwB9cU_le1cZQLD4-YYh370YMpADt3_aWzrzzsOTY6Tq-ZLHyn7MstFLEUaiMC5A4/s1600/self-mapping2.png" height="476" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlUsEqDNABx8_ohnN4tQdEx5sRUtdeV1G6yryIT8CZu0oE1LG9ACSy5YvcE-AR1BblnkZjX4Y-EAq4GlZYhRPfuzHdWBkVVJUrmhhfVm6Rhq4bxEGhAAksToN4iKely-_IfUmd_iiCGbI/s1600/self-mapping3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlUsEqDNABx8_ohnN4tQdEx5sRUtdeV1G6yryIT8CZu0oE1LG9ACSy5YvcE-AR1BblnkZjX4Y-EAq4GlZYhRPfuzHdWBkVVJUrmhhfVm6Rhq4bxEGhAAksToN4iKely-_IfUmd_iiCGbI/s1600/self-mapping3.png" height="480" width="640" /></a></div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<br /></div>
</div>
Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-81339421703130354482014-11-17T04:10:00.002-08:002014-11-17T17:30:59.537-08:00Implement software-based SMEP with Non-Execute (NX) bit in page tables to secure kernel/user virtual memory address space.<span style="font-family: Trebuchet MS, sans-serif;">In my <a href="http://hypervsir.blogspot.com/2014/11/how-to-implement-software-based.html" target="_blank">previous post</a>, I talked about how to implement a software-based SMEP (</span><span style="font-family: 'Trebuchet MS', sans-serif;">S</span><span style="font-family: 'Trebuchet MS', sans-serif;">upervisor Mode Execution Protection</span><span style="font-family: Trebuchet MS, sans-serif;">) with virtualization/hypervisor for fun. In this post, I'm going to detail yet another solution to implement software-based SMEP without virtualization technology. </span><br />
<div>
<span style="font-family: Trebuchet MS, sans-serif;"></span><br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif;"></span><br />
<div>
<span style="font-family: Trebuchet MS, sans-serif;">In modern operating systems, like Linux and Windows, all the processes share the same kernel virtual address space, but have separate user virtual address space, see below for <a href="http://blogs.technet.com/b/askperf/archive/2007/09/28/memory-management-x86-virtual-address-space.aspx" target="_blank">Windows 32bit OS</a>. The system can achieve this by configuring separate page structures pointed by a translation table base register (e.g. CR3 register on x86/Intel MMU architecture) for each process, and switch among them.</span></div>
<span style="font-family: Trebuchet MS, sans-serif;">
</span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<span style="font-family: Trebuchet MS, sans-serif;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-QyYXZNs43ZyyZxO-E008J0TpPtL1agMYP9paWt14US4XfUEqfWj0DdY2AbjLkqi2zEeXZPhfB4hM85hxr3ahyWIiZl1Kcdm4878gT6S5-bZ0-x2IW8GtdPI0w1wByTUKIiaX8oJimcs/s1600/SMEP1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-QyYXZNs43ZyyZxO-E008J0TpPtL1agMYP9paWt14US4XfUEqfWj0DdY2AbjLkqi2zEeXZPhfB4hM85hxr3ahyWIiZl1Kcdm4878gT6S5-bZ0-x2IW8GtdPI0w1wByTUKIiaX8oJimcs/s1600/SMEP1.png" height="510" width="640" /></a></span></div>
<span style="font-family: Trebuchet MS, sans-serif;">
</span>
<br />
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<span style="font-family: Trebuchet MS, sans-serif;">
To simplify the discussion, I'm assuming that we are working with a Linux 64bit OS system on x86_64/Intel architecture. </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;">So, from here (<a href="https://www.kernel.org/doc/Documentation/x86/x86_64/mm.txt" target="_blank">https://www.kernel.org/doc/Documentation/x86/x86_64/mm.txt</a></span><span style="font-family: 'Trebuchet MS', sans-serif;">)</span><span style="font-family: 'Trebuchet MS', sans-serif;">, we can know that the virtual address range below belongs to user space. </span></div>
<div>
<blockquote class="tr_bq" style="white-space: pre-wrap; word-wrap: break-word;">
<span style="font-family: Trebuchet MS, sans-serif;">0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm</span></blockquote>
<span style="font-family: Trebuchet MS, sans-serif;"></span><br />
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><span style="font-family: Trebuchet MS, sans-serif;">And, we also get to know that x86_64 bit Linux OS uses Intel IA-32e paging as below (w/ 4KB page size as an example), which has CR3 register pointing to the physical base address of a PML4 table. Each process/task has a corresponding PML4 table. </span></span></div>
<span style="font-family: Trebuchet MS, sans-serif;">
</span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<span style="font-family: Trebuchet MS, sans-serif;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbrXfj8ooKlZ0Gn7ckrPYYz2Fh50ST3KK2mk-Sm5Yqkq8uJpJKL12O9sHM2UWV9hbl3U4m8swhpzhrzaKtkyszIcqjSr0DXdZ2ZVdAcVmiHFl_u9tajZGtioYN36shgdUn89rZ56_YgIA/s1600/SMEP2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbrXfj8ooKlZ0Gn7ckrPYYz2Fh50ST3KK2mk-Sm5Yqkq8uJpJKL12O9sHM2UWV9hbl3U4m8swhpzhrzaKtkyszIcqjSr0DXdZ2ZVdAcVmiHFl_u9tajZGtioYN36shgdUn89rZ56_YgIA/s1600/SMEP2.png" height="468" width="640" /></a></span></div>
<span style="font-family: Trebuchet MS, sans-serif;">
</span>
<br />
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<span style="font-family: Trebuchet MS, sans-serif;">
When a task gets scheduled, the corresponding <b><i>physical base address</i></b> of PML4 table will be wrote to CR3 register by a mov-to-cr3 instruction, so that the task/process virtual address space can be switched accordingly.<br /><br />Since the Linux user address space range is </span><span style="font-family: 'Trebuchet MS', sans-serif; white-space: pre-wrap;">0000000000000000 - 00007fffffffffff, we can infer that the first 256 PML4 entries (index 0~255) will eventually pointer to user virtual address space for each process/task. See below picture. </span></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipjZ6KoQF4a_yxMXHwrEHFpjWavJazOOfPAErjhr6HfxdySlEbl9B-u-aBMCnLJdh3vFQS3q6mdCA18auoZahpwn7AoUXuAzOUDUTYOmNniPI6Jipdqf9taBaReKwnu4uJMvbS1N942dU/s1600/SMEP4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipjZ6KoQF4a_yxMXHwrEHFpjWavJazOOfPAErjhr6HfxdySlEbl9B-u-aBMCnLJdh3vFQS3q6mdCA18auoZahpwn7AoUXuAzOUDUTYOmNniPI6Jipdqf9taBaReKwnu4uJMvbS1N942dU/s1600/SMEP4.png" height="368" width="640" /></a></div>
<div>
<span style="font-family: 'Trebuchet MS', sans-serif; white-space: pre-wrap;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><span style="white-space: pre-wrap;">In each PML4 entry above, there are some processor-"Ignored" bits and a XD (eXecute-Disable) bit as picture indicated below. <span style="color: #cc0000;">The "XD" bit can control whether or not the referenced physical pages can be fetched for execution. If it is set, then an instruction fetch will trigger a #PF exception (assuming MSR IA32_EFER.NXE = 1). This is the key point for implementing software-base SMEP solution</span>. </span></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8W2YC0Afs9M_J3S_YiaCUb3SX9aXkqgWe7pQ9-QAYg-LIkotbgqfNNdnpTp64uK5GB02c2Fn-rd2mxw-W-nvT3iwxz0UG6qdm1NpTDdBFA_MSFWotGm4AVgRE5dTusVBWoGzj9Y-fFtc/s1600/SMEP3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8W2YC0Afs9M_J3S_YiaCUb3SX9aXkqgWe7pQ9-QAYg-LIkotbgqfNNdnpTp64uK5GB02c2Fn-rd2mxw-W-nvT3iwxz0UG6qdm1NpTDdBFA_MSFWotGm4AVgRE5dTusVBWoGzj9Y-fFtc/s1600/SMEP3.png" height="532" width="640" /></a></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><span style="white-space: pre-wrap;"><br /></span><br />So, the solution now is:</span></div>
<div>
<ol>
<li><span style="font-family: Trebuchet MS, sans-serif;"><span style="color: #cc0000;">Whenever a process enters kernel mode (CPL=0, for example, through a syscall or sysenter instruction), OS kernel sets the bit PML4E.XD bit for all the PML4 table entries (index 0 through 255, can be optimized)</span>. And then flush TLB (performance cost).<br /><span style="color: #cc0000;">In this way, any attempt to fetch user virtual address memory in kernel mode will cause a #PF exception,</span> but read/write access to user virtual address memory is allowed (for example, copy_to/from_user() functions).</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif;">OS kernel can use some "Ignored" bits to record this intended behavior for easy virtual address management.<br /> </span></li>
<li><span style="font-family: Trebuchet MS, sans-serif;">Before leaving kernel mode, the OS kernel change PML4.XD bit (and some "Ignored" bits) back to the original state. </span></li>
</ol>
</div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<div>
<span style="font-family: 'Trebuchet MS', sans-serif;">Similarly, if we don't consider performance cost, we are even able to implement a software-based SMAP (Supervisor Mode Access Protection) with "Present" bit clear, but I'm not explaining the details in this post.</span></div>
<div>
</div>
<div>
<br /></div>
<div>
<The End><br />
<br />
<br />
<b><span style="color: red; font-family: Trebuchet MS, sans-serif; font-size: large;">Update:</span></b><br />
I didn't do enough homework before. Previously UDEREF from PAX used 32bit segmentation (and its limit) to emulate SMEP/SMAP behaviors, but thanks to someone from PAX team commenting it as below, I got the UDEREF for 64bit here:<br />
https://github.com/opntr/pax-docs-mirror/blob/master/uderef-amd64.txt<br />
<br />
<br /></div>
Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com10tag:blogger.com,1999:blog-385094692918992828.post-15811744768574727772014-11-16T04:15:00.001-08:002014-11-16T04:17:45.067-08:00DMA Attacks Against McAfee DeepSafe<div>
<span style="font-family: Trebuchet MS, sans-serif;">Rafal Wojtczuk (from <a href="http://www.bromium.com/" target="_blank">Bromium</a>, previously <a href="http://invisiblethingslab.com/itl/About.html" target="_blank">Invisible Things Lab</a>) presented DMA attacks against DeepSafe. </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"></span><br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;">About DeepSafe:</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><a href="http://www.mcafee.com/us/solutions/mcafee-deepsafe.aspx" target="_blank">http://www.mcafee.com/us/solutions/mcafee-deepsafe.aspx</a></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;">The snapshots below from: <a href="https://www.youtube.com/watch?v=RM1oBlFX5UQ" target="_blank">https://www.youtube.com/watch?v=RM1oBlFX5UQ</a></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3-FNUJGOVkqevQs7T5QUOxu-9Wx9kxsu_WarJyE-R5vc5j8dQFMxGV8p8NlLpuk_5kUAtH_R0p0osn_i3sQEeAh-BSuvWQP9RC7_YHfE1Dr__MpkMLWfPSvUIMAcPlAMhzGpkvbvRZWM/s1600/DS1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: Trebuchet MS, sans-serif;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3-FNUJGOVkqevQs7T5QUOxu-9Wx9kxsu_WarJyE-R5vc5j8dQFMxGV8p8NlLpuk_5kUAtH_R0p0osn_i3sQEeAh-BSuvWQP9RC7_YHfE1Dr__MpkMLWfPSvUIMAcPlAMhzGpkvbvRZWM/s1600/DS1.png" height="355" width="640" /></span></a></div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmhXiPL6leZCYn_AnMJodUtZDK_eZGhdMRMELm-Zo04nC_v-QV6B4rheMbEkgoawHOtf22rsI0mj1S-F6KTGiWtJPm0xMa-xBjDbm53efEkAJXCy7A88d9zZUnQ0hFNuMKahC31ZP0WlE/s1600/DS2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: Trebuchet MS, sans-serif;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmhXiPL6leZCYn_AnMJodUtZDK_eZGhdMRMELm-Zo04nC_v-QV6B4rheMbEkgoawHOtf22rsI0mj1S-F6KTGiWtJPm0xMa-xBjDbm53efEkAJXCy7A88d9zZUnQ0hFNuMKahC31ZP0WlE/s1600/DS2.png" height="358" width="640" /></span></a></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;">How to know where physical address space DeepSafe hypervisor is located in? (from whitepaper)</span></div>
<div>
<i><b><span style="font-family: Trebuchet MS, sans-serif;"></span></b></i><br />
<blockquote class="tr_bq">
<span style="font-family: Trebuchet MS, sans-serif;"><i><b>There are a few interesting technical details regarding the above hypervisor overwrite. <span style="color: blue;">First, malware running in OS needs to know where in physical address space Deepsafe hypervisor is located</span>. Dumping all the physical address space via DMA and doing pattern search in it is possible, but troublesome. A more elegant approach was found – it turns out that when EPT fault occurs because OS tried to read from a physical address belonging to the hypervisor, then Deepsafe does not bother to emulate the instruction, it just skips it. Thus, the following function</b></i></span></blockquote>
<span style="font-family: Trebuchet MS, sans-serif;"><i><b>
</b></i></span>
<blockquote class="tr_bq">
<span style="font-family: Trebuchet MS, sans-serif;"><i><b>mov rax, MAGICVALUE</b></i></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><i><b>mov rax, [rcx]</b></i></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><i><b>ret </b></i></span></blockquote>
<span style="font-family: Trebuchet MS, sans-serif;">
<blockquote class="tr_bq">
<i><b>Will return MAGICVALUE if memory at rcx belongs to Deepsafe, and something else (real memory content) if not. Deepsafe allocates a contiguous physical memory region of size 0x300000, so it is easy and fast to find it via scanning all the memory.</b></i></blockquote>
</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><u><b><i>References:</i></b></u></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;">BlackHat 2014 @US:</span></div>
<div>
<span style="font-family: 'Trebuchet MS', sans-serif;"><a href="https://www.blackhat.com/us-14/archives.html#poacher-turned-gamekeeper-lessons-learned-from-eight-years-of-breaking-hypervisors" target="_blank">https://www.blackhat.com/us-14/archives.html#poacher-turned-gamekeeper-lessons-learned-from-eight-years-of-breaking-hypervisors</a></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;">The Presentation:</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><a href="https://www.blackhat.com/docs/us-14/materials/us-14-Wojtczuk-Poacher-Turned-Gamekeeper-Lessons_Learned-From-Eight-Years-Of-Breaking-Hypervisors.pdf" target="_blank">https://www.blackhat.com/docs/us-14/materials/us-14-Wojtczuk-Poacher-Turned-Gamekeeper-Lessons_Learned-From-Eight-Years-Of-Breaking-Hypervisors.pdf</a></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;">Whitepaper:</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><a href="https://www.blackhat.com/docs/us-14/materials/us-14-Wojtczuk-Poacher-Turned-Gamekeeper-Lessons_Learned-From-Eight-Years-Of-Breaking-Hypervisors-wp.pdf" target="_blank">https://www.blackhat.com/docs/us-14/materials/us-14-Wojtczuk-Poacher-Turned-Gamekeeper-Lessons_Learned-From-Eight-Years-Of-Breaking-Hypervisors-wp.pdf</a></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;">or</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><a href="http://www.bromium.com/sites/default/files/wp-bromium-breaking-hypervisors-wojtczuk.pdf" target="_blank">http://www.bromium.com/sites/default/files/wp-bromium-breaking-hypervisors-wojtczuk.pdf</a> </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-52826183209873258372014-11-16T02:45:00.003-08:002014-11-19T22:00:31.131-08:00Latest researching status of ROP/JOP attacks and defenses<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Control Flow Hijacking, like ROP, becomes a hot topic in recent years since ever DEP(W^X enforcement) and SMEP were introduced in h/w processor. Based upon the papers that I read recently, this post just gives a brief introduction on the recent researching status (though incomplete) about control flow attacks and defenses. </span><br />
<span style="font-size: large;"><br /></span>
<br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">When code injection attacks become more and more difficult, attackers start to seek other opportunities to execute arbitrary code with completely re-using existing executable code in application image and/or shared libraries.</span><br />
<span style="font-size: large;"><span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">Typically, for example, those techniques without code injection could be return-to-libc, ROP (Return Oriented Programming), JOP (Jump Oriented Programming), or even SROP (Sigreturn Oriented Programming, see <a href="https://www.cs.vu.nl/~herbertb/papers/srop_sp14.pdf" target="_blank">Framing Signals—A Return to Portable Shellcode</a>). </span></span><br />
<span style="font-size: large;"><span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">Regarding techniques that defend against those control flow hijackings, here below are the lists (also incomplete).</span></span><br />
<ul>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Change existing compilers to re-generate the code binaries. There are some typical solutions like, generate return-less binary code, generate control flow friendly binary (with extra IDs/labels for CFG hardening), modify all the control flow (ret, jmp, call) instructions to a well-known center redirection table, etc.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Without binary changes, make static binary instrument and dynamic control flow tracing. For instance, build control flow graph, and enforce the control flow execution exactly aligning with the known CFG paths.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Hardware assisted CFI. CFI(Control-Flow Integrity) is an efficient way to defend against ROP/JOP attacks. However, due to performance issue, complete CFI enforcement is impossible in practice. So, there are some lightweight CFI checks with the help of latest processor LBR (last branch record) to examine the control flow behaviors base upon the experience (rather than the full CFG analysis).</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">THere are some specific solutions, like stack shadowing (to check if the target "ret" is call-preceding instruction), code section shadowing. But most of them are not a generic solution, but have many assumptions and limitations. </span></li>
</ul>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Regarding to the defenses with CFI, there are many papers that focus on the policy of checking whether or not the history control flow instructions behave as a malicious software. Typically, for example:</span></div>
<div>
<ul>
<li><span style="font-size: large;"><span style="font-family: 'Trebuchet MS', sans-serif;">Some solutions to <b><i>check the length of each one of ROP gadgets</i></b>. If it is very short (e.g. less than 5~6 instructions</span><span style="font-family: 'Trebuchet MS', sans-serif;">), it might be suspected. If there are </span><span style="font-family: Trebuchet MS, sans-serif;">consecutive chain of gadgets with "short" instruction sequences in a control flow, then it might be a ROP attack.</span></span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">Some of them to <b><i>check whether if the target of "ret" is a Call-Preceded instruction</i></b> (the instruction immediately preceding is a CALL instruction), this is because normally, every "ret" instruction returns back to an instruction that immediately follows a corresponding "call" instruction. </span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">Or check if<i><b> the target of all the indirect call/jmp instructions are the "entry-point" functions</b></i>. This is also normally true for a legitimate application, because generally an indirect "jmp" or "call" won't be calling into a certain middle location in a function. </span></li>
</ul>
</div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">However, just as those papers (see the links in References below) indicate, all the current CFI solutions based upon above assumptions can be bypassed with advanced ROP gadgets. </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">For example, in Nicholas's paper, he just used some call-preceded ROP gadgets and long termination gadgets for flushing attacking history to bypass checks of the famous kBouncer and ROPecker CFI solutions.</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">But his solution has an assumption for kBounce/ROPecker: the last branch records (LBR) can be stored in only 16 (at most) pairs of LBR MSRs. </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">As a matter of fact, if appropriately configured, the last branch records can also be stored into a variable-sized memory-resident branch trace store (BTS) buffer specified by DS(Debug Store) save area pointed by the IA32_DS_AREA MSR . And processor doesn't restrict how many pairs of last branch records could be stored in that BTS buffer, it also allows us to make processor generate an interrupt before the count of records reaches to the max records configured (or when the BTS buffer is nearly full). This means that we will never miss history LBR records. If we don't consider performance cost when enforcing CFI check at run time, this could be a good solution to trace all the control flow information. </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: blue;">However, even if we can get all the control flow traces (to defend against Nicholas's "history flushing" solution), does it mean that we can completely defend against control flow attacks? </span>Unless that we can make full CFI checks with CFG, one of another problems that we might always encounter is how to design a better policy to reduce "false positive", and at the same time to catch all the ROP attacks at run time with acceptable performance cost in practice. </span></div>
<div>
</div>
<div>
<span style="font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><b>References:</b></span></div>
<div>
<ul style="background-color: white; color: #4e4e4e; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; font-size: 15px; line-height: 20.7900009155273px; list-style-image: initial; list-style-position: initial; margin: 0.5em 0px; padding: 0px 2.5em;">
<li style="border: none; margin: 0px 0px 0.25em; padding: 0px;"><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">Nicholas Carlini, David Wagner, ROP is Still Dangerous: Breaking Modern Defenses<br /><a href="http://www.cs.berkeley.edu/~daw/papers/rop-usenix14.pdf" style="color: #0200a4; text-decoration: none;" target="_blank">http://www.cs.berkeley.edu/~daw/papers/rop-usenix14.pdf</a></span></li>
<li style="border: none; margin: 0px 0px 0.25em; padding: 0px;"><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">Enes Gökta¸s, Size Does Matter - Why Using Gadget-Chain Length to Prevent Code-Reuse Attacks is Hard<br /><a href="http://www.cs.columbia.edu/~mikepo/papers/chainlength.sec14.pdf" style="color: #0200a4; text-decoration: none;" target="_blank">http://www.cs.columbia.edu/~mikepo/papers/chainlength.sec14.pdf</a></span></li>
<li style="border: none; margin: 0px 0px 0.25em; padding: 0px;"><span style="font-size: large;"><span style="font-family: 'Trebuchet MS', sans-serif;">Out Of Control: Overcoming Control-Flow Integrity: </span><span style="font-family: 'Trebuchet MS', sans-serif;"><a href="http://www.ieee-security.org/TC/SP2014/papers/OutOfControl_c_OvercomingControl-FlowIntegrity.pdf" style="color: #0200a4; text-decoration: none;" target="_blank">http://www.ieee-security.org/TC/SP2014/papers/OutOfControl_c_OvercomingControl-FlowIntegrity.pdf</a></span></span></li>
<li style="border: none; margin: 0px 0px 0.25em; padding: 0px;"><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Technical Report TR-HGI-2014-001:<br /><a href="http://www.hgi.ruhr-uni-bochum.de/media/emma/veroeffentlichungen/2014/05/09/TR-HGI-2014-001_1_1.pdf" target="_blank">http://www.hgi.ruhr-uni-bochum.de/media/emma/veroeffentlichungen/2014/05/09/TR-HGI-2014-001_1_1.pdf</a></span></li>
</ul>
<div>
<span style="font-size: large;"><br /></span></div>
</div>
<div>
<br /></div>
Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-65062459601496232872014-11-12T18:35:00.001-08:002014-11-21T07:57:18.608-08:00How to Implement a software-based SMEP(Supervisor Mode Execution Protection) with Virtualization/Hypervisor Technology<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">As my <a href="http://hypervsir.blogspot.com/2014/10/introduction-on-hardware-security.html" target="_blank">previous post</a> indicated, SMEP is a powerful security feature, and easy to deploy in modern commodity OS. However this feature requires H/W processor's support, for those processors that are not SMEP-capable, this post presents a <i>software-based solution to emulate SMEP functionality with the help of Virtualization/Hypervisor technology</i>. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br />
</span><br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">When x86 processor CR4.SMEP bit is set, the system software executing in kernel mode (CPL<3) cannot fetch instructions from any linear address with a translation for which the U/S flag is 1 (User) in every paging-structure entries controlling the translation. In other words, If SMEP is enabled, software operating in supervisor mode cannot fetch instructions from linear addresses that are accessible in user mode. When such an instruction fetch occurs, a #PF exception will be generated by SMEP-capable processor.</span><br />
<span style="font-size: large;"><span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">So, how to implement a software-based SMEP feature? </span></span><br />
<span style="font-size: large;"><span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">This paper (<a href="https://www.cs.cmu.edu/~arvinds/pubs/secvisor.pdf" target="_blank">SecVisor: A Tiny Hypervisor to Provide Lifetime Kernel Code Integrity for Commodity OSes</a> from CyLab/CMU) presents an great idea: <span style="color: blue;">Create two separate EPT protection memory views for guest kernel (ring 0) and user (ring 3) mode respectively, with different EPT permissions for corresponding GPA->HPA translations, and then switch these two EPT page table views by intercepting kernel<->kernel mode switches</span>. In x86/Intel processor, hypervisor can configure different VMCS EPTP pointers (which points to different Extended </span><span style="font-family: 'Trebuchet MS', sans-serif;">Page </span><span style="font-family: 'Trebuchet MS', sans-serif;">Tables</span><span style="font-family: Trebuchet MS, sans-serif;">) and switch among them at appropriate time.</span></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br />
To make the discussion easier, we can call these two guest memory translation tables (pointed by two different EPTP pointers) as protected memory views: <span style="color: blue;">one is used in guest Kernel mode, named as <i><b>Kernel View;</b></i> the other is for User mode, named as <b><i>User View</i></b></span>. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br />
Besides, as that paper indicates, for both views, the identity map (GPA=HPA) is created in both EPT page tables by default. But EPT page table entry permissions may be different for the same GPA addresses. The latter is the key part for emulating SMEP behaviors, I will talk about it later.</span><br />
<span style="font-size: large;"><span style="font-family: Trebuchet MS, sans-serif;"><br /></span><span style="font-family: Trebuchet MS, sans-serif;">By intercepting guest kernel/user mode switches, we can do this below in hypervisor:</span></span><br />
<ol>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Switch to use <i><b>Kernel View</b></i> when guest logical processor entering Kernel mode;<br />As we know that, in x86 processor, there are several ways to cause logical processor enter Kernel mode, for example in Windows OS, interrupt/fault/trap (through IDT table), syscall instructions. Based upon my previous project experience, some others like task gate (only NMI on 32bit OS), call gate, are not never used in Windows OS. </span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Switch to use <b><i>User View</i></b> when guest logical processor leaving kernel (or entering User mode);</span></li>
</ol>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">So, the question now is that - how to get hypervisor be notified whenever a kernel/user mode switch happens?</span><br />
<span style="font-size: large;"><span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">The SecVisor does it like pictures below (snapshots from this <a href="http://quning.org/self/cmu_secvisor_ppt.pdf" target="_blank">link</a></span><span style="font-family: 'Trebuchet MS', sans-serif;">): In <i><b>User View</b></i>, the <u>Execution </u>permission is removed in the EPT page tables for Kernel Code pages , whenever entering kernel mode to fetch the entry point instruction from Kernel code page, an EPT violation vmexit occurs, then the control is transferred to hypervisor (SecVisor), so SecVisor can </span><span style="font-family: Trebuchet MS, sans-serif;">switch to <b><i>Kernel View</i></b> by updating the corresponding EPTP pointer in VMCS. Similarly, we can switch to <i style="font-weight: bold;">User View </i>whenever leaving kernel mode. </span></span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgACZD98ZttgjZ39N6pUMlFUs-2Rz5E7ET68fKKNC_12Rm66JOnAOVoKc9ak59-Ky-VCrzSp4rc_T-fGjVqGmpYz0Rq1rxbE7uzuc9N1njjMjHNIGaNK853qOhxWxWE-9kxt2wcOoepjE0/s1600/secvisor1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-size: large;"><img border="0" height="480" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgACZD98ZttgjZ39N6pUMlFUs-2Rz5E7ET68fKKNC_12Rm66JOnAOVoKc9ak59-Ky-VCrzSp4rc_T-fGjVqGmpYz0Rq1rxbE7uzuc9N1njjMjHNIGaNK853qOhxWxWE-9kxt2wcOoepjE0/s640/secvisor1.png" width="640" /></span></a></div>
<span style="font-size: large;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<span style="font-size: large;"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgH4pRpji1xJc9Sgij9ec2YM1AVA-R-0nIP4RWl0cfXoFyFkZEeUMX0bD_UPjHTiFuebHNhCEbfJtNThVZM25VSfdlnZSsZ_LcFquhjV4LBARiVI8wctq3MlKhckYnAS-0lrlCBGEJ4KWA/s1600/secvisor2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-size: large;"><img border="0" height="484" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgH4pRpji1xJc9Sgij9ec2YM1AVA-R-0nIP4RWl0cfXoFyFkZEeUMX0bD_UPjHTiFuebHNhCEbfJtNThVZM25VSfdlnZSsZ_LcFquhjV4LBARiVI8wctq3MlKhckYnAS-0lrlCBGEJ4KWA/s640/secvisor2.png" width="640" /></span></a></div>
<span style="font-size: large;"><span style="font-family: 'Trebuchet MS', sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">Now, obviously we can get to know how to emulate SMEP behaviors. </span></span><br />
<span style="font-size: large;"><span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">Assumed that the guest logical processor is running Kernel mode, and EPTP hence points to the mapping tables in <i><b>Kernel View</b></i> and also assumed that <i><b>only </b></i>approved code (e.g. Kernel and trusted LKM modules) has EPT execution permission in <b style="font-style: italic;">Kernel View</b>, see picture below in the meanwhile,</span><span style="font-family: 'Trebuchet MS', sans-serif;"> provided that there is a kernel vulnerability that can be exploited by malware to execute arbitrary</span><span style="font-family: Trebuchet MS, sans-serif;"> user mode code. When the logical processor starts to execution user accessible code in kernel mode, </span><span style="font-family: 'Trebuchet MS', sans-serif;">an EPT violation will be generated b</span><span style="font-family: 'Trebuchet MS', sans-serif;">ecause that user mode code cannot be executable in EPT </span><i style="font-family: 'Trebuchet MS', sans-serif;"><b>Kernel View</b></i><span style="font-family: 'Trebuchet MS', sans-serif;">. </span></span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiauyasSxZp5PDAEllPtM2VKiU_9ZQu-MnF5laYc5Quo6x6QeVT_ED153FSbWfSzJOsjw8tTE0Yv1xXYlBWooP2K6RPIanljxKh6OtnXnNZQ8gcCtg_ku8-xNa9afLeIGHjNw49J3iTtWM/s1600/secvisor3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-size: large;"><img border="0" height="484" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiauyasSxZp5PDAEllPtM2VKiU_9ZQu-MnF5laYc5Quo6x6QeVT_ED153FSbWfSzJOsjw8tTE0Yv1xXYlBWooP2K6RPIanljxKh6OtnXnNZQ8gcCtg_ku8-xNa9afLeIGHjNw49J3iTtWM/s640/secvisor3.png" width="640" /></span></a></div>
<span style="font-size: large;"><span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="color: blue; font-family: Trebuchet MS, sans-serif;">When hypervisor gets the control, the following policy could be applied to check the execution (instruction fetch) violate SMEP functionality:</span></span><br />
<ol>
<li><span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;">Read the current CPL value from corresponding guest VMCS area to see if it is ZERO (kernel mode);</span></li>
<li><span style="color: blue; font-size: large;"><span style="font-family: Trebuchet MS, sans-serif;">Get the current guest CR3 value (also from VMCS) and guest violation linear address (actually for EPT violation due to execution fault, that address is guest RIP) from corresponding VMCS area, then traverse the guest page table to see if </span><span style="font-family: 'Trebuchet MS', sans-serif;">U/S bit (accessible in user mode) flags in every page structures are ONE. </span></span></li>
</ol>
<span style="font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: blue;">If both conditions above are true, then we catch a SMEP-like violation in guest kernel mode. </span>
<br /><br /><i><span style="color: blue;">Challenges:</span></i></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">However, there are many challenges to implement this software-based SMEP feature with virtualization technology.</span><br />
<ol>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Performance impacts.<br />Because in that paper, we create two EPT memory protection views (Kernel View and User View), in order to switch back and forth at run time, the hypervisor must have to trap every event of entering and leaving kernel. This introduces significant performance cost because kernel-user mode switches are normally very frequent. <br /><br /><span style="color: blue;">I think one of solutions of switching EPTP pointers (Views) without VMExit is to leverage the latest Virtualization features, like <a href="http://hypervsir.blogspot.com/2014/10/thoughts-on-virtualization-exception.html" target="_blank">Virtualization Exception (#VE) and EPTP switching function (VMFUNC) in my previous post</a>, and also use<a href="http://hypervsir.blogspot.com/2014/11/using-lbr-last-branch-record-feature-to.html" target="_blank"> IDT Shadow/Virtualization technique</a> in my another post to trap every kernel/user mode switches due to interrupt/trap/fault events</span>. However, on those #VE/VMFUNC-capable machines, SMEP is also available:-)<br /><br />For the mode switches due to syscall/sysret, you can brainstorm how to handle it without vmexit!</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">In <b><i>Kernel View</i></b>, we configure all the kernel code executable in EPT tables. When there is an LKM module loaded or unloaded, we must update the module memory to be executable in <b><i>Kernel View</i></b> and to be non-executable in <b><i>User View</i></b> immediately.<br /><br />The author in the paper has a solution to solve it by adding code in load_module() and the free_module() function.<br /><br />However, without guest kernel code changes, for module loading, I think we can use a <span style="color: blue;">lazy solution</span> to solve it, for example, when a new loaded LKM module starts to run at the first time in Kernel mode, a EPT violation occurs, then in hypervisor we can check if it is a trusted LKM module, if yes, then we just allow that LKM code page executable in <b><i>Kernel View</i></b>, and remove the execution permissions in <i><b>User View</b></i>. But how to update the LKM code page EPT permissions in <b><i>Kernel View</i></b> when such a LKM module gets unloaded from kernel?</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">In the case of low memory pressure, will Linux OS page out or swap out LKM code pages to the disk storage? <br />I know this is true on Windows OS system, but I have no idea if Linux will do the same thing. (Anybody can tell me?)<br />If it is the case on Linux system, then without guest kernel hooks, it is also a challenge to update LKM code page permissions in EPT <b><i>Kernel View</i></b> and <b><i>User View</i></b>.</span></li>
</ol>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">
<span style="color: blue;">Note that what I'm talking about in this post is for fun. I don't think it is worth doing all those things just only for emulating SMEP-like feature with virtualization technology:(. As a matter of fact, I have yet another solution to implement a software-based SMEP feature without Virtualization/Hypervisor. Please stay tuned...in my next post.</span><br /><br /><i><b><u>Update:</u></b></i><br />Question: thinking of how to implement a software SMAP (Supervisor Mode Access Protection</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">) with virtualization technology......</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /><b><i>References:</i></b></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><a href="https://www.cs.cmu.edu/~arvinds/pubs/secvisor.pdf" target="_blank">SecVisor: A Tiny Hypervisor to Provide Lifetime Kernel Code Integrity for Commodity OSes</a>, and its presentation <a href="http://quning.org/self/cmu_secvisor_ppt.pdf" target="_blank">link</a>.</span><br />
<span style="font-size: large;"><br /></span>
<span style="font-size: large;"> </span>Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-71664578237472358432014-11-08T06:36:00.000-08:002014-11-20T21:53:27.833-08:00What does Transactional Synchronization Extensions (TSX) processor technology mean to vulnerability exploits (e.g. Brute Forcing)?<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><a href="http://en.wikipedia.org/wiki/Transactional_Synchronization_Extensions" target="_blank">Intel Transactional Synchronization Extensions</a> (TSX) was introduced since from Haswell processor with adding hardware<a href="http://en.wikipedia.org/wiki/Transactional_memory" target="_blank"> transactional memory</a> support. It was originally design to</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"> speed up execution of multi-threaded software through lock elision. Every new technology has both good side and evil side, then how about TSX extension? What can we use it to do for vulnerability exploits and its defenses?</span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"></span><br />
<a name='more'></a><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">According to Intel SDM, TSX provides two software interfaces for programmers:</span><br />
<ul>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><i><b>Hardware Lock Elision</b></i> (HLE) is a legacy compatible instruction set extension (comprising the XACQUIRE and </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">XRELEASE prefixes).</span></li>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><i><b>Restricted Transactional Memory</b></i> (RTM) is a new instruction set interface (comprising the XBEGIN, XABORT, and XEND </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">instructions).</span></li>
</ul>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">This post is focusing on the RTM interface, as the specification indicates below.</span></div>
<blockquote class="tr_bq">
<span style="font-family: Trebuchet MS, sans-serif;">Software uses the XBEGIN instruction to specify the <span style="color: blue;"><b>start of the transactional region</b></span> and the XEND instruction to specify the <span style="color: blue;"><b>end of the transactional region</b></span>.</span></blockquote>
<blockquote class="tr_bq">
<span style="font-family: Trebuchet MS, sans-serif;">The XBEGIN instruction takes an operand that provides a relative offset<br />to the <b><span style="color: blue;">fallback instruction address </span></b>if the transactional region could not be successfully executed transactionally.</span></blockquote>
<blockquote class="tr_bq">
<span style="font-family: Trebuchet MS, sans-serif;"><span style="color: blue;"><b>A processor may abort transactional execution for many reasons</b></span>. The hardware automatically detects transactional abort conditions and restarts execution from the <span style="color: blue;"><b>fallback instruction address</b></span> with the architectural state corresponding to that at the start of the XBEGIN instruction and the EAX register updated to describe the abort status</span><span style="font-family: 'Trebuchet MS', sans-serif;">.</span></blockquote>
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">The interesting things here are as follows:</span><br />
<br />
<ol>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">What reasons can cause a transactional execution abort?</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">What does it look like when a transactional execution abort happens?</span></li>
</ol>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">For the 2nd question above, the TSX specification says below, </span><br />
<blockquote class="tr_bq">
<span style="font-family: Trebuchet MS, sans-serif;">The architecture ensures that updates </span><span style="font-family: Trebuchet MS, sans-serif;">performed within a transactional region that subsequently <span style="color: blue;"><b>aborts execution will never become visible</b></span>. Only a </span><span style="font-family: Trebuchet MS, sans-serif;">committed transactional execution updates architectural state. Transactional aborts never cause functional failures </span><span style="font-family: Trebuchet MS, sans-serif;">and only affect performance.</span></blockquote>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"></span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">which means </span><span style="color: blue; font-family: 'Trebuchet MS', sans-serif; font-size: large;">after a TSX abort occurs, all the physical memory and processor register updates (after XBEGIN instruction) will be discarded</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">, and from user's perspective, the memory and register states are "restored" to the states at the start of XBEGIN instruction. This seems to be the same with the behaviors of try/catch or setjmp/longjmp functionalities. </span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">However, there are some differences. For instance, </span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">any fault or trap in a transactional region that must be exposed to software will be suppressed, </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">as if the fault or trap had never </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">occurred. If any exception is not masked, that will result in a transactional abort and it will be as if the exception </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">had never occurred</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">. </span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><span style="color: blue;"><br /></span></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><span style="color: blue;">As matter of fact, all the s</span></span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: blue;">ynchronous exception events (like #GP, #PF) that occur during transactional execution are suppressed as if they had never occurred, and those events won't be delivered to processor for handling. </span></span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">Then regarding the 1st question above, there are a couple of reasons that might cause a TSX abort, like </span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">XABORT, CPUID, Software INT, VMX instructions, IO instructions, ring transitions (e.g. syscall), VMExit (ept violation), etc.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">So, now let's think about something about Intel TSX!</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Provided that <span style="color: #cc0000;">a malicious software attempts to access (e.g. write or execute) a protected memory, normally it will trigger a #PF fault, then it could be detected and terminated </span></span><span style="color: #cc0000; font-family: 'Trebuchet MS', sans-serif; font-size: large;">subsequently </span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: #cc0000;">by </span></span><span style="color: #cc0000; font-family: 'Trebuchet MS', sans-serif; font-size: large;">OS kernel</span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: #cc0000;">. But if the malicious software attempts to do the same thing in TSX state after an XBEGIN instruction, as we pointed out above, such a #PF exception will be </span></span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><span style="color: #cc0000;">suppressed, which means the OS kernel cannot even detect this access violation</span>. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">The similar case also applies to the access to protected memory by EPT (extended page table) in a VMM/Hypervisor. Because when guest software attempts to access the physical memory protected with EPT in a hypervisor, an EPT violation vmexit will be triggered, but in TSX state, such a vmexit won't be triggered, so the hypervisor won't detect it. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">So basically, it means that, </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><span style="color: blue;">To malicious software, before a successful attack,</span></span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><span style="color: blue;"> it can make more attempts (e.g. </span></span><span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;">brute-forcing</span><span style="color: blue; font-family: 'Trebuchet MS', sans-serif; font-size: large;">) to do bad things without being caught by OS kernel or even hypervisor</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">. </span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">But now I don't have an idea on how to use it to do something "real bad". Maybe someone else does have ideas.... </span><br />
<br />
<br />
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><u><i><b>UPDATE:</b></i></u></span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">Interesting!!! got this.... </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">TSX improves timing attacks against KASLR</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><a href="http://labs.bromium.com/2014/10/27/tsx-improves-timing-attacks-against-kaslr/" target="_blank">http://labs.bromium.com/2014/10/27/tsx-improves-timing-attacks-against-kaslr/</a> </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<br />
<br />
<br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"></span>Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-66287818607212065162014-11-08T04:53:00.004-08:002014-11-13T01:30:12.218-08:00Using LBR (Last Branch Record) Feature to Detect IDT-Shadowing-Based Malicious IDT Hooking <span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Thanks to <a href="https://twitter.com/ysmoo" target="_blank">Yushi </a>who shared a presentation (<a href="http://www.iolanes.eu/_docs/eli_asplos12_slides.pdf" target="_blank">ELI: Bare-Metal Performance for I/O Virtualization</a>) with me. In that hypervisor (ELI), it innovates an idea of gust IDT shadow (or IDT virtualization) design for some specific usage models. I'm going to talk a little bit about this idea.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"></span><br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">This post firstly gives an introduction about IDT shadow in that paper, then talks about guest IDT hooking with this technique, and finally explains how to detect such a hooking with processor LBR feature. </span><br />
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: blue;"><b><u>IDT Shadow/Virtualization</u></b></span></span><br />
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">See picture below (captured from that presentation), it presents an idea of <i>Exitess Interrupt Delivery</i>. </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-9bgbAwf1ytksOfHrEZVGKqRjXDSZ03gNj8Vc3_wMudHvCYGpJ1W0YOmYZBYiywAUkViKYwmXkhlM1IiFlWhYbNG4yhT1ZAsl8Qo2uLv5lD6DuxE-Y-5bXWl23z5Ph58HDlXt7JQy2XQ/s1600/VIDT.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-9bgbAwf1ytksOfHrEZVGKqRjXDSZ03gNj8Vc3_wMudHvCYGpJ1W0YOmYZBYiywAUkViKYwmXkhlM1IiFlWhYbNG4yhT1ZAsl8Qo2uLv5lD6DuxE-Y-5bXWl23z5Ph58HDlXt7JQy2XQ/s1600/VIDT.png" height="433" width="640" /></a></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">The basic ideas are as follows:</span></div>
<div>
<ul>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: #990000;">Setup a guest shadow IDT table. With hardware virtualization's help, we can easily to cheat guest OS software by monitoring and trapping guest execution of LIDT and SIDT instruction</span>.<br />We can maintain a shadow IDT table (and keep it sync'ed with the original one), then let the real guest logical processor IDTR.base point to our shadow IDT table.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">And in shadow IDT table, it clears Present bit or reduces IDTR.limit to trap the wanted guest interrupts/exceptions.<br />The former idea is exactly the same with my idea in <a href="http://hypervsir.blogspot.com/2014/10/monitortrap-software-interrupt-int-08h.html" target="_blank">previous post by generating #NP fault</a>. The later idea (triggering #GP) is not recommended in my opinion because decreasing DITR.limit will cause many false positives (please correct me if I am wrong). </span></li>
</ul>
</div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">In that paper (see <i>References </i>at the end of this post for the link of this full paper), ELI utilizes the shadow IDT to implement a high-performance guest/host interrupt delivery solution without changes to guest OS kernel.</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">And one of other benefits is that the guest OS kernel integrity check (like Windows x64 OS <a href="http://en.wikipedia.org/wiki/Kernel_Patch_Protection" target="_blank">PatchGuard</a>) cannot even detect it.</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;"><b><u>Guest IDT vector entry hooking</u></b></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Another usage of shadow IDT is to <span style="color: #990000;">hook any one of guest IDT ISRs by changing the corresponding ISR entry in the shadow IDT table</span>. </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">For example, to implement an interrupt filter for a particular interrupt, we can change the IDT entry in the shadow IDT table and let it point to our own hooking ISR entry. After that, whenever that particular interrupt is triggered, the processor will firstly pass on the control to our hooking ISR routine, and we can do something (e.g. filtering), then jump to original OS kernel ISR for further handling. And under some circumstance, we can even let the control be back to our routine after original ISR handling (see the patent in the <i>References</i>).</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: #990000;">However, what if this kind guest IDT hooking is done by a malicious hypervisor? Then how to detect it in guest OS?</span> </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;"><u><b>Detect IDT hooking with LBR (Last Branch Record)</b></u></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Nowadays, all the x86/Intel processors have a feature: Last Branch Record. </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">When it is enabled, the processor records a running trace of the most recent branches, <span style="color: #cc0000;">interrupts, and/or exceptions </span>taken by the processor in the last branch record (LBR) stack (could be in MSRs and/or Branch Trace Store (BTS) Buffer in DS save memory area). </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">To be specific, when this feature is configured by guest OS kernel, the processor will capture the trace (e.g. <i>LastBranchToIP </i>address) in LBR stack whenever an interrupt is generated by processor. So <span style="color: #990000;">when the OS original ISR gets executed, it can check the content of <i>LastBranchToIP</i> in LBR stack, to see if it matches with the OS original ISR entry. If there is a mismatch, then it indicates that the corresponding ISR entry is hooked by others, e.g. by a malicious hypervisor</span>. </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">But in a hypervisor environment, there are some solutions to prevent guest OS kernel from detecting IDT hooking with LBR feature, e.g.,</span></div>
<div>
<ul>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Hide LBR and disabling this feature by trapping corresponding CPUID instruction (LBR capability check) and MSR read/write access to LBR control MSRs.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Because LBR stack could be stored with some LBR MSRs, hypervisor must trap those MSRs, and return faked values.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">And hypervisor must also trap read access to </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">Branch Trace Store (BTS) Buffer of DS-save memory area if guest OS configures the LBR stack to be also stored in BTS buffer. </span></li>
</ul>
</div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;"><i><u><b>References:</b></u></i></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"></span><br />
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Ravi, et al.,Patent: Secure handling of interrupted events utilizing a virtual interrupt definition table (VIDT):</span></div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">
</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><a href="http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=/netahtml/PTO/search-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN/8578080" target="_blank">http://patft.uspto.gov/netacgi/nph-Parser?Sect2=PTO1&Sect2=HITOFF&p=1&u=/netahtml/PTO/search-bool.html&r=1&f=G&l=50&d=PALL&RefSrch=yes&Query=PN/8578080</a></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">The full paper for ELI presentation from IBM israel </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">research lab: Abel Gordon, et al. ELI: Bare-Metal Performance for I/O Virtualization</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><a href="http://www.mulix.org/pubs/eli/eli.pdf" target="_blank">http://www.mulix.org/pubs/eli/eli.pdf</a></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
</div>
</div>
Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-2507157266663000982014-11-06T07:02:00.002-08:002016-05-05T23:29:25.542-07:00Monitor Trap Flag (MTF) Usage in EPT-based Guest Physical Memory Monitoring<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;">Monitor Trap Flag (MTF) is a flag specifically designed for single-stepping in x86/Intel hardware virtualization VT-x technology. When MTF is set, the guest will trigger a VM Exit after executing each instruction (need to consider NMI or other interrupt delivery boundary). This <a href="https://www.cerias.purdue.edu/assets/pdf/bibtex_archive/2013-5.pdf" target="_blank">paper </a>presents an idea to use MTF for memory write allowing when monitoring modification to guest virtual-to-physical mapping (page table entries) tables. </span><br />
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;"><br /></span>
<br />
<a name='more'></a><span style="font-family: "trebuchet ms" , sans-serif; font-size: large;">In that paper (<i>SPIDER: Stealthy Binary Program Instrumentation and Debugging via Hardware Virtualization</i>), it details a solution to trap guest virtual-to-physical mapping address changes by monitoring the corresponding guest page tables. <span style="color: blue;">Based upon my previous experience, monitoring page table entries (with read-only permission in EPT PTE settings) will cause significant performance cost</span>. In this post, I am not challenging that solution since it is not a product after all.</span><br />
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;">As we all know that EPT can be configured to monitor guest physical memory access with appropriate RWX permission settings. For example, for a guest data page, we can configure the corresponding EPT page table entry with !W permission, then whenever the processor fetches the instructions in that guest physical page for execution (e.g. code injection for shellcode execution), an EPT violation vmexit (or <a href="http://hypervsir.blogspot.com/2014/10/thoughts-on-virtualization-exception.html" target="_blank">#VE interrupt</a>) will occur. </span><br />
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;">However, the contents of some guest physical page might be swapped out to disk by OS under a low memory pressure condition, and then that physical page might be remapped to another guest virtual address used for by other process. In this case, we must restore the EPT permission to default (e.g. RWX), otherwise there are many unwanted EPT violations occur. </span><br />
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;">One of solutions is to monitor guest virtual-to-physical mapping page table entries just as what the paper does. For example, we can monitor guest PTE page (guest physical address) with EPT Read-Only permission. Whenever a page remapping is required, the guest OS kernel will update the corresponding guest PTE entry. </span><br />
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;">Since the PTE entry in EPT permission is read-only, any change to that entry will trigger EPT violation vmexit. <span style="color: blue;">After hypervisor captures this event, it will record the current values of PTE entry, then temporarily set the PTE page to writable and let the guest single-step (through enabling MTF) through the instruction that performs the write access. After the single-stepping, hypervisor will read the new values of PTE entry and see which ones of them have been modified, and take appropriate actions based up mapping updates. After that, hypervisor will disable MTF flag and set the PTE page back to read-only to capture future remapping event</span>. </span><br />
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;">In the real case, the guest page table may have multiple levels, also changes to page table entries</span><br />
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;">may be very frequent, and minimal EPT page granularity is 4KB (too large), therefore this can only be an experimental solution due to huge performance penalty.</span><br />
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;"><br /></span>
<span style="font-family: "trebuchet ms" , sans-serif; font-size: large;">However, using MTF flag to grant a data write access and/or inspect the write content on a data page that is wrote less frequently is acceptable.</span><br />
<br />
<br />Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com2tag:blogger.com,1999:blog-385094692918992828.post-79378032464454149982014-11-05T04:27:00.003-08:002014-11-09T19:13:39.534-08:00BitVisor - A Thin Hypervisor Built for Enforcing I/O Device Security - Storage (USB/DISK) Encryption or File Access Monitoring <span style="font-family: Trebuchet MS, sans-serif; font-size: large;">This post is wrote to share an idea of the paper (<a href="http://dl.acm.org/citation.cfm?id=1508311" target="_blank">BitVisor: A Thin Hypervisor for Enforcing I/O Device Security</a>) that I read recently. It</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"> innovates a <span style="color: blue;">hypervisor-based solution for enforcing storage/disk encryption of ATA devices</span>.</span><br />
<a name='more'></a><div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">As we know that Direct memory access (DMA) is a feature in computer system that allows certain hardware subsystems to directly access main system memory independently of the CPU/processor. </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">As shown in picture below (from <a href="http://en.wikipedia.org/wiki/IOMMU" target="_blank">wikipedia</a>), processor MMU (if present) cannot directly intercept data movement between external DMA-capable devices and the main memory. Even in a virtualization environment, extended page table (EPT or NPT) MMU configured by Hypervisor cannot protect the main memory resource, this is one of reasons of why the technology IOMMU like VT-d (<a href="https://software.intel.com/en-us/articles/intel-virtualization-technology-for-directed-io-vt-d-enhancing-intel-platforms-for-efficient-virtualization-of-io-devices" target="_blank">Intel Virtualization Technology for Directed I/O</a>) is introduced to prevent malicious device drivers from attacking against the main memory even hypervisor owned memory space. </span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg90GgOVrPTlh7uJ5ZchFNvEMUUcZelnD4ivs5w1yqcbYB3RVb4ArFETKj7hkaYPXij8cJbecIHuCuo6N1aj4jGJC3T2zG3jJOaHBftSZ_yEkGrEhBLErxd0f163-GNDKRHhpC880XcWFk/s1600/MMU_and_IOMMU.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg90GgOVrPTlh7uJ5ZchFNvEMUUcZelnD4ivs5w1yqcbYB3RVb4ArFETKj7hkaYPXij8cJbecIHuCuo6N1aj4jGJC3T2zG3jJOaHBftSZ_yEkGrEhBLErxd0f163-GNDKRHhpC880XcWFk/s1600/MMU_and_IOMMU.png" height="400" width="400" /></a></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;">Although the processor has no chance to intercept the data transferred between device and main memory, it still has chances to intercept the access to DMA control data regions like DMA descriptors that store transferring information,</span><br />
<span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;">such as the buffer address and the size of the data. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Because basically all the DMA host controllers are using some specific command control registers to configure and trigger DMA data transfers. Those registers are commonly represented as I/O based or/and MMIO based registers, the former registers are accessed with I/O port (e.g. IN/OUT or INTS/OUTS instructions in x86 system), and the latter registers are accessed with <a href="http://en.wikipedia.org/wiki/Memory-mapped_I/O" target="_blank">memory-mapped I/O</a> method with generic memory movement instructions (like MOV instruction).</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">In x86 virtualization, both accesses with I/O port and MMIO can be monitored by hypervisor (e.g. through I/O port bitmap VMCS configuration, and appropriate EPT page permission settings). </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">In BitVisor, <span style="color: blue;">a thin hypervisor intercepts any read/write access to the ATA DMA host controller's command-block registers and control-block register. Therefore, it is easy to obtain information </span></span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: blue;">necessary to enforce encryption</span>. For example, the hypervisor can </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">obtain the LBA and sector count by intercepting writes to these </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">registers. Similarly, the corresponding ATA disk DMA descriptor can also be monitored and controlled by BitVisor hypervisor. </span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="color: blue;">With intercepting all the information as mentioned above, the hypervisor has knowledge of where the DMA buffer (physical address) will be transferred to/from, when to start/stop data transfers, and what size in bytes will be transferred to/from external device</span>.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">This paper presents a novel idea of <i><u>shadow DMA descriptors</u></i> (see Figure below, click it to enlarge) for safely intercepting the content of data </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">transferred by DMA. A shadow DMA descriptor is a shadow of the DMA descriptor of the guest OS (guest DMA descriptor).The hypervisor sets the shadow DMA descriptors to the host controller (rather than the real guest DMA descriptor). The shadow DMA descriptor specifies a memory region controlled by the hypervisor as a temp buffer, called the shadow buffer. </span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">When data movement starts, the DMA host controller transfers data between the shadow buffer (rather than the real guest buffer) and the device, based on the shadow DMA descriptors. After data movement is completed, the hypervisor emulates the host controller behaviors by copying data between the shadow buffer and the guest buffer that is specified in the real guest DMA descriptor.</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggFqUEZOgAOva4tVtFr-M3ysm6P-hNmXyyqObzGfwBD3h9ERWEuEwO1CUJ0oWZFlKndacvvG3DNVs9gVFri7IPz7AgT_i-PMu84CddWXGRmFszuwolLejJ_ORX4wtPR9n0CCdVSbLvzL8/s1600/BitVisor.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggFqUEZOgAOva4tVtFr-M3ysm6P-hNmXyyqObzGfwBD3h9ERWEuEwO1CUJ0oWZFlKndacvvG3DNVs9gVFri7IPz7AgT_i-PMu84CddWXGRmFszuwolLejJ_ORX4wtPR9n0CCdVSbLvzL8/s1600/BitVisor.png" height="233" width="640" /></a></div>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Now that hypervisor can fully control and intercept the DMA data content, the encryption and decryption become easy. When guest software writes the data to ATA disk, the hypervisor can enforce encryption from guest buffer to shadow buffer (that eventually goes to disk), and in reverse order when guest software reads data from ATA disk, the hypervisor will decrypt data from shadow buffer (coming from disk) to guest buffer. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">On the other side, as a pretty good side effort, <span style="color: #cc0000;">we can utilize this solution to enforce protection from device-specific DMA attacks on a platform that is lack of IOMMU (e.g. VT-d) capability</span>. For instance, the hypervisor can verify the address of </span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">guest buffers specified in the guest DMA descriptors so that the </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">address (plus the data size) does not point to the hypervisor memory regions and any other protected guest memory regions. </span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">As we can see that the hypervisor can capture the event when DMA starts. However, the </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">end of DMA transfer is usually notified by a hardware interrupt, but </span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">BitVisor cannot identify the ATA device that </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">issues hardware interrupts. Instead, BitVisor captures I/O access to </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">status registers, because device drivers usually read status registers </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">to check whether DMA transfer has finished successfully or not, </span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">and write registers to acknowledge interrupts. <span style="color: #cc0000;">As an alternative, the solution in <a href="http://hypervsir.blogspot.com/2014/10/monitortrap-software-interrupt-int-08h.html" target="_blank">my previous post</a> can solve this issue by monitoring the ATA disk external interrupt</span>.</span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">To download the source code of latest BitVisor, please go to the official site </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><a href="http://www.bitvisor.org/" target="_blank">http://www.bitvisor.org/</a></span><br />
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">See below <a href="https://www.bitvisor.org/summit/slides/BitVisor-Summit-04-matsubara.pdf" target="_blank">snapshot </a>(and other one <a href="http://www.slideshare.net/shinagawa/20121204-bitvisor-summit" target="_blank">BitVisor Summit @2012</a>), it uses the Ring3 layer in VMX-root to hold various services. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhN9beTY-_8g83lj7E6Sz0WESPuiLYdYYieVNsc0JN-FQJxlsQAnaXnN9QD7Trj4P1X-GpcwuvgaFHjEaojDa7CTRIJ9hrWL-CZcm_FIbfibZiSD1OWYdSwDXPuvuixXe-DaIJA0r3ku20/s1600/BitVisor-layer.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhN9beTY-_8g83lj7E6Sz0WESPuiLYdYYieVNsc0JN-FQJxlsQAnaXnN9QD7Trj4P1X-GpcwuvgaFHjEaojDa7CTRIJ9hrWL-CZcm_FIbfibZiSD1OWYdSwDXPuvuixXe-DaIJA0r3ku20/s1600/BitVisor-layer.png" height="484" width="640" /></a></div>
<br />
<br />
<br />
<br />
<span style="color: blue; font-size: large;"><u>References:</u></span><br />
<span style="font-family: Trebuchet MS, sans-serif;">TreVisor: </span><br />
<span style="font-family: Trebuchet MS, sans-serif;">OS-Independent Software-Based Full Disk Encryption & Secure Against Main Memory Attacks</span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><a href="http://www1.cs.fau.de/filepool/projects/trevisor/trevisor.pdf" target="_blank">http://www1.cs.fau.de/filepool/projects/trevisor/trevisor.pdf</a></span><br />
<br />
<span style="font-family: Trebuchet MS, sans-serif;">OSb: OSv on BitVisor</span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><a href="http://www.slideshare.net/yushiomote/osb-osv-on-bitvisor" target="_blank">http://www.slideshare.net/yushiomote/osb-osv-on-bitvisor</a></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">Takahiro Shinagawa: Introduction to the BitVisor and Comparison with Xen</span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><a href="http://www.slideshare.net/xen_com_mgr/xs-japan-2008-bitvisor-english" target="_blank">http://www.slideshare.net/xen_com_mgr/xs-japan-2008-bitvisor-english</a></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">Dependable Cloud Computing</span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><a href="http://www.slideshare.net/kazuhikokato/121127-37898979" target="_blank">http://www.slideshare.net/kazuhikokato/121127-37898979</a></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">Kernel Memory Protection by an Insertable Hypervisor which has VM Introspection and Stealth Breakpoints (IWSEC2014)</span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><a href="http://www.slideshare.net/suzaki/international-workshop-on-security-iwsec2014" target="_blank">http://www.slideshare.net/suzaki/international-workshop-on-security-iwsec2014</a></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif;">A Hypervisor IPS based on Hardware Assisted Virtualization Technology</span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><a href="http://www.slideshare.net/ffri/bh-usa08murakami" target="_blank">http://www.slideshare.net/ffri/bh-usa08murakami</a></span><br />
<br />
<br /></div>
Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com5tag:blogger.com,1999:blog-385094692918992828.post-17633451096769456042014-11-04T21:39:00.001-08:002014-11-04T21:39:23.695-08:00XEN PVH Virtualization Mode - "What Color Is Your Xen?"<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">In my previous post <i><a href="http://hypervsir.blogspot.com/2014/10/why-smaller-code-size-with-xen-on-arm.html" target="_blank">Why smaller code size with XEN on ARM?</a></i>, one of reasons I explained is that XEN on x86 must support different guest working modes with backward compatibility due to historical x86 virtualization technology limitations (e.g. in the first x86 VT-x version, no hardware-assisted Paging support). This post just shares some useful information/links on a new XEN virtualization mode (PVH) I read recently. </span><br />
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"></span><br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Before PVH virtualization mode introduced (by Mukesh Rathor @Oracle in 2012), Xen (on x86) supports different virtualization modes, like PV, HVM, HVM with PV drivers, PVHVM depending the guest domain/OS type and hardware machine capability. This was pretty complicated in XEN design. I think we wouldn't do it like that if XEN on x86 project were launched in recent year (instead of 10 years ago). This is why XEN on ARM can do it better in this area. </span></div>
</div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Here are some very great posts that explain why PVH mode is much better than any one of previous virtualization modes based upon the latest x86 processors and platforms. </span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif;"><div>
What Color Is Your Xen?</div>
<div>
<a href="http://www.brendangregg.com/blog/2014-05-07/what-color-is-your-xen.html" target="_blank">http://www.brendangregg.com/blog/2014-05-07/what-color-is-your-xen.html</a></div>
<div>
<br /></div>
<div>
The Paravirtualization Spectrum, part 1: The Ends of the Spectrum</div>
<div>
<a href="https://blog.xenproject.org/2012/10/23/the-paravirtualization-spectrum-part-1-the-ends-of-the-spectrum/" target="_blank">https://blog.xenproject.org/2012/10/23/the-paravirtualization-spectrum-part-1-the-ends-of-the-spectrum/</a></div>
<div>
<br /></div>
<div>
The Paravirtualization Spectrum, Part 2: From poles to a spectrum</div>
<div>
<a href="https://blog.xenproject.org/2012/10/31/the-paravirtualization-spectrum-part-2-from-poles-to-a-spectrum/" target="_blank">https://blog.xenproject.org/2012/10/31/the-paravirtualization-spectrum-part-2-from-poles-to-a-spectrum/</a></div>
<div>
<br /></div>
<div>
<span style="font-size: large;">At a glance, this picture below (from <a href="http://www.brendangregg.com/blog/2014-05-07/what-color-is-your-xen.html" target="_blank">What Color Is Your Xen?</a>) has a straightforward illustration of the differences among all those virtualization working modes.</span></div>
<div>
<span style="font-size: large;"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7OWvqztIwt-BPI69Q7dA2rjyabxj8D3H17hiwwOoALdwnlPkyBjvZEKSwuAdSw5Imq-SZUmRCnSRwrIfxsa1R0aNR-FQ6MUZODoAlzGZMS9L8Z3Snf9OS29tGthfj2kwJUs2mK_sXLwM/s1600/xen-colors.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7OWvqztIwt-BPI69Q7dA2rjyabxj8D3H17hiwwOoALdwnlPkyBjvZEKSwuAdSw5Imq-SZUmRCnSRwrIfxsa1R0aNR-FQ6MUZODoAlzGZMS9L8Z3Snf9OS29tGthfj2kwJUs2mK_sXLwM/s1600/xen-colors.png" height="307" width="640" /></a></div>
<div>
<br /></div>
</span></div>
Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-75878601416924035932014-11-03T22:29:00.003-08:002014-11-05T07:16:36.414-08:00Unikernels: Library Operating Systems for the Cloud (OSv)<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Unlike a general-purpose, commercial operating system (like Windows, Ubuntu), OSv (<a href="http://osv.io/" target="_blank">http://osv.io/</a> from cloudius-systems) is a single-purpose operating system. It is also kind of library operating system </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><i><span style="color: blue;">designed for the cloud </span></i>that running on top of different hypervisors, e.g. XEN, KVM, VMware. </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">So what does OSv like look?</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"> </span><br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">By quickly taking a look at this slide (<a href="http://www.slideshare.net/dmarti1111/o-sv-usenix-atc-2014" target="_blank">http://www.slideshare.net/dmarti1111/o-sv-usenix-atc-2014</a>), we can get to know these two features below. </span><br />
<ul>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">General-purpose OS has kernel mode (ring 0) and user mode (ring 3), but </span><span style="color: blue; font-family: 'Trebuchet MS', sans-serif; font-size: large;">OSv only has code running in Ring 0 mode, it doesn't have code running in user (ring 3) mode</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">. This is one of most significant differences.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">The other main difference is that the <span style="color: blue;">OSv only has one single address space</span> (multiple threads allowed, </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">though</span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">) running on a hypervisor as being a single virtual appliance, which serves a single-purpose cloud service. </span></li>
</ul>
<br />
<div>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">So, each OSv holds one specific application (with multiple threads) on top of it, and the OSv itself runs as a guest OS on top of hypervisor. Application and resource isolation is guaranteed by the hypervisor.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Since an OSv has a single address space, it needs </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">only</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"> one</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"> CR3 value. Process and address space switch (scheduler) is not required any more, and hence no TLB flush overhead introduced. This also can reduce "kernel" component memory footprint, and let application own more memory space.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Regarding address translation overheads, e.g. GVA->GPA->HPA, by using the larger table (2MB, or even 1GB) for both guest virtual memory page tables and EPT tables, such a translation overhead could be further reduced a lot. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Note that, as the slides pointed out, <span style="color: blue;">syscalls (user/kernel switch) are no longer required</span>. Any traditional syscalls now are converted to just function calls in kernel mode only. Although this can significantly reduce performance cost, it also causes a new issue: application ABI compatibility issue. This means that in order to deploy this application on top of OSv, the source code must have to be modified and recompiled. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">There are many challenges, you can check </span><a href="http://osv.io/" style="font-family: 'Trebuchet MS', sans-serif; font-size: x-large;" target="_blank">http://osv.io/</a><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"> for greater details. </span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">But anyway, if OSv is a correct direction for cloud OS (and with a success on having rich application supported) in future, it will definitely have a direct competition with some other solutions like <a href="https://www.docker.com/" target="_blank">Docker</a>. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">I like this product in person, the much simpler it is, the more I love it :-).</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Also, see this slide (<a href="http://www.slideshare.net/yushiomote/osb-osv-on-bitvisor" target="_blank">OSb: OSv on BitVisor</a></span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">), someone is working on enabling OSv on top of </span><a href="http://hypervsir.blogspot.com/2014/11/paper-read-bitvisor-thin-hypervisor-for.html" style="font-family: 'Trebuchet MS', sans-serif; font-size: x-large;" target="_blank">BitVisor</a><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">, cool ! </span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span></div>
<div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><i>Some other references about Library OS:</i></span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">----------------------------------------------------</span><br />
<div>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">Unikernels: Rise of the Virtual Library Operating System</span><br />
<a href="http://queue.acm.org/detail.cfm?id=2566628" target="_blank"><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">http://queue.acm.org/detail.cfm?id=2566628</span></a><br />
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br />
XPDS14: OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><a href="http://www.xenproject.org/help/presentations-and-videos/video/xpds14v-osv.html" target="_blank">http://www.xenproject.org/help/presentations-and-videos/video/xpds14v-osv.html</a><br /><br />Rethinking the Library OS from the Top Down:</span></div>
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><a href="http://research.microsoft.com/pubs/141071/asplos2011-drawbridge.pdf" target="_blank">http://research.microsoft.com/pubs/141071/asplos2011-drawbridge.pdf</a></span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span><br />
<br />
<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">OSv on bhyve:</span></div>
<div>
<span style="color: #0000ee; font-family: Trebuchet MS, sans-serif; font-size: large;"><u><a href="http://bhyvecon.org/osv_on_bhyve.pdf" target="_blank">http://bhyvecon.org/osv_on_bhyve.pdf</a></u></span></div>
</div>
</div>
</div>
</div>
<div>
<br /></div>
Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-86950879825429632462014-11-03T18:46:00.001-08:002014-11-12T00:37:13.961-08:00Problems arises when supporting EFI + GRUB2 + Xen with Multiboot2 boot specification<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Previously I wrote a post to <a href="http://hypervsir.blogspot.com/2014/09/limitations-of-multiboot-specification.html" target="_blank">discuss the limitations for Multiboot boot specification</a>, today I saw that XEN hypervisor also has the similar problems. </span><br />
<br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">See these two links below, </span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Daniel Kiper (from Oracle) have some proposals to solve XEN/Multiboot2 issue on EFI/Grub2 platforms.</span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><a href="http://lists.xen.org/archives/html/xen-devel/2014-05/msg02928.html" target="_blank">http://lists.xen.org/archives/html/xen-devel/2014-05/msg02928.html</a></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><a href="https://lists.gnu.org/archive/html/grub-devel/2014-06/msg00016.html" target="_blank">https://lists.gnu.org/archive/html/grub-devel/2014-06/msg00016.html</a></span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><a href="http://www.slideshare.net/xen_com_mgr/xen-in-efiworld20140801finaldk" target="_blank">http://www.slideshare.net/xen_com_mgr/xen-in-efiworld20140801finaldk</a></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">To summarize it, the problems are:</span><br />
<ul>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Grub2 calls </span><span style="background-color: white; line-height: 14.5600004196167px;"><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><i>ExitBootServices()</i> before jumpping to the entry point of XEN, which means all the EFI Boot Services will be terminated then.</span></span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="background-color: white; line-height: 14.5600004196167px;">Multiboot2 specification doesn't define 64-bit entry point and its initial transition state, which means that even both XEN and Grub2/EFI are running in 64bit environment, during handover stage Grub2 must have to switch processor mode to 32bit, and XEN must also have a stub that switches processor mode back to normal 64-bit. </span></span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><span style="background-color: white; line-height: 14.5600004196167px;">XEN requires some information from Grub2, e.g. EFI tables/Functions, ACPI, Memory map, VGA, EDD data, etc. Hence, some extra Multiboot2 TAGs must be introduced to support passing on those informations. But this also requires changes upstreamed to Grub2.</span></span></li>
</ul>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Previously, in our own proprietary VMM, we worked around this issue to boot guest Linux OS with "<i><b>noefi</b></i>" flag in vmlinuz cmdline options (see this <a href="https://www.kernel.org/doc/Documentation/kernel-parameters.txt" target="_blank">link </a>for Linux parameters). We did it in the same way with <a href="http://sourceforge.net/projects/tboot/" target="_blank">tboot</a> project, because we just get EFI System Descriptor Table from Grub2, and then boot Linux guest OS as usually like on a legacy platform (e.g. with legacy e820 memory map format). However, XEN should NOT do it in this way.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-46798983338848174062014-11-03T00:16:00.002-08:002014-11-03T01:02:01.466-08:00Debugging Bug Check (BSOD) 0x101 CLOCK_WATCHDOG_TIMEOUT in a Hypervisor/VMM Environment<div>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">I'm planning to write a post for debugging Bug Check 0x101 issue (CLOCK_WATCHDOG_TIMEOUT) in Windows system. but I happened to find this blog </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><a href="http://blogs.msdn.com/b/ntdebugging/archive/2011/10/26/debugging-a-clock-watchdog-timeout-bugcheck.aspx" target="_blank">Debugging a CLOCK_WATCHDOG_TIMEOUT Bugcheck</a> from MSFT debugger team which explaned it in greater details. However, the issue we met is slightly different from what MSFT team was debugging. We are working in virtualization/hypervisor environment, and Windows (7+) is running as a primary Guest OS. </span></div>
<div>
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Basically, according to MSFT, a bugcheck 0x101 occurs when the Clock interrupt (Its IRQL is #28) has not been processed by each processor within a timeout. The Clock interrupt is quite high in the IRQL table for x86, however the Inter-Processor Interrupt (IPI, its IRQL is #29) is much higher than this level. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">However, in our case, the things are little bit different. But I won't give some details for this.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">In Uniprocessor mode, the system hangs up when issue occurs, in a SMP mode, the system shows 0x101 BSOD just right after a very short stucking, but sometimes the system also gets hang-up. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">The root cause I eventually got is a deadloop happening in hypervisor. And when this deadloop happens in BSP processor (the CPU is endlessly running in host VMX root mode), the symptom is guest Windows OS hang-up without 0x101 WATCHDOG TIMEOUT BSOD, but when such a deadloop happens in any one of APs (Application Processors), this clock watchdog timeout Blue Screen of Death occurs very soon. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">This is because when one of processors runs in VMX root mode endlessly, the </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">Clock interrupt (IRQL #28) has not been processed by that processor within a timeout, then the BSP processor will initiate a 0x101 BSOD, send IPI to other processors starting to dump the system states, and putting themselves into shutdown state. </span><br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span>
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;"><br /></span></div>
<div>
<br /></div>
Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0tag:blogger.com,1999:blog-385094692918992828.post-84764896400850532592014-11-02T22:07:00.003-08:002014-11-05T17:51:29.980-08:00Security OS Design (cont.): Write Protection for Linux Kernel critical data structures (GDT, IDT, syscall table, task_strcture, mm_struct,...)<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">To be continued for <a href="http://hypervsir.blogspot.com/2014/10/security-os-design-idea-to-prevent.html" target="_blank">previous post</a>, let me review what must be changed in Linux kernel in order to prevent buffer overrun/overflow attacks from modifying the critical kernel data structures, like GDT, IDT, task_</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">struct</span><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">, mm_struct, etc.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"></span><br />
<a name='more'></a><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">There are some kernel data structures that are never changed at runtime as long as the operating system completes their initialization. For example, the GDT and IDT table, the system call table, or SSDT (pointed by nt!KeServiceDescriptorTable, see this <a href="http://resources.infosecinstitute.com/hooking-system-service-dispatch-table-ssdt/" target="_blank">link</a> for SSDT hooking in Windows OS). Note that in Linux system, some of GDT table entries will also be updated by kernel. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">For those data structures, we can directly configure them with Read-Only memory permission in page table entries. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">However, there are many kernel data structures like task_struct, mm_struct, GDT, which must have to be configured with Read-Write attribute because they are changed very frequently during OS runtime. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">But the good thing is that those data are only changed by kernel itself. Basically, the system drivers or other LKMs must not change them, and we can even think that any changes to them by those LKMs are illegitimate, and not desired behaviors. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">So, with this assumption we can now take a look at what we should have to do on existing Linux kernel system or a new operating system started from scratch.</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><i><b>Kernel virtual memory management subsystem</b></i>:</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">First of all, add a new type of memory allocation to support Read-Only memory allocation (with kmalloc() or even vmalloc() ), for example, adding a new parameter GFP_ROMEM. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">This means that the kernel internal memory management subsystem (e.g. Linux slab allocator) must be extended to group RO memory chunks together in a single or multiple RO pages (4KB or 2MB in size), and traditional RW memory chunks into other multiple RW pages in 4KB or 2MB size. This might greatly increase the complicity of memory management system design. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><b><i>Memory allocation for kernel itself and drivers (or any LKMs)</i></b></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Once we add a new type GFP_ROMEM, we must define the rules to use it. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">The first rule #1 is ... for the data structures that will only be modified by kernel module itself, we must use this new type for memory allocation in kernel (e.g. scheduler). All the drivers (or other LKMs) are disallowed to use this new type, we can use code static analysis tool to enforce this usage. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">The second rule #2 is ... since the data structure are now RO attribute, and by default CR0.WP bit is set, so kernel module must have to disable CR0.WP before writing access to those data structures. So the code logic is as below:</span><br />
<span style="font-family: Trebuchet MS, sans-serif;"><br />
</span><br />
<span style="color: blue; font-family: Trebuchet MS, sans-serif;"> disable_wp(); // clear CR0.WP bit.</span><br />
<div>
<span style="color: blue; font-family: Trebuchet MS, sans-serif;"> write access to RO data fields.</span></div>
<div>
<span style="color: blue; font-family: Trebuchet MS, sans-serif;"> enable_wp(); // set CR0.WP bit again.</span><br />
<div>
<br />
<span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">At the same time, any legitimate drivers (LKMs) are not allowed to change CR0.WP bit (code scanning to enforce this).</span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">With solution, we can prevent many buffer overflow attacks like, some driver bug that causes arbitrary kernel memory overwriting. However, </span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">ROP (JOP) attacks might bypass this solution, but this security design is not intended to address such a specific attack like ROP. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Problems: </span><br />
<ol>
<li><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">This will increase tremendous changes to existing Linux kernel system. But it would be good if we plan to write our own operating system starting from scratch. </span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Performance impact. Too many extra cycles for disabling/enabling CR0.WP bit. But we can optimize it, the real impact might not so big.</span></li>
<li><span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Need to consider the interrupts or NMIs between disable_wp() and enable_wp() functions. This is just an implementation consideration. It can be solved very easily. </span></li>
</ol>
<br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">Any other big issues? </span><br />
<br />
<br />
<span style="color: blue; font-family: Trebuchet MS, sans-serif; font-size: large;"><u><i>[Update]:</i></u></span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">The memory <i><b>Protection Keys</b></i> feature can do kind of similar protection for key data structures. With this feature enabled, each process also has a protection key value associated with it. On a memory access the hardware checks that the current process's protection key matches the value associated with the memory block being accessed; if not, an exception occurs. </span><br />
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;"><br /></span>
<span style="font-family: Trebuchet MS, sans-serif; font-size: large;">S</span><span style="font-family: 'Trebuchet MS', sans-serif; font-size: large;">ee the wikipedia page for details: <a href="http://en.wikipedia.org/wiki/Memory_protection#Protection_keys" target="_blank">http://en.wikipedia.org/wiki/Memory_protection#Protection_keys</a></span><br />
<br /></div>
</div>
Anababahttp://www.blogger.com/profile/12583828764274874899noreply@blogger.com0