SIMPLE IS BETTER: Introduction to Processor Hardware Security Features in x86 & ARM Architectures

x86 and ARM processors both provide many hardware enforced security features, e.g. NX (No-eXecute) for executable space protection, to help system software engineers to build a secure computing environment.

This article summaries those security features for both x86/Intel and ARM architectures, and explains how are they used by Operating System.

x86/Intel processor Architecture:

Protection rings or privilege levels. In computer science, it is called hierarchical protection domains. In x86 architecture, it has 4 levels or layers of rings, from ring 0 through ring 3, which are arranged in a hierarchy from most privileged to least privileged.

On most modern operating systems, Ring 0 mode is often referred to as Kernel mode, and Ring 3 mode is called User mode. Ring 1 and Ring 2 mode are often not used except that some hypervisors use it as called ring compression for software virtualization. The privilege level of the currently executing program or task is indicated by the value of CPL (Current Privilege Level, in CS and SS segment registers).

Some rules are defined and enforced by protection rings:

1) There are some instructions that can only be executed in privileged level (ring 0) to prevent the critical processor resources from being accessed by unprivileged levels (ring 1~3), e.g. RDMSR, WRMSR, LIDT, LGDT.

2) Privilege switching is also restricted, and can only be achieved by some special instructions or events. e.g. syscall/sysenter, interrupt/exception.

3) Access to memory resource is also controlled based on Ring levels and page level protection (see later).
Page level protections. Modern system generally uses page-structure hierarchy to manage virtual address to physical address translation. The system software is responsible for configuring those page structures (like page table entries), and the processor will enforce the protections by two levels of checks: one is the restriction of privilege/ring mode; the other is the page type restrictions (e.g. read-only, read/write, non-executable...).

The remainder of this section details each of those checks:

1) Privilege level protections by checking "U/S" bit, User/Supervisor, in each level of page structures.

The basic rule is that if the current privilege level (CPL) is user mode, it cannot access the memories whose corresponding page structure has U/S bit clear. In other words, a user mode task or program cannot read (write, or fetch) access to the memory that belongs to supervisor or privileged mode.

This kind of protection is very useful, for example, the malicious software is not allowed to read or modify the kernel/system resources, and many extensible protection mechanisms below are based on this privilege level state.

2) Executable space protection, sometimes called as XD (eXecute Disable), NX (No-eXecute).

Operating system uses this feature to mark some region of memory spaces not executable. For example, stack or heap memory space may be marked as NX. This helps to prevent certain buffer-overflow exploits from succeeding, particularly those that inject code and and execute in controlled stack or heap space.

On Windows OS, it is used as "Hardware DEP", Data Execution Prevention. On some other systems, it might be used as "W^X", which means they marks writable pages by default as non-executable.

Note that this feature is introduced when 32bit-PAE (Physical Address Extension) mode or 64bit mode is enabled. On modern operating system, this is true.

3) Supervisor Mode Execution Protection (SMEP, might be introduced in Ivy Bridge processor).

I think SMEP is definitely a very very powerful security feature, and easy to deploy by system software. In my experiences, it can block most of (up to 90%+) public exploits for kernel privilege escalation in The Exploit Database (http://www.exploit-db.com/).

This feature is enabled by setting a bit in the CR4 control register, and then CPU will generate a fault whenever ring0/kernel mode attempts to execute code from a page marked with the user bit (U/S = 1) set.

It means that with SMEP enabled, it’s no longer possible to map arbitrary exploit payloads in user mode, since the CPU will trigger a fault if it attempts to execute those controlled user arbitrary exploit pages in kernel mode.
<NOP>
4) Supervisor Mode Access Protection (SMAP, introduced in Broadwell or Haswell??).

It defines a new SMAP bit in the CR4 control register; when that bit is set, any attempt to access user-space memory while running in a privileged mode will lead to a page fault.

In other words, SMAP will prevent unintended data accesses to userland memory, but care must be taken because it has to be disabled/enabled around legitimate access functions in the kernel, for example, copy_to_user(), copy_from_user() functions.

Intel has added two new instructions for this purpose(CLAC/STAC) to temporary disable/enable SMAP for those legitimate accesses. These two instructions are used to clear and set RFLAGS.AC bit. If the SMAP bit is set in the CR4 register, explicit supervisor-mode data accesses to user-mode pages are allowed if and only if RFLAGS.AC bit is 1, here the AC bit is also used for alignment checks of user mode data access.

What does SMAP mean for security? The user-mode memory accessed in unintended ways by kernel mode will be prohibited , e.g., attacker controlled pointers can no longer target user-mode memory directly, but even simple kernel bugs such as NULL pointer based dereferences will just trigger a SMAP access violation (page fault , #PF) instead of letting the attacker take over kernel data flow. Because here the memory access dereferenced by NULL pointer is just located at user mode address, the kernel code cannot write/read the crafted data on that memory address when SMAP is active.
nop
5) WP (Write protection). This feature is a very old feature, controlled by CR0.WP bit.

When set, inhibits supervisor mode code from writing into read-only pages; when clear, allows supervisor mode code write into read-only pages (regardless of the U/S bit setting).

This flag are often used to protect the kernel mode code sections, since those code section will be configured as read-only pages, a hardware CPU exception(page fault #GP) will be triggered whenever a malicious kernel software (e.g. kernel rootkits) attempts to modify the kernel code pages (e.g. doing inline hooks for detour). And WP can also be used by protecting kernel static data sections which must not be changed at system runtime.

Besides, this flag facilitates implementation of the copy-on-write (COW) method of creating a new process (forking) used by operating systems such as Unix.

ARM Architecture

Like x86/Intel processor architecture, ARM also provides some equivalent hardware-enforced security features.

ARM architecture defines different levels of execution privilege: PL0(unprivileged, for user, application), PL1 (privileged, for all modes other than User mode and Hyp mode. Normally operating system software executes at this level), PL2 (Hyp mode, normally used by a hypervisor for Hardware Virtualization, and only Non-Secure State has this privilege level).

Normally when the processor running at higher privilege level can access the resource (memory, register) available at the same and lower privilege levels. A Data Abort Exception is generated if the processor attempts a data access that the access rights do not permit. For example, a Data Abort exception is generated if the processor is at PL0 and attempts to access a memory region that is marked as only accessible to privileged (PL1) memory accesses.

However, in an ARM processor including Security Extension, note that Non-secure Hyp mode executes at PL2 does not indicate that it is more privileged than the Secure PL1 modes. Secure PL1 modes can change the configuration and control settings for Non-secure operation in all modes, but Non-secure modes (even PL2) can never change the configuration and control settings for Secure operation.

Like NX or XD attribute in x86/Intel processor, ARM also has the same security feature called XN (eXecute-Never).

When this bit is 1 in the corresponding long(or short, for some cases)-descriptor tables, a Permission fault is generated if the processor attempts to execute an instruction fetched from the corresponding memory region.

In addition, the Virtualization Extensions provide controls that enforce the XN restrictions, regardless of the settings in the translation tables:

Restriction on Secure instruction fetch (SCR.SIF in Secure Configuration Register).

When this bit is set to 1, any attempt in Secure state to execute an instruction fetched from Non-secure physical memory causes a Permission fault.
Preventing execution from writable locations. When the corresponding stage 1 MMU is enabled, force writable memory to be treated as XN, regardless of the setting of the XN bit.

In other words, for example, the memory regions with unprivileged write permission will be treated as XN for any access from software that is executing at PL1.

Check these control bits in ARMv7-A (with Virtualization Extension) reference manual, SCTLR.WXN (for Secure and Non-secure PL1&0 stage 1 translations), HSCTLR.WXN (for Non-secure PL2 stage 1 translations) for details.

Like SMEP in x86/Intel architecture, ARM provides the similar security feature called PXN (Privileged eXecute-Never).

A Permission fault is generated if the processor is executing at PL1 and attempts to execute an instruction fetched from the corresponding memory region when this PXN bit is 1. If Virtualization Extension is supported in ARM architecture, for Secure and Non-secure PL1&0 stage 1 translations, when SCTLR.UWXN is set to 1, an instruction fetch is forced to be treated as accessing a PXN region if it accesses a region that software executing at PL0 can write to.

However, it seems that a SMAP-like security feature is not provided by ARM architecture currently. Please correct me if I'm wrong.

For example, how to restrict the read or/and write access to PL0 memory when a system software executes at PL1?

The AP (Access Permission) bits of page translation table descriptors in ARMv7 VMSA (Virtual Memory System Architecture), or AP bits of DRACR or IRACR (Data/Instruction Region Access Control Register) in ARMv7 PMSA (Protected Memory System Architecture) provide some kinds of memory read & write protections, but all the definitions indicate that PL1 always has higher access permission than PL0.

[Update]:
See the Memory Protection Keys mechanism:
http://en.wikipedia.org/wiki/Memory_protection#Protection_keys

Intel MPX
Intel MPX (Memory Protection Extensions, http://en.wikipedia.org/wiki/Intel_MPX) is a set of extensions to the x86 instruction set architecture. With compiler, runtime library and operating system support, Intel MPX brings increased security to software by checking pointer references whose normal compile-time intentions are maliciously exploited at runtime due to buffer overflows. Intel MPX will introduce new registers, and new instruction set extensions that operate on these registers.

References:

http://en.wikipedia.org/wiki/Protection_ring
SMEP, What is It, and How to Beat It on Linux: http://vulnfactory.org/blog/2011/06/05/smep-what-is-it-and-how-to-beat-it-on-linux/
Supervisor mode access prevention: http://lwn.net/Articles/517475/
Supervisor Mode Access Prevention - by PaX: https://forums.grsecurity.net/viewtopic.php?f=7&t=3046
Recent ARM Security Improvements
https://forums.grsecurity.net/viewtopic.php?f=7&t=3292

SIMPLE IS BETTER

Tuesday, May 06, 2014

Introduction to Processor Hardware Security Features in x86 & ARM Architectures

x86/Intel processor Architecture:

References:

Recent ARM Security Improvements

No comments:

Post a Comment