Tuesday, February 25, 2014

Approach to retrieving the physical memory map on different system (SFI, LEGACY,UEFI)

According to Wikipedia, in computer science a memory map is a structure of data (which usually resides in memory itself) that indicates how the memory space is laid out. In the boot process, a memory map is passed on from the firmware in order to instruct an operating system kernel about memory layout. It contains the information regarding the size of total memory, the range of specific memory space and any reserved regions, it may also provide other details specific to the architecture and platform.

There is no unified standard or specification that defines the format/structure of memory layout, the type of memory regions, and how to pass on such a memory map information from firmware to bootloader or operating system. The followings of this article will explain in some details for each different type of x86 platform/system.

In a legacy PC system, BIOS provides a service routine that can be used by bootloader of operating system to get the layout of system memory map. Such a routine is E820 INT15h function below:
It is BIOS's responsibility to set up and install appropriate routine on vector 15h in physical memory to correctly report the memory layout to bootloader and operating system. To get the memory map information, the OS bootloader must be running in real mode (or switched to real mode), and IDTR.base is commonly zero for almost all the legacy PC compatible system (in real mode, the IDT, Interrupt Descriptor Table is also called as IVT, Interrupt Vector Table), then it calls this BIOS routine with a software interrupt INT 15h with EAX register =E820h. We can refer to this link and this link in greater details on how to program in real mode to detect the system memory map. The basic information for each memory region/entry contains the base address of region, the length of that region, and the type of each region.

The definition of "region type" is compatible with ACPI Specification.(now it is maintained by UEFI forum). It defines:
  • Type 1: Usable (normal) RAM, will be free to use for operating system.
  • Type 2: Reserved - unusable. It is reserved by BIOS firmware for special purpose, and operating system software is not allowed to access it. However, the behaviors of access are undefined. For example, some of chipsets will ignore write access, and return FFs for a read access, but it is not always the case. 
  • Type 3: ACPI reclaimable memory. This memory region is used to store some of ACPI tables, after operating system completes OSPM initialization, this area memory space will be reclaimed by operating system.
  • Type 4: ACPI NVS memory. This area is preserved by BIOS/firmware for storing some other ACPI information, like S3 or S4 (Standby or Hibernate) code and/or data in FACS (Firmware ACPI Control Structure). The operating system must not use this memory region. 
  • Type 5: Area containing bad memory 

In an UEFI system, the approach to detecting system memory map  is completely different, for example, the different method to get memory map, the different memory information structure, and even different memory region type.

UEFI specification defines firmware service interfaces. Any UEFI-compliant operating system, preboot software, and OS bootloader could call those service interfaces to get or control the system resources under certain conditions. Two types of services available in compliant system: UEFI Boot Services and UEFI Runtime Services. The main difference is that the Boot Services functions are only available before a successful call to ExitBootServices() that is also one of Boot Services function, while the Runtime Services functions are always available before and after any call that function. 

The interface function of getting system memory map (sometimes we still call e820 table) is GetMemoryMap() that is also one of Boot Services. In order to call this function during boot process, the preboot software or OS bootloader must meet some requirements, for example, it must retrieve the EFI System Table pointer & Boot Services Table pointer, then get the function pointer (physical address) of GetMemoryMap function in Boot Services Table structure; the preboot software must be in processor protected mode and be consistent with EFI environment (e.g. both 32bit or 64bit environment, if not, processor mode switch must be preceded before calling); the Page must be disabled, otherwise SetVirtualAddressMap() must be called for memory "fix-up" e.g. converting physical memory to virtual memory; and there are many other calling conventions e.g. interrupt status, eflags, stack space and alignment, for details, please read the UEFI specification

After calling this function, the OS bootloader can get the system memory map information for each region with the a structure EFI_MEMORY_DESCRIPTOR. In this structure, there are some fields like memory type, memory start address, number of pages (the length), and memory attributes (cacheability attributes, e.g. WT/WB/UC/WC, and protection attributes, e.g. write/execution/read protections).

Unlike the "region type" definition in legacy system, the memory type of UEFI system is slightly different. The types are defined as the followings after ExitBootServices() is called: 

  • EfiReservedMemoryType, Not used.
  • EfiLoaderCode, The code of previous loaded applications. It is free to use by operating system.
  • EfiLoaderData, The data of previous loaded applications, and the default data allocation type used by an application to allocate pool memory. It is free to use by operating system
  • EfiBootServicesCode, The code previously used by a loaded Boot Services Driver. It is free to use by operating system.
  • EfiBootServicesData, The data previously used by a loaded Boot Serves Driver, and also the default data allocation type used by a Boot Services Driver to allocate pool memory.It is free to use by operating system.
  • EfiRuntimeServicesCode, The code used by loaded Runtime Services Drivers. The memory in this range is to be preserved by the loader and OS in the working and ACPI S1–S3 states.
  • EfiRuntimeServicesData, The data used by a loaded Runtime Services Driver and the default data allocation type used by a Runtime Services Driver to allocate pool memory.The memory in this range is to be preserved by the loader and OS in the working and ACPI S1–S3 states
  • EfiConventionalMemory, Free (unallocated) memory by UEFI. It is free to use by operating system.
  • EfiUnusableMemory, Memory in which errors have been detected by firmware, and marked as BAD.
  • EfiACPIReclaimMemory, Memory that holds the ACPI tables, it will be reclaimed by operating system. This memory is to be preserved by the loader and OS until ACPI is enabled. Once ACPI is enabled, the memory in this range is available for general use.
  • EfiACPIMemoryNVS, Address space reserved for use by the firmware (e.g. FACS table for standby/sleep). The memory in this range is to be preserved by the loader and OS in the working and ACPI S1–S3 states
  • EfiMemoryMappedIO, Used by system firmware to request that a memory-mapped IO region be mapped by the OS to a virtual address so it can be accessed by EFI runtime services. This memory is not used by the OS, all system memory-mapped IO information used by operating system should come from ACPI tables.
  • EfiMemoryMappedIOPortSpace, System memory-mapped IO region that is used to translate memory cycles to IO cycles by the processor. Note: There is only one region of type. This memory is not used by the OS. All system memory-mapped IO information should come from ACPI tables.
  • EfiPalCode, Address space reserved by the firmware for code that is part of the processor. This memory is to be preserved by the loader and OS in the working and ACPI S1–S3 states. This memory may also have other attributes that are defined by the processor implementation.

Note that some of the types above have a different usage before ExitBootServices() is called than they do afterwards. For details, please take a look at the UEFI specification.

The below table is a mapping between these the legacy memory region types and UEFI memory types:
|     UEFI memory types        | Legacy types |
| EfiReservedMemoryType        |              |
| EfiRuntimeServicesCode       |              |
| EfiRuntimeServicesData       |  Reserved    |
| EfiMemoryMappedIO            |              |
| EfiMemoryMappedIOPortSpace   |              |
| EfiPalCode                   |              |
EfiUnusableMemory            |  bad memory  |
EfiACPIReclaimMemory         | ACPI reclaim |
| EfiLoaderCode                |              |
| EfiLoaderData                |              |
| EfiBootServicesCode          |   Usable     | 
| EfiBootServicesData          |              |
EfiConventionalMemory        |              |
| EfiACPIMemoryNVS             | ACPI NVS     |

In addition to legacy and UEFI system, Atom/SoC platform (starting from Intel Moorestown platform) uses a new interface, called as Simple Firmware Interface (the link SFI). SFI is developed by Intel Corporation as a lightweight method for firmware to export static tables to the operating system. It defines a few tables, e.g. CPU, APIC, Memory map, that contains various data structures in memory, and all those SFI tables share a common table header format. The operating system finds the system table by searching 16 byte boundaries between physical address 0x000E0000 and 0x000FFFFF. We can download its specification from the SFI official site

SFI specification defines a SFI Memory Map Table. The OS loader can locate this table to get the details of system memory map. The memory structure is defined as the same with that of UEFI specification, a.k.a EFI_MEMORY_DESCRIPTOR. However, this table is optional, only if the firmware is not UEFI, the SFI-compliant OS/loader can get physical memory layout from this table. 

Note that SFI is independent of UEFI, just like ACPI. Platform firmware that supports SFI may or may not also support UEFI. For a hardware platform, the firmware designer can choose either SFI or ACPI as interface, for an operating system, it could be both SFI-compliant OS and ACPI-compliant OS in a single OS executable image. 

We've talked about the firmware interface (to get memory map) on different x86 platform systems, actually those above are defined between firmware and OS boot loaders. But when we are talking about the interface between OS bootloader and operating system, the way to retrieve memory map might be another story. 
For example, during Linux system boot process, the bootloader (e.g. Grub, Lilo, Gummiboot, rEFInd, Coreboot, etc) uses a different means to pass on memory map information to Linux kernel. Such an interface is well-defined in Linux Boot Protocol. Hence, in this case the OS kernel doesn't need to search memory map table, or call firmware service routine to get physical memory layout. Instead, in legacy firmware the bootloader fills up e820_map[] fields and in UEFI firmware it fills up efi_info field when constructing Linux boot_params, so that the Linux operating system kernel can directly get the physical memory layout information from those fields in handoff boot parameters. 

One more thing, in preboot environment, the bootloader can modify (legally or illegally) the memory map structure before handing off to operating system. In this way, the pre-boot software can preserve or hide some of memory regions for special purpose. Here are some examples, In virtualization environment, XEN hypervisor can hide itself from guest operating system by modifying the memory map to prevent the guest software tampering with the 
privileged data/code memory of hypervisor. In some other conditions, some bootkits (malware) can also do the similar things to hide its own code and data before booting operating system to prevent it being detected by traditional anti-malware software.  

<The End>

No comments:

Post a Comment