Memory Hierarchy

The x86 processor chips have gone through many changes in the last 20 years. The transitions between 16, 32 and 64 bits versions immediately come to mind–and they are very notable indeed–but x86 assembly programmers have had to deal with many other changes. Another big change was the introduction of a new operating mode in the 286 and 386 CPU models: protected mode. Protected mode replaced the older operating mode called real mode, and it allowed software to utilize features such as virtual memory and paging. These features are now used in virtually all modern operating systems which run on the x86 architecture, such as Microsoft Windows, Linux, and many others.

In this article we give an overview of the available registers and operating modes of the various x86 architectures, and we describe the basic register semantics in different operating modes.

Processors have small amounts of high-speed storage, called registers, located in the heart of the CPU. All data must be represented in a register before it can be processed, so they play a central role: the first step of learning a new platform is usually learning the register set. The advantage of registers is that they can be accessed much more quickly than main memory, which improves the processing speed. If the CPU would have to execute all calculations in main memory, performance would be painfully slow.

In the next sections we look at the registers in the 16, 32 and 64 bit version of the x86 architecture. We omit special system features such as debug registers and control registers; these are not of interest to the application programmer.

Let’s take a look at the different registers available in the first x86 CPU, the 16-bit 8086:

The CPU registers of a 8086 or 8088 processor. All registers are 16 bits wide.

A total of 14 CPU registers is available in the 8086 and 8088 CPU models. The short names of the registers (AX, CS, SI) are used to refer to the registers in x86 assembly language. For example, to copy the 16 bit value from CX to SI, one would write:

MOV SI, CX

The register names reflect the intended purpose of each register, as envisioned by the 8086 designers. One should not feel overly restricted by them, however; many registers are general-purpose, and can be used freely.

Register NameDescription
General Registers
AX
BX
CX
DX
These four 16-bit registers are truly general-purpose. Programmers can use them freely to. The AX register is historically one of the most commonly used registers, because it is faster than the other registers in some cases.
Note that the individual 8-bit sections of the 16-bit general purpose registers can be addressed separately. The AX register’s high bytes is called AH, and the low byte is called AL.
Segment Registers
CS
DS
SS
ES
The four segment registers play an important role in the CPU’s memory management, and they are not used by application programmers to load, store or manipulate values. The 8086 architecture operates in real mode, where segment registers work together with general purpose registers to access any memory value. A description of real mode and other operating modes can be found in the Operating Modes section.
It is important to understand that every CPU instruction that accesses memory uses one of the segment registers. Every instruction has an implicit default register, but this can often be overriden.
Pointer & Index Registers
SIThe Source Index and Destination Index registers were intended to be used in string operations, as the start and destination indices, respectively. In practice, these registers are also general purpose
DI
BPThe Base Pointer is used to point to the base of the stack frame, in the case of stack-based functions. The SS register is implicitly used as the address base.
SPThe SP register contains a pointer to the top of the stack. The SS register is implicitly used as the address base.
IPThe Instruction Pointer register always works together with CS segment register and it points to the memory location of the instruction that should be executed next if no branching is done.
Status Register
FLAGSThe Flags register determines the current state of the processor; it contains flags such as carry flag, overflow flag and zero flag. They are modified automatically by CPU after mathematical operations, this allows to determine the type of the result, and to determine conditions to transfer control to other parts of the program. Generally you cannot access these registers directly; instead, they are used by CPU instructions.

With the advent of Intel’s 32-bit 80386 processor, the 16-bit general-purpose registers, base registers, index registers, instruction pointer, and FLAGS register, but not the segment registers, were expanded to 32 bits. This is represented by prefixing an E (for Extended) to the register names in x86 assembly language. Thus, the AX register corresponds to the lowest 16 bits of the new 32-bit EAX register, SI corresponds to the lowest 16 bits of ESI, and so on. Furthermore, two segment registers were added: FS and GS.

The CPU registers of the 386 architecture. All registers are extended to 32 bits, except for the segment registers. Note that the upper 16 bits of the extended registers are not separately addressable.

The introduction of 64-bit processors by both AMD and Intel brought further changes to the CPU registers. There are different implementations, but the common denominator is usually referred to as the x86-64 architecture. It extends the 32-bit registers into 64-bit registers in a way similar to how the 16 to 32-bit extension was done. The extended registers carry the R prefix. In addition, eight new 64-bit general-purpose registers (R8-R15) were introduced. The segment registers remained the same as their 386 counterparts.

The CPU registers of the x86-64 architecture. Which registers can be used depends on the operation mode the CPU is running in.

We have identified the x86 CPU registers. Although are general-purpose, and have no special semantics.

The operating mode controls how the processor sees and manages the system memory and the tasks that use it. There are five operating modes: real mode, protected mode, virtual 8086 mode, 64-bit mode and compatibility mode.

Operating ModeAvailable OnDefault Address Size (bits)Addressable Memory
Real ModeAll x86 models.161 MB
Protected ModeAll x86 models since 286, with significant improvements in 386.324 GB
Protected Mode Sub-ModeVirtual-8086 ModeAll x86 models since 386.161 MB
Long ModeAll 64-bit x86 models.64The theoretical limit is 2^64 bytes. The limit of current processors is around 1TB.
Long Mode Sub-ModeCompatibility ModeAll 64-bit x86 models.324 GB

One of the big differences between the operating modes is in the way memory addressing works. Both the amount of memory that can be addressed and the translation process between logical addresses and to physical addresses may vary depending on the operating mode. Before we get into the details of each operating mode, we look at how the memory is referenced in x86 assembly language.

Because the registers in the 8086 were all 16 bits wide, Intel decided to divide address space into 64KB segments and coordinate memory access through the use of two 16 bit values – a segment and an offset. As the names suggests, the segment part denotes the current 64 KB segment, and the offset identifies the offset within that segment. The associated notation is segment:offset.  The operating mode determines how this is translated to a physical memory address. In 32-bit or 64-bit processors the segments are much bigger, but a segmentation mechanism is still used — this is why all x86 processors have segment registers.

The x86 instruction set provides a number of distinct ways to specify memory locations. In the 8086 instruction set there are 17 unique ways to address memory, and newer x86 models have even more. Luckily, most ways are very similar. We provide a few examples here in Intel assembly notation. For more details you can check out these pages, but keep in mind that syntax varies depending on your assembler.

A assembly reference to memory is enclosed in square brackets: [ ]. Note that every memory reference is relative to a segment register, implicitly or explicitly.

mov ax, [102h] ; Actual address is DS:102h

Note that all data operations, such as MOV, implicitly use the DS segment register. It is always possible to override this default segment register.

mov ax, cs:[102h] ; Actual address is CS:102h

The above instruction copies 16 bits of data from the CS segment (at offset 102h) into the AX register.

This code copies a 32-bit value  from the DS segment to the EAX register. The value in the EBX register is taken as the offset in the DS.

mov eax, [ebx] ; Actual address is DS:[ebx]

Again, the default segment is DS since it is a data operation.

In Indexed addressing mode, a register is combined with a displacement.

These instructions move 4 bytes of data from offset EBX+8  in the DS segment to the EAX register.

mov eax, 8[ebx] ; Actual address is DS:[ebx] + 8
mov eax, [ebx][8] ; Same operation
mov eax, [ebx+8] ; Same operation

The ESP and EBP registers use the SS segment by default:

mov eax, [esp-8] ; Actual address is SS:[esp] - 8

In Base/Indexed addressing mode, a base register and an index register are used in conjunction:

mov eax, [ebx][ecx] ; Actual address is DS:[ebx] + [ecx]

This mode lets you multiply the index register in the addressing mode by one, two, four, or eight.

mov eax,1000h[ebx][ebx*2] ; Actual address is DS:1000h + [ebx] + [ebx]*2

If EBX were equal to 200h in the above operation, the actual address would be DS:1600h.

On the earliest CPU models, real mode was the only available operating mode. Even during the DOS era it was the only available mode, so assembly programmers who were active in those days will remember it all too well. Real mode is characterized by a 20 bit segmented memory address space, and unlimited direct software access to all memory, I/O addresses and peripheral hardware. Real mode is provided in modern x86 CPUs for backwards compatibility with pre-80386 processors and applications, but has very limited use nowadays.

The processor enters real mode immediately after power on, so an operating system kernel, or other program, must explicitly switch to another mode if it wishes to run in anything but real mode. Switching modes is accomplished by modifying certain bits of the processor’s control registers although some preparation is required beforehand in many cases, and some post switch cleanup may be required. Nowadays, operating system code will quickly perform such a switch to a different mode after booting, so most people never have to deal with real mode.

A processor in real mode can address a maximum of 1MB of memory, which requires an address size of 20 bits to access the physical memory. The registers are 16-bits wide, so a segment and offset are combined to form the logical address. In real mode each logical address points directly into physical memory location, because the address spaces are equivalent — both are 1 MB.

The physical memory address is calculated from the segment and offset in this way:

PhysicalAddress = Segment * 16 + Offset

In practice, the segment address is shifted 4 bits to the left (multiplication by 16 decimal), and then the offset is added.

The translation from a logical address (segment:offset) to a physical address in real mode.

While segment address increases with steps of 16 bytes, each segment is 65.536 bytes in size. This means there are up to 4,096 segment:offset pairs that refer to the same location in physical memory.  For those who are still wondering: yes, real mode programming is extremely tedious.

In contrast with other operating modes, the 1MB memory space in real mode is global and unprotected. This means that all processes see the same memory space and have unrestricted read and write access to this space. A change of the active process does not change the memory space representation: it is the same from each perspective. For this reason it was very hard to implement multitasking in real mode. Luckily, the introduction of protected mode made life much easier.

Protected mode is an operating mode that allows system software to utilize features such as virtual memory, paging, safe multi-tasking, and other features designed to increase an operating system’s control over application software. It dates back quite a while: protected mode as we know it today was introduced in the 386 processor model. If you have a 32-bit operating system, you can be sure you are running in protected mode.

In addition to the segmentation mechanism (which changed considerably), a paging unit has been added as a second layer of address translation between the segmentation unit and the physical bus. The memory translation mechanism under protected mode looks like this:

An abstract representation of the x86 protected mode memory translation scheme.

In protected mode, a segment register no longer contains the physical address of the beginning of a segment, but rather a “selector” that points to a system-level structure called a segment descriptor. A segment descriptor contains the physical address of the beginning of the segment, the length of the segment, and access permissions to that segment. These segment descriptors are stored in two tables: the Global Descriptor Table (GDT) and the Local Descriptor Table (LDT). Each CPU (or core) in a computer contains a register called gdtr which stores the linear memory address of the first byte in the GDT.

A selector can be loaded into a segment registers directly with instructions like MOV. The sole exception is CS, which can only be changed by instructions that affect the flow of execution, like CALL or JMP. Just like in real mode, offsets referring to locations inside the segment are combined with the physical address of the beginning of the segment to get the physical address corresponding to that offset. The offset is checked against the length of the segment, with offsets referring to locations outside the segment causing an exception. In addition, the access permissions are checked to make sure the process that is trying to access the segment has permission to do so.

Virtual-8086 mode is a sub-mode of protected mode that was created to provide backward-compatibility with real mode applications. It provides processes that are incapable of running while the processor is running a 32-bit protected mode operating system with an execution environment that is identical to real mode.

Long mode is the primary operating mode of x86-64 processors. Under a 64-bit operating system, 64-bit programs run under 64-bit mode, and 32-bit and 16-bit protected mode applications run under compatibility mode. An x86-64 processor does not have to use long mode: in legacy mode it will function exactly like an x86-32 processor, allowing the use of 32-bit operating system.

64-bit protected mode is the mode where a 64-bit application (or operating system) can access 64-bit instructions and registers.

This mode is provided for backward-compatibility. It allows 32-bit and 16-bit protected mode applications to run under a 64-bit operating system. From the application’s viewpoint compatibility mode looks like the obsolete 32-bit protected mode but from the viewpoint of the OS (address translation, processing of interruptions and exceptions) 64-bit mechanisms are used.

x86 Registers and Operating Modes

Leave a Reply