Mode of Operations
The first step is to identify the processor mode of operation. x64 supports various modes and memory models. Let's try to identify the current one. This information is stored in the 32-bit CR0 control register ([1]), under the flag PE stored at position 0 (position 0 is the least significant bit (LSB), that is, the right-most bit). If this bit is set, we are running in protected mode, otherwise we are running in real-address mode. Let's use the kernel debugger to perform this check as shown in Figure 1.The CR0.PE bit is set to 1, so we are running in protected mode using a segmented memory model (you might also notice that the CR0.PG bit, at position 31 is set, indicating that we are also using paging). We can also check the sub-mode operation by inspecting the IA32_EFER Machine Specific Register (MSR) (0xC0000080) ([2]), and checking the LME (bit position 8) and LMA (bit position 10) flags. You can see the result in Figure 2.kd> .formats cr0 Evaluate expression: Hex: 00000000`80050031 Decimal: 2147811377 Decimal (unsigned) : 2147811377 Octal: 0000000000020001200061 Binary: 00000000 00000000 00000000 00000000 10000000 00000101 00000000 00110001 Chars: .......1 Time: ***** Invalid Float: low -4.59246e-040 high 0 Double: 1.06116e-314
Figure 1. Operation Mode Identification
The IA32_EFER.LMA and IA32_EFER.LME bits are set, so we are running in IA-32e sub-mode (64-bit). This information will be used later in the text.kd> rdmsr 0xC0000080 msr[c0000080] = 00000000`00000d01 kd> .formats 00000000`00000d01 Evaluate expression: Hex: 00000000`00000d01 Decimal: 3329 Decimal (unsigned) : 3329 Octal: 0000000000000000006401 Binary: 00000000 00000000 00000000 00000000 00000000 00000000 00001101 00000001 Chars: ........ Time: Thu Jan 1 01:55:29 1970 Float: low 4.66492e-042 high 0 Double: 1.64474e-320
Figure 2. Operation Sub-Mode Identification
Segmented Memory Model
The Segmented Memory Model accesses the memory by using the segment concept. A segment provides information on how to translate a given address. According to the executed instruction, a different segment is involved (eg. for call instruction the code segment is used, instead, for the push and pop instructions the stack segment is used). The Intel architecture defines a total of six segment registers: CS, DS, ES, SS, GS, and FS. For example, the CS segment (code segment) is used when a call instruction is executed. Let's see how this works with a practical example, let's consider the instruction in Figure 3.The call instruction uses the value 1A000000 to specify the address of the function to execute. Since we are in a x64 bit operation mode, the value is RIP-relative, this explains why the function address in the disassembly is 0x7FFD42C7D5E0 (0x7FFD42C7D5C1 (RIP) + 0x1a (offset) + 0x05 (instruction size)). In addition to the mentioned value, the value of the CS segment is also used. The combination of the CS with the function address is called the logical address. The segment value is then used to translate the logical address into what is known as the virtual address (this process is described in the next section). Since our system is using paging, and additional translation step is performed to translate the virtual address into the physical address (this topic is not covered in this post). All the translation steps are represented in Figure 4.00007FFD42C7D5C1 | E8 1A000000 | call kernelbase.7FFD42C7D5E0
Figure 3. How Segmentation Works
How Segmentation Works
The segment registers are 16-bit registers whose structure is reported in Figure 5.The Index field is used as an index in a table that contains information on all the available segments. The TI flag indicates which table must be used, and the Request Privilege Level (RPL) field specifies the protection level of the code requesting access to a specific segment. The possible protection level values are: 0, 1, 2 and 3, and are often represented as protection rings, where ring 0 is the most privileged (where the kernel mode code is executed) and ring 3 is the least privileged (where user mode code is executed).
The two tables that contain information on the segments are the Global Descriptor Table (GDT) and the Local Descriptor Table (LDT). The registers GDTR and LDTR contain the base address of the respective table. In the latest Windows versions, the LDT is no more used, so the TI flag will always be 0. The GDT is an array of segment descriptors, where each segment descriptor is typically represented by the 64-bit structure reported in Figure 6.
Given the segment descriptor definition, we can now explain how the logical address to virtual address translation is performed. The Base field is added to the logical address in order to obtain the virtual address. This process is described in Figure 7.
A very important field is DPL. It indicates the privilege level of the code running in that segment, for example, a DPL value of 0 can execute privileged instructions such as CLI. Another relevant field is L. This field indicates if the segment is running in Long mode (if it is set to 1) or in compatibility mode (if it is set to 0). Figure 8 shows how to inspect the GDT and all the defined segments.
The first two commands obtain the address of the GDT register and dump the memory value. The first non null entry is at offset 0x10 from the GDT base address (the first entry in the GDT is always null). To have a more readable view, we can use the dg command; it dumps all the segments and shows relevant information. There are various Code and Data segments, having as privilege 0 (kernel mode) and 3 (user mode).kd> rgdtr gdtr=fffff804382f3fb0 kd> db fffff804382f3fb0 fffff804`382f3fb0 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................ fffff804`382f3fc0 00 00 00 00 00 9b 20 00-00 00 00 00 00 93 40 00 ...... .......@. fffff804`382f3fd0 ff ff 00 00 00 fb cf 00-ff ff 00 00 00 f3 cf 00 ................ fffff804`382f3fe0 00 00 00 00 00 fb 20 00-00 00 00 00 00 00 00 00 ...... ......... fffff804`382f3ff0 67 00 00 20 2f 8b 00 38-04 f8 ff ff 00 00 00 00 g.. /..8........ fffff804`382f4000 00 3c 00 00 00 f3 40 00-00 00 00 00 00 00 00 00 .<....@......... fffff804`382f4010 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................ fffff804`382f4020 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................ kd> dg 10 50 P Si Gr Pr Lo Sel Base Limit Type l ze an es ng Flags ---- ----------------- ----------------- ---------- - -- -- -- -- -------- 0010 00000000`00000000 00000000`00000000 Code RE Ac 0 Nb By P Lo 0000029b 0018 00000000`00000000 00000000`00000000 Data RW Ac 0 Bg By P Nl 00000493 0020 00000000`00000000 00000000`ffffffff Code RE Ac 3 Bg Pg P Nl 00000cfb 0028 00000000`00000000 00000000`ffffffff Data RW Ac 3 Bg Pg P Nl 00000cf3 0030 00000000`00000000 00000000`00000000 Code RE Ac 3 Nb By P Lo 000002fb 0038 00000000`00000000 00000000`00000000
0 Nb By Np Nl 00000000 0040 00000000`382f2000 00000000`00000067 TSS32 Busy 0 Nb By P Nl 0000008b 0048 00000000`0000ffff 00000000`0000f804 0 Nb By Np Nl 00000000 0050 00000000`00000000 00000000`00003c00 Data RW Ac 3 Bg By P Nl 000004f3 Figure 8. Dumping All Segments
In particular, there is a segment in user mode that is running in 32-bit compatibility mode (Long=0); its segment selector is 0x20. Similarly, there is a segment running in user mode as long mode (Long=0); its segment selector is 0x30.
Windows and the Flat Memory Model
You might have heard that Windows uses a flat memory model, but, we stated above that we are running in a segment memory model. What does it mean? By now, you know how a segment descriptor is used to compute the virtual address and we have also dumped all the segment descriptors defined in the system. You might have noticed that all the Code and Data segments have the Base address field to 0. This implies that Windows is not taking advantage of the segment concept, since having as Base always 0 has as result that the logical address is equal to the virtual address. This means that we are using a segmented memory model without using the segment concept. This mode is known as flat memory model. This statement is also reported by the Intel official documentation:In 64-bit mode, segmentation is generally (but not completely) disabled, creating a flat 64-bit linear-address space. The processor treats the segment base of CS, DS, ES, SS as zero, creating a linear address that is equal to the effective address. The FS and GS segments are exceptions. These segment registers (which hold the segment base) can be used as additional base registers in linear address calculations. They facilitate addressing local data and certain operating system data structures. Note that the processor does not perform segment limit checks at runtime in 64-bit mode.
Decoding a Segment Register
Let's try decoding the value stored in a segment register. Let's consider the CS register, having value 0x33. This value in binary format is 00110011b. As described in Figure 5, bits 3-15 represent the index in the GDT table, which in this case have decimal value 6 (110b). To obtain the segment selector we have to multiply the index by the size of a segment descriptor, which is 8 bytes. Figure 9 shows this operation in the kernel debugger.The segment descriptor value is 0020fb00`00000000. Now, let's use the dg and dt commands to display the segment descriptor associated with index 6, by using the operation 6 * 8 = 48 (0x30). The result is reported in Figure 10.kd> .formats 0x33 Evaluate expression: Hex: 00000000`00000033 Decimal: 51 Decimal (unsigned) : 51 Octal: 0000000000000000000063 Binary: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00110011 Chars: .......3 Time: Thu Jan 1 01:00:51 1970 Float: low 7.14662e-044 high 0 Double: 2.51973e-322 kd> dq gdtr + (6 * 8) L1 fffff804`382f3fe0 0020fb00`00000000
Figure 9. Obtain the Segment Selector
As you can see, the result is the same in both cases.kd> dg 30 P Si Gr Pr Lo Sel Base Limit Type l ze an es ng Flags ---- ----------------- ----------------- ---------- - -- -- -- -- -------- 0030 00000000`00000000 00000000`00000000 Code RE Ac 3 Nb By P Lo 000002fb kd> dt nt!_KGDTENTRY64 fffff804`382f3fe0 -b +0x000 LimitLow : 0 +0x002 BaseLow : 0 +0x004 Bytes :
+0x000 BaseMiddle : 0 '' +0x001 Flags1 : 0xfb '' +0x002 Flags2 : 0x20 ' ' +0x003 BaseHigh : 0 '' +0x004 Bits : +0x000 BaseMiddle : 0y00000000 (0) +0x000 Type : 0y11011 (0x1b) +0x000 Dpl : 0y11 +0x000 Present : 0y1 +0x000 LimitHigh : 0y0000 +0x000 System : 0y0 +0x000 LongMode : 0y1 +0x000 DefaultBig : 0y0 +0x000 Granularity : 0y0 +0x000 BaseHigh : 0y00000000 (0) +0x008 BaseUpper : 0 +0x00c MustBeZero : 0 +0x000 DataLow : 0n9283176673312768 +0x008 DataHigh : 0n0 Figure 10. Dump of a Segment Descriptor
Experimenting With Kernel Mode and User Mode Code
Let's use windbg to inspect the segments of a piece of code running in kernel mode (Figure 11).As you can see, RIP points to kernel address, and the CS segment value is 0x10 that, according to the result from Figure 8, corresponds to a segment of type Code, with privilege 0 (the most privileged) and Long mode enabled. Now let's try the same experiment by analyzing a 64-bit user-mode process (Figure 12).kd> r rax=0000000000000003 rbx=fffff804382fde60 rcx=fffff804382fde60 rdx=fffff804382fde10 rsi=fffff80433b731a0 rdi=fffff80433b73190 rip=fffff80435414be5 rsp=fffff804382fdde8 rbp=0000000000000000 r8=0000000000000003 r9=fffff804382fddf8 r10=0000000000000000 r11=fffff804382fddd0 r12=fffff80433b73100 r13=0000000000000000 r14=0000000000000100 r15=00000000ffffffff iopl=0 nv up di ng nz na po nc cs=0010 ss=0000 ds=002b es=002b fs=0053 gs=002b efl=00040086 nt!DebugService2+0x5: fffff804`35414be5 cc int 3
Figure 11. 64-bit Kernel Mode Process Registers
The image shows a CS segment value of 0x33, that corresponds to a segment of type Code, with privilege 3 (the lowest privilege) and Long mode enabled. Finally, let's see an example of a 32-bit user-mode process running on a 64-bit OS (Figure 13).
The image shows a CS with value 0x23, that corresponds to a segment of type Code, with privilege 3 and Long mode disabled. Since Long mode is disabled, this implies that the process is running in compatibility-mode (32-bit).
Segment Transition and Syscall
We mentioned that code running in kernel mode has a different CS value with DPL value 0. How is the segment transition performed? There are various ways to change the segment descriptor. One way is by using specific instructions that change the CS register, such as retf, which reads the new CS value from the stack. However, due to a lower DPL we can not use such a mechanism.An alternative method is to use a call gate segment descriptor ([3]). However, this mechanism is not used in modern Windows OS, which prefers to use the syscall instruction. Among the various actions performed by this instruction, there is the change of the segment selector. But, how is the correct segment chosen? This information is obtained from the IA32_STAR (0xC0000081) MSR. Bit 32-47 are extracted and used as value for the new segment selector (which is 0x10 in case of transition to kernel mode). Let's use windbg to verify this aspect (Figure 14).
We first read the IA32_STAR MSR and extract the bits related to the new CS, whose value is 00000000 00010000. Converting this value to hex results in 0x10, which is exactly the same value that we obtained when we inspected the CS register in kernel mode in the previous section.kd> rdmsr 0xC0000081 msr[c0000081] = 00230010`00000000 kd> .formats 00230010`00000000 Evaluate expression: Hex: 00230010`00000000 Decimal: 9851692904349696 Decimal (unsigned) : 9851692904349696 Octal: 0000430001000000000000 Binary: 00000000 00100011 00000000 00010000 00000000 00000000 00000000 00000000 Chars: .#...... Time: Sun Mar 21 11:08:10.434 1632 (UTC + 1:00) Float: low 0 high 3.21426e-039 Double: 5.28462e-308 kd> .formats 0y0000000000010000 Evaluate expression: Hex: 00000000`00000010 Decimal: 16 Decimal (unsigned) : 16 Octal: 0000000000000000000020 Binary: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00010000 Chars: ........ Time: Thu Jan 1 01:00:16 1970 Float: low 2.24208e-044 high 0 Double: 7.90505e-323 kd> dg 10 P Si Gr Pr Lo Sel Base Limit Type l ze an es ng Flags ---- ----------------- ----------------- ---------- - -- -- -- -- -------- 0010 00000000`00000000 00000000`00000000 Code RE Ac 0 Nb By P Lo 0000029b
Figure 14. Transition to DPL 0 Via Syscall Instruction
Heaven's Gates Consideration
If you reached this point, you now have all the information to understand the concept behind the Heaven's Gate mechanism, which is used to transition from x64 to x86 code in order to run 32-bit binaries. Microsoft created a specific segment descriptor for this purpose, assigning to it the value 0x20. The privileges between the two segment descriptors are the same, and it is possible to perform the transition by using one of the many instructions that take into consideration the CS register, such as retf or a far call. A lot of documentation is written on this aspect, and Microsoft refers to this with the name Windows-on-Windows (WoW64).Conclusion
Modern OS are executed in protected mode under a flat segmented memory model. In this post we analyzed how this model works and how it can be used to change privilege levels. If you want to know more, I invite you to read the references.References
[1] - Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3 (3A): System Programming Guide - Chapter 2.5 CONTROL REGISTERS[2] - Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 4: Model-Specific Registers - IA32_EFER
[3] - Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3 (3A): System Programming Guide - Chapter 5.8.3 Call Gates
[4] - Call Gates' Ring Transitioning in IA-32 Mode
[5] - Bringing Call Gates Back
[6] - Windows Internals, Part 2, 7th Edition
[7] - Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3A: System Programming Guide, Part 1
[8] - Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 1: Basic Architecture