Entendiendo el PEB

Understanding the PEB

Jacobo


This will be the first in a series of articles aimed at understanding the basic theory necessary to delve into the world of malware in Windows environments. 

It is difficult to set a starting point to delve into this field of cybersecurity. A point that allows understanding complex and uncommon aspects while being accessible to understand, that can be said to start from level 0, basic, but without covering too many basic things that every IT professional should know. 

Thus, it has been decided to start with the PEB. What is it? Why is it important in malware development?

A bit of theory:

It is interesting to understand what lies behind Windows executable files, whether a file with the common EXE extension known to all of us or extensions (perhaps less known) like DLL or SYS. 

To begin with, it is important to know that they are all the same type of file, what Windows calls Portable Executable Format. Understanding the structure of these files is the subject of following articles. However, at a minimum, it must be known that when one of these files is executed, Windows creates a process with at least one thread, which will be responsible for executing the file's code.

This created process could be said to be responsible for providing all the necessary tools to the executable so that it launches correctly. 

An example: when you write C code that prints “Hello world” to the screen using the typical printf, things happen underneath that ordinary users, even developers, might not be aware of. 

Printf is a function defined and declared in a library called ucrtbase.dll. However, this is a very high-level function and we know that Windows has a kernel that does not know these functions. So how is it achieved that the kernel obtains information it can digest? Through native Windows APIs or WinAPIs. 

In short, when we create a program with a printf and run it, we know that a series of calls will be made from very high level like printf down to very low level like ZwWriteFile. Understanding WinAPI function calls in depth will be the subject of another article.

Again, the question is: How does a process provide the necessary tools to the executable file to launch correctly? The answer is simple: by having control over all the information an executable needs to launch. That is, as we have seen, our program needs at least the DLL called ucrtbase.dll. How does the executable obtain (and load into its memory) the address of the DLL to use its functions like printf? Through the PEB. 

The Process Environment Block is a Windows data structure that contains information for process startup, and therefore executables, such as its own PID, whether the process is being debugged, and especially pointers to other structures like LDR_DATA that will end up containing information about the DLLs needed for the executable such as their address.

This structure is partially documented by Windows. They only provide clear information about 7 of its 19 variables. 

 


We know that all byte-type variables will occupy one byte of memory space and PVOID variables will be pointers to something and therefore will occupy 8 bytes of memory. It is necessary to know at least their size to properly calculate memory addresses and offsets when debugging code or trying to access them in our code.

Luckily, we have the community researching these topics and providing us with unofficial but more accurate data than Microsoft's own documentation. One example is the website https://www.vergiliusproject.com/ where we can consult the vast majority of Windows structures. We recommend visiting the PEB structure on this site https://www.vergiliusproject.com/kernels/x64/windows-11/24h2/_PEB. They are somewhat similar and help to get a real idea of their variables. 

If we look at the official Windows documentation, what matters most (for now) about the PEB is the variable defined as Ldr. This variable will again be a Windows structure of type _PEB_LDR_DATA and like the PEB, it is partially documented by Microsoft. 

According to the official documentation, this is its form.

 


Our goal is to understand the PEB, how the process control structure is, and how it allows executables (PE) to use all the DLLs they need, among other things. 

PEB_LDR_DATA has a variable of type LIST_ENTRY called InMemoryOrderModuleList formed by two pointers. These pointers will indicate the memory address of all the DLLs the executable needs, both the next DLL and the previous DLL, creating a doubly linked list of DLLs to load. 

 


These pointers, according to Microsoft's documentation, are called Flink and Blink. 

At this point, it is interesting to reflect on how this doubly linked list provides all the information to an executable about which DLL to load. If we think a little, at minimum it will be necessary to know the DLL's name and the absolute path where it is stored on disk to be able to load it into virtual memory. 

We have a doubly linked list, formed by struct _LIST_ENTRY where flink points to the address of the next _LIST_ENTRY from which the next flink will be obtained, and so on. Where is the information of each required DLL?

Well, this is very simple. Each _LIST_ENTRY belongs to a larger structure called _LDR_DATA_TABLE_ENTRY. Again, it is a Windows structure partially documented by Microsoft.

 


As can be seen, the second line contains the LIST_ENTRY structure (note that it is not a pointer to the structure but the 16 bytes of the two pointers are embedded there). You can also see the variables DllBase, EntryPoint, or FullDllName which contain, indeed, all the necessary information to load all the DLLs the executable needs. 

In summary, the PEB is a Windows structure that contains all the necessary information about the process. Thanks to it, an executable can load all the required DLLs so that the program's functions and those needed for the kernel are loaded. As learned, all this is done through a series of structures that keep all the information organized. 

 



Let's get to work:

To analyze and observe the PEB and its entire structure and its structures, WinDBG, the official Windows debugger, will be used. Additionally, a small C program will be created with the help of VisualStudio to verify all the data and serve to reinforce knowledge and understand that it is possible to access them from an executable. 

First, we will start using WinDBG. This debugger will allow us both to launch an executable from disk and analyze its process, as well as attach to a running executable process.

 


In this first case, we will launch our program with the “Hello world” printf from scratch.

 


Once launched, we will use the first command which will be !peb (WinDBG is case sensitive). With this, the debugger will show us all the information about this structure and its substructures.

 


Thanks to the !peb command, all these structures and memory addresses can now be observed. First, the first arrow points to the memory address of the PEB. It hasn't been said yet, but obviously this address is a virtual memory address and the PEB is unique for each process. 

A process can have multiple threads. Given this and knowing that each thread will have its own TEB (Thread Environment Block) structure, does each TEB of the threads in the same process have its own PEB?

The next arrow shows us the address of the Ldr variable, which will have the pointer to the first LIST ENTRY structure. 

Finally, it can be observed how from the PEB all loaded DLLs have been obtained, including ucrtbased.dll, which is responsible for printf.

WinDBG also allows us to dump memory and see its contents, so we will look at the memory and how it stores these structures. 

First, we will start by accessing the PEB structure (0x00000042393fc000).

 




After an offset of 0x18, you can reach the Ldr variable, which is a pointer to the structure. We will follow it both to this address and the following ones until we reach the first loaded module.

 


This is the result. The only hint I will give is that with the du command we have dumped the memory of the FullDllName variable of the first loaded module. The rest of the boxes, separated by colors, you should try to identify 😉.



Next steps:

Why is this entire structure necessary and why is it important to know how to handle the debugger as malware developers?

Security measures like EDRs or Windows Defender itself try every day to evolve, to detect things at even lower levels such as repeated opcode strings. This means we must understand as much as possible about the low-level functioning of what we develop. 

In particular, understanding the existence of the PEB, what data it has, and how I can access it opens the way to implement anti-hooking measures, implement anti-debugging measures, or perform more advanced techniques like Reflective Loader Injection where it will be necessary to manually load all the executable's dependencies. 

An exe has been created that obtains some of the data seen previously.

 


It is left to the reader's discretion to take the .c file and modify it to try to obtain information about the rest of the variables in the PEB structure.







return to blog

Leave a comment

Please note that comments must be approved before they are published.