- Learning Linux Binary Analysis
- Ryan “elfmaster” O'Neill
- 1102字
- 2021-07-16 12:56:53
ELF program headers
ELF program headers are what describe segments within a binary and are necessary for program loading. Segments are understood by the kernel during load time and describe the memory layout of an executable on disk and how it should translate to memory. The program header table can be accessed by referencing the offset found in the initial ELF header member called e_phoff
(program header table offset), as shown in the ElfN_Ehdr
structure in display 1.7
.
There are five common program header types that we will discuss here. Program headers describe the segments of an executable file (shared libraries included) and what type of segment it is (that is, what type of data or code it is reserved for). First, let's take a look at the Elf32_Phdr
structure that makes up a program header entry in the program header table of a 32-bit ELF executable.
Note
We sometimes refer to program headers as Phdrs throughout the rest of this book.
Here's the Elf32_Phdr
struct:
typedef struct { uint32_t p_type; (segment type) Elf32_Off p_offset; (segment offset) Elf32_Addr p_vaddr; (segment virtual address) Elf32_Addr p_paddr; (segment physical address) uint32_t p_filesz; (size of segment in the file) uint32_t p_memsz; (size of segment in memory) uint32_t p_flags; (segment flags, I.E execute|read|read) uint32_t p_align; (segment alignment in memory) } Elf32_Phdr;
PT_LOAD
An executable will always have at least one PT_LOAD
type segment. This type of program header is describing a loadable segment, which means that the segment is going to be loaded or mapped into memory.
For instance, an ELF executable with dynamic linking will generally contain the following two loadable segments (of type PT_LOAD
):
- The text segment for program code
- And the data segment for global variables and dynamic linking information
The preceding two segments are going to be mapped into memory and aligned in memory by the value stored in p_align
. I recommend reading the ELF man pages in Linux to understand all of the members in a Phdr structure as they describe the layout of both the segments in the file as well as in memory.
Program headers are primarily there to describe the layout of a program for when it is executing and in memory. We will be utilizing Phdrs later in this chapter to demonstrate what they are and how to use them in reverse engineering software.
Note
The text segment (also known as the code segment) will generally have segment permissions set as PF_X
| PF_R
(READ+EXECUTE
).
The data segment will generally have segment permissions set to PF_W
| PF_R
(READ+WRITE
).
A file infected with a polymorphic virus might have changed these permissions in some way such as modifying the text segment to be writable by adding the PF_W
flag into the program header's segment flags (p_flags
).
PT_DYNAMIC – Phdr for the dynamic segment
The dynamic segment is specific to executables that are dynamically linked and contains information necessary for the dynamic linker. This segment contains tagged values and pointers, including but not limited to the following:
- List of shared libraries that are to be linked at runtime
- The address/location of the Global offset table (GOT) discussed in the ELF Dynamic Linking section
- Information about relocation entries
Following is a complete list of the tag names:
The dynamic segment contains a series of structures that hold relevant dynamic linking information. The d_tag
member controls the interpretation of d_un
.
The 32-bit ELF dynamic struct:
typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; extern Elf32_Dyn _DYNAMIC[];
We will explore more about dynamic linking later in this chapter.
PT_NOTE
A segment of type PT_NOTE
may contain auxiliary information that is pertinent to a specific vendor or system. Following is a definition of PT_NOTE
from the formal ELF specification:
Sometimes a vendor or system builder needs to mark an object file with special information that other programs will check for conformance, compatibility, and so on. Sections of type SHT_NOTE
and program header elements of type PT_NOTE
can be used for this purpose. The note information in sections and program header elements holds any number of entries, each of which is an array of 4-byte words in the format of the target processor. Labels appear below to help explain note information organization, but they are not part of the specification.
A point of interest: because of the fact that this segment is only used for OS specification information, and is actually not necessary for an executable to run (since the system will just assume the executable is native either way), this segment becomes an interesting place for virus infection, although not necessarily the most practical way to go about it due to size constraints. Some information on NOTE segment infections can be found at http://vxheavens.com/lib/vhe06.html.
PT_INTERP
This small segment contains only the location and size to a null terminated string describing where the program interpreter is; for instance, /lib/linux-ld.so.2
is generally the location of the dynamic linker, which is also the program interpreter.
PT_PHDR
This segment contains the location and size of the program header table itself. The Phdr table contains all of the Phdr's describing the segments of the file (and in the memory image).
Consult the ELF(5) man pages or the ELF specification paper to see all possible Phdr types. We have covered the most commonly seen ones that are vital to program execution or that we will be seeing most commonly in our reverse engineering endeavors.
We can use the readelf -l <filename>
command to view a file's Phdr table:
Elf file type is EXEC (Executable file) Entry point 0x8049a30 There are 9 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4 INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1 [Requesting program interpreter: /lib/ld-linux.so.2] LOAD 0x000000 0x08048000 0x08048000 0x1622c 0x1622c R E 0x1000 LOAD 0x016ef8 0x0805fef8 0x0805fef8 0x003c8 0x00fe8 RW 0x1000 DYNAMIC 0x016f0c 0x0805ff0c 0x0805ff0c 0x000e0 0x000e0 RW 0x4 NOTE 0x000168 0x08048168 0x08048168 0x00044 0x00044 R 0x4 GNU_EH_FRAME 0x016104 0x0805e104 0x0805e104 0x0002c 0x0002c R 0x4 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 GNU_RELRO 0x016ef8 0x0805fef8 0x0805fef8 0x00108 0x00108 R 0x1
We can see the entry point of the executable as well as some of the different segment types we just finished discussing. Notice the offsets to the right of the permission flags and alignment flags of the two first PT_LOAD
segments.
The text segment is READ+EXECUTE
and the data segment is READ+WRITE
, and both segments have an alignment of 0x1000
or 4,096 which is a page size on a 32-bit executable, and this is for alignment during program loading.
- 數據結構(Java語言描述)
- PyTorch Artificial Intelligence Fundamentals
- 深入理解Android:Wi-Fi、NFC和GPS卷
- Scratch3.0趣味編程動手玩:比賽訓練營
- Scala for Machine Learning(Second Edition)
- .NET Standard 2.0 Cookbook
- C語言程序設計習題與實驗指導
- .NET 4.5 Parallel Extensions Cookbook
- 玩轉.NET Micro Framework移植:基于STM32F10x處理器
- Java程序設計教程
- Mastering OAuth 2.0
- Node.js 6.x Blueprints
- jQuery Essentials
- PHP從入門到精通(第7版)
- R語言與網站分析