官术网_书友最值得收藏!

ELF program headers

ELF program headers are what describe segments within a binary and are necessary for program loading. Segments are understood by the kernel during load time and describe the memory layout of an executable on disk and how it should translate to memory. The program header table can be accessed by referencing the offset found in the initial ELF header member called e_phoff (program header table offset), as shown in the ElfN_Ehdr structure in display 1.7.

There are five common program header types that we will discuss here. Program headers describe the segments of an executable file (shared libraries included) and what type of segment it is (that is, what type of data or code it is reserved for). First, let's take a look at the Elf32_Phdr structure that makes up a program header entry in the program header table of a 32-bit ELF executable.

Note

We sometimes refer to program headers as Phdrs throughout the rest of this book.

Here's the Elf32_Phdr struct:

typedef struct {
    uint32_t   p_type;   (segment type)
    Elf32_Off  p_offset; (segment offset)
    Elf32_Addr p_vaddr;   (segment virtual address)
    Elf32_Addr p_paddr;    (segment physical address)
    uint32_t   p_filesz;   (size of segment in the file)
    uint32_t   p_memsz; (size of segment in memory)
    uint32_t   p_flags; (segment flags, I.E execute|read|read)
    uint32_t   p_align;  (segment alignment in memory)
  } Elf32_Phdr;

PT_LOAD

An executable will always have at least one PT_LOAD type segment. This type of program header is describing a loadable segment, which means that the segment is going to be loaded or mapped into memory.

For instance, an ELF executable with dynamic linking will generally contain the following two loadable segments (of type PT_LOAD):

  • The text segment for program code
  • And the data segment for global variables and dynamic linking information

The preceding two segments are going to be mapped into memory and aligned in memory by the value stored in p_align. I recommend reading the ELF man pages in Linux to understand all of the members in a Phdr structure as they describe the layout of both the segments in the file as well as in memory.

Program headers are primarily there to describe the layout of a program for when it is executing and in memory. We will be utilizing Phdrs later in this chapter to demonstrate what they are and how to use them in reverse engineering software.

Note

The text segment (also known as the code segment) will generally have segment permissions set as PF_X | PF_R (READ+EXECUTE).

The data segment will generally have segment permissions set to PF_W | PF_R (READ+WRITE).

A file infected with a polymorphic virus might have changed these permissions in some way such as modifying the text segment to be writable by adding the PF_W flag into the program header's segment flags (p_flags).

PT_DYNAMIC – Phdr for the dynamic segment

The dynamic segment is specific to executables that are dynamically linked and contains information necessary for the dynamic linker. This segment contains tagged values and pointers, including but not limited to the following:

  • List of shared libraries that are to be linked at runtime
  • The address/location of the Global offset table (GOT) discussed in the ELF Dynamic Linking section
  • Information about relocation entries

Following is a complete list of the tag names:

The dynamic segment contains a series of structures that hold relevant dynamic linking information. The d_tag member controls the interpretation of d_un.

The 32-bit ELF dynamic struct:

typedef struct {
Elf32_Sword    d_tag;
    union {
Elf32_Word d_val;
Elf32_Addr d_ptr;
    } d_un;
} Elf32_Dyn;
extern Elf32_Dyn _DYNAMIC[];

We will explore more about dynamic linking later in this chapter.

PT_NOTE

A segment of type PT_NOTE may contain auxiliary information that is pertinent to a specific vendor or system. Following is a definition of PT_NOTE from the formal ELF specification:

Sometimes a vendor or system builder needs to mark an object file with special information that other programs will check for conformance, compatibility, and so on. Sections of type SHT_NOTE and program header elements of type PT_NOTE can be used for this purpose. The note information in sections and program header elements holds any number of entries, each of which is an array of 4-byte words in the format of the target processor. Labels appear below to help explain note information organization, but they are not part of the specification.

A point of interest: because of the fact that this segment is only used for OS specification information, and is actually not necessary for an executable to run (since the system will just assume the executable is native either way), this segment becomes an interesting place for virus infection, although not necessarily the most practical way to go about it due to size constraints. Some information on NOTE segment infections can be found at http://vxheavens.com/lib/vhe06.html.

PT_INTERP

This small segment contains only the location and size to a null terminated string describing where the program interpreter is; for instance, /lib/linux-ld.so.2 is generally the location of the dynamic linker, which is also the program interpreter.

PT_PHDR

This segment contains the location and size of the program header table itself. The Phdr table contains all of the Phdr's describing the segments of the file (and in the memory image).

Consult the ELF(5) man pages or the ELF specification paper to see all possible Phdr types. We have covered the most commonly seen ones that are vital to program execution or that we will be seeing most commonly in our reverse engineering endeavors.

We can use the readelf -l <filename> command to view a file's Phdr table:

Elf file type is EXEC (Executable file)
Entry point 0x8049a30
There are 9 program headers, starting at offset 52
Program Headers:
  Type          Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR          0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
  INTERP        0x000154 0x08048154 0x08048154 0x00013 0x00013 R   0x1
      [Requesting program interpreter: /lib/ld-linux.so.2]
  LOAD          0x000000 0x08048000 0x08048000 0x1622c 0x1622c R E 0x1000
  LOAD          0x016ef8 0x0805fef8 0x0805fef8 0x003c8 0x00fe8 RW  0x1000
  DYNAMIC       0x016f0c 0x0805ff0c 0x0805ff0c 0x000e0 0x000e0 RW  0x4
  NOTE          0x000168 0x08048168 0x08048168 0x00044 0x00044 R   0x4
  GNU_EH_FRAME  0x016104 0x0805e104 0x0805e104 0x0002c 0x0002c R   0x4
  GNU_STACK     0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4
  GNU_RELRO     0x016ef8 0x0805fef8 0x0805fef8 0x00108 0x00108 R   0x1

We can see the entry point of the executable as well as some of the different segment types we just finished discussing. Notice the offsets to the right of the permission flags and alignment flags of the two first PT_LOAD segments.

The text segment is READ+EXECUTE and the data segment is READ+WRITE, and both segments have an alignment of 0x1000 or 4,096 which is a page size on a 32-bit executable, and this is for alignment during program loading.

主站蜘蛛池模板: 南澳县| 水富县| 南通市| 杭州市| 武川县| 璧山县| 长丰县| 银川市| 衡东县| 志丹县| 肇东市| 旺苍县| 河间市| 乳山市| 城固县| 辽宁省| 巩义市| 深水埗区| 肇州县| 鹤壁市| 巴林左旗| 宁明县| 淅川县| 富源县| 府谷县| 类乌齐县| 广汉市| 仲巴县| 福泉市| 越西县| 广水市| 广水市| 漳浦县| 怀来县| 虎林市| 兴城市| 邹城市| 仙游县| 兴城市| 乌鲁木齐县| 景谷|