官术网_书友最值得收藏!

ELF dynamic linking

In the old days, everything was statically linked. If a program used external library functions, the entire library was compiled directly into the executable. ELF supports dynamic linking, which is a much more efficient way to go about handling shared libraries.

When a program is loaded into memory, the dynamic linker also loads and binds the shared libraries that are needed to that process address space. The topic of dynamic linking is rarely understood by people in any depth as it is a relatively complex procedure and seems to work like magic under the hood. In this section, we will demystify some of its complexities and reveal how it works and also how it can be abused by attackers.

Shared libraries are compiled as position-independent and can therefore be easily relocated into a process address space. A shared library is a dynamic ELF object. If you look at readelf -h lib.so, you will see that the e_type (ELF file type) is called ET_DYN. Dynamic objects are very similar to executables. They do not typically have a PT_INTERP segment since they are loaded by the program interpreter, and therefore will not be invoking a program interpreter.

When a shared library is loaded into a process address space, it must have any relocations satisfied that reference other shared libraries. The dynamic linker must modify the GOT (Global offset table) of the executable (located in the section .got.plt), which is a table of addresses located in the data segment. It is in the data segment because it must be writeable (at least initially; see read-only relocations as a security feature). The dynamic linker patches the GOT with resolved shared library addresses. We will explain the process of lazy linking shortly.

The auxiliary vector

When a program gets loaded into memory by the sys_execve() syscall, the executable is mapped in and given a stack (among other things). The stack for that process address space is set up in a very specific way to pass information to the dynamic linker. This particular setup and arrangement of information is known as the auxiliary vector or auxv. The bottom of the stack (which is its highest memory address since the stack grows down on x86 architecture) is loaded with the following information:

The auxiliary vector

[argc][argv][envp][auxiliary][.ascii data for argv/envp]

The auxiliary vector (or auxv) is a series of ElfN_auxv_t structs.

typedef struct
{
  uint64_t a_type;              /* Entry type */
  union
    {
      uint64_t a_val;           /* Integer value */
    } a_un;
} Elf64_auxv_t;

The a_type describes the auxv entry type, and the a_val provides its value. The following are some of the most important entry types that are needed by the dynamic linker:

#define AT_EXECFD       2       /* File descriptor of program */
#define AT_PHDR         3       /* Program headers for program */
#define AT_PHENT        4       /* Size of program header entry */
#define AT_PHNUM        5       /* Number of program headers */
#define AT_PAGESZ       6       /* System page size */
#define AT_ENTRY        9       /* Entry point of program */
#define AT_UID          11      /* Real uid */

The dynamic linker retrieves information from the stack about the executing program. The linker must know where the program headers are, the entry point of the program, and so on. I listed only a few of the auxv entry types previously, taken from /usr/include/elf.h.

The auxiliary vector gets set up by a kernel function called create_elf_tables() that resides in the Linux source code /usr/src/linux/fs/binfmt_elf.c.

In fact, the execution process from the kernel looks something like the following:

  1. sys_execve() →.
  2. Calls do_execve_common() →.
  3. Calls search_binary_handler() →.
  4. Calls load_elf_binary() →.
  5. Calls create_elf_tables() →.

The following is some of the code from create_elf_tables() in /usr/src/linux/fs/binfmt_elf.c that adds auxv entries:

NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE);
NEW_AUX_ENT(AT_PHDR, load_addr + exec->e_phoff);
NEW_AUX_ENT(AT_PHENT, sizeof(struct elf_phdr));
NEW_AUX_ENT(AT_PHNUM, exec->e_phnum);
NEW_AUX_ENT(AT_BASE, interp_load_addr);
NEW_AUX_ENT(AT_ENTRY, exec->e_entry);

As you can see, the ELF entry point and the address of the program headers, among other values, are placed onto the stack with the NEW_AUX_ENT() macro in the kernel.

Once a program is loaded into memory and the auxiliary vector has been filled in, control is passed to the dynamic linker. The dynamic linker resolves symbols and relocations for shared libraries that are linked into the process address space. By default, an executable is dynamically linked with the GNU C library libc.so. The ldd command will show you the shared library dependencies of a given executable.

Learning about the PLT/GOT

The PLT (procedure linkage table) and GOT (Global offset table) can be found in executable files and shared libraries. We will be focusing specifically on the PLT/GOT of an executable program. When a program calls a shared library function such as strcpy() or printf(), which are not resolved until runtime, there must exist a mechanism to dynamically link the shared libraries and resolve the addresses to the shared functions. When a dynamically linked program is compiled, it handles shared library function calls in a specific way, far differently from a simple call instruction to a local function.

Let's take a look at a call to the libc.so function fgets() in a 32-bit compiled ELF executable. We will use a 32-bit executable in our examples because the relationship with the GOT is easier to visualize since IP relative addressing is not used, as it is in 64-bit executables:

objdump -d test
 ...
 8048481:       e8 da fe ff ff          call   8048360<fgets@plt>
 ...

The address 0x8048360 corresponds to the PLT entry for fgets(). Let's take a look at that address in our executable:

objdump -d test (grep for 8048360)
...
08048360<fgets@plt>:                    /* A jmp into the GOT */
 8048360:       ff 25 00 a0 04 08       jmp    *0x804a000
 8048366:       68 00 00 00 00          push   $0x0
 804836b:       e9 e0 ff ff ff          jmp    8048350 <_init+0x34>
...

So the call to fgets() leads to 8048360, which is the PLT jump table entry for fgets(). As we can see, there is an indirect jump to the address stored at 0x804a000 in the preceding disassembled code output. This address is a GOT (Global offset table) entry that holds the address to the actual fgets() function in the libc shared library.

However, the first time a function is called, its address has not yet been resolved by the dynamic linker, when the default behavior lazy linking is being used. Lazy linking implies that the dynamic linker should not resolve every function at program loading time. Instead, it will resolve the functions as they are called, which is made possible through the .plt and .got.plt sections (which correspond to the Procedure linkage table, and the Global offset table, respectively). This behavior can be changed to what is called strict linking with the LD_BIND_NOW environment variable so that all dynamic linking happens right at program loading time. Lazy linking increases performance at load time, which is why it is the default behavior, but it also can be unpredictable since a linking error may not occur until after the program has been running for some time. I have actually only experienced this myself one time over the course of years. It is also worth noting that some security features, namely, read-only relocations cannot be applied unless strict linking is enabled because the .plt.got section (among others) is marked read-only; this can only occur after the dynamic linker has finished patching it, and thus strict linking must be used.

Let's take a look at the relocation entry for fgets():

$ readelf -r test
Offset   Info      Type           SymValue    SymName
...
0804a000  00000107 R_386_JUMP_SLOT   00000000   fgets
...
Note

R_386_JUMP_SLOT is a relocation type for PLT/GOT entries. On x86_64, it is called R_X86_64_JUMP_SLOT.

Notice that the relocation offset is the address 0x804a000, the same address that the fgets() PLT jumps into. Assuming that fgets() is being called for the first time, the dynamic linker has to resolve the address of fgets() and place its value into the GOT entry for fgets().

Let's take a look at the GOT in our test program:

08049ff4 <_GLOBAL_OFFSET_TABLE_>:
 8049ff4:       28 9f 04 08 00 00       sub    %bl,0x804(%edi)
 8049ffa:       00 00                   add    %al,(%eax)
 8049ffc:       00 00                   add    %al,(%eax)
 8049ffe:       00 00                   add    %al,(%eax)
 804a000:       66 83 04 08 76          addw   $0x76,(%eax,%ecx,1)
 804a005:       83 04 08 86             addl   $0xffffff86,(%eax,%ecx,1)
 804a009:       83 04 08 96             addl   $0xffffff96,(%eax,%ecx,1)
 804a00d:       83                      .byte 0x83
 804a00e:       04 08                   add    $0x8,%al

The address 0x08048366 is highlighted in the preceding and is found at 0x804a000 in the GOT. Remember that little endian reverses the byte order, so it appears as 66 83 04 08. This address is not the address to the fgets() function since it has not yet been resolved by the linker, but instead points back down into the PLT entry for fgets(). Let's look at the PLT entry for fgets() again:

08048360 <fgets@plt>:
 8048360:       ff 25 00 a0 04 08       jmp    *0x804a000
 8048366:       68 00 00 00 00          push   $0x0
 804836b:       e9 e0 ff ff ff          jmp    8048350 <_init+0x34>

So, jmp *0x804a000 jumps to the contained address there within 0x8048366, which is the push $0x0 instruction. That push instruction has a purpose, which is to push the GOT entry for fgets() onto the stack. The GOT entry offset for fgets() is 0x0, which corresponds to the first GOT entry that is reserved for a shared library symbol value, which is actually the fourth GOT entry, GOT[3]. In other words, the shared library addresses don't get plugged in starting at GOT[0] and they begin at GOT[3] (the fourth entry) because the first three are reserved for other purposes.

Note

Take note of the following GOT offsets:

  • GOT[0] contains an address that points to the dynamic segment of the executable, which is used by the dynamic linker for extracting dynamic linking-related information
  • GOT[1] contains the address of the link_map structure that is used by the dynamic linker to resolve symbols
  • GOT[2] contains the address to the dynamic linkers _dl_runtime_resolve() function that resolves the actual symbol address for the shared library function

The last instruction in the fgets() PLT stub is a jmp 8048350. This address points to the very first PLT entry in every executable, known as PLT-0.

PLT-0 from our executable contains the following code:

 8048350:       ff 35 f8 9f 04 08       pushl  0x8049ff8
 8048356:       ff 25 fc 9f 04 08       jmp    *0x8049ffc
 804835c:       00 00                   add    %al,(%eax)

The first pushl instruction pushes the address of the second GOT entry, GOT[1], onto the stack, which, as noted earlier, contains the address of the link_map structure.

The jmp *0x8049ffc performs an indirect jmp into the third GOT entry, GOT[2], which contains the address to the dynamic linkers _dl_runtime_resolve() function, therefore transferring control to the dynamic linker and resolving the address for fgets(). Once fgets() has been resolved, all future calls to the PLT entry forfgets() will result in a jump to the fgets() code itself, rather than pointing back into the PLT and going through the lazy linking process again.

The following is a summary of what we have just covered:

  1. Call fgets@PLT (to call the fgets function).
  2. PLT code does an indirect jmp to the address in the GOT.
  3. The GOT entry contains the address that points back into PLT at the push instruction.
  4. The push $0x0 instruction pushes the offset of the fgets() GOT entry onto the stack.
  5. The final fgets() PLT instruction is a jmp to the PLT-0 code.
  6. The first instruction of PLT-0 pushes the address of GOT[1] onto the stack that contains an offset into the link_map struct for fgets().
  7. The second instruction of PLT-0 is a jmp to the address in GOT[2] that points to the dynamic linker's _dl_runtime_resolve(), which then handles the R_386_JUMP_SLOT relocation by adding the symbol value (memory address) of fgets() to its corresponding GOT entry in the .got.plt section.

The next time fgets() is called, the PLT entry will jump directly to the function itself rather than having to perform the relocation procedure again.

The dynamic segment revisited

I earlier referenced the dynamic segment as a section named .dynamic. The dynamic segment has a section header referencing it, but it also has a program header referencing it because it must be found during runtime by the dynamic linker; since section headers don't get loaded into memory, there has to be an associated program header for it.

The dynamic segment contains an array of structs of type ElfN_Dyn:

typedef struct {
    Elf32_Sword    d_tag;
    union {
      Elf32_Word d_val;
      Elf32_Addr d_ptr;
    } d_un;
} Elf32_Dyn;

The d_tag field contains a tag that matches one of the numerous definitions that can be found in the ELF(5) man pages. I have listed some of the most important ones used by the dynamic linker.

DT_NEEDED

This holds the string table offset to the name of a needed shared library.

DT_SYMTAB

This contains the address of the dynamic symbol table also known by its section name .dynsym.

DT_HASH

This holds the address of the symbol hash table, also known by its section name .hash (or sometimes named .gnu.hash).

DT_STRTAB

This holds the address of the symbol string table, also known by its section name .dynstr.

DT_PLTGOT

This holds the address of the global offset table.

Note

The preceding dynamic tags demonstrate how the location of certain sections can be found through the dynamic segment that can aid in the forensics reconstruction task of rebuilding a section header table. If the section header table has been stripped, a clever inpidual can rebuild parts of it by getting information from the dynamic segment (that is, the .dynstr, .dynsym, and .hash, among others).

Other segments such as text and data can yield information that you need as well (such as for the .text and .data sections).

The d_val member of ElfN_Dyn holds an integer value that has various interpretations such as being the size of a relocation entry to give one instance.

The d_ptr member holds a virtual memory address that can point to various locations needed by the linker; a good example would be the address to the symbol table for the d_tag DT_SYMTAB.

The dynamic linker utilizes the ElfN_Dyn d_tags to locate the different parts of the dynamic segment that contain a reference to a part of the executable through the d_tag such as DT_SYMTAB, which has a d_ptr to give the virtual address to the symbol table.

When the dynamic linker is mapped into memory, it first handles any of its own relocations if necessary; remember that the linker is a shared library itself. It then looks at the executable program's dynamic segment and searches for the DT_NEEDED tags that contain pointers to the strings or pathnames of the necessary shared libraries. When it maps a needed shared library into the memory, it accesses the library's dynamic segment (yes, they too have dynamic segments) and adds the library's symbol table to a chain of symbol tables that exists to hold the symbol tables for each mapped library.

The linker creates a struct link_map entry for each shared library and stores it in a linked list:

struct link_map
  {
    ElfW(Addr) l_addr; /* Base address shared object is loaded at.  */
    char *l_name;      /* Absolute file name object was found in.  */
    ElfW(Dyn) *l_ld;   /* Dynamic section of the shared object.  */
    struct link_map *l_next, *l_prev; /* Chain of loaded objects.  */
  };

Once the linker has finished building its list of dependencies, it handles the relocations on each library, similar to the relocations we discussed earlier in this chapter, as well as fixing up the GOT of each shared library. Lazy linking still applies to the PLT/GOT of shared libraries as well, so GOT relocations (of type R_386_JMP_SLOT) won't happen until the point when a function has actually been called.

For more detailed information on ELF and dynamic linking, read the ELF specification online or take a look at some of the interesting glibc source code available. Hopefully, dynamic linking has become less of a mystery and more of an intrigue at this point. In Chapter 7, Process Memory Forensics we will be covering PLT/GOT poisoning techniques to redirect shared library function calls. A very fun technique is to subvert dynamic linking.

主站蜘蛛池模板: 清远市| 道真| 蓬安县| 南昌县| 信宜市| 无极县| 锡林郭勒盟| 临沭县| 崇仁县| 古丈县| 信丰县| 巴彦淖尔市| 响水县| 称多县| 洪雅县| 辽阳县| 清原| 名山县| 天门市| 太谷县| 泸水县| 亳州市| 岑巩县| 威海市| 巩留县| 云阳县| 盐津县| 泉州市| 灌阳县| 察哈| 龙井市| 高陵县| 秭归县| 任丘市| 铅山县| 汉寿县| 建水县| 吉木萨尔县| 遵义市| 新巴尔虎右旗| 绥江县|