Hell Oh Entropy!

Life, Code and everything in between

Changing the default loader for a program in its ELF

I was following an email thread in the libc-help mailing list, where the OP asked how one could force a program to load with a different loader, i.e. a loader from a different build of glibc. The answer, which is available if you follow the thread, is that it is necessary to rebuild the program to do this, with the --dynamic-linker switch to the linker. The best answer for the thread was to do this without a rebuild by making a wrapper script that invokes the program with an alternate loader, like:

/path/to/other/ld-linux.so program_name

So the story should have ended there. But it got me thinking; how hard could it be to edit the program to change the hard-coded loader value in the program? So I decided to find out.

The first try was as simple as replacing the string with my new loader string in a test program, which is basically just a Hell Oh World program. To do this, I opened the binary in emacs and changed to hex mode using M-x and then hexl-mode. Next I just wrote over the string with the absolute path of the new loader. I saved my changes and ran the program:

[siddhesh@localhost ~]$ ./a.out
bash: ./a.out: cannot execute binary file

Obviously I had done something wrong. So I had to do a bit of reading to understand what all was to change. One very obvious thing about this was that the length of the loader string was being recorded somewhere -- I was prety sure I had not overwritten anything and I had not changed the offset of the string at all. So I started reading the System V ABI documentation, which includes information on the ELF header. The ELF header documents where the program header table is, and the size of each program header table entry. one could read this very easily using the elfutils readelf command. So here's what I got:

[siddhesh@localhost ~]$ readelf -h ./a.out
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x4003e0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          2472 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         8
  Size of section headers:           64 (bytes)
  Number of section headers:         30
  Section header string table index: 27

So the important information here is the start of program headers and the size of each program header. In the case above, the program header starts 64 bytes into the file and then every 56 bytes after that was a program header. You can also read the program headers using elfutils:

[siddhesh@localhost ~]$ readelf -l ./a.out

Elf file type is EXEC (Executable file)
Entry point 0x4003e0
There are 8 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x00000000000001c0 0x00000000000001c0  R E    8
  INTERP         0x0000000000000200 0x0000000000400200 0x0000000000400200
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x000000000000068c 0x000000000000068c  R E    200000
  LOAD           0x0000000000000690 0x0000000000600690 0x0000000000600690
                 0x00000000000001ec 0x0000000000000200  RW     200000
  DYNAMIC        0x00000000000006b8 0x00000000006006b8 0x00000000006006b8
                 0x0000000000000190 0x0000000000000190  RW     8
  NOTE           0x000000000000021c 0x000000000040021c 0x000000000040021c
                 0x0000000000000044 0x0000000000000044  R      4
  GNU_EH_FRAME   0x00000000000005e8 0x00000000004005e8 0x00000000004005e8
                 0x0000000000000024 0x0000000000000024  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     8

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 
   03     .ctors .dtors .jcr .dynamic .got .got.plt .data .bss 
   04     .dynamic 
   05     .note.ABI-tag .note.gnu.build-id 
   06     .eh_frame_hdr 
   07     

The relevant section here is the INTERP section. Each program header is basically this struct:

typedef struct {
        Elf32_Word p_type;
        Elf32_Off p_offset;
        Elf32_Addr p_vaddr;
        Elf32_Addr p_paddr;
        Elf32_Word p_filesz;
        Elf32_Word p_memsz;
        Elf32_Word p_flags;
        Elf32_Word p_align;
} Elf32_Phdr;

Here, the relevant values of the string size are p_filesz and p_memsz. If you look into the values from the readelf, you'll see that the size is 0x1c, which is the length of the string, including the trailing null character. So that is what I had to change. Now I had to find this header in the hex editor. The header is the one beginning with PT_INTERP, which has value 0x03:

87654321  0011 2233 4455 6677 8899 aabb ccdd eeff  0123456789abcdef                                                                                                               
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0200 3e00 0100 0000 e003 4000 0000 0000  ..>.......@.....
00000020: 4000 0000 0000 0000 a809 0000 0000 0000  @...............
00000030: 0000 0000 4000 3800 0800 4000 1e00 1b00  ....@.8...@.....
00000040: 0600 0000 0500 0000 4000 0000 0000 0000  ........@.......
00000050: 4000 4000 0000 0000 4000 4000 0000 0000  @.@.....@.@.....
00000060: c001 0000 0000 0000 c001 0000 0000 0000  ................
00000070: 0800 0000 0000 0000 0300 0000 0400 0000  ................
00000080: 0002 0000 0000 0000 0002 4000 0000 0000  ..........@.....
00000090: 0002 4000 0000 0000 1c00 0000 0000 0000  ..@.....(.......
000000a0: 1c00 0000 0000 0000 0100 0000 0000 0000  (...............

I just changed the highlighted 1c to 28, which is the hex value of the string length of my interpreter (/home/siddhesh/sandbox/lib/ld-2.8.90.so). This and the change in the string was all that was needed to change the interpreter for the program. This has one really basic flaw, which is that if the interpreter string is longer than the current available space in the header, I will end up overwriting other data, thus corrupting the binary. So if you're being adventurous, be sure that your interpreter path is small enough to fit into the available space.

comments powered by Disqus