Skip to content
Jonathan Brossard edited this page May 27, 2024 · 8 revisions

wcc : The Witchcraft Core Compiler

The wcc compiler takes binaries (ELF, PE, ...) as an input and creates valid ELF binaries as an output. It can be used to create relocatable object files from executables or shared libraries.

wcc command line options

jonathan@blackbox:~$ wcc
Witchcraft Compiler Collection (WCC) version:0.0.6    (18:10:50 May 10 2024)

Usage: wcc [options] file

options:

    -o, --output           <output file>
    -m, --march            <architecture>
    -e, --entrypoint       <0xaddress>
    -i, --interpreter      <interpreter>
    -p, --poison           <poison>
    -s, --shared
    -c, --compile
    -S, --static
    -x, --strip
    -X, --sstrip
    -E, --exec
    -C, --core
    -O, --original
    -D, --disasm
    -d, --debug
    -h, --help
    -v, --verbose
    -V, --version

jonathan@blackbox:~$ 
Options description
    -o, --output           <output file>

Speficy the desired output file name. Default: a.out

    -m, --march            <architecture>

Specify the desired output architecture. This option is ignored. Run the 64bit or the 32bit versions of wcc to produce 64 bits or 32 bits binaries respectively.

    -e, --entrypoint       <0xaddress>

Specify the address of the entry point as found in the ELF header manually.

    -i, --interpreter      <interpreter>

Specify a new program interpreter to be written to the interpreter segment of the output program.

    -p, --poison           <poison>

Specify a poison byte to be written in the unused bytes of the output file.

    -s, --shared

Produce a shared library.

    -c, --compile

Produce relocatable object files.

    -S, --static

Produce a static binary.

    -x, --strip

Do not use the Dynamic symbol table to unstrip the binary. Default: off.

    -X, --sstrip

Strip more.

    -E, --exec

Set binary type to ET_EXEC in the ELF header.

    -C, --core

Set binary type to a Core file in the ELF header.

    -O, --original

Copy original section headers from input file (which must be an ELF) instead of guessing them from bfd sections. Default: off.

    -D, --disasm

Display application disassembly.

    -d, --debug

Enable debug mode (very verbose).

    -h, --help

Display help.

    -v, --verbose

Be verbose.

    -V, --version

Display version number.

Example usage of wcc

The primary use of wcc is to "unlink" (undo the work of a linker) ELF binaries, either executables or shared libraries, back into relocatable shared objects. The following command line attempts to unlink the binary /bin/ls (from GNU binutils) into a relocatable file named /tmp/ls.o

jonathan@blackbox:~$ wcc -c /bin/ls -o /tmp/ls.o
jonathan@blackbox:~$ 

This relocatable file can then be used as if it had been directly produced by a compiler. The following command would use the gcc compiler to link /tmp/ls.o into a shared library /tmp/ls.so

jonathan@blackbox:~$ gcc /tmp/ls.o -o /tmp/ls.so -shared
jonathan@blackbox:~$ 

Limits of wcc

wcc will process any file supported by libbfd and produce ELF files that will contain the same mapping when relinked and executed. This includes PE or OSX COFF files in 32 or 64 bits. However, rebuilding relocations is currently supported only for Intel ELF x86_64 binaries. Transforming a PE into an ELF and invoking pure functions is for instance supported.

How does it work ?

wcc uses libbfd to parse the sections of the input binary, and generates an ELF file with the corresponding Sections and Segments. wcc also handles symbols and symbol tables and attempts to unstrip stripped binaries by parsing their dynamic symbol tables. Relocations are recreated as needed for ELF Intel x86_64 input files. Help on extending to other cpus and relocation types very welcome :)

What does the resulting /tmp/ls.o look like in details ?

In order to observe more closely the output of wcc, let's take a look at /tmp/ls.o as parsed by readelf (GNU binutils package) editted for brevity:

jonathan@blackbox:~$ readelf -a /tmp/ls.o
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          2348624 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         9
  Section header string table index: 8

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000000000  0001ae00
       00000000002191ec  0000000000000000 WAX       0     0     16
  [ 2] .rodata           PROGBITS         0000000000000000  00011f20
       00000000000050fc  0000000000000000   A       0     0     32
  [ 3] .data             PROGBITS         0000000000000000  0001a3a0
       0000000000000254  0000000000000000  WA       0     0     32
  [ 4] .bss              NOBITS           0000000000000000  0001a5f4
       0000000000000d60  0000000000000000  WA       0     0     32
  [ 5] .rela.all         RELA             0000000000000000  00233fe0
       0000000000007158  0000000000000018   A       7     1     8
  [ 6] .strtab           STRTAB           0000000000000000  0023b138
       0000000000000dee  0000000000000000           0     0     1
  [ 7] .symtab           SYMTAB           0000000000000000  0023bf26
       00000000000016f8  0000000000000018           6     5     8
  [ 8] .shstrtab         STRTAB           0000000000000000  0023d890
       000000000000003e  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

There are no section groups in this file.

There are no program headers in this file.

Relocation section '.rela.all' at offset 0x233fe0 contains 1209 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000217eb0  000600000001 R_X86_64_64       0000000000000000 __ctype_toupper_loc + 0
000000217eb8  000700000001 R_X86_64_64       0000000000000000 __uflow + 0
000000217ec0  000800000001 R_X86_64_64       0000000000000000 getenv + 0
000000217ec8  000900000001 R_X86_64_64       0000000000000000 sigprocmask + 0
000000217ed0  000a00000001 R_X86_64_64       0000000000000000 raise + 0
000000217ed8  007b00000001 R_X86_64_64       00000000004021f0 free + 0
000000217ee0  000b00000001 R_X86_64_64       0000000000000000 localtime + 0
000000217ee8  000c00000001 R_X86_64_64       0000000000000000 __mempcpy_chk + 0
000000217ef0  000d00000001 R_X86_64_64       0000000000000000 abort + 0
000000217ef8  000e00000001 R_X86_64_64       0000000000000000 __errno_location + 0
000000217f00  000f00000001 R_X86_64_64       0000000000000000 strncmp + 0
...
00000000091f  000400000002 R_X86_64_PC32     0000000000000000 .bss + abd
000000000971  000400000002 R_X86_64_PC32     0000000000000000 .bss + ac1
000000000976  00020000000a R_X86_64_32       0000000000000000 .rodata + 1924
000000000988  000400000002 R_X86_64_PC32     0000000000000000 .bss + acd
0000000009b6  000400000002 R_X86_64_PC32     0000000000000000 .bss + ad1
0000000009ce  00020000000a R_X86_64_32       0000000000000000 .rodata + 1160
0000000009d3  00020000000a R_X86_64_32       0000000000000000 .rodata + 3ca8
000000000a0b  000400000002 R_X86_64_PC32     0000000000000000 .bss + b3e
000000000a12  000400000002 R_X86_64_PC32     0000000000000000 .bss + b46
000000000a26  000400000002 R_X86_64_PC32     0000000000000000 .bss + b0d
000000000a2f  000400000002 R_X86_64_PC32     0000000000000000 .bss + b36
000000000a39  000400000002 R_X86_64_PC32     0000000000000000 .bss + b2a
...
000000000b25  008500000002 R_X86_64_PC32     0000000000000000 optarg - 4
000000000b45  000400000002 R_X86_64_PC32     0000000000000000 .bss + ad1
000000000b50  000400000002 R_X86_64_PC32     0000000000000000 .bss + b3e
00000000240f  008200000002 R_X86_64_PC32     0000000000000000 stderr - 4
...

The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.

Symbol table '.symtab' contains 245 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 .text
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    2 .rodata
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 .data
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4 .bss
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 .unknown
     6: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __ctype_toupper_loc
     7: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __uflow
     8: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND getenv
     9: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND sigprocmask
    10: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND raise
    11: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND localtime
    12: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __mempcpy_chk
	...
   132: 0000000000411efc     0 NOTYPE  WEAK   DEFAULT  UND old__fini
   133: 0000000000000000     8 OBJECT  GLOBAL DEFAULT  UND optarg
   134: 0000000000000000   100 FUNC    GLOBAL DEFAULT    1 old_plt
   135: 0000000000000738   100 FUNC    GLOBAL DEFAULT    1 old_text
   136: 00000000000104d5   100 FUNC    GLOBAL DEFAULT    1 old_text_end
   137: 000000000000b538   100 FUNC    GLOBAL DEFAULT    1 internal_0040d6a0
   138: 000000000000fd78   100 FUNC    GLOBAL DEFAULT    1 internal_00411ee0
   139: 000000000000c4d8   100 FUNC    GLOBAL DEFAULT    1 internal_0040e640
   140: 0000000000007ce8   100 FUNC    GLOBAL DEFAULT    1 internal_00409e50
   141: 000000000000ed28   100 FUNC    GLOBAL DEFAULT    1 internal_00410e90
   142: 000000000000ead8   100 FUNC    GLOBAL DEFAULT    1 internal_00410c40
   143: 00000000000075e8   100 FUNC    GLOBAL DEFAULT    1 internal_00409750
   144: 000000000000e9c8   100 FUNC    GLOBAL DEFAULT    1 internal_00410b30
   145: 0000000000007fb8   100 FUNC    GLOBAL DEFAULT    1 internal_0040a120
   146: 000000000000a6a8   100 FUNC    GLOBAL DEFAULT    1 internal_0040c810
   147: 000000000000c7c8   100 FUNC    GLOBAL DEFAULT    1 internal_0040e930
   148: 000000000000c498   100 FUNC    GLOBAL DEFAULT    1 internal_0040e600
   149: 000000000000c4c8   100 FUNC    GLOBAL DEFAULT    1 internal_0040e630
   150: 000000000000c4e8   100 FUNC    GLOBAL DEFAULT    1 internal_0040e650
   151: 0000000000002c68   100 FUNC    GLOBAL DEFAULT    1 internal_00404dd0
	...
   241: 000000000000e958   100 FUNC    GLOBAL DEFAULT    1 internal_00410ac0
   242: 000000000000fbc8   100 FUNC    GLOBAL DEFAULT    1 internal_00411d30
   243: 000000000000fc48   100 FUNC    GLOBAL DEFAULT    1 internal_00411db0
   244: 000000000000fc88   100 FUNC    GLOBAL DEFAULT    1 internal_00411df0

No version information found in this file.
jonathan@blackbox:~$

It is worth in particular noticing that wcc rebuilt different types of relocations under the new .rela.all section. It also stripped the sections non essential to a relocatable object file from the input binary, and rebuilt a symbol table. On this last topic, it is also worth noticing that wcc created new symbols named internal_00XXXXXX where 0xXXXXXX is the address of a static function within the binary, not normally exported. Finally, wcc also makes used of additional symbol tables to find the address of additional functions if any are available (parsing both symbol tables and dynamic symbol tables).

Contribute

The Witchcraft Compiler Collection is Licensed under the MIT License.

Feel free to contribute :)

Clone this wiki locally