Overview
Hooking changes the behavior of existing functions by replacing them before system calls or function calls, enabling modification of the original functionality.
Hooking with LD_PRELOAD
Linux provides an environment variable named LD_PRELOAD. This variable allows specifying one or more shared library paths. When a program starts, the dynamic loader will preload the libraries specified by LD_PRELOAD before loading the standard C runtime libraries. This preload mechanism makes it possible to inject custom shared libraries that override function definitions. When the program attempts to call an overridden function, the dynamic loader will prefer the definition in the preloaded library, enabling function hooking.
Example target program that waits for user input (keeps the process blocked):
#include
int main()
{
printf("please input a number:\n");
int val = 0;
scanf("%d", &val);
printf("already recv your number!\n");
return 0;
}
Example hook library that overrides scanf to print a message, allowing the target program to continue without waiting for input:
#include
int main()
{
printf("please input a number:\n");
int val = 0;
scanf("%d", &val);
printf("already recv your number!\n");
return 0;
}
Compile the target program and the hook shared object:
gcc ./target.c -o target
gcc --shared hook.c -o hook.so -fPIC
Run the target with LD_PRELOAD to apply the hook. The scanf call will be replaced by the hooked behavior:
LD_PRELOAD=./hook.so ./target
Hooking with ptrace
The LD_PRELOAD method only works for programs that are started after the preload is set. To hook an already running process, ptrace can be used. ptrace allows one process to observe and control another, and is used by debuggers such as GDB. The general steps for ptrace-based hooking are:
- Attach to the running target process with ptrace and save the original register state.
- Locate the link_map chain pointer of the target process and traverse it to find the real address of the target function. The link_map addresses are referenced from the .got.plt section; the base address of that section can be found via the DYNAMIC segment DT_PLTGOT. Typically dlopen is located first.
- Modify the target process registers and stack to call dlopen so the hook shared object is loaded into the target process.
- Replace the original function address in the target process (for example in the GOT) with the new function address from the loaded hook.so. Since hook.so was loaded into the target address space, the new function address can be resolved similarly to step 2.
- Restore the original registers and detach with PTRACE_DETACH.
The following sections show key implementation details.
Attach and save registers
Attach to the target process and save its registers into a user_regs_struct.
void ptrace_attach(pid_t pid, struct user_regs_struct *regs)
{
if(ptrace(PTRACE_ATTACH, pid, NULL, NULL) < 0)
{
printf("ptrace_attach error\n");
}
waitpid(pid, NULL, WUNTRACED);
if(ptrace(PTRACE_GETREGS, pid, NULL, regs))
{
printf("ptrace_getregs error!\n");
}
}
Traverse link_map to find a symbol
The link_map chain can be obtained by parsing the ELF structure; here is how to traverse the chain to find a symbol address by name.
Elf_Addr find_symbol(int pid, Elf_Addr lm_addr, char *sym_name)
{
struct link_map lmap; // store link_map content
unsigned int nlen = 0;
while (lm_addr)
{
// Read link_map structure from target process
ptrace_getdata(pid, lm_addr, &lmap, sizeof(struct link_map));
// Get pointer to next link_map
lm_addr = (Elf_Addr)(lmap.l_next);
// Skip invalid dynamic link names
if (0 == lmap.l_name)
{
continue;
}
Elf_Addr sym_addr = find_symbol_in_linkmap(pid, &lmap, sym_name);
if (sym_addr)
{
return sym_addr;
}
}
return 0;
}
Inject code to call dlopen in the target
After obtaining dlopen's address, simulate a call to dlopen by writing the library path onto the target stack and setting registers appropriately.
int inject_code(pid_t pid, unsigned long dlopen_addr, char *libc_path)
{
char sbuf1[STRLEN], sbuf2[STRLEN];
struct user_regs_struct regs, saved_regs;
int status;
ptrace_getregs(pid, ®s); // get all register values
ptrace_getdata(pid, regs.rsp + STRLEN, sbuf1, sizeof(sbuf1));
ptrace_getdata(pid, regs.rsp, sbuf2, sizeof(sbuf2));
/* return address used to trigger SIGSEGV */
unsigned long ret_addr = 0x666;
ptrace_setdata(pid, regs.rsp, (char *)&ret_addr, sizeof(ret_addr));
ptrace_setdata(pid, regs.rsp + STRLEN, libc_path, strlen(libc_path) + 1);
memcpy(&saved_regs, ®s, sizeof(regs));
regs.rdi = regs.rsp + STRLEN;
regs.rsi = RTLD_NOW|RTLD_GLOBAL|RTLD_NODELETE;
regs.rip = dlopen_addr+2;
if (ptrace(PTRACE_SETREGS, pid, NULL, ®s) < 0)
{
printf("inject_code:PTRACE_SETREGS 1 failed!");
}
if (ptrace(PTRACE_CONT, pid, NULL, NULL) < 0)
{
printf("inject_code:PTRACE_CONT failed!");
}
waitpid(pid, &status, 0);
if (ptrace(PTRACE_SETREGS, pid, 0, &saved_regs) < 0)
{
printf("inject_code:PTRACE_SETREGS 2 failed!");
}
ptrace_setdata(pid, saved_regs.rsp + STRLEN, sbuf1, sizeof(sbuf1));
ptrace_setdata(pid, saved_regs.rsp, sbuf2, sizeof(sbuf2));
return 0;
}
Find and modify GOT entries
Find the target symbol in relocation entries (PLT/GOT) and obtain the GOT offset to replace.
Elf_Addr find_sym_in_rel(int pid, char *sym_name)
{
Elf_Rel *rel = (Elf_Rel *) malloc(sizeof(Elf_Rel));
Elf_Sym *sym = (Elf_Sym *) malloc(sizeof(Elf_Sym));
int i;
char str[STRLEN] = {0};
unsigned long ret;
struct lmap_result *lmret = get_dyn_info(pid);
for (i = 0; i < lmret->nrelplts; i++)
{
ptrace_getdata(pid, lmret->jmprel + i*sizeof(Elf_Rela), rel, sizeof(Elf_Rela));
ptrace_getdata(pid, lmret->symtab + ELF64_R_SYM(rel->r_info) * sizeof(Elf_Sym), sym, sizeof(Elf_Sym));
int n = ptrace_getstr(pid, lmret->strtab + sym->st_name, str, STRLEN);
if (strcmp(str, sym_name) == 0)
break;
}
if (i == lmret->nrelplts)
ret = 0;
else
ret = rel->r_offset;
free(rel);
return ret;
}
Example: auto-assigning scanf input
Modify the target program to loop 10 times, each time reading from stdin and printing the value:
#include
#include
int main()
{
int val = 10;
while (val--)
{
sleep(2);
printf("please input a number:\n");
int val = 0;
scanf("%d", &val);
printf("your val is %d\n", val);
}
return 0;
}
The hook library automatically assigns a value to the scanned variable:
#include
#include
int num = 1;
int hookscanf(const char *format,...)
{
va_list ap;
int retval;
va_start(ap, format);
int* pval = va_arg(ap, int*);
printf("automatic input: %d\n", num);
*pval = num++;
return 0;
}
After compiling and running, the target receives values without manual console input.
Summary
This article introduced two common Linux hooking methods. The LD_PRELOAD method is straightforward but only works for processes started with the preload set. The ptrace-based method can attach to and modify already running processes but is more complex and requires a deeper understanding of the ELF file format and related structures.
ALLPCB