2018-12-01

Bochspwn漏洞挖掘技术深究(1)：Double Fetches 检测

虽然现在技术文章很少人看，大家都喜欢聊安全八卦，但技术文章输出是一种很好的学习方式。更重要的是，专业的文章是给专业的人看的，并非为了取悦所有人。

对于应用程序的代码插桩，有现成的Pin和DynamoRIO插桩框架，在Fuzzing中可以用来实现代码覆盖率的反馈驱动，这已经被应用到winafl，效果很好。除了挖洞，在逆向工程领域应用也很广泛。

上面都是针对应用层的，内核层的，上面的Pin和DynamoRIO就派不上用场了，对于这种系统内核级的指令插桩，有时就会采用虚拟化技术为实现，比如通过Qemu或Bochs虚拟机。

ProjectZero的j00ru大神就用bochs的插桩API为实现针对内核double fetches的监测，项目称为bochspwn，后来又采用污点追踪方式检测未初始化漏洞导致的内核信息泄露，叫bochspwn-reloaded。

Bochs Instrument API 文档参考：http://bochs.sourceforge.net/cgi-bin/lxr/source/instrument/instrumentation.txt ，在编译bochs时指定插桩代码目录：

1	./configure [...] --enable-instrumentation="instrument/myinstrument"

下面是bochspwn中用到的API：

// Bochs初始化CPU对象时的回调函数
void bx_instr_initialize(unsigned cpu);	
// Bochs析构CPU对象时的回调函数
void bx_instr_exit(unsigned cpu);	
// Bochs访问线性内存时的回调函数
void bx_instr_lin_access(unsigned cpu, bx_address lin, bx_address phy,unsigned len, unsigned memtype, unsigned rw);	
// Bochs执行指令前的回调函数
void bx_instr_before_execution(unsigned cpu, bxInstruction_c *i);

bx_instr_initialize用来加载配置信息，针对不同的系统环境设置不同的数据结构偏移地址，用来提供需要的进程/线程等重要信息：

[general]
trace_log_path      = memlog.bin
modules_list_path   = modules.bin

os                  = windows
bitness             = 32
version             = win10_32

min_read_size       = 1
max_read_size       = 16
min_write_size      = 1
max_write_size      = 16

callstack_length    = 48
write_as_text       = 0

symbolize           = 0
symbol_path         = <symbols path>

[win7_32]
kprcb               = 0x120
current_thread      = 0x04
tcb                 = 0x0
process             = 0x150
client_id           = 0x22c
process_id          = 0
thread_id           = 4
create_time         = 0x200
image_filename      = 0x16c
kdversionblock      = 0x34
psloadedmodulelist  = 0x18
loadorder_flink     = 0x0
basedllname         = 0x2c
baseaddress         = 0x18
sizeofimage         = 0x20
us_len              = 0x0
us_buffer           = 0x4
teb_cid             = 0x20
irql                = 0x24
previous_mode       = 0x13a
exception_list      = 0x0
next_exception      = 0x0
try_level           = 0xc
......

Bochspwn的核心功能实现就在于bx_instr_lin_access与bx_instr_before_execution两个函数。先看下bx_instr_before_execution的实现逻辑：

忽略实模式real mode
忽略无关的系统调用中断指令，仅允许int 0x2e与 int 0x80
获取当前进程/线程ID相关的信息，当发现漏洞时方便重现

void bx_instr_before_execution(unsigned cpu, bxInstruction_c *i) {
  static client_id thread;
  BX_CPU_C *pcpu = BX_CPU(cpu);
  unsigned opcode;

  // We're not interested in instructions executed in real mode.
  if (!pcpu->protected_mode() && !pcpu->long64_mode()) {
    return;
  }

  // If the system needs an additional invokement from here, call it now.
  if (globals::has_instr_before_execution_handler) {
    invoke_system_handler(BX_OS_EVENT_INSTR_BEFORE_EXECUTION, pcpu, i);
  }

  // Any system-call invoking instruction is interesting - this
  // is mostly due to 64-bit Linux which allows various ways
  // to be used for system-call invocation.
  // Note: We're not checking for int1, int3 nor into instructions.
  opcode = i->getIaOpcode();
  if (opcode != BX_IA_SYSCALL && opcode != BX_IA_SYSENTER && opcode != BX_IA_INT_Ib) {
    return;
  }

  // The only two allowed interrupts are int 0x2e and int 0x80, which are legacy
  // ways to invoke system calls on Windows and linux, respectively.
  if (opcode == BX_IA_INT_Ib && i->Ib() != 0x2e && i->Ib() != 0x80) {
    return;
  }

  // Obtain information about the current process/thread IDs.
  if (!invoke_system_handler(BX_OS_EVENT_FILL_CID, pcpu, &thread)) {
    return;
  }

  // Process information about a new syscall depending on the current mode.
  if (!events::event_new_syscall(pcpu, &thread)) {
    return;
  }
}

再看下bx_instr_lin_access实现逻辑：

忽略仅读写指令
检测CPU类型（32位或64位）
判断当前指令地址pc是否为内核地址，判断访问的线性内存地址是否为用户层地址
检测读取的内存长度是否处于0~16字节之间，长度大小范围在config.txt中配置，仅处理此范围内的指令操作
通过上述条件之后，就代表可能存在内核漏洞，然后反汇编指令，然后填充日志记录信息

void bx_instr_lin_access(unsigned cpu, bx_address lin, bx_address phy,
                         unsigned len, unsigned memtype, unsigned rw) {

  BX_CPU_C *pcpu = BX_CPU(cpu);
  // Not going to use physical memory address.
  (void)phy;

  // Read-write instructions are currently not interesting.
  if (rw == BX_RW)
    return;

  // Is the CPU in protected or long mode?
  unsigned mode = 0;

  // Note: DO NOT change order of these ifs. long64_mode must be called
  // before protected_mode, since it will also return "true" on protected_mode
  // query (well, long mode is technically protected mode).

  if (pcpu->long64_mode()) {
#if BX_SUPPORT_X86_64
    mode = 64;
#else
    return;
#endif  // BX_SUPPORT_X86_64
  } else if (pcpu->protected_mode()) {
    // This is either protected 32-bit mode or 32-bit compat. long mode.
    mode = 32;
  } else {
    // Nothing interesting.
    // TODO(gynvael): Well actually there is the smm_mode(), which
    // might be a little interesting, even if it's just the bochs BIOS
    // SMM code.
    return;
  }

  // Is pc in kernel memory area?
  // Is lin in user memory area?
  bx_address pc = pcpu->prev_rip;
  if (!invoke_system_handler(BX_OS_EVENT_CHECK_KERNEL_ADDR, &pc, NULL) ||
      !invoke_system_handler(BX_OS_EVENT_CHECK_USER_ADDR, &lin, NULL)) {
    return; /* pc not in ring-0 or lin not in ring-3 */
  }

  // Check if the access meets specified operand length criteria.
  if (rw == BX_READ) {
    if (len < globals::config.min_read_size || len > globals::config.max_read_size) {
      return;
    }
  } else {
    if (len < globals::config.min_write_size || len > globals::config.max_write_size) {
      return;
    }
  }

  // Save basic information about the access.
  log_data_st::mem_access_type access_type;
  switch (rw) {
    case BX_READ:
      access_type = log_data_st::MEM_READ;
      break;
    case BX_WRITE:
      access_type = log_data_st::MEM_WRITE;
      break;
    case BX_EXECUTE:
      access_type = log_data_st::MEM_EXEC;
      break;
    case BX_RW:
      access_type = log_data_st::MEM_RW;
      break;
    default: abort();
  }

  // Disassemble current instruction.
  static Bit8u ibuf[32] = {0};
  static char pc_disasm[64];
  if (read_lin_mem(pcpu, pc, sizeof(ibuf), ibuf)) {
    disassembler bx_disassemble;
    bx_disassemble.disasm(mode == 32, mode == 64, 0, pc, ibuf, pc_disasm);
  }

  // With basic information filled in, process the access further.
  process_mem_access(pcpu, lin, len, pc, access_type, pc_disasm);
}

信息记录方式都是通过invoke_system_handler函数去处理自定义系统事件，目前主要支持4种操作系统（windows\linux\freebsd\openbsd），macOS还没搞过，原作者是说想继续实现macOS，这个值得尝试开发下：

const struct tag_kSystemEventHandlers {
  const char *system;
  s_event_handler_func handlers[BX_OS_EVENT_MAX];
} kSystemEventHandlers[] = {
  {"windows",
   {(s_event_handler_func)windows::init,
    (s_event_handler_func)windows::check_kernel_addr,
    (s_event_handler_func)windows::check_user_addr,
    (s_event_handler_func)windows::fill_cid,	// 获取线程环境块TEB，读取进程/线程ID
    (s_event_handler_func)windows::fill_info,	// 基于config.txt中配置的进线程结构offset去读取进线程信息，包括进程文件名、创建时间、栈回溯等信息
    (s_event_handler_func)NULL}
  },
  {"linux",
   {(s_event_handler_func)linux::init,
    (s_event_handler_func)linux::check_kernel_addr,
    (s_event_handler_func)linux::check_user_addr,
    (s_event_handler_func)linux::fill_cid,
    (s_event_handler_func)linux::fill_info,
    (s_event_handler_func)NULL}
  },
  {"freebsd",
   {(s_event_handler_func)freebsd::init,
    (s_event_handler_func)freebsd::check_kernel_addr,
    (s_event_handler_func)freebsd::check_user_addr,
    (s_event_handler_func)freebsd::fill_cid,
    (s_event_handler_func)freebsd::fill_info,
    (s_event_handler_func)freebsd::instr_before_execution}
  },
  {"openbsd",
   {(s_event_handler_func)openbsd::init,
    (s_event_handler_func)openbsd::check_kernel_addr,
    (s_event_handler_func)openbsd::check_user_addr,
    (s_event_handler_func)openbsd::fill_cid,
    (s_event_handler_func)openbsd::fill_info,
    (s_event_handler_func)openbsd::instr_before_execution}
  },
  {NULL, {NULL, NULL, NULL, NULL, NULL}}
};

最后就是输出记录的信息，比如作者发现的CVE-2018-0894漏洞信息：

------------------------------ found uninit-copy of address fffff8a000a63010

[pid/tid: 000001a0/000001a4] {     wininit.exe}
       COPY of fffff8a000a63010 ---> 1afab8 (64 bytes), pc = fffff80002698600
       [                             mov r11, rcx ]
Allocation origin: 0xfffff80002a11101
                   (ntoskrnl.exe!IopQueryNameInternal+00000071)
--- Shadow memory:
00000000: 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 ................
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
--- Actual memory:
00000000: 2e 00 30 00 aa aa aa aa 20 30 a6 00 a0 f8 ff ff ..0..... 0......
00000010: 5c 00 44 00 65 00 76 00 69 00 63 00 65 00 5c 00 \.D.e.v.i.c.e.\.
00000020: 48 00 61 00 72 00 64 00 64 00 69 00 73 00 6b 00 H.a.r.d.d.i.s.k.
00000030: 56 00 6f 00 6c 00 75 00 6d 00 65 00 32 00 00 00 V.o.l.u.m.e.2...
--- Stack trace:
 #0  0xfffff80002698600 (ntoskrnl.exe!memmove+00000000)
 #1  0xfffff80002a11319 (ntoskrnl.exe!IopQueryNameInternal+00000289)
 #2  0xfffff800028d4426 (ntoskrnl.exe!IopQueryName+00000026)
 #3  0xfffff800028e8fa8 (ntoskrnl.exe!ObpQueryNameString+000000b0)
 #4  0xfffff8000291313b (ntoskrnl.exe!NtQueryVirtualMemory+000005fb)
 #5  0xfffff800026b9283 (ntoskrnl.exe!KiSystemServiceCopyEnd+00000013)

riusksk's blog

攀蟾折桂摄寰宇，摘星揽月御乾坤。踏云踩雾骋宵壤，驱风逐日闯天地。 ------泉哥

Bochspwn漏洞挖掘技术深究(1)：Double Fetches 检测