通过DWARF Expression将代码隐藏在栈展开过程中
本文为看雪论坛优秀文章
看雪论坛作者ID:天水姜伯约
一
引言
二
背景知识
clang++编译命令
将.cpp文件编译为.s文件(汇编文件):clang++ -S -c x1.cpp x2.cpp ...
将.s文件编译为可执行文件:clang++ x1.s x2.s ... -o a.elf
将.bc文件编译为.s文件:llc x1.bc
栈展开过程
CFA:CFA的全称是Canonical Frame Address,它的值是在执行(不是执行完)当前函数(callee)的caller的call指令时的RSP值。用一段代码说明如下:
caller:
push arg1 --> RSP = 0xFFF8
push arg2 --> RSP = 0xFFF0 (执行call指令时的RSP值在这
call callee --> RSP = 0xFFE8
callee:
push rbp --> CFA = 0xFFF0
恢复其它寄存器
DWARF Expression
编码(入栈立即数)
DW_OP_const1(2, 4, 8)u OP1(u_int8, u_int16, u_int32, u_int64)
这4条指令都包含操作数OP1,OP1使用补码编码,语义都是将OP1压入栈中。如果OP1的长度小于单位元素长度,使用0补齐高位。
DW_OP_const1(2, 4, 8)s OP1(int8, int16, int32, int64)
与上一条指令基本相同,区别是这条指令压入的是有符号数,而上一条指令压入的是无符号数。
DW_OP_constu OP1(ULEB128)
这条指令包含操作数OP1,OP1使用ULEB128编码,它向栈中压入OP1。
DW_OP_consts OP1(SLEB128)
与上一条指令基本相同,区别是OP1使用SLEB128编码,压入的是有符号数。
寄存器寻址
DW_OP_bregn OP1(SLEB128)
n的取值可以是0-31,代表着寄存器的编号。AMD64环境中寄存器编号如下图所示。这条指令向栈中压入 REG + OP1,REG是由n指定的。注意压入的仅仅是地址,而不存在解引用的过程,解引用操作需要使用DW_OP_deref指令。
REG是寄存器编号,LENGTH_OF_EXPRESSION是编写的Expression的字节码的长度,后面跟着的就是Expression的字节码。
三
方法
将汇编代码转换为DWARF Expression
handle1()
需要混淆的代码片段
handle2()
1、将汇编代码转换为二叉表达式树
2、后序遍历二叉树,得到后缀表达式
3、按顺序翻译后缀表达式就可以得到DWARF Expression
转换为二叉表达式树
class ASTNode:
def __init__(self, value, valtype, left, right):
self.Value = value #数值
self.Type = valtype #数值类型,可以是寄存器,常量,操作符
self.Left = left
self.Right = right
初始化寄存器环境regs字典,将所有寄存器的初始值都设置为ASTNode(reg, VAL_REG, None, None)
② 解析每一条指令
如addq %reg1, %reg2指令可以解析为:regs[1] = ASTNode("+", VAL_OP, regs[op[0]], regs[op[1]]);addq $const, %reg1指令可以解析为:regs[1] = ASTNode("+", VAL_OP, regs[op[1]], ASTNode(int(op[0]), VAL_CONST, None, None))
def Block2AST(block):
#DO NOT touch the shit of memory access, it's really a nightmare
if not isHandleable(block):
return None
# Because we add call in front of block and after block, so, the block should not use regs
# not included in RBX, R12-R15 before set them
regs = {}
for reg in REGISTER_NUMBER.keys():
regs[reg] = ASTNode(REGISTER_NUMBER[reg], VAL_REG, None, None)
for line in block:
mm = line.split("\t")[0]
op = line.split("\t")[1]
if op.find("#") != -1:
op = op[:op.find("#")]
op = op.split(",")
for i in range(len(op)):
op[i] = op[i].strip()
op[i] = op[i].replace("%", "")
op[i] = op[i].replace("$", "")
if mm == 'imulq':
if len(op) == 3:
if isREG(op[0]):
regs[op[2]] = ASTNode("*", VAL_OP, regs[op[0]], regs[op[1]])
else:
regs[op[2]] = ASTNode("*", VAL_OP, ASTNode(int(op[0]), VAL_CONST, None, None), regs[op[1]])
elif len(op) == 2:
if isREG(op[0]):
regs[op[1]] = ASTNode("*", VAL_OP, regs[op[0]], regs[op[1]])
else:
regs[op[1]] = ASTNode("*", VAL_OP, ASTNode(int(op[0]), VAL_CONST, None, None), regs[op[1]])
elif len(op) == 1:
print("ERROR: IMUL WITH ONLY 1 OP")
exit(0)
elif mm == 'movabsq':
regs[op[1]] = ASTNode(int(op[0]), VAL_CONST, None, None)
elif mm == 'addq':
if isREG(op[0]):
regs[op[1]] = ASTNode("+", VAL_OP, regs[op[0]], regs[op[1]])
#......省略,模拟指令语义都是一些很dirty的工作
else:
regs[op[1]] = ASTNode("+", VAL_OP, ASTNode(int(op[0]), VAL_CONST, None, None), regs[op[1]])
elif mm == 'movq':
regs[op[1]] = regs[op[0]]
elif mm == 'shrq':
if isREG(op[0]):
print("ERROR: SHRQ WITH REG OP2")
exit(0)
else:
regs[op[1]] = ASTNode(">>", VAL_OP, regs[op[1]], ASTNode(int(op[0]), VAL_CONST, None, None))
else:
print("ERROR: " + mm + " UNSUPPORTED")
return regs
获取后缀表达式
def getPostFix(root):
if root.Left == None and root.Right == None:
return val2Str(root)
s1 = getPostFix(root.Left)
s2 = getPostFix(root.Right)
return s1 + " " + s2 + " " + val2Str(root)
后缀表达式转为DWARF Expression
def postFix2DWARF(s):
code = s.split(" ")
dwarf = []
for c in code:
c = c.strip()
if is64REG(c):
dwarf += DW_push_reg(REGISTER_NUMBER[c])
elif c == '<<':
dwarf += DW_shl()
elif c == '>>':
dwarf += DW_shr()
elif c == '+':
dwarf += DW_add()
elif c == '-':
dwarf += DW_sub()
elif c == '&':
dwarf += DW_and()
elif c == '|':
dwarf += DW_or()
elif c == '^':
dwarf += DW_xor()
elif c == '*':
dwarf += DW_mul()
elif c == '%':
dwarf += DW_mod()
else:
dwarf += DW_push_imm(int(c))
return dwarf
block_code = postFix2DWARF(post_fix)
block_code = DW_CFA_val_expression(REGISTER_NUMBER[reg], len(block_code)) + block_code
输出.cfi_escape指令
if reg in ['rbx', 'r12', 'r13', 'r14', 'r15']:
if not (regs[reg].Left == None and regs[reg].Right == None and
regs[reg].Value == REGISTER_NUMBER[reg] and regs[reg].Type == VAL_REG):
block_code = postFix2DWARF(post_fix)
block_code = DW_CFA_val_expression(REGISTER_NUMBER[reg], len(block_code)) + block_code
print(reg)
print(".cfi_escape " + str(block_code)[1:len(str(block_code)) - 1])
① 在wrapper1中插入生成的DWARF Expression。
② 删除handle1与handle2之间的待混淆代码片段,再删除对handle2函数的调用。
四
实验
五
附件说明
参考文献
[2]CFI Directives: https://sourceware.org/binutils/docs/as/CFI-directives.html
[3]LEB128: https://en.wikipedia.org/wiki/LEB128
看雪ID:天水姜伯约
https://bbs.pediy.com/user-home-783210.htm
# 往期推荐
6.Windows本地提权漏洞CVE-2014-1767分析及EXP编写指导
球分享
球点赞
球在看
点击“阅读原文”,了解更多!