代码混淆之我见(三)
前言
本次我们以VMProtect V2.12.3中对Handle及Dispatch的混淆为例,进行一次非常简单的解混淆操作。
注意是非常简单,所以牛人请飞过……
传送门:
代码混淆之我见(一)(点击即可阅读)
准备工作
用VMProtect对一段非常简单的汇编代码进行操作,如下。
include windows.inc
include kernel32.inc
include user32.inc
includelib kernel32.lib
includelib user32.lib
.data
szText db 'VMProtect V2.12.3',0
szTitle db '三十二变'
.code
@Main proc
mov eax,2019h
invoke MessageBox,NULL,offset szText,offset szTitle,MB_OK
ret
@Main endp
_Start:
invoke @Main
invoke ExitProcess,0
end _Start
在VMProtect中,配置对@Main过程进行虚拟化处理。
通过查看反汇编代码可知,@Main过程已经被替换为一条JMP指令,以遇到流程转移指令就跟进的原则,发现最多能定位到此处。
retn 0x10指令执行之后,程序的状态是怎样的我们并不清楚。
经不完全统计,该段代码中共运用了以下几种混淆。
1. 栈混淆
2. 插入死代码
3. 打乱代码空间局部性
与(一)中所描述的“插入死代码”稍有不同,本文介绍的“插入死代码”有新的含义。即即使不执行该段代码,亦不会对程序之后的执行状态造成影响。
以一段具体的代码为例,如下。
int x; //1
x = 5; //2
x = 10; //3
print("%d", x); //4
其中语句2即是死代码,因为该语句仅对x赋值,而变量x在语句2上不是活跃的,即x在被使用前就被语句3重新定值。
打乱代码空间局部性即打乱分析者对某段代码的感知。如果将一段逻辑上是连续的代码拆分成无数段,并用流程转移指令随机连接,最终,被拆分的代码段在物理上变得毫无关系,这将严重影响我们对代码的分析。
接下来,我们通过编写解混淆脚本,尝试辅助对虚拟机的分析。
注意,所需的库为Capstone、PythonWin。
另外,Python我只学了半个小时的语法,所以我写出了很多我自己都觉得非常玄幻的代码。如有问题,望跟帖反馈……汗。
以下部分内容仅为本人一得之见,如有错误,望跟帖反馈。
Dispatch
首先,我们需要读取完全读取该段代码,直到遇到ret指令终止。
为了方便分析堆栈中的数据情况,在将代码拼接回来时,对于CALL指令可以直接将其转化为push imm32/jmp addr的结构,跳转指令则直接舍去。
#获取一段汇编代码,以RET X指令作为终止信号
#CALL指令转换为push ret/jmp addr
#条件跳转默认成立
def GetAsm(StartAddr):
LstAsm = []
ObDis = Cs(CS_ARCH_X86, CS_MODE_32)
ObDis.detail = True
k = b'\x6A\x00'
asmm = ObDis.disasm(k, 0x401000)
for insn in asmm:
vcall = insn
state = True
while state:
code = ReadMemory(StartAddr, 30)
asmm = ObDis.disasm(code, StartAddr)
x = 0
for insn in asmm:
if insn.mnemonic == 'ret':
LstAsm.append(insn)
state = False
if x == 0:
if IsBranch(insn.mnemonic):
for i in insn.operands:
StartAddr = i.imm
break
elif insn.mnemonic == 'call':
for i in insn.operands:
LstAsm.append(vcall)
StartAddr = i.imm
break
else:
LstAsm.append(insn)
elif x == 1:
StartAddr = insn.address
break
x += 1
return LstAsm
因为要分析ret 0x10这条指令最终走向何处,所以我们要对程序运行时的栈进行模拟。
建立一个字典,作为程序运行时的栈空间。通过分析代码,处理sub、sbb、add、adc、lea等指令对esp的影响,更新栈指针,同时注意对于栈空间的访问。但当遇到如sub esp,reg32或者mov reg32,[esp + reg32 - imm8]等指令,我们还是认为其对堆栈的影响是不可预料的。我们将栈里的内存以DWORD为单位开始编号。
但我模拟的栈指针是从低地址向高地址扩展的,所以对于程序中对于esp的调整,都要进行与之相反的工作来调整模拟的栈指针。并为每条指令附加read,write字典,以表示其访问的栈变量。
注意,我忽略了将esp赋值给其它寄存器,程序通过其它寄存器来访问栈空间的情况。
def AnalyzeStackObfs(LstAsm):
ptr = -1 #模拟栈指针,但是从低地址向高地址扩展的,以DWORD为最小单位,即执行一次push dword栈指针递增1
LstStackInfo = []
stackele = {}
stackele['read'] = []
stackele['write'] = []
for insn in LstAsm:
stackele = {}
stackele['read'] = []
stackele['write'] = []
if insn.mnemonic == 'push': #入栈
for x in insn.operands:
if x.type == X86_OP_MEM and insn.reg_name(x.mem.base) == 'esp' and x.mem.index == 0:
if x.mem.disp % 4 != 0:
print('栈混淆清理可能有误,因为有未知的调整栈指针指令,该指令位于%X' % (insn.address))
if x.access == CS_AC_READ:
stackele['read'].append(ptr - x.mem.disp // 4)
elif x.access == CS_AC_WRITE:
stackele['write'].append(ptr - x.mem.disp // 4)
elif x.access == CS_AC_READ | CS_AC_WRITE:
stackele['read'].append(ptr - x.mem.disp // 4)
stackele['write'].append(ptr - x.mem.disp // 4)
ptr += 1
stackele['write'].append(ptr)
elif insn.mnemonic in ['pushad', 'pushal']: #入栈
ptr += 8
for x in range(ptr, ptr - 8, -1):
stackele['write'].append(x)
elif insn.mnemonic == 'pushfd': #入栈
ptr += 1
stackele['write'].append(ptr)
elif insn.mnemonic == 'pop': #出栈
for x in insn.operands:
if x.type == X86_OP_MEM and insn.reg_name(x.mem.base) == 'esp' and x.mem.index == 0:
if x.mem.disp % 4 != 0:
print('栈混淆清理可能有误,因为有未知的调整栈指针指令,该指令位于%X' % (insn.address))
if x.access == CS_AC_READ:
stackele['read'].append(ptr - x.mem.disp // 4)
elif x.access == CS_AC_WRITE:
stackele['write'].append(ptr - x.mem.disp // 4)
elif x.access == CS_AC_READ | CS_AC_WRITE:
stackele['read'].append(ptr - x.mem.disp // 4)
stackele['write'].append(ptr - x.mem.disp // 4)
stackele['read'].append(ptr)
ptr -= 1
elif insn.mnemonic == 'popfd': #出栈
stackele['read'].append(ptr)
ptr -= 1
elif insn.mnemonic in ['popad', 'popal']: #出栈
for x in range(ptr, ptr - 8, -1):
stackele['read'].append(x)
ptr -= 8
elif insn.mnemonic in ['sub', 'sbb'] and ModifyReg('esp', insn): #调整栈指针
for x, k in zip(insn.operands, range(1, 3)):
if k == 2:
if x.type == X86_OP_IMM:
if x.imm % 4 == 0:
ptr += (x.imm // 4)
else:
print('栈混淆清理可能有误,因为有未知的调整栈指针指令,该指令位于%X' % (insn.address))
else:
print('栈混淆清理可能有误,因为有未知的调整栈指针指令,该指令位于%X' % (insn.address))
elif insn.mnemonic in ['add', 'adc'] and ModifyReg('esp', insn): #调整栈指针
for x, k in zip(insn.operands, range(1, 3)):
if k == 2:
if x.type == X86_OP_IMM:
if x.imm % 4 == 0:
ptr -= (x.imm // 4)
else:
print('栈混淆清理可能有误,因为有未知的调整栈指针指令,该指令位于%X' % (insn.address))
else:
print('栈混淆清理可能有误,因为有未知的调整栈指针指令,该指令位于%X' % (insn.address))
elif insn.mnemonic == 'lea': #调整栈指针
if ModifyReg('esp', insn):
for x, k in zip(insn.operands, range(1, 3)):
if k == 2:
if x.type == X86_OP_MEM:
if x.mem.disp % 4 == 0:
ptr -= (x.mem.disp // 4)
else:
print('栈混淆清理可能有误,因为有未知的调整栈指针指令,该指令位于%X' % (insn.address))
else:
print('栈混淆清理可能有误,因为有未知的调整栈指针指令,该指令位于%X' % (insn.address))
elif insn.mnemonic == 'ret':
stackele['read'].append(ptr)
for x in insn.operands:
ptr -= (x.imm // 4 + 1)
elif ModifyReg('esp', insn): #如果还有其它调整栈指针的指令没有被考虑,打印出详细信息
print('栈混淆清理可能有误,因为有未知的调整栈指针指令,该指令位于%X' % (insn.address))
else: #处理 xxx [esp + xx],xx格式指令
for x in insn.operands:
if x.type == X86_OP_MEM and insn.reg_name(x.mem.base) == 'esp' and x.mem.index == 0:
if x.mem.disp % 4 != 0:
print('栈混淆清理可能有误,因为有未知的调整栈指针指令,该指令位于%X' % (insn.address))
if x.access == CS_AC_READ:
stackele['read'].append(ptr - x.mem.disp // 4)
elif x.access == CS_AC_WRITE:
stackele['write'].append(ptr - x.mem.disp // 4)
elif x.access == CS_AC_READ | CS_AC_WRITE:
stackele['read'].append(ptr - x.mem.disp // 4)
stackele['write'].append(ptr - x.mem.disp // 4)
LstStackInfo.append(stackele)
for x, k in zip(LstAsm, LstStackInfo):
print(x.mnemonic + ' ' + x.op_str, k)
return
最终我们得到了非常详细的输出。
ret 0x10指令所访问的栈变量如下。
可以观察到,ret 0x10指令读取了第65号栈变量,而65号栈变量在push dword ptr [esp + 0xc]中被定值,而定值又来源于第61号栈变量,61号栈变量最近一次定值在mov dword ptr [esp + 8],ecx中。
我们可以进一步观察ecx的定值情况。
ecx来源于[eax * 4 + 0x405821],接着又进行了循环右移操作,即解密工作。
现在我们可以来观察一下哪些代码可以被归入死代码一类。
首先不得不说,图中圈出的pushad指令非常诱人,但其定值了第67与66号栈变量,且依据现有信息,我们无法判断它们是否活跃。为了保证解混淆后的代码语义不发生变化,我们应该尽可能地保守分析。
在对栈变量进行活跃分析时,需要遵循以下原则。
1.活跃分析应从底部向顶部分析;
2.出口代码处的所有栈变量都是活跃的;
3.栈变量的活跃性持续向上传递,但遇到对该栈变量的读写操作时,会变更活跃性;
4.当某一栈变量向上传递活跃性遇到写操作时,则活跃性变为死状态,若遇到读操作时,则活跃性变为活状态;
5.当出现对同一栈变量进行读写操作时,我们默认读操作先于写操作。
其中原则2是出于保守分析的前提提出的,而原则5则是经验之谈,如果你找到了反例,望跟帖反馈。以上原则均是基于同一基本块内栈变量活跃性流动情况提出的。
栈变量的活跃性分析代码如下。
#StackVar指定欲分析的栈变量
def AnalyzeStackVar(LstStackInfo, StackVar):
k = True
LstStackInfo.reverse()
for info in LstStackInfo:
info[StackVar] = k
if StackVar in info['write']:
k = False
if StackVar in info['read']:
k = True
LstStackInfo.reverse()
return
有了栈变量的在各指令上的活跃情况,我们就可以删除一些对死状态栈变量赋值的语句,具体代码如下。
def GetUnVar(info):
z = []
for x in range(0, MaxPtr + 1):
if not info[x]:
z.append(x)
return z
#默认在尾部时,所有栈变量都是活跃的
for i in range(0, MaxPtr + 1):
AnalyzeStackVar(LstStackInfo, i)
LstNewAsm = LstAsm[:]
LstNewStackInfo = LstStackInfo[:]
LstAsmInfo = []
for x, k in zip(LstNewAsm, LstNewStackInfo):
y = {}
y['Asm'] = x
y['Stack'] = k
LstAsmInfo.append(y)
LstNewAsmInfo = LstAsmInfo[:]
for asmi in LstAsmInfo:
#不涉及到栈指针调整的指令可以直接删除
if (len(asmi['Stack']['write']) != 0) and (IsListIn(asmi['Stack']['write'], GetUnVar(asmi['Stack']))) and (asmi['Asm'].mnemonic not in ['push', 'pushal', 'pushfd', 'pushad', 'pop', 'popfd', 'popal', 'popad']):
print('位于%X处的指令被清除了' % asmi['Asm'].address)
LstNewAsmInfo.remove(asmi)
输出情况并不是非常理想,仅删除了3条语句。
这3条指令如下。
mov word ptr ss:[esp],dx
mov byte ptr ss:[esp+0x8],al
mov byte ptr ss:[esp],bl
当然,我们还可以继续化简。
当入栈与出栈所写入的栈变量为死变量时,我们可以将其修改为一条修改栈指针的指令。如下图所示。
0x406194处的代码可以直接化简为lea esp,[esp - 4],因为写入的第0号栈变量是死变量。但我们不能直接删除这行代码,否则会导致栈不平衡。
具体代码如下。
#清理堆栈混淆
def CleanStackObfs(LstStackInfo, LstAsm):
def AnalyzeStackVar(LstStackInfo, StackVar):
k = True
LstStackInfo.reverse()
for info in LstStackInfo:
info[StackVar] = k
if StackVar in info['write']:
k = False
if StackVar in info['read']:
k = True
LstStackInfo.reverse()
return
def GetUnVar(info):
z = []
for x in range(0, MaxPtr + 1):
if not info[x]:
z.append(x)
return z
#默认在尾部时,所有栈变量都是活跃的
for i in range(0, MaxPtr + 1):
AnalyzeStackVar(LstStackInfo, i)
LstNewAsm = LstAsm[:]
LstNewStackInfo = LstStackInfo[:]
LstAsmInfo = []
for x, k in zip(LstNewAsm, LstNewStackInfo):
y = {}
y['Asm'] = x
y['Stack'] = k
LstAsmInfo.append(y)
LstNewAsmInfo = LstAsmInfo[:]
ObDis = Cs(CS_ARCH_X86, CS_MODE_32)
ObDis.detail = True
for insn in ObDis.disasm(b'\x8D\x64\x24\xFC', 401000):
vPush = insn
for insn in ObDis.disasm(b'\x8D\x64\x24\xE0', 401000):
vPushal = insn
for insn in ObDis.disasm(b'\x8D\x64\x24\x04', 401000):
vPop = insn
for insn in ObDis.disasm(b'\x8D\x64\x24\x20', 401000):
vPopal = insn
for asmi in LstAsmInfo:
if (len(asmi['Stack']['write']) != 0) and (IsListIn(asmi['Stack']['write'], GetUnVar(asmi['Stack']))):
if asmi['Asm'].mnemonic not in ['push', 'pushal', 'pushfd', 'pushad', 'pop', 'popfd', 'popal', 'popad']: #不涉及到栈指针调整的指令可以直接删除
print('位于%X处的指令被清除了' % asmi['Asm'].address)
LstNewAsmInfo.remove(asmi)
if asmi['Asm'].mnemonic in ['push', 'pushfd']:
LstNewAsmInfo[ListFind(LstNewAsmInfo, asmi)]['Asm'] = vPush
print('位于%X处的指令被清除了' % asmi['Asm'].address)
LstNewAsmInfo[ListFind(LstNewAsmInfo, asmi)]['Stack']['write'] = []
LstNewAsmInfo[ListFind(LstNewAsmInfo, asmi)]['Stack']['read'] = []
if asmi['Asm'].mnemonic in ['pushal', 'pushad']:
LstNewAsmInfo[ListFind(LstNewAsmInfo, asmi)]['Asm'] = vPushal
print('位于%X处的指令被清除了' % asmi['Asm'].address)
LstNewAsmInfo[ListFind(LstNewAsmInfo, asmi)]['Stack']['write'] = []
LstNewAsmInfo[ListFind(LstNewAsmInfo, asmi)]['Stack']['read'] = []
if asmi['Asm'].mnemonic in ['pop', 'popfd']:
LstNewAsmInfo[ListFind(LstNewAsmInfo, asmi)]['Asm'] = vPop
print('位于%X处的指令被清除了' % asmi['Asm'].address)
LstNewAsmInfo[ListFind(LstNewAsmInfo, asmi)]['Stack']['write'] = []
LstNewAsmInfo[ListFind(LstNewAsmInfo, asmi)]['Stack']['read'] = []
if asmi['Asm'].mnemonic in ['popad', 'popal']:
LstNewAsmInfo[ListFind(LstNewAsmInfo, asmi)]['Asm'] = vPopal
print('位于%X处的指令被清除了' % asmi['Asm'].address)
LstNewAsmInfo[ListFind(LstNewAsmInfo, asmi)]['Stack']['write'] = []
LstNewAsmInfo[ListFind(LstNewAsmInfo, asmi)]['Stack']['read'] = []
LstAsmInfo = LstNewAsmInfo[:]
return LstNewAsmInfo
输出结果如下。
两次化简的情况看起来不是非常理想,未化简前有123条代码,化简后仍有120条代码。但其中大部分代码都是调整栈指针的无意义代码。
我们仍然运用消除死代码的思想,但这次我们的对象换为寄存器。
寄存器的活跃分析原则与栈变量的活跃分析原则相同,在此不做赘述。
需要注意的是,我将所有对某一32位寄存器的低16位,高8位,低8位寄存器的操作均视为对该32位寄存器的操作。
具体代码如下。
def AnalyeRegObfs(LstAsm):
def IsUseful(AsmInfo):
insn = AsmInfo['Asm']
(regs_read, regs_write) = insn.regs_access()
if len(regs_write) == 0:
return True
if ModifyReg('esp', insn):
return True
for r in AsmInfo.keys():
if r != 'Asm' and AsmInfo[r]: #如果影响了任意一个活跃寄存器即视为有效
if ModifyRegExact(r, insn):
return True
return False
def AnalyzeReg(LstAsm, Reg):
k = True
LstAsm.reverse()
LstRegInfo = []
for insn in LstAsm:
LstRegInfo.insert(0, k)
if ModifyRegExact(Reg, insn):
k = False
if ReadRegExact(Reg, insn):
k = True
LstAsm.reverse()
return LstRegInfo
LstAsmInfo = []
LstReg = ['eax', 'ebx', 'ecx', 'edx', 'edi', 'esi', 'ebp']
LstRegInfo = {}
for r in LstReg:
LstRegInfo[r] = AnalyzeReg(LstAsm, r)
for i in range(0, len(LstAsm)):
insn = LstAsm[i]
asmi = {}
asmi['Asm'] = insn
for k in LstRegInfo.keys():
asmi[k] = LstRegInfo[k][i]
LstAsmInfo.append(asmi)
LstNewAsmInfo = LstAsmInfo[:]
LstDel = []
for i in LstAsmInfo:
if not IsUseful(i):
print('%X %s %s' % (i['Asm'].address, i['Asm'].mnemonic, i['Asm'].op_str))
return LstAsmInfo
删除了20条指令。
注意,我没有分析标志位的活跃状态,因为标志位的活跃状态分析比较扯淡,因为有的指令对标志位既访问又写入,如cmc、sbb、adc等指令,而反汇编引擎无法解析明确。
注意,删除了一次死代码后,可能又会有新的死代码出现。如下。
int x; //1
x = 1; //2
x += 5; //3
x += 6; //4
x = 10; //5
所以我们需要不断重复这个过程,直到无法再删除任何指令。当然如果你和我一样懒,也可以多重复调用几次清理过程……
最终代码化简到85条,其中包含29条调整栈指针的代码,基本可以认为最终生成的代码为56条。
代码如下。
61E68 lea esp, [esp - 4] read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
40613F mov dword ptr [esp + 8], 0x64e25d43 read = [] write = [0] //1
61E68 lea esp, [esp - 0x20] read = [] write = []
406148 mov dword ptr [esp + 0x24], 0xc54985f2 read = [] write = [1]
61E68 lea esp, [esp - 4] read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
406152 lea esp, [esp + 0x2c] read = [] write = []
61E68 lea esp, [esp - 0x20] read = [] write = []
4050C8 mov dword ptr [esp + 0x1c], eax read = [] write = [2] //3
4050CC mov byte ptr [esp], 0x34 read = [] write = [9]
61E68 lea esp, [esp - 4] read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
4050DA mov dword ptr [esp + 0x24], edx read = [] write = [3] //4
61E68 lea esp, [esp - 4] read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
404C14 mov dword ptr [esp + 0x2c], edi read = [] write = [4] //5
61E68 lea esp, [esp - 4] read = [] write = []
40477F mov dword ptr [esp + 0x2c], edx read = [] write = [5]
4048DC push 0x39fa6022 read = [] write = [17]
405365 mov dword ptr [esp + 0x2c], esi read = [] write = [6] //6
401000 push 0 read = [] write = [18]
401000 push 0 read = [] write = [19]
404961 lea esp, [esp + 0x34] read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
4057CE mov dword ptr [esp + 4], ecx read = [] write = [7] //7
4057D5 pushal read = [] write = [16, 15, 14, 13, 12, 11, 10, 9]
4057D6 lea esp, [esp + 0x24] read = [] write = []
4048A7 setno ch read = [] write = []
401000 push 0 read = [] write = [8]
61E68 lea esp, [esp + 4] read = [] write = []
405753 pushfd read = [] write = [8] //11
405763 push ebx read = [] write = [9]
61E68 lea esp, [esp - 4] read = [] write = []
40576A mov dword ptr [esp], ebp read = [] write = [10] //8
405770 push dword ptr [0x404620] read = [] write = [11]
61E68 lea esp, [esp - 4] read = [] write = []
405563 mov dword ptr [esp], 0 read = [] write = [12] //9(重定位)
40556D mov esi, dword ptr [esp + 0x30] read = [0] write = [] //2
405573 not esi read = [] write = []
405579 sub esi, 0xcb1ca134 read = [] write = []
40557F push ebx read = [] write = [13] //10
61E68 lea esp, [esp + 4] read = [] write = []
405586 ror esi, 0x1a read = [] write = []
405594 mov ebp, esp read = [] write = []
4055A4 sub esp, 0xc0 read = [] write = [] //20
61E68 lea esp, [esp - 0x20] read = [] write = [] //21
4055AB lea edi, [esp + 0x20] read = [] write = [] //22
4055AF lea esp, [esp + 0x20] read = [] write = [] //23
4055B3 add al, cl read = [] write = []
4055B5 shld cx, cx, 0xb read = [] write = []
4055BA mov ebx, esi read = [] write = []
4055BC xadd cl, al read = [] write = []
4055C1 add esi, dword ptr [ebp] read = [] write = []
4055C6 mov al, byte ptr [esi] read = [] write = []
4055D8 sub al, bl read = [] write = []
4055E0 inc al read = [] write = []
4055E4 mov cl, 0x18 read = [] write = []
4055E6 xor al, 0x1e read = [] write = []
4055E8 adc ch, bl read = [] write = []
4055EA add al, 0xc read = [] write = []
4055F0 sbb cl, 0x3e read = [] write = []
4055F3 sub bl, al read = [] write = []
4055F5 pushal read = [] write = [68, 67, 66, 65, 64, 63, 62, 61]
61E68 lea esp, [esp + 4] read = [] write = []
4055FA movzx eax, al read = [] write = []
405601 inc esi read = [] write = []
405604 mov ecx, dword ptr [eax*4 + 0x405821] read = [] write = []
40560B pushfd read = [] write = [68]
61E68 lea esp, [esp - 4] read = [] write = []
405611 mov byte ptr [esp], 0x2a read = [] write = [69]
405615 ror ecx, 0x10 read = [] write = []
405618 push esp read = [] write = [70]
405619 lea esp, [esp + 0x28] read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
401000 push 0 read = [] write = [62]
401000 push 0 read = [] write = [63]
4051E4 mov dword ptr [esp + 8], ecx read = [] write = [61]
4051E8 push dword ptr [esp + 4] read = [62] write = [64]
4051EC push dword ptr [esp + 0xc] read = [61] write = [65]
4051F0 ret 0x10 read = [65] write = []
语句1压入了0x64e25d43,在语句2中取出,赋值给了esi,并进行了一系列解密操作,得到P-Code的起始地址。
语句3-语句11备份环境,构造虚拟机执行时所需信息。
语句20-23,构造虚拟机堆栈,同时调整EDI。
Handle
对Handle的分析与对Dispatch的分析其实是大同小异的。
随便找一条Handle,以00405629处的Handle为例,未化简前有90条指令,化简后为43条指令。
具体代码如下。
404B25 movzx eax, byte ptr [esi] read = [] write = []
404B33 sub al, bl read = [] write = []
404F87 neg al read = [] write = []
404F94 ror al, 1 read = [] write = []
404F9B sub al, 0x9d read = [] write = []
404F9F sub bl, al read = [] write = []
404FA4 mov edx, dword ptr [ebp] read = [] write = []
404FB1 add ebp, 4 read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
61E68 lea esp, [esp - 0x20] read = [] write = []
40536F inc esi read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
404A0B mov dword ptr [eax + edi], edx read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
404A13 mov byte ptr [esp], 0xe3 read = [] write = [10]
404A17 push 0xc46df181 read = [] write = [11]
404A1C lea esp, [esp + 0x30] read = [] write = []
4055C6 mov al, byte ptr [esi] read = [] write = []
4055D8 sub al, bl read = [] write = []
4055E0 inc al read = [] write = []
4055E4 mov cl, 0x18 read = [] write = []
4055E6 xor al, 0x1e read = [] write = []
4055E8 adc ch, bl read = [] write = []
4055EA add al, 0xc read = [] write = []
4055F0 sbb cl, 0x3e read = [] write = []
4055F3 sub bl, al read = [] write = []
4055F5 pushal read = [] write = [7, 6, 5, 4, 3, 2, 1, 0]
61E68 lea esp, [esp + 4] read = [] write = []
4055FA movzx eax, al read = [] write = []
405601 inc esi read = [] write = []
405604 mov ecx, dword ptr [eax*4 + 0x405821] read = [] write = []
40560B pushfd read = [] write = [7]
61E68 lea esp, [esp - 4] read = [] write = []
405611 mov byte ptr [esp], 0x2a read = [] write = [8]
405615 ror ecx, 0x10 read = [] write = []
405618 push esp read = [] write = [9]
405619 lea esp, [esp + 0x28] read = [] write = []
61E68 lea esp, [esp - 4] read = [] write = []
401000 push 0 read = [] write = [1]
401000 push 0 read = [] write = [2]
4051E4 mov dword ptr [esp + 8], ecx read = [] write = [0]
4051E8 push dword ptr [esp + 4] read = [1] write = [3]
4051EC push dword ptr [esp + 0xc] read = [0] write = [4]
4051F0 ret 0x10 read = [4] write = []
可以看到,该Handle首先从P-Code指令流中取出操作数,进而进行解码工作,同时更新执行密钥。而后从虚拟机堆栈中弹出一个DWORD送入CONTEXT结构。
对此不多做赘述。
后记:
代码混淆只是一种辅助性手段,完全寄托于混淆的安全就是伪安全。市面上关于混淆的理论其实很多,但我还没找到一种能评估一种混淆算法强度以及时间开销的理论,当然也可能是这种理论所需要的知识体系完全超出了我的理解范围……
这篇文章断断续续写了3天左右,所以下载的脚本和文章中贴出的脚本可能有些不同……
VMProtect V2.12.3可以在看雪工具中下载:
https://tools.pediy.com/win/packers.htm
- End -
看雪ID:三十二变
https://bbs.pediy.com/user-783210.htm
本文由看雪论坛 三十二变 原创
转载请注明来自看雪社区
热门图书推荐:
热门文章阅读
公众号ID:ikanxue
官方微博:看雪安全
商务合作:wsc@kanxue.com