Many Python obfuscators make the same mistakes, they use exec
or eval
to execute the code. They use compressors like zlib
and lzma
to then just pass the original code to the exec
call.
Example:
1
2
import zlib
exec(zlib.decompress(b'x\x9c+(\xca\xcc+\xd1P\xf7H\xcd\xc9\xc9W\x08\xcf/\xcaIQT\xd7\x04\x00S_\x07\n'))
These tools, often referred to as “obfuscators,” are actually packers. They keep the original code stored and create a wrapper around it to prevent the user from seeing the code directly. The issue with packers is that they always need to unpack themselves in order for Python to execute the code, allowing us to retrieve the original code.
The exec
function accepts code as an argument, which can be either source code or a code object. By hooking exec
, we can dump the code before it is executed. In a future post, I will provide more information about code objects and how they can be used with exec
.
To intercept and manipulate the behavior of the exec
function, we will use a wrapper. This involves creating a hook function that returns a new function, hooked_func
, which calls our custom function wrap
where we can interact with the arguments. After wrap
is called, hooked_func
checks the return value of wrap
. If it is True
, hooked_func
will call the original exec
function and return its value. This allows us to decide whether to call the original exec
function or not. If wrap
returns False
, hooked_func
will return the fake return value passed by wrap
.
1
2
3
4
5
6
7
8
9
def hook(wrap, old):
def hooked_func(*args, **kwargs):
call_original, real_return = wrap(*args, **kwargs)
if call_original:
return old(*args, **kwargs)
else:
return real_return
return hooked_func
exec
will still behave the same way, there will just be an extra function call in between.
A commonly used obfuscator is created by developer tools.
1
2
3
4
5
6
7
8
import base64, codecs
magic = 'cHJp'
love = 'oaDb'
god = 'IjEr'
destiny = 'ZFVc'
joy = '\x72\x6f\x74\x31\x33'
trust = eval('\x6d\x61\x67\x69\x63') + eval('\x63\x6f\x64\x65\x63\x73\x2e\x64\x65\x63\x6f\x64\x65\x28\x6c\x6f\x76\x65\x2c\x20\x6a\x6f\x79\x29') + eval('\x67\x6f\x64') + eval('\x63\x6f\x64\x65\x63\x73\x2e\x64\x65\x63\x6f\x64\x65\x28\x64\x65\x73\x74\x69\x6e\x79\x2c\x20\x6a\x6f\x79\x29')
eval(compile(base64.b64decode(eval('\x74\x72\x75\x73\x74')),'<string>','exec'))
Here is an example of the output when using this method. Instead of calling the exec
function directly, the code calls the compile
function and passes the resulting code object to the eval
function. This technique is often used to mislead inexperienced reversers, but the same method can be applied. To intercept the code, we can use the same code as before, but we hook the compile
function instead of exec
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def hook(wrap, old):
def hooked_func(*args, **kwargs):
call_original, real_return = wrap(*args, **kwargs)
if call_original:
return old(*args, **kwargs)
else:
return real_return
return hooked_func
def my_hook(*args, **kwargs):
print(*args, **kwargs)
return False, "None"
compile = hook(my_hook, compile)
import base64, codecs
magic = 'cHJp'
love = 'oaDb'
god = 'IjEr'
destiny = 'ZFVc'
joy = '\x72\x6f\x74\x31\x33'
trust = eval('\x6d\x61\x67\x69\x63') + eval('\x63\x6f\x64\x65\x63\x73\x2e\x64\x65\x63\x6f\x64\x65\x28\x6c\x6f\x76\x65\x2c\x20\x6a\x6f\x79\x29') + eval('\x67\x6f\x64') + eval('\x63\x6f\x64\x65\x63\x73\x2e\x64\x65\x63\x6f\x64\x65\x28\x64\x65\x73\x74\x69\x6e\x79\x2c\x20\x6a\x6f\x79\x29')
eval(compile(base64.b64decode(eval('\x74\x72\x75\x73\x74')),'<string>','exec'))
1
b'print("1+1")' <string> exec
In this example, we can see all of the arguments that were passed to the compile
function. Many obfuscators use variations of this technique. After discussing code objects, I will provide examples that use marshal
for obfuscation.