Building on my previous post titled “Common Mistakes”, I wanted to share a practical example of how a stereotype obfuscator can be problematic. I received this particular obfuscator as a crackme and it perfectly illustrates how many obfuscator developers tend to approach security.
The example I’m presenting involves an obfuscator that compiles its “runtime” into a .pyd
file, which can make it difficult to overwrite built-in functions, as I discussed in my previous post. However, I will be showing you a more general workaround that can be used for both source code obfuscators and compiled ones.
Analyzing
When we first look at the crackme we see the following files:
1
2
3
crackme
│ test-obf.py
│ orion.cp310-win_amd64.pyd
In test-obf.py
we find something that looks similar to PyArmor:
1
2
3
4
5
6
__author__ = "censored#0000"
__obfuscator__ = "Orion V1.2"
__github__ = ""
from orion import decrypt_and_exec as Samsung
Samsung(__file__, "58100a5b566c535c5b171d404d505f56680f56495c14580a583d540b564b1d124c5a0e5f695d004f58165b095a3d505956114e484d515e5e3e0f01400d4b0c5c", b'ORION\x04\x05\x00\x01\x1f\x8b\x08\x00[\x07Ld\x02\xff\x01\x9c\x07c\xf8BZh91AY&SY\x08\xbfS9\x00\x02\xfe\x7f\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xd0\x03\xfe\xf5\xcf/^\xee\xde\xf7\xafn\xee\xefm]!4\xc8`L=Bd\xf4M\x19\x03\x00\x86\xd3BmF\xca1\x1a\r\x88\x86\x98\x98\x9e\xa6\x9a4\xc4\xfdS\xd2a3S\xd4\x0c\x04hb2zA\x80jz\x9a\x06M4df\x84`M1\xb5L&\xd3H\xf5\x1b\xd4G\xe9\x1a\x98h\xd0e<\xa1#\xd3MOS\'\x82OSA\xea\x1a3P\x1e\xa1\xa3\xd4...")
It’s quite evident that the decrypt_and_exec
function decrypts and runs the code. However, since the code is compiled to a .pyd
, we cannot hook the built-ins like we did before as they operate in a separate environment. Alternatively, we can opt for a more general solution by modifying CPython to dump all executed code. Although it may seem complex, the process is relatively straightforward.
Additionally, it’s worth noting that the .pyd
file’s filename reveals that it’s intended for use with Python 3.10, indicated by the cp310
in the filename.
Deobfuscating
To get started with editing the desired version (3.10 in this case), you can follow these simple steps:
- Clone the specific version of Python that you want to modify.
- Navigate to the
ceval.c
file in thePython
directory of the cloned repository and open it in your preferred text editor.1 2 3 4 5 6 7 8
cpython-3.10 ├───... ├───Python │ │ ... │ │ ceval.c │ │ ... | └───...
- We will look for the part where the code objects are executed, in versions 3.8 - 3.10 I can say with certainty this happens in the
_PyEval_EvalFrameDefault
function. - Find the lines where the frame gets pushed.
1 2 3
/* push frame */ tstate->frame = f; co = f->f_code;
NOTE: It might not be this code exactly, search carefully
Once the code object is defined, we can easily access and dump it. However, saving each code object to an individual file can create complications, especially since countless code objects are executed in every script. Therefore, we’ve chosen to save everything to a single file named
dumped.txt
, using a unique separator that’s unlikely to appear in the code objects themselves. In this case, we’ve decided to use the separatorSVENSKITHESOURCE
.1 2 3 4 5 6 7 8 9
FILE *file; char filename[32] = "dumped.txt"; file = fopen(filename, "ab"); PyMarshal_WriteObjectToFile((PyObject *)co, file, 4); fprintf(file, "SVENSKITHESOURCE"); fclose(file);
The
PyMarshal_WriteObjectToFile
call is what actually dumps the code object and writes it. The argument4
indicates the marsal version, where 4 means “Current version”. - Compile it to the target of your choice
Once you’ve run test-obf.py
using your custom Python version, you’ll find a dumped.txt
file in the same directory. At this point, you’ll need to identify the code objects that are relevant to your task. One straightforward approach is to search for strings that are printed to the console or that you know are present in the program. To do this, you can use a hex editor such as HxD.
Start by scrolling up until you come across the Python bootstrap code objects or any that don’t originate directly from the original program. Alternatively, you can search for the filename, as this information is also stored in the code objects. Once you’ve located a unique identifier, you can create a quick and simple script to further deobfuscate the code.
In the case of our example “insert the key” is printed to the console.
1
2
3
4
5
6
7
8
import marshal, dis
file = open("dumped.txt", "rb").read()
for chunk in file.split("SVENSKITHESOURCE"):
if b"insert the key" in chunk:
code = marshal.loads(chunk)
dis.dis(code)
This way we can view the disassembly of all the related code objects, the output in our example being:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
4 0 LOAD_CONST 0 (0)
2 LOAD_CONST 1 (None)
4 IMPORT_NAME 0 (time)
6 STORE_NAME 0 (time)
6 8 LOAD_NAME 1 (input)
10 LOAD_CONST 2 ('insert the key:')
12 CALL_FUNCTION 1
14 STORE_NAME 2 (key)
8 16 LOAD_NAME 2 (key)
18 LOAD_CONST 3 ('orion_non_sliddato_invece_uuid_skidda')
20 COMPARE_OP 2 (==)
22 POP_JUMP_IF_FALSE 23 (to 46)
9 24 LOAD_NAME 3 (print)
26 LOAD_CONST 4 ('bravo mi hai deobfuscato')
28 CALL_FUNCTION 1
30 POP_TOP
10 32 LOAD_NAME 0 (time)
34 LOAD_METHOD 4 (sleep)
36 LOAD_CONST 5 (2)
38 CALL_METHOD 1
40 POP_TOP
42 LOAD_CONST 1 (None)
44 RETURN_VALUE
12 >> 46 LOAD_NAME 3 (print)
48 LOAD_CONST 6 ('coglione non sai deobfuscare')
50 CALL_FUNCTION 1
52 POP_TOP
13 54 LOAD_NAME 3 (print)
56 LOAD_CONST 7 ('essendo che hai sbagliato adesso eliminerò system 32')
58 CALL_FUNCTION 1
60 POP_TOP
14 62 LOAD_NAME 3 (print)
64 LOAD_CONST 8 ('eliminando system 32...')
66 CALL_FUNCTION 1
68 POP_TOP
15 70 LOAD_NAME 0 (time)
72 LOAD_METHOD 4 (sleep)
74 LOAD_CONST 9 (15)
76 CALL_METHOD 1
78 POP_TOP
16 80 LOAD_NAME 3 (print)
82 LOAD_CONST 10 ('ci sei cascato UwU')
84 CALL_FUNCTION 1
86 POP_TOP
88 LOAD_CONST 1 (None)
90 RETURN_VALUE
Which decompiles back to:
1
2
3
4
5
6
7
8
9
10
11
12
import time
key = input('insert the key:')
if key == 'orion_non_sliddato_invece_uuid_skidda':
print('bravo mi hai deobfuscato')
time.sleep(2)
else:
print('coglione non sai deobfuscare')
print('essendo che hai sbagliato adesso eliminerò system 32')
print('eliminando system 32...')
time.sleep(15)
print('ci sei cascato UwU')
We’ve successfully obtained the original source code without needing to examine the obfuscator’s runtime, demonstrating that using multiple layers of “encryption” is pointless unless it’s done intelligently.