Last week at Hacker School I did a quick presentation on python bytecode and the
dis module. The disassembler is a very powerful tool with a gentle learning curve – that is, you can get a fair amount out of it without really knowing much about what’s going on. This post is a quick introduction to how and why you should use it.
Bytecode is the internal representation of a python program in the compiler. Here, we’ll be looking at bytecode from cpython, the default compiler. If you don’t know what compiler you’re using, it’s probably cpython.
How do I get bytecode?
You already have it! Bytecode is what’s contained in those .pyc files you see when you import a module. It’s also created on the fly by running any python code.
Ok, so you have some bytecode, and you want to understand it. Let’s look at it without using the
dis module first.
1 2 3 4 5 6 7 8 9 10 11
Hmm, that was … not very enlightening. We can see that we have a bunch of bytes (some printable, others not), but we have no idea what they mean.
Let’s run it through
1 2 3 4 5 6 7 8 9 10 11 12
Now this starts to make some sense.
dis takes each byte, finds the opcode that corresponds to it in
opcodes.py, and prints it as a nice, readable constant. If we look at
opcodes.py we see that
LOAD_CONST is 100,
STORE_FAST is 125, etc.
dis also shows the line numbers on the left and the values or names on the right. So without ever seeing something like before, we have an idea what’s going on: we first load a constant, 2, then somehow store it as
a. Then we repeat this with 3 and
b. We load
b back up, do
BINARY_ADD, which presumably adds the numbers, and then do
Examining the bytecode can sometimes increase your understanding of python code. Here is one example.
elif is identical in bytecode to
else ... if. Take a look:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
We’ve read PEP 8 so we know that flat is better than nested for style and readability. But is there a performance difference? Not at all – in fact, these two functions have identical bytecode.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
That makes sense –
else just means “start executing here if the
if was false” – there’s no more computation to do.
elif is just syntactic sugar.
This just scratches the surface of what’s interesting about python bytecode.