I’ve managed to encounter three different bugs with the same obscure source in the last week. I think Hacker School might be cursed. Here’s a blog post attempting to rid us of the curse.
Bug 1: Flask app on Heroku can’t find images and stylesheets
The first bug was a Flask app deployed to Heroku. It worked fine locally, but when deployed, none of the images or stylesheets rendered.
Bug 2: A project fails to build
A Hacker Schooler cloned into a project and tried to build it with python setup.py install. The build failed with the error Supposed package directory '[project]' exists but is not a directory.
Bug 3: Heroku-deployed app crashes
I deployed a new feature to the Hacker School site (which is a Rails app), and crashed the application. Again, everything worked fine locally on my machine and my colleague’s machines.
The solution and explanations are below the fold. If you’d like to try to guess, you can ask me debugging questions on twitter (@akaptur), and I’ll respond within 24 hours until Friday, October 18th, 2013. If you don’t like guessing, or your own bugs are plenty for you, you can click through now.
Last week at Hacker School I did a quick presentation on python bytecode and the dis module. The disassembler is a very powerful tool with a gentle learning curve – that is, you can get a fair amount out of it without really knowing much about what’s going on. This post is a quick introduction to how and why you should use it.
What’s bytecode?
Bytecode is the internal representation of a python program in the compiler. Here, we’ll be looking at bytecode from cpython, the default compiler. If you don’t know what compiler you’re using, it’s probably cpython.
How do I get bytecode?
You already have it! Bytecode is what’s contained in those .pyc files you see when you import a module. It’s also created on the fly by running any python code.
Disassembling
Ok, so you have some bytecode, and you want to understand it. Let’s look at it without using the dis module first.
Now this starts to make some sense. dis takes each byte, finds the opcode that corresponds to it in opcodes.py, and prints it as a nice, readable constant. If we look at opcodes.py we see that LOAD_CONST is 100, STORE_FAST is 125, etc. dis also shows the line numbers on the left and the values or names on the right. So without ever seeing something like before, we have an idea what’s going on: we first load a constant, 2, then somehow store it as a. Then we repeat this with 3 and b. We load a and b back up, do BINARY_ADD, which presumably adds the numbers, and then do RETURN_VALUE.
Examining the bytecode can sometimes increase your understanding of python code. Here is one example.
elif
elif is identical in bytecode to else ... if. Take a look:
We’ve read PEP 8 so we know that flat is better than nested for style and readability. But is there a performance difference? Not at all – in fact, these two functions have identical bytecode.
That makes sense – else just means “start executing here if the if was false” – there’s no more computation to do. elif is just syntactic sugar.
Further reading:
This just scratches the surface of what’s interesting about python bytecode.
If you enjoyed this, you might enjoy diving into Yaniv Aknin’s series on python internals. If you’re excited about bytecode, you should contribute to Ned Batchelder’s byterun.
We’ve asked many of our residents at Hacker School what qualities all great programmers share. There’s very little agreement – clearly, there are a multitude of ways to be a great programmer, and you can think of a counter-example for almost every quality you can name. One of the rare non-controversial statements came first from Jessica McKellar, who identified systematic debugging as a key skill.
What does systematic debugging look like? At this point, I’m focused on two aspects: asking a clear question, and keeping track of my mental “stack”. I had a particularly fun and interesting bug yesterday that I think illustrates this nicely.
The problem at hand was a brainteaser from Jessica:
Using the official SOWPODS Scrabble dictionary, what letters, if any, never appear doubled? (By that I mean — “AA” does appear doubled because it is in “AARDVARK”, “BB” does appear doubled because it is in “BUBBLE” — are there any letters that never appear doubled in a word?)
I came up with a solution, but it was kind of slow, so I was trying to find a faster one. Also, my original solution used a regular expression to match a doubled letter in a word, but it would only find the first pair (e.g. in BOOKKEEPER only the ‘OO’ would be caught).
I decided to try taking the dictionary as a single string, rather than a list of words, and consuming it one letter-pair at a time, as follows.
If you’re thinking, “Hmm, that looks like it might be slow,” congratulations! It is indeed quite slow. I had two guesses for why it was slow. (It wasn’t the print statement.) First, maybe I was copying the string over and over, and that was slowing down the function. Second, maybe I misunderstodd the regular expression – for example, maybe it was taking a longer time to operate on a longer string. I decided to investigate the second possibility first, since regular expressions were more of a mystery to me.
At this point in the problem solving, my mental stack looks something like this:
---------------------------
| regex taking longer on a | (newest)
| longer-length string? |
---------------------------
| reg exp? or string copy? |
---------------------------
| function is so slow, why? |
---------------------------
| find a faster solution |
---------------------------
| solve wordplay problem | (oldest)
---------------------------
The next step was to test my theory that my regular expression should not take longer on a longer string. Importantly, the regex had no “greedy” elements – it should match and return the first time it encounters two identical characters, regardless of how long the string is after those matching characters. Here was the experiment I wrote:
In both cases, I thought, the regex should check “he”, fail, check “ee”, succeed, and return. To my great surprise, the long string test took 70-80x as long to run as the short string test!
test_with_timeit output
123
short: 0.0243809223175
long: 1.77788496017
time ratios: 72.9211527366 x
Clearly, I was missing something about regular expressions. I googled around and learned some details about backreferences, the (\1) in my code (which makes my regular expression not actually “regular”), but that wasn’t very illuminating. (You can picture “something about Finite State Automata” being pushed onto and quickly popped back off the stack.)
My next step was to run timing experiments with a variety of regular expressions to see if I could find an element that made the difference. At this point yesterday, I was still trying to pass strings to timeit, which was getting unwieldy. (Stay tuned to learn two better ways to do timing.) I switched to a simpler timing script, just taking time.time() at the start and end of the algorithm.
Uh-oh, wait a minute, I thought this was identical code. Where did the 70x time difference go?
I couldn’t see any obvious errors, so I went back to the timeit code. Could I reproduce these results while still using the timeit module?
Let’s look back at the problem stack.
---------------------------
| what's up with timeit? |
---------------------------
| is this code different? |
---------------------------
| timing programs give |
| different results, uh-oh |
---------------------------
| regex taking longer on a |
| longer-length string |
---------------------------
| reg exp? or string copy? |
---------------------------
| function is so slow, why? |
---------------------------
| find a faster solution |
---------------------------
| solve wordplay problem | (oldest)
---------------------------
At this point Brandon Rhodes joined me in pair-debugging, and we made a couple of discoveries. The most interesting of these was that two different ways of calling timeit generated wildly different results.
So now we know something is up with passing strings to timeit.
---------------------------
| timeit string version - ? |
---------------------------
| what's up with timeit? |
---------------------------
| is this code different? |
---------------------------
....
Ok, we’ve narrowed it down this far – it’s time to take a look at the source. timeit is a module written in python (yay!), and the relevant parts look pretty straightforward:
template="""def inner(_it, _timer):%(setup)s _t0 = _timer() for _i in _it:%(stmt)s _t1 = _timer() return _t1 - _t0"""classTimer:"""Class for timing execution speed of small code snippets. The constructor takes a statement to be timed, an additional statement used for setup, and a timer function. Both statements default to 'pass'; the timer function is platform-dependent (see module doc string). To measure the execution time of the first statement, use the timeit() method. The repeat() method is a convenience to call timeit() multiple times and return a list of results. The statements may contain newlines, as long as they don't contain multi-line string literals. """def__init__(self,stmt="pass",setup="pass",timer=default_timer):"""Constructor. See class doc string."""self.timer=timerns={}ifisinstance(stmt,basestring):stmt=reindent(stmt,8)ifisinstance(setup,basestring):setup=reindent(setup,4)src=template%{'stmt':stmt,'setup':setup}elifhasattr(setup,'__call__'):src=template%{'stmt':stmt,'setup':'_setup()'}ns['_setup']=setupelse:raiseValueError("setup is neither a string nor callable")self.src=src# Save for traceback displaycode=compile(src,dummy_src_name,"exec")execcodeinglobals(),nsself.inner=ns["inner"]elifhasattr(stmt,'__call__'):self.src=Noneifisinstance(setup,basestring):_setup=setupdefsetup():exec_setupinglobals(),nselifnothasattr(setup,'__call__'):raiseValueError("setup is neither a string nor callable")self.inner=_template_func(setup,stmt)else:raiseValueError("stmt is neither a string nor callable")deftimeit(self,number=default_number):"""Time 'number' executions of the main statement. To be precise, this executes the setup statement once, and then returns the time it takes to execute the main statement a number of times, as a float measured in seconds. The argument is the number of times through the loop, defaulting to one million. The main statement, the setup statement and the timer function to be used are passed to the constructor. """ifitertools:it=itertools.repeat(None,number)else:it=[None]*numbergcold=gc.isenabled()gc.disable()timing=self.inner(it,self.timer)ifgcold:gc.enable()returntimingdeftimeit(stmt="pass",setup="pass",timer=default_timer,number=default_number):"""Convenience function to create Timer object and call timeit method."""returnTimer(stmt,setup,timer).timeit(number)
Let’s trace the part that’s giving us the long execution string. We call the module-level function timeit.timeit, which creates a Timer object and returns the result of calling its timeit method. No surprise that we got identical results between our method #1 and method #3 above.
The Timer object checks to see if you’ve passed it strings or callables as its stmt and setup parameters. In our case both are strings. The init method splices the strings into the source code here:
timeit.py
1
src=template%{'stmt':stmt,'setup':setup}
We can isolate this in the REPL if we want to be extra-sure of how it works:
timeit.py
1234567891011121314151617181920
>>>template="""... def inner(_it, _timer):... %(setup)s... _t0 = _timer()... for _i in _it:... %(stmt)s... _t1 = _timer()... return _t1 - _t0... """>>>stmt="pass">>>setup="print 'here we go'">>>printtemplate%{'stmt':stmt,'setup':setup}definner(_it,_timer):print'here we go'_t0=_timer()for_iin_it:pass_t1=_timer()return_t1-_t0
Got it – simple, legal python code is generated. So where is our regular expression getting screwed up? Let’s take a look at the code generated in our actual test by inserting a print statement into the timeit module. (First we’ll copy the module over to our current working directory, to avoid modifying our actual standard library.)
timeit.py
12
src=template%{'stmt':stmt,'setup':setup}printsrc
And now calling test_with_timeit.py again, it prints:
This looks mostly reasonable … but hey, wait, what happened to the regular expression?
1
double=re.compile(r"([a-z])()");
That’s not right! Changing print src to print repr(src) gives us a better idea what’s up – the relevant line is now:
1
double=re.compile(r"([a-z])(\x01)");
And looking back at the SETUP string … uh-oh, this isn’t a raw string, and it contains an escape character ((\1)). We can tell we’re in trouble immediately in the REPL:
1234
>>>print"\1">>>repr("\1")"'\\x01'"
Why does this happen? We can consult the docs on string literals to find out. Unless you’re working with a raw string (one prefixed with r or R), escape characters are interpreted the same way they are in C. Our text, a backslash followed by a number, matches the pattern for an octal digit: a backslash followed by 1-3 integers. \ooo indicates a character with the octal value ooo. And that’s exactly what we’ve done:
1234
>>>ord("\1")1>>>chr(1)'\x01'
\x01 is a legal (though unprintable) character, and it doesn’t appear in our long string to test the regular expression. So of course the longer one took a longer time – it had to traverse the entire string looking for \x01, which it never found.
The fix is quite easy – one character, in fact. Make the SETUP string raw:
… and the problem is solved. So we can pop a bunch of frame off our problem stack and go investigate the string copying, which is in fact the issue.
---------------------------
| reg exp? or string copy? |
---------------------------
| function is so slow, why? |
---------------------------
| find a faster solution |
---------------------------
| solve wordplay problem | (oldest)
---------------------------
Lessons learned
Interesting bugs are a ton of fun! Now for a bit of introspection: what could have gone better?
Most obviously, I could have spent less time on the regular-expression branch of why the function was slow, especially as the stack grew. Whether I should have done this depends on your perspective: at Hacker School, I’m almost 100% learning motivated and about 0% get-it-done motivated, so my time to solve the bug is virtually unbounded.
I was slightly overconfident about understanding how the timeit code splicing happened. The code looked so simple that I didn’t bother to check my understanding at first.
Passing timeit.timeit strings is an unwieldy way of doing timing. I should have been using either the callable option, or just timing with time python test_file.py from the command line. Both of these would have been easier to work with and would have sidestepped the bug.
My mental model of regular expressions was largely correct. That’s satisfying.
Strings! Ouch. This may not seem like much of a lesson, but it’s useful to add to my mental list of Places That Bugs Lurk.
>>>classMadness(dict):...def__setitem__(self,attr,value):...ifisinstance(value,type):...value.__name__=attr...dict.__setitem__(self,attr,value)...>>>code="""\... class Example(object): pass... NewName = Example... print(Example.__name__)... """>>>execcodeinMadness()NewName
This is really fun stuff, and nicely illustrates a core feature of python – that namespaces are basically just dictionaries.
So what’s going on here? In this case, I think it’s easiest to work backwards from the end.
Last we have exec code in Madness(). The exec keyword in python 21 executes a string as code.
123
>>>a="hello">>>exec"print a"hello
exec optionally takes a context in which to execute the code. A dictionary is a perfectly legal context to use:
12
>>>exec"print b"in{'b':"hi there"}hithere
All the code is shown here – b is not defined elsewhere in the REPL session.
If a dictionary is provided, it’s presumed to contain both the global and the local variables. This dictionary is the only context in which the code will be executed (it won’t check the scope from which the exec statement was made).
If the value of the key:value pair that we’re setting is an instance of the type type – that is, if the object in question is a class – then grab that class’s name and set it to the key. (This step will blow up if the key in question isn’t a legal name in the first place.) Then use the parent class (dict)’s __setitem__ to actually write to the Madness object.
We can execute the code string one line at a time to better see what’s happening.
(I’m using m.keys() to examine the state of the namespace instead of just printing m because the reference to __builtins__ pukes a bunch of tiresome definitions and copyright statements into the REPL. exec adds a reference to __builtins__ on the execution of any statement.)
We’re now quite close to the original Ruby behavior – with a deeper understanding of python namespaces!
All of this works in python 3, too, with slightly different syntax because exec is a function, not a keyword. Nick’s gist includes the python 3 version too.↩
My colleague Dave told me about an interesting bit of Ruby the other day. If you make a new Class object, it initially has no name. If you then assign a name to it, with your new class on the right-hand side of the assignment, the name is attached to the Class object.
Luckily for our intuition, this is a pretty special case, and not a way that Ruby generally behaves. But it got me wondering if there’s something similar in python.
What about python?
The simplest answer is “no”. In python, assignment is a simple statement, not an operator, so you can’t do things like operator overloading. This also means we can’t somehow add a hook to assignment.
Ok, fine. But wait – we’re programmers! We don’t give up that easily.
What about hooking on __getattribute__? (Defining your own __getattribute__ is “the bad way” to do it. You almost always want to define your own __getattr__, unless you’re doing something silly and/or malicious, like we are. Note that we have to use the parent class (object)’s __getattribute__ to extract and save the attribute we want. If we didn’t do that, we’d trigger an infinitely recursive lookup.)
12345678
classFoo(object):def__init__(self):self.some_attr="sup foo"def__getattribute__(self,attr):attribute=object.__getattribute__(self,attr)object.__setattr__(self,attr,"you're too late!")returnattribute
This looks like it works!
12345678
>>>f=Foo()>>>f<obj.Fooobjectat0x10253dd10>>>>word=f.some_attr>>>printwordsupfoo>>>printf.some_attryou're too late!
Unfortunately, this behavior has nothing to do with assignment. The object was mutated when __getattribute__ was called, which happened to be in an assignment statement in this code above. We’d get exactly the same behavior without the assignment:
1234567
>>>g=Foo()>>>g<obj.Fooobjectat0x10253dd10>>>>g.some_attrsupfoo>>>g.some_attryou're too late!
Darn.
But we’re programmers, right? We don’t give up that easily.
Trace functions to the rescue
There’s a settrace function in python’s sys module that we can use to examine stack frames, events, and lots of other code data while our program is running. sys.settrace(fn) takes a trace function as an argument, and that trace function must take frame, event, args as arguments. It will then get called every time an event happens. A line of code, a function call, a function return, and an exception are all “events”. We can use this to inspect each line of code before it runs and check manually if it’s an assignment statement.
(The trace function returns a reference to itself to indicate that tracing should continue.)
Now let’s write our code to mess up assignment.
12345678910111213141516171819202122232425
classTracer(object):[...]defmess_up_on_assignment(self,frame,linetext):rhs=linetext[linetext.index("=")+1:]names=rhs.split()local_objs=[namefornameinnamesifnameinframe.f_locals.keys()+frame.f_globals.keys()]fornameinlocal_objs:ifnamenotinJANKY_NAMESPACE_MANAGER:box=Box(frame.f_locals[name],name)frame.f_locals[name]=boxelse:JANKY_NAMESPACE_MANAGER[name].assignments+=1JANKY_NAMESPACE_MANAGER={}classBox(object):def__init__(self,obj,name):self.obj=objself.name=nameself.assignments=1JANKY_NAMESPACE_MANAGER[name]=selfdef__repr__(self):raiseNameError("this message has self-destructed.")
Here our “box” object is pretty silly. It takes the original object and wraps it in a box. (You could implement __getattr__ and __setattr__ so that the box behaves basically like the original object, which I haven’t done.) The box then explodes when you try to print it. We’re also tracking which names we’ve seen with the aptly-named JANKY_NAMESPACE_MANAGER (which has all kinds of scope issues, but whatever).
Meanwhile, in the mess_up_on_assignment code, we check to see if any of the whitespace-separated words on the right-hand side of the equation are names we recognize from frame.f_locals or frame.f_globals. If so, and if we haven’t seen them before, throw them in a self-destructing box!
importsabotagesabotage.assignment(__file__)defgreet():name=raw_input("Enter your name: ")print"hello,",nameprint"goodbye, ",nameif__name__=='__main__':greet()
And this one blows up:
12345678910111213141516
importsabotagesabotage.assignment(__file__)print"done sabotaging"raw_input("Hit enter to continue")defgreet():name=raw_input("Enter your name: ")print"hello,",namename_copy=nameprint"goodbye, ",nameif__name__=='__main__':greet()
And there you have it – a (incredibly goofy) side-effecting assignment that mutates the object on the right-hand side of the assignment statement.
You can alias this with git config --global alias.plog "log --graph --oneline --all". (Looking at this graph may motivate you to start using git merge --no-ff.) You can substitute whatever you want for plog above.
If you get really screwed up, it’s reflog time! (This is pronounced ref-log, not re-flog as I’d been reading it. On the other hand, as my colleague Alan notes, “You damn well ought to feel penitent if you’re wedged enough to need the reflog.”) Here, we’re using git reset to undo the last commit. Even though the commit doesn’t appear in git log anymore, there’s still a references to it in the reflog.
It turns out that there is a better way to get a commit sha than running git log and copying and pasting the relevant shas. You can use rev-parse instead.
I encountered some surprising behavior in python’s list.sort() method. I was calling list.sort() with a function, then list.reverse(). This was silly – I forgot I could call list.sort(reverse=True). As it turns out, these are not the always the same, or even usually the same.
What’s happening here is that .sort() and sorted() in python are “stable sorts”, meaning that it’ll preserve the existing order when faced with a tie. This leads to a nice side effect, as described in the python wiki: you can sort by multiple criteria. So if we wanted to sort primarily by number (ascending), and secondarily by letter, we could do that quite easily. Just sort by the secondary key first, then by the primary key. Since the sort is stable, you know that a tie in the second (primary) sort will preserve the order you had going in.
I’m currently digging into the source of virtualenv, a python utility for managing environments. It’s incredibly handy, and I’d been using it for a while without really understanding how it worked. I’ll have more later on the details, but in the meantime, I hit a particularly puzzling bit of code:
123456789101112131415
if [ -z "$VIRTUAL_ENV_DISABLE_PROMPT" ] ; then
_OLD_VIRTUAL_PS1="$PS1"
if [ "x" != x ] ; then
PS1="$PS1"
else
if [ "`basename \"$VIRTUAL_ENV\"`" = "__" ] ; then
# special case for Aspen magic directories
# see http://www.zetadev.com/software/aspen/
PS1="[`basename \`dirname \"$VIRTUAL_ENV\"\``] $PS1"
else
PS1="(`basename \"$VIRTUAL_ENV\"`)$PS1"
fi
fi
export PS1
fi
This is largely intelligible – clearly we’re modifying the $PS1 variable, which sets the bash prompt. But what on earth is this line?
12
if [ "x" != x ] ; then
PS1="$PS1"
x is not a variable defined elsewhere in the script, and bash will interpret both "x" and x as string literals anyway. Is there some bizarre shell in which this test can possibly return true?
As it turns out, there isn’t. The reason this code looks so bizarre is that it’s automatically generated by virtualenv. The original line of code is this:
12
if [ "x__VIRTUAL_PROMPT__" != x ] ; then
PS1="__VIRTUAL_PROMPT__$PS1"
I recently started using git’s patch mode via git add -p. Patch mode makes my commits more granular, which means my commit messages are more accurate. Better still, patch mode allows you to abstract away from adding “files” – you’re adding changes to be committed. This is a closer mental model to what git is actually doing. I’m finding it particularly useful for code review: make several changes in a friend’s program, group the changes by concept, and commit one concept at a time.
That’s the why – here’s the how.
git add -p launches what git calls the “interactive hunk selector.” This leads me to my other favorite reason to use -p: besides being useful, all the docs sound like an episode of The Bachelorette. “Once you have decided the fate of all the hunks, if there is any hunk that is chosen, the index is updated with the selected hunks.” And if not, our bachelorette takes the million dollars!
In practice, I only end up using three of the interactive hunk selector options: [y]es, [n]o, and [s]plit. Split takes the current hunk and divides it into smaller hunks, then presents them for selection one at a time. (If you’re attempting to continue the Bachelorette metaphor, now is a good time to stop.)
Let’s look at an example. Suppose I’m working on a function that recursively calculates the sum of element of a list. My first draft looks like this:
123456
def sum(list_in):
""" Returns the sum of elements of a list."""
if list_in == []:
return 0
else:
return list_in.pop() + sum(list_in)
Not bad, but it’s a little verbose. Let’s take advantage of the fact that an empty list is falsey. And, oops, we’re overwriting the built-in sum function in python – we probably don’t want to do that.
123456
def summer(list_in):
""" Returns the sum of elements of a list."""
if not list_in:
return 0
else:
return list_in.pop() + summer(list_in)
Ok, these are two different thoughts, but we didn’t commit in between. Interactive mode to the rescue – let’s stage those hunks.
12345678910111213141516
sum [master]\ ⚲git add -p
diff --git a/sum.py b/sum.py
index fdccd3f..6d000f7 100644
--- a/sum.py
+++ b/sum.py
@@ -1,6 +1,6 @@
-def sum(list_in):
+def summer(list_in):
""" Returns the sum of elements of a list."""
- if list_in == []:
+ if not list_in:
return 0
else:
- return list_in.pop() + sum(list_in)
+ return list_in.pop() + summer(list_in)
Stage this hunk [y,n,q,a,d,/,s,e,?]?
Our changes aren’t in the right hunks, so we’ll pick s to split them.
1234567
Stage this hunk [y,n,q,a,d,/,s,e,?]? s
Split into 3 hunks.
@@ -1,2 +1,2 @@
-def sum(list_in):
+def summer(list_in):
""" Returns the sum of elements of a list."""
Stage this hunk [y,n,q,a,d,/,j,J,g,e,?]?
Let’s deal with the naming issue first. y. Next up:
123456
@@ -2,4 +2,4 @@
""" Returns the sum of elements of a list."""
- if list_in == []:
+ if not list_in:
return 0
else:
sum [master *+]\ ⚲git commit -m "fix naming to not conflict with builtin"
[master caaa300] fix naming to not conflict with builtin
1 file changed, 2 insertions(+), 2 deletions(-)
Now we can repeat this process for our other change.
1234567891011121314
sum [master *]\ ⚲git add -p
diff --git a/sum.py b/sum.py
index 77676b1..6d000f7 100644
--- a/sum.py
+++ b/sum.py
@@ -1,6 +1,6 @@
def summer(list_in):
""" Returns the sum of elements of a list."""
- if list_in == []:
+ if not list_in:
return 0
else:
return list_in.pop() + summer(list_in)
Stage this hunk [y,n,q,a,d,/,e,?]?
This is our only hunk now, so y.
123
sum [master +]\ ⚲git commit -m "use falseyness of empty list"
[master 5209728] use falseyness of empty list
1 file changed, 1 insertion(+), 1 deletion(-)
Of course, you may not want your commits to be quite this granular – but hopefully the example has demonstrated the strength of git add -p in creating single-concept commits.