Archive

Posts Tagged ‘graph’

Python Gotcha #1: Default arguments and mutable data structures

August 23, 2010 2 comments

I ran into an issue in Python the other day that I thought others might find instructive. The problem involves default arguments and mutable data structures.

Named arguments

In Python, you can enumerate the arguments to a method explicitly, rather than simply by putting the arguments in the order expected by the method. For instance

>>> def printit(a,b,c):
... print a,b,c
...
# Calling the method with the arguments in the expected order; 1 is assigned to a, 2 to b, 3 to c
>>> printit(1,2,3)
1 2 3
# Explicitly assigning the values to the different argument variables. Now you can call the method however you want.
>>> printint(c='c',b='b',a='a')
a b c

Default Arguments

Another feature of Python not present in Java is that of default arguments

>>> def defaulted(a,b='b',c='c'):
... print a,b,c
...
>>> defaulted(1,2,3)
1 2 3
>>> defaulted(1,2)
1 2 c
>>> defaulted(1)
1 b c
# This will fail because I do not explicitly define what a is
>>> defaulted(b=6)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: defaulted() takes at least 1 non-keyword argument (0 given)

These default arguments are very succinct and a nice feature. In Java, the best way to emulate this feature is through method overloading. From the linked article:

public void printFavorites(Color favoriteColor, String favoritePhrase, int favoriteNumber) {
     System.out.println("Favorite color: " + favoriteColor);
     System.out.println("Favorite phrase: " + favoritePhrase);
     System.out.println("Favorite number: " + favoriteNumber);
}

public void printFavorites(Color favoriteColor, String favoritePhrase) {
    printFavorites(favoriteColor, favoritePhrase, 0);
}

public void printFavorites(Color favoriteColor) {
    printFavorites(favoriteColor, "The best phrase evar", 0);
}

This increases the amount of boilerplate code, and is not quite as flexible as the python version – in Python you could define a default argument for each variable and then the user could choose which if any to override.

>>> def printFavorites(faveColor="Blue", favePhrase="This is awesome",faveNumber=5):
... print "Favorite color: ", faveColor
... print "Favorite phrase: ", favePhrase
... print "Favorite number: ", faveNumber
>>> printFavorites()
Favorite color: Blue
Favorite phrase: This is awesome
Favorite number: 5
>>> printFavorites(favePhrase="Just the phrase")
Favorite color: Blue
Favorite phrase: Just the phrase
Favorite number: 5
>>> printFavorites(faveColor="orange")
Favorite color: orange
Favorite phrase: This is awesome
Favorite number: 5

Regardless, it’s usually not too much of a hurdle in Java. But given that it’s such a nice feature in Python, I wanted to make use of it.

Real world example: Graph traversal

Let’s examine how to represent directed (potentially cyclic) graphs in Python, and how to do a depth first search of all paths between two
First let’s create a graph, and then show the code to find all the paths between two nodes.

Graph representation

In Python we can represent a graph as a nested dictionary:

g = {'A': ['B', 'C'],
'B': ['C', 'D'],
'C': ['A']}

Better visualized as

(Side note: this graph was created using the DOT language, something I plan to post more about in the future.)

The nested dictionary structure defined above maps nodes to their children (nodes in this case are simply represented as single letters, but they could be arbitrarily complex).

Search

I mentioned that the graphs might be cyclic; if we do not detect a cycle, the code would similarly loop until running out of memory. Thus we need to keep track of the path we’ve already explored to avoid cycling.

Here is code to find all the paths between two nodes (slightly modified from http://www.python.org/doc/essays/graphs.html; I will highlight exactly what I changed later in the article).

def find_all_paths(graph, start, end, path=[]):
  path.append([start])
  if start == end:
    return [path]
  if not graph.has_key(start):
    return []
  paths = []
  for node in graph[start]:
    if node not in path:
      # recursive call
      newpaths = find_all_paths(graph, node, end, path)
      for newpath in newpaths:
        paths.append(newpath)
  return paths

Let’s test:

>>> graph.find_all_paths(g, 'A','B')
[['A', 'B']]
>>> graph.find_all_paths(g, 'A','C')
[['A', 'B', 'C'], ['A', 'C']]

Both of these are correct; the path between A and B is just A and B but between A and C we have two potential paths – either the two hops from A to B to C, or directly from A to C. The code I presented contains a bug. What happens when I run the queries again?

>>> graph.find_all_paths(g, 'A','B')
[]
>>> graph.find_all_paths(g, 'A','C')
[]

Huh? What’s going on. It worked fine the first time, but subsequent method calls return the wrong result!

Solution

The problem, I discovered with a little bit of searching, is spelled out in the Python documents.

Important warning: The default value is evaluated only once. This makes a difference when the default is a mutable object such as a list, dictionary, or instances of most classes. For example, the following function accumulates the arguments passed to it on subsequent calls:

def f(a, L=[]):
  L.append(a)
  return L

print f(1)
print f(2)
print f(3)

This will print

[1]
[1, 2]
[1, 2, 3]

If you don’t want the default to be shared between subsequent calls, you can write the function like this instead:

def f(a, L=None):
  if L is None:
    L = []
  L.append(a)
  return L

I do not like the workaround presented in the docs, since it obfuscates what the argument is supposed to be. (compare visitedNodes=[] vs visitedNodes=None). There is another workaround for this, and it’s the approach taken originally on the site where the graph traversal code came from:

def find_all_paths(graph, start, end, path=[]):
path = path + [start]
# my buggy version had path.append(start)
...

Fortunately the Python docs are excellent and I was able to find the solution without too much trouble. Python programmers need to be aware that their default arguments are only evaluated once, especially when working with potentially mutable data structures like dictionaries and lists. Programmers must take care to realize that the following two calls are NOT equivalent, especially with respect to objects contained in default arguments.

exploredNodes.append("x")
exploredNodes = exploredNodes + ["x"]
Categories: Java, Python Tags: , , , , ,