Archive

Archive for August, 2010

Java: Creating correct equals and hashCode methods

August 24, 2010 11 comments

Equals and hashcode

Generating correct equals and hashCode methods is hard. There’s an entire chapter or two devoted to it in Joshua Bloch’s Effective Java, a definitive tome for Java developers. Getting these two methods correct is important if you’re going to be using your domain objects within Java collections, particularly hash maps.

As manually generating equals and hashCode methods is difficult and error prone, there are a few different techniques for helping the developer with the process.

We will be using the following simple class to explore this issue:

public class Point {
    private int x;
    private int y;

    public Point(int x, int y) {
      this.x = x;
      this.y = y;
    }

    public int getX() {
      return x;
    }

    public int getY() {
      return y;
    }
}

Library based solution

There is at least one library that supports the generation of equals and hashCode methods, and that is the excellent Apache Commons Lang. There is a good explanation of how to use it here, but here’s a condensed version, from the javadocs with my annotation added in the comments:

// Nick: An example using reflection to determine all of the fields of the object;
// easiest to use but as it uses reflection it will be slower than manually
// including the fields
public boolean equals(Object obj) {
   return EqualsBuilder.reflectionEquals(this, obj);
}

// Nick: An example explicitly choosing which fields to include within the EqualsBuilder
// Note that it still requires some knowledge of creating correct equals methods, so it's not as idiot proof as the previous method
public boolean equals(Object obj) {
  if (obj instanceof MyClass == false) {
    return false;
  }
  if (this == obj) {
    return true;
  }
  MyClass rhs = (MyClass) obj;
  return new EqualsBuilder()
                .appendSuper(super.equals(obj))
                .append(field1, rhs.field1)
                .append(field2, rhs.field2)
                .append(field3, rhs.field3)
                .isEquals();
 }

The HashCodeBuilder works similarly:

public class Person {
   String name;
   int age;
   boolean isSmoker;
   ...



   public int hashCode() {
     // you pick a hard-coded, randomly chosen, non-zero, odd number
     // ideally different for each class
     return new HashCodeBuilder(17, 37).
       append(name).
       append(age).
       append(smoker).
       toHashCode();
   }

   // Nick: Alternatively, for the lazy:
   public int hashCode() {
      return HashCodeBuilder.reflectionHashCode(this);
   }

}

Using the library is a good approach, but it also introduces a dependency that may not be otherwise necessary. There are a lot of good classes in Apache Commons Lang, but if all you are using it for is the EqualsBuilder and ToStringBuilder, you’re probably better off avoiding the dependency. In this case, you can make your IDE do the heavy lifting for you.

IDE based code generation

Given that IDEs like NetBeans and Eclipse do such a good job of automatically creating things like getters/setters, constructors, etc., it’s no surprise that they can be used to generate equals/hashCode methods as well. Unfortunately, they are not perfect, which prompted me to write this post in the first place.

I will be focusing on NetBean’s implementation of the equals/hashCode code generation as of version 6.9 (the most recent version).

When you are in NetBeans and press Ctrl+I, the IDE provides a popup menu with options for methods that it can automatically generate for you.

Generate options in NetBeans 6.9

When you choose the equals() and hashCode() option, you are presented with the following screen (where the variables will differ depending on your class, obviously).

equals() and hashCode() generation dialog

After checking all of the checkboxes and pressing generate, the IDE inserts the following two snippets of code:

@Override
public boolean equals(Object obj) {
    if (obj == null) {
        return false;
    }
    if (getClass() != obj.getClass()) {
        return false;
    }
    final Point other = (Point) obj;
    if (this.x != other.x) {
        return false;
    }
    if (this.y != other.y) {
        return false;
    }
    return true;
}

@Override
public int hashCode() {
    int hash = 3;
    hash = 97 * hash + this.x;
    hash = 97 * hash + this.y;
    return hash;
}

Perfect. Great. The IDE has done all the work for you. It’s definitely more verbose than the Apache Commons solution, but at least there are no dependencies introduced into your code. If you change your class so as to introduce more variables you wish to consider for equality and hashCode, you should delete the generated methods and regenerate them.

While this is functional, there are two main problems I have with this dialog:
* Multiple checkboxes
* No linkage between equals/hashCode

I will address each in turn

Multiple checkboxes

There is no means for enabling or disabling all of the fields. Any time there are (potentially) a lot of checkboxes, you should give the user the option to toggle them all at once. You can that the NetBeans designers did just this in the Generate Getters and Setters dialog in NetBeans 6.9.

Generate getter / setters

Here you can see a checkbox next to the Point class name which toggles all of the children nodes’ checkboxes (all of the variables). This is pretty standard UI stuff; here is this pattern at work in GMail and Google Docs.

GMail's select/deselect options Google Doc's select/deselect

This is not the end of the world, as the dialog does support keyboard navigation and toggling of the check boxes via the space bar. It is a bizarre UI feature though, as there is absolutely no indication as to which of the two panes has focus, and thus which checkbox you’re about to toggle. By the fact that I’m familiar with focus traversal, I intuited that tab would shift the focus between the panes but there’s no way a novice would know that and no indication of this. In the following screenshot, note that it’s impossible to tell whether I’m about to toggle the x or the y variable.

What will happen when I press space?

Lack of coupling between the equals/hashCode methods

Usually coupling is considered a bad thing in programming. However, when creating an equals and hashCode methods, it’s vital that the same fields be used in the construction of both methods. For instance, if you use a variable x and y to create the equals methods, you should use exactly the variables x and y while constructing the hashCode method.

Why?

This post from bytes.com does a good job of explaining this:

Overriding the hashCode method.

The contract for the equals method should really have another line saying you must proceed to override the hashCode method after overriding the equals method. The hashCode method is supported for the benefit of hash based collections.

The contract

Again from the specs:

  • Whenever it is invoked on the same object more than once during an execution of an application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the equals method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

So equal objects must have equal hashCodes. An easy way to ensure that this condition is always satisfied is to use the same attributes used in determining equality in determining the hashCode. You should now see why it is important to override hashCode every time you override equals.

That sentence from the last paragraph sums it up: “An easy way to ensure that this condition is always satisfied is to use the same attributes used in determining equality in determining the hashCode”. Thus it’s clear that the dialog should provide a linkage between the equals/hashCode columns such that toggling the row of one column toggles the corresponding row. Otherwise you can create situations that are guaranteed to violate the contract of equals/hashCode, nullifying the entire point of having the IDE generate these methods for you.

For instance, see the following screen shot:
Violation of contract, allowed by the GUI

The dialog will allow you to continue, blithely creating the erroneous methods, only to manifest itself as subtle bugs later, with no warning. Either the dialog should force you to choose the variables in tandem, or at the very least it should offer a warning that choosing mismatching variables for the equals and hashCode methods can introduce bugs into the program.

Conclusion

I’ve investigated two ways of freeing the developer from the burden of implementing a correct version of equals and hashCode, through the use of Apache Commons Lang and NetBeans IDE. I’ve also detailed problems in the UI design of the dialogs presented for the generation of these two methods from NetBeans.

EDIT:
Thanks to Daniel for bringing Eclipse’s dialog to my attention. Eclipse's dialog
As you can see, they do not separate out the equals/hashCode, which makes a lot more sense to me.

Python Gotcha #1: Default arguments and mutable data structures

August 23, 2010 4 comments

I ran into an issue in Python the other day that I thought others might find instructive. The problem involves default arguments and mutable data structures.

Named arguments

In Python, you can enumerate the arguments to a method explicitly, rather than simply by putting the arguments in the order expected by the method. For instance

>>> def printit(a,b,c):
... print a,b,c
...
# Calling the method with the arguments in the expected order; 1 is assigned to a, 2 to b, 3 to c
>>> printit(1,2,3)
1 2 3
# Explicitly assigning the values to the different argument variables. Now you can call the method however you want.
>>> printint(c='c',b='b',a='a')
a b c

Default Arguments

Another feature of Python not present in Java is that of default arguments

>>> def defaulted(a,b='b',c='c'):
... print a,b,c
...
>>> defaulted(1,2,3)
1 2 3
>>> defaulted(1,2)
1 2 c
>>> defaulted(1)
1 b c
# This will fail because I do not explicitly define what a is
>>> defaulted(b=6)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: defaulted() takes at least 1 non-keyword argument (0 given)

These default arguments are very succinct and a nice feature. In Java, the best way to emulate this feature is through method overloading. From the linked article:

public void printFavorites(Color favoriteColor, String favoritePhrase, int favoriteNumber) {
     System.out.println("Favorite color: " + favoriteColor);
     System.out.println("Favorite phrase: " + favoritePhrase);
     System.out.println("Favorite number: " + favoriteNumber);
}

public void printFavorites(Color favoriteColor, String favoritePhrase) {
    printFavorites(favoriteColor, favoritePhrase, 0);
}

public void printFavorites(Color favoriteColor) {
    printFavorites(favoriteColor, "The best phrase evar", 0);
}

This increases the amount of boilerplate code, and is not quite as flexible as the python version – in Python you could define a default argument for each variable and then the user could choose which if any to override.

>>> def printFavorites(faveColor="Blue", favePhrase="This is awesome",faveNumber=5):
... print "Favorite color: ", faveColor
... print "Favorite phrase: ", favePhrase
... print "Favorite number: ", faveNumber
>>> printFavorites()
Favorite color: Blue
Favorite phrase: This is awesome
Favorite number: 5
>>> printFavorites(favePhrase="Just the phrase")
Favorite color: Blue
Favorite phrase: Just the phrase
Favorite number: 5
>>> printFavorites(faveColor="orange")
Favorite color: orange
Favorite phrase: This is awesome
Favorite number: 5

Regardless, it’s usually not too much of a hurdle in Java. But given that it’s such a nice feature in Python, I wanted to make use of it.

Real world example: Graph traversal

Let’s examine how to represent directed (potentially cyclic) graphs in Python, and how to do a depth first search of all paths between two
First let’s create a graph, and then show the code to find all the paths between two nodes.

Graph representation

In Python we can represent a graph as a nested dictionary:

g = {'A': ['B', 'C'],
'B': ['C', 'D'],
'C': ['A']}

Better visualized as

(Side note: this graph was created using the DOT language, something I plan to post more about in the future.)

The nested dictionary structure defined above maps nodes to their children (nodes in this case are simply represented as single letters, but they could be arbitrarily complex).

Search

I mentioned that the graphs might be cyclic; if we do not detect a cycle, the code would similarly loop until running out of memory. Thus we need to keep track of the path we’ve already explored to avoid cycling.

Here is code to find all the paths between two nodes (slightly modified from http://www.python.org/doc/essays/graphs.html; I will highlight exactly what I changed later in the article).

def find_all_paths(graph, start, end, path=[]):
  path.append([start])
  if start == end:
    return [path]
  if not graph.has_key(start):
    return []
  paths = []
  for node in graph[start]:
    if node not in path:
      # recursive call
      newpaths = find_all_paths(graph, node, end, path)
      for newpath in newpaths:
        paths.append(newpath)
  return paths

Let’s test:

>>> graph.find_all_paths(g, 'A','B')
[['A', 'B']]
>>> graph.find_all_paths(g, 'A','C')
[['A', 'B', 'C'], ['A', 'C']]

Both of these are correct; the path between A and B is just A and B but between A and C we have two potential paths – either the two hops from A to B to C, or directly from A to C. The code I presented contains a bug. What happens when I run the queries again?

>>> graph.find_all_paths(g, 'A','B')
[]
>>> graph.find_all_paths(g, 'A','C')
[]

Huh? What’s going on. It worked fine the first time, but subsequent method calls return the wrong result!

Solution

The problem, I discovered with a little bit of searching, is spelled out in the Python documents.

Important warning: The default value is evaluated only once. This makes a difference when the default is a mutable object such as a list, dictionary, or instances of most classes. For example, the following function accumulates the arguments passed to it on subsequent calls:

def f(a, L=[]):
  L.append(a)
  return L

print f(1)
print f(2)
print f(3)

This will print

[1]
[1, 2]
[1, 2, 3]

If you don’t want the default to be shared between subsequent calls, you can write the function like this instead:

def f(a, L=None):
  if L is None:
    L = []
  L.append(a)
  return L

I do not like the workaround presented in the docs, since it obfuscates what the argument is supposed to be. (compare visitedNodes=[] vs visitedNodes=None). There is another workaround for this, and it’s the approach taken originally on the site where the graph traversal code came from:

def find_all_paths(graph, start, end, path=[]):
path = path + [start]
# my buggy version had path.append(start)
...

Fortunately the Python docs are excellent and I was able to find the solution without too much trouble. Python programmers need to be aware that their default arguments are only evaluated once, especially when working with potentially mutable data structures like dictionaries and lists. Programmers must take care to realize that the following two calls are NOT equivalent, especially with respect to objects contained in default arguments.

exploredNodes.append("x")
exploredNodes = exploredNodes + ["x"]
Categories: Java, Python Tags: , , , , ,

Stanza e-book app – how to fix the screen brightness

August 10, 2010 10 comments

Stanza is a great free e-book reader for the iPhone or iPad.  One thing I noticed while using it was that the screen would be very dim from time to time.  Thinking there was a problem with the light sensor, I’d try all sorts of things to try to get the screen brightness to fix itself.  Other apps on the phone didn’t have this problem, so I figured I must have changed some setting.  I looked in the settings of the app but there was nothing indicating how to change the brightness.


Highest brightness

Lowest brightness

Finally I discovered by accident that dragging your finger up and down the screen increases and decreases the brightness.  So if your Stanza app reading experience is hampered by a dim screen, try dragging your finger from the bottom to the top.

Hopefully this helps someone similarly confused.

Categories: iPad, iPhone, iPod Tags: , ,