Code bloat and its inevitability in Java

April 27, 2010

[I]f you begin with the assumption that you need to shrink your code base, you will eventually be forced to conclude that you cannot continue to use Java. Conversely, if you begin with the assumption that you must use Java, then you will eventually be forced to conclude that you will have millions of lines of code.

Steve Yegge has an excellent rant on code bloat and its inevitability when using Java.  He makes some very cogent points about refactoring and design patterns, using the metaphor of spring cleaning.  Refactoring and design patterns can be seen as neatly organizing your stuff within compartments in a closet, but if you aren’t able to throw things out, you never will really have a clean code base.  Furthermore, he argues that Java necessitates code bloat and duplication, as it lacks certain features that are necessary to really shrink code bases (lambda functions, closures).

He concludes his article by considering some alternative languages that run on the JVM; it’s an older article (from 2007) and he choose Rhino as his language of choice to rewrite his game in (currently 500 thousand lines of Java code).  I’d be interested to see if he were to revisit this article if he would choose Scala instead. It seems to be a natural fit for what he’s trying to do; it runs in the JVM but adds functional programming features that would surely reduce his source code line count. In the comments many people suggested Scala, but he dismisses them by saying

“Scala folks and Groovy folks: you’re not big enough yet. For something as big as my game, I want a proven mainstream language. I picked Rhino as a complicated multidimensional compromise; the actual reasons are a full blog post. But the short answer is “you’re not big enough.” Sorry.

Given Scala’s somewhat more mainstream position these days, I think he’d be forced to reconsider.

This article hits home for me, because the verbosity of Java is a thing I struggle with on a daily basis; it is definitely one of my biggest complaints about the language, and it makes it a breath of fresh air to use something more expressive and concise, like Python.

NetBeans Platform Tip #2: Persisting state in TopComponents

April 27, 2010

Given the fact that NetBeans remembers the size, location, and layout of all your TopComponents, it isn’t too surprising that there is an easy mechanism for persisting state information.  Unfortunately it’s not immediately obvious; if you don’t stumble onto the right resources, you might be led to believe you need to deal with the Preferences API or writing to a database or some other flat file, and reading it later.  Fortunately it’s easier than that.  Assume we have a list of Strings we want to remember between invocations of our top component; somehow we get them from the user (the details of that are not important here), and we’d like to remember them.  The secret is in the writeExternal and readExternal methods, which are called when the TopComponent is serialized (saved) and deserialized (loaded).  Also make sure the persistence type you return is TopComponent.PERSISTENCE_ALWAYS; otherwise NetBeans will not attempt to save your TopComponent between invocations of your app.

public class TestTopComponent extends TopComponent implements Serializable {

    private List<String> strings = new ArrayList<String>();

    public EngineeringTableTopComponent(Collection<String> toSave) {

    public EngineeringTableTopComponent() {

    public void initComponents() {
         // Create GUI

    public int getPersistenceType() {
        return TopComponent.PERSISTENCE_ALWAYS;

     * Save the state of the top component, including which properties are displayed
     * @param oo
     * @throws IOException
    public void writeExternal(ObjectOutput oo) throws IOException {
        Object toWrite = new NbMarshalledObject(strings);

     * Restore the state of the top component, including which properties are displayed
     * @param oi
     * @throws IOException
     * @throws ClassNotFoundException
    public void readExternal(ObjectInput oi) throws IOException, ClassNotFoundException {
        NbMarshalledObject obj = (NbMarshalledObject) oi.readObject();
        strings  = (List<String>) obj.get();

NetBeans (IDE) Tip #1: Project Groups

April 26, 2010

My coworker showed me a great feature of NetBeans that any developer would do well to learn – it’s called Project Groups.  You can see a small screen cast illustrating the concept here.

The general idea is that NetBeans remembers a set of projects and the associated open files as a Project Group; when you switch between project groups, your settings from the last session are restored.  This is illustrated in the above linked video, but it doesn’t show the main reason I like it so much, namely the ability to create Project Groups automatically from specific folders on the hard drive.

The feature is accessible via File -> Project Group -> New Group.

If you have multiple projects you are juggling, you can create one project group for each project root folder, and then switch between them easily.

I don’t often go digging through menus, so I am indebted to my coworker pointing this feature out to me; I use it on a daily basis and it helps ease the burden of context switching by hiding all the irrelevant projects.

TextMate: Column editing, filter selection through command

April 21, 2010

I’ve mentioned TextMate before, as it is the best general purpose text editor I have found on the Mac.  I’d like to show you some of the neat features that I’ve discovered/been told about, as well as examples as to how they are useful.

The first thing I’d like to show you is column selection.

Standard selection

Column selection

You enter column editing mode by holding down the alt/option key; you’re pressing the correct button if your cursor changes into a crosshair.  At that point, you can select rectangular regions of the text.  You can copy and paste them like normal, but that’s not why this mode is useful. If you’ll notice, if you begin type, what you type is inserted at the same point on each of the lines you have selected

This came in very handy today when I had to write a copy constructor for a class that had a huge (> 40) number of variables.  (Yes, in most cases a class should be a lot more than a bag of variables.)  For those unsure of what a copy constructor is, it basically allows you to clone objects, making a new object but using the state information from an existing instance of that class. NetBeans has a lot of support for refactoring, but nothing I found did this automatically; I copy and pasted the variables into TextMate due to the features I’m about to illustrate.

public  class TextMateExample {
 private int varA;
 private int  varB;
 private double varC;
 private int varD;
 private float varE;
 private int varF;
 private  String varG;
 private int varH;

Use the column selection to get all of the variable names and potentially some of the variable type declarations; don’t worry about the excess.

Paste it underneath.

Clean it up, first by chopping off excess on the left, and then manually editing the extra stuff out.

Now it’s easy from here.  Paste the column of variable names to the right

We’ll call our copied object ‘other’, and obviously the current object as ‘this’.

Make a 0 width selection to the left of the first var column, then type ‘this.’

Select the semicolon column and the whitespace in between, and replace it with ‘= other.’.

Place this block inside a new constructor, and you’re all set

The last thing I want to illustrate is how to use the ‘filter through command’ feature in TextMate.  Any arbitrary selection in TextMate can be used as the input to any command line script you can think of.  This is extraordinarily powerful, especially if you are familiar with the Unix command line.

Let’s say that you want to replace all the raw variable access with the corresponding getter methods, for whatever reason. Use the column selection to insert ‘get’ between the other. and the variable name

The Java Beans getter/setter convention would be to name those variables getVarX, not getvarX.  Since they’re named this way, that would be extraordinarily easy to fix; select the column of lower case v’s, hit V, and all the v’s are instantly replaced. Let’s assume that our variables are named much differently, though following JavaBeans convention.  In other words, we want to capitalize the first letter following the word ‘get’ in that column.

I’m going to show you three ways of accomplishing the translation.

1) Use the built in TextMate case conversion

But that’s no fun, and doesn’t illustrate the use of filtering through the terminal.  (Again, this example is a bit trivial and it’s overkill to use the terminal for this, but I’m still going to illustrate how.  Because I can)

2) Use filter through command with ‘tr’ command

Select the offending letters (again, all lowercase v’s here, but use your imagination), right click on the selection and choose ‘filter through command’.

The ‘tr’ command performs a mapping from one set of characters to another.

echo "cbadefg" | tr "abc" "def"

“a” maps to “d”, “b” maps to “e”, and so on and so forth.  All we need to do is map “abcdefghijklmnopqrstuvwxyz” to “ABCDEFGHIJKLMNOPQRSTUVWXYZ”.

Fortunately there is a shortcut for that in tr, since they are common sets of letters: [:lower:] and [:upper:]

echo  "cbadefg" | tr "[:lower:]" "[:upper:]"

So we can use this command in the dialog box:

Not surprisingly this works.

3) Use ‘sed’

Sed stands for ‘stream editor’, and it is an extremely versatile Unix command line tool.  Let’s do a few examples of what’s possible with sed before presenting the command to switch the case of all the letters

Perhaps the most common use of sed is to replace one string with another.  The syntax to do that is


the ‘g’ argument is optional; if you include it, all instances of string 1 will be replaced per line; else just the first one.

Let’s try it out:

echo "Hello, World" | sed  's/Hello/Goodbye/g'
Goodbye, World

There’s a lot more to sed; I recommend reading Getting Started with Sed if this piques your interest.  I’ll have more blog posts later to illustrate some more, nontrivial uses.

sed can do the same transliteration that the tr command can do, with the syntax ‘y/set1/set2/’

The command to use for filtering to convert lowercase to uppercase is then

sed  'y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/'


The column editing feature of TextMate is fairly unique (I haven’t found another text editor that supports it), and it could come in useful any time you need to append a string to the front of a column of text, as could be the case with a set of variables.  For instance, you could prepend ‘private final’ to set of default access level variables.

I also illustrated the use of ‘filter selection through command'; any command you can execute on your terminal is accessible to you here.  The power of Unix is completely at your disposal via this dialog.

Show only unread messages in Gmail

April 21, 2010

Gmail is incredibly powerful, but it hides some of the more common features of e-mail under the surface. For instance, you can see at a glance that you have n unread messages, but how to actually see just those messages isn’t immediately obvious.

In the search field, type

in: inbox label: unread

and you will see only the unread messages that are in your inbox. If you want to search in all folders, omit the in: label.

Unix tip #2: explicit for loops, command substitution

April 17, 2010

A lot of unix commands are designed to operate on a large number of files at once.  For instance, the move file command, mv, has as its (simplified) arguments

mv file 1 [file 2, ... file n] destination

It’s for that reason that you can do something like

mv *.jpg ~/Desktop/Images

and have all the jpegs in the current working directory moved to the Images folder on the Desktop.  So many unix commands are set up that way that you might not need to know how to explicitly iterate (loop).  I was forced to learn this syntax when I had a large number of .tar.gz (roughly equivalent to zip files for those more familiar with Windows land) files to decompress.

The command I usually use to extract the contents of a zipped file is

tar -xzvf /path/to/tar/file

This expects a single file path; doing something like tar -xzvf *.tar.gz will not work.

Nick@Macintosh-2 ~/Desktop/TarExample$ ls
foo.tar.gz  foo2.tar.gz
Nick@Macintosh-2 ~/Desktop/TarExample$ tar -zxvf *.tar.gz
tar: foo2.tar.gz: Not found in archive
tar: Error exit delayed from previous errors

As you can see, this isn’t going to work.  Instead we need an explicit loop.  The general syntax is

for i in [iterable]; do [command with variable $i]; done

For instance,

Nick@Macintosh-2 ~/Desktop/TarExample$ for i in 1 2 3 4 5; do echo $i; done

Recalling our last unix tip, we could replace this with

 Nick@Macintosh-2 ~/Desktop/TarExample$ for i in {1..5}; do echo $i; done

The iterable list is whitespace separated.  This is very important for what I’m about to show to you next.

If you’re familiar with basic Unix functionality, you know that you list the contents of a directory with the ls command.  Let’s do that here.

Nick@Macintosh-2 ~/Desktop/TarExample$ ls foo.tar.gz  foo2.tar.gz
If you’ll notice, these are exactly the filenames we need to pass into the tar command.  Let’s try with echo first.

for i in ls; do echo $i; done ls
Well, that didn’t work.  What’s going on?  Turns yout you need to add backticks (the key to the left of the 1 key) around the ls command; otherwise bash treats it as text.

or i in `ls`; do echo $i; done foo.tar.gz foo2.tar.gz
We can go ahead and replace the echo command with our tar command:

 Nick@Macintosh-2 ~/Desktop/TarExample$ ^echo^tar -xzvf

 for i in `ls`; do tar -xzvf $i; done

The stdout here shows the contents that were extracted from the .tar.gz files.

But what is that syntax?

 ^echo^tar -xzvf

?  This is another neat feature of bash; you can repeat the last command, textually substituting the second command for the first.  I could have just as easily hit the up key, moved my cursor, deleted echo, replaced it with tar -xzvf, but this is faster to type for me.

Just for another example,

 echo "Hello"
 Nick@Macintosh-2 ~/Desktop/TarExample$ ^Hello^World
 echo "World"

In actuality, I would not use `ls`; what happens if there were things other than .tar.gz files in the directory?  We’d be calling the tar command with the incorrect arguments.  Instead we only want it to affect the files ending in.tar.gz; this is a place where the * wildcard comes in handy.

Nick@Macintosh-2 ~/Desktop/TarExample$ ls
a.txt       b.txt       foo.tar.gz  foo2.tar.gz

 Nick@Macintosh-2 ~/Desktop/TarExample$ ls *.tar.gz
 foo.tar.gz  foo2.tar.gz

So I can use this in my earlier command,

 for i in `ls *.tar.gz`; do tar -xzvf $i; done

Note that you can avoid the use of backticks if you use plain wildcard expansion:

 for i in *.tar.gz; do tar -xzvf $i; done

The reason that this works without the use of backticks is that the ls text is a command that needs to be run by the shell; the * is an expression that is evaluated earlier in the pipeline.  Read more about globbing.

Nested for loops

I haven’t had a need to nest for-loops yet, but you can if you wish.

for i in {1,2,3}; do for j in {3,4,5}; do echo $i $j; done; done
1 3
1 4
1 5
2 3
2 4
2 5
3 3
3 4
3 5


I have shown you how to explicitly iterate over lists in bash, how to use wildcard matching to restrict the set of objects returned by command, and how to replace one piece of the last command with another.  In most cases you will not need to explicitly iterate over lists, due to the way many unix commands are written, but it’s a useful skill to have nonetheless.

Unix tip #1: advanced mkdir and brace expansion fun

April 11, 2010 7 comments

April 11, 2010

You are in a folder and want to create one folder which has 3 sub folders.  Let’s call the main folder Programming and its 3 sub folders Java, Python, and Scala.  Visually this looks like

or rendered via tree:

|-- Java
|-- Python
`-- Scala

A first pass at accomplishing this would be to create the Programming folder, and then the three individual folders underneath

$ mkdir Programming
$ mkdir Programming/Java
$ mkdir Programming/Python
$ mkdir Programming/Scala

This certainly works, but it takes four commands.

Let’s see if we can’t do better.  Delete those folders with the command

rm -rf Programming/

This will delete the programming folder and all subfolders underneath it (the r flag is for recursive, the f flag for forcing the removal of nonempty directories)

Like most unix commands, the mkdir command can take multiple arguments, separated by spaces.  So the three separate commands to create Java, Python, and Scala can be put onto one line.

mkdir Programming; mkdir Programming/Java Programming/Python Programming/Scala

Note the ; separator between the two commands.  We need to create the Programming folder before we can create the subfolders.

This is better but still too verbose.  It would be nice to remove the mkdir Programming call; we’d like to be able to create an arbitrarily nested folder and have mkdir create all the parent folders automatically.  Fortunately there is a way to do this: the -p flag of mkdir does exactly this.

 -p      Create intermediate directories as required.  If this option is not specified, the full path prefix of
             each operand must already exist.  On the other hand, with this option specified, no error will be
             reported if a directory given as an operand already exists.  Intermediate directories are created with
             permission bits of rwxrwxrwx (0777) as modified by the current umask, plus write and search permission
             for the owner.

Thus we can change our command to

mkdir -p Programming/Java Programming/Python Programming/Scala

This is better but still not perfect; we’re repeating ourselves 3 times with the Programming call.  Enter an absurdly useful Bash shell construct known as brace expansion.

echo {5,6,7}
5 6 7

The arguments within braces are treated as if they were space separated instead.  That wouldn’t be terribly useful except that things immediately before the brace are repeated as well

echo hello{5,6,7}
hello5 hello6 hello7

This brace expansion can be used anywhere, since the textual substitution happens before the arguments are passed into other processes.  So, combining this with what we saw earlier, we can put Java, Python and Scala into a list and prepend it with Programming:

 echo Programming/{Java,Python,Scala}
Programming/Java Programming/Python Programming/Scala

That should look very familiar.  Putting it in place of the earlier mkdir command we get the elegant one liner

mkdir -p Programming/{Java,Python,Scala}

Certain versions of bash also support numerical ranges within the brackets:

echo {1..10}
1 2 3 4 5 6 7 8 9 10


I have shown you how to create all the parent directories using the mkdir command, and introduced you to the brace expansion macro of Bash.  The latter is extremely powerful, and can be used to great effect within scripts.

Note: The arguments within the braces must have NO space between after or before the commas in order for the brace expansion to work.

[572][nicholasdunn: Desktop]$ echo {5, 6, 7}
{5, 6, 7}
[573][nicholasdunn: Desktop]$ echo {5,6,7}
5 6 7
