Archive

Posts Tagged ‘workaround’

Go gotcha #1: variable shadowing within inner scope due to use of := operator

March 3, 2014 3 comments

Disclaimer: Go is open source and developed by many Google employees. I work for Google, but the opinions expressed here are my own and do not necessarily represent that of Google.

Last week I described how the range keyword in conjunction with taking the address of the iterator variable will lead to the wrong result. This week I’ll discuss how it’s possible to accidentally shadow one variable with another, leading to hard to find bugs.

Let’s take the same basic setup as last week; we have a Solution struct, and we’re searching for the best (lowest cost) one in a slice of potential candidates.

package main

import "fmt"

type Solution struct {
    Name     string
    Cost     int
    Complete bool
}

func FindBestSolution(solutions []*Solution) *Solution {
    var best *Solution
    for _, solution := range solutions {
        if solution.Complete {
            if best == nil || solution.Cost < best.Cost {
                best := solution
                fmt.Printf("new best: %v\n", *best)
            }
        }
    }
    return best
}

func main() {
    solutions := []*Solution{
        &Solution{
            Name:     "foo",
            Cost:     10,
            Complete: true,
        },
    }
    fmt.Printf("Best solution is: %v", FindBestSolution(solutions))
}

Output:

new best: {foo 10 true}
Best solution is: <nil>
Program exited.

Go playground

What’s going on? We see that we have a good candidate solution from the debugging information. Why does the function return the wrong value?

The bug is in this line:

best := solution

The problem is that we’re declaring and initializing a new variable (with the := operator) rather than assigning to the existing best variable in the outer scope. The corrected line is

best = solution

Use = to change the value of an existing variable, use := to create a new variable.

If I had not referenced the new variable with the debug print statement, this code would not have compiled:

if best == nil || solution.Cost < best.Cost {
    best := solution
}
prog.go:16: best declared and not used
 [process exited with non-zero status]

Go playground

Why is this shadowing of variables in other scopes allowed at all?

There is a long thread on the subject on Go-nuts, debating this subject.

Arguments For

Nate Finch:

type M struct{}

func (m M) Max() int {
    return 5
}

func foo() {
    math := M{}
    fmt.Println(math.Max())
}

If shadowing didn’t work, importing math would suddenly break this program.


My point was about adding an import after writing a lot of code (when
adding features or whatever), and that without shadowing, merely importing
a package now has the potential to break existing code….

The current shadowing rules insulate code inside functions from what
happens at the top level of the file, so that adding imports to the file
will never break existing code (now waiting for someone to prove me wrong
on this 😉

Rui Maciel:

There is a simpler and better solution: use a short variable declaration
when you actually want to declare a variable, and use an assignment
operator when all you want to do is assign a value to a variable which
you’ve previously declared. This doesn’t require any change to either
the language or the compiler, particularly one which is that cryptic.

Arguments Against

Johann Höchtl:

See it this way. I can carry a gun in my hand aiming towards a target. I
pull the trigger and hit the target. Everything happens exactly the whay
it is expected to happen.

Suddenly an inner block jumps in … the instructor. Me, a gun in my
hand, the instructor in between and on the other side the target. I pull
the trigger.

Still … everything happens exactly the way it is told to behave. Which
still makes the end results not a desirable result. Adding an “inner
block”, which by itself is behaving in a fully specified way,
influences the whole.

Somewhat odd I admit, but you may get what I mean?

Conclusion

I don’t think that the shadowing should be an error but I do think there should be a warning. The go vet tool already helps find common mistakes, such as forgetting to include arguments to printf. For instance:

example.go:

package main

import "fmt"

func main() {
    fmt.Printf("%v %v", 5)
}

Run:

go vet example.go

example.go:6: missing argument for Printf verb %v: need 2, have 1

If the vet tool were modified to add this warning, it would occasionally yield false positives for those cases where the shadowing is done intentionally. I think this is a worthwhile tradeoff. Virtually all Go programmers I’ve talked with have made this mistake, and I would be willing to bet that these cases are far more frequent than intentional shadowing.

Go gotcha #0: Why taking the address of an iterated variable is wrong

February 25, 2014 6 comments

Golang mascot
Go mascot – by Renée French under Creative Commons Attribution-Share Alike 3.0 Unported from http://en.wikipedia.org/wiki/File:Golang.png

Disclaimer: Go is open source and developed by many Google employees. I work for Google, but the opinions expressed here are my own and do not necessarily represent that of Google.

Go is my new favorite programming language. It’s compact, garbage collected, terse, and very easy to read. There are some things that trip me up even now after I’ve been using it for awhile. Today I’m going to discuss the range construct and how it has a surprising feature that might violate your assumptions.

Range

First, the range keyword is a way to iterate through the various builtin data structures in Go. For instance,

a := map[string]int {
    "hello": 1,
    "world": 2,
}
// 2 element range gets key and value
for key, value := range a {
    fmt.Printf("key %s value %d\n", key, value)
}
// 1 element is just the key
for key := range a {
    fmt.Printf("key %s\n", key)
}

// Works for slices (think of them as vectors/lists) too
b := []string {"hello", "world"}
// 2 element range gets the index as well as the entry
for i, s := range b {
    fmt.Printf("entry %d: %s\n", i, s)
}
// 1 element gets just the index (notice the pattern?)
for i := range b {
    fmt.Printf("entry %d\n", i)
}

This outputs

key hello value 1
key world value 2
key hello
key world
entry 0: hello
entry 1: world
entry 0
entry 1

Try this code in the Go Playground

Solution search – pointers

Imagine the case where we have a struct as follows

type Solution struct {
    Name string
    Cost int
    Complete bool
}

Say that we’re doing some sort of optimization where we’re looking for the minimum cost solution that meets some criteria; for simplicity’s sake, I’ve put that as the ‘complete’ bool. It’s possible that no such Solution matches, in which case we return a nil solution.

A reasonable implementation would be as follows

func FindBestSolution(solutions []Solution) *Solution {
    var best *Solution
    for _, solution := range solutions {
        if solution.Complete {
            if best == nil || solution.Cost < best.Cost {
                best = &solution
            }
        }
    }
    return best
}

Do you see the bug? Don’t worry if you don’t – I’ve made this mistake a few times now.

Let’s add some tests to find the problem. This is an example of a table driven test, where the test cases are given as a slice of struct literals. This makes it very easy to add new test cases.

func TestFindBestSolution(t *testing.T) {
    tests := []struct {
        name      string
        solutions []Solution
        want      *Solution
    }{
        {
            name:      "Nil list",
            solutions: nil,
            want:      nil,
        },
        {
            name: "No complete solution",
            solutions: []Solution{
                {
                    Name:     "Foo",
                    Cost:     25,
                    Complete: false,
                },
            },
            want: nil,
        },
        {
            name: "Sole solution",
            solutions: []Solution{
                {
                    Name:     "Bar",
                    Cost:     12,
                    Complete: true,
                },
            },
            want: &Solution{
                Name:     "Bar",
                Cost:     12,
                Complete: true,
            },
        },
        {
            name: "Multiple complete solution",
            solutions: []Solution{
                {
                    Name:     "Foo",
                    Cost:     25,
                    Complete: false,
                },
                {
                    Name:     "Bar",
                    Cost:     12,
                    Complete: true,
                },
                {
                    Name:     "Baz",
                    Cost:     25,
                    Complete: true,
                },
            },
            want: &Solution{
                Name:     "Bar",
                Cost:     12,
                Complete: true,
            },
        },
    }
    for _, test := range tests {
        got := FindBestSolution(test.solutions)
        if got == nil && test.want != nil {
            t.Errorf("FindBestSolution(%q): got nil wanted %v", test.name, *test.want)
        } else if got != nil && test.want == nil {
            t.Errorf("FindBestSolution(%q): got %v wanted nil", test.name, *got)
        } else if got == nil && test.want == nil {
            // This is OK
        } else if *got != *test.want {
            t.Errorf("FindBestSolution(%q): got %v wanted %v", test.name, *got, *test.want)
        }
    }
}

If you run the tests you’ll find that the last test fails:

--- FAIL: TestFindBestSolution (0.00 seconds)
    prog.go:82: FindBestSolution("One complete solution"): got {Baz 25 true} wanted {Bar 12 true}
FAIL
 [process exited with non-zero status]

This is strange – it works fine in the single element case, but not with multiple values. Let’s try adding a case where the correct value is last in the list.

    {
        name: "Multiple - correct solution is last",
        solutions: []Solution{
            {
                Name:     "Baz",
                Cost:     25,
                Complete: true,
            },
            {
                Name:     "Bar",
                Cost:     12,
                Complete: true,
            },
        },
        want: &Solution{
            Name:     "Bar",
            Cost:     12,
            Complete: true,
        },
    },

Sure enough, this test passes. So somehow if the element is last the algorithm works. What’s going on?

From the go-wiki entry on Range:

When iterating over a slice or map of values, one might try this:

items := make([]map[int]int, 10)
for _, item := range items {
        item = make(map[int]int, 1) // Oops! item is only a copy of the slice element.
        item[1] = 2                 // This 'item' will be lost on the next iteration.
}

The make and assignment look like they might work, but the value property of range (stored here as item) is a copy of the value from items, not a pointer to the value in items.

This is exactly what’s happening in this case. The solution variable is getting a copy of each entry, not the entry itself. Thus when you take the address of the entry, you end up with a pointer pointing at the LAST element in the slice (since the iteration stops at that point). To illustrate:

package main

import "fmt"

func main() {
    strings := []string{"some","value"}
    for i, s := range strings {
        fmt.Printf("Element %d: %s Pointer %v\n", i, s, &s)
    }
}

Element 0: some Pointer 0x10500168
Element 1: value Pointer 0x10500168

Note that the same pointer is used in both cases. This explains why the Solution pointer ended up pointing at the last element of the slice.
Playground

So how do we work around this problem? The key is to introduce a new variable whose address it’s safe to take; its contents won’t change out from underneath you.

Broken:

if solution.Complete {
    if best == nil || solution.Cost < best.Cost {
        best = &solution
    }
}

Fixed:

if solution.Complete {
    if best == nil || solution.Cost < best.Cost {
        tmp := solution
        best = &tmp
    }
}

With this patch the tests pass:

PASS

Program exited.

Alternative design

A great feature of Go is that you can return multiple values from a single function. Here’s an alternative implementation that doesn’t suffer from the previous problem.

func FindBestSolution(solutions []Solution) (Solution, bool) {
    var best Solution
    found := false
    for _, solution := range solutions {
        if solution.Complete {
            if !found || solution.Cost < best.Cost {
                best = solution
                found = true
            }
        }
    }
    return best, found
}

Since best is copying the VALUE of the solution variable, this works correctly. You can play with this example and see how the tests change in the Playground.

This illustrates one other nice feature of Go – all types have a ‘zero’ value that is legal to use. For strings this is the empty string, for pointers it’s nil, for ints it’s 0, for structs all of types are set to zero values. The line var best Solution implicitly sets best to be the zero solution. If I wanted to I could get rid of the found bool altogether and just compare the returned solution with another zero valued Solution.

Conclusion

I introduced some basic features of Go, including maps, slices, range, structs, and functions. I provided links to the amazingly useful Go playground which lets you easily test out code, format it, and share it with others.

I showed two implementations of a function that searches through a slice of struct values, searching for a solution that meets some criteria.

The first example using pointers led to a subtle bug that’s hard to find and solve unless you know how range works. I showed how to write unit tests that exercise the function and helped flush out the bug. I also explained what the bug was and how to work around it.

Finally I showed a version of the same function that uses Go’s multiple return types to return a found boolean rather than using a nil pointer to signify that the value wasn’t found.

Python Gotcha: Word boundaries in regular expressions

September 22, 2011 4 comments

TL;DR

Be careful trying to match word boundaries in Python using regular expressions.  You have to be sure to either escape the match sequence or use raw strings.

Word boundaries

Word boundaries are a great way of performing regular expression searches for whole words while avoiding partial matches.  For instance, a search for the regular expression “the” would match both the word “the” and the start of the word “thesaurus”.

>>> import re
>>> re.match("the", "the")
# matches
>>> re.match("the", "thesaurus")
# matches 
In some cases, you might want to match just the word “the” by itself, but not when it’s embedded within another word.

The way to match a word boundary is with ‘\b’, as described in the Python documentation.  I wasted a few minutes wrestling with trying to get this to work.

>>> re.match("\bthe\b", "the")
# no match

It turns out that \b is also used as the backspace control sequence.  Thus in order for the regular expression engine to interpret the word boundary correctly, you need to escape the sequence:

>>> re.match("\\bthe\\b", "the")
# match

You can also use raw string literals and avoid the double backslashes:

>>> re.match(r"\bthe\b", "the")
# match

In case you haven’t seen the raw string prefix before, here is the relevant documentation:

String literals may optionally be prefixed with a letter ‘r’ or ‘R’; such strings are called raw strings and use different rules for interpreting backslash escape sequences.

Conclusion

Make sure you are familiar with the escape sequences for strings in Python, especially if you are dealing with regular expressions whose special characters might conflict.  The Java documentation for regular expressions makes this warning a bit more explicit than Python’s:

The string literal “\b”, for example, matches a single backspace character when interpreted as a regular expression, while “\\b” matches a word boundary.

Hopefully this blog post will help others running into this issue.

Mule 3 Deployment Gotchas / Workarounds

June 10, 2011 1 comment

Mule is an open source enterprise service bus written in Java. I’ve worked with Mule 2.2 quite a bit but only recently have started to work with Mule 3. This post details some of the pains involved with the transition, none of which are well documented or hinted at in the Migration guide.

Gotchas/Workarounds

Mule IDE specific

The Mule IDE is really a misnomer – it’s not a standalone product, but instead an Eclipse plugin. See the installation guide for more information.

XML validation warnings

By default, Eclipse 3.5 will flag all sorts of spurious errors in your XML configuration file. See the blog post for more details, but here’s the short version on how to solve it:

General

These issues exist whether you use the IDE to deploy the app or deploy the app via the command line.

Failure to launch / Timeouts

Mule is configured via XML. You must declare the namespaces and schema locations in order to make use of the built-in Mule constructs. For instance, here’s a snippet of one of my Mule configurations:

<mule xmlns="http://www.mulesoft.org/schema/mule/core"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xmlns:spring="http://www.springframework.org/schema/beans"
      xmlns:vm="http://www.mulesoft.org/schema/mule/vm"
      xmlns:script="http://www.mulesoft.org/schema/mule/scripting"
      xmlns:http="http://www.mulesoft.org/schema/mule/http"
      xmlns:cxf="http://www.mulesoft.org/schema/mule/cxf"
      xmlns:xm="http://www.mulesoft.org/schema/mule/xml"
      xmlns:pattern="http://www.mulesoft.org/schema/mule/pattern"
      xmlns:servlet="http://www.mulesoft.org/schema/mule/servlet"
      xmlns:jetty="http://www.mulesoft.org/schema/mule/jetty"
      xmlns:test="http://www.mulesoft.org/schema/mule/test"
      xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
        http://www.mulesoft.org/schema/mule/core http://www.mulesoft.org/schema/mule/core/3.1/mule.xsd
        http://www.mulesoft.org/schema/mule/http http://www.mulesoft.org/schema/mule/http/3.1/mule-http.xsd
        http://www.mulesoft.org/schema/mule/cxf http://www.mulesoft.org/schema/mule/cxf/3.1/mule-cxf.xsd
        http://www.mulesoft.org/schema/mule/scripting http://www.mulesoft.org/schema/mule/scripting/3.1/mule-scripting.xsd
        http://www.mulesoft.org/schema/mule/pattern http://www.mulesoft.org/schema/mule/pattern/3.1/mule-pattern.xsd
        http://www.mulesoft.org/schema/mule/xml http://www.mulesoft.org/schema/mule/xml/3.1/mule-xml.xsd
        http://www.mulesoft.org/schema/mule/vm http://www.mulesoft.org/schema/mule/vm/3.1/mule-vm.xsd
        http://www.mulesoft.org/schema/mule/servlet http://www.mulesoft.org/schema/mule/servlet/3.1/mule-servlet.xsd
        http://www.mulesoft.org/schema/mule/test http://www.mulesoft.org/schema/mule/test/3.1/mule-test.xsd
        http://www.mulesoft.org/schema/mule/jetty http://www.mulesoft.org/schema/mule/jetty/3.1/mule-jetty.xsd"
        >

Make absolutely sure that the version of the xsd that you include matches the major version of mule that you’re using! If you accidentally place a 3.0 instead of a 3.1 in any of those entries, your app will mysteriously fail to launch and you’ll get a stack trace like the following:

INFO  2011-06-09 17:21:20,015 [main] org.mule.MuleServer: Mule Server initializing...
INFO  2011-06-09 17:21:20,298 [main] org.mule.lifecycle.AbstractLifecycleManager: Initialising RegistryBroker
INFO  2011-06-09 17:21:20,355 [main] org.mule.config.spring.MuleApplicationContext: Refreshing org.mule.config.spring.MuleApplicationContext@19bb5c09: startup date [Thu Jun 09 17:21:20 EDT 2011]; root of context hierarchy
WARN  2011-06-09 17:22:36,265 [main] org.springframework.beans.factory.xml.XmlBeanDefinitionReader: Ignored XML validation warning
java.net.ConnectException: Operation timed out
    at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
    at org.apache.xerces.util.ErrorHandlerWrapper.warning(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.xs.traversers.XSDHandler.reportSchemaWarning(Unknown Source)
    at org.apache.xerces.impl.xs.traversers.XSDHandler.getSchemaDocument1(Unknown Source)
    at org.apache.xerces.impl.xs.traversers.XSDHandler.getSchemaDocument(Unknown Source)
    at org.apache.xerces.impl.xs.traversers.XSDHandler.parseSchema(Unknown Source)
    at org.apache.xerces.impl.xs.XMLSchemaLoader.loadSchema(Unknown Source)

Deploying via command line

While it’s nice to be able to use an IDE to develop Mule applications, I prefer to deploy from the command line. This allows me to script the launch of the applications. Furthermore, this approach works in a headless (screenless) remote server, whereas the IDE approach will not. The basic way to deploy an app has changed from Mule 2.2 to Mule 3. It used to be that you would call mule -config /path/to/your/config.xml. Now you move your application to the $MULE_HOME/apps folder and run mule, which in turn will deploy all the apps in the apps folder. This can be very handy, especially when coupled with the Hot Deployment features of Mule; you no longer need to have one terminal per mule app you’re running. From the article, “Mule 3: A New Deployment Model”, here are the ostensible steps you must take to deploy your application in this manner:

  • Create a directory under: $MULE_HOME/apps/foo
  • Jar custom classes (if any), and put them under: $MULE_HOME/apps/foo/lib
  • Put the master Mule config file at: $MULE_HOME/apps/foo/mule-config.xml (note that it has to be named: mule-config.xml
  • Start your app with: mule -app foo

While these instructions are correct, there are a lot of gotchas involved. Let me detail them below.

Relative paths

There is often a need to make reference to resources within your configuration file. For instance, you might need to configure an embedded Jetty webserver and tell Jetty where its configuration file is located. When you do this, it’s crucial that you prepend relative paths in the XML configuration file with ${app.home}.

The reason for this is that the current working directory in which you launch the mule process becomes the current working directory for all of your application configuration files. So if you have mule-config.xml in the root of your folder, and conf/jetty.xml in that same folder, then your reference to the jetty.xml should be ${app.home}/conf/jetty.xml. Otherwise, if you just use conf/jetty.xml and launch mule from a folder that’s not the same as the root folder of your application, all of your paths will break.

Property files / Resources

As the step #2 above says, you must jar up all of your compiled classes and include them in the lib folder of your project. If you don’t do this, you’ll get an exception when your component / custom classes are attempted to be instantiated.

What should be emphasized is that all resources that you reference from within your code must end up in the jar as well. By default, that won’t happen. You can use something like the solution presented in Ant Build: copy properties file to jar file to get this to happen.

Unintentional Application Deletion

When you deploy an app by copying a zip or folder into the apps directory and then running mule, Mule will launch it and then create a text file called ‘$APP_NAME-anchor.text’. If you delete this file, Mule will “undeploy this app in a clean way”. What isn’t noted by this is that it will delete the corresponding zip/folder. So be careful not to accidentally delete your whole project. (Not that I did that or anything).

JDBC drivers problems

One nice feature of the hot deploy process is that Mule will automatically load all of the jars in the lib folder and ensure that they’re on the classpath. Unfortunately there is an extremely annoying problem with JDBC drivers, in which they corresponding jar will be loaded correctly, but then will fail to be found at runtime.

At startup:

Loading the following jars:
=============================
file:/opt/local/Mule/mule-standalone-3.1.1/apps/XMLPlayer/lib/mysql-connector-java-5.1.13-bin.jar
=============================
<!-- snip -->
WARN 2011-06-09 15:56:12,130 [http://XMLPlayer].connector.http.mule.default.receiver.2 org.hibernate.cfg.SettingsFactory: Could not obtain connection to query metadata
java.sql.SQLException: No suitable driver found for jdbc:mysql://localhost:3306/db

The exact same project works perfectly in the Mule IDE. The only solution I’ve found is to copy the mysql-connector-java-5.1.13-bin.jar into $MULE_HOME/lib/endorsed. There is a similar bug report but it was closed for some reason. It most certainly does not work the way you would intuitively expect.

Conclusion

Mule 3 has many improvements over Mule 2, particular with the introduction of Flows. Unfortunately, deployment is a much tricker problem than it was in Mule 2, and the resources online are woefully inadequate for the task at hand. I hope this blog post helps some poor soul going through the same frustration I went through to get a Mule 3 application deployed.

Embed a Jetty file server within Mule 3.1.1

June 7, 2011 1 comment

This post details how to embed a Jetty webserver within Mule, such that static files hosted within your application are accessible to the outside world. The resources describing how to do this are few and far between; I also found them erroneous. For some reason, any time I include a test:component element in my Mule configuration files, I get a timeout. By eliminating that piece, I got things to work.

These config files assume that both jetty.xml and mule-config.xml are located in the same folder, namely conf.

mule-config.xml

<mule xmlns="http://www.mulesoft.org/schema/mule/core"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xmlns:spring="http://www.springframework.org/schema/beans"
      xmlns:http="http://www.mulesoft.org/schema/mule/http"
      xmlns:xm="http://www.mulesoft.org/schema/mule/xml"
      xmlns:jetty="http://www.mulesoft.org/schema/mule/jetty"
      xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
        http://www.mulesoft.org/schema/mule/core http://www.mulesoft.org/schema/mule/core/3.1/mule.xsd
        http://www.mulesoft.org/schema/mule/http http://www.mulesoft.org/schema/mule/http/3.1/mule-http.xsd
scripting.xsd
        http://www.mulesoft.org/schema/mule/xml http://www.mulesoft.org/schema/mule/xml/3.1/mule-xml.xsd
        http://www.mulesoft.org/schema/mule/jetty http://www.mulesoft.org/schema/mule/jetty/3.1/mule-jetty.xsd"
        >
        
  <description>
  This configuration uses an embedded Jetty instance to serve static content.
 </description>


  <jetty:connector configFile="${app.home}/conf/jetty.xml" name="jetty_connector" ></jetty:connector>
  <!-- do not use localhost here or you will not be able to access the server except locally.-->
  <jetty:endpoint address="http://0.0.0.0:8080" 
              name="jettyEndpoint" 
              connector-ref="jetty_connector"
              path="/">
            
  </jetty:endpoint> 

  <model name="Jetty">
    <service name="jettyUMO">
      <inbound>
        <jetty:inbound-endpoint ref="jettyEndpoint" /> 
      </inbound>
    </service>
  </model>
</mule>

jetty.xml

Modified from Newbie Guide to Jetty, namely changing class names (the classes in question are bundled with Mule 3.1.1, in the Jar file found in $MULE_HOME/lib/opt/jetty-6.1.26.jar).

<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd">

<Configure id="FileServer" class="org.mortbay.jetty.Server">
  <Set name="handler">
    <New class="org.mortbay.jetty.handler.HandlerList">
      <Set name="handlers">
        <Array type="org.mortbay.jetty.Handler">
          <Item>
            <New class="org.mortbay.jetty.handler.ResourceHandler">
              <!--  Jetty 6.1.26, which comes with Mule 3.1, does not have this method --> 
              <!--<Set name="directoriesListed">true</Set>-->
              <Set name="welcomeFiles">
                <Array type="String">
                  <Item>index.html</Item>
                </Array>
              </Set>
              <!-- This folder maps to the root URL configured for this Jetty endpoint.  If I wanted to start serving content from the a folder named "static", I would replace the . with "static".-->
              <Set name="resourceBase">.</Set>
            </New>
          </Item>
          <Item>
            <New class="org.mortbay.jetty.handler.DefaultHandler" />
          </Item>
        </Array>
      </Set>
    </New>
  </Set>
</Configure>

A gist with both of these code snippets can be found here.

Conclusion

With these two configuration files, you can launch an embedded instance of Jetty within your application, and use it to serve static content. Due to a limitation in the version of Jetty 6.1.26 which Mule 3.1.1 comes with, you cannot use the Jetty instance to list the contents of folders; instead the client must know the absolute path to the file. For my purposes this was not a problem.

Hibernate + MySQL + Mac = Foreign Key Nightmares. A painless solution to a painful problem

May 23, 2011 5 comments

tl;dr summary: Avoid using mixed case table names when using MySQL on a Mac.  Use lowercase underscore separated table names instead.

I was using Hibernate to map my Java classes to MySQL tables and columns.  For most classes, inserts worked perfectly.  For other classes, I’d consistently get errors like

- SQL Error: 1452, SQLState: 23000
- Cannot add or update a child row: a foreign key constraint fails

By running the command

show engine innodb status

in my mysql window, I found following clue:

110520 14:26:09 Transaction:
TRANSACTION 85B76, ACTIVE 0 sec, OS thread id 4530606080 inserting
mysql tables in use 1, locked 1
1 lock struct(s), heap size 376, 0 row lock(s)
MySQL thread id 3, query id 2175 localhost root update
insert into TableName (pk_Pdu) values (10)
Foreign key constraint fails for table `myproj`.`tablename`:
,
  CONSTRAINT `FKEC7DE11817B41BEB` FOREIGN KEY (`pk_Pdu`) REFERENCES `ParentClass` (`pk_Pdu`)
Trying to add to index `PRIMARY` tuple:
DATA TUPLE: 3 fields;
 0: len 8; hex 800000000000000a; asc         ;;
 1: len 6; hex 000000085b76; asc     [v;;
 2: len 7; hex 00000000000000; asc        ;;

But the parent table `myproj`.`ParentClass`
or its .ibd file does not currently exist!

I knew for a fact the table existed; I was able to query it and it showed up fine. Something else must be going on.

I finally stumbled onto the answer by way of a StackOverflow post:

However, I did rename the tables all to lowercase and that did make a difference. A quick search indicates I should maybe setting lower_case_table_names = 1 since I am using InnoDB. On Mac OS/X it is 2 by default (and I failed to mention I’m using a new box which may be why it isn’t working locally).

Sure enough, as soon as I renamed the table names to be all lowercase underscore separated, things worked perfectly. The default naming strategy in Hibernate names the tables in exactly the same way as the class names (e.g. in CamelCase as opposed to lower_case_underscore_separated). Fortunately the designers saw fit to make this naming convention overridable. All I had to do was add one line of code to fix my entire problem:


Configuration config = new Configuration();
// Name tables with lowercase_underscore_separated
config.setNamingStrategy(new ImprovedNamingStrategy());

Thanks to this blog post on ImprovedNamingStrategy for pointing the way. This post also helped me find the problem.

Conclusion

If you’re using Hibernate and a MySQL database running on MacOSX, make sure that your table names are all in lowercase.  This can be accomplished by using the ImprovedNamingStrategy class when configuring Hibernate.

This experience taught me a valuable lesson.  The first is, sometimes a problem can be caused by something that’s not directly your fault per se (i.e. I hadn’t incorrectly structured my Hibernate annotations, as I initially suspected), but rather due some quirk in the operating system or external tools you’re using.  The second is it’s crucial for cross platform libraries like Hibernate to provide the hooks for you to be able to swap out default behavior, precisely to be able to work around problems like these.  Thankfully Hibernate had built in just the hooks I needed to solve the problem.

NetBeans Platform – Duplicate pair in treePair error

March 5, 2011 Leave a comment

I have previously written about the NetBeans Platform
I’ve been using a lot recently at work and so have found a lot of pain points when using the APIs. I’ll try to document some of the more vexing problems here for other programmers who might be facing the same problem.

One problem you will probably face while using the NetBeans Platform is an inscrutable error message when you start up your application, something to the effect of:

java.lang.IllegalStateException: Duplicate pair in treePair1: java.lang.Object pair2: java.lang.Object index1: 55 index2: 55 item1: null item2: null id1: 16309239 id2: 4ecfe790

    at org.openide.util.lookup.ALPairComparator.compare(ALPairComparator.java:83)
    at org.openide.util.lookup.ALPairComparator.compare(ALPairComparator.java:54)
    at java.util.TreeMap.put(TreeMap.java:530)
    at java.util.TreeSet.add(TreeSet.java:238)
    at org.openide.util.lookup.AbstractLookup.getPairsAsLHS(AbstractLookup.java:322)
    at org.openide.util.lookup.MetaInfServicesLookup.beforeLookup(Met…

While it’s impossible to know from this error message, this really indicates that you had some sort of exception in the initializer of one or more of your TopComponents, leading to duplicate null entries in some internal set that NetBeans Platform maintains, leading to this error message. Fortunately, there is usually a “Previous” button on the exception window so you can see what caused the real problem.

EventBus – how to switch EventService implementations for unit testing

February 23, 2011 2 comments

I’ve written previously about EventBus, a great open source Java library for pub-sub (publish subscribe). It’s a truly excellent way to write loosely coupled systems, and much preferable to having to make your domain models extends Observable and your listeners implement Observer. I’m writing today to describe some difficulties in incorporating EventBus into unit tests, and how to overcome that problem.

Test setup

I was attempting to test that certain messages were being published by a domain model object when they were supposed to. In order to test this, I wrote a simple class that did nothing more than listen to the topics I knew that my model object was supposed to publish to, and then increment a counter when these methods were called. It looked something like this:

class EventBusListener {
    private int numTimesTopicOneCalled = 0;
    private int numTimesTopicTwoCalled = 0;

    public EventBusListener() {
        AnnotationProcessor.process(this);
    }

    @EventTopicSubscriber(topic="topic_one")
    public void topicOneCalled(String topic, Object arg) {
        this.numTimesTopicOneCalled++;
    }

    @EventTopicSubscriber(topic="topic_two")
    public void topicTwoCalled(String topic, Object arg) {
        this.numTimesTopicTwoCalled++;
    }

    public int getNumTimesTopicOneCalled() {
        return this.numTimesTopicOneCalled;
    }

    public int getNumTimesTopicOneCalled() {
        return this.numTimesTopicTwoCalled;
    }
}

The basic test routine looked something like this:

@Test
public void testTopicsFired() {


    // Uses EventBus internally
    DomainObject obj = new DomainObject();

    int count = 10;
    EventBusListener listener = new EventBusListener();
    for (int i = 0; i < count; i++) {
        obj.doSomethingThatShouldFireEventBusPublishing();
    }

    assertEquals(count, listener.getNumTimesTopicOneCalled());
    assertEquals(count, listener.getNumTimesTopicTwoCalled());
}

This code kept failing, but in nondeterministic ways – sometimes the listener would report having its topic one called 4 times instead of 10, sometimes 7, but never the same issue twice. Stepping through the code in debug mode I saw that the calls to EventBus.publish were in place, and sometimes they worked. Nondeterminism like this made me think of a threading issue, so I began to investigate.

Problem

After reading through the EventBus javadoc, I came upon the root of the problem:

The EventBus is really just a convenience class that provides a static wrapper around a global EventService instance. This class exists solely for simplicity. Calling EventBus.subscribeXXX/publishXXX is equivalent to EventServiceLocator.getEventBusService().subscribeXXX/publishXXX, it is just shorter to type. See EventServiceLocator for details on how to customize the global EventService in place of the default SwingEventService.

And from the SwingEventService javadoc (emphasis mine):

This class is Swing thread-safe. All publish() calls NOT on the Swing EventDispatchThread thread are queued onto the EDT. If the calling thread is the EDT, then this is a simple pass-through (i.e the subscribers are notified on the same stack frame, just like they would be had they added themselves via Swing addXXListener methods).

Here’s the crux of the issue: the EventBus.publish calls are not occurring on the EventDispatchThread, since the Unit testing environment is headless and this domain object is similarly not graphical. Thus these calls are being queued up using SwingUtilities.invokeLater, and they have no executed by the time the unit test has completed. This leads to the non-deterministic behavior, as a certain number of the queued up messages are able to be processed before the end of execution of the unit test, but not all of them.

Solutions

Sleep Hack

One solution, albeit a terrible one, would be to put a hack in:

@Test
public void testTopicsFired() {
    // same as before

    // Let the messages get dequeued
    try {
        Thread.sleep(3000);
    }
    catch (InterruptedException e) {}

    assertEquals(count, listener.getNumTimesTopicOneCalled());
    assertEquals(count, listener.getNumTimesTopicTwoCalled());
}

This is an awful solution because it involves an absolute hack. Furthermore, it makes that unit test always take at least 3 seconds, which is going to slow the whole test suite down.

ThreadSafeEventService

The real key is to ensure that whatever we call for EventBus within our unit testing code is using a ThreadSafeEventService. This EventService implementation does not use the invokeLater method, so you can be assured that the messages will be delivered in a deterministic manner. As I previously described, the EventBus static methods are convenience wrappers around a certain implementation of the EventService interface. We are able to modify what the default implementations will be by the EventServiceLocator class. From the docs:

By default will lazily hold a SwingEventService, which is mapped to SERVICE_NAME_SWING_EVENT_SERVICE and returned by getSwingEventService(). Also by default this same instance is returned by getEventBusService(), is mapped to SERVICE_NAME_EVENT_BUS and wrapped by the EventBus.

To change the default implementation class for the EventBus’ EventService, use the API:

EventServiceLocator.setEventService(EventServiceLocator.SERVICE_NAME_EVENT_BUS, new SomeEventServiceImpl());

Or use system properties by:

System.setProperty(EventServiceLocator.SERVICE_NAME_EVENT_BUS,
 YourEventServiceImpl.class.getName());

In other words, you can replace the SwingEventService implementation with the ThreadSafeEventService by calling

EventServiceLocator.setEventService(EventServiceLocator.SERVICE_NAME_EVENT_BUS, 
new ThreadSafeEventService());

An alternative solution is use an EventService instance to publish to rather than the EventBus singleton, and expose getters/setters to that EventService. It can start initialized to the same value that the EventBus would be wrapping, and then the ThreadSafeEventService can be injected for testing. For instance:


public class ClassToTest{
    // Use the default EventBus implementation
    private EventService eventService = EventServiceLocator.getEventBusService();

    public void setEventService(EventService service) {
        this.eventService = service;
    }
    public EventService getEventService() {
        return this.eventService;
    }

    public void doSomethingThatNotifiesOthers() {
        // as opposed to EventBus.publish, use an instance of EventService explicitly
        eventService.publish(...);
    }
}

Conclusion

I have explained how EventBus static method calls map directly to a singleton implementation of the EventService interface. The default interface works well for Swing applications, due to its queuing of messages via the SwingUtilities.invokeLater method. Unfortunately, it does not work for unit tests that listen for these EventBus publish events, since the behavior is nondeterministic and the listener might not be notified by the end of the unit test. I presented a solution for replacing the default SwingEventService implementation with a ThreadSafeEventService, which will work perfectly for unit tests.

Java annoyances

January 28, 2011 Leave a comment
Having had Java as the programming language of the vast majority of my undergraduate courses, as well as the language I program in every day, I am most comfortable and fluent in it.  When I return to Java after using different languages such as AWKPython, or Ruby, I’m always left with a bitter taste in my mouth.  There are some things Java just makes way too hard, verbose, and painful to accomplish.  It’s for that reason that I’m learning Scala, what could be (simplistically) described as a cleaned up, more succint version of Java.

Asymmetry in standard libraries

Symmetry is an important feature of a library; it basically means that methods come in pairs.  For instance, you’d expect that a class with a read method has a write method, or one with a set method has a get method.  (That’s not always the case, certainly, but API writers often strive for symmetry.  See Practical API Design: Confessions of a Java Framework Architect) As another example, there’s both a method to convert from an array to a list (Arrays.asList) and there is a method to go the other direction (List.toArray()).  Unfortunately, not all of the Java library APIs adhere to this convention.  The one that bothers me the most is in the String library.  There is a String split method that breaks a String up around a given regular expression, but no corresponding method to reconstitute a String from a collection of other String objects, with a specified separator between them.  This leads to code like the following to comma separate a collection of strings:
String[] strings = ...;
StringBuilder b = new StringBuilder();
for (int i = 0; i < strings.length; i++) {
 b.append(strings[i]);
 if (i != strings.length -1) {
 b.append(",");
 }
}
System.out.println(b.toString());
This whole mess could be replaced with one line of Python code
print(",".join(strings))
or in Scala:
println(strings.mkString(","))
It’s pretty sad that you have to either write that ugly mess, or turn to something like Apache StringUtils.

Different treatment of primitives and objects

It is a lot harder to deal with variable length collections of primitive types than it should be.  This is because you cannot create collections out of things that are not objects.  You can create them out of the boxed primitive type wrappers, but then you have to iterate through and convert back into the primitive types.

In other words, you can create a List<Double> but you cannot create a List<double>.  This leads to code like the following:
// Need a double[] but don't know how long it's going to be
List<Double> doubles = new LinkedList<Double>();
for (...) {
 doubles.append(theComputedValue);
}
// Option 1: Use for loop and iterate over Double list, converting to primitive values through
// auto unboxing
// Bad: leads to O(n^2) running time with LinkedList
double[] doubleArray = new double[doubles.size()];
for (int i = 0; i < doubleArray.length; i++) {
 doubleArray[i] = doubles.get(i);
}
// Option 2: Use enhanced for loop syntax (described below), along with
// additional index variable.
// Better performance but extraneous index value hanging around
int index = 0;
// Automatic unboxing
for (double d : doubles) {
 doubleArray[index++] = d;
}
...
b[index] // Oops
As I blogged about previously, there is a library called Apache Commons Primitives that can be used for variable sized lists for primitive types, but it is a shame one has to turn to third party libraries for such a common task.

Patchwork iteration support

Java 5 introduced the “Enhanced for loop” syntax which allows you to replace
Collection<String> strings = new ArrayList<String>();
Iterator<String> it = strings.iterator();
while (it.hasNext()) {
 String theString = it.next();
}
with the much simpler
for (String s : strings) {
 // deal with the String
}
Here’s the rub: this syntax is supported for arrays and Iterable objects.  But guess what?  Iterators are not Iterable.  Why is this a problem?  Well, you might want to return read-only iterators to your data.  If you do this, the client code cannot use the enhanced for loop syntax, and is stuck with the earlier hasNext() code.  If you want to use the enhanced for loop syntax to work for Iterators, you need to introduce a wrapper around the Iterator which implements the Iterable interface.  From the previously linked blog post:
class IterableIterator<T> implements  Iterable<T> {
    private Iterator<T> iter;
    public IterableIterator(Iterator<T> iter) {
        this.iter = iter;
    }
    // Fulfill the Iterable interface
    public Iterator<T> iterator() {
        return iter;
    }
}
I hope this strikes you as inelegant as well.
Furthermore, arrays are not iterable either, despite the fact that you can use the enhanced for loop syntax with them.
What this all boils down to is that there’s no great way to accept an Iterable collection of objects.  If you accept an Iterable<E>, you close yourself off to arrays and iterators.  You’d have to convert the arrays to a suitable collection type by using the Arrays.asList method.  It would be great if we could treat arrays, collections, etc., agnostically when all we want to do is iterate over their elements.

Lack of type inference for constructors with generics

Yes, we all know we should program to an interface rather than to a specific implementation; doing so will allow our code to be much more flexible and easily changed later.  Furthermore, we also know we should use generics for our collections rather than raw collections of objects; this allows us to catch typing errors before they occur.  So in other words

// BAD: Raw hashmap and programming to the implementation!
HashMap b = new HashMap();
// Good
Map<String, Integer> wordCounts = new HashMap<String, Integer>();
In fact, this lack of type inference is one reason why Joshua Bloch suggests that static factory methods can be better that constructors – it is possible to have a static factory method that can infer the correct types and instantiate the object, without making you explicitly repeat the type parameters.  For instance, Google Guava provides many static methods to instantiate maps:
Map<String, Integer> wordCounts = Maps.newHashMap();
Fortunately, the problem of having to repeat type parameters twice for constructors is being fixed in JDK 7 with something called the Diamond Operator.  It will allow you to replace
Map<String, Integer> wordCounts = new HashMap<String, Integer>();
with
Map<String, Integer> wordCounts = new HashMap<>();
This improvement to the language can’t come fast enough.

Conclusion

I use Java on a daily basis for about 90% of the work I need to do.  I’m comfortable with it, I understand its syntax, it’s fast, it’s powerful.  After being exposed to languages like python and scala, certain issues in Java stand out in stark contrast, and thus I’ve enumerated a few of the reasons that Java annoys me on a daily basis.  Fortunately excellent libraries exist to correct many of the annoyances, but it’s painful to have to use them to do such basic things as joining a list of Strings with a given separator character, or creating a variable sized list of primitive types. Fortunately Java continues to evolve, and at least some of my irritations will be fixed in JDK 7.
Post in the comments if you have better workarounds than those that I’ve suggested, you have other languages that make these tasks easy and would like to highlight them, or any other reason you can think of.

NetBeans Platform – ModuleInstall ClassNotFoundException

January 3, 2011 Leave a comment

In the NetBeans Platform, you can create an ModuleInstall class which handles the lifecycle of your module, and provides methods you can override to handle when your module is loaded and unloaded.  The standard way of creating this class is to use the wizard that NetBeans provides for this purpose, accessible by right clicking on the project and choosing New -> Other -> Module Development -> Module Installer

If you decide that you do not want to use the ModuleInstall mechanism any longer (the NetBeans Platform folks suggest you do NOT use it, as it will increase the startup time of the application), you might think you can just delete the Installer.java file that the wizard created.  Unfortunately, that’s not the case.  When you run the project you’ll get an exception like the following

org.netbeans.InvalidException:  
StandardModule:net.developmentality.moduleexample jarFile: ...
  java.lang.ClassNotFoundException:  net.developmentality.moduleexample.Installer starting from  ModuleCL@482...

The problem is that the wizard modified the manifest.mf file as well, and you need to manually clean up the file before your project will work again.

Open the manifest.mf file and you’ll see a line in the file like the following:

OpenIDE-Module-Install: net/developmentality/moduleexample/Installer.class

delete that line, and rebuild.  You should be ready to go.
In general, you need to be very careful about using any of the wizards that NetBeans provides.  They are extremely useful, but they end up changing XML files that you will need to manually edit later if you decide to refactor.  For instance, if you create an Action instance using the New Action Wizard, the action will be registered in the layer.xml file.  If you rename the action using the refactor command, the entries in the xml file are NOT modified, and you will get an exception at runtime unless you remember to edit that file.
Fortunately, the wizards are smart enough to at least tell you which files they are modifying.  It’s just a matter of remembering this further down the road when you need to refactor, move, or delete files.  Look for the modified files section in the wizard dialog: