Python Gotcha: Word boundaries in regular expressions
TL;DR
Be careful trying to match word boundaries in Python using regular expressions. You have to be sure to either escape the match sequence or use raw strings.
Word boundaries
Word boundaries are a great way of performing regular expression searches for whole words while avoiding partial matches. For instance, a search for the regular expression “the” would match both the word “the” and the start of the word “thesaurus”.
>>> import re
>>> re.match("the", "the")
# matches
>>> re.match("the", "thesaurus")
# matches
The way to match a word boundary is with ‘\b’, as described in the Python documentation. I wasted a few minutes wrestling with trying to get this to work.
>>> re.match("\bthe\b", "the")
# no match
It turns out that \b is also used as the backspace control sequence. Thus in order for the regular expression engine to interpret the word boundary correctly, you need to escape the sequence:
>>> re.match("\\bthe\\b", "the")
# match
You can also use raw string literals and avoid the double backslashes:
>>> re.match(r"\bthe\b", "the") # match
In case you haven’t seen the raw string prefix before, here is the relevant documentation:
String literals may optionally be prefixed with a letter ‘r’ or ‘R’; such strings are called raw strings and use different rules for interpreting backslash escape sequences.
Conclusion
Make sure you are familiar with the escape sequences for strings in Python, especially if you are dealing with regular expressions whose special characters might conflict. The Java documentation for regular expressions makes this warning a bit more explicit than Python’s:
The string literal “\b”, for example, matches a single backspace character when interpreted as a regular expression, while “\\b” matches a word boundary.
Hopefully this blog post will help others running into this issue.
Car Talk Puzzler #5: The Perfect Square Dance
PUZZLER: The Perfect Square Dance!
Sally invited 17 guests to a dance party. She assigned each guest a number from 2 to 18, keeping 1 for herself. The sum of each couple’s numbers was a perfect square. What was the number of Sally’s partner?
The fifth in my ongoing series of solving Car Talk Puzzlers with my programming language of choice. I’m using Python again, just like last time.
There are a few pieces to this. The first is, how do we generate a list of all possible pairs that match the perfect square constraint? With list comprehensions and a helper function this is easy.
def perfect_square(n): return math.sqrt(n) == int(math.sqrt(n)) # range creates a list that's exclusive of last number. Thus to go from 1 to n, # use range(1, n+1) guests = range(1, num_guests + 1) # By enforcing the x# a whole bunch of equivalent pairs (e.g. (3,6) and (6,3)). pairs = [(x,y) for x in guests for y in guests if x<y and perfect_square(x+y)] >>> guests [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18] >>> pairs [(1, 3), (1, 8), (1, 15), (2, 7), (2, 14), (3, 6), (3, 13), (4, 5), (4, 12), (5, 11), (6, 10), (7, 9), (7, 18), (8, 17), (9, 16), (10, 15), (11, 14), (12, 13)]
>>> itertools.combinations([1,2,3,4,5], 2) # This is a sort of generator object which lazily returns the values as needed. # To force them to all be evaluated at once, to see how this works, # wrap the call in a list function. >>>list(itertools.combinations([1,2,3,4,5],2)) [(1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)]
This performs exactly as how we would expect. Note that I want combinations rather than permutations because the order does not matter.
Yeah, I think we can handle checking 48620 combinations.
def flatten(nested): """ Flatten one level of nesting. Returns a generator object For instance: list(flatten([(1,3),(5,6)])) --> [1,3,5,6] """ return itertools.chain.from_iterable(nested) def all_guests_present_once(combination): """ Returns whether each guest is present once Combination is a list of tuples, e.g. [(1,5),(7,8)] """ flattened = list(flatten(combination)) return len(set(flattened)) == len(flattened) >>> all_guests_present_once([(1,3),(4,5)]) True >>> all_guests_present_once([(1,3),(3,6)]) False
OK we’re ready to throw it all together.
def dance_arrangement(num_guests):
"""
Returns a valid pairing for all guests if possible, else an empty set
"""
# Clearly you need an even number of guests to have everyone paired
if num_guests % 2 == 1:
return []
else:
# range creates a list that's exclusive of last number. Thus to go from 1 to n,
# use range(1, n+1)
guests = range(1, num_guests + 1)
# By enforcing the x # a whole bunch of equivalent pairs (e.g. (3,6) and (6,3)).
pairs = [(x,y) for x in guests for y in guests if x # brute force search
all_arrangements = itertools.combinations(pairs, num_guests / 2)
return filter(all_guests_present_once, all_arrangements)
Running the program with num_guests = 18, we get
[((1, 15), (2, 14), (3, 13), (4, 12), (5, 11), (6, 10), (7, 18), (8, 17), (9, 16))]
8 [((1, 8), (2, 7), (3, 6), (4, 5))] 14 [((1, 8), (2, 14), (3, 13), (4, 12), (5, 11), (6, 10), (7, 9))] 16 [((1, 8), (2, 7), (3, 6), (4, 5), (9, 16), (10, 15), (11, 14), (12, 13))] 18 [((1, 15), (2, 14), (3, 13), (4, 12), (5, 11), (6, 10), (7, 18), (8, 17), (9, 16))]
As you can see, 8, 14, and 16 guests can also be paired up in this way. Something to keep in mind the next time you are going to have a party.
Full sourcecode can be found on Github.
Arduino Cookbook review

Some of you may have seen me post this on Twitter, but if not I’m linking here to my DZone book review of the Arduino Cookbook. Tl;dr version: I like it a lot. I’m looking forward to building some things with Arduino in the coming months.
Mule 3 Deployment Gotchas / Workarounds
Mule is an open source enterprise service bus written in Java. I’ve worked with Mule 2.2 quite a bit but only recently have started to work with Mule 3. This post details some of the pains involved with the transition, none of which are well documented or hinted at in the Migration guide.
Gotchas/Workarounds
Mule IDE specific
The Mule IDE is really a misnomer – it’s not a standalone product, but instead an Eclipse plugin. See the installation guide for more information.
XML validation warnings
By default, Eclipse 3.5 will flag all sorts of spurious errors in your XML configuration file. See the blog post for more details, but here’s the short version on how to solve it:
General
These issues exist whether you use the IDE to deploy the app or deploy the app via the command line.
Failure to launch / Timeouts
Mule is configured via XML. You must declare the namespaces and schema locations in order to make use of the built-in Mule constructs. For instance, here’s a snippet of one of my Mule configurations:
<mule xmlns="http://www.mulesoft.org/schema/mule/core"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:spring="http://www.springframework.org/schema/beans"
xmlns:vm="http://www.mulesoft.org/schema/mule/vm"
xmlns:script="http://www.mulesoft.org/schema/mule/scripting"
xmlns:http="http://www.mulesoft.org/schema/mule/http"
xmlns:cxf="http://www.mulesoft.org/schema/mule/cxf"
xmlns:xm="http://www.mulesoft.org/schema/mule/xml"
xmlns:pattern="http://www.mulesoft.org/schema/mule/pattern"
xmlns:servlet="http://www.mulesoft.org/schema/mule/servlet"
xmlns:jetty="http://www.mulesoft.org/schema/mule/jetty"
xmlns:test="http://www.mulesoft.org/schema/mule/test"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
http://www.mulesoft.org/schema/mule/core http://www.mulesoft.org/schema/mule/core/3.1/mule.xsd
http://www.mulesoft.org/schema/mule/http http://www.mulesoft.org/schema/mule/http/3.1/mule-http.xsd
http://www.mulesoft.org/schema/mule/cxf http://www.mulesoft.org/schema/mule/cxf/3.1/mule-cxf.xsd
http://www.mulesoft.org/schema/mule/scripting http://www.mulesoft.org/schema/mule/scripting/3.1/mule-scripting.xsd
http://www.mulesoft.org/schema/mule/pattern http://www.mulesoft.org/schema/mule/pattern/3.1/mule-pattern.xsd
http://www.mulesoft.org/schema/mule/xml http://www.mulesoft.org/schema/mule/xml/3.1/mule-xml.xsd
http://www.mulesoft.org/schema/mule/vm http://www.mulesoft.org/schema/mule/vm/3.1/mule-vm.xsd
http://www.mulesoft.org/schema/mule/servlet http://www.mulesoft.org/schema/mule/servlet/3.1/mule-servlet.xsd
http://www.mulesoft.org/schema/mule/test http://www.mulesoft.org/schema/mule/test/3.1/mule-test.xsd
http://www.mulesoft.org/schema/mule/jetty http://www.mulesoft.org/schema/mule/jetty/3.1/mule-jetty.xsd"
>
Make absolutely sure that the version of the xsd that you include matches the major version of mule that you’re using! If you accidentally place a 3.0 instead of a 3.1 in any of those entries, your app will mysteriously fail to launch and you’ll get a stack trace like the following:
INFO 2011-06-09 17:21:20,015 [main] org.mule.MuleServer: Mule Server initializing...
INFO 2011-06-09 17:21:20,298 [main] org.mule.lifecycle.AbstractLifecycleManager: Initialising RegistryBroker
INFO 2011-06-09 17:21:20,355 [main] org.mule.config.spring.MuleApplicationContext: Refreshing org.mule.config.spring.MuleApplicationContext@19bb5c09: startup date [Thu Jun 09 17:21:20 EDT 2011]; root of context hierarchy
WARN 2011-06-09 17:22:36,265 [main] org.springframework.beans.factory.xml.XmlBeanDefinitionReader: Ignored XML validation warning
java.net.ConnectException: Operation timed out
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.warning(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.xs.traversers.XSDHandler.reportSchemaWarning(Unknown Source)
at org.apache.xerces.impl.xs.traversers.XSDHandler.getSchemaDocument1(Unknown Source)
at org.apache.xerces.impl.xs.traversers.XSDHandler.getSchemaDocument(Unknown Source)
at org.apache.xerces.impl.xs.traversers.XSDHandler.parseSchema(Unknown Source)
at org.apache.xerces.impl.xs.XMLSchemaLoader.loadSchema(Unknown Source)
Deploying via command line
While it’s nice to be able to use an IDE to develop Mule applications, I prefer to deploy from the command line. This allows me to script the launch of the applications. Furthermore, this approach works in a headless (screenless) remote server, whereas the IDE approach will not. The basic way to deploy an app has changed from Mule 2.2 to Mule 3. It used to be that you would call mule -config /path/to/your/config.xml. Now you move your application to the $MULE_HOME/apps folder and run mule, which in turn will deploy all the apps in the apps folder. This can be very handy, especially when coupled with the Hot Deployment features of Mule; you no longer need to have one terminal per mule app you’re running. From the article, “Mule 3: A New Deployment Model”, here are the ostensible steps you must take to deploy your application in this manner:
- Create a directory under: $MULE_HOME/apps/foo
- Jar custom classes (if any), and put them under: $MULE_HOME/apps/foo/lib
- Put the master Mule config file at: $MULE_HOME/apps/foo/mule-config.xml (note that it has to be named: mule-config.xml
- Start your app with: mule -app foo
While these instructions are correct, there are a lot of gotchas involved. Let me detail them below.
Relative paths
There is often a need to make reference to resources within your configuration file. For instance, you might need to configure an embedded Jetty webserver and tell Jetty where its configuration file is located. When you do this, it’s crucial that you prepend relative paths in the XML configuration file with ${app.home}.
The reason for this is that the current working directory in which you launch the mule process becomes the current working directory for all of your application configuration files. So if you have mule-config.xml in the root of your folder, and conf/jetty.xml in that same folder, then your reference to the jetty.xml should be ${app.home}/conf/jetty.xml. Otherwise, if you just use conf/jetty.xml and launch mule from a folder that’s not the same as the root folder of your application, all of your paths will break.
Property files / Resources
As the step #2 above says, you must jar up all of your compiled classes and include them in the lib folder of your project. If you don’t do this, you’ll get an exception when your component / custom classes are attempted to be instantiated.
What should be emphasized is that all resources that you reference from within your code must end up in the jar as well. By default, that won’t happen. You can use something like the solution presented in Ant Build: copy properties file to jar file to get this to happen.
Unintentional Application Deletion
When you deploy an app by copying a zip or folder into the apps directory and then running mule, Mule will launch it and then create a text file called ‘$APP_NAME-anchor.text’. If you delete this file, Mule will “undeploy this app in a clean way”. What isn’t noted by this is that it will delete the corresponding zip/folder. So be careful not to accidentally delete your whole project. (Not that I did that or anything).
JDBC drivers problems
One nice feature of the hot deploy process is that Mule will automatically load all of the jars in the lib folder and ensure that they’re on the classpath. Unfortunately there is an extremely annoying problem with JDBC drivers, in which they corresponding jar will be loaded correctly, but then will fail to be found at runtime.
At startup:
Loading the following jars:
=============================
file:/opt/local/Mule/mule-standalone-3.1.1/apps/XMLPlayer/lib/mysql-connector-java-5.1.13-bin.jar
=============================
<!-- snip -->
WARN 2011-06-09 15:56:12,130 [http://XMLPlayer].connector.http.mule.default.receiver.2 org.hibernate.cfg.SettingsFactory: Could not obtain connection to query metadata
java.sql.SQLException: No suitable driver found for jdbc:mysql://localhost:3306/db
The exact same project works perfectly in the Mule IDE. The only solution I’ve found is to copy the mysql-connector-java-5.1.13-bin.jar into $MULE_HOME/lib/endorsed. There is a similar bug report but it was closed for some reason. It most certainly does not work the way you would intuitively expect.
Conclusion
Mule 3 has many improvements over Mule 2, particular with the introduction of Flows. Unfortunately, deployment is a much tricker problem than it was in Mule 2, and the resources online are woefully inadequate for the task at hand. I hope this blog post helps some poor soul going through the same frustration I went through to get a Mule 3 application deployed.
Embed a Jetty file server within Mule 3.1.1
This post details how to embed a Jetty webserver within Mule, such that static files hosted within your application are accessible to the outside world. The resources describing how to do this are few and far between; I also found them erroneous. For some reason, any time I include a test:component element in my Mule configuration files, I get a timeout. By eliminating that piece, I got things to work.
These config files assume that both jetty.xml and mule-config.xml are located in the same folder, namely conf.
mule-config.xml
<mule xmlns="http://www.mulesoft.org/schema/mule/core"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:spring="http://www.springframework.org/schema/beans"
xmlns:http="http://www.mulesoft.org/schema/mule/http"
xmlns:xm="http://www.mulesoft.org/schema/mule/xml"
xmlns:jetty="http://www.mulesoft.org/schema/mule/jetty"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
http://www.mulesoft.org/schema/mule/core http://www.mulesoft.org/schema/mule/core/3.1/mule.xsd
http://www.mulesoft.org/schema/mule/http http://www.mulesoft.org/schema/mule/http/3.1/mule-http.xsd
scripting.xsd
http://www.mulesoft.org/schema/mule/xml http://www.mulesoft.org/schema/mule/xml/3.1/mule-xml.xsd
http://www.mulesoft.org/schema/mule/jetty http://www.mulesoft.org/schema/mule/jetty/3.1/mule-jetty.xsd"
>
<description>
This configuration uses an embedded Jetty instance to serve static content.
</description>
<jetty:connector configFile="${app.home}/conf/jetty.xml" name="jetty_connector" ></jetty:connector>
<!-- do not use localhost here or you will not be able to access the server except locally.-->
<jetty:endpoint address="http://0.0.0.0:8080"
name="jettyEndpoint"
connector-ref="jetty_connector"
path="/">
</jetty:endpoint>
<model name="Jetty">
<service name="jettyUMO">
<inbound>
<jetty:inbound-endpoint ref="jettyEndpoint" />
</inbound>
</service>
</model>
</mule>
jetty.xml
Modified from Newbie Guide to Jetty, namely changing class names (the classes in question are bundled with Mule 3.1.1, in the Jar file found in $MULE_HOME/lib/opt/jetty-6.1.26.jar).
<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd">
<Configure id="FileServer" class="org.mortbay.jetty.Server">
<Set name="handler">
<New class="org.mortbay.jetty.handler.HandlerList">
<Set name="handlers">
<Array type="org.mortbay.jetty.Handler">
<Item>
<New class="org.mortbay.jetty.handler.ResourceHandler">
<!-- Jetty 6.1.26, which comes with Mule 3.1, does not have this method -->
<!--<Set name="directoriesListed">true</Set>-->
<Set name="welcomeFiles">
<Array type="String">
<Item>index.html</Item>
</Array>
</Set>
<!-- This folder maps to the root URL configured for this Jetty endpoint. If I wanted to start serving content from the a folder named "static", I would replace the . with "static".-->
<Set name="resourceBase">.</Set>
</New>
</Item>
<Item>
<New class="org.mortbay.jetty.handler.DefaultHandler" />
</Item>
</Array>
</Set>
</New>
</Set>
</Configure>
A gist with both of these code snippets can be found here.
Conclusion
With these two configuration files, you can launch an embedded instance of Jetty within your application, and use it to serve static content. Due to a limitation in the version of Jetty 6.1.26 which Mule 3.1.1 comes with, you cannot use the Jetty instance to list the contents of folders; instead the client must know the absolute path to the file. For my purposes this was not a problem.
Why Code4Cheap is destined for failure
I was intrigued by the premise, but I’ve come to the conclusion that it is destined for failure. The first reason is that the title contains the word ‘Cheap’. Cheap has very negative connotations, including “of shoddy quality”. Even the literal definition, “purchasable below the going price or the real value” , presents real problems for the site. Why?
The blog post Pay Enough or Don’t Pay at All by Panos Ipeirotis sums it up perfectly:
There are the social norms and the market norms. When no money is involved, the exchanges operate using social norms. Once you put a price on a task, it becomes part of a market norm. It can be measured and compared. … Instead of offering their priceless help, they were being valued as unskilled workers, like every other worker in the market. Money and altruism do not mix.
A central tenet of the seminal book about the open source movement, “The Cathedral and the Bazaar“, is that the hacker culture thrives as a “gift culture” as opposed to an “exchange culture”. (This chapter of the book is available online if you’re interested in more). Thus we see every day thousands of highly skilled people give away their time and programming effort, both in the open source community and in Q&A sites like StackOverflow. In these instances, the currency consists of reputation and goodwill rather than money.
One must pay a reasonable rate for programming expertise if he is to pay at all, and the current questions on the site are laughably complex for the amount of money that the posters are offering. On top of that, the site takes a 30% cut out of any bounty that a buyer offers for a solution, further disincentivizing prospective programmers (i.e. a $50 bounty actually becomes $35).
I applaud the creator for launching a product, but I’m afraid this one will not last, without some sweeping changes to the business model.
What makes Google maps easier to read than its competitors?
This isn’t a new link but one I’ve been meaning to bring to my readers’ attention for awhile now. Justin O’Beirne has posted an excellent analysis of how Google’s use of white outlines, label sizes, and label font weight enhance a user’s ability to find information on a Google map.
Interestingly enough, Bing changed its mapping visual style to respond to some of the complaints against it. See O’Beirne’s post on the updates they made.
The entire 41latitude website is excellent, but these articles in particular piqued my interest. Hopefully you find them similarly enlightening
Hibernate + MySQL + Mac = Foreign Key Nightmares. A painless solution to a painful problem
tl;dr summary: Avoid using mixed case table names when using MySQL on a Mac. Use lowercase underscore separated table names instead.
I was using Hibernate to map my Java classes to MySQL tables and columns. For most classes, inserts worked perfectly. For other classes, I’d consistently get errors like
- SQL Error: 1452, SQLState: 23000 - Cannot add or update a child row: a foreign key constraint fails
By running the command
show engine innodb status
in my mysql window, I found following clue:
110520 14:26:09 Transaction: TRANSACTION 85B76, ACTIVE 0 sec, OS thread id 4530606080 inserting mysql tables in use 1, locked 1 1 lock struct(s), heap size 376, 0 row lock(s) MySQL thread id 3, query id 2175 localhost root update insert into TableName (pk_Pdu) values (10) Foreign key constraint fails for table `myproj`.`tablename`: , CONSTRAINT `FKEC7DE11817B41BEB` FOREIGN KEY (`pk_Pdu`) REFERENCES `ParentClass` (`pk_Pdu`) Trying to add to index `PRIMARY` tuple: DATA TUPLE: 3 fields; 0: len 8; hex 800000000000000a; asc ;; 1: len 6; hex 000000085b76; asc [v;; 2: len 7; hex 00000000000000; asc ;; But the parent table `myproj`.`ParentClass` or its .ibd file does not currently exist!
I knew for a fact the table existed; I was able to query it and it showed up fine. Something else must be going on.
I finally stumbled onto the answer by way of a StackOverflow post:
However, I did rename the tables all to lowercase and that did make a difference. A quick search indicates I should maybe setting lower_case_table_names = 1 since I am using InnoDB. On Mac OS/X it is 2 by default (and I failed to mention I’m using a new box which may be why it isn’t working locally).
Sure enough, as soon as I renamed the table names to be all lowercase underscore separated, things worked perfectly. The default naming strategy in Hibernate names the tables in exactly the same way as the class names (e.g. in CamelCase as opposed to lower_case_underscore_separated). Fortunately the designers saw fit to make this naming convention overridable. All I had to do was add one line of code to fix my entire problem:
Configuration config = new Configuration();
// Name tables with lowercase_underscore_separated
config.setNamingStrategy(new ImprovedNamingStrategy());
Thanks to this blog post on ImprovedNamingStrategy for pointing the way. This post also helped me find the problem.
Conclusion
If you’re using Hibernate and a MySQL database running on MacOSX, make sure that your table names are all in lowercase. This can be accomplished by using the ImprovedNamingStrategy class when configuring Hibernate.
This experience taught me a valuable lesson. The first is, sometimes a problem can be caused by something that’s not directly your fault per se (i.e. I hadn’t incorrectly structured my Hibernate annotations, as I initially suspected), but rather due some quirk in the operating system or external tools you’re using. The second is it’s crucial for cross platform libraries like Hibernate to provide the hooks for you to be able to swap out default behavior, precisely to be able to work around problems like these. Thankfully Hibernate had built in just the hooks I needed to solve the problem.
Unzip KMZ Files on a Mac using Springy
I’m learning about KML/KMZ files, where KMZ is basically a .zip file renamed as .kmz. The problem is that these .kmz files cannot be opened using the default Mac unzip utility. When you try to open the .zip file, it creates a new file called <originalfile>.zip.cpgz. Opening the .cpgz file yields a copy of the original zip.
The solution is to use Springy, a zip utility for Mac (free trial, ~$20 to buy). It handles the file perfectly:
Edit: Found an alternative approach here. Basically, rename the file .rar instead of .zip and the Unix unzip utility can handle it.
I’ve written a script to incorporate this; find it as a gist here.
Visor – Mac OSX shortcut to launch Terminal
My friend Paul showed me a very nice application to quickly launch Terminal. It’s called Visor, and it’s very useful if you’re a programmer.
After installing it, your terminal hides until being summoned via a keyboard hotkey. At that point, it pops into view from the top of the screen (though this can be customized if you desire). I find it really declutters my desktop, as I no longer need to devote screen real estate to the terminal. Instead, it’s hidden until I need it.
Due to the way it’s packaged (as a SIMBL plugin that modifies the Terminal app itself), it is unobtrusive, incorporating its settings into the Terminal app itself as opposed to requiring a separate app in your dock or quick launch bar. It’s simple and works flawlessly. Can’t ask for much more in a piece of free software.








