Archive
Java annoyances
Asymmetry in standard libraries
String[] strings = ...; StringBuilder b = new StringBuilder(); for (int i = 0; i < strings.length; i++) { b.append(strings[i]); if (i != strings.length -1) { b.append(","); } } System.out.println(b.toString());
print(",".join(strings))
println(strings.mkString(","))
Different treatment of primitives and objects
It is a lot harder to deal with variable length collections of primitive types than it should be. This is because you cannot create collections out of things that are not objects. You can create them out of the boxed primitive type wrappers, but then you have to iterate through and convert back into the primitive types.
// Need a double[] but don't know how long it's going to be List<Double> doubles = new LinkedList<Double>(); for (...) { doubles.append(theComputedValue); } // Option 1: Use for loop and iterate over Double list, converting to primitive values through // auto unboxing // Bad: leads to O(n^2) running time with LinkedList double[] doubleArray = new double[doubles.size()]; for (int i = 0; i < doubleArray.length; i++) { doubleArray[i] = doubles.get(i); } // Option 2: Use enhanced for loop syntax (described below), along with // additional index variable. // Better performance but extraneous index value hanging around int index = 0; // Automatic unboxing for (double d : doubles) { doubleArray[index++] = d; } ... b[index] // Oops
Patchwork iteration support
Collection<String> strings = new ArrayList<String>(); Iterator<String> it = strings.iterator(); while (it.hasNext()) { String theString = it.next(); }
for (String s : strings) { // deal with the String }
class IterableIterator<T> implements Iterable<T> { private Iterator<T> iter; public IterableIterator(Iterator<T> iter) { this.iter = iter; } // Fulfill the Iterable interface public Iterator<T> iterator() { return iter; } }
Lack of type inference for constructors with generics
Yes, we all know we should program to an interface rather than to a specific implementation; doing so will allow our code to be much more flexible and easily changed later. Furthermore, we also know we should use generics for our collections rather than raw collections of objects; this allows us to catch typing errors before they occur. So in other words
// BAD: Raw hashmap and programming to the implementation! HashMap b = new HashMap(); // Good Map<String, Integer> wordCounts = new HashMap<String, Integer>();
Map<String, Integer> wordCounts = Maps.newHashMap();
Map<String, Integer> wordCounts = new HashMap<String, Integer>();
Map<String, Integer> wordCounts = new HashMap<>();
Conclusion
Scala – type inferencing gotchas
int x = 100;
as we would in Java, you can instead write
val x = 100
The setup
def calculateAverageColor(image:BufferedImage):Color = { var redSum = 0 var greenSum = 0 var blueSum = 0 // calculate the sum of each channel here val red = (redSum / numPixels) val green = (greenSum / numPixels) val blue = (blueSum / numPixels) new Color(red, green, blue) }
Problem #1
scala> java.lang.Integer.MAX_VALUE res0: Int = 2147483647 scala> java.lang.Integer.MAX_VALUE+1 res2: Int = -2147483648
If this happens, then we end up trying to construct a color with negative red, green, or blue values; this will result in an IllegalArgumentException.
def calculateAverageColor(image:BufferedImage):Color = { // Declare the sums as longs so we don't have to worry about overflow var redSum:Long = 0 var greenSum:Long = 0 var blueSum:Long = 0 // calculate the sum of each channel val red = (redSum / numPixels) val green = (greenSum / numPixels) val blue = (blueSum / numPixels) new Color(red, green, blue) }
java.lang.IllegalArgumentException: Color parameter outside of expected range: Red, Green, Blue
Problem #2
At this point, you might start debugging the process by printing out the values of red, green, and blue. Sure enough they’ll be in the range [0, 255], just as you need for the Color constructor. What is going on?
There are two related problems. The first is that the type of red, green, and blue are not integers, due to the Long value in the computation. The compiler sees the Long and (correctly) infers that the type of red, green, and blue must be Long.
”
The following 19 specific conversions on primitive types are called the widening primitive conversions:
- byte to short, int, long, float, or double
- short to int, long, float, or double
- char to int, long, float, or double
- int to long, float, or double
- long to float or double
- float to double
”
long redSum = ...; int averageRed = (int) (redSum/numPixels);
val redSum:Long = ... val averageRed:Int = (redSum/numPixels).asInstanceOf[Int]
var redSum:Long = 0 var greenSum:Long = 0 var blueSum:Long = 0 // calculate the sum of each channel val red:Int = (redSum / numPixels).asInstanceOf[Int] val green:Int = (greenSum / numPixels).asInstanceOf[Int] val blue:Int = (blueSum / numPixels).asInstanceOf[Int] new Color(red, green, blue)
Conclusion
One of the nice things about Scala is that you do not need to explicitly declare the types of your variables. In one sequence of unfortunate events, the variables that looked like ints were in fact longs, leading to an implicit conversion to the float primitive type, which in turn caused the incorrect constructor to be invoked, and an IllegalArgumentException. Hopefully you can avoid doing something so foolish as a result of reading this post.
Ternary operator in bash
Here’s a really quick tip for bash programmers.
In languages like C++, Java, Python, and the like, there’s the concept of a ternary operator. Basically it allows you to assign one value if a condition is true, else another.
In C/Java:
int x = valid ? 0 : 1;
In Python:
x = 0 if valid else 1
In Scala:
val x = if (valid) 0 else 1
Well, there’s no ternary operator in Bash, but there is a way to fake it.
valid=1 [ $valid ] && x=1 || x=0
Where whatever conditional you want is within the brackets.
If it’s valid, then the branch after the AND is followed, otherwise that after the OR is followed.
This is equivalent though perhaps a bit less readable then
if [ $valid ]; then x=1; else x=0; fi
So you should be aware of the construct in case you ever run into it, but it’s arguably less readable than just listing out explicitly what you’re doing.
Thanks to experts-exchange for making me aware of this little tip.
Java gotcha: Splitting a string
CSV, or comma separated value, files are the workhorse of flat file processing. They’re human readable, you can edit them in any spreadsheet editor worth its salt, they’re portable, and they’re easy to parse and read into your programs. Unfortunately, text you might want to store in the columns might also have commas in it, which complicates the parsing significantly – instead of splitting on all of the commas, you have to know whether the comma is in a quoted string, in which case it doesn’t signify the end of a field … it gets messy, and you’re probably going to get it wrong if you roll your own. For a quick and dirty script, sometimes it’s better to delimit the columns with a different character, one that comes up much less often in text. One good choice is the pipe character, |.
Foolishly I processed lines of text like the following in Scala:
val FIELD_DELIMITER = "|"; val FILE_NAME_INDEX = 0 val AVG_COLOR_INDEX = 1 val THUMBNAIL_IMAGE_INDEX = 2 def parseLine(indexRow:String) = { Console.println("Row: " + indexRow) val entries = indexRow.split(FIELD_DELIMITER) // extract the file name, average color, thumbnail }
Unfortunately, this code is subtly broken. The problem is that the String split method does not accept a plain string on which to split – rather, it treats the string as a regular expression. Since the pipe character has special meaning in regular expressions (namely “OR”), this has the effect of splitting the string on every character, rather than just at the pipe delimiters.
To get around this, you need to tell the regular expression engine that you want to treat the pipe as a literal, which you do by backslash escaping the pipe. Unfortunately “\|” is not a valid string in Java, because it will attempt to interpret it as a control character (like the newline \n). Instead, you need to backslash escape the backslash, leaving “\\|”
Conclusion:
Be careful when you’re using String.split in Java/Scala. You have to be aware that it’s treating your string as a regular expression. Read this StackOverflow discussion for a better understanding of the pros and cons of Java using regular expressions in many of its core String libraries.
An introduction to Scala
An Introduction to Scala
Overview
- Introduction
- Why is it gaining popularity, and why should you care?
- History
- Functional programming
- Syntax
- Data structures
- Benefits
- Drawbacks
- Conclusion
Introduction
- Scala = scalable language.
- From scripts all the way up to enterprise
- Runs on JVM
- Combines functional and object-oriented paradigms
Scala was designed to be both object-oriented and functional. It is a pure object-oriented language in the sense that every value is an object. Objects are defined by classes, which can be composed using mixin composition. Scala is also a functional language in the sense that every function is a value. Functions can be nested, and they can operate on data using pattern matching. Language creator Martin Odersky
Why?
- Productive like a scripting language, but with type safety and speed
- You can express yourself in fewer lines of code than equivalent Java.
- “On average, Odersky predicts a 2x reduction in lines of code between a Java program and a Scala port.” Scala lift-off: Martin Odersky Keynote
- Less boilerplate
- Better type inferencing
- Well suited to multithreaded environments
- Actors and message passing
Less boilerplate
// Scala
case class Person(firstName: String, lastName: String)
// Java
public class Person implements Serializable {
private final String firstName;
private final String lastName;
public Person(String firstName, String lastName) {
this.firstName = firstName;
this.lastName = lastName;
}
public String getFirstName() {
return firstName;
}
public String getLastName() {
return lastName;
}
public Person withFirstName(String firstName) {
return new Person(firstName, lastName);
}
public Person withLastName(String lastName) {
return new Person(firstName, lastName);
}
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
Person person = (Person) o;
if (firstName != null ? !firstName.equals(person.firstName) : person.firstName != null) {
return false;
}
if (lastName != null ? !lastName.equals(person.lastName) : person.lastName != null) {
return false;
}
return true;
}
public int hashCode() {
int result = firstName != null ? firstName.hashCode() : 0;
result = 31 * result + (lastName != null ? lastName.hashCode() : 0);
return result;
}
public String toString() {
return "Person(" + firstName + "," + lastName + ")";
}
}
Samples where Scala code is significantly shorter than Java
History
- Martin Odersky created it
- Functional programming and compiler background (wrote current javac reference compiler)
- Worked on adding generics to Java
- Design began in 2001, released in 2003
- Version 2.80 Final released July 14, 2008
Functional programming
- Avoid state and mutable data
- Every variable is a constant
- Lends itself to parallelization; avoid locking and deadlocks
- First-class functions – functions can serve as arguments and results of functions
- Recursion is primary tool for iteration
- Heavy use of pattern matching
- Lazy evaluation – infinite sequences
Who’s using Scala
- Twitter uses for performance critical code; replaces Ruby in that regard
- FourSquare uses Scala and Lift web framework
- Xerox uses Scala and Lift
- Etherpad uses Scala in its real time document collaboration software (acquired by Google for Google Docs; Scala code is now opensourced)
Syntax:
Hello world in Scala:
object HelloWorld extends Application {
Console.println("Hello World")
}
No semicolons needed
Variables declared with val and var
val x = 0 // immutable
var y = 0 // mutable
x = 1 // compiler error
y = 1 // fine
Type declarations occur after the variable
int x = 0; // Java
val x: Int = 0 // Scala
Generic type arguments occur within square brackets rather than angle brackets
List<Integer> intList = ...; // java
val intList: List[Int] = ..// scala
Array indexing is done through use of parentheses rather than square brackets
// Java
int[] x = {1,2,3,4};
x[0] // 1
// Scala
val x: Array[Int] = Array(1,2,3,4)
x(0) // 1
Method definitions
scala> def square(x:Int) = { x * x }
square: (Int)Int
scala> square(5)
res38: Int = 25
No such thing as a ‘void’ method, every method returns a value. void = “Unit” in Scala
scala> def printIt(x:Any) = { println(x) }
printIt: (Any)Unit
Type inferencing – note that I haven’t been declaring what types my methods take
// Compiler infers that the return type must be int
scala> def square(x:Int) = { x * x }
square: (Int)Int
// I can explicitly indicate what it returns if I prefer
scala> def squareExplicit(x:Int):Int = { x * x }
squareExplicit: (Int)Int
No need to explicitly return values
scala> def square(x:Int) = { x * x }
square: (Int)Int
scala> def squareExplicit(x:Int):Int = { return x * x }
squareExplicit: (Int)Int
// The result of last expression is returned.
Conditionals as values
// BAD. Java style, mutable value
var x = 0
if (someCondition) {
x = something
}
else {
x = somethingElse
}
// GOOD: Scala style, immutable
val x =
if (someCondition) {
something
}
else {
somethingElse
}
// or if short enough
val x = if(something) something else (somethingElse)
All operations are method calls
Scala doesn’t technically have operator overloading, because it doesn’t actually have operators in the traditional sense. Instead, characters such as +, -, *, and / can be used in method names. Thus, when you typed 1 + 2 into the Scala interpreter in Step 1 you were actually invoking a method named + on the Int object 1, passing in 2 as a parameter. As illustrated in Figure 3.1, you could alternatively have written 1 + 2 using traditional method invocation syntax (1).+(2).
“Programming in Scala, p. 39”
What good does that do me?
If your domain objects make sense to add, multiply, divide, etc., you can treat them in a much more natural way.
// BigInt is a wrapper around Java's BigInteger, providing arithmetic methods
scala> BigInt("103058235287305927350") + BigInt("288888203959230529352")
res16: BigInt = 391946439246536456702
// In Java you're stuck calling .add, .multiply, etc
scala> new java.math.BigInteger("103058235287305927350").add(new java.math.BigInteger("288888203959230529352"))
res21: java.math.BigInteger = 391946439246536456702
Data structures
Lists
More powerful than Java lists. Expose a whole raft of Functional Programming paradigms, such as map, reduce, filter, folds, etc.
// Note the syntax; you do not call "new". This is because there is
// an apply ("()") method defined in the List object.
scala> val a = List("hello","how","are","you")
a: List[java.lang.String] = List(hello, how, are, you)
// Explicit
val nums:List[Int] = List(1,2,3,4)
scala> a.size
res40: Int = 4
scala> a.mkString(" ")
res41: String = hello how are you
scala> a.map(x => x.toUpperCase())
res42: List[java.lang.String] = List(HELLO, HOW, ARE, YOU)
scala> a.map(x => x.length())
res43: List[Int] = List(5, 3, 3, 3)
scala> a.filter(x => x.startsWith("h"))
res44: List[java.lang.String] = List(hello, how)
scala> a.forall(x => x.length() > 0)
res46: Boolean = true
// Singly linked list
scala> a.head
res67: java.lang.String = hello
scala> a.tail
res68: List[java.lang.String] = List(how, are, you)
// add to head
scala> 1 :: List(2)
res70: List[Int] = List(1, 2)
// reverse
scala> a.reverse
res72: List[java.lang.String] = List(you, are, how, hello)
Tuples
Tuples are a good data structure to pack up multiple variables to return from a function, without the need to define a helper class to do so. Slightly syntax to create them, as well as to access the elements contained therein.
// Explicitly create the tuple; never really a need to do this
scala> val a:Tuple2[Int,Int] = (5,6)
a: (Int, Int) = (5,6)
// The parentheses indicate it's a tuple
scala> val b = (5,6)
b: (Int, Int) = (5,6)
// Yes, I know it's a strange syntax. And it's 1-based rather than 0-based. This is because Haskell has 1-based tuple ordering and Odersky wanted to stay true to that.
scala> a._1
res48: Int = 5
scala> a._2
res49: Int = 6
// You can have an arbitrary number of elements, of any type you want
val listTuple = (List(1,2,3), "hello", new java.awt.Color(255,0,0))
scala> val listTuple = (List(1,2,3), "hello", new java.awt.Color(255,0,0))
listTuple: (List[Int], java.lang.String, java.awt.Color) = (List(1, 2, 3),hello,java.awt.Color[r=255,g=0,b=0])
Maps
Scala provides immutable maps as well as a syntax for creating map literals.
scala> val theMap = Map("hello"->5,"world"->5)
theMap: scala.collection.immutable.Map[java.lang.String,Int] = Map(hello -> 5, world -> 5)
scala> theMap("hello")
res50: Int = 5
scala> theMap.getOrElse("world",2)
res51: Int = 5
scala> theMap.getOrElse("worldly",2)
res52: Int = 2
scala> theMap.keySet
res54: scala.collection.Set[java.lang.String] = Set(hello, world)
scala> theMap.values
res55: Iterator[Int] = non-empty iterator
scala> theMap.values.toList
res56: List[Int] = List(5, 5)
// Mutable map:
var mutableMap = Map[Int, String]()
mutableMap += (1 -> "1")
mutableMap += (2 -> "2")
Arrays
Internally map to Java arrays. Idiomatic Scala code tends to prefer Lists to Arrays, but they are necessary for interacting with a lot of Java code. Create in same way as lists.
scala> val array = Array(1,2,3)
array: Array[Int] = Array(1, 2, 3)
Benefits
-
Static type checking but with improved type inference
// Java
private Map wordCount = new HashMap();
// Scala:
var wordCount:Map[String, Int] = Map() -
Much faster than Python/Ruby and other interpreted languages, but just as expressive
- Duck typing support
- Conciseness
- Functions as first class objects; no need for anonymous classes and extraneous interfaces Unified types
- Traits are a cross between interfaces and abstract classes, allowing you to provide default implementations. Allows for ‘mix-in’ composition, as well. Solves problems with multiple inheritance, while giving flexibility. Avoid repeating yourself.
- Console REPL environment allows you to quickly iterate and test
- Of theoretical interest
- Combines functional and object-oriented in statically-typed environment
Drawbacks/Complaints
- Immaturity of tool chain
- Binary/source incompatibility from rapid change of language
- Type system is very complex
- Functional programming can be hard to wrap head around for people familiar with OOP
-
Syntax can be so concise as to be cryptic
For instance, the following example comes from StackOverflow post.
def scanLeft[a,b](xs:Iterable[a])(s:b)(f : (b,a) => b) = xs.foldLeft(List(s))( (acc,x) => f(acc(0), x) :: acc).reverse
And then use it like this:
scala> scanLeft(List(1,2,3))(0)(_+_) res1: List[Int] = List(0, 1, 3, 6)
Conclusion
Scala is a very intriguing language due to its combination of object orientation with functional programming concepts. The programmer is not forced to choose between the expressiveness of objects and immutability; he can switch between the two paradigms as the situation warrants.
While binary/source incompatibility is a very real problem for enterprise adoption, it is an issue that the creators are working on.
Scala provides a great framework in which to explore functional programming methods, especially through the use of the interactive console environment.
Resources
- Scala in the Interprise
- Martin Odersky’s Introduction to Scala video
- Scala for Java refugees
- Top five scripting languages on the JVM
- Scala history
- Scala language homepage
- Scala cheatsheet
- Scala FAQ
- Java, Grovvy & Scala: side to side1 2
- Lift
- Scala is unfit for serious development
- What is stopping you from switching to Scala?
Images:
Car Talk Puzzler #3: The Palindromic Odometer
The Palindromic Odometer
Driving along, Terry notices that the last four digits on the odometer are palindromic. A mile later, the last five digits are palindromic. [A mile later, the middle four digits are palindromic.] [One mile after that], all six are palindromic. What was the odometer reading when Terry first looked at it?
This is the third in my series of Car Talk puzzler solutions using whatever programming language strikes my fancy. Just like the last one I wrote, this one uses Scala. This is a much shorter post, as I am assuming some familiarity with Scala; go back and read the last post if you haven’t already.
object PalindromicPuzzlerSolver { // Pad out numbers to 6 places // Note that I can cal Java library functions seamlessly val df = new java.text.DecimalFormat("000000") // A string is a palindrome if it reversed equals itself def isPalindrome(word: String): Boolean = { // Reverse is a method of a RichString, which a String can be implicitly converted // into. Necessary to convert the RichString back into String after the reversal word == word.reverse.toString() } def fitsDescription(miles: Int): Boolean = { // Last 4 are palindome (dropping first 2) isPalindrome(df.format(miles).drop(2)) && // Add one, last 5 are palindrome isPalindrome(df.format(miles + 1).drop(1)) && // Add two more, all six are a palindrome isPalindrome(df.format(miles + 3)) } def main(args: Array[String]) { val max = 1000000 println( 0 until max filter fitsDescription ) // Prints out // RangeF(198888, 199999) // This is the same as the java style: // println( (0.until(max)).filter(fitsDescription) ) // This syntax only works when you are dealing with // methods that take one argument. } }
Thus the solution is 198888, just as the answer proclaims. Another win for brute force.
There are three noteworthy things in this example, which didn’t come up in the last post (or weren’t explained fully)
1. Interoperability with Java
2. Use of drop
3. Convenience methods for specifying a range
Interoperability with Java
Any Java class can be imported and used with Scala; that is one of its strongest selling points. Scala runs in the same JVM as Java, so it makes sense that it should have access to Java classes. In fact many Scala classes are thin wrappers around Java objects, with syntactic sugar built in. For instance, Scala has a BigInt class which wraps a Java BigInteger, while providing overloaded methods to allow them to be used more like standard numbers. For instance,
scala> BigInt("103058235287305927350") + BigInt("288888203959230529352") res16: BigInt = 391946439246536456702 scala> new java.math.BigInteger("103058235287305927350").add(new java.math.BigInteger("288888203959230529352")) res21: java.math.BigInteger = 391946439246536456702
Note that the + is actually a method call; this is exactly the same thing as calling
scala> BigInt("103058235287305927350").+(BigInt("288888203959230529352")) res22: BigInt = 391946439246536456702
In Scala, method names don’t have to be alphanumeric, which is how it gets away with what looks like operator overloading. This makes it a lot nicer to work with classes like BigDecimal or BigInteger, as you can use the more commonly used addition and subtraction symbols rather than explicitly calling “add” and “subtract” methods. Just realize that you are really calling methods like any other under the hood. When you understand this, you’ll also understand why Scala uses an underscore instead of * for wildcard – * can be used in method names, as it is used for multiplication!
Drop
Scala has full support for slicing iterable objects; see my previous discussion on slices if this term is unfamiliar.
scala> List(1,2,3,4).slice(1,3) res27: List[Int] = List(2, 3) scala> List(1,2,3,4).slice(1) warning: there were deprecation warnings; re-run with -deprecation for details res28: Seq[Int] = List(2, 3, 4) scala> List(1,2,3,4).drop(1) res29: List[Int] = List(2, 3, 4)
So drop(n) is exactly the same as slice(n) – it provides a new list with the first n elements removed. I use drop instead of slice due to the deprecation of the single argument form of slice.
Specifying a range
If you need to iterate over a range of numbers in Java, you’d do something like
for (int i = 0; i < max; i++) {
}
whereas in Python you can do something like
for x in range(max):
Scala similarly provides methods to create a range of numbers to iterate over.
scala> 0 until 10 res9: Range = Range(0, 1, 2, 3, 4, 5, 6, 7, 8, 9) scala> 0 to 10 res10: Range.Inclusive = Range(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Again, I want to stress that until is a method like any other; you are just free to leave off the . and the parens when it is unambiguous what you mean. Note that these commands are actually creating a sequence in memory. If you only need to access each element one at a time, and you are dealing with a large range, you might instead create an Iterator rather than a sequence. The Iterator will create each subsequent number as necessary, rather than building up the whole list ahead of time. This is much more space efficient.
Note that Python makes this distinction as well:
>>> range(10) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> xrange(10) xrange(10)
xrange returns a generator object, as opposed to range which returns the actual list.
Returning to Scala, if I wanted an iterator instead I would do
scala> Iterator.range(0,10) res11: Iterator[Int] = non-empty iterator scala> Iterator.range(0,10).foreach(println) 0 1 2 3 4 5 6 7 8 9
Conclusion
I have shown another solution using Scala, while illustrating a few more features. In particular one needs to understand that Scala’s syntactic sugar is actually masking standard function calls under the hood; x + y is the same as x.+(y), where “+” is a valid function name.
Car Talk Puzzler #2: What do these words have in common? An imperative and functional approach using Java and Scala
Here’s the second in my series of Car Talk puzzler solutions. Today I’m going to be illustrating a solution first in Java, and then in Scala in order to show how functional programming features can drastically reduce the number of lines of code, while still expressing the developer’s intent.
RAY: Here’s the answer. If you notice, the first letter in each of the words is from farther down in the alphabet than all the subsequent letters. And if we assigned a number to every letter in the alphabet, so that A is one, B is two, C is three and so on…
TOM: Yeah, yeah, yeah.
RAY: Then the numerical place in the alphabet of the first letter is equal to the sum of the numerical places of all the other letters. For example, Table: T is 20, A is 1, B is 2, L is 12, E is 5, so 1 plus 2 plus 12 plus 5 –
TOM: Is 20.
So, we have the answer. But as the guys mention on the show, there are probably other words as well. Let’s do a brute force search through the dictionary to find all such words. The dictionary I”ll be using is available here, as a new line separated list of words, one per line. This makes it very easy to deal with.
Here’s the rather straightforward solution in Java:
import java.io.FileReader; import java.io.BufferedReader; import java.io.FileNotFoundException; import java.io.IOException; /** * Determines all the words such that the first letter's ordinal position * equals the sum of the remaining letters' ordinal positions. For instance, * "table"'s ordinal numbers are [20,1,2,12,5], and 20 = 1+2+12+5. Thus * "table" has this property. */ public class OrdinalPositionPuzzlerSolver { /** * Assumes the string consists just of A-Z, a-z. */ public static boolean hasProperty(String s) { if (s.length() < 2) { return false; } char[] letters = s.toCharArray(); int sum = 0; for (int i = 1; i < letters.length; i++) { int ordinal = ordinal(letters[i]) ; sum += ordinal; } return sum == ordinal(letters[0]); } /** * @return the 1 based ordinal position of the letter. For instance, * a is 1, b is 2, etc. */ public static int ordinal(char letter) { return Character.toLowerCase(letter) - 'a' + 1; } public static void main(String[] args) { String dictionaryPath = "dict.txt"; /** http://www.javapractices.com/topic/TopicAction.do?Id=42 */ try { BufferedReader input = new BufferedReader(new FileReader(dictionaryPath)); try { String line = null; while (( line = input.readLine()) != null){ if (hasProperty(line)) { System.out.println(line); } } } catch (FileNotFoundException e) { System.err.println("Error, file " + dictionaryPath + " didn't exist."); } finally { input.close(); } } catch (IOException e) { System.err.println("IO Error:" + e); } } }
The only place that bears some explanation might be the ordinal function:
/** * @return the 1 based ordinal position of the letter. For instance, * a is 1, b is 2, etc. */ public static int ordinal(char letter) { return Character.toLowerCase(letter) - 'a' + 1; }
Basically, letters (characters) in Java are really just numbers under the hood which get mapped back into letters when printed to the screen. So, we can do arithmetic using letters. ‘a’ + 1 = ‘b’, ‘b’ + 2 = ‘d’, and so on. (This only works because the letters a-z and A-Z are mapped to a continuous range of integers). So if we want to convert a->1, first we figure out a’s position from the start of the alphabet (‘a’ -‘a’ == 0, it’s the first), and add 1 to get the 1-based number.
The other thing to notice is how much junk code there is just to read the lines from our file:
try { BufferedReader input = new BufferedReader(new FileReader(dictionaryPath)); try { String line = null; while (( line = input.readLine()) != null){ if (hasProperty(line)) { System.out.println(line); } } } catch (FileNotFoundException e) { System.err.println("Error, file " + dictionaryPath + " didn't exist."); } finally { input.close(); } } catch (IOException e) { System.err.println("IO Error:" + e); }
Believe it or not, this is relatively idiomatic Java code to accomplish this task. First we wrap a file reader in a buffered reader so that we do not continuously access the file; rather we read chunks into the file and buffer them for in-memory access. That’s 20 lines of code just to iterate over the lines of a file, determine if they match a pattern, and print it to the terminal.
It really should not be this hard.
As I’ve said before, Java’s verbosity is one of my biggest complaints about it. The program works, and it’s relatively straightforward, but it can’t really be described as elegant. In that same post I mentioned my interest in Scala as an alternative to Java. It’s a compiled language that runs in the JVM, but through its type inferencing and functional language aspects, it is much more concise; the creator Martin Odersky claims a typical Source Line of Code (SLOC) reduction of 2x between equivalent Scala and Java programs. Let’s reimplement the program in Scala, and see if this claim rings true. (Note: I’m not an expert in scala so there’s probably even more concise, idiomatic ways to accomplish what I’m trying to do. But let’s start somewhere).
The syntax of scala is very similar to that of Java. One of the main differences is that type declarations occur after the variable rather than before.
int x = 0; // Java val x: Int = 0 // Scala
Another syntactic difference is that generic type arguments occur within square brackets rather than angle brackets
List<Integer> intList = ...; // java val intList: List[Int] = ..// scala
Because of this, array indexing is done through use of parentheses rather than square brackets.
// Java int[] x = {1,2,3,4}; x[0] // 1
// Scala val x: List[Int] = List(1,2,3,4) x(0) // 1
Due to type inferencing, you often don’t need to declare the type of your variables; the compiler can figure it out. Thus the last declaration could be rewritten as
val x = List(1,2,3,4)
I highly recommend you watch Martin Odersky’s introduction to Scala video; it’s about an hour long and goes over the design considerations and differences from Java.
There are no static functions in Scala; rather you create a singleton object through the keyword ‘object’ and access the method in the same way.
Finally, there is a read/eval/execute interpreter built-in, accessible by calling ‘scala‘ in the command line. This allows you to experiment without having to explicitly compile any code; it allowed me to create the pieces iteratively before plunking them into a file for later compilation.
scala Welcome to Scala version 2.7.3.final (Java HotSpot(TM) Client VM, Java 1.5.0_22). Type in expressions to have them evaluated. Type :help for more information. scala> 1+1 res0: Int = 2 scala> print("hello world") hello world scala>
OK, enough beating around the bush. Let’s solve this puzzler!
We’ll tackle this in 3 parts.
1) Summing a list of numbers
2) Completing the ordinal number function
3) Iterating over all the lines in the dictionary file
Summing a list of numbers
A first pass at a sum function might be as follows:
def sum(nums: Iterable[Int]): Int = { var sum = 0 for (num <- nums) { sum += num } sum }
Note the declaration of sum as a var as opposed to a val; a val is equivalent to a final value while a var can be reassigned. Why did I declare the nums argument to be of type Iterable rather than List? Well, a List is iterable. So is an array. So are any number of things. We want to keep this function as generic as possible. Pasting this into our interpreter, we can test this out.
scala> def sum(nums: Iterable[Int]): Int = { | var sum = 0 | for (num <- nums) { | sum += num | } | sum | } sum: (Iterable[Int])Int scala> sum(List()) res4: Int = 0 scala> sum(List(1,2,3,4,5)) res5: Int = 15 scala> sum(Array(1,2,3,4)) res6: Int = 10
This works, but let’s look at another way to solve this problem – a functional rather than imperative solution. Java is inherently imperative, meaning you’ll see a lot of explicit while and for loops. Here’s just a quick excerpt of each the wikipedia articles to define each programming paradigm:
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state.
In computer science, imperative programming is a programming paradigm that describes computation in terms of statements that change a program state. In much the same way that imperative mood in natural languages expresses commands to take action, imperative programs define sequences of commands for the computer to perform.
So whereas the imperative manner is completely explicit in how the sum is created, using a mutable state variable (sum), a more functional approach will eliminate the use of mutable state altogether.
The reduce function is absolutely crucial to understand in order to program in a functional manner; let’s examine Python’s definition of it before applying it to the problem at hand.
Here is the Python definition of reduce function:
reduce(function, iterable[, initializer])
Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value. For example, reduce(lambda x, y:x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). The left argument, x, is the accumulated value and the right argument, y, is the update value from the iterable. If the optional initializer is present, it is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty. If initializer is not given and iterable contains only one item, the first item is returned.
And the description of the built-in sum function is as follows:
sum(iterable[, start])¶
Sums start and the items of an iterable from left to right and returns the total. start defaults to 0. The iterable‘s items are normally numbers, and are not allowed to be strings. The fast, correct way to concatenate a sequence of strings is by calling ”.join(sequence). Note that sum(range(n), m) is equivalent to reduce(operator.add,range(n), m) To add floating point values with extended precision, see math.fsum().
While Scala doesn’t have a built-in sum function, it does have support for reducing a list through use of a function. It has both a reduce left (left to right reduction) and reduce right (right to left reduction). Here is Scala’s definition of reduceLeft:
def reduceLeft[B >: A](op : (B, A) => B) : B
Combines the elements of this iterable object together using the binary operator op, from left to right
Notes Will not terminate for infinite-sized collections.
Parameters op – The operator to apply
Returns op(… op(a0,a1), …, an) if the iterable object has elements a0, a1, …, an.
Throws Predef.UnsupportedOperationException – if the iterable object is empty.
This might be a little confusing but bear with me.
First, we need to write a binary function for our sum operator – after all, we want to pairwise reduce the list by using an add operator. A first pass is as follows:
scala> def add(x: Int, y: Int): Int = { x + y } add: (Int,Int)Int scala> List(1,2,3,4).reduceLeft(add) res12: Int = 10 scala> List().reduceLeft(add) java.lang.UnsupportedOperationException: Nil.reduceLeft at scala.List.reduceLeft(List.scala:1093) at .(:6) at .() at RequestResult$.(:3) at RequestResult$.() at RequestResult$result() at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethod... scala> List(-5,5).reduceLeft(add) res14: Int = 0
Note that you cannot call sum on an empty list any more; doing so results in an UnsupportedOperationException. This is arguably more correct than returning 0, but we will have to ensure that we don’t attempt to sum an empty iterable object.
So a working sum definition would be
def sum(nums: Iterable[Int]): Int = { def add(x: Int, y: Int): Int = { x + y } nums.reduceLeft(add) }
Note that you can nest function definitions, as functions are objects in Scala, and thus can be defined wherever a variable could. Note too that I never have to explicitly return a value; the last evaluated value is implicitly returned by the function.
Finally, if you want to be even more terse, you can replace the explicitly defined add function with an anonymous one:
def sum(nums: Iterable[Int]): Int = { nums.reduceLeft(_+_) }
Read the Daniel Spiewak’s excellent blog post for a better explanation for what’s going on here. Here’s a quick explanation:
Completing the ordinal number functional
Recall the function that we had in Java for computing the ordinal position of each letter:
public static boolean hasProperty(String s) { if (s.length() < 2) { return false; } char[] letters = s.toCharArray(); int sum = 0; for (int i = 1; i < letters.length; i++) { int ordinal = ordinal(letters[i]) ; sum += ordinal; } return sum == ordinal(letters[0]); } /** * @return the 1 based ordinal position of the letter. For instance, * a is 1, b is 2, etc. */ public static int ordinal(char letter) { return Character.toLowerCase(letter) - 'a' + 1; }
Again, we could directly translate this code into Scala but doing so would not be making full use of Scala’s functional underpinnings. We are converting from characters of the string to their ordinal positions in the alphabet. While the Java code does this one letter at a time, keeping a sum, it’s just as correct to do the entire mapping process and then sum over the new list. Pictorially:
Just as there is a reduce function to convert an iterable object into a single entry, so too is there a function to convert an Iterable of one type into an Interable of another; it is the map function.
// The underscore in an anonymous function like this is a placeholder for the entry in the iterable collection currently being processed scala> List(1,3,4).map(_+5) res19: List[Int] = List(6, 8, 9) scala> List("hi","there").map(_.toUpperCase()) res22: List[java.lang.String] = List(HI, THERE)
The above examples map from Int-> Int and String->String but there’s nothing stopping us from converting types.
scala> List(100,300,400).map(_.toString()) res23: List[java.lang.String] = List(100, 300, 400)
Finally, let’s apply the ordinal function to a string:
scala> "cat".map(Character.toLowerCase(_) - 'a' + 1) res24: Seq[Int] = ArrayBufferRO(3, 1, 20)
Thus the entire function for determining if the word is a match is
[sourecode language=”scala”]
/**
* Determines if the word has the desired property of the ordinal position
* of first letter equalling the sum of the ordinal positions of the other
* letters
* Expects letters in the range [a-zA-Z];
*/
def matches(word: String): Boolean = {
if (word.length() < 2) {
false
}
else {
val ordinals = word.map(Character.toLowerCase(_) – 'a' + 1)
val ordinalSum = sum(ordinals.drop(1))
ordinals(0) == ordinalSum
}
}
[/sourecode]
Iterating over the lines in a file
Remember how it took 20 lines to iterate over all the lines in a file? Fortunately Scala makes it incredibly easy:
val lines = scala.io.Source.fromFile("dict.txt").getLines
The variable is actually not an array or a list; it is an iterator over the lines in the file. But iterators are still iterable, so all of our previously mentioned functions for dealing with iterable objects still work. Again, rather than looping over each element, explicitly testing, and printing whether it matched, we will use a functional approach, namely the ‘filter’ function.
scala> List(1,2,3,4).filter(_%2==0) res25: List[Int] = List(2, 4)
The function passed into filter must return Boolean; since our matches() function does that, we can insert it right in:
val lines = scala.io.Source.fromFile("dict.txt").getLines val matching = lines.filter(matches) matching.foreach(print(_))
When we run the complete code, we get the following list of matches:
- acknowledgements
approximation
bumptiously
capitalizations
circuitously
conceptualizing
consciousness
consistently
credulousness
crystallizing
disconcertingly
discriminatory
electrocutions
excommunication
existentialism
experimentally
exploitations
expressively
extraneously
governmentally
impersonations
incomprehensibly
indispensability
initializations
intercommunicated
interpolations
notwithstanding
organizationally
oversimplifying
phenomenologically
sportswriting
Wait a minute… what’s going on here? These obviously don’t follow the pattern. Let’s open a terminal and do some testing.
First we paste our two helper method definitions in
Welcome to Scala version 2.7.3.final (Java HotSpot(TM) Client VM, Java 1.5.0_22). Type in expressions to have them evaluated. Type :help for more information. scala> def sum(nums: Iterable[Int]): Int = { | nums.reduceLeft[Int](_+_) | } sum: (Iterable[Int])Int scala> scala> /** | * Determines if the word has the desired property of the ordinal position | * of first letter equalling the sum of the ordinal positions of the other | * letters | * Expects letters in the range [a-zA-Z]; | */ scala> def matches(word: String): Boolean = { | val ordinals = word.map(Character.toLowerCase(_) - 'a' + 1) | val ordinalSum = sum(ordinals.drop(1)) | ordinals(0) == ordinalSum | } matches: (String)Boolean scala> matches("table") res0: Boolean = true scala> matches("chair") res1: Boolean = false scala> matches("consistently") res2: Boolean = false
This is very strange. “consistently” is one of the words on the list, but it doesn’t pass the test on its own. In fact, none of them do.
What’s the problem? After googling and debugging for a bit, I realized that the precondition I was assuming in my matches method was being violated; the letters were not in the range [A-Za-z]. How could that be? Well, it turns out that the getLines() method returns the lines in the file – with new lines included!
scala> matches("consistently\r\n") res5: Boolean = true
(The file had been created on Windows, and thus has a carriage feed as well as a new line marker).
Fortunately there is a method to eliminate whitespace (including control characters like these); it’s known as trim.
We can put the trim in one of two places, either in the matches() method itself,
def matches(word: String): Boolean = { val ordinals = word.trim().map(Character.toLowerCase(_) - 'a' + 1) val ordinalSum = sum(ordinals.drop(1)) ordinals(0) == ordinalSum } scala> matches("consistently\r\n") res6: Boolean = false scala> matches("consistently") res7: Boolean = false
Or we can introduce the trim while iterating over the lines in the collection. I’ve opted for this second approach; I’m making it a precondition of calling the method matches() that the input is already sanitized. We introduce an implicit mapping from the string to the trimmed version of the string:
// An iterator of lines in the file; note that they have newline control characters val lines = scala.io.Source.fromFile("dict.txt").getLines // Filter the lines, eliminating the new lines through trim mapping val matching = lines.filter(i => matches(i.trim())) matching.foreach(print(_))
After making the change, the result is
- cab
- hag
- jade
- leaf
- leg
- maced
- mica
- mid
- need
- pig
- real
- ride
- same
- sand
- seam
- sod
- table
- toad
- toe
- vial
- ward
- whack
- who
- wick
- win
- yeas
- yet
- zebra
- zinc
Tada!
The final version of the program is
object OrdinalPositionSolver { /** * Determines if the word has the desired property of the ordinal position * of first letter equalling the sum of the ordinal positions of the other * letters * Expects letters in the range [a-zA-Z]; */ def matches(word: String): Boolean = { if (word.length() < 2) { false } else { val ordinals = word.map(Character.toLowerCase(_) - 'a' + 1) val ordinalSum = sum(ordinals.drop(1)) ordinals(0) == ordinalSum } } def sum(nums: Iterable[Int]): Int = { nums.reduceLeft[Int](_+_) } def main(args: Array[String]) { // An iterator of lines in the file; note that they have newline control characters val lines = scala.io.Source.fromFile("dict.txt").getLines // Filter the lines, eliminating the new lines through trim mapping val matching = lines.filter(i => matches(i.trim())) matching.foreach(print(_)) } }
Conclusion
I have solved the puzzler in two different ways, once in an imperative Java style and once in a more functional Scala style. The functional way is extremely concise and expressive, doing in 30 lines of code what it takes the Java code to do in over 60, confirming the claim that a typical Scala program is at least twice as compact as the corresponding Java program. I have introduced some of the salient syntactic differences between Java and Scala. I have also introduced some of the mainstays of the functional programming paradigm, namely map, reduce, and filter. While there are efforts to make Java more functional, the fact that you cannot pass functions as objects severely inhibits the use of functional programming paradigms.