### Archive

Archive for the ‘scala’ Category

## Scala – type inferencing gotchas

I’ve written previously about Scala; today I’m going to write about a potential ‘gotcha’ of Scala’s type inferencing.  Type inferencing means that instead of writing
`int x = 100;`

as we would in Java, you can instead write

```val x = 100
```
and the compiler is smart enough to figure out that you mean an integer.  Let’s look at one case in which the use of implicit typing came back to bite me.

## The setup

I was writing a routine that calculates the ‘average’ color of an image (where we define the average color as the color formed by the average red, blue, and green components of all the pixels in the image).  A first pass might look like this:
```def calculateAverageColor(image:BufferedImage):Color = {
var redSum = 0
var greenSum = 0
var blueSum = 0

// calculate the sum of each channel here

val red = (redSum / numPixels)
val green = (greenSum / numPixels)
val blue = (blueSum / numPixels)

new Color(red, green, blue)
}
```
(I’m leaving off the actual summation algorithm, as it’s not important to illustrate how the problem manifests itself.)
This code is almost right, but it has a problem.  Can you guess what it is?

## Problem #1

In a lot of cases this code will work fine.  But if you have an extremely high resolution image, or you have an overexposed image (where many of the pixels are near white, implying a high red, green, and blue value), any of the individual sums can overflow, if the sum exceeds the max value that an integer can hold.  Due to the way binary numbers are implemented (“Twos complement”), this results in a negative number.
```scala> java.lang.Integer.MAX_VALUE
res0: Int = 2147483647

scala> java.lang.Integer.MAX_VALUE+1
res2: Int = -2147483648
```

If this happens, then we end up trying to construct a color with negative red, green, or blue values; this will result in an IllegalArgumentException.

The easiest way to fix this is to use a bigger datatype to store the running sum; let’s use a Long instead.  This allows us a maximum value of 9223372036854775807; that should be plenty big to store whatever running total we have.
```def calculateAverageColor(image:BufferedImage):Color = {
// Declare the sums as longs so we don't have to worry about overflow
var redSum:Long = 0
var greenSum:Long = 0
var blueSum:Long = 0

// calculate the sum of each channel

val red = (redSum / numPixels)
val green = (greenSum / numPixels)
val blue = (blueSum / numPixels)

new Color(red, green, blue)
}
```
This code looks OK at a cursory glance; if you actual use it, you will quickly get an exception:
```java.lang.IllegalArgumentException: Color parameter outside of expected range: Red, Green, Blue
```

## Problem #2

At this point, you might start debugging the process by printing out the values of red, green, and blue.  Sure enough they’ll be in the range [0, 255], just as you need for the Color constructor.  What is going on?

There are two related problems.  The first is that the type of red, green, and blue are not integers, due to the Long value in the computation.  The compiler sees the Long and (correctly) infers that the type of red, green, and blue must be Long.

This wouldn’t be a problem, except that there is no Color constructor that takes in Long arguments.  Yet the code compiles fine.  Why?
Well, it’s because of the implicit widening conversions present in the Java language.  These (usually) make our lives as programmers much easier.  If a method is defined to take in an integer and we pass it a short instead, the short is implicitly converted to an integer with no extra effort on our part.  This is done automatically because there is no danger of truncation; because the type that is implicitly converted to is larger (wider), it can always hold the value of the smaller type.
According to the Sun specification,

The following 19 specific conversions on primitive types are called the widening primitive conversions:

• byte to shortintlongfloat, or double
• short to intlongfloat, or double
• char to intlongfloat, or double
• int to longfloat, or double
• long to float or double
• float to double

The implicit conversion that tripped me up was that longs are implicitly promoted to float.  The reason the method compiled, even though there is no constructor that takes long r,g,b arguments, is because there is a constructor that takes float arguments, and the longs were automatically converted to floats!  See the Color documentation for more. Unlike the integer constructor, which takes RGB values in the range [0,255], the float constructor takes RGB values in the range [0,1].
The solution is to convert the longs back into ints, via a cast.  In Java this would be accomplished with
```long redSum = ...;
int averageRed = (int) (redSum/numPixels);
```
In Scala you instead must do
```val redSum:Long = ...
val averageRed:Int = (redSum/numPixels).asInstanceOf[Int]
```
The fixed code as a whole thus looks like
```	var redSum:Long = 0
var greenSum:Long = 0
var blueSum:Long = 0

// calculate the sum of each channel

val red:Int = (redSum / numPixels).asInstanceOf[Int]
val green:Int = (greenSum / numPixels).asInstanceOf[Int]
val blue:Int = (blueSum / numPixels).asInstanceOf[Int]

new Color(red, green, blue)
```

## Conclusion

One of the nice things about Scala is that you do not need to explicitly declare the types of your variables.  In one sequence of unfortunate events, the variables that looked like ints were in fact longs, leading to an implicit conversion to the float primitive type, which in turn caused the incorrect constructor to be invoked, and an IllegalArgumentException.  Hopefully you can avoid doing something so foolish as a result of reading this post.

## Java gotcha: Splitting a string

CSV, or comma separated value, files are the workhorse of flat file processing. They’re human readable, you can edit them in any spreadsheet editor worth its salt, they’re portable, and they’re easy to parse and read into your programs.  Unfortunately, text you might want to store in the columns might also have commas in it, which complicates the parsing significantly – instead of splitting on all of the commas, you have to know whether the comma is in a quoted string, in which case it doesn’t signify the end of a field … it gets messy, and you’re probably going to get it wrong if you roll your own.  For a quick and dirty script, sometimes it’s better to delimit the columns with a different character, one that comes up much less often in text.  One good choice is the pipe character, |.

Foolishly I processed lines of text like the following in Scala:

```
val FIELD_DELIMITER = "|";
val FILE_NAME_INDEX = 0
val AVG_COLOR_INDEX = 1
val THUMBNAIL_IMAGE_INDEX = 2

def parseLine(indexRow:String) = {
Console.println("Row: " + indexRow)

val entries = indexRow.split(FIELD_DELIMITER)
// extract the file name, average color, thumbnail

}
```

Unfortunately, this code is subtly broken.  The problem is that the String split method does not accept a plain string on which to split – rather, it treats the string as a regular expression.  Since the pipe character has special meaning in regular expressions (namely “OR”), this has the effect of splitting the string on every character, rather than just at the pipe delimiters.

To get around this, you need to tell the regular expression engine that you want to treat the pipe as a literal, which you do by backslash escaping the pipe.  Unfortunately “\|” is not a valid string in Java, because it will attempt to interpret it as a control character (like the newline \n).  Instead, you need to backslash escape the backslash, leaving “\\|”

Conclusion:
Be careful when you’re using String.split in Java/Scala.  You have to be aware that it’s treating your string as a regular expression.  Read this StackOverflow discussion for a better understanding of the pros and cons of Java using regular expressions in many of its core String libraries.

## An introduction to Scala

July 15, 2010 1 comment # Overview

• Introduction
• Why is it gaining popularity, and why should you care?
• History
• Functional programming
• Syntax
• Data structures
• Benefits
• Drawbacks
• Conclusion

# Introduction

• Scala = scalable language.
• From scripts all the way up to enterprise
• Runs on JVM
• Combines functional and object-oriented paradigms

Scala was designed to be both object-oriented and functional. It is a pure object-oriented language in the sense that every value is an object. Objects are defined by classes, which can be composed using mixin composition. Scala is also a functional language in the sense that every function is a value. Functions can be nested, and they can operate on data using pattern matching. Language creator Martin Odersky

# Why?

• Productive like a scripting language, but with type safety and speed
• You can express yourself in fewer lines of code than equivalent Java.
• Well suited to multithreaded environments
• Actors and message passing

## Less boilerplate

``````// Scala
case class Person(firstName: String, lastName: String)

// Java
public class Person implements Serializable {
private final String firstName;
private final String lastName;

public Person(String firstName, String lastName) {
this.firstName = firstName;
this.lastName = lastName;
}

public String getFirstName() {
return firstName;
}

public String getLastName() {
return lastName;
}

public Person withFirstName(String firstName) {
return new Person(firstName, lastName);
}

public Person withLastName(String lastName) {
return new Person(firstName, lastName);
}

public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
Person person = (Person) o;
if (firstName != null ? !firstName.equals(person.firstName) : person.firstName != null) {
return false;
}
if (lastName != null ? !lastName.equals(person.lastName) : person.lastName != null) {
return false;
}
return true;
}

public int hashCode() {
int result = firstName != null ? firstName.hashCode() : 0;
result = 31 * result + (lastName != null ? lastName.hashCode() : 0);
return result;
}

public String toString() {
return "Person(" + firstName + "," + lastName + ")";
}
}
``````

Samples where Scala code is significantly shorter than Java

# History

• Martin Odersky created it
• Functional programming and compiler background (wrote current javac reference compiler)
• Worked on adding generics to Java
• Design began in 2001, released in 2003
• Version 2.80 Final released July 14, 2008

# Functional programming

• Avoid state and mutable data
• Every variable is a constant
• Lends itself to parallelization; avoid locking and deadlocks
• First-class functions – functions can serve as arguments and results of functions
• Recursion is primary tool for iteration
• Heavy use of pattern matching
• Lazy evaluation – infinite sequences

# Who’s using Scala

• Twitter uses for performance critical code; replaces Ruby in that regard
• FourSquare uses Scala and Lift web framework
• Xerox uses Scala and Lift
• Etherpad uses Scala in its real time document collaboration software (acquired by Google for Google Docs; Scala code is now opensourced)

# Syntax:

## Hello world in Scala:

``````object HelloWorld extends Application {
Console.println("Hello World")
}
``````

## Variables declared with val and var

``````val x = 0 // immutable
var y = 0 // mutable
x = 1 // compiler error
y = 1 // fine
``````

## Type declarations occur after the variable

``````int x = 0; // Java
val x: Int = 0 // Scala
``````

## Generic type arguments occur within square brackets rather than angle brackets

``````List<Integer> intList = ...; // java
val intList: List[Int] =  ..// scala
``````

## Array indexing is done through use of parentheses rather than square brackets

``````// Java
int[] x = {1,2,3,4};
x // 1

// Scala
val x: Array[Int] = Array(1,2,3,4)
x(0) // 1
``````

## Method definitions

``````scala> def square(x:Int) = { x * x }
square: (Int)Int

scala> square(5)
res38: Int = 25
``````

## No such thing as a ‘void’ method, every method returns a value. void = “Unit” in Scala

``````scala> def printIt(x:Any) = { println(x) }
printIt: (Any)Unit
``````

## Type inferencing – note that I haven’t been declaring what types my methods take

``````// Compiler infers that the return type must be int
scala> def square(x:Int) = { x * x }
square: (Int)Int

// I can explicitly indicate what it returns if I prefer
scala> def squareExplicit(x:Int):Int = { x * x }
squareExplicit: (Int)Int
``````

## No need to explicitly return values

``````scala> def square(x:Int) = { x * x }
square: (Int)Int

scala> def squareExplicit(x:Int):Int = { return x * x }
squareExplicit: (Int)Int

// The result of last expression is returned.
``````

## Conditionals as values

``````// BAD.  Java style, mutable value
var x = 0
if (someCondition) {
x = something
}
else {
x = somethingElse
}

// GOOD: Scala style, immutable
val x =
if (someCondition) {
something
}
else {
somethingElse
}

// or if short enough
val x = if(something) something else (somethingElse)
``````

## All operations are method calls

Scala doesn’t technically have operator overloading, because it doesn’t actually have operators in the traditional sense. Instead, characters such as +, -, *, and / can be used in method names. Thus, when you typed 1 + 2 into the Scala interpreter in Step 1 you were actually invoking a method named + on the Int object 1, passing in 2 as a parameter. As illustrated in Figure 3.1, you could alternatively have written 1 + 2 using traditional method invocation syntax (1).+(2).
“Programming in Scala, p. 39”

### What good does that do me?

If your domain objects make sense to add, multiply, divide, etc., you can treat them in a much more natural way.

``````// BigInt is a wrapper around Java's BigInteger, providing arithmetic methods
scala> BigInt("103058235287305927350") + BigInt("288888203959230529352")
res16: BigInt = 391946439246536456702

// In Java you're stuck calling .add, .multiply, etc
res21: java.math.BigInteger = 391946439246536456702
``````

## Data structures

### Lists

More powerful than Java lists. Expose a whole raft of Functional Programming paradigms, such as map, reduce, filter, folds, etc.

``````// Note the syntax; you do not call "new".  This is because there is
// an apply ("()") method defined in the List object.
scala> val a = List("hello","how","are","you")
a: List[java.lang.String] = List(hello, how, are, you)

// Explicit
val nums:List[Int] = List(1,2,3,4)

scala> a.size
res40: Int = 4

scala> a.mkString(" ")
res41: String = hello how are you

scala> a.map(x => x.toUpperCase())
res42: List[java.lang.String] = List(HELLO, HOW, ARE, YOU)

scala> a.map(x => x.length())
res43: List[Int] = List(5, 3, 3, 3)

scala> a.filter(x => x.startsWith("h"))
res44: List[java.lang.String] = List(hello, how)

scala> a.forall(x => x.length() > 0)
res46: Boolean = true

res67: java.lang.String = hello

scala> a.tail
res68: List[java.lang.String] = List(how, are, you)

scala> 1 :: List(2)
res70: List[Int] = List(1, 2)

// reverse
scala> a.reverse
res72: List[java.lang.String] = List(you, are, how, hello)
``````

### Tuples

Tuples are a good data structure to pack up multiple variables to return from a function, without the need to define a helper class to do so. Slightly syntax to create them, as well as to access the elements contained therein.

``````// Explicitly create the tuple; never really a need to do this
scala> val a:Tuple2[Int,Int] = (5,6)
a: (Int, Int) = (5,6)

// The parentheses indicate it's a tuple
scala> val b = (5,6)
b: (Int, Int) = (5,6)

// Yes, I know it's a strange syntax.  And it's 1-based rather than 0-based.  This is because Haskell has 1-based tuple ordering and Odersky wanted to stay true to that.
scala> a._1
res48: Int = 5

scala> a._2
res49: Int = 6

// You can have an arbitrary number of elements, of any type you want
val listTuple = (List(1,2,3), "hello", new java.awt.Color(255,0,0))
scala> val listTuple = (List(1,2,3), "hello", new java.awt.Color(255,0,0))
listTuple: (List[Int], java.lang.String, java.awt.Color) = (List(1, 2, 3),hello,java.awt.Color[r=255,g=0,b=0])
``````

### Maps

Scala provides immutable maps as well as a syntax for creating map literals.

``````scala> val theMap = Map("hello"->5,"world"->5)
theMap: scala.collection.immutable.Map[java.lang.String,Int] = Map(hello -> 5, world -> 5)

scala> theMap("hello")
res50: Int = 5

scala> theMap.getOrElse("world",2)
res51: Int = 5

scala> theMap.getOrElse("worldly",2)
res52: Int = 2

scala> theMap.keySet
res54: scala.collection.Set[java.lang.String] = Set(hello, world)

scala> theMap.values
res55: Iterator[Int] = non-empty iterator

scala> theMap.values.toList
res56: List[Int] = List(5, 5)

// Mutable map:
var mutableMap = Map[Int, String]()
mutableMap += (1 -> "1")
mutableMap += (2 -> "2")
``````

### Arrays

Internally map to Java arrays. Idiomatic Scala code tends to prefer Lists to Arrays, but they are necessary for interacting with a lot of Java code. Create in same way as lists.

``````scala> val array = Array(1,2,3)
array: Array[Int] = Array(1, 2, 3)
``````

# Benefits

• Static type checking but with improved type inference

// Java
private Map wordCount = new HashMap();
// Scala:
var wordCount:Map[String, Int] = Map()

• Much faster than Python/Ruby and other interpreted languages, but just as expressive

• Duck typing support
• Conciseness
• Functions as first class objects; no need for anonymous classes and extraneous interfaces Unified types
• Traits are a cross between interfaces and abstract classes, allowing you to provide default implementations. Allows for ‘mix-in’ composition, as well. Solves problems with multiple inheritance, while giving flexibility. Avoid repeating yourself.
• Console REPL environment allows you to quickly iterate and test
• Of theoretical interest
• Combines functional and object-oriented in statically-typed environment

# Drawbacks/Complaints

• Immaturity of tool chain
• Binary/source incompatibility from rapid change of language
• Type system is very complex
• Functional programming can be hard to wrap head around for people familiar with OOP
• Syntax can be so concise as to be cryptic

For instance, the following example comes from StackOverflow post.

``````def scanLeft[a,b](xs:Iterable[a])(s:b)(f : (b,a) => b) =
xs.foldLeft(List(s))( (acc,x) => f(acc(0), x) :: acc).reverse
``````

And then use it like this:

``````scala> scanLeft(List(1,2,3))(0)(_+_)
res1: List[Int] = List(0, 1, 3, 6)
``````

# Conclusion

Scala is a very intriguing language due to its combination of object orientation with functional programming concepts. The programmer is not forced to choose between the expressiveness of objects and immutability; he can switch between the two paradigms as the situation warrants.

While binary/source incompatibility is a very real problem for enterprise adoption, it is an issue that the creators are working on.

Scala provides a great framework in which to explore functional programming methods, especially through the use of the interactive console environment.

# Images:

## Interview with Twitter dev explaining use of Scala

So what were our criteria for choosing Scala? Well first we asked, was it fast, and fun, and good for long-running process? Does it have advanced features? Can you be productive quickly? Developers of the language itself had to be accessible to us as we’d been burned by Ruby in that respect. Ruby’s developers had been clear about focusing it on fun, even sometimes at the expense of performance. They understood our concerns about enterprise-class support and sometimes had other priorities.

We wanted to be able to talk to the guys building the language, not to steer the language, but at least to have a conversation with them.

There’s a new, good post interviewing one of the lead back-end engineers of Twitter.  I knew that they had replaced some Ruby code with Scala due to performance concerns, but there’s new information in here to me, such as all the systems it’s used in.  It also gives a great summary of Scala’s benefits as a language. He makes a good point that you can easily hire Java programmers and train them to use Scala; it’s a lot harder to teach/hire a Haskell programmer, for instance.

## Car Talk Puzzler #3: The Palindromic Odometer

The Palindromic Odometer

Driving along, Terry notices that the last four digits on the odometer are palindromic. A mile later, the last five digits are palindromic. [A mile later, the middle four digits are palindromic.] [One mile after that], all six are palindromic. What was the odometer reading when Terry first looked at it?

This is the third in my series of Car Talk puzzler solutions using whatever programming language strikes my fancy. Just like the last one I wrote, this one uses Scala. This is a much shorter post, as I am assuming some familiarity with Scala; go back and read the last post if you haven’t already.

```object PalindromicPuzzlerSolver {
// Pad out numbers to 6 places
// Note that I can cal Java library functions seamlessly
val df = new java.text.DecimalFormat("000000")

// A string is a palindrome if it reversed equals itself
def isPalindrome(word: String): Boolean = {
// Reverse is a method of a RichString, which a String can be implicitly converted
// into. Necessary to convert the RichString back into String after the reversal
word == word.reverse.toString()
}

def fitsDescription(miles: Int): Boolean = {
// Last 4 are palindome (dropping first 2)
isPalindrome(df.format(miles).drop(2)) &&
// Add one, last 5 are palindrome
isPalindrome(df.format(miles + 1).drop(1)) &&
// Add two more, all six are a palindrome
isPalindrome(df.format(miles + 3))
}

def main(args: Array[String]) {
val max = 1000000
println( 0 until max filter fitsDescription )
// Prints out
// RangeF(198888, 199999)

// This is the same as the java style:
// println( (0.until(max)).filter(fitsDescription) )
// This syntax only works when you are dealing with
// methods that take one argument.
}
}
```

Thus the solution is 198888, just as the answer proclaims. Another win for brute force.

There are three noteworthy things in this example, which didn’t come up in the last post (or weren’t explained fully)

1. Interoperability with Java
2. Use of drop
3. Convenience methods for specifying a range

## Interoperability with Java

Any Java class can be imported and used with Scala; that is one of its strongest selling points. Scala runs in the same JVM as Java, so it makes sense that it should have access to Java classes. In fact many Scala classes are thin wrappers around Java objects, with syntactic sugar built in. For instance, Scala has a BigInt class which wraps a Java BigInteger, while providing overloaded methods to allow them to be used more like standard numbers. For instance,

```scala> BigInt("103058235287305927350") + BigInt("288888203959230529352")
res16: BigInt = 391946439246536456702

res21: java.math.BigInteger = 391946439246536456702
```

Note that the + is actually a method call; this is exactly the same thing as calling

```scala> BigInt("103058235287305927350").+(BigInt("288888203959230529352"))
res22: BigInt = 391946439246536456702
```

In Scala, method names don’t have to be alphanumeric, which is how it gets away with what looks like operator overloading. This makes it a lot nicer to work with classes like BigDecimal or BigInteger, as you can use the more commonly used addition and subtraction symbols rather than explicitly calling “add” and “subtract” methods. Just realize that you are really calling methods like any other under the hood. When you understand this, you’ll also understand why Scala uses an underscore instead of * for wildcard – * can be used in method names, as it is used for multiplication!

## Drop

Scala has full support for slicing iterable objects; see my previous discussion on slices if this term is unfamiliar.

```scala> List(1,2,3,4).slice(1,3)
res27: List[Int] = List(2, 3)

scala> List(1,2,3,4).slice(1)
warning: there were deprecation warnings; re-run with -deprecation for details
res28: Seq[Int] = List(2, 3, 4)

scala> List(1,2,3,4).drop(1)
res29: List[Int] = List(2, 3, 4)
```

So drop(n) is exactly the same as slice(n) – it provides a new list with the first n elements removed. I use drop instead of slice due to the deprecation of the single argument form of slice.

Specifying a range
If you need to iterate over a range of numbers in Java, you’d do something like
for (int i = 0; i < max; i++) {
}

whereas in Python you can do something like
for x in range(max):

Scala similarly provides methods to create a range of numbers to iterate over.

```scala> 0 until 10
res9: Range = Range(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

scala> 0 to 10
res10: Range.Inclusive = Range(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
```

Again, I want to stress that until is a method like any other; you are just free to leave off the . and the parens when it is unambiguous what you mean. Note that these commands are actually creating a sequence in memory. If you only need to access each element one at a time, and you are dealing with a large range, you might instead create an Iterator rather than a sequence. The Iterator will create each subsequent number as necessary, rather than building up the whole list ahead of time. This is much more space efficient.

Note that Python makes this distinction as well:

```>>> range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> xrange(10)
xrange(10)
```

xrange returns a generator object, as opposed to range which returns the actual list.

Returning to Scala, if I wanted an iterator instead I would do

```scala> Iterator.range(0,10)
res11: Iterator[Int] = non-empty iterator

scala> Iterator.range(0,10).foreach(println)
0
1
2
3
4
5
6
7
8
9
```

Conclusion
I have shown another solution using Scala, while illustrating a few more features. In particular one needs to understand that Scala’s syntactic sugar is actually masking standard function calls under the hood; x + y is the same as x.+(y), where “+” is a valid function name.

## Car Talk Puzzler #2: What do these words have in common? An imperative and functional approach using Java and Scala

May 15, 2010 1 comment

Here’s the second in my series of Car Talk puzzler solutions. Today I’m going to be illustrating a solution first in Java, and then in Scala in order to show how functional programming features  can drastically reduce the number of lines of code, while still expressing the developer’s intent.

Puzzler:
What do these words have in common: pig, table, cab, real, yet, and ride?

RAY: Here’s the answer. If you notice, the first letter in each of the words is from farther down in the alphabet than all the subsequent letters. And if we assigned a number to every letter in the alphabet, so that A is one, B is two, C is three and so on…

TOM: Yeah, yeah, yeah.

RAY: Then the numerical place in the alphabet of the first letter is equal to the sum of the numerical places of all the other letters. For example, Table: T is 20, A is 1, B is 2, L is 12, E is 5, so 1 plus 2 plus 12 plus 5 –

TOM: Is 20.

So, we have the answer.  But as the guys mention on the show, there are probably other words as well.  Let’s do a brute force search through the dictionary to find all such words.  The dictionary I”ll be using is available here, as a new line separated list of words, one per line.  This makes it very easy to deal with.

Here’s the rather straightforward solution in Java:

``` import java.io.FileReader;
import java.io.FileNotFoundException;
import java.io.IOException;

/**
* Determines all the words such that the first letter's ordinal position
* equals the sum of the remaining letters' ordinal positions.  For instance,
* "table"'s ordinal numbers are [20,1,2,12,5], and 20 = 1+2+12+5.  Thus
* "table" has this property.
*/
public class OrdinalPositionPuzzlerSolver {

/**
* Assumes the string consists just of A-Z, a-z.
*/
public static boolean hasProperty(String s) {
if (s.length() < 2) {
return false;
}
char[] letters = s.toCharArray();
int sum = 0;
for (int i = 1; i < letters.length; i++) {
int ordinal = ordinal(letters[i]) ;
sum += ordinal;
}
return sum == ordinal(letters);
}

/**
* @return the 1 based ordinal position of the letter.  For instance,
* a is 1, b is 2, etc.
*/
public static int ordinal(char letter) {
return Character.toLowerCase(letter) - 'a' + 1;
}

public static void main(String[] args) {
String dictionaryPath = "dict.txt";
/**
http://www.javapractices.com/topic/TopicAction.do?Id=42
*/
try {
try {
String line = null;
while (( line = input.readLine()) != null){
if (hasProperty(line)) {
System.out.println(line);
}
}
}
catch (FileNotFoundException e) {
System.err.println("Error, file " + dictionaryPath + " didn't exist.");
}
finally {
input.close();
}
}
catch (IOException e) {
System.err.println("IO Error:" + e);
}

}
}
```

The only place that bears some explanation might be the ordinal function:

```/**
* @return the 1 based ordinal position of the letter.  For instance,
* a is 1, b is 2, etc.
*/
public static int ordinal(char letter) {
return Character.toLowerCase(letter) - 'a' + 1;
}
```

Basically, letters (characters) in Java are really just numbers under the hood which get mapped back into letters when printed to the screen.  So, we can do arithmetic using letters.  ‘a’ + 1 = ‘b’, ‘b’ + 2 = ‘d’, and so on.  (This only works because the letters a-z and A-Z are mapped to a continuous range of integers).  So if we want to convert a->1, first we figure out a’s position from the start of the alphabet (‘a’ -‘a’ == 0, it’s the first), and add 1 to get the 1-based number.

The other thing to notice is how much junk code there is just to read the lines from our file:

```try {
try {
String line = null;
while (( line = input.readLine()) != null){
if (hasProperty(line)) {
System.out.println(line);
}
}
}
catch (FileNotFoundException e) {
System.err.println("Error, file " + dictionaryPath + " didn't exist.");
}
finally {
input.close();
}
}
catch (IOException e) {
System.err.println("IO Error:" + e);
}
```

Believe it or not, this is relatively idiomatic Java code to accomplish this task.  First we wrap a file reader in a buffered reader so that we do not continuously access the file; rather we read chunks into the file and buffer them for in-memory access.  That’s 20 lines of code just to iterate over the lines of a file, determine if they match a pattern, and print it to the terminal.

It really should not be this hard.

As I’ve said before, Java’s verbosity is one of my biggest complaints about it.  The program works, and it’s relatively straightforward, but it can’t really be described as elegant.  In that same post I mentioned my interest in Scala as an alternative to Java.  It’s a compiled language that runs in the JVM, but through its type inferencing and functional language aspects, it is much more concise; the creator Martin Odersky claims a typical Source Line of Code (SLOC) reduction of 2x between equivalent Scala and Java programs. Let’s reimplement the program in Scala, and see if this claim rings true.  (Note: I’m not an expert in scala so there’s probably even more concise, idiomatic ways to accomplish what I’m trying to do.  But let’s start somewhere).

The syntax of scala is very similar to that of Java.  One of the main differences is that type declarations occur after the variable rather than before.

``` int x = 0; // Java
val x: Int = 0 // Scala
```

Another syntactic difference is that generic type arguments occur within square brackets rather than angle brackets

``` List<Integer> intList = ...; // java
val intList: List[Int] =  ..// scala
```

Because of this, array indexing is done through use of parentheses rather than square brackets.

``` // Java
int[] x = {1,2,3,4};
x // 1
```
``` // Scala
val x: List[Int] = List(1,2,3,4)
x(0) // 1
```

Due to type inferencing, you often don’t need to declare the type of your variables; the compiler can figure it out.  Thus the last declaration could be rewritten as

``` val x = List(1,2,3,4)
```

I highly recommend you watch Martin Odersky’s introduction to Scala video; it’s about an hour long and goes over the design considerations and differences from Java.

There are no static functions in Scala; rather you create a singleton object through the keyword ‘object’ and access the method in the same way. Finally, there is a read/eval/execute interpreter built-in, accessible by calling ‘scala‘ in the command line.  This allows you to experiment without having to explicitly compile any code; it allowed me to create the pieces iteratively before plunking them into a file for later compilation.

``` scala
Welcome to Scala version 2.7.3.final (Java HotSpot(TM) Client VM, Java  1.5.0_22).
Type in expressions to have them evaluated.

scala> 1+1
res0: Int = 2

scala> print("hello world")
hello world
scala>
```

OK, enough beating around the bush.  Let’s solve this puzzler!

We’ll tackle this in 3 parts.
1) Summing a list of numbers
2) Completing the ordinal number function
3) Iterating over all the lines in the dictionary file

## Summing a list of numbers

A first pass at a sum function might be as follows:

```def sum(nums: Iterable[Int]): Int = {
var sum = 0
for (num <- nums) {
sum += num
}
sum
}
```

Note the declaration of sum as a var as opposed to a val; a val is equivalent to a final value while a var can be reassigned.  Why did I declare the nums argument to be of type Iterable rather than List?  Well, a List is iterable.  So is an array.  So are any number of things.  We want to keep this function as generic as possible. Pasting this into our interpreter, we can test this out.

```scala> def sum(nums: Iterable[Int]): Int = {
|     var sum = 0
|     for (num <- nums) {
|         sum += num
|     }
|     sum
| }
sum: (Iterable[Int])Int

scala> sum(List())
res4: Int = 0

scala> sum(List(1,2,3,4,5))
res5: Int = 15

scala> sum(Array(1,2,3,4))
res6: Int = 10
```

This works, but let’s look at another way to solve this problem – a functional rather than imperative solution.  Java is inherently imperative, meaning you’ll see a lot of explicit while and for loops.  Here’s just a quick excerpt of each the wikipedia articles to define each programming paradigm:

In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state.

In computer science, imperative programming is a programming paradigm that describes computation in terms of statements that change a program state. In much the same way that imperative mood in natural languages expresses commands to take action, imperative programs define sequences of commands for the computer to perform.

So whereas the imperative manner is completely explicit in how the sum is created, using a mutable state variable (sum), a more functional approach will eliminate the use of mutable state altogether.

The reduce function is absolutely crucial to understand in order to program in a functional manner; let’s examine Python’s definition of it before applying it to the problem at hand.

Here is the Python definition of reduce function:

reduce(function, iterable[, initializer])

Apply function of two arguments cumulatively to the items of iterable, from left to right, so as to reduce the iterable to a single value. For example, reduce(lambda x, y:x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). The left argument, x, is the accumulated value and the right argument, y, is the update value from the iterable. If the optional initializer is present, it is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty. If initializer is not given and iterable contains only one item, the first item is returned.

And the description of the built-in sum function is as follows:

sum(iterable[, start])

Sums start and the items of an iterable from left to right and returns the total. start defaults to 0. The iterable‘s items are normally numbers, and are not allowed to be strings. The fast, correct way to concatenate a sequence of strings is by calling ”.join(sequence). Note that sum(range(n), m) is equivalent to reduce(operator.add,range(n), m) To add floating point values with extended precision, see math.fsum().

While Scala doesn’t have a built-in sum function, it does have support for reducing a list through use of a function.  It has both a reduce left (left to right reduction) and reduce right (right to left reduction).  Here is Scala’s definition of reduceLeft:

def reduceLeft[B >: A](op : (B, A) => B) : B
Combines the elements of this iterable object together using the binary operator op, from left to right
Notes Will not terminate for infinite-sized collections.
Parameters
op – The operator to apply
Returns op(… op(a0,a1), …, an) if the iterable object has elements a0, a1, …, an.
Throws
Predef.UnsupportedOperationException – if the iterable object is empty.

This might be a little confusing but bear with me.

First, we need to write a binary function for our sum operator – after all, we want to pairwise reduce the list by using an add operator.  A first pass is as follows:

```scala> def add(x: Int, y: Int): Int = { x + y }

res12: Int = 10
java.lang.UnsupportedOperationException: Nil.reduceLeft
at scala.List.reduceLeft(List.scala:1093)
at .(:6)
at .()
at RequestResult\$.(:3)
at RequestResult\$.()
at RequestResult\$result()
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethod...
res14: Int = 0
```

Note that you cannot call sum on an empty list any more; doing so results in an UnsupportedOperationException.  This is arguably more correct than returning 0, but we will have to ensure that we don’t attempt to sum an empty iterable object.

So a working sum definition would be

```def sum(nums: Iterable[Int]): Int = {
def add(x: Int, y: Int): Int = { x + y }
}
```

Note that you can nest function definitions, as functions are objects in Scala, and thus can be defined wherever a variable could.  Note too that I never have to explicitly return a value; the last evaluated value is implicitly returned by the function.

Finally, if you want to be even more terse, you can replace the explicitly defined add function with an anonymous one:

```def sum(nums: Iterable[Int]): Int = {
nums.reduceLeft(_+_)
}
```

Read the Daniel Spiewak’s excellent blog post for a better explanation for what’s going on here.  Here’s a quick explanation:

(_+_) is actually a Scala shorthand for an anonymous function taking two parameters and returning the result obtained by invoking the + operator on the first passing the second.  Less precisely: it’s addition.  Likewise, (_*_) would define multiplication.

## Completing the ordinal number functional

Recall the function that we had in Java for computing the ordinal position of each letter:

```    public static boolean hasProperty(String s) {
if (s.length() < 2) {
return false;
}
char[] letters = s.toCharArray();
int sum = 0;
for (int i = 1; i < letters.length; i++) {
int ordinal = ordinal(letters[i]) ;
sum += ordinal;
}
return sum == ordinal(letters);
}

/**
* @return the 1 based ordinal position of the letter.  For instance,
* a is 1, b is 2, etc.
*/
public static int ordinal(char letter) {
return Character.toLowerCase(letter) - 'a' + 1;
}
```

Again, we could directly translate this code into Scala but doing so would not be making full use of Scala’s functional underpinnings.  We are converting from characters of the string to their ordinal positions in the alphabet.  While the Java code does this one letter at a time, keeping a sum, it’s just as correct to do the entire mapping process and then sum over the new list.  Pictorially: Just as there is a reduce function to convert an iterable object into a single entry, so too is there a function to convert an Iterable of one type into an Interable of another; it is the map function.

```// The underscore in an anonymous function like  this is a placeholder for the entry in the iterable collection currently  being processed
scala> List(1,3,4).map(_+5)
res19: List[Int] =  List(6, 8, 9)
scala> List("hi","there").map(_.toUpperCase())
res22:  List[java.lang.String] = List(HI, THERE)
```

The above examples map from Int-> Int and String->String but there’s nothing stopping us from converting types.

```scala> List(100,300,400).map(_.toString())
res23:  List[java.lang.String] = List(100, 300, 400)
```

Finally, let’s apply the ordinal function to a string:

```scala> "cat".map(Character.toLowerCase(_) - 'a' +  1)
res24: Seq[Int] = ArrayBufferRO(3, 1, 20)
```

Thus the entire function for determining if the word is a match is

[sourecode language=”scala”]
/**
* Determines if the word has the desired property of the ordinal position
* of first letter equalling the sum of the ordinal positions of the other
* letters
* Expects letters in the range [a-zA-Z];
*/
def matches(word: String): Boolean = {
if (word.length() < 2) {
false
}
else {
val ordinals = word.map(Character.toLowerCase(_) – 'a' + 1)
val ordinalSum = sum(ordinals.drop(1))
ordinals(0) == ordinalSum
}
}
[/sourecode]

## Iterating over the lines in a file

Remember how it took 20 lines to iterate over all the lines in a file?  Fortunately Scala makes it incredibly easy:

```val lines =  scala.io.Source.fromFile("dict.txt").getLines
```

The variable is actually not an array or a list; it is an iterator over the lines in the file.  But iterators are still iterable, so all of our previously mentioned functions for dealing with iterable objects still work.  Again, rather than looping over each element, explicitly testing, and printing whether it matched, we will use a functional approach, namely the ‘filter’ function.

```scala>  List(1,2,3,4).filter(_%2==0)
res25: List[Int] = List(2, 4)
```

The function passed into filter must return Boolean; since our matches() function does that, we can insert it right in:

```val lines =  scala.io.Source.fromFile("dict.txt").getLines
val matching =  lines.filter(matches)
matching.foreach(print(_))
```

When we run the complete code, we get the following list of matches:

• acknowledgements
approximation
bumptiously
capitalizations
circuitously
conceptualizing
consciousness
consistently
credulousness
crystallizing
disconcertingly
discriminatory
electrocutions
excommunication
existentialism
experimentally
exploitations
expressively
extraneously
governmentally
impersonations
incomprehensibly
indispensability
initializations
intercommunicated
interpolations
notwithstanding
organizationally
oversimplifying
phenomenologically
sportswriting

Wait a minute… what’s going on here?  These obviously don’t follow the pattern.  Let’s open a terminal and do some testing.

First we paste our two helper method definitions in

``` Welcome to Scala version 2.7.3.final (Java  HotSpot(TM) Client VM, Java 1.5.0_22).
Type in expressions to have  them evaluated.

scala>  def sum(nums: Iterable[Int]): Int = {
|          nums.reduceLeft[Int](_+_)
|     }
sum: (Iterable[Int])Int

scala>

scala>     /**
|      * Determines  if the word has the desired property of the ordinal position
|      * of first letter equalling the sum of the ordinal positions of  the other
|      * letters
|      * Expects letters in  the range [a-zA-Z];
|      */

scala>     def  matches(word: String): Boolean = {
|         val ordinals =  word.map(Character.toLowerCase(_) - 'a' + 1)
|         val  ordinalSum = sum(ordinals.drop(1))
|         ordinals(0) ==  ordinalSum
|     }
matches: (String)Boolean

scala> matches("table")
res0: Boolean = true

scala>  matches("chair")
res1: Boolean = false

scala>  matches("consistently")
res2: Boolean = false
```

This is very strange.  “consistently” is one of the words on the list, but it doesn’t pass the test on its own.  In fact, none of them do.

What’s the problem?  After googling and debugging for a bit, I realized that the precondition I was assuming in my matches method was being violated; the letters were not in the range [A-Za-z].  How could that be?  Well, it turns out that the getLines() method returns the lines in the file – with new lines included!

```scala> matches("consistently\r\n")
res5: Boolean = true
```

(The file had been created on Windows, and thus has a carriage feed as well as a new line marker).

Fortunately there is a method to eliminate whitespace (including control characters like these); it’s known as trim.

We can put the trim in one of two places, either in the matches() method itself,

```def matches(word: String): Boolean = {
val ordinals = word.trim().map(Character.toLowerCase(_) - 'a' + 1)
val ordinalSum = sum(ordinals.drop(1))
ordinals(0) == ordinalSum
}

scala> matches("consistently\r\n")
res6: Boolean = false

scala> matches("consistently")
res7: Boolean = false

```

Or we can introduce the trim while iterating over the lines in the collection.  I’ve opted for this second approach; I’m making it a precondition of calling the method matches() that the input is already sanitized.  We introduce an implicit mapping from the string to the trimmed version of the string:

```// An iterator of lines in the file; note that they have newline control characters
val lines = scala.io.Source.fromFile("dict.txt").getLines
// Filter the lines, eliminating the new lines through trim mapping
val matching = lines.filter(i => matches(i.trim()))
matching.foreach(print(_))
```

After making the change, the result is

• cab
• hag
• leaf
• leg
• maced
• mica
• mid
• need
• pig
• real
• ride
• same
• sand
• seam
• sod
• table
• toe
• vial
• ward
• whack
• who
• wick
• win
• yeas
• yet
• zebra
• zinc

The final version of the program is

```object OrdinalPositionSolver {
/**
* Determines if the word has the desired property of the ordinal position
* of first letter equalling the sum of the ordinal positions of the other
* letters
* Expects letters in the range [a-zA-Z];
*/
def matches(word: String): Boolean = {
if (word.length() < 2) {
false
}
else {
val ordinals = word.map(Character.toLowerCase(_) - 'a' + 1)
val ordinalSum = sum(ordinals.drop(1))
ordinals(0) == ordinalSum
}
}

def sum(nums: Iterable[Int]): Int = {
nums.reduceLeft[Int](_+_)
}

def main(args: Array[String]) {
// An iterator of lines in the file; note that they have newline control characters
val lines = scala.io.Source.fromFile("dict.txt").getLines
// Filter the lines, eliminating the new lines through trim mapping
val matching = lines.filter(i => matches(i.trim()))
matching.foreach(print(_))
}
}
```

## Conclusion

I have solved the puzzler in two different ways, once in an imperative Java style and once in a more functional Scala style.  The functional way is extremely concise and expressive, doing in 30 lines of code what it takes the Java code to do in over 60, confirming the claim that a typical Scala program is at least twice as compact as the corresponding Java program.  I have introduced some of the salient syntactic differences between Java and Scala. I have also introduced some of the mainstays of the functional programming paradigm, namely map, reduce, and filter.  While there are efforts to make Java more functional, the fact that you cannot pass functions as objects severely inhibits the use of functional programming paradigms.

## Code bloat and its inevitability in Java

[I]f you begin with the assumption that you need to shrink your code base, you will eventually be forced to conclude that you cannot continue to use Java. Conversely, if you begin with the assumption that you must use Java, then you will eventually be forced to conclude that you will have millions of lines of code.

Steve Yegge has an excellent rant on code bloat and its inevitability when using Java.  He makes some very cogent points about refactoring and design patterns, using the metaphor of spring cleaning.  Refactoring and design patterns can be seen as neatly organizing your stuff within compartments in a closet, but if you aren’t able to throw things out, you never will really have a clean code base.  Furthermore, he argues that Java necessitates code bloat and duplication, as it lacks certain features that are necessary to really shrink code bases (lambda functions, closures).

He concludes his article by considering some alternative languages that run on the JVM; it’s an older article (from 2007) and he choose Rhino as his language of choice to rewrite his game in (currently 500 thousand lines of Java code).  I’d be interested to see if he were to revisit this article if he would choose Scala instead. It seems to be a natural fit for what he’s trying to do; it runs in the JVM but adds functional programming features that would surely reduce his source code line count. In the comments many people suggested Scala, but he dismisses them by saying

“Scala folks and Groovy folks: you’re not big enough yet. For something as big as my game, I want a proven mainstream language. I picked Rhino as a complicated multidimensional compromise; the actual reasons are a full blog post. But the short answer is “you’re not big enough.” Sorry.

Given Scala’s somewhat more mainstream position these days, I think he’d be forced to reconsider.

This article hits home for me, because the verbosity of Java is a thing I struggle with on a daily basis; it is definitely one of my biggest complaints about the language, and it makes it a breath of fresh air to use something more expressive and concise, like Python.

Categories: Java, scala Tags: , ,