Posts Tagged ‘color’

Data Visualization – Size of NFL Football Players Over Time

June 5, 2014 Leave a comment
1920 height/weight of NFL players

Screenshot from – 1920

2014 height/weight of NFL players

Screenshot from – 2014

I love Noah Veltman’s visualization of the changing height and weight distribution of professional football players. It uses animation to convey the incredible increase in size of the typical football player, and it does so with a minimal amount of chart junk. Let’s look at two aspects that make this effective.

It uses the appropriate visualization

There are 4 variables plotted on the graph – height, weight, density, and time. Two of the variables are encoded in the axes of the chart. The time dimension is controlled by the slider (or by hitting the play button). The density is represented by the color on the chart.

You could present this data as a table of data but it would be much harder to understand the pattern that the animation conveys in a very simple manner – not only are players getting bigger in both terms of height and weight, but the variance is increasing as well.

It makes good use of color

It uses color appropriately, by varying the saturation rather than the hue. I’ve blogged about this topic before when discussing the Wind Map. To repeat my favorite quote about this, Stephen Few states in his PDF “Practical Rules for Using Color in Charts”:

When using color to encode a sequential range of quantitative values, stick with a single hue (or a small set of closely related hues) and vary intensity from pale colors for low values to increasingly darker and brighter colors for high values


I could imagine extending this visualization in a few ways:

  • Allow users to view the players that match a given height/weight combination (who exactly are the outliers?)
  • Allow restricting the data to a given position (see how quarterbacks’ height/weight are distributed vs those of the offensive line)
  • Compare against some other normalized metrics, such as rate of injury. Is there a correlation?

This is a great data visualization because it tells a story and it spurs the imagination towards additional areas of analysis and research.

Wind Map – a visualization to make Tufte proud

June 9, 2012 2 comments

Wind map picture

Edward Tufte is a noted proponent of designing data rich visualizations. His books, including the seminal The Visual Display of Quantitative Information have influenced countless designers and engineers. When I first saw Fernanda Viégas and Martin Wattenberg’s Wind map project via Michael Kleber’s Google+ post, I immediately became entranced with it. After studying it for some time, I feel that the designers must have been intimately familiar with Tufte’s work. Let us examine how this triumph of data visualization succeeds.

Minimalist and data dense

Tufte describes the data density of charts based on the amount of information conveyed per measure of area. There are two ways of increasing data density – increasing the amount of information conveyed, and decreasing the amount of non-essential pixels in the image.

No chart junk

You’ll immediately notice what’s not in the image – there’s no compass rose, no latitude or longitude lines, or any other grid lines separating the map from the rest of the page. There aren’t even dividing lines between the states. It isn’t a map at all about political boundaries, so this extra information would only detract from the data being conveyed.

More info

This map conveys two variables, wind speed and wind direction, for thousands of points across the United States. A chart conveying the same information would take far more space and the viewer would have no way of seeing the patterns that exist.

Does not abuse color

In the hands of less restrained designers, this map would be awash in color. You see this often in weather maps and elevation maps, as illustrated below:

Snowfall example
Egregious elevation map
Egregious elevation map 2

The problem is that it is difficult to place colors in a meaningful order quickly. Yes, there is the standard ROYGBIV color ordering of the rainbow, but it’s difficult to apply quickly. Quick – what’s ‘bigger’ – orange or mauve? How about pink or green? Yellow or purple?. It is much easier to compare colors based on their saturation or intensity rather than hue. Color is great for categorical differences, but not so great for conveying quantitative information. Stephen Few sums it up nicely in his great PDF “Practical Rules for Using Color in Charts

When using color to encode a sequential range of quantitative values, stick with a single hue (or a small set of closely related hues) and vary intensity from pale colors for low values to increasingly darker and brighter colors for high values

The designers uses five shades of gray, each of which is distinguishable from the others, rather than a rainbow of colors. Five options is a nice tradeoff between granularity and ease of telling the shades apart.

Excellent use of the medium

In a print medium, the shades of gray would have had to suffice to illustrate how fast the wind was moving. In this medium, the designers used animation to illustrate the speed and direction of the wind in a truly mesmerizing way.


This visualization does a lot of things right. In particular, it uses a great deal of restraint in conveying the information. Unlike some of the other examples I showed, it does not have extra chart junk wasting space, it does not abuse color to try to convey quantitative information, and it is absolutely aesthetically pleasing.

Scala – type inferencing gotchas

January 24, 2011 6 comments
I’ve written previously about Scala; today I’m going to write about a potential ‘gotcha’ of Scala’s type inferencing.  Type inferencing means that instead of writing
int x = 100;

as we would in Java, you can instead write

val x = 100
and the compiler is smart enough to figure out that you mean an integer.  Let’s look at one case in which the use of implicit typing came back to bite me.

The setup

I was writing a routine that calculates the ‘average’ color of an image (where we define the average color as the color formed by the average red, blue, and green components of all the pixels in the image).  A first pass might look like this:
def calculateAverageColor(image:BufferedImage):Color = {
    var redSum = 0
    var greenSum = 0
    var blueSum = 0

    // calculate the sum of each channel here

    val red = (redSum / numPixels)
    val green = (greenSum / numPixels)
    val blue = (blueSum / numPixels)

    new Color(red, green, blue)
(I’m leaving off the actual summation algorithm, as it’s not important to illustrate how the problem manifests itself.)
This code is almost right, but it has a problem.  Can you guess what it is?

Problem #1

In a lot of cases this code will work fine.  But if you have an extremely high resolution image, or you have an overexposed image (where many of the pixels are near white, implying a high red, green, and blue value), any of the individual sums can overflow, if the sum exceeds the max value that an integer can hold.  Due to the way binary numbers are implemented (“Twos complement”), this results in a negative number.
scala> java.lang.Integer.MAX_VALUE
res0: Int = 2147483647

scala> java.lang.Integer.MAX_VALUE+1
res2: Int = -2147483648

If this happens, then we end up trying to construct a color with negative red, green, or blue values; this will result in an IllegalArgumentException.

The easiest way to fix this is to use a bigger datatype to store the running sum; let’s use a Long instead.  This allows us a maximum value of 9223372036854775807; that should be plenty big to store whatever running total we have.
def calculateAverageColor(image:BufferedImage):Color = {
     // Declare the sums as longs so we don't have to worry about overflow
     var redSum:Long = 0
     var greenSum:Long = 0
     var blueSum:Long = 0

     // calculate the sum of each channel

     val red = (redSum / numPixels)
     val green = (greenSum / numPixels)
     val blue = (blueSum / numPixels)

     new Color(red, green, blue)
This code looks OK at a cursory glance; if you actual use it, you will quickly get an exception:
java.lang.IllegalArgumentException: Color parameter outside of expected range: Red, Green, Blue

Problem #2

At this point, you might start debugging the process by printing out the values of red, green, and blue.  Sure enough they’ll be in the range [0, 255], just as you need for the Color constructor.  What is going on?

There are two related problems.  The first is that the type of red, green, and blue are not integers, due to the Long value in the computation.  The compiler sees the Long and (correctly) infers that the type of red, green, and blue must be Long.

This wouldn’t be a problem, except that there is no Color constructor that takes in Long arguments.  Yet the code compiles fine.  Why?
Well, it’s because of the implicit widening conversions present in the Java language.  These (usually) make our lives as programmers much easier.  If a method is defined to take in an integer and we pass it a short instead, the short is implicitly converted to an integer with no extra effort on our part.  This is done automatically because there is no danger of truncation; because the type that is implicitly converted to is larger (wider), it can always hold the value of the smaller type.
According to the Sun specification,

The following 19 specific conversions on primitive types are called the widening primitive conversions:


  • byte to shortintlongfloat, or double
  • short to intlongfloat, or double
  • char to intlongfloat, or double
  • int to longfloat, or double
  • long to float or double
  • float to double

The implicit conversion that tripped me up was that longs are implicitly promoted to float.  The reason the method compiled, even though there is no constructor that takes long r,g,b arguments, is because there is a constructor that takes float arguments, and the longs were automatically converted to floats!  See the Color documentation for more. Unlike the integer constructor, which takes RGB values in the range [0,255], the float constructor takes RGB values in the range [0,1].
The solution is to convert the longs back into ints, via a cast.  In Java this would be accomplished with
long redSum = ...;
int averageRed = (int) (redSum/numPixels);
In Scala you instead must do
val redSum:Long = ...
val averageRed:Int = (redSum/numPixels).asInstanceOf[Int]
The fixed code as a whole thus looks like
	var redSum:Long = 0
	var greenSum:Long = 0
	var blueSum:Long = 0

	// calculate the sum of each channel

	val red:Int = (redSum / numPixels).asInstanceOf[Int]
	val green:Int = (greenSum / numPixels).asInstanceOf[Int]
	val blue:Int = (blueSum / numPixels).asInstanceOf[Int]

	new Color(red, green, blue)


One of the nice things about Scala is that you do not need to explicitly declare the types of your variables.  In one sequence of unfortunate events, the variables that looked like ints were in fact longs, leading to an implicit conversion to the float primitive type, which in turn caused the incorrect constructor to be invoked, and an IllegalArgumentException.  Hopefully you can avoid doing something so foolish as a result of reading this post. – find lighter/darker shades of colors

December 15, 2010 Leave a comment

Color choosers are a dime a dozen online, but is a very nice one.  Its stated purpose is to allow you to specify a color and then find shades that are darker and lighter than that color.  It’s very well designed, aesthetically pleasing, and has the good sense to allow you to copy the hex value of the color with a single click.

I use it on a semi-regular basis to design Java Swing UIs; just a quick tip for the Java folks out there – when you have the hex code copied, you need to preface the hex string with 0x for the Color constructor to work correctly.  In other words, if you are have the hex string #facade, you would create a Java color object with the command new Color(0xfacade).  The 0x tells the Java compiler to treat the following text as hexadecimal.

Categories: Java, UI Tags: , , , , ,


April 11, 2010 Leave a comment

Just a quick post about a great online tool I was shown (thanks Eric) called Colorbrewer.

There are numerous books and articles online about color palette design, usually from a web-design / aesthetic standpoint.  But there is more to the use of color than mere aesthetics; color can be used as an effective tool in scientific data visualization.  One of the few books on the topic describes its contents as a guide to “how scientists and engineers can use color to help gain insight into their data sets through true color, false color, and pseudocolor imaging.”  While the content of the book is a bit beyond the scope of this post, it’s clear that color gets a lot of use in charts and graphs, and being able to better pick colors is beneficial.
ColorBrewer is designed to help users pick a set of colors that is best used to show data on a map.  Unlike most color scheme choosers where you pick whether you want muted colors, bright colors, pastel colors, etc., ColorBrewer starts by asking whether the data you are visualizing is sequential, diverging, or qualitative.  Sequential and diverging both have to do with quantitative data, e.g. average salaries.  Sequential is the more familiar for data; darker colors usually indicate a higher value on whatever metric is measured.  Diverging, on the other hand, treats the average as a neutral color and then uses colors with sharply contrasting hues for the high and low ends.  Qualitative could also be labeled as ‘categorical’; it means about the same thing.

Among its other features, ColorBrewer can exclude colors that would not come out well when photocopied, as well as those that would be confused by people with color blindness.  It also has mechanisms for exporting the RGB/Hex color codes of the generated color palettes for use in other applications.

Categories: UI Tags: , , ,