Archive

Posts Tagged ‘tr’

Excel 2008 for Mac’s CSV export bug

December 6, 2010 7 comments
I ran into this at work a few weeks ago and thought I’d share.

Excel 2008’s CSV export feature is broken.  For instance, enter the following fake data into Excel:

Row Name Age
0 Nick 23
1 Bill 48
Save as -> CSV file

Full list of choices

When you use standard unix commands to view the output, the results are all garbled.

[Documents]$ cat Workbook1.csv
1,Bill,48[Documents]$
$ wc -l Workbook1.csv
0 Workbook1.csv
What is the issue?  The file command reveals the problem:
$ file Workbook1.csv
Workbook1.csv: ASCII text, with CR line terminators
CR stands for Carriage return, the ‘\r’ control sequence which, along with the newline character (‘\n’), is used to break up lines on Windows.  Unix OSes like Mac OS expect a single ‘\n’ new line character to terminate lines.
How can we fix this?

dos2unix.

# convert the Workbook1.csv file into a Unix appropriate file
dos2unix Workbook1.csv WithUnixLineEndings.csv
If you don’t have dos2unix on your Mac, and you don’t want to install it, you can fake it with the tr command:
tr '\15' '\n' < Workbook1.csv # remove the carriage returns, replace with a newline
Row,Name,Age
0,Nick,23
1,Bill,48
Very annoying that the Mac Excel doesn’t respect Unix line terminators.  Interestingly, I found a post that talks about ensuring that you choose a CSV file encoded for Mac, but that option seems missing from the Mac version itself.
If I’m missing something obvious, please correct me.

TextMate: Column editing, filter selection through command

April 21, 2010 1 comment

I’ve mentioned TextMate before, as it is the best general purpose text editor I have found on the Mac.  I’d like to show you some of the neat features that I’ve discovered/been told about, as well as examples as to how they are useful.

The first thing I’d like to show you is column selection.


Standard selection

Column selection

You enter column editing mode by holding down the alt/option key; you’re pressing the correct button if your cursor changes into a crosshair.  At that point, you can select rectangular regions of the text.  You can copy and paste them like normal, but that’s not why this mode is useful. If you’ll notice, if you begin type, what you type is inserted at the same point on each of the lines you have selected

This came in very handy today when I had to write a copy constructor for a class that had a huge (> 40) number of variables.  (Yes, in most cases a class should be a lot more than a bag of variables.)  For those unsure of what a copy constructor is, it basically allows you to clone objects, making a new object but using the state information from an existing instance of that class. NetBeans has a lot of support for refactoring, but nothing I found did this automatically; I copy and pasted the variables into TextMate due to the features I’m about to illustrate.

public  class TextMateExample {
 private int varA;
 private int  varB;
 private double varC;
 private int varD;
 private float varE;
 private int varF;
 private  String varG;
 private int varH;
}
 

Use the column selection to get all of the variable names and potentially some of the variable type declarations; don’t worry about the excess.

Paste it underneath.

Clean it up, first by chopping off excess on the left, and then manually editing the extra stuff out.

Now it’s easy from here.  Paste the column of variable names to the right

We’ll call our copied object ‘other’, and obviously the current object as ‘this’.

Make a 0 width selection to the left of the first var column, then type ‘this.’

Select the semicolon column and the whitespace in between, and replace it with ‘= other.’.

Place this block inside a new constructor, and you’re all set

The last thing I want to illustrate is how to use the ‘filter through command’ feature in TextMate.  Any arbitrary selection in TextMate can be used as the input to any command line script you can think of.  This is extraordinarily powerful, especially if you are familiar with the Unix command line.

Let’s say that you want to replace all the raw variable access with the corresponding getter methods, for whatever reason. Use the column selection to insert ‘get’ between the other. and the variable name

The Java Beans getter/setter convention would be to name those variables getVarX, not getvarX.  Since they’re named this way, that would be extraordinarily easy to fix; select the column of lower case v’s, hit V, and all the v’s are instantly replaced. Let’s assume that our variables are named much differently, though following JavaBeans convention.  In other words, we want to capitalize the first letter following the word ‘get’ in that column.

I’m going to show you three ways of accomplishing the translation.

1) Use the built in TextMate case conversion

But that’s no fun, and doesn’t illustrate the use of filtering through the terminal.  (Again, this example is a bit trivial and it’s overkill to use the terminal for this, but I’m still going to illustrate how.  Because I can)

2) Use filter through command with ‘tr’ command

Select the offending letters (again, all lowercase v’s here, but use your imagination), right click on the selection and choose ‘filter through command’.

The ‘tr’ command performs a mapping from one set of characters to another.

echo "cbadefg" | tr "abc" "def"
feddefg

“a” maps to “d”, “b” maps to “e”, and so on and so forth.  All we need to do is map “abcdefghijklmnopqrstuvwxyz” to “ABCDEFGHIJKLMNOPQRSTUVWXYZ”.

Fortunately there is a shortcut for that in tr, since they are common sets of letters: [:lower:] and [:upper:]

echo  "cbadefg" | tr "[:lower:]" "[:upper:]"
CBADEFG

So we can use this command in the dialog box:

Not surprisingly this works.

3) Use ‘sed’

Sed stands for ‘stream editor’, and it is an extremely versatile Unix command line tool.  Let’s do a few examples of what’s possible with sed before presenting the command to switch the case of all the letters

Perhaps the most common use of sed is to replace one string with another.  The syntax to do that is

's/string1/replacementstring/[g]'

the ‘g’ argument is optional; if you include it, all instances of string 1 will be replaced per line; else just the first one.

Let’s try it out:

echo "Hello, World" | sed  's/Hello/Goodbye/g'
Goodbye, World

There’s a lot more to sed; I recommend reading Getting Started with Sed if this piques your interest.  I’ll have more blog posts later to illustrate some more, nontrivial uses.

sed can do the same transliteration that the tr command can do, with the syntax ‘y/set1/set2/’

The command to use for filtering to convert lowercase to uppercase is then

sed  'y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/'

Conclusion

The column editing feature of TextMate is fairly unique (I haven’t found another text editor that supports it), and it could come in useful any time you need to append a string to the front of a column of text, as could be the case with a set of variables.  For instance, you could prepend ‘private final’ to set of default access level variables.

I also illustrated the use of ‘filter selection through command’; any command you can execute on your terminal is accessible to you here.  The power of Unix is completely at your disposal via this dialog.

Categories: Java, programming, unix Tags: , , , ,