Search Results

Keyword: ‘textmate’

How to make git use TextMate as the default commit editor

July 21, 2010 2 comments
git config --global core.editor "mate -w"

Now when you do a git commit without specifying a commit message, TextMate will pop-up and allow you to enter a commit message in it. When you save the file and close the window, the commit will go through as normal. (If you have another text editor you prefer instead, just change the “mate -w” line to the preferred one)

For those curious what the -w argument is about, it tells the shell to wait for the mate process to terminate (the file to be saved and closed). Read this for more information about how to associate TextMate with various other shell scripts and programs.

Advertisements
Categories: textmate, Uncategorized, unix Tags: , ,

TextMate usability flaws

May 24, 2010 20 comments

I’ve posted previously about TextMate, a great text editor for Mac OSX.  While it is one of the best text editors I’ve used due to its simplicity, elegance, and power, there are a few UI features that drive me up the wall, and make it fall short of being perfect.

Tab key behavior

Here I have a bunch of text selected.  What will happen when I hit tab?


If you answered, delete your selection and insert a literal tab, you win the prize.  What happens in Microsoft Word, NetBeans Platform, and any number of other IDEs and text editors?  The text is indented.  This is clearly the better alternative in my mind because it is non-destructive, and you can get the destructive tab with a simple addition of any delete/backspace key before hitting tab.

Furthermore, it’s inconsistent – if I had selected that text and hit quote, I’d have the text surrounded by quotes, not a single quote replacing all my test.  I would argue that it does more harm than good to

To indent the region, you have to do ⌘-]; to unindent you do ⌘-[.  I’m sure there’s a way to remap tab and shift tab to indent the region, but it’s not immediately clear to me how to do that.  If you know how, please comment so people can see.  Every time I switch between Netbeans and TextMate I invariably delete a whole region of text I meant to indent by forgetting to context switch and use ⌘-].  Fortunately TextMate has extremely good undo support, so it’s not the end of the world, but it’s that little bit of friction that slows down work and adds up to frustration over multiple instances.

EDIT: The following sneaks up on me all the time and also is very irksome.  Here’s a rather typical situation:

Before backwards indenting

Here I want to move the block of text backward; I resist the urge to hit shift tab, and instead press ⌘-[.  What happens?  Not what you’d think.

After hitting un-indent

I really cannot think of a valid reason for this behavior; I have the first line selected, but it does not respond to the indentation command.  Very frustrating.

Syntax highlighting with unsaved files

If you are working on an unsaved file, there is no way to tell TextMate to treat it as a file of a certain programming language, and thus which syntax highlighting rules apply.

Why is this irksome?  When I’m writing a blog post, for instance, I have a lot of snippets I compose in TextMate.  In order to get the syntax highlighting to work correctly, I need to save each individual file.

Now, I’m not suggesting TextMate is magical and can read my mind as to what ruleset to apply to an unsaved file.  What I’m suggesting is the addition of a menu that allows you to apply the syntax rules of a given language to the current file, regardless of whether it’s saved or not.  Notepad++, a Windows text editor, has precisely this feature.

EDIT: Thanks to the comment by Marshall, I realize that this complaint is way off base – I didn’t realize where the menu was but it’s there.  Right where it says “Plain text” in the above screen shot, click that and you get a full list of languages to treat the file as.

Tab support

Tab support is abysmal.  There is no support for closing tabs with the middle mouse button, a convention followed by most major web browsers, as well as NetBeans and Eclipse.  Instead, you are forced to click a small close icon on each tab.  Fitts’s Law states that the smaller a button is, the longer it will take a user to navigate to; this is part of the reason that Apple has adopted menu bars at the top of the screen rather than floating with each window.  Doing so gives the menu effectively infinite height, as the user can slam the mouse up and not go past it.  Long story short, having the only means of closing tabs be a tiny button is bad UI design.

I could forgive the lack of middle button close IF the tabs supported sensible context menus to close other tabs.  Compare for instance the options of some other products that support tabs:

Firefox’s popup menu on tab

When you right click a tab in TextMate, no context menu appears – there is absolutely no menu option I can find to close all tabs.  Any system providing tabs should allow the user to make a blank slate for himself and focus on one (or zero) files at a time.  As it stands, you must click each tiny x button individually on all the tabs.

Finally, you cannot undock tabs.  Again, maybe I’ve just been spoiled by other products like Adium, Firefox, and NetBeans, but I consider it a very important feature to be able to undock tabs to show two windows side by side.  While tabs are certainly more space efficient than tiling windows side by side, sometimes you really need to compare two files side by side.  TextMate makes it very difficult to do that, especially when you have opened a project rather than individual files.

Here is the workflow in FireFox:


Before drag is initiated on the tab

While tab is being dragged; note that a translucent version of the page follows the cursor to indicate what is happening

After the drop is completed, the tab is split off from the original window and becomes a separate frame.

Conclusion

TextMate is an excellent text editor but not without some usability flaws.  I’ve detailed some features that irritate me about TextMate, due to their violation of the principle of least astonishment; I’ve used enough other similar systems to expect certain functionality, and this expectation is violated in a few ways.  These ways include the fact that hitting tab while text is selected replaces the contents with a literal tab rather than indenting the region, lack of a feature to syntax highlight unsaved files, an inability to close multiple tabs at once, and finally an inability to drag tabs out of the frame to become separate frames, so as to be able to compare documents side by side.  Software designers take note – if you are going to have tabs, you must build in these features or your users will feel seriously limited.

Categories: UI Tags: , ,

TextMate: Column editing, filter selection through command

April 21, 2010 1 comment

I’ve mentioned TextMate before, as it is the best general purpose text editor I have found on the Mac.  I’d like to show you some of the neat features that I’ve discovered/been told about, as well as examples as to how they are useful.

The first thing I’d like to show you is column selection.


Standard selection

Column selection

You enter column editing mode by holding down the alt/option key; you’re pressing the correct button if your cursor changes into a crosshair.  At that point, you can select rectangular regions of the text.  You can copy and paste them like normal, but that’s not why this mode is useful. If you’ll notice, if you begin type, what you type is inserted at the same point on each of the lines you have selected

This came in very handy today when I had to write a copy constructor for a class that had a huge (> 40) number of variables.  (Yes, in most cases a class should be a lot more than a bag of variables.)  For those unsure of what a copy constructor is, it basically allows you to clone objects, making a new object but using the state information from an existing instance of that class. NetBeans has a lot of support for refactoring, but nothing I found did this automatically; I copy and pasted the variables into TextMate due to the features I’m about to illustrate.

public  class TextMateExample {
 private int varA;
 private int  varB;
 private double varC;
 private int varD;
 private float varE;
 private int varF;
 private  String varG;
 private int varH;
}
 

Use the column selection to get all of the variable names and potentially some of the variable type declarations; don’t worry about the excess.

Paste it underneath.

Clean it up, first by chopping off excess on the left, and then manually editing the extra stuff out.

Now it’s easy from here.  Paste the column of variable names to the right

We’ll call our copied object ‘other’, and obviously the current object as ‘this’.

Make a 0 width selection to the left of the first var column, then type ‘this.’

Select the semicolon column and the whitespace in between, and replace it with ‘= other.’.

Place this block inside a new constructor, and you’re all set

The last thing I want to illustrate is how to use the ‘filter through command’ feature in TextMate.  Any arbitrary selection in TextMate can be used as the input to any command line script you can think of.  This is extraordinarily powerful, especially if you are familiar with the Unix command line.

Let’s say that you want to replace all the raw variable access with the corresponding getter methods, for whatever reason. Use the column selection to insert ‘get’ between the other. and the variable name

The Java Beans getter/setter convention would be to name those variables getVarX, not getvarX.  Since they’re named this way, that would be extraordinarily easy to fix; select the column of lower case v’s, hit V, and all the v’s are instantly replaced. Let’s assume that our variables are named much differently, though following JavaBeans convention.  In other words, we want to capitalize the first letter following the word ‘get’ in that column.

I’m going to show you three ways of accomplishing the translation.

1) Use the built in TextMate case conversion

But that’s no fun, and doesn’t illustrate the use of filtering through the terminal.  (Again, this example is a bit trivial and it’s overkill to use the terminal for this, but I’m still going to illustrate how.  Because I can)

2) Use filter through command with ‘tr’ command

Select the offending letters (again, all lowercase v’s here, but use your imagination), right click on the selection and choose ‘filter through command’.

The ‘tr’ command performs a mapping from one set of characters to another.

echo "cbadefg" | tr "abc" "def"
feddefg

“a” maps to “d”, “b” maps to “e”, and so on and so forth.  All we need to do is map “abcdefghijklmnopqrstuvwxyz” to “ABCDEFGHIJKLMNOPQRSTUVWXYZ”.

Fortunately there is a shortcut for that in tr, since they are common sets of letters: [:lower:] and [:upper:]

echo  "cbadefg" | tr "[:lower:]" "[:upper:]"
CBADEFG

So we can use this command in the dialog box:

Not surprisingly this works.

3) Use ‘sed’

Sed stands for ‘stream editor’, and it is an extremely versatile Unix command line tool.  Let’s do a few examples of what’s possible with sed before presenting the command to switch the case of all the letters

Perhaps the most common use of sed is to replace one string with another.  The syntax to do that is

's/string1/replacementstring/[g]'

the ‘g’ argument is optional; if you include it, all instances of string 1 will be replaced per line; else just the first one.

Let’s try it out:

echo "Hello, World" | sed  's/Hello/Goodbye/g'
Goodbye, World

There’s a lot more to sed; I recommend reading Getting Started with Sed if this piques your interest.  I’ll have more blog posts later to illustrate some more, nontrivial uses.

sed can do the same transliteration that the tr command can do, with the syntax ‘y/set1/set2/’

The command to use for filtering to convert lowercase to uppercase is then

sed  'y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/'

Conclusion

The column editing feature of TextMate is fairly unique (I haven’t found another text editor that supports it), and it could come in useful any time you need to append a string to the front of a column of text, as could be the case with a set of variables.  For instance, you could prepend ‘private final’ to set of default access level variables.

I also illustrated the use of ‘filter selection through command’; any command you can execute on your terminal is accessible to you here.  The power of Unix is completely at your disposal via this dialog.

Categories: Java, programming, unix Tags: , , , ,

Two shortcuts for NetBeans navigation

December 10, 2010 Leave a comment

NetBeans is my Java IDE of choice, and I have written about it a few times in the past.  There are a slew of features to make you more productive and today I will highlight two shortcuts that are essential to efficient navigation.  I will show you how to jump quickly to any file in any open project, and how to search for a string in any file.

Jump to any file in any open project
⌃⇧ O

Ctrl Shift O = Go to File
By typing this shortcut at any time, you launch the Go to File dialog.  This dialog allows you type pieces of any file name, and you will see results returned in near real time.  This is excellent for when you know the approximate name of the file you’re looking for (you can insert a ‘*’ character as a wildcard, so if you know the word ‘foo’ appears somewhere in the word, you would search for *foo.

Search results are returned instantly

This dialog is a great example of incremental search, a UI technique I blogged about previously.  The incremental search is beneficial because it gives you instant feedback on your search query, allowing you to modify it as necessary to find what you’re looking for.
Those of you who use TextMate might recognize this feature; in TextMate this dialog is accessible via ⌘T (Command T).
The difference is that TextMate’s search is faster, and it is more forgiving (you can search for just a few letters in the file you’re searching for and still find it; you don’t need to know the precise name).
In general it’s much faster to search for files than to navigate through the file hierarchies manually.

Search for string

The second shortcut is to search for text in all the open projects (or narrowing it down to a select few files if you prefer).  This command is easily accessible when right clicking on a project node, but it’s accessible any time via keyboard shortcut as well

⇧ ⌘ F – Find in projects
Shift Command F

This one’s pretty self explanatory, but it took me awhile to stumble onto the shortcut, and it’s a pretty helpful one to know.

I hope these two shortcuts make your time in NetBeans a little easier.

Unix tip #3: Introduction to Find, Grep, Sed

September 7, 2010 6 comments
I’ve written a few times before about Unix command line tools and how learning them can make you a more efficient programmer.  Today I’m going to introduce a few essential tools in the Unix toolkit.  While programming, one often notes future improvements or tasks with the use of a TODO comment.  For instance, if you have a dummy implementation of a method, you might comment that you need to fill in the actual implementation later:

 public int randomValue() {
 // TODO: hook up the actual random number generator
 return 0;
 }
 

The problem is that these TODOs more often than not get ignored, especially if you have to search through the code yourself to try to find all of the remaining tasks.  Fortunately, certain Programs (NetBeans and TextMate for two examples) can find instances of keywords indicating a task, extract the comments, and present them to you in a nice table view.

I’m going to step through the use of a few Unix tools that can be tied together to extract the data and create a similar view.  In particular I will illustrate the use of find, grep, sed,  and pipes.

The general steps I’ll be presenting are:

Step Tools used
1. Find all Java files find
2. Find each TODO item grep
3. Extract filename, line number, task sed
4. Format results of step 3 as an HTML table find/grep/sed/shell script

.

Finding instances of text with grep

In order to extract all of the TODO items from within our java files, we need a way of searching for matching text. grep is the tool to do that. Grep takes as input a list of files to search and a pattern to try to match against; it will then emit a set of lines matching the pattern.

For instance, to search for TODO or any version of that string (todo, ToDO), in all the .java files in the current directory, you would execute the following:

grep -i TODO *.java
Telephone.java:    // TODO: Document
Telephone.java:     // TODO: throw exception if precondition is violated

Note that the line numbers are omitted. If we want them, we use the -n command

grep -i -n TODO *.java
Telephone.java:20:    // TODO: Document
Telephone.java:29:     // TODO: throw exception if precondition is violated

If all we want to do is get a rough estimate as to how many documented TODOs we have, we can pipe the result of this argument into the wc utility, which counts bytes, characters, or lines. We want the number of lines.

grep -i -n TODO *.java | wc -l
       2

This works fine with a single directory of files, but it will not handle nested directories. For instance, if my directory structure looks like the following:

tree
.
|-- BalancedTernary.java
`-- Telephone.java

0 directories, 2 files

All of these files will be searched when grep is run. But if I introduce new files in subdirectories:

mkdir Subdir
echo "//TODO: Create this file" > Subdir/Test.java

tree
|-- BalancedTernary.java
|-- Subdir
|   `-- Test.java
`-- Telephone.java

1 directory, 3 files

The new Test.java will not be searched. In order make grep search through all of the subdirectories (i.e., recursively), you can combine grep with another extremely useful Unix utility, find. Before moving on to find, I want to stress that grep is extremely useful and vital to anyone using a Unix based machine. See grep tutorials for many good examples of how to use grep.

Finding files with find

The find command is extremely useful. The man page describes find as

find – search for files in a directory hierarchy

There are a lot of arguments you can use, but to get started, the basic syntax is

find [<starting location>] -name <name pattern>

If the starting location is not provided, it is assumed to be in the current directory (. in Unix terms). In all the examples that follow I will explicitly list the starting directory.

For instance, if we want to find all the files that end with the extension “.java” in the current working directory, we could run the following:

find . -name "*.java"
./BalancedTernary.java
./Subdir/Test.java
./Telephone.java

Note that we must enclose the pattern in quotes in this example in order to prevent the shell from trying to expand the * wildcard. If we don’t, the shell will convert the asterisk into a space delimited set of all the files/directories in the current folder, which will lead to an error

 find . -name *.java # expands to find . -name BalancedTernary.java Telephone.java
find: Telephone.java: unknown option

Just as we can use the wc command to count the number of times a phrase appears in a file, we can use it to count the number of files matching a given pattern. That is because find outputs each matching file path to a separate line. Thus if we wanted to count the number of java files in all folders rooted in the current folder, we could do

find . -name "*.java" | wc -l
       3

While I have only presented the -name flag, there are numerous other flags as well, such as whether the candidate file is a file or directory (-type f or -type d respectively), whether the match is smaller, the same, or bigger than a given size (-size +100M == bigger than 100 megabytes), or when the file was last modified (find -newer ordinary_file would only accept files that have a modification time newer than that of ordinary_file). A A great article for gaining more expertise is Mommy I found it! – 15 practical unix find commands.

Combining find with other commands

find becomes even more powerful when combined with the -exec option, which allows you to execute arbitrary commands for each file that matches the pattern. The syntax for doing that looks like

find [<starting location>] -name <name pattern> -exec <commands> {} \;

where the file path will be substituted for the {} characters. For instance, if we want to count the number of lines in each Java file, we could run

find . -name "*.java" -exec wc -l {} \;
      23 ./BalancedTernary.java
       1 ./Subdir/Test.java
      88 ./Telephone.java

This has precisely the same effect as if we explicitly executed the wc -l command ourselves:

wc -l ./BalancedTernary.java wc -l ./Subdir/Test.java wc -l ./Telephone.java

As another example, we could backup all of the Java files in the directory by copying them and appending the suffix .bk to each

find . -name "*.java" -exec cp {} {}.bk \;
Nick@Macintosh-3 ~/Desktop/Programming/Java/example$ ls
BalancedTernary.java    Subdir                  Telephone.java.bk
BalancedTernary.java.bk Telephone.java

To undo this, we could remove all of the files ending in .bk:

find . -name “*.bk” -exec rm {} \;

Combining find and grep

Since I started the article talking about grep, it’s only natural that you can combine grep with find, and it often pays to do so.

For instance, by combining the earlier grep command to find all TODO items with the find command to find all java files, we suddenly have a command which will traverse an arbitrarily nested directory structure and search all the files we are interested in.

find . -name "*.java" -exec grep -i -n TODO {}  \;
1://todo: Create this file
20:    // todo: Document
29:     // todo: throw exception if precondition is violated

Note that we no longer have the filename prepended to the output; if we want it back we can add the -H flag.

find . -name "*.java" -exec grep -Hin TODO {} \;
./Subdir/Test.java:1://todo: Create this file
./Telephone.java:20:    // todo: Document
./Telephone.java:29:     // todo: throw exception if precondition is violated

In this last snippet I have combined the individual -H, -i and -n flags together into the shorter -Hin; this works identically as listing them separately. (Not all Unix commands work this way; check the man page if you’re unsure).

An alternate exec terminator: Performance considerations

I said earlier that the basic syntax for combining find with other commands is

find [<starting location>] -name <name pattern> -exec <commands> {} \;

The ; terminates the exec clause, but because it can be interpreted as text, it has to be backslash escaped. While researching this article I found a Unix/Linux “find” Command Tutorial that introduced me to an alternative syntax for terminating the -exec clause of the find command. By replacing the semicolon with a + sign, files are grouped together in batches and sent to the given command rather than executed one at a time. Let me illustrate:

# Executes the 'echo' command on each file individually
find . -exec echo {} \;
.
./BalancedTernary.java
./Subdir
./Subdir/Test.java
./table.html
./Telephone.java
./test.a

# Executes the 'echo' command on bundled groups of files
find . -exec echo {} +
. ./BalancedTernary.java ./Subdir ./Subdir/Test.java ./table.html ./Telephone.java ./test.a

This technique of grouping the files together can have a profound performance boost when used with commands that can handle space terminated arguments. For instance:

time find /Applications/ -name "*.java" -exec grep -i TODO {} \;
real    1m36.458s
user    0m3.912s
sys 0m10.933s

time find /Applications/ -name "*.java" -exec grep -i TODO {} +
real    0m39.060s
user    0m3.660s
sys 0m6.571s

# An alternate way of executing grep on batches of files at once #
time find /Applications/ -name "*.java" -print0 | xargs -0 grep -i "TODO"
real    0m50.486s
user    0m4.230s
sys 0m7.924s

By replacing the semicolon with the plus sign, I gained almost a 2.5x speed increase. Again, this will only work with commands that correctly handle whitespace separated arguments; the previous example with copy would fail miserably, because cp expects a single src/destination pair

# Will not work!
find . -name "*.java" -exec cp {} {}.bk +

Converting results of find/grep into table form – Intro to sed, cut, and basename

In the last section, I showed how to combine find and grep. The output of the command will look something like this:

find . -name "*.java" -exec grep -Hin TODO {} +
./Subdir/Test.java:1://todo: Create this file
./Telephone.java:20:    // todo: Document
./Telephone.java:29:     // todo: throw exception if precondition is violated

The output has the path to the file, followed by a semicolon, followed by the matching line in the input file that had the TODO in it. Let’s mimic the output of the TODO list in TextMate, which simply displayed a two column table with File name and line number followed by the extracted comment. While we could use any programming language to do this text manipulation (Python springs to mind), I’m going to use a combination of sed and shell scripts to illustrate a few more powerful command line tools.

Recall that the output of our script so far looks like the following:

./Telephone.java:20: // todo: Document

In other words each line is in the form

relative/path/to/File:lineNumber:todo text

The colons delimiting the text allow us to split the constituent parts very easily. The command to do that is cut. With cut you specify the delimiter on which to split the text, and then which numbered fields you want (where fields are numbered 1 .. n)

As an example, here is code to extract the path (the first column of text):

find . -name "*.java" -exec grep -Hin TODO {} + | cut -d ":" -f 1
./Subdir/Test.java
./Telephone.java
./Telephone.java

This gives us the path, one per line. If we want to convert the relative path into just the name of the file, like the TextMate example does, we want to strip out all of the leading directories, leaving just the file name. While we could code up a regular expression to perform the substitution, I prefer to avoid doing more work than I need to. Instead I’ll use the basename command, which does that for us.

find . -name "*.java" -exec grep -Hin TODO {} + | basename `cut -d ":" -f 1`
Test.java
Telephone.java
Telephone.java

The line number, the second column of text, is just as easy to extract.

find . -name “*.java” -exec grep -Hin TODO {} + | cut -d “:” -f 2 1 20 29

The fact that the line of text extracted by grep could contain the colon character (and often will; I always write my TODOs as TODO: do x) means we have to be a bit smarter about how we use cut. If we assume that the text is just in the third column, we will lose the text if there are colons.

# Only taking the third column
echo "./Telephone.java:20:    // todo: Document" | cut -d ":" -f 3    // todo
# Taking all columns after and including the third column
echo "./Telephone.java:20:    // todo: Document" | cut -d ":" -f 3-
    // todo: Document

While this works, it’s not the neatest output. In particular we want to get rid of the leading white space; otherwise it will mess up the formatting in the HTML table. Performing text substitution is the job of the sed tool. sed stands for stream editor and it is capable of doing extremely heavy duty find and replace tasks. I don’t pretend to be an expert with sed and this article won’t make you one either, but hopefully I can at least illustrate its usefulness. For a more in depth tutorial, see Sed – An Introduction and Tutorial.

A common use case for sed, as I mentioned, is to replace text. The general pattern is

sed ‘s/regexpToReplace/textToReplaceItWith/[g]’

The s can be read as “substitute”, and the optional g stands for global. If you omit it, it will only replace the first instance of the regular expression match that it finds. The g makes it search for all matches in the text.

Thus to remove leading white space, we can use the expression sed ‘s/^[ <tab>]*//g’

where the ^ character indicates that it must match the start of the line, and the text within brackets are the characters that will be matched by the regular expression. The * means to match zero or more instances. In other words, this line says “match the start of the string and all spaces and tabs you can until reaching other text, and replace it with nothing”.

The above command is not strictly correct. We need to indicate to sed that we want to replace the tab character. Unlike many Unix utilities, sed does not allow you to use the character sequence \t to indicate the tab character. Instead you need a literal tab at that place in the command. The problem with doing this is that your shell might swallow the tab before it gets to the sed command. In bash, the default shell environment on the Mac, the tab key is interpreted as a command to auto complete what is being typed. If you press the tab key twice, the shell will print out all the possible autocompletions.

For instance,

$lp<tab><tab>
lp           lpc          lpmove       lppasswd     lpr          lprsetup.sh
lpadmin      lpinfo       lpoptions    lpq          lprm         lpstat

Here I started typing lp, hit tab twice, and the shell produced a list of all the commands it knew about (technically, that are on the PATH environment variable). So we need a way to smuggle the tab key into the sed command, without triggering the shell’s autocompletion. The way to do this is with the “verbatim” command sequence, which instructs the shell not to interpret certain commands and instead to pass them treat them verbatim, as text.

To enter this temporary verbatim mode, you press Ctrl V (sometimes indicated as ^V online) followed by the key combination you want treated as text. Thus the real sed command to remove leading white space is sed ‘s/^[ ]*//’

$ sed 's/^[    ]*//'
     spaces
spaces
        tabs
tabs
           tabs and spaces
tabs and spaces

The above snippet illustrates that sed reads from standard input by default and thus can be used interactively to test the replacements you have specified. Again, in the above text it looks like I have a string of spaces, but it’s really <space><ctrl v><tab> within the brackets. From here on out I will put a \t to indicate a tab but you should realize that you need to do the ctrl v tab sequence I just described instead.

(Aside: I have read online that some versions of sed actually do support the \t character sequence to indicate tabs, but the default sed shipping with Mac OSX does not.)

sed – combine multiple commands into one

If you have series of text replacements you want to do using sed, you can either pipe the chain of transformations you want to do from one sed invocation to another, or you can use the -e flag to chain them together.

echo "hello world" | sed 's/hello/goodbye/' | sed 's/world/frank/'
goodbye frank
echo "hello world" | sed -e 's/hello/goodbye/' -e 's/world/frank/'goodbye frank

Note that you need the -e immediately after the first sed pattern as well; I naively tried to do

echo "hello world" | sed 's/hello/goodbye/' -e 's/world/frank/'sed: -e: No such file or directory
sed: s/world/frank/: No such file or directory

Integrating sed with find and grep

Combining all of the above sed goodness with the previous code we have

find . -name "*.java" -exec grep -Hin TODO {} + | cut -d ":" -f 3- | sed 's/^[ \t]*//'
//todo: Create this file
// todo: Document
// todo: throw exception if precondition is violated

I don’t want the todo text in the comments, as it would be redundant. As such I will remove the double slashes followed by any white space followed by todo, followed by an optional colon, followed by any space.

find . -name "*.java" -exec grep -Hin TODO {} + | cut -d ":" -f 3- | sed -e 's/^[ \t]*//' -e 's/[\/*]*[ \t]*//' -e 's/TODO/todo/' -e 's/todo[:]*[ \t]*//'
 Create this file
 Document
 throw exception if precondition is violated

This can be read as

s/^[ \t]*//         remove leading whitespace
s/[\/*]*            remove any number of forward slashes (/) or stars (*), which indicate the start of a comment
[ \t]*              remove whitespace
s/TODO/todo         convert uppercase TODO string into lower case
todo                remove the literal string 'todo'
[:]*                remove any colons that exist
[ \t]*              remove whitespace

We now have all the pieces we need to create our script.

Putting it all together

I’m going to show the script in its entirety without a huge amount of explanation. This post is more about the use of find/grep/sed than it is about shell scripting. I don’t claim to be an expert at writing shell scripts, so I wouldn’t be surprised if there’s a better way to do some of the following. It is not perfect; as the comments indicate, it wouldn’t handle text like ToDo correctly in the sed command. More importantly, there are some false positives in the lines it returns: things like toDouble match, because it contains the string ‘todo’. I’ll leave such improvements to the reader; if you do have any suggestions for the script, please add them to the comments below.

#!/bin/sh

# From http://www.linuxweblog.com/bash-argument-numbers-check
EXPECTED_ARGS=1
E_BADARGS=65
if [ $# -gt $EXPECTED_ARGS ]
then
  echo "Usage: ./extract [starting_directory]" >&2
  exit $E_BADARGS
fi

# By default, start in the current working directory, but if they provide
# an argument, use that instead.
if [ $# -eq $EXPECTED_ARGS ]
then
    startingDir=$1
else
    startingDir="."
fi

# Start creating the HTML document
echo "<html><head></head><body>"
echo "<table border=1>"
echo "<tr><td>Location</td><td>Comment</td></tr>"

# The output of the find command will look like
# ./Telephone.java:20:    // todo: Document

find $startingDir -name "*.java" -exec grep -Hin todo {} + |
# Allows the script to read in piped in arguments
while read data; do

    # The location of the file is the first argument
    fileLoc=`echo "$data" | cut -d ":" -f 1`
    fileName=`basename $fileLoc`

    # the line number is the second
    lineNumber=`echo "$data" | cut -d ":" -f 2`

    # all arguments after the second colon are the comment.  Eliminate the TODO
    # text with a simple find and replace.
    # Note: only handles todo and TODO, would need some more logic to handle other cases
    comment=`echo "$data" | cut -d ":" -f 3- | sed -e 's/^[     ]*//' -e 's/[\/*]*[     ]*//' -e 's/TODO/todo/' -e 's/todo[:]*[     ]*//'`
    echo "<tr>"
    echo "  <td><a href="$fileLoc">$fileName ($lineNumber)</a></td>"
    echo "  <td>$comment</td>"
    echo "</tr>"
done

# Finish off the HTML document
echo "</table>"
echo "</body></html>"

exit 0

If you save this script as a .sh file, you will need to make it executable before you can run it. From the terminal:

chmod +x extract.sh
# Extract all the TODO comments in the Applications folder, and save it as an html table
# Redirect the printed HTML to an HTML document
./extract.sh /Applications > table.html

The source code for the script is available on github. Running the script in my /Applications directory leads to the following HTML table:

Location Comment
Aquamacs (629) return ((ObjectReference)val).toString(); //
Aquamacs (633) return val.toString(); // not correct in all cases
Cycling (11) support joint operations on more than one channel.
Cycling (27) what about objects with more than one input?
Cycling (36) improve feedback math — fixed point, like jit.wake?
Cycling (277) theta shift?
Cycling (349) double closest[] = new double[] {a[0].toDouble(), a[1].toDouble(), a[2].toDouble()};
Cycling (351) double farthest[] = new double[] {a[0].toDouble(), a[1].toDouble(), a[2].toDouble()};
Cycling (5) describe the class
Cycling (22) implement with a Vector to improve performance
Cycling (8) abort a thread if an incoming message arrives before completion
Cycling (8) have the search happen in a separate thread
Cycling (9) possible to separate the errors that results from not
Cycling (191) implement automatic replacement of shader name in prototype file
PGraphicsOpenGL.java (738) make this more efficient and just update a sub-part
PGraphicsOpenGL.java (1165) P3D overrides box to turn on triangle culling, but that’s a waste
PGraphicsOpenGL.java (1180) P3D overrides sphere to turn on triangle culling, but that’s a waste
PGraphicsOpenGL.java (1508) Should instead override textPlacedImpl() because createGlyphVector
PGraphicsOpenGL.java (2207) this expects a fourth arg that will be set to 1
PGraphicsOpenGL.java (2847) not optimized properly, creates multiple temporary buffers
PGraphicsOpenGL.java (2858) is this possible without intbuffer?
PGraphicsOpenGL.java (2870) remove the implementation above and use setImpl instead,
PGraphicsOpenGL.java (2978) – extremely slow and not optimized.
PGraphicsOpenGL.java (738) make this more efficient and just update a sub-part
PGraphicsOpenGL.java (1165) P3D overrides box to turn on triangle culling, but that’s a waste
PGraphicsOpenGL.java (1180) P3D overrides sphere to turn on triangle culling, but that’s a waste
PGraphicsOpenGL.java (1508) Should instead override textPlacedImpl() because createGlyphVector
PGraphicsOpenGL.java (2207) this expects a fourth arg that will be set to 1
PGraphicsOpenGL.java (2847) not optimized properly, creates multiple temporary buffers
PGraphicsOpenGL.java (2858) is this possible without intbuffer?
PGraphicsOpenGL.java (2870) remove the implementation above and use setImpl instead,
PGraphicsOpenGL.java (2978) – extremely slow and not optimized.

The complete result can be found as another github gist.

Quick note: You have to be careful about what you echo in the shell. In an early version, I forgot to surround the text ($data) with quotes. This led to a problem when there were asterisks in the text, since the shell expanded the star into a list of all the files in the directory (aka file globbing). This is a relatively harmless problem; had the line had something like rm * instead, it would have been devastating. So make sure you surround your output text in quotes!

$ echo *
ApplicationTODO.html BlogPost.mkdown Find text.mkdown PGraphicsOpenGL.java TabTodo.java Test.html TodoTest.java appTable.html extract.sh tab tab.txt table body.html table.awk table.html table1.html test.java
$ echo "*"
*

Conclusion

I have introduced the find command and how it can be used to locate files or directories on disk with certain properties (name, last modified date, etc). I then showed how grep can be used to search the contents of a file or stream of content for matching regular expressions. Next I showed you how to combine find with arbitrary Unix commands, including grep with the -exec option. Finally I tied all these concepts together by creating a simple script which searches through all of the java files in a directory for those lines that have TODO in them, and creates an HTML table summarizing the location of each of these tasks, alongside the TODO item text.

Categories: Uncategorized, unix Tags: , , , , ,

Divvy

June 10, 2010 1 comment

If you’re using a Mac, you owe it to yourself to try a great program called Divvy.

In a nutshell, Divvy lets you divide up your workstation as you see fit, without having to manually resize all the windows yourself.  For instance, at work I have two workspaces; one for Mail and Calendar split 50/50, and one for Terminal, NetBeans, TextMate, and FireFox, with Terminal and FireFox taking the largest space.  Manually positioning four windows without leaving gaps is nigh impossible, and time consuming to boot.  I bind Divvy to Ctrl + Shift + Spacebar, and can position all 4 windows just how I like in just seconds.

Protip: Hold down the command key while dragging to get a finer grid.

Protip #2: You can set keyboard shortcuts for different divisions of the space.  Press the little gear icon in the top right of the program and go to the Shortcuts tab.

Categories: Apple, UI Tags: , , ,

My ten essential Mac programs

March 24, 2010 3 comments
I recently had the good fortune of getting a new Macbook Pro at work and kept track of the first programs I downloaded and installed.  Here are the first ten applications I installed, as well as descriptions as to why they are so essential to my everyday work.

Firefox

Firefox is my browser of choice.  No big surprise there.  Not much to say except tabbed browsing is great.

Quicksilver

Quicksilver, if you are unfamiliar, is an application launcher for Mac OSX.  If you’re a fan of analogies:
Spotlight : Documents :: Quicksilver : Applications

That’s a bit simplistic, as Quicksilver can do more than just launch applications, but that’s 99% of what I use it for, so the analogy stands.

TinyGrab

TinyGrab is an amazingly simple screen shot app for both Windows and Mac OS X (I use it on both platforms and it works better on Mac).  After registering for an account, you keep the app running in the background.  Any time you take a screenshot via Command Shift 3 (full screen capture) or Command Shift 4 (area of screen or window capture), the picture is automatically uploaded to the service, and a small url to the picture is copied to your clipboard.  All of the icons you see here are hosted on TinyGrab’s servers and were uploaded near instantaneously.  I say it works better on Mac than Windows because the Mac one merely hooks onto the act of capturing a screenshot using the already excellent Mac tools; when you press the hotkey to take a picture on Windows, it has to use its own “clip this area of the screen” feature, and it doesn’t work quite as seamlessly as on the Mac.

R

You may have already seen my previous R posts; R is a programming language intended for statistics.  It has dozens of high-quality open source code modules from mathematicians and scientists from around the world.  It is a great tool for doing exploratory data analysis.

R can be used both interactively through the R Console program, as well as through scripts.

Omnigraffle

Omnigraffle is a great program for creating vector graphics on the Mac.  It’s intended for use as a diagramming and charting tool, but it’s very versatile and I could imagine uses far outside those domains.  The interface is extremely slick, and the quality of the output is second to none; you can instantly tell when something has been created by Omnigraffle by its extensive use of drop shadows. I used it frequently in college to create diagrams for embedding within computer science and math problem sets.  For instance, I illustrated linked lists and other data structures, saved the output as PNGs, and then included the PNGs within my documents.

Netbeans

If you’re programming in Java and you’re not using an IDE, you are wasting your time.  Netbeans and Eclipse are the two biggies in the Java world; I prefer Netbeans due to its great built-in keyboard macros.  By memorizing a few keyboard shortcuts, you can save dozens of keystrokes from commonly typed phrases.  For instance, declaring constants is usually quite verbose in Java:

public static final int BUFFER_SIZE = 1024;

With netbeans you can shorten the 24 characters before the variable name to five: Psfi -> TAB.  There are a whole raft of such shortcuts, and they are indispensable for easing the pain of Java’s verbosity.

Other great and essential features include the ability to automatically determine which modules need to be imported; this feature alone makes an IDE superior to a dumb text editor.  The other feature that immediately springs to mind is the ability to easily refactor code; you can change the name of a variable in one file and have it propagate to all files that reference it, rather than having to find and replace the string in all the files.

MacPorts

Unlike the other tools in this post, MacPorts is a command line utility.  I use it when I need to install some open source library or project and there is no installer available for my platform.  If there’s a port version of the software available, it handles all the dependency management, installs the libraries where they need to go, and updates all the necessary environment variables.

Textmate

Textmate is my text editor of choice for all things non-Java.  It makes it very easy to open a directory as a project and then jump around between files within it (with a very smart, intuitive search feature).  Just as netbeans has tab code completions, so does Textmate.  Common shortcuts (“snippets”) are bundled up and distributed with the software; it is also easy to add your own.  It seems to be the de facto standard for web development (every Ruby on Rails developer I’ve ever met uses it).

Two main complaints:

  • Some strange default behavior: If I select a bunch of text and hit tab, I would expect that to indent the text rather than delete the contents of it.  Similarly for shift tab.  Instead, you must hit option tab and option shift tab (that’s a bit of a finger stretcher)
  • You cannot split a window and look at two sections of it at the same time.

WriteRoom

WriteRoom is the antithesis of Microsoft Word, or any modern text editor.  Whereas most programs throw feature after feature at you, WriteRoom strips it down to the barest of feature sets.  The minimalist nature extends to the presentation as well; when you boot it up you begin by staring at a full screen blank picture.  Text is monochrome green by default, though both the background and foreground colors can be changed.  By stripping all user interface elements out of the view, you are free to focus on the task of writing without any distractions.

Obviously this is not well suited to all tasks; if you are doing any sort of work in which you need to simultaneously reference other materials (e.g. look at a website or excel spreadsheet at the same time), this is not for you.  But if you need to brainstorm something and get some thoughts down onto paper, this is a great choice.

There is a free Windows clone called Dark Room, and there is a similar product for the Mac in beta called Ommwriter.

MacTex LaTeX distribution

LaTeX (unfortunately named for Google searching) is a typesetting language/program.  It’s used extensively by college professors and others looking for beautifully typeset text and equations.  Unlike Microsoft Word, composing a document using LaTeX is most certainly not WYSIWYG, but its creators see that as a feature and not a bug.  They claim that people waste an inordinate time fiddling with fonts and presentation rather than content.  By formatting your work as a Latex document, you can render it in multiple different ways just by changing a template.

The MacTeX package includes LaTeXIt, TeXShop, and BibDesk, as well as a few other programs I never touch.

LaTeXiT is a small program for creating equations and other snippets to embed in other sources.

TeXShop is a full fledged editor of LaTeX documents; if you’re doing any sort of serious document creation, you’re probably going to do it in TeXShop.  There’s nothing stopping you from composing your documents in any plaintext editor, but you will have to manually run the scripts that convert your text into PDF; TeXShop automates some of that hassle.

BibDesk is a program for managing bibliographic entries.

Adium

An excellent chat/IM client for Mac that supports all the big formats.  Recognize your favorite protocol from the icons it supports?

Why install a chat program on a work computer?  IM and chat is a big part of collaborative software development.

Conclusion

Some of these programs are fairly well known (Firefox, Adium, Netbeans), but I hope I have exposed you to some new programs.

Categories: Apple, Uncategorized Tags: , , , ,