Archive
ack – Better than grep?
I stumbled onto a really nice command line tool named ack while reading a StackOverflow question yesterday. Living at the domain betterthangrep.com/, it purports to .. be better than grep. Or, as they put it
ack is a tool like grep, designed for programmers with large trees of heterogeneous source code
I’ve written previously about how to combine find and grep, and really, ack exists to obviate the use of find and grep. It ignores commonly ignored directories by default (e.g. all those .svn metadata folders that SVN insists on creating), and with a simple command line flag you can tell ack what sort of files you want searched. Furthermore, because it recurses by default, you don’t need to use the find command to traverse the tree.
Using the todo example, a basic way of searching for the TODOs in all of our java files is to use the command
find . -name "*.java" -exec grep -i -n TODO {} \;
In ack, this is accomplished much easier:
ack -i --java TODO
Furthermore, the matching results are highlighted right away, making it extremely apparent where the matches occur.
I’m going to start using this at work and see if it can replace my grep/find hackery. Will let you know. Very impressed so far.
If you want to give it a try, the easiest way to install it is with macports:
port install p5-app-ack
Quotes, quotes, quotes: A primer for the command line
In Bash programming, there are a lot of ways to get input into programs. In particular, there are a slew of different quoting methods you should understand. This article provides a quick reference of the difference between using No quotes, Double Quotes, Single Quotes, and Backticks
No quotes
Standard shell scripts assumes arguments are space delimited. You can iterate over elements in this way:
for i in Hi how are you; do echo $i; done Hi how are you
This is why it is a problem to have spaces in your file names. For instance,
$ ls with spaces.txt $ cat with spaces.txt cat: with: No such file or directory cat: spaces.txt: No such file or directory
Here I naively typed with spaces.txt thinking the cat program could handle it. Instead, cat saw two arguments: with, and spaces.txt. In order to handle this, you can either escape the space,
$ cat with\ spaces.txt
or use the double quotes method. (Note that if you use tab autocompletion, the backslash escape will be added automatically)
Double quotes
Double quotes can be used when you want to group multiple space delimited words together as a single argument. For instance
for i in "Hi how" "are you"; do echo $i; done Hi how are you
In the previous example, I could do
$ cat "with spaces.txt"
and the filename would be passed as a single unit to cat.
An important thing to note is that shell variables are expanded within double quotes.
name=Frank; echo "Hello $name" Hello Frank
This is crucial to understand. It also allows you to solve problems caused by having spaces in file names, especially when combined with the * globbing behavior of the shell. For instance, let’s say we wanted to iterate over all the text files in a directory and do something to them.
$ ls with spaces.txt withoutspaces.txt $ for i in *.txt; do cat $i; done cat: with: No such file or directory cat: spaces.txt: No such file or directory # Surround the $i with quotes and our space problem is solved. $ for i in *.txt; do cat "$i"; done
(Yes I know iterating over and calling cat on each argument is silly, as cat can accept a list of files (e.g. *.txt). But it illustrates the point that commands will be confused by spaces in the name and should use double quotes to handle the problem).
Single quotes are also good when you need to embed single quotes in a string (you do not need to escape them)
$ echo "'Single quotes'" 'Single quotes' $ echo "\"Escaped quotes\"" "Escaped quotes"
Double quotes are my default while I’m working in the terminal.
Single quotes
Single quotes act just like double quotes except that the text inside of them is interpreted literally; in other words, the shell does not attempt to do any more expansion or substitution. For instance,
$ name=Frank; echo 'Hello $name' Hello $name
This can save you some backslash escaping your normally would have to do.
Use it when:
- You need double quotes embedded in your string
$ echo '"How are you doing?", she said' "How are you doing?", she said
- You do not need any literal single quotes in your string (it’s very difficult to get single quotes/apostrophe literals to appear in such a string)
Back ticks
Back ticks (“, the key to the left of the 1 and above the Tab key on a standard US keyboard), allow you to substitute in the output of another command. For instance:
$ current_dir=`pwd` $ echo $current_dir /Users/nicholasdunn/Desktop/Scripts [/sourecode] This can be combined with the double quotes, but will be treated as literal characters in the single quotes: echo "`pwd`" /Users/nicholasdunn/Desktop/Scripts $ echo '`pwd`' `pwd`
Use when:
You want to capture the results of another command, usually for purposes of assigning a variable.
Hopefully this brief tour through the different types of quotes in bash has been useful.
Maybe this is why people are afraid of the command line?
I think one of my favorite quotes sums this up nicely:
A wealth of information creates a poverty of attention – Herbert Simon
Args4j library for parsing Java command line arguments
args4j is a great tool for parsing command line arguments in Java.
Here’s the description from its homepage:
- It makes the command line parsing very easy by using annotations.
- You can generate the usage screen very easily.
- You can generate HTML/XML that lists all options for your documentation.
- Fully supports localization.
- It is designed to parse javac like options (as opposed to GNU-style where ls -lR is considered to have two options l and R.)
- It is licensed under the MIT license.
It is fairly simple to use:
- Create a class holding all the options you wish to parse.
- Annotate the fields of the class, telling args4j information about the command line arguments (what flags are necessary, what are the various short and long ways of specifying that argument)
- In your main method, create an instance of your options holder class
- Pass the instance from step 3 into an instance of CmdLineParser class, and then call parseArgument on the array of strings passed into the main method
- If no parse exceptions are thrown, your options object has all of the required options fields filled in, which can then be queried via normal getters.
The sample main class provided by the args4j folks is especially useful; in this example they do not create a separate class to hold the options, but that is certainly an option.
I’m really not sure how this stacks up to other command line parsing tools for Java; suffice to say it’s small, fast, the annotations make it a breeze to specify a set of names for different named arguments… it just works. Please use this instead of trying to roll your own options parsing code.
Unix tip #1: advanced mkdir and brace expansion fun
If you don’t know all the ins and outs of the mkdir command, you are probably expending more effort than necessary. Imagine this fairly common use case:
You are in a folder and want to create one folder which has 3 sub folders. Let’s call the main folder Programming and its 3 sub folders Java, Python, and Scala. Visually this looks like
or rendered via tree:
Programming/ |-- Java |-- Python `-- Scala
A first pass at accomplishing this would be to create the Programming folder, and then the three individual folders underneath
$ mkdir Programming $ mkdir Programming/Java $ mkdir Programming/Python $ mkdir Programming/Scala
This certainly works, but it takes four commands.
Let’s see if we can’t do better. Delete those folders with the command
rm -rf Programming/
This will delete the programming folder and all subfolders underneath it (the r flag is for recursive, the f flag for forcing the removal of nonempty directories)
Like most unix commands, the mkdir command can take multiple arguments, separated by spaces. So the three separate commands to create Java, Python, and Scala can be put onto one line.
mkdir Programming; mkdir Programming/Java Programming/Python Programming/Scala
Note the ; separator between the two commands. We need to create the Programming folder before we can create the subfolders.
This is better but still too verbose. It would be nice to remove the mkdir Programming call; we’d like to be able to create an arbitrarily nested folder and have mkdir create all the parent folders automatically. Fortunately there is a way to do this: the -p flag of mkdir does exactly this.
-p Create intermediate directories as required. If this option is not specified, the full path prefix of each operand must already exist. On the other hand, with this option specified, no error will be reported if a directory given as an operand already exists. Intermediate directories are created with permission bits of rwxrwxrwx (0777) as modified by the current umask, plus write and search permission for the owner.
Thus we can change our command to
mkdir -p Programming/Java Programming/Python Programming/Scala
This is better but still not perfect; we’re repeating ourselves 3 times with the Programming call. Enter an absurdly useful Bash shell construct known as brace expansion.
echo {5,6,7} 5 6 7
The arguments within braces are treated as if they were space separated instead. That wouldn’t be terribly useful except that things immediately before the brace are repeated as well
echo hello{5,6,7} hello5 hello6 hello7
This brace expansion can be used anywhere, since the textual substitution happens before the arguments are passed into other processes. So, combining this with what we saw earlier, we can put Java, Python and Scala into a list and prepend it with Programming:
echo Programming/{Java,Python,Scala} Programming/Java Programming/Python Programming/Scala
That should look very familiar. Putting it in place of the earlier mkdir command we get the elegant one liner
mkdir -p Programming/{Java,Python,Scala}
Certain versions of bash also support numerical ranges within the brackets:
echo {1..10} 1 2 3 4 5 6 7 8 9 10
Conclusion:
I have shown you how to create all the parent directories using the mkdir command, and introduced you to the brace expansion macro of Bash. The latter is extremely powerful, and can be used to great effect within scripts.
Note: The arguments within the braces must have NO space between after or before the commas in order for the brace expansion to work.
[572][nicholasdunn: Desktop]$ echo {5, 6, 7} {5, 6, 7} [573][nicholasdunn: Desktop]$ echo {5,6,7} 5 6 7