Home > programming > Unix tip #2: explicit for loops, command substitution

Unix tip #2: explicit for loops, command substitution

A lot of unix commands are designed to operate on a large number of files at once.  For instance, the move file command, mv, has as its (simplified) arguments

mv file 1 [file 2, ... file n] destination

It’s for that reason that you can do something like

mv *.jpg ~/Desktop/Images

and have all the jpegs in the current working directory moved to the Images folder on the Desktop.  So many unix commands are set up that way that you might not need to know how to explicitly iterate (loop).  I was forced to learn this syntax when I had a large number of .tar.gz (roughly equivalent to zip files for those more familiar with Windows land) files to decompress.

The command I usually use to extract the contents of a zipped file is

tar -xzvf /path/to/tar/file

This expects a single file path; doing something like tar -xzvf *.tar.gz will not work.

Nick@Macintosh-2 ~/Desktop/TarExample$ ls
foo.tar.gz  foo2.tar.gz
Nick@Macintosh-2 ~/Desktop/TarExample$ tar -zxvf *.tar.gz
tar: foo2.tar.gz: Not found in archive
tar: Error exit delayed from previous errors

As you can see, this isn’t going to work.  Instead we need an explicit loop.  The general syntax is

for i in [iterable]; do [command with variable $i]; done

For instance,

Nick@Macintosh-2 ~/Desktop/TarExample$ for i in 1 2 3 4 5; do echo $i; done

Recalling our last unix tip, we could replace this with

 Nick@Macintosh-2 ~/Desktop/TarExample$ for i in {1..5}; do echo $i; done

The iterable list is whitespace separated.  This is very important for what I’m about to show to you next.

If you’re familiar with basic Unix functionality, you know that you list the contents of a directory with the ls command.  Let’s do that here.

Nick@Macintosh-2 ~/Desktop/TarExample$ ls foo.tar.gz  foo2.tar.gz
If you’ll notice, these are exactly the filenames we need to pass into the tar command.  Let’s try with echo first.

for i in ls; do echo $i; done ls
Well, that didn’t work.  What’s going on?  Turns yout you need to add backticks (the key to the left of the 1 key) around the ls command; otherwise bash treats it as text.

or i in `ls`; do echo $i; done foo.tar.gz foo2.tar.gz
We can go ahead and replace the echo command with our tar command:

 Nick@Macintosh-2 ~/Desktop/TarExample$ ^echo^tar -xzvf

 for i in `ls`; do tar -xzvf $i; done

The stdout here shows the contents that were extracted from the .tar.gz files.

But what is that syntax?

 ^echo^tar -xzvf

?  This is another neat feature of bash; you can repeat the last command, textually substituting the second command for the first.  I could have just as easily hit the up key, moved my cursor, deleted echo, replaced it with tar -xzvf, but this is faster to type for me.

Just for another example,

 echo "Hello"
 Nick@Macintosh-2 ~/Desktop/TarExample$ ^Hello^World
 echo "World"

In actuality, I would not use `ls`; what happens if there were things other than .tar.gz files in the directory?  We’d be calling the tar command with the incorrect arguments.  Instead we only want it to affect the files ending in.tar.gz; this is a place where the * wildcard comes in handy.

Nick@Macintosh-2 ~/Desktop/TarExample$ ls
a.txt       b.txt       foo.tar.gz  foo2.tar.gz

 Nick@Macintosh-2 ~/Desktop/TarExample$ ls *.tar.gz
 foo.tar.gz  foo2.tar.gz

So I can use this in my earlier command,

 for i in `ls *.tar.gz`; do tar -xzvf $i; done

Note that you can avoid the use of backticks if you use plain wildcard expansion:

 for i in *.tar.gz; do tar -xzvf $i; done

The reason that this works without the use of backticks is that the ls text is a command that needs to be run by the shell; the * is an expression that is evaluated earlier in the pipeline.  Read more about globbing.

Nested for loops

I haven’t had a need to nest for-loops yet, but you can if you wish.

for i in {1,2,3}; do for j in {3,4,5}; do echo $i $j; done; done
1 3
1 4
1 5
2 3
2 4
2 5
3 3
3 4
3 5


I have shown you how to explicitly iterate over lists in bash, how to use wildcard matching to restrict the set of objects returned by command, and how to replace one piece of the last command with another.  In most cases you will not need to explicitly iterate over lists, due to the way many unix commands are written, but it’s a useful skill to have nonetheless.

Categories: programming Tags: ,
  1. Andrew Dunn
    April 17, 2010 at 11:10 pm

    Cool trick, it is always amazing seeing the things that can be done with bash commands. Keep up the good work.

  2. October 15, 2012 at 1:56 am

    Really nice tips. Keep posting a lot.
    tar command examples in linux

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: