Archive
How to remove “smart” quotes from a text file
If you’ve copied and pasted text from Microsoft Word, chances are there will be the so-called smart quotes in that text. Some programs don’t handle these characters very well. You can turn them off in Word but if you’re trying to remedy the problem after the fact, sed is your old friend. I’ll show you how to replace these curly quotes with the traditional straight quote.
Recall that you can do global find/replace by using sed.
sed s/[”“]/'"'/g File.txt
This won’t actually change the contents of the File, but you can save the results to a new file
sed s/[”“]/'"'/g File.txt > WithoutSmartQuotes.txt
If you wish to save the files in place, overwriting the original contents, you would do
sed -i ".bk" s/[”“]/'"'/g File.txt
This tells the sed command to make the change “in place”, while backing up the original file to File.txt.bk in case anything goes wrong.
To fix the smart quotes in all the text files in a directory, do the following:
for i in *.txt; do sed -i ".bk" s/[”“]/'"'/g $i; done
At the conclusion of the command, you will have double the number of text files in the directory, due to all the backup files. When you’ve concluded that the changes are correct (do a diff File.txt File.txt.bk to see the difference), you can delete all the backup files with rm *.bk.