Multiple File Renaming With sed
Of course, there are many tools that specialize in renaming files, but none of them is quite as powerful as sed. In fact, sed does much more than just rename files. After all, sed’s main is a stream editor, whose job is to edit incoming text anyway you like.
We will cover a very simple example, which doesn’t do sed much justice, but I think a more complex example is not very realistic for everyday use. Anyway, let’s get started.
$ pwd
/tmp/test
We have a test directory in /tmp/test. There we’ve compiled a collection of excellent reading material.
$ ls
[test] 01 Interesting.txt
[test] 02 Insightful.txt
[test] 03 Fabulous.txt
[test] 04 Terrific.txt
[test] 05 Masterful.txt
Each file is called [test] NN Something.txt where NN is the file number. We want to remove the [test] part and change places of the number and the textual part of the file name, and change the extension to .text. All in one go, right? :)
Of course, we won’t do everything right away, because we want to demonstrate the method here. First, let’s strip the [test] part:
$ ls | sed 's/\[test\] //'
01 Interesting.txt
02 Insightful.txt
03 Fabulous.txt
04 Terrific.txt
05 Masterful.txt
Ok, that was simple enough. When matching the square brackets, you need to escape them. So [ becomes \[, and same for the other side. Note the space after \]. It matches a real (literal) space. It’s part of the prefix, so we will also match that. Now, we also want to keep the prefix. I’ll show you why later, but bear with me for now.
$ ls | sed 's/\(\[test\] \)//'
To capture a part of the regex match, you have to enclose the pattern that will match that part in escaped round brackets. You will notice that we haven’t changed anything with this, but the captured part is saved for us so we can use it later. Let’s change the positions of the number part, and the text part.
$ ls | sed 's/\(\[test\] \)\(0[1-5]\)\( \)\([a-zA-Z]\+\)/\4 \2/'
Interesting 01.txt
Insightful 02.txt
Fabulous 03.txt
Terrific 04.txt
Masterful 05.txt
Woah! That’s a long-ass regex! Ok, let’s break it apart:
01: s/
02: \(\[test\] \)
03: \(0[1-5]\)
04: \( \)
05: \([a-zA-Z]\+\)
06: /
07: \4 \2
08: /
The first line is the start of the regex, standard sed stuff. L2 is the prefix match, as we said before. L3 is the second match, and it matches the numbers. A single 0 (zero), followed by a number between 1 and 5 inclusive. It’s also captured for later use. L4 captures a single space character (one between the numbers and the text). L5 matches a sequence of one or more letters. Note that the \+ sequence of characters means one or more in sed. The plus sign has to be escaped. L6 is standard sed separator like the one in L1. L7 is the output. We used the 4th matched group, \4, which corresponds to the textual title of the file, and second group, \2, corresponding to the two digits. Note that the space between the two items does not have to be escaped. Finally, L8 is the separator that ends the regexp.
Ok, let’s change the extension.
$ ls | sed 's/\(\[test\] \)\(0[1-5]\)\( \)\([a-zA-Z]\+\)\(.*\)/\4 \2.text/'
Interesting 01.text
Insightful 02.text
Fabulous 03.text
Terrific 04.text
Masterful 05.text
We have added one more captured group (5th group) that contains ‘everything else’. The * (star or asterix) character means zero or more of the previous character or group. A single unescaped period, ., means ‘any character’. We add the extension manually.
So how come the prefix and extension disappeared? Well, actually, sed only works on any part of the incoming string that is matched. In other words, if it’s not matched, it remains intact. If you want to mess with any part of the incoming string, you first have to match it. In the example before the last one, we still had .txt extension, because it was not matched. Now that it is, it disappears, so we can replace it with our own extension. It’s opposite of how regexes work in other programming languages, and maybe a bit counter-intuitive. I hope I’m making sense, though.
You might be asking “But all of this just builds a new list of files, and the original files are not renamed?” Yes, that is the case. So let’s rename the files, shall we?
$ ls | sed 's/\(\[test\] \)\(0[1-5]\)\( \)\([a-zA-Z]\+\)\(.*\)/mv ...
... "\1\2\3\4\5" "\4 \2.text"/'
mv "[test] 01 Interesting.txt" "Interesting 01.text"
mv "[test] 02 Insightful.txt" "Insightful 02.text"
mv "[test] 03 Fabulous.txt" "Fabulous 03.text"
mv "[test] 04 Terrific.txt" "Terrific 04.text"
mv "[test] 05 Masterful.txt" "Masterful 05.text"
Note that I’ve broken the command in half because of the space constraings on this blog. Keep in mind that the whole command has to be on a single line without the ... part.
What we have here is, we’ve transformed the original strings into shell commands. It’s still just text, but with a bit of a twist, it can be executed. Note that we’re using all 5 captured matches to build the original file name, while keeping them separate so we can build the new one. We have also enclosed the file names in double quotes "...", so we don’t have to escape the spaces.
And now the final ingredient: execution.
$ ls | sed 's/\(\[test\] \)\(0[1-5]\)\( \)\([a-zA-Z]\+\)\(.*\)/mv ...
... "\1\2\3\4\5" "\4 \2.text"/' | sh
We pipe everything through sh so it gets executed. The result is just as we expected:
$ ls
Fabulous 03.text
Insightful 02.text
Interesting 01.text
Masterful 05.text
Terrific 04.text
So, not only have we renamed the files, but we’ve also screwed the ordering. Good going! Now, the regex to reverse this is an exercise for you. :) Let me give you a hint: you only need 3 captures.
I hope this has been fun for you, and I hope you have noticed one big advantage piping stuff trhough sed: it can transform an innocent little list of files into a powerful weapon of mass execution. OK, that’s a bit overdramatized. You can turn text into shell commands, that’s the point. Have fun and happy hacking.




