Sed stands for “Stream EDitor”: it is a free and open source utility installed by default on all Linux and Unix-based operating systems out there. What it does, is performing text manipulation on files, but it can also be used as part of pipeline and supports the use of regular expressions. In this tutorial, we learn the basics of the sed substitute command.
In this tutorial you will learn:
- The sed “substitute” command basic syntax
- How to use backreferences
- Some of the most used “substitute” command flags
- How to modify a file in place and optionally create a backup of it
| Category | Requirements, Conventions or Software Version Used |
|---|---|
| System | Distribution agnostic |
| Software | sed |
| Other | None |
| Conventions | # – requires given linux-commands to be executed with root privileges either directly as a root user or by use of sudo command$ – requires given linux-commands to be executed as a regular non-privileged user |
The “substitute” command
Substitute is probably the most known and used sed command: it helps us replace text patterns in a file in a non-interactive way. The first thing we should do, if we want to learn how to use it, it’s to take a look at its syntax. Let’s see an example: suppose we have a file called lotr.txt containing the famous ring poem written by John Ronald Reuel Tolkien:
Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone, Nine for Mortal Men doomed to die, One for the Dark Lord on his dark throne In the Land of Mordor where the Shadows lie. One Ring to rule them all, One Ring to find them, One Ring to bring them all, and in the darkness bind them, In the Land of Mordor where the Shadows lie.
Now, imagine we want to replace all occurrences of the word “Ring”, with the word “foo”, without using a full-fledged text editor, perhaps from a shell script. To perform such action using
sed, we would run:
$ sed 's/Ring/foo/g' lotr.txt
As soon as we launch the command, the processed content of the file will appear on the standard output:
Three foos for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone, Nine for Mortal Men doomed to die, One for the Dark Lord on his dark throne In the Land of Mordor where the Shadows lie. One foo to rule them all, One foo to find them, One foo to bring them all, and in the darkness bind them, In the Land of Mordor where the Shadows lie.
As expected, each occurrence of the word “Ring”, was replaced with “foo”; the original file, however, remained unchanged: this is because, by default, sed simply writes to standard output. Let’s analyze what we did in the example above. The sed utility accepts a series of command which are represented by a single letter, and sometimes, some kind of arguments. In this case, we used the s command (substitute). The syntax of the command is the following:
s/regexp/replacement/flags
The utility tries to match the specified regular expression (“regexp”) in the specified file (or stream); each match of the expression is substituted with “replacement”.
Using backreferences
As part of both the regexp, and the replacement, we can use backreferences, which let us reference sub-parts of a matched regular expression enclosed in escaped parenthesis, which create capturing groups. Let’s see an example:
$ sed 's/\([a-z]\)\1\([ ,/]\)/\1\2/g' lotr.txt
In the example above, we specified the following regular expression:
\([a-z]\)\1\([ ,.]\)
The first [a-z] pattern matches any ASCII character from “a” to “z”; as you can see, it is written between escaped parenthesis: this creates a capturing group which allows us to reference the matched text later using \1, where “1” means: first capturing group.
This is what we did immediately after, to match “any double character”. Finally, we defined another capturing group which includes the
[,.] expression: this (sub)expression matches a “space”, a “comma” or a “dot”. As you can imagine, we can reference the second capturing group, by using \2. The whole expression therefore means: “any double character followed by a space, a comma, or a dot”.
In the “substitution” part of the command, we used backreferences again, to replace the whole matched pattern with a single occurrence of the character matched by the expression contained in the first capturing group, followed by the match of the expression contained in the second one (again: a space, a comma or a dot).
You can also notice that after the “substitution” part of the command, we used the g letter: this is a flag which makes all matches in a line to be substituted instead of just the first one (the latter is the default sed behavior). We will talk specifically about flags in the next section. As a result of the command, any double letter at the end of a word is replaced by a single occurrence. This is the result:
Thre Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone, Nine for Mortal Men doomed to die, One for the Dark Lord on his dark throne In the Land of Mordor where the Shadows lie. One Ring to rule them al, One Ring to find them, One Ring to bring them al, and in the darknes bind them, In the Land of Mordor where the Shadows lie.
In the “substitution” part of the command, we can also use the unescaped & character, which references the whole regular expression match.
Sed “substitute” command flags
When using the s command, we can specify a series of flags which can modify its behavior. Let’s see some of them, and their effect.
The “g” flag
The gflag modifies the behavior of the sed “substitute” command so that all matches of a regexp in a line, are substituted, instead of just the first one. Just as an example, take a look at the 6th line of the ring poem: the word “Ring” appears two times. Let’s run the “substitute” command, as we did in the first example, but without the g flag:
$ sed 's/Ring/foo/' lotr.txt
We obtain:
One foo to rule them all, One Ring to find them,
As you can see, only the first occurrence of the word “Ring” was substituted with “foo”. If we use the g flag, instead, both occurrences are affected by the substitution.
The “i” flag
The i flag causes the regular expression, used in the s command, to become case-insensitive. For example, the regex in the following command will match “Ring” even if “ring” is specified:
$ sed 's/ring/foo/i' lotr.txt
Using a number as a flag
When a number is used as a flag, the behavior of the sed s command changes so that only n matches of the regex in a line are substituted by “replacement” (sed works on line-basis). If we run:
$ sed 's/Ring/foo/1' lotr.txt
We obtain the following output:
Three foos for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone, Nine for Mortal Men doomed to die, One for the Dark Lord on his dark throne In the Land of Mordor where the Shadows lie. One foo to rule them all, One Ring to find them, One foo to bring them all, and in the darkness bind them, In the Land of Mordor where the Shadows lie.
The “p” flag
The p flag changes the behavior of sed, so that it outputs the lines in which a substitution was performed. This is particularly useful when sed is used with the -n option, which makes it silent. When the two things are combined, only changed lines are printed:
$ sed -n 's/Ring/foo/gp' lotr.txt
The command returns:
Three foos for the Elven-kings under the sky, One foo to rule them all, One foo to find them, One foo to bring them all, and in the darkness bind them,
The ones above are only some of the flags which can be used with the sed “substitute” command; the complete and detailed list can be found by reading the official documentation of the “s” command.
Substituting text in place
As we already said, the output of the sed “substitute” command is printed on the standard output, so the original file is not altered. If we want to change the content of a file in place, all we have to do is to invoke sed with the -i option (short for --in-place), e.g:
$ sed -i 's/Ring/foo/g' lotr.txt
It is also possible to provide a suffix as argument to the option. When we do so, a backup file is created with the suffix we specified. For example, if we run:
$ sed --in-place='.bk' 's/Ring/foo/g' lotr.txt
If we were to launch the command above, the target file would be modified in place; its original content, however, would be saved in a backup file called lotr.txt.bk.
Conclusions
In this article, we learned the basics of the sed “substitute” command. We saw what is the syntax of the command, how to use regular expressions and backreferences. We also saw some of the flags that can be used with the “s” command, like i to make the regexp case-insensitive, and g to substitute all regexp matches in a line. Finally, we saw how to modify a file in place and optionally create a backup of it before the changes are written.
