Few Unix commands are as famous as sed, grep, and awk. They are often used together, perhaps because of their peculiar names and powerful text parsing capabilities. They also have similarities in syntax and logic. Although they can all be used for text parsing, they each have their own specialties. This article focuses on the sed command, which is a stream editor.
Installing sed#
If you are using Linux, BSD, or macOS, they already have GNU or BSD sed installed. These are unique re-implementations of the original sed command. Although they are similar, there are also some subtle differences. This article has been tested on Linux and NetBSD versions, so you can use any sed you find on your computer, but for BSD sed, you must use short options (e.g., -n instead of --quiet).
GNU sed is generally considered the most feature-rich sed, so you may want to try it regardless of whether you are running Linux. If you cannot find GNU sed in the Ports tree (often called gsed on non-Linux systems), you can download the source code from the GNU website. The advantage of installing GNU sed is that you can use its additional features, but if portability is required, you can also restrict it to comply with the POSIX specification for sed.
MacOS users can find GNU sed on MacPorts or Homebrew.
On Windows, you can install GNU sed using Chocolatey.
Understanding the pattern space and hold space#
sed can only process one line at a time. Because it does not have a visual mode, it creates a pattern space, which is a memory space that contains the current line from the input stream (with any trailing newline characters removed). After filling the pattern space, sed will execute your instructions. When the command is finished, sed will print the contents of the pattern space to the output stream, which is usually the standard output, but can be redirected to a file or even redirected to the same file using the --in-place=.bak
option.
Then, the loop starts again with the next input line.
To provide some flexibility when traversing files, sed also provides a hold space (sometimes called a hold buffer), which is a space reserved in sed's memory for temporary data storage. You can think of the hold space as a clipboard, and that's exactly what this article demonstrates: how to use sed to copy/cut and paste.
First, create an example text file with the following content:
Line one
Line three
Line two
Copying data to the hold space#
To place content in sed's hold space, use the h
or H
command. The lowercase h
tells sed to overwrite the current content in the hold space, while the uppercase H
tells sed to append the data to the existing content in the hold space.
When used alone, nothing is visible:
$ sed --quiet -e '/three/ h' example.txt
$
The --quiet
option (abbreviated as -n
) suppresses all output, but sed fulfills my search requirement. In this case, sed selects any line that contains the string "three" and copies it to the hold space. I didn't tell sed to print anything, so there is no output.
Copying data from the hold space#
To see what's in the hold space, you can copy the content from the hold space and put it into the pattern space, and observe what happens:
$ sed -n -e '/three/h' -e 'g;p' example.txt
Line three
Line three
The first blank line is because when sed copies the content to the pattern space for the first time, the hold space is empty.
The next two lines containing "Line three" are because these are the contents of the hold space starting from the second line.
The command uses two unique scripts (-e
) purely for readability and organization. It might be useful to separate the steps into separate scripts, but technically, the following command is equally effective as a single script statement:
$ sed -n -e '/three/h ; g ; p' example.txt
Line three
Line three
Appending data to the pattern space#
The G
command appends a newline character and the content of the hold space to the pattern space.
$ sed -n -e '/three/h' -e 'G;p' example.txt
Line one
Line three
Line three
Line two
Line three
The first two lines of this output contain both the content of the pattern space ("Line one") and an empty hold space. The next two lines match the search text ("three"), so they contain both the pattern space and the hold space. The hold space in the third line remains unchanged, so at the end of the pattern space ("Line two"), it is the hold space (still "Line three").
Cutting and pasting with sed#
Now that you know how to move a string from the pattern space to the hold space and back, you can design a sed script to copy, delete, and then paste a line in a document. For example, to move "Line three" from the example file to the third line, sed can solve this problem:
$ sed -n -e '/three/ h' -e '/three/ d' \
-e '/two/ G;p' example.txt
Line one
Line two
Line three
- The first script finds the lines that contain the string "three" and copies them from the pattern space to the hold space, replacing any existing content in the hold space.
- The second script deletes any lines that contain the string "three". This accomplishes the equivalent of the "cut" action in a word processor or text editor.
- The last script finds the line that contains the string "two" and appends the content of the hold space to the pattern space, then prints the pattern space.
Task completed.
Writing sed scripts#
Again, using separate script statements is purely for visual and psychological simplicity. The cut and paste commands as a single script are equally effective:
$ sed -n -e '/three/ h ; /three/ d ; /two/ G ; p' example.txt
Line one
Line two
Line three
It can even be written in a dedicated script file:
#!/usr/bin/sed -nf
/three/h
/three/d
/two/ G
p
To run the script, give it executable permissions, and then try it with the example file:
$ chmod +x myscript.sed
$ ./myscript.sed example.txt
Line one
Line two
Line three
Of course, the more predictable the text you need to parse, the easier it is to solve problems with sed. Inventing "recipes" for sed operations like copying and pasting is often impractical because the conditions that trigger the operations may vary from file to file. However, the more proficient you become with sed commands, the easier it is to design complex actions based on the input you need to parse.
The important thing is to recognize the different operations, understand when sed moves to the next line, and predict what the pattern and hold spaces contain.
sed is complex. Although it has only a dozen commands, its flexible syntax and native capabilities mean it is full of infinite potential.
References#
How to use the Linux sed command [1]
Using the sed command for copying, cutting, and pasting [2]