Multi Line Sed

Sed is a powerful command line tool for manipulating text, using regular expressions (regex). But the default behavior of sed is to just operate on a single line of input text, then starting over on the next line, etc. Sometimes we would instead want to manipulate text based on its context, i.e. based on content found on surrounding lines of text.

For example, in the JSON here below we might want to change the label in object with id "ZoomIn", regardless of its previous value:

{
  "menu": {
    "header": "SVG Viewer",
    "items": [
      {
        "id": "Open"
      },
      {
        "id": "OpenNew",
        "label": "Open New"
      },
      null,
      {
        "id": "ZoomIn",
        "label": "Zoom In"
      },
      {
        "id": "ZoomOut",
        "label": "Zoom Out"
      }
    ]
  }
}

Assuming the JSON file has the name menu.json, this command will print the file content to stdout, but with the text "Something Else" instead of "Zoom In" as label for the object with id "ZoomIn":

sed 'H;1h;$!d;x; s/\("id":[[:space:]]*"ZoomIn"[^}]*"label":[[:space:]]*\)"[^"]*"/\1"Something Else"/g' menu.json

In order to accomplish our goal, a few extra sed commands were needed before the replacement command s///g, so that it could run on the entire input file at once, instead of line-by-line.

The commands are separated by ;, and are in some cases prefixed by a line specifier (two in our example: 1 and $!). The line specifier causes the command to be skipped on all lines not included in the specification.

Sed runs a cycle for each line of input like this: Replace the pattern buffer with the current input line, run through the sequence of commands, and finally print the pattern buffer to stdout.

But the d command is special: it starts a new cycle, which means that sed skips the rest of the commands for that input line and instead starts over with the next line. Because of that, the x and s commands in our example will only be run once: when the last input line has been read.

A description of each command in our example, in order of appearance:

H
Append the current pattern buffer (prefixed by a newline char) to the hold buffer.
1h
Only for the first line: replace the hold buffer with the current pattern, without any newline char as prefix. (So on the first line, the result of the H command is overwritten by this command.)
$!d
All lines excluding the final one: clear the pattern buffer (nothing gets printed) and start a new cycle (skip the rest of the commands).
x
Swap the hold and pattern buffers.
s
In the pattern buffer (which will in our example by now contain the entire input file), do a search-and-replace. After that, sed will print the pattern buffer to stdout, since there are no more commands.

Add new comment

By submitting this form, you accept the Mollom privacy policy.