I have (from the sed website http://sed.sourceforge.net/sed1line.txt) this one-liner:
sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;/BBB/!d;/CCC/!d'
Its purpose is to search a paragraph for either AAA, BBB or CCC.
My understanding of the script:
What is not clear to me:
Thank you very much for every comment!
My test data (match every paragraph with XX in it):
YYaaaa
aaa1
aaa2
aXX3
aaa4
YYbbbb
bbb1
bbb2
YYcccc
ccc1
ccc2
ccc3
cXX4
ccc5
YYdddd
ddd1
dXX2
Following command is used:
sed -ne '/./{H;$!d};x;/XX/p' test2
Versions:
$ sed --version
GNU sed-Version 4.2.1
$ bash --version
GNU bash, Version 4.2.10(1)-release (x86_64-pc-linux-gnu)
It collects a paragraph as individual lines into the hold space (H
), then when you hit an empty line, /./
fails and it falls through to the x
which basically zaps the hold space for the next paragraph.
In order to correctly handle the final paragraph, it needs to cope with a paragraph which is not followed by an empty line, therefore it falls through from the last line as if it were followed by an empty line. This is a common idiom for scripts which collect something up through a particular pattern (or, to put it differently, it's a common error for such scripts to fail to handle the last collected data at end of file).
So in other words, if we are looking at a non-empty line, add it to the hold space, and unless it's the last line in the file, delete it and start over from the beginning of the script with the next input line. (Perhaps your understanding of d
was not complete? This is what $!d
means.)
Otherwise, we have an empty line, or end of file, and the hold space contains zero or more lines of text (one paragraph, possibly empty). Exchange them into the pattern space (the current, empty, line conveniently moves to the hold space) and examine the pattern space. If it fails to match one of our expressions, delete it. Otherwise, the default action is to print the entire pattern space.