Sed Regular Expressions

From NovaOrdis Knowledge Base
Revision as of 18:33, 1 January 2024 by Ovidiu (talk | contribs) (→‎Negation)
Jump to navigation Jump to search

Internal

Meta-Characters - Special Characters (need to be escaped in regular expressions)

/
"
$ # unescaped signifies end of line 
^ # unescaped signifies the beginning of a line
!
[
]
:
* # zero or more
. # dot

Single quote is a special case, to match it use its ASCII hexadecimal value prefixed by \x as follows, instead of escaping it:

\x27

To use () for grouping, they need to be escaped:

\(...\)

More details in Grouping below.

Non-Special Characters (do not need to be escaped in regular expressions)

<
>
(
)
!
-
{
}
+ # this is interesting, I thought '+' is a meta-character, more experimentation necessary.

Grouping

Use \( and \) for grouping. Parentheses must be escaped to be interpreted as grouping separator.

Negation

Match everything except the specified characters. More than one characters is matched (this behavior is different from the behavior of bash patterns on negation):

[^abc]*

Match Zero or One Character

Normally, this would be achieved with ? placed after the character or the group of characters in questions, but this not work with standard sed.

Examples

Match everything except space:

     [^ ]*
     .*

seems to work too.

Words (digits, alpha, _):

sed -e 's/[0-9a-zA-Z_]*/THIS_WAS_A_WORD/g'

Blank spaces (spaces, tabs, newlines): \s does not seem to work.

Regular Expression Syntax

TO NORMALIZE across java Regular Expression Syntax, grep Regular Expression Syntax, sed Regular Expression Syntax.