Sed Regular Expressions

From NovaOrdis Knowledge Base
Revision as of 04:36, 5 January 2021 by Ovidiu (talk | contribs) (→‎Negation)
Jump to navigation Jump to search

Internal

Meta-Characters - Special Characters (need to be escaped in regular expressions)

/
"
$ # unescaped signifies end of line 
^ # unescaped signifies the beginning of a line
!
[
]
:
* # zero or more
. # dot

Single quote is a special case, to match it use its ASCII hexadecimal value prefixed by \x as follows, instead of escaping it:

\x27

To use () for grouping, they need to be escaped:

\(...\)

More details in Grouping below.

Non-Special Characters (do not need to be escaped in regular expressions)

<
>
(
)
!
-
{
}
+ # this is interesting, I thought '+' is a meta-character, more experimentation necessary.

Grouping

Use \( and \) for grouping. Parentheses must be escaped to be interpreted as grouping separator.

Negation

Match everything except the specified characters. More than one characters is matched (this behavior is different from the behavior of bash patterns on negation):

[^abc]*

Examples

Match everything except space:

     [^ ]*
     .*

seems to work too.

Words (digits, alpha, _):

sed -e 's/[0-9a-zA-Z_]*/THIS_WAS_A_WORD/g'

Blank spaces (spaces, tabs, newlines): \s does not seem to work.

Regular Expression Syntax

TO NORMALIZE across java Regular Expression Syntax, grep Regular Expression Syntax, sed Regular Expression Syntax.