Latest revision as of 21:46, 1 April 2024

External

https://www.gnu.org/software/sed/manual/html_node/Regular-Expressions.html

Internal

Meta-Characters - Special Characters (need to be escaped in regular expressions)

/
"
$ # unescaped signifies end of line 
^ # unescaped signifies the beginning of a line
!
[
]
:
* # zero or more
. # dot

Single quote is a special case, to match it use its ASCII hexadecimal value prefixed by \x as follows, instead of escaping it:

\x27

To use () for grouping, they need to be escaped:

\(...\)

More details in Grouping below.

Non-Special Characters (do not need to be escaped in regular expressions)

<
>
(
)
!
-
{
}

`+`

+ by itself is not a meta-character, it matches "+".

The GNU version matches \+ with "one or more characters"

Grouping

Use \( and \) for grouping. Parentheses must be escaped to be interpreted as grouping separator.

Brackets

Brackets mean "any one of"

[ab]

will match "a" or "b".

echo "blah" | sed -s 's/[ab]/x/g'

prints "xlxh".

Brackets and Negation

Match everything except the specified characters. More than one characters is matched. The behavior is different from the behavior of bash patterns on negation:

[^abc]*

Match Zero or One Character

Normally, this would be achieved with ? placed after the character or the group of characters in questions, but this not work with standard sed.

Examples

Match everything except space:

[^ ]*

.*

seems to work too.

Words (digits, alpha, _):

sed -e 's/[0-9a-zA-Z_]*/THIS_WAS_A_WORD/g'

Blank spaces (spaces, tabs, newlines): \s does not seem to work.

Regular Expression Syntax

TO NORMALIZE across java Regular Expression Syntax, grep Regular Expression Syntax, sed Regular Expression Syntax.

@@ Line 1: / Line 1: @@
+=External=
+* https://www.gnu.org/software/sed/manual/html_node/Regular-Expressions.html
 =Internal=
@@ Line 4: / Line 6: @@
 * [[Regular Expressions]]
-=Special Characters (need to be escaped in regular expressions)=
+=<span id='Special_Characters_.28need_to_be_escaped_in_regular_expressions.29'></span>Meta-Characters - Special Characters (need to be escaped in regular expressions)=
+<font size=-2>
   /
   "
@@ Line 15: / Line 17: @@
   :
   * # zero or more
- + # one or more - does not seem to work for all seds.
   . # dot
+</font>
+Single quote is a special case, to match it use its ASCII hexadecimal value prefixed by \x as follows, instead of escaping it:
+<font size=-2>
+ \x27
+</font>
+To use () for grouping, they need to be escaped:
+<font size=-2>
+ \(...\)
+</font>
+More details in [[#Grouping|Grouping]] below.
 =Non-Special Characters (do not need to be escaped in regular expressions)=
+<font size=-2>
   <
   >
@@ Line 28: / Line 39: @@
   {
   }
- + # this is interesting, I thought '+' is a meta-character, more experimentation necessary.
+</font>
+==<tt>+</tt>==
+<code>+</code> by itself is not a meta-character, it matches "+".
+The GNU version matches <code>\+</code> with "one or more characters"
 =Grouping=
@@ Line 34: / Line 51: @@
 Use <tt>\(</tt> and <tt>\)</tt> for grouping. Parentheses must be escaped to be interpreted as grouping separator.
+=Brackets=
+Brackets mean "any one of"
+<font size=-2>
+ [ab]
+</font>
+will match "a" or "b".
+<syntaxhighlight lang='bash'>
+echo "blah" | sed -s 's/[ab]/x/g'
+</syntaxhighlight>
+prints "xlxh".
+==Brackets and Negation==
+Match everything except the specified characters. More than one characters is matched. The behavior is different from the [[Bash_Patterns#Negation|behavior of bash patterns on negation]]:
+<font size=-2>
+ [^abc]*
+</font>
+=Match Zero or One Character=
+Normally, this would be achieved with <code>?</code> placed after the character or the group of characters in questions, but this not work with standard <code>sed</code>.
 =Examples=
@@ Line 39: / Line 80: @@
 Match everything except space:
-<pre>
+<font size=-2>
-     [^ ]*
+ [^ ]*
-</pre>
+</font>
-<pre>
+<font size=-2>
-     .*
+ .*
-</pre>
+</font>
 seems to work too.
@@ Line 51: / Line 92: @@
 Words (digits, alpha, _):
-<pre>
+<font size=-2>
-sed -e 's/[0-9a-zA-Z_]*/THIS_WAS_A_WORD/g'
+ sed -e 's/[0-9a-zA-Z_]*/THIS_WAS_A_WORD/g'
-</pre>
+</font>
 Blank spaces (spaces, tabs, newlines): <tt>\s</tt> does not seem to work.
 =Regular Expression Syntax=
-<font color=red>TO NORMALIZE across [[Java_Regular_Expressions#Regular_Expression_Syntax|java Regular Expression Syntax]], [[Grep_Regular_Expressions#Regular_Expression_Syntax|grep Regular Expression Syntax]], [[Sed_Regular_Expressions#Regular_Expression_Syntax|sed Regular Expression Syntax]].</font>
+<font color=darkkhaki>TO NORMALIZE across [[Java_Regular_Expressions#Regular_Expression_Syntax|java Regular Expression Syntax]], [[Grep_Regular_Expressions#Regular_Expression_Syntax|grep Regular Expression Syntax]], [[Sed_Regular_Expressions#Regular_Expression_Syntax|sed Regular Expression Syntax]].</font>

Sed Regular Expressions: Difference between revisions

Latest revision as of 21:46, 1 April 2024

Contents

External

Internal

Meta-Characters - Special Characters (need to be escaped in regular expressions)

Non-Special Characters (do not need to be escaped in regular expressions)

`+`

Grouping

Brackets

Brackets and Negation

Match Zero or One Character

Examples

Regular Expression Syntax

Navigation menu

Sed Regular Expressions: Difference between revisions

Latest revision as of 21:46, 1 April 2024

External

Internal

Meta-Characters - Special Characters (need to be escaped in regular expressions)

Non-Special Characters (do not need to be escaped in regular expressions)

+

Grouping

Brackets

Brackets and Negation

Match Zero or One Character

Examples

Regular Expression Syntax

Navigation menu

Search

`+`