Bash Patterns: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 31: Line 31:
If the first character following the ‘[’ is a ‘!’ or a ‘^’ then any character not enclosed is matched.  
If the first character following the ‘[’ is a ‘!’ or a ‘^’ then any character not enclosed is matched.  


A ‘-’ may be matched by including it as the first or last character in the set. A ‘]’ may be matched by including it as the first character in the set. The sorting order of characters in range expressions is determined by the current locale and the values of the LC_COLLATE and LC_ALL shell variables, if set.
A ‘-’ may be matched by including it as the first or last character in the set.  


For example, in the default C locale, ‘[a-dx-z]’ is equivalent to ‘[abcdxyz]’. Many locales sort characters in dictionary order, and in these locales ‘[a-dx-z]’ is typically not equivalent to ‘[abcdxyz]’; it might be equivalent to ‘[aBbCcDdxXyYz]’, for example. To obtain the traditional interpretation of ranges in bracket expressions, you can force the use of the C locale by setting the LC_COLLATE or LC_ALL environment variable to the value ‘C’, or enable the globasciiranges shell option.
A ‘]’ may be matched by including it as the first character in the set.
 
The sorting order of characters in range expressions is determined by the current locale and the values of the LC_COLLATE and LC_ALL shell variables, if set. For example, in the default C locale, ‘[a-dx-z]’ is equivalent to ‘[abcdxyz]’. Many locales sort characters in dictionary order, and in these locales ‘[a-dx-z]’ is typically not equivalent to ‘[abcdxyz]’; it might be equivalent to ‘[aBbCcDdxXyYz]’, for example. To obtain the traditional interpretation of ranges in bracket expressions, you can force the use of the C locale by setting the LC_COLLATE or LC_ALL environment variable to the value ‘C’, or enable the globasciiranges shell option.
 
===Classes===


Within ‘[’ and ‘]’, character classes can be specified using the syntax [:class:], where class is one of the following classes defined in the POSIX standard:
Within ‘[’ and ‘]’, character classes can be specified using the syntax [:class:], where class is one of the following classes defined in the POSIX standard:


alnum   alpha   ascii   blank   cntrl   digit   graph   lower
* alnum
print   punct   space   upper   word    xdigit
* alpha
A character class matches any character belonging to that class. The word character class matches letters, digits, and the character ‘_’.
* ascii
 
* blank
Within ‘[’ and ‘]’, an equivalence class can be specified using the syntax [=c=], which matches all characters with the same collation weight (as defined by the current locale) as the character c.
* cntrl  
* digit
* graph
* lower
* print
* punct  
* space
* upper
* word: matches letters, digits, and the character ‘_’.
* xdigit


Within ‘[’ and ‘]’, the syntax [.symbol.] matches the collating symbol symbol.
A character class matches any character belonging to that class.


==/==
==/==

Revision as of 00:10, 19 September 2019

External

Internal

Metacharacters

The following characters have a special meaning when a bash pattern is evaluated, and they need to be escaped to be matched literally:

*

* (star) matches any string, including the null string. When the globstar shell option is enabled, and ‘*’ is used in a filename expansion context, two adjacent ‘*’s used as a single pattern will match all files and zero or more directories and subdirectories. If followed by a ‘/’, two adjacent ‘*’s will match only directories and subdirectories.

?

? (question mark) matches a single character. The dot ('.') is not a metacharacter.

[...]

Matches any one of the enclosed characters.

A pair of characters separated by a hyphen denotes a range expression; any character that falls between those two characters, inclusive, using the current locale’s collating sequence and character set, is matched. A digit is matches as follows:

[0-9]

If the first character following the ‘[’ is a ‘!’ or a ‘^’ then any character not enclosed is matched.

A ‘-’ may be matched by including it as the first or last character in the set.

A ‘]’ may be matched by including it as the first character in the set.

The sorting order of characters in range expressions is determined by the current locale and the values of the LC_COLLATE and LC_ALL shell variables, if set. For example, in the default C locale, ‘[a-dx-z]’ is equivalent to ‘[abcdxyz]’. Many locales sort characters in dictionary order, and in these locales ‘[a-dx-z]’ is typically not equivalent to ‘[abcdxyz]’; it might be equivalent to ‘[aBbCcDdxXyYz]’, for example. To obtain the traditional interpretation of ranges in bracket expressions, you can force the use of the C locale by setting the LC_COLLATE or LC_ALL environment variable to the value ‘C’, or enable the globasciiranges shell option.

Classes

Within ‘[’ and ‘]’, character classes can be specified using the syntax [:class:], where class is one of the following classes defined in the POSIX standard:

  • alnum
  • alpha
  • ascii
  • blank
  • cntrl
  • digit
  • graph
  • lower
  • print
  • punct
  • space
  • upper
  • word: matches letters, digits, and the character ‘_’.
  • xdigit

A character class matches any character belonging to that class.

/

/ (slash)

~

~ (tilda)

Non-Special Characters

These characters do not need to be escaped in bash patterns to match:

\ # forward slash
. # dot - does NOT match any character, it just matches a dot