- A regexp range can by accident match more than was intended.
- For example, the regular expression /[a-zA-z]/ will
- match every lowercase and uppercase letters, but the same regular
- expression will also match the chars: [\]^_`.
+ It's easy to write a regular expression range that matches a wider range of characters than you intended.
+ For example, /[a-zA-z]/ matches all lowercase and all uppercase letters,
+ as you would expect, but it also matches the characters: [ \ ] ^ _ `.
- On other occasions it can happen that the dash in a regular
- expression is not escaped, which will cause it to be interpreted
- as part of a range. For example in the character class [a-zA-Z0-9%=.,-_]
+ Another common problem is failing to escape the dash character in a regular
+ expression. An unescaped dash is interpreted
+ as part of a range. For example, in the character class [a-zA-Z0-9%=.,-_]
the last character range matches the 55 characters between
, and _ (both included), which overlaps with the
- range [0-9] and is thus clearly not intended.
+ range [0-9] and is clearly not intended by the writer.
- - Don't write character ranges were there might be confusion as to - which characters are included in the range. - + Avoid any confusion about which characters are included in the range by + writing unambiguous regular expressions. + Always check that character ranges match only the expected characters.
- The following example code checks whether a string is a valid 6 digit hex color. + The following example code is intended to check whether a string is a valid 6 digit hex color.
- However, the A-f range matches every uppercase character, and
- thus a "color" like #XYZ is considered valid.
+ However, the A-f range is overly large and matches every uppercase character.
+ It would parse a "color" like #XXYYZZ as valid.
@@ -65,10 +63,9 @@ public class Tester {
- A regexp range can by accident match more than was intended.
- For example, the regular expression /[a-zA-z]/ will
- match every lowercase and uppercase letters, but the same regular
- expression will also match the chars: [\]^_`.
+ It's easy to write a regular expression range that matches a wider range of characters than you intended.
+ For example, /[a-zA-z]/ matches all lowercase and all uppercase letters,
+ as you would expect, but it also matches the characters: [ \ ] ^ _ `.
- On other occasions it can happen that the dash in a regular
- expression is not escaped, which will cause it to be interpreted
- as part of a range. For example in the character class [a-zA-Z0-9%=.,-_]
+ Another common problem is failing to escape the dash character in a regular
+ expression. An unescaped dash is interpreted
+ as part of a range. For example, in the character class [a-zA-Z0-9%=.,-_]
the last character range matches the 55 characters between
, and _ (both included), which overlaps with the
- range [0-9] and is thus clearly not intended.
+ range [0-9] and is clearly not intended by the writer.
- - Don't write character ranges were there might be confusion as to - which characters are included in the range. - + Avoid any confusion about which characters are included in the range by + writing unambiguous regular expressions. + Always check that character ranges match only the expected characters.
- The following example code checks whether a string is a valid 6 digit hex color. + The following example code is intended to check whether a string is a valid 6 digit hex color.
- However, the A-f range matches every uppercase character, and
- thus a "color" like #XYZ is considered valid.
+ However, the A-f range is overly large and matches every uppercase character.
+ It would parse a "color" like #XXYYZZ as valid.
@@ -59,10 +57,9 @@ function isValidHexColor(color) {
- A regexp range can by accident match more than was intended.
- For example, the regular expression /[a-zA-z]/ will
- match every lowercase and uppercase letters, but the same regular
- expression will also match the chars: [\]^_`.
+ It's easy to write a regular expression range that matches a wider range of characters than you intended.
+ For example, /[a-zA-z]/ matches all lowercase and all uppercase letters,
+ as you would expect, but it also matches the characters: [ \ ] ^ _ `.
- On other occasions it can happen that the dash in a regular
- expression is not escaped, which will cause it to be interpreted
- as part of a range. For example in the character class [a-zA-Z0-9%=.,-_]
+ Another common problem is failing to escape the dash character in a regular
+ expression. An unescaped dash is interpreted
+ as part of a range. For example, in the character class [a-zA-Z0-9%=.,-_]
the last character range matches the 55 characters between
, and _ (both included), which overlaps with the
- range [0-9] and is thus clearly not intended.
+ range [0-9] and is clearly not intended by the writer.
- - Don't write character ranges were there might be confusion as to - which characters are included in the range. - + Avoid any confusion about which characters are included in the range by + writing unambiguous regular expressions. + Always check that character ranges match only the expected characters.
- The following example code checks whether a string is a valid 6 digit hex color. + The following example code is intended to check whether a string is a valid 6 digit hex color.
- However, the A-f range matches every uppercase character, and
- thus a "color" like #XYZ is considered valid.
+ However, the A-f range is overly large and matches every uppercase character.
+ It would parse a "color" like #XXYYZZ as valid.
@@ -59,10 +57,9 @@ def is_valid_hex_color(color):
- A regexp range can by accident match more than was intended.
- For example, the regular expression /[a-zA-z]/ will
- match every lowercase and uppercase letters, but the same regular
- expression will also match the chars: [\]^_`.
+ It's easy to write a regular expression range that matches a wider range of characters than you intended.
+ For example, /[a-zA-z]/ matches all lowercase and all uppercase letters,
+ as you would expect, but it also matches the characters: [ \ ] ^ _ `.
- On other occasions it can happen that the dash in a regular
- expression is not escaped, which will cause it to be interpreted
- as part of a range. For example in the character class [a-zA-Z0-9%=.,-_]
+ Another common problem is failing to escape the dash character in a regular
+ expression. An unescaped dash is interpreted
+ as part of a range. For example, in the character class [a-zA-Z0-9%=.,-_]
the last character range matches the 55 characters between
, and _ (both included), which overlaps with the
- range [0-9] and is thus clearly not intended.
+ range [0-9] and is clearly not intended by the writer.
- - Don't write character ranges were there might be confusion as to - which characters are included in the range. - + Avoid any confusion about which characters are included in the range by + writing unambiguous regular expressions. + Always check that character ranges match only the expected characters.
- The following example code checks whether a string is a valid 6 digit hex color. + The following example code is intended to check whether a string is a valid 6 digit hex color.
- However, the A-f range matches every uppercase character, and
- thus a "color" like #XYZ is considered valid.
+ However, the A-f range is overly large and matches every uppercase character.
+ It would parse a "color" like #XXYYZZ as valid.
@@ -59,10 +57,9 @@ end