Skip to content Skip to sidebar Skip to footer

Regular Expression Matches An Extra Empty Group

I'm new to the domain of regular expressions. All I'll post below are simplified examples from my code. I have a string, let's say test_1,some_2,foo,bar_4, that I want to replace b

Solution 1:

You are getting that last false-positive match because your regular expression is matching empty strings:

"".replace(/(.*?)(?:_(\d))?(?:,|$)/g, "title: '$1' ('$2') ");

title: '' ('') 

So, in your case after all characters have been consumed, it will match an empty string.

You can control by changing your first group to be non-optional, considering it is not really an optional one as it shows.

/(.*?)(?:_(\d))?(?:,|$)/g
 --^^--

For example,

var str = "test_1,some_2,foo,bar_4";
test.replace(/([a-z]+)(?:_(\d))?(?:,|$)/gi, "title: '$1' ('$2') ");

title:test(1)title:some(2)title:foo()title:bar(4)

That is,

  • ([a-z]+): Matching at least one alphabetical character, and
  • gi: Making the string case-insensitive.

Solution 2:

As a simplest solution, you can just add trailing comma to original string before matching regular expression.

Solution 3:

Your problem is that your pattern matches not only what you want but also empty strings:

(.*?)# matches any string (including an empty one) not containing \n
(?:_(\d))?  # it is an optional group
(?:,|$)  # it matches a comma or the end of the string

So when your regex engines evaluates the end of your string against your pattern it sees that:

  • the first group matches because an empty string is being processed
  • the second group matches because it is optional
  • the third group matches because the end of the string is being processed

so the whole pattern matches and you get an extra match. You can see it clearly in the console using the match method of strings

> s.match(/(.*?)(?:_(\d))?(?:,|$)/g)
  ["test_1,", "some_2,", "foo,", "bar_4", ""]

You have at least two options for dealing with the problem:

  • change the first group of your pattern in a way that doesn't match the empty string but still fits your needs (it depends on the strings you have to process)
  • leave your regex untouched and process the string returned by replace removing the unwanted part

The first option is the elegant one. The second can be easily achieved with an extra line of code:

> var result = s.replace(/(.*?)(?:_(\d))?(?:,|$)/g, "title: $1 ($2) ");> result = result.slice(0, result.lastIndexOf("title"));
  "title: test (1) title: some (2) title: foo () title: bar (4) "

Post a Comment for "Regular Expression Matches An Extra Empty Group"