RegEx to match numbers in whole words only

Go To StackoverFlow.com

1

I would like to match positive and negative numbers (no decimal or thousand separators) inside a string using .NET, but I want to match whole words only. So if a string looks like

redeem: -1234
paid: 234432

then I'd like to match -1234 and 234432

But if text is

LS022-1234-5678
FA123245

then I want no match returned. I tried

\b\-?\d+\b

but it will only match 1234 in the first scenario, not returning the "-" sign.

Any help is appreciated. Thank you.

2012-04-05 17:13
by Daniel


0

Well, I'm sure this is far from perfect, but it works for your examples:

(?<=\W)-?(?<!\w-)\d+

If you want to allow underscores just before the number, then I'd use this modification:

(?i)(?<=[^a-z0-9])-?(?<![a-z0-9]-)\d+

Let me know of any issues and I'll try and help. If you'd like me to explain either of them, let me know that too.

EDIT

To only match if there is a space or tab just before the number / negative sign (as noted in the comment below), this could be used:

(?<=[ \t])-?\d+

Note that it will match e.g. on the first number series of a telephone number, time or date value, and will not match if the number is at the beginning of the line (after a newline) - make sure this is what you intend :D

2012-04-05 17:31
by Code Jockey
You're most helpful, thank you. This is far better. However, I failed to properly define the task: I want a match only if the positive or negative number is preceded by space or TAB character (I do not want the space or tab included in the match) - Daniel 2012-04-05 17:59
@Daniel - Please see the revision in the edit abov - Code Jockey 2012-04-09 14:03


1

There is no word boundary between a space and -, thus you can't use \b there.

You could use:

(?<!\S)-?\d+\b

or

(?<![\w-])-?\d+\b

depending on your requirements (which aren't fully specified). Both will work for your examples tho.

2012-04-05 17:14
by Qtax
still seems to match the cases that he doesn't want matched... (?) it doesn't match the numbers as negative numbers, but it does match the digits, because the hyphen is 'not a word character - Code Jockey 2012-04-05 17:34
Code Jockey: you are correct. I need to match the "-" as well. Also the provided RegEx matched a lot of "garbage" (meaning non-whole words) as well - Daniel 2012-04-05 17:43
@Daniel the updated version should work better - Qtax 2012-04-05 18:34
Ads