Looking for a cleaner RegEx solution to matching different formats for a SSN

Go To StackoverFlow.com

1

To see what I want, check out the regex I'm using. I'll try to explain it in English. I want to match 4444 or 444444444 or 444-44-4444. Here's what i have and it does what I want.

^[0-9]{9}$|^[0-9]{4}$|^[0-9]{3}-[0-9]{2}-[0-9]{4}$

Is there a way to do it without the ORs? I thought about doing this

([0-9]{3}-?[0-9]{2}-?)?[0-9]{4}

but that allows 222-222222 which I want to exclude.

2012-04-05 17:58
by Dale
Is 4444 a valid SSN - alan 2012-04-05 18:02
What regex implementation are you using? grep? javascript? python? perl - alan 2012-04-05 18:22
Have you looked at these? There seem to be some extra constraints on an SSN, e.g. disallowing all zeros in any subsection, SSNs starting with 666 etc, which you might want to consider - ire_and_curses 2012-04-05 18:30


1

You should be able to do this with backreferences:

^(?:\d{3}(-?)\d{2}\1)?\d{4}$

If a - is present, it is captured and can be referenced with \1. If it's not present, \1 will just be empty. So it essentially means: If - is at that position, it must be at the other position too.

DEMO

2012-04-05 18:02
by Felix Kling
Felix-Kling (the @ modifier isn't working) I think it should be ^(?:\d{3}(-?)\d{2}\1)?\d{4}$ i.e., (-?) instead of (-)? For some reason it would not work for me otherwise - alan 2012-04-05 18:46
Interesting... which language did you check? It worked for me in JavaScript. But thinking about it, it makes sense as well.. - Felix Kling 2012-04-05 20:50
It didn't work in Python. And I have a text editor that (I think) uses Perl's regex implementation. It didn't work in that either. But it worked in both when I moved the ? inside the parentheses. (On another topic: when I type the @ in a comment your username does not pop up so that I can insert @Felix Kling. Maybe because of the space? Do you know how to get around that? - alan 2012-04-05 20:54
Afaik, the username completion does not appear if the only other user is the thread owner... I get notified for any comment. And you also got notified, although I did not use @alan ; - Felix Kling 2012-04-05 21:26
It worked in javascript, but not in Java which is what I needed and forgot to specify in my question. I ended up going with the less elegant ^([0-9]{3}[0-9]{2})?[0-9]{4}$|^[0-9]{3}-[0-9]{2}-[0-9]{4}$ in the name of time - Dale 2012-04-06 12:48


0

The pattern marked as answer actual fails because it doesn't match the US Spec for valid SSN numbers!

Using the match invalidator this pattern works and throws out 000 and 666 or numbers which start with 9xx as per the government specification Social Security Number Randomization

# To use this regex pattern specify IgnoreWhiteSpace due to these comments.
^                           # Beginning of line anchor      
(?!9)                       # Can't be 900- 999                                   
(?!000)                     # If it starts with 000 its bad (STOP MATCH!)
(?!666)                     # If it starts with 666 its bad (STOP MATCH!)
(?<FIRST>\d{3})             # Match the First three digits and place into First named capture group
(?:[\s\-]?)                 # Match but don't capture a possible space or dash
(?<SECOND>\d\d)             # Match next two digits
(?:[\s-]?)                  # Match but don't capture a possible space or dash
(?<THIRD>\d{4})             # Match the final for digits
$                           # EOL anchor

I describe the use of the match invalidator on my blog article Regular Expression (Regex) Match Invalidator (?!) in .Net.

2012-04-05 19:17
by ΩmegaMan
Ads