Regex Split String at particular word pattern differant value in match group

Go To StackoverFlow.com

2

This question is very similar to an earlier question I asked ( This Question ), however I need to change it slightly.

So in the earlier question this string

Berkshire Hathaway Inc (Ticker: BRK; NAICS: 524126, 511130, 335212, 445292, 511110, 442210; Duns: 00-102-4314) Walt Disney Co (Ticker: DIS; NAICS: 713110, 512110, 711211, 515120; Duns: 00-690-4700)

creates 2 matches with these values:

Berkshire Hathaway Inc
Walt Disney Co

Now I want the matches to contain Ticker: XXX or the Company Name with preference going to Ticker: XXX.

So for the example above it would match:

Ticker: BRK
Ticker: DIS

And for this example:

Berkshire Hathaway Inc (NAICS: 524126, 511130, 335212, 445292, 511110, 442210; Duns: 00-102-4314) Walt Disney Co (Ticker: DIS; NAICS: 713110, 512110, 711211, 515120; Duns: 00-690-4700)

The result would be:

Berkshire Hathaway Inc
Ticker: DIS

I guess I just don't understand the regex solution in the previous question well enough to understand how to modify it to fit this pattern.

The regex is written in c#

By the way the previous regex solution was:

(?!\s*$)(.*?)(?:\([^)]*(?:(?:SIC|NAICS):[^)]*)+\)|$)

which I guess should be changed to this now:

(?!\s*$)(.*?)(?:\([^)]*(?:(?:SIC|NAICS|Duns):[^)]*)+\)|$)

but how do I extract the Ticker: and choose that value over the other value if Ticker exists?

2012-04-04 17:07
by M.B.


4

I'm still learning regex, so I'm not sure if you can use conditional logic on groups. As an alternative though, you could modify your regex as follows so that it also captures a group for the ticker if it exists:

(?!\s*$)(.*?)(?:\((Ticker:[^;]+)?[^)]*(?:(?:SIC|NAICS|Duns):[^)]*)+\)|$)

Then you could do the logic in your c# code. I guess something like this would work:

Regex regex = new Regex(@"(?!\s*$)(.*?)(?:\((Ticker:[^;]+)?[^)]*(?:(?:SIC|NAICS|Duns):[^)]*)+\)|$)");
Match match = regex.Match("Berkshire Hathaway Inc (NAICS: 524126, 511130, 335212, 445292, 511110, 442210; Duns: 00-102-4314) Walt Disney Co (Ticker: DIS; NAICS: 713110, 512110, 711211, 515120; Duns: 00-690-4700)");  
while (match.Success) {
    if (match.Groups[2].Success)
    {
        Console.WriteLine(match.Groups[2].Value);
    }
    else
    {
        Console.WriteLine(match.Groups[1].Value);
    }
    match = match.NextMatch();
}

Output:

Berkshire Hathaway Inc 
Ticker: DIS
2012-04-04 19:06
by Robbie
ha. I like that solution because it's easy - M.B. 2012-04-04 19:49
You cannot use conditional logic on groups, btw. Regular expressions are meant simply to parse the text as i - Justin Pihony 2012-04-05 00:39
@JustinPihony Thanks for the tip! Always gladly received : - Robbie 2012-04-05 07:33


2

I would suggest using a tool like Expresso to work out your regular expressions. It is designed for C# Regex, and will even copy the code you need to use into your clipboard. You can paste your example into the tool and then tweak your regular expression until it works. I find a tool like this to be a must for writing regular expressions.

2012-04-04 17:14
by Ryan Shyffer
I'm using Rad software and it helps with testing, but I'm still new to regex, so I'm just not sure I understand the syntax yet - M.B. 2012-04-04 17:18
regex buddy is whilst not free also awesom - krystan honour 2012-04-05 18:19
Ads