This question is very similar to an earlier question I asked ( This Question ), however I need to change it slightly.
So in the earlier question this string
Berkshire Hathaway Inc (Ticker: BRK; NAICS: 524126, 511130, 335212, 445292, 511110, 442210; Duns: 00-102-4314) Walt Disney Co (Ticker: DIS; NAICS: 713110, 512110, 711211, 515120; Duns: 00-690-4700)
creates 2 matches with these values:
Berkshire Hathaway Inc Walt Disney Co
Now I want the matches to contain Ticker: XXX or the Company Name with preference going to Ticker: XXX.
So for the example above it would match:
Ticker: BRK Ticker: DIS
And for this example:
Berkshire Hathaway Inc (NAICS: 524126, 511130, 335212, 445292, 511110, 442210; Duns: 00-102-4314) Walt Disney Co (Ticker: DIS; NAICS: 713110, 512110, 711211, 515120; Duns: 00-690-4700)
The result would be:
Berkshire Hathaway Inc Ticker: DIS
I guess I just don't understand the regex solution in the previous question well enough to understand how to modify it to fit this pattern.
The regex is written in c#
By the way the previous regex solution was:
(?!\s*$)(.*?)(?:\([^)]*(?:(?:SIC|NAICS):[^)]*)+\)|$)
which I guess should be changed to this now:
(?!\s*$)(.*?)(?:\([^)]*(?:(?:SIC|NAICS|Duns):[^)]*)+\)|$)
but how do I extract the Ticker: and choose that value over the other value if Ticker exists?
I'm still learning regex, so I'm not sure if you can use conditional logic on groups. As an alternative though, you could modify your regex as follows so that it also captures a group for the ticker if it exists:
(?!\s*$)(.*?)(?:\((Ticker:[^;]+)?[^)]*(?:(?:SIC|NAICS|Duns):[^)]*)+\)|$)
Then you could do the logic in your c# code. I guess something like this would work:
Regex regex = new Regex(@"(?!\s*$)(.*?)(?:\((Ticker:[^;]+)?[^)]*(?:(?:SIC|NAICS|Duns):[^)]*)+\)|$)");
Match match = regex.Match("Berkshire Hathaway Inc (NAICS: 524126, 511130, 335212, 445292, 511110, 442210; Duns: 00-102-4314) Walt Disney Co (Ticker: DIS; NAICS: 713110, 512110, 711211, 515120; Duns: 00-690-4700)");
while (match.Success) {
if (match.Groups[2].Success)
{
Console.WriteLine(match.Groups[2].Value);
}
else
{
Console.WriteLine(match.Groups[1].Value);
}
match = match.NextMatch();
}
Output:
Berkshire Hathaway Inc
Ticker: DIS
I would suggest using a tool like Expresso to work out your regular expressions. It is designed for C# Regex, and will even copy the code you need to use into your clipboard. You can paste your example into the tool and then tweak your regular expression until it works. I find a tool like this to be a must for writing regular expressions.