So as a preface, I need this regex to move past a bug I'm waiting on server people to fix.
Basically I get JSON back with unescaped " characters in HTML. I need a regex that looks inbetween <> characters and replaces " with a \".
<div style="padding: 0%; width: 100%;"><span style="font-family:verdana;"><span style="font-size: 72px;">Demo!</span></span></div>
UPDATED INFO: The text though is inside some json sent back as a string that I eventually need to parse into regular JSON, and the parsing fails.
The string looks something like this:
"{
"overlay": "overlay1",
"type": "text",
"text": "<div style="padding: 0%; width: 100%;"><span style="font-family:verdana;"><span style="font-size: 72px;">Demo!</span></span></div>"
}"
This is the regex so far that I have found (I know some regex stuff, just not a lot with look ahead or behind
/(?<=\<)(.*?)(?=\>)/g
But using that only gets me to retrieving this:
<div style="padding: 0%; width: 100%;"><span style="font-family:verdana;"><span style="font-size: 72px;">Demo!</span></span></div>
(basically just everything inside the <> characters. When I only really want to target the " inside the <>.)
Can anyone recommend a quick temporary fix? Thanks!
The following should work (as a temporary fix):
/"(?=[^<]*>)/g
This will match all double quotes where there are no < characters before the next >.
Try replacing the (.*?) in the center part with a ([^<>]*?)
You have to be careful with the dot operator.
Try this:
string = string.replace(/\"/g, "\\\"");
//--- EDIT---
var someString = "{
"overlay": "overlay1",
"type": "text",
"text": "<div style="padding: 0%; width: 100%;"><span style="font-family:verdana;"><span style="font-size: 72px;">Demo!</span></span></div>"
}";
someString = someString.replace(/"(?=[^<]*>)/g, "\\\""); //Props @F.J for this RegEx
Obj = $.parseJSON(someString);
console.log(Obj.text);
If the JSON is otherwise well-formed and you don't have any attribute-type syntax outside of tags, the following should work for a one-line fix:
var str = '<div style="padding: 0%; width: 100%;"><span style="font-family:verdana;"><span style="font-size: 72px;">Demo!</span></span></div>'
str.replace(/([\w-]+)=\"(.*?)\"/g, '$1=\\\"$2\\\"')
>> "<div style=\"padding: 0%; width: 100%;\"><span style=\"font-family:verdana;\"><span style=\"font-size: 72px;\">Demo!</span></span></div>"
That adds slashes before all HTML attributes wherever they appear (not necessarily inside of tags). If you need to target it better, do your first search to isolate tags then loop through each tag doing this regex replacement.
By the way, why do you say "I know it's not supposed to be done"? This is a perfect use for regular expressions!
If this is perl, you could do something like:
$string =~ s/(?!\\)"/\\"/g;