Need help turning Perl regex into Javascript regex

Go To StackoverFlow.com

2

I'm looking to mine out some goodies from a multi-line string of text. I'm comfy doing regex in Perl (though I'm sure there is a better way than my code below), but don't really see how to use a marked string in the regexp as part of the newSubStr in Javascript. Is there a way or am I stuck running multiple replaces on this to ditch the audio and source lines?

$_ = <<END;
<audio controls="controls" preload="metadata">
   <source src="01.mp3" type="audio/mpeg">
   <source src="01.ogg" type="audio/ogg">
   Stuff
   Default: <a href="01.mp3">&gt;&gt;download</a>
</audio>
END

s#.*<source.*?>.*?\n(.*)\n</audio>.*#$1#s;

print "[$_]\n";

Multiples regex in (my limited) Javascript might like this:

// We're really dependent on the HTML layout for line feeds
// so watch out.
var line = aElems[i].innerHTML.replace(/.*?audio.*?\n/gm, '');
var line2 = line.replace(/.*<source.*?\n/mg, '');
console.log(line2);
2012-04-04 17:37
by SidMuchRock
What are you trying to do? Do you have a bunch of HTML files sitting around, or are you trying to do this live in the browser? Why are you using JavaScript here? What have you tried so far - brian d foy 2012-04-04 17:47
Parsing html with regexes isn't a good idea - kirilloid 2012-04-04 17:47
@briandfoy I'm just trying to (essentially) destroy an audio tag, replacing it with part of its internal text. I do this when I discover the browser supports the audio tag but none of the sources available. See my related question at [http://stackoverflow.com/q/10016079/1311457 - SidMuchRock 2012-04-04 18:02
The pony he comes...NoName 2012-04-04 20:53


2

From reading both your questions it sounds like what you really want is to make the parent tag of your audio tag contain the innerHTML of your audio tag with the source elements removed.

A regexp would be error prone especially when you can use the DOM to get the same results with less effort.

var audio_tag = ...;
var elements_to_delete = audio_tag.getElementsByTagName('source');
for (var idx = elements_to_delete.length - 1; idx >= 0; --idx) {
    audio_tag.removeChild( elements_to_delete[idx] );
}
audio_tag.parentNode.innerHTML = audio_tag.innerHTML;
2012-04-04 19:46
by Ven'Tatsu
Hmm yes, why didn't I think of that. As you note, that's much safer. I'll give that a shot and be back - SidMuchRock 2012-04-04 21:07
Works perfectly. It doesn't answer my question up above, but it answers the need I had to ask this question in the first place. We'll call it answered anyway. :) Thank you - SidMuchRock 2012-04-04 22:38


3

Although you say you want to use JavaScript, I thought I'd show you the non-regex Perl approach. The HTML::TokeParser::Simple makes it pretty easy:

use HTML::TokeParser::Simple;
my $p = HTML::TokeParser::Simple->new( *DATA );

TOKEN: while( my $token = $p->get_token ) {
    if( $token->is_start_tag( 'audio' ) ){
        AUDIO: while( my $t = $p->get_token ) {
            next AUDIO if $t->is_tag( 'source' );
            last AUDIO if $t->is_end_tag( 'audio' );
            print $t->as_is;
            }
        next TOKEN;
        }

    print $token->as_is;
    }

__DATA__
<html>
<head><title>Test</title></head>
<body>
<p>Keep this</p>
<audio controls="controls" preload="metadata">
   <source src="01.mp3" type="audio/mpeg">
   <source src="01.ogg" type="audio/ogg">
   Stuff
   Default: <a href="01.mp3">&gt;&gt;download</a>
</audio>
<p>Keep this</p>
</body>
</html>

This gives:

<html>
<head><title>Test</title></head>
<body>
<p>Keep this</p>



   Stuff
   Default: <a href="01.mp3">&gt;&gt;download</a>

<p>Keep this</p>
</body>
</html>

There are other Perl modules that will correctly parse HTML and play with the structure, too.

For the JavaScript side, why don't you just replace the HTML? I know you asked a related question about this. It seems to me that something else should be generating the content inside the audio and should be able to give you something you like in this case. I'd back up a step and work on that. Or, you can explain much more about your problem.

2012-04-04 18:03
by brian d foy
I don't want to build the html from scratch (.shtml or some other solution) for each user, I've the page set, I just want to make modifications based on user preferences or abilities. I can do it easily with a few lines of regexp. I can do it easily with two for loops and some variables to let me know if we're in the section I want to store. But I hoped for a one liner like I'm able to do in PERL. No worries, I've got work arounds - SidMuchRock 2012-04-04 18:47
@SidMuchRock - FYI, the name of the language is Perl and the name of the interpreter is perl - NoName 2012-04-04 20:55
Why would you build the HTML for every user? Based on your description of the problem, it sounds like some files link to non-existent resources. Just don't generate links to those missing files - brian d foy 2012-04-04 23:15
Nope it's not non-existent sources, but rather sources that aren't compatible with the user. Firefox supports
Ads