I'm developing WebApp. I've feature to quicksearch for articles.
In two words structure is:
So, when user type query in popup-quicksearch field, app
push
to temporary search results array (with cache)As you can see, original array doesn't modifing.
Currenty i'm using primitive String.indexOf
, but it cannot match text within formatted via html tags text (example below):
Question is about RegEx patterns. I clearly understand that it's not recommended to use RegEx to manipulate with DOM and expecting results below isn't semantically correct but it fits needs.
For example: we have something like this:
<ul><li>Item <i><span style="color:red">Y</span></i></li></ul>
and we need to highlight query e
, expecting result: ... It<em>e</em>m ...
, but if use trivial replace(/e/ig, '<em>$&</em>')
it will replace e
in style="color:red"
too.
i.e. what RegEx pattern to do not touch words in tags?
Second example: we need to highlight Item Y
, so expecting result is <ul><li><em>Item <i><span style="color:red">Y</em></span></i></li></ul>
If I understood correctly, you need to search within text contents of a fragment of a DOM tree. One way of achieving this is to use the XML/HTML text contents. This examples makes use of jQuery, but the idea is easily portable to other libs:
HTML:
<div id="article_contents">
Blah blah blah, Item 1, Item 2 blah blah <b>Ite</b>m <span>1</span> blah blah
</div>
JavaScript:
var source = jQuery('#article_contents').text();
var queryRegexp = new RegExp ( 'Item 1', 'g' );
var results = source.match (queryRegexp);
Now results
will hold all occurences of your search string. Of course to achieve your goal of highlighting results you must go a few steps further (like using RegExp.exec to get the offsets of the matches).
A short hackish solution is to look for markup between every single letter of the search string. If your keyword is "search" it would look like this:
(s)(<[.^>]*>)*(e)(<[.^>]*>)*(a)(<[.^>]*>)*(r)(<[.^>]*>)*(c)(<[.^>]*>)*(h)
But in reality you need to do more than that, because:
display:none
, visibility:hidden
, etcthese <tag> are </tag> my <i><b>s</b>earch keywords</i>
and if you're supposed to wrap my search in <span>
tags in that markup (without actually wrapping every single character) you'll end up with a in the middle of some other tag, on a different DOM tree level - Silviu-Marian 2012-04-04 23:09