I search for the following lines coming from a messy HTML file:
<span id="fooPack1_xpl01_name11">150.00 FTL</span>
<span id="fooPack1_xpl02_name11">350.00 FTL</span>
<span id="fooPack1_xpl03_name11">250.00 FTL</span>
<span id="fooPack1_xpl04_name11">230.00 FTL</span>
I use BeautifulSoup and re to search and find the strings:
tags = soup.find_all('span', id=re.compile(r'[fooPack1_xpl04_name11]\d+'))
But obviously the common part of that string is at the beginning and at the end, with the part changing always in the middle. How can I restructure my re search pattern so that it searches for "fooPack1_xpl"+(different string)+"_name11"
Thanks.
// EDIT //
When I query the following:
<span id="FullView1_spl02_Stack_4">03/04/12</span>
<span id="FullView1_spl03_Stack_4">01/03/11</span>
<span id="FullView1_spl04_Stack_4">02/25/02</span>
<span id="FullView1_spl05_Stack_4">07/16/04</span>
<span id="FullView1_spl01_Stack32">999.00 SPL</span>
<span id="FullView1_spl02_Stack82">150.00 XPP</span>
<span id="FullView1_spl03_Stack82">350.00 XPP</span>
<span id="FullView1_spl04_Stack82">450.00 XPP</span>
<span id="FullView1_spl05_Stack82">550.00 XPP</span>
<span id="FullView1_spl06_Stack82">650.00 XPP</span>
<span id="FullView1_spl07_Stack22">888.00 SPL</span>
<span id="FullView1_spl202_stckFriendName">Red Car</span>
<span id="FullView1_spl203_stckFriendName">Green Car</span>
<span id="FullView1_spl204_stckFriendName">Blue Car</span>
with:
foo=soup.findAll('span', id=re.compile(r'FullView1_spl\d+_stack82'))
I get the following result:
<span id="FullView1_spl204_stckFriendName">Blue Car</span>
<span id="FullView1_spl02_Stack82">150.00 XPP</span>
<span id="FullView1_spl03_Stack82">350.00 XPP</span>
<span id="FullView1_spl04_Stack82">450.00 XPP</span>
<span id="FullView1_spl05_Stack82">550.00 XPP</span>
<span id="FullView1_spl06_Stack82">650.00 XPP</span>
Obviously, I do not need the top element to be detected. So this is the only problem.
You're almost there. You want to search for fooPack1_xpl
followed by digits followd by _name11
, so how about:
re.compile(r'fooPack1_xpl\d+_name11')
Note that I just put a \d+
for where you expect the digits, and the literal string you were searching for otherwise.