I've had some help from the community to replace some gifs and urls to helpful data in an HTML table before chucking it into a 2D array, but I think what I actually need is to store each row of the table as a hash in an activerecord entry.
Here's the first row of sample data with headers:
html2 = <<TABLE2
<table class="status">
<caption class="status">Drive status</caption>
<tr class="status">
<th class="status"></th>
<th class="status">Drive</th>
<th class="status">State</th>
<th class="status">Health</th>
<th class="status">Make/Model</th>
<th class="status">Speed</th>
<th class="status">Serial</th>
<th class="status">Firmware</th>
<th class="status"><a href="/cgi-bin/status_dylan?cont=0&dylan=0&display=1">Sectors</a></th>
<th class="status">Temp</th>
<th class="status"> </th>
</tr>
<tr class="status">
<td class="status"><img border="0" src="/tick_green.gif"></td>
<td class="status">0</td>
<td class="status">Ready</td>
<td class="status"><a href="/cgi-bin/status_drive?cont=0&dylan=0&drive=0"><img border="0" src="/bar10.gif"></a></td>
<td class="status">SEAGATE ST3146807FC</td>
<td class="status">10000 RPM</td>
<td class="status">3HY61E1B</td>
<td class="status">XR12</td>
<td class="status">286749488</td>
<td class="status"> 29.0°C</td>
<td class="status" style="background-color: #fefe00"> 
</td>
</tr>
clean_table2 = []
table2.css('tr').each do |tr|
clean_row = []
tr.css('td').each do |td|
#for each cell, look for img tags, and replace the images with text as appropriate, then strip the html
img = td.at('img')
clean_row.push case
when img && img[:src][/bar(\d+)\.gif/] then 'Health: '+$1
when img && img[:src][/tick_green/] then 'Healthy'
when img && img[:src][/cross_red/] then 'Failed'
when img && img[:src][/caution/] then 'Caution'
else td.text.strip
end
end
clean_table2.push clean_row
#puts clean_row[5]
end
puts "\n"
#puts clean_table.join("\n")
clean_table2.each {|x|
puts "#{x}"
}
Here's the code to strip everything unimportant and replace the 'non-helpful' gifs with rational text -= but the hashes I'm creating aren't as useful as I'd hoped - so I would rather create a hash using the table headers as keys Then I can feed this in with server serial numbers and aray addresses into an activerecord entry so that I can compare and display deltas between instances of the records (for example, if the drive health drops to 5 from 10) What do you all think? I can compare the arrays, but I think that since record retrieval is fast, I can store only distinct changes, rather than storing a 2-d array every time there's a change (which I think will rapidly go out of control)
... as you can probably guess, I'm also trying to get this straight in my head too ;) many thanks Scott
rewritten slightly, and made it a bit more logical...
table = html_page.parser.xpath('//table/caption[contains(.,"Drive")]/..')
#loop through each row individually (or do I want to chuck the whole thing into a nice juicy hash)
#Am I using this? #REMOVE
clean_table = Array.new
clean_head=[]
table.css('tr').each do |tr|
#stash WWN number, fake interface and fake address [can get, but not needed at this stage]
clean_row = {:wwn=>cells[0],:dyl_if=>'1',:dyl_addr=>'0'}
#grab headers
tr.css('th').each_with_index do |th,i|
if i == 0
clean_head.push "Drive Health"
else if i == 10
clean_head.push "BG Temp"
else clean_head.push th.text.strip
end
end
end
#each td in each tr - add index so I can add table headers as keys in hash
tr.css('td').each_with_index do |td, i|
#for each cell, look for img tags, and replace the images with text as appropriate, then strip the html
img = td.at('img')
clean_row[clean_head[i]] = case
when img && img[:src][/bar(\d+)\.gif/] then 'Health: '+$1
when img && img[:src][/tick_green/] then 'Healthy'
when img && img[:src][/cross_red/] then 'Failed'
when img && img[:src][/caution/] then 'Caution'
else td.text.strip
end
end
#Debug output - confirm nothing cocked up
puts clean_row
if clean_row.has_key?("Health")
Drive_Record.create(clean_row)
puts "Add Drive Recprd"
end
end