I'm trying to create an expression from an XML. Reading from top node I want to put the node one by one into a stack, once I hit a closing tag I want to pop all elements in the stack. How do I check the end of a tag ?.
TIA,
John
Answer:
OK, I think I've the solution, using a recursive function like this:
def findTextNodes(nodeList):
for subnode in nodeList:
if subnode.nodeType == subnode.ELEMENT_NODE:
print("element node: ",subnode.tagName)
# call function again to get children
findTextNodes(subnode.childNodes)
print('subnode return: ', subnode.tagName)
elif subnode.nodeType == subnode.TEXT_NODE:
print("text node: ",subnode.data)
When the 'subnode return' it hits the closing tag!.
Thanks everybody!.
minidom builds the whole DOM in memory. Therefore it will not inform you when a end tag is encountered
1) You can consider swtich to http://docs.python.org/library/pyexpat.html and use the xmlparser.EndElementHandler to watch for the end tag. You will also need to use StartElementHandler to build your stack.
2) Take advantage of the DOM tree that minidom produces: Just select the nodes from it. (without any use of stack)
minidom
builds a DOM. There aren't tags in a DOM, as the XML has been fully parsed into nodes. A node in the DOM represents the entire XML element.
What it sounds like you want are simply the node's children (or children of type ELEMENT_NODE
perhaps).
Since you're talking about pushing them onto and popping them off of a stack, it sounds like you want them in the reverse of the order in which they appear in the document. In which case you probably want something like reversed([child for child in node.childNodes if child.nodeType == child.ELEMENT_NODE])
.
If you want all children (including the node's children's children and so on) then a recursive solution is simplest.