Can I return a parsed XML path with XML::Simple?

Go To StackoverFlow.com

2

I have an XML file that has the following format:

<testsuite name="Conformance">
 <testsuite name="Manageability">
  <testsuite name="MIBs">
   <testcase internalid="1" name="name1">...</testcase>
   <testcase internalid="2" name="name2">...</testcase>
  </testsuite>
 </testsuite>
</testsuite>

With Perl's XML::Simple I'm trying to get a list of testcases and their path so in this case the result would be:

Conformance/Manageability/MIBs
    name1
    name2

Can I do this with XML::Simple and if so what would the call look like?

My current script:

use strict;
use warnings;
use Data::Dumper;
#use XML::Twig;
use XML::Simple;

my $file = 'test.xml';

my $ref = XMLin($file);

print Dumper($ref);

I've tried several things but can't seem to get what I need. Is it easier to parse the data structure returned to get what I need?

2012-04-04 20:00
by user1216398
Why advice is that you should stop using XML::Simple when you run into the first problem using it. It's time to move up to something that doesn't make so many decisions for you - brian d foy 2012-04-07 20:51


2

Recursion is a perfect fit here.

use strict;
use warnings;
use XML::LibXML qw( );

sub visit_testsuite {
   my ($testsuite_node, $parent_path) = @_;

   my $name = $testsuite_node->getAttribute('name');
   my $path = defined($parent_path) ? "$parent_path/$name" : $name;

   my @testcase_nodes = $testsuite_node->findnodes('testcase');
   if (@testcase_nodes) {
      print("$path\n");
      for my $testcase_node (@testcase_nodes) {
         printf("   %s\n", $testcase_node->getAttribute('name'));
      }
      print("\n");
   }

   for my $testsuite_child ($testsuite_node->findnodes('testsuite')) {
      visit_testsuite($testsuite_child, $path);
   }
}


my $doc  = XML::LibXML->load_xml( IO => \*DATA );
my $root = $doc->documentElement();

visit_testsuite($root);

__DATA__

<testsuite name="Conformance">
 <testsuite name="Manageability">
  <testsuite name="MIBs">
   <testcase internalid="1" name="name1">...</testcase>
   <testcase internalid="2" name="name2">...</testcase>
  </testsuite>
 </testsuite>
</testsuite>

The root node really shouldn't be a testsuite node, but it's what you said you had.

2012-04-04 22:16
by ikegami
That does the trick...thanks - user1216398 2012-04-05 12:55


2

Using XML::Simple? Listen to what the author of that module has to say:

However I'd recommend not using XML::Simple (and I should know - I wrote it). I personally use XML::LibXML.

Source: RE: Help with accessing an unknown set of data generated by XML::Simple

Do yourself a favour and learn the proper way, which most of the time means XML::LibXML. That's the C library which is also used in PHP, Python and Ruby. Compiles on very UNIX and WINDOWS. Portable. Fast. Standard APIs. The way to go.

2012-04-04 21:37
by Lumi


2

Since you tried with XML::Twig, here is a solution for it. When it finds a testcase it checks whether it's the first one in the testsuite, if it is it prints the path, by using the ancestors of the element. Then it prints the name of the test case.

2 notes: a testcase is the first one if it does not have a previous testcase sibling, and ancestors returns the ancestors of the elements from the inner one (the element parent) to the outer one (the root), so in this case we need to reverse the list to get them in the desired order.

Voilà:

#!/usr/bin/perl

use strict;
use warnings;

use XML::Twig;

XML::Twig->new( twig_handlers => { testcase => \&test_case })
         ->parse( \*DATA);

sub test_case
  { my( $t, $test_case)= @_;
    if( ! $test_case->prev_sibling( 'testcase'))
      { # first test case, output the "path"
        print join( '/', map { $_->att( 'name') } reverse $test_case->ancestors( 'testsuite')), "\n";
      }
    print "    ", $test_case->att( 'name'),"\n";
  }

__DATA__
<testsuite name="Conformance">
 <testsuite name="Manageability">
  <testsuite name="MIBs">
   <testcase internalid="1" name="name1">...</testcase>
   <testcase internalid="2" name="name2">...</testcase>
  </testsuite>
 </testsuite>
</testsuite>
2012-04-05 07:55
by mirod


0

XML::Simple violates the "make everything as simple as possible, not simpler" in almost all but the simplest cases.

It looks like I misunderstood your requirements the first time, so here is another way -- however, I expect it to do much worse than @ikegami's solution because it first finds all testcase nodes, and then traces back to their parents.

#!/usr/bin/env perl

use strict; use warnings;
use XML::XPath;
use XML::XPath::XMLParser;

my $xp = XML::XPath->new(ioref => \*DATA);

my $nodeset = $xp->find('//testcase');

my %cases;

foreach my $node ($nodeset->get_nodelist) {
    my $current = $node;
    my @parents;

    while (defined(my $parent = $current->getParentNode)) {
        my $name = $parent->getAttribute('name');
        last unless defined $name;
        push @parents, $name;
        $current = $parent;
    }

    my $path = join('/', '', reverse @parents);

    push @{ $cases{ $path } }, $node->getAttribute('name');
}

for my $path (sort keys %cases) {
    print "$path\n";
    for my $case (sort @{ $cases{$path} }) {
        print "\t$case\n";
    }
}


__DATA__
<testsuite name="Conformance">
 <testsuite name="Manageability">
  <testsuite name="MIBs">
   <testcase internalid="1" name="name1">...</testcase>
   <testcase internalid="2" name="name2">...</testcase>
  </testsuite>
 </testsuite>
 <testsuite name="Yabadabadoo">
  <testsuite name="Da da da">
   <testcase internalid="1" name="name1">...</testcase>
   <testcase internalid="2" name="name2">...</testcase>
  </testsuite>
 </testsuite>
</testsuite>

Output:

/Conformance/Manageability/MIBs
        name1
        name2
/Conformance/Yabadabadoo/Da da da
        name1
        name2
2012-04-04 21:27
by Sinan Ünür
The question was about outputting "Conformance/Manageability/MIBs" to - ikegami 2012-04-04 22:04
Ads