Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

html::treebuilder::libxml(3pm) [debian man page]

HTML::TreeBuilder::LibXML(3pm)				User Contributed Perl Documentation			    HTML::TreeBuilder::LibXML(3pm)

NAME
HTML::TreeBuilder::LibXML - HTML::TreeBuilder and XPath compatible interface with libxml SYNOPSIS
use HTML::TreeBuilder::LibXML; my $tree = HTML::TreeBuilder::LibXML->new; $tree->parse($html); $tree->eof; # $tree and $node compatible to HTML::Element my @nodes = $tree->findvalue($xpath); for my $node (@nodes) { print $node->tag; my %attr = $node->all_external_attr; } HTML::TreeBuilder::LibXML->replace_original(); # replace HTML::TreeBuilder::XPath->new DESCRIPTION
HTML::TreeBuilder::XPath is libxml based compatible interface to HTML::TreeBuilder, which could be slow for a large document. HTML::TreeBuilder::LibXML is drop-in-replacement for HTML::TreeBuilder::XPath. This module doesn't implement all of HTML::TreeBuilder and HTML::Element APIs, but enough methods are defined so modules like Web::Scraper work. BENCHMARK
This is a benchmark result by tools/benchmark.pl Web::Scraper: 0.26 HTML::TreeBuilder::XPath: 0.09 HTML::TreeBuilder::LibXML: 0.01_01 Rate no_libxml use_libxml no_libxml 5.45/s -- -94% use_libxml 94.3/s 1632% -- AUTHOR
Tokuhiro Matsuno <tokuhirom slkjfd gmail.com> Tatsuhiko Miyagawa <miyagawa@cpan.org> Masahiro Chiba THANKS TO
woremacx++ http://d.hatena.ne.jp/woremacx/20080202/1201927162 id:dailyflower SEE ALSO
HTML::TreeBuilder, HTML::TreeBuilder::XPath LICENSE
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.14.2 2012-04-02 HTML::TreeBuilder::LibXML(3pm)

Check Out this Related Man Page

WWW::Mechanize::TreeBuilder(3pm)			User Contributed Perl Documentation			  WWW::Mechanize::TreeBuilder(3pm)

NAME
WWW::Mechanize::TreeBuilder - Module to optimize WWW::Mechanize and HTML::TreeBuilder use SYNOPSIS
use Test::More tests => 2; use Test::WWW::Mechanize; use WWW::Mechanize::TreeBuilder; # or # use WWW::Mechanize; # or # use Test::WWW::Mechanize::Catalyst 'MyApp'; my $mech = Test::WWW::Mechanize->new; # or #my $mech = Test::WWW::Mechanize::Catalyst->new; # etc. etc. WWW::Mechanize::TreeBuilder->meta->apply($mech); $mech->get_ok('/'); is( $mech->look_down(_tag => 'p')->as_trimmed_text, 'Some text', 'It worked' ); DESCRIPTION
This module combines WWW::Mechanize and HTML::TreeBuilder. Why? Because I've seen too much code like the following: like($mech->content, qr{<p>some text</p>}, "Found the right tag"); Which is just all flavours of wrong - its akin to processing XML with regexps. Instead, do it like the following: ok($mech->look_down(_tag => 'p', sub { $_[0]->as_trimmed_text eq 'some text' }) The anon-sub there is a bit icky, but this means that anyone should happen to add attributes to the "<p>" tag (such as an id or a class) it will still work and find the right tag. All of the methods available on HTML::Element (that aren't 'private' - i.e. that don't begin with an underscore) such as "look_down" or "find" are automatically delegated to "$mech->tree" through the magic of Moose. METHODS
Everything in WWW::Mechanize (or which ever sub class you apply it to) and all public methods from HTML::Element except those where WWW::Mechanize and HTML::Element overlap. In the case where the two classes both define a method, the one from WWW::Mechanize will be used (so that the existing behaviour of Mechanize doesn't break.) USING XPATH OR OTHER SUBCLASSES
HTML::TreeBuilder::XPath allows you to use use xpath selectors to select elements in the tree. You can use that module by providing parameters to the moose role: with 'WWW::Mechanize::TreeBuilder' => { tree_class => 'HTML::TreeBuilder::XPath' }; # or # NOTE: No hashref using this method WWW::Mechanize::TreeBuilder->meta->apply($mech, tree_class => 'HTML::TreeBuilder::XPath'; ); and class will be automatically loaded for you. This class will be used to construct the tree in the following manner: $tree = $tree_class->new_from_content($req->decoded_content)->elementify; You can also specify a "element_class" parameter which is the (HTML::Element sub)class that methods are proxied from. This module provides defaults for element_class when "tree_class" is "HTML::TreeBuilder" or "HTML::TreeBuilder::XPath" - it will warn otherwise. AUTHOR
Ash Berlin "<ash@cpan.org>" LICENSE
Same as Perl 5.8, or at your option any later version of Perl. perl v5.10.1 2010-12-16 WWW::Mechanize::TreeBuilder(3pm)
Man Page