Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

mkdoc::xml(3pm) [debian man page]

MKDoc::XML(3pm) 					User Contributed Perl Documentation					   MKDoc::XML(3pm)

NAME
MKDoc::XML - The MKDoc XML Toolkit SYNOPSIS
This is an article, not a module. SUMMARY
MKDoc is a web content management system written in Perl which focuses on standards compliance, accessiblity and usability issues, and multi-lingual websites. At MKDoc Ltd we have decided to gradually break up our existing commercial software into a collection of completely independent, well- documented, well-tested open-source CPAN modules. Ultimately we want MKDoc code to be a coherent collection of module distributions, yet each distribution should be usable and useful in itself. MKDoc::XML is part of this effort. You could help us and turn some of MKDoc's code into a CPAN module. You can take a look at the existing code at http://download.mkdoc.org/. If you are interested in some functionality which you would like to see as a standalone CPAN module, send an email to <mkdoc-modules@lists.webarch.co.uk>. DISCLAIMER
MKDoc::XML is a low level XML library. MKDoc::XML::* modules do not make sure your XML is well-formed. MKDoc::XML::* modules can be used to work with somehow broken XML. MKDoc::XML::* modules should not be used as high-level parsers with general purpose XML unless you know what you're doing. WHAT'S IN THE BOX XML tokenizer MKDoc::XML::Tokenizer splits your XML / XHTML files into a list of MKDoc::XML::Token objects using a single regex. XML tree builder MKDoc::XML::TreeBuilder sits on top of MKDoc::XML::Tokenizer and builds parsed trees out of your XML / XHTML data. XML stripper MKDoc::XML::Stripper objects removes unwanted markup from your XML / HTML data. Useful to remove all those nasty presentational tags or 'style' attributes from your XHTML data for example. XML tagger MKDoc::XML::Tagger module matches expressions in XML / XHTML documents and tag them appropriately. For example, you could automatically hyperlink certain glossary words or add <abbr> tags based on a dictionary of abbreviations and acronyms. XML entity decoder MKDoc::XML::Decode is a pluggable, configurable entity expander module which currently supports html entities, numerical entities and basic xml entities. XML entity encoder MKDoc::XML::Encode does the exact reverse operation as MKDoc::XML::Decode. XML Dumper MKDoc::XML::Dumper serializes arbitrarily complex perl structures into XML strings. It is also able of doing the reverse operation, i.e. deserializing an XML string into a perl structure. AUTHOR
Copyright 2003 - MKDoc Holdings Ltd. Author: Jean-Michel Hiver This module is free software and is distributed under the same license as Perl itself. Use it at your own risk. SEE ALSO
Petal: http://search.cpan.org/dist/Petal/ MKDoc: http://www.mkdoc.com/ Help us open-source MKDoc. Join the mkdoc-modules mailing list: mkdoc-modules@lists.webarch.co.uk perl v5.10.1 2005-03-10 MKDoc::XML(3pm)

Check Out this Related Man Page

MKDoc::XML::Token(3pm)					User Contributed Perl Documentation				    MKDoc::XML::Token(3pm)

NAME
MKDoc::XML::Token - XML Token Object SYNOPSIS
my $tokens = MKDoc::XML::Tokenizer->process_data ($some_xml); foreach my $token (@{$tokens}) { print "'" . $token->as_string() . "' is text " if (defined $token->text()); print "'" . $token->as_string() . "' is a self closing tag " if (defined $token->tag_self_close()); print "'" . $token->as_string() . "' is an opening tag " if (defined $token->tag_open()); print "'" . $token->as_string() . "' is a closing tag " if (defined $token->tag_close()); print "'" . $token->as_string() . "' is a processing instruction " if (defined $token->pi()); print "'" . $token->as_string() . "' is a declaration " if (defined $token->declaration()); print "'" . $token->as_string() . "' is a comment " if (defined $token->comment()); print "'" . $token->as_string() . "' is a tag " if (defined $token->tag()); print "'" . $token->as_string() . "' is a pseudo-tag (NOT text and NOT tag) " if (defined $token->pseudotag()); print "'" . $token->as_string() . "' is a leaf token (NOT opening tag) " if (defined $token->leaf()); } SUMMARY
MKDoc::XML::Token is an object representing an XML token produced by MKDoc::XML::Tokenizer. It has a set of methods to identify the type of token it is, as well as to help building a parsed tree as in MKDoc::XML::TreeBuilder. API
my $token = new MKDoc::XML::Token ($string_token); Constructs a new MKDoc::XML::Token object. my $string_token = $token->as_string(); Returns the string representation of this token so that: MKDoc::XML::Token->new ($token)->as_string eq $token is a tautology. my $node = $token->leaf(); If this token is not an opening tag, this method will return its corresponding node structure as returned by $token->text(), $token->tag_self_close(), etc. Returns undef otherwise. my $node = $token->pseudotag(); If this token is a comment, declaration or processing instruction, this method will return $token->tag_comment(), $token_declaration() or $token->pi() resp. Returns undef otherwise. my $node = $token->tag(); If this token is an opening, closing, or self closing tag, this method will return $token->tag_open(), $token->tag_close() or $token->tag_self_close() resp. Returns undef otherwise. my $node = $token->comment(); If this token object represents a declaration, the following structure is returned: # this is <!-- I like Pie. Pie is good --> { _tag => '~comment', text => ' I like Pie. Pie is good ', } Returns undef otherwise. my $node = $token->declaration(); If this token object represents a declaration, the following structure is returned: # this is <!DOCTYPE foo> { _tag => '~declaration', text => 'DOCTYPE foo', } Returns undef otherwise. my $node = $token->pi(); If this token object represents a processing instruction, the following structure is returned: # this is <?xml version="1.0" charset="UTF-8"?> { _tag => '~pi', text => 'xml version="1.0" charset="UTF-8"', } Returns undef otherwise. my $node = $token->tag_open(); If this token object represents an opening tag, the following structure is returned: # this is <aTag foo="bar" baz="buz"> { _tag => 'aTag', _open => 1, _close => 0, foo => 'bar', baz => 'buz', } Returns undef otherwise. my $node = $token->tag_close(); If this token object represents a closing tag, the following structure is returned: # this is </aTag> { _tag => 'aTag', _open => 0, _close => 1, } Returns undef otherwise. my $node = $token->tag_self_close(); If this token object represents a self-closing tag, the following structure is returned: # this is <aTag foo="bar" baz="buz" /> { _tag => 'aTag', _open => 1, _close => 1, foo => 'bar', baz => 'buz', } Returns undef otherwise. my $node = $token->text(); If this token object represents a piece of text, then this text is returned. Returns undef otherwise. TRAP! $token->text() returns a false value if this text happens to be '0' or ''. So really you should use: if (defined $token->text()) { ... do stuff... } NOTES
MKDoc::XML::Token works with MKDoc::XML::Tokenizer, which can be used when building a full tree is not necessary. If you need to build a tree, look at MKDoc::XML::TreeBuilder. AUTHOR
Copyright 2003 - MKDoc Holdings Ltd. Author: Jean-Michel Hiver This module is free software and is distributed under the same license as Perl itself. Use it at your own risk. SEE ALSO
MKDoc::XML::Tokenizer MKDoc::XML::TreeBuilder perl v5.10.1 2004-10-06 MKDoc::XML::Token(3pm)
Man Page