Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

mkdoc::xml::tagger(3pm) [debian man page]

MKDoc::XML::Tagger(3pm) 				User Contributed Perl Documentation				   MKDoc::XML::Tagger(3pm)

NAME
MKDoc::XML::Tagger - Adds XML markup to XML / XHTML content. SYNOPSIS
use MKDoc::XML::Tagger; print MKDoc::XML::Tagger->process_data ( "<p>Hello, World!</p>", { _expr => 'World', _tag => 'strong', class => 'superFort' } ); Should print: <p>Hello, <strong class="superFort">World</strong>!</p> SUMMARY
MKDoc::XML::Tagger is a class which lets you specify a set of tag and attributes associated with expressions which you want to mark up. This module will then stuff any XML you send out with the extra expressions. For example, let's say that you have a document which has the term 'Microsoft Windows' several times in it. You could wish to surround any instance of the term with a <trademark> tag. MKDoc::XML::Tagger lets you do exactly that. In MKDoc, this is used so that editors can enter hyperlinks separately from the content. It allows them to enter content without having to worry about the annoying <a href="..."> syntax. It also has the added benefit from preventing bad information architecture such as the 'click here' syndrome. We also have plans to use it for automatically linking glossary words, abbreviation tags, etc. MKDoc::XML::Tagger is also probably a very good tool if you are building some kind of Wiki system in which you want expressions to be automagically hyperlinked. DISCLAIMER
This module does low level XML manipulation. It will somehow parse even broken XML and try to do something with it. Do not use it unless you know what you're doing. API
The API is very simple. my $result = MKDoc::XML::Tagger->process_data ($xml, @expressions); Tags $xml with the @expressions list. Each element of @expressions is a hash reference looking like this: { _expr => 'Some Expression', _tag => 'foo', attribute1 => 'bar' attribute2 => 'baz' } Which will try to turn anything which looks like: Some Expression sOmE ExPrEssIoN (etcetera) Into: <foo attr1="bar" attr2="baz">Some Expression</foo> <foo attr1="bar" attr2="baz">sOmE ExPrEssIoN</foo> <foo attr1="bar" attr2="baz">(etcetera)</foo> You can have multiple expressions, in which case longest expressions are processed first. my $result = MKDoc::XML::Tagger->process_file ('some/file.xml', @expressions); Same as process_data(), except it takes its data from 'some/file.xml'. NOTES
MKDoc::XML::Tagger does not really parse the XML file you're giving to it nor does it care if the XML is well-formed or not. It uses MKDoc::XML::Tokenizer to turn the XML / XHTML file into a series of MKDoc::XML::Token objects and strictly operates on a list of tokens. For this same reason MKDoc::XML::Tagger does not support namespaces. AUTHOR
Copyright 2003 - MKDoc Holdings Ltd. Author: Jean-Michel Hiver This module is free software and is distributed under the same license as Perl itself. Use it at your own risk. SEE ALSO
MKDoc::XML::Tokenizer MKDoc::XML::Token perl v5.10.1 2005-03-10 MKDoc::XML::Tagger(3pm)

Check Out this Related Man Page

MKDoc::XML::Tokenizer(3pm)				User Contributed Perl Documentation				MKDoc::XML::Tokenizer(3pm)

NAME
MKDoc::XML::Tokenizer - Tokenize XML the REX way SYNOPSIS
my $tokens = MKDoc::XML::Tokenizer->process_data ($some_xml); foreach my $token (@{$tokens}) { print "'" . $token->as_string() . "' is text " if (defined $token->text()); print "'" . $token->as_string() . "' is a self closing tag " if (defined $token->tag_self_close()); print "'" . $token->as_string() . "' is an opening tag " if (defined $token->tag_open()); print "'" . $token->as_string() . "' is a closing tag " if (defined $token->tag_close()); print "'" . $token->as_string() . "' is a processing instruction " if (defined $token->pi()); print "'" . $token->as_string() . "' is a declaration " if (defined $token->declaration()); print "'" . $token->as_string() . "' is a comment " if (defined $token->comment()); print "'" . $token->as_string() . "' is a tag " if (defined $token->tag()); print "'" . $token->as_string() . "' is a pseudo-tag (NOT text and NOT tag) " if (defined $token->pseudotag()); print "'" . $token->as_string() . "' is a leaf token (NOT opening tag) " if (defined $token->leaf()); } SUMMARY
MKDoc::XML::Tokenizer is a module which uses Robert D. Cameron REX technique to parse XML (ignore the carriage returns): [^<]+|<(?:!(?:--(?:[^-]*-(?:[^-][^-]*-)*->?)?|[CDATA[(?:[^]]*](?:[^]]+]) *]+(?:[^]>][^]]*](?:[^]]+])*]+)*>)?|DOCTYPE(?:[ ]+(?:[A-Za-z_:]|[^ x00-x7F])(?:[A-Za-z0-9_:.-]|[^x00-x7F])*(?:[ ]+(?:(?:[A-Za-z_:]|[^ x00-x7F])(?:[A-Za-z0-9_:.-]|[^x00-x7F])*|"[^"]*"|'[^']*'))*(?:[ ]+) ?(?:[(?:<(?:!(?:--[^-]*-(?:[^-][^-]*-)*->|[^-](?:[^]"'><]+|"[^"]*"|'[^']*' )*>)|?(?:[A-Za-z_:]|[^x00-x7F])(?:[A-Za-z0-9_:.-]|[^x00-x7F])*(?:?>|[ n ][^?]*?+(?:[^>?][^?]*?+)*>))|%(?:[A-Za-z_:]|[^x00-x7F])(?:[A-Za-z0 -9_:.-]|[^x00-x7F])*;|[ ]+)*](?:[ ]+)?)?>?)?)?|?(?:(?:[A-Za-z _:]|[^x00-x7F])(?:[A-Za-z0-9_:.-]|[^x00-x7F])*(?:?>|[ ][^?]*?+(? :[^>?][^?]*?+)*>)?)?|/(?:(?:[A-Za-z_:]|[^x00-x7F])(?:[A-Za-z0-9_:.-]|[^x 00-x7F])*(?:[ ]+)?>?)?|(?:(?:[A-Za-z_:]|[^x00-x7F])(?:[A-Za-z0-9_:. -]|[^x00-x7F])*(?:[ ]+(?:[A-Za-z_:]|[^x00-x7F])(?:[A-Za-z0-9_:.-]| [^x00-x7F])*(?:[ ]+)?=(?:[ ]+)?(?:"[^<"]*"|'[^<']*'))*(?:[ t ]+)?/?>?)?) That's right. One big regex, and it works rather well. DISCLAIMER
This module does low level XML manipulation. It will somehow parse even broken XML and try to do something with it. Do not use it unless you know what you're doing. API
my $tokens = MKDoc::XML::Tokenizer->process_data ($some_xml); Splits $some_xml into a list of MKDoc::XML::Token objects and returns an array reference to the list of tokens. my $tokens = MKDoc::XML::Tokenizer->process_file ('/some/file.xml'); Same as MKDoc::XML::Tokenizer->process_data ($some_xml), except that it reads $some_xml from '/some/file.xml'. NOTES
MKDoc::XML::Tokenizer works with MKDoc::XML::Token, which can be used when building a full tree is not necessary. If you need to build a tree, look at MKDoc::XML::TreeBuilder. AUTHOR
Copyright 2003 - MKDoc Holdings Ltd. Author: Jean-Michel Hiver This module is free software and is distributed under the same license as Perl itself. Use it at your own risk. SEE ALSO
MKDoc::XML::Token MKDoc::XML::TreeBuilder perl v5.10.1 2004-10-06 MKDoc::XML::Tokenizer(3pm)
Man Page