TRANG(1) http://www.thaiopensource. TRANG(1)
NAME
trang - Schema Converter Based on RELAX NG
SYNOPSIS
trang [-I rng | rnc | dtd | xml ] [-O rng | rnc | dtd | xml ] [-i input-param] [-o output-param] {inputFileOrURI...} {outputFile}
INTRODUCTION
Trang takes as input a schema written in any of the following formats:
o RELAX NG (XML syntax)
o RELAX NG (compact syntax)
o XML 1.0 DTD
and produces as output a schema written in any of the following formats:
o RELAX NG (XML syntax)
o RELAX NG (compact syntax)
o XML 1.0 DTD
o W3C XML Schema
Trang can also infer a schema from one or more example XML documents.
Trang uses an internal representation based on RELAX NG. For each supported input format, there is an input module that converts a schema
in that input format into this internal representation. For each supported output format, there is an output module that converts the
internal representation into a schema in that output format. Thus, any supported input format can be translated to any supported output
format.
COMMAND-LINE ARGUMENTS
Trang requires two command-line arguments: the first is the URI or filename of the schema to be translated; the second is the output
filename.
Trang infers the input and output modules to be used from the extension of input and output filenames as follows:
.rng
RELAX NG (XML syntax)
.rnc
RELAX NG (compact syntax)
.dtd
XML 1.0 DTD
.xsd
W3C XML Schema
.xml
XML documents (used as examples from which to infer a schema)
This inference can be overridden using the -I and -O options.
When the input are XML documents used as examples to infer a schema, more than one input file may be specified as arguments. All the input
files are specified before the output file.
The arguments specifying the input and output files can be preceded by arguments specifying options. Trang accepts the following options:
-I
Specifies the input module (rng, rnc, dtd, or xml)
-O
Specifies the output module (rng, rnc, dtd, or xsd)
-i, -o
Specifies an additional parameter for an input (-i) or output (-o) module. The -i and -o options may be used multiple times in order to
specify multiple parameters. There are two kinds of parameter: boolean parameters and string-valued parameters. A string-valued
parameter is specified using the form name=value. A boolean parameter is specified using the form name or no-name. The applicable
parameters depend on the particular input and output module and are described in the documentation for the input or output modules.
INPUT MODULES
The following subsections describe the several options for each input module.
RELAX NG (XML Syntax) Input Module
This input module accepts RELAX NG schemas in XML syntax as defined by the RELAX NG 1.0 Committee Specification[1]. It accept the following
parameters:
-i encoding=name
Use an encoding of name rather than the encoding specified in the encoding declaration of the XML document.
RELAX NG Compact Syntax Input Module
This input module accepts RELAX NG schemas using the compact syntax as defined in the RELAX NG Compact Syntax Committee Specification[2].
This input module accepts RELAX NG schemas in XML syntax as defined by the RELAX NG 1.0 Committee Specification[1]. It accept the following
parameters:
-i encoding=name
Use an encoding of name. By default, Trang will autodetect an encoding of UTF-8 or UTF-16.
DTD Input Module
This input module accepts DTDs as defined by the XML 1.0 Recommendation[3]. It accepts the following parameters:
-i xmlns=uri
Specifies the default namespace, that is the namespace used for unqualified element names.
-i xmlns:prefix=uri
Specifies the namespace for the element and attribute names using prefix.
-i colon-replacement=chars
Replaces colons in element names by chars when constructing the names of definitions used to represent the element declarations and
attribute list declarations in the DTD. Trang generates a definition for each element declaration and attlist declaration in the DTD.
The name of the definition is based on the name of the element. In RELAX NG, the names of definitions cannot contain colons. However,
in the DTD, the element name may contain a colon. By default, Trang will first try to use the element names without prefixes. If this
causes a conflict, it will instead replace the colon by a legal name character (it try first to use a period).
-i element-define=name-pattern
Specifies how to construct the name of the definition representing an element declaration from the name of the element. The
name-pattern must contain exactly one percent character. This percent character is replaced by the name of element (after colon
replacement) and the result is used as the name of the definition.
-i inline-attlist
Specifies not to generate definitions for attribute list declarations and instead move attributes declared in attribute list
declarations into the definitions generated for element declarations. This is the default behavior when the output module is xsd.
Otherwise, the default behaviour is as described in the -i no-inline-attlist parameter.
-i attlist-define=name-patter
This specifies how to construct the name of the definition representing an attribute list declaration from the name of the element. The
name-pattern must contain exactly one percent character. This percent character is replaced by the name of element (after colon
replacement) and the result is used as the name of the definition.
-i any-name=name
Specifies the name of the definition generated for the content of elements declared in the DTD as having a content model of ANY.
-i strict-any
Preserves the exact semantics of ANY content models by using an explicit choice of references to all declared elements. By default,
Trang uses a wildcard that allows any element.
-i annotation-prefix=prefix
Default values are represented using an annotation attribute prefix:defaultValue where prefix is bound to
http://relaxng.org/ns/compatibility/annotations/1.0 as defined by the RELAX NG DTD Compatibility Committee Specification[4]. By
default, Trang will use a for prefix unless that conflicts with a prefix used in the DTD.
-i generate-start, -i no-generate-start
Specifies whether Trang should generate a start element. DTDs do not indicate what elements are allowed as document elements. Trang
assumes that all elements that are defined but never referenced are allowed as document elements.
XML Input Module
This input module accepts one or more XML documents and infers a schema. All the XML documents will be valid with respect to the inferred
schema.
It accept the following parameters:
-i encoding=name
Use an encoding of name rather than the encoding specified in the encoding declaration of the XML document.
OUTPUT MODULES
All output modules accept the following parameters:
-o encoding=name
Use an encoding of name for the output files.
-o indent=n
Indent by n spaces for each indentation level.
RELAX NG (XML Syntax) Output Module
This output module outputs RELAX NG schemas in XML syntax as defined by the RELAX NG 1.0 Committee Specification[1].
RELAX NG Compact Syntax Output Module
This output module outputs RELAX NG schemas in compact syntax as defined by the RELAX NG Compact Syntax Committee Specification[2].
DTD Output Module
This output module outputs DTDs as defined by the XML 1.0 Recommendation[3].
It has many limitations. There are many RELAX NG features that it cannot handle, including:
o Wildcards
o Multiple element patterns with the same name
o externalRef
o overriding definitions (in an include)
o combining definitions with combine="choice"
However, it can handle many RELAX NG features, including some that go beyond the capabilities of DTDs. When some part of a RELAX NG schema
cannot be represented exactly in DTD, Trang will try to approximate it. The approximation will always be more general, that is, the DTD
will allow everything that is allowed by the RELAX NG schema, but there may be some things that are allowed by the DTD that are not allowed
by the RELAX NG schema. For example, if the RELAX NG schema specifies that the content of an element is a string conforming to some
datatype, then Trang will make the content of the element be (#PCDATA); or if the RELAX NG schema specifies a choice between two attributes
x and y, then the DTD will allow both x and y optionally. Whenever Trang approximates, it will give a warning message.
If you want to be able to generate a DTD but need to use some feature of RELAX NG that Trang is unable to convert into a DTD, then you
might try one of the following approaches:
o Create a RELAX NG schema including the features you need, and then use XSLT (or some other XML transformation language) to transform
the schema into something that Trang can handle, perhaps making use of annotations in the schema to guide the transformation.
o Create a RELAX NG schema S1 which uses only features that Trang can handle but which, consequently, does not capture all the desired
constraints; then create a second RELAX NG schema S2 that includes S1, and overrides definitions in S1 replacing them with definitions
that make unrestricted use of the features of RELAX NG.
W3C XML Schema Output Module
This output module outputs an W3C XML Schema as defined by the XML Schema Recommendation[5].
It supports the following parameters:
-o disable-abstract-elements
Disables the use of abstract elements and subsitution groups in the generated XML Schema. This can also be controlled using an
annotation attribute.
-o any-process-contents=strict|lax|skip
Specifies the value for the processContents attribute of any elements. The default is skip (corresponding to RELAX NG semantics) unless
the input format is dtd, in which case the default is strict (corresponding to DTD semantics).
-o any-attribute-process-contents=strict|lax|skip
Specifies the value for the processContents attribute of anyAttribute elements. The default is skipt (corresponding to RELAX NG
semantics).
It has the following limitations:
o it may generate schemas that violate W3C XML Schema's restrictions on ambiguous content models;
o it may generate schemas that violate W3C XML Schema's restrictions on consistent element types;
o when the RELAX NG schema cannot be represented by W3C XML Schema, a generalization is generated; it should give a warning in this case,
but does not always do so.
Annotations can be added to the RELAX NG schema to guide the translation. These annotations have the namespace URI
http://www.thaiopensource.com/ns/relaxng/xsd. This document will use the convention that the prefix tx refers to this namespace URI; in
other words, it will assume a namespace declaration of
xmlns:tx="http://www.thaiopensource.com/ns/relaxng/xsd"
Currently, only one annotation is supported, an attribute tx:enableAbstractElements. The value of this must be true or false. It applies to
RELAX NG define elements. Trang has the ability to translate a define that contains a choice of element patterns into an abstract element
declaration, which will be used as the head of a substitution group whose members are the elements in the choice. Whether it does this is
determined by the value of the tx:enableAbstractElements annotation attribute. If the value is true, it will attempt to use an abstract
element element. If the value is false, it will not, which means the define will typically be translated into a group definition.
The tx:enableAbstractElements attribute is inherited in a similar way to the ns attribute: it can be specified on a grammar, div or include
element to enable or disable the use of abstract elements for all descendant define elements. In the absence of any inherited
tx:enableAbstractElements attribute, the use of abstract elements is enabled unless the -o disable-abstract-elements option was specified.
It can happen that the same element name occurs in a choice in more than one define element; at most one of these define elements can be
translated to an abstract element. In this case, Trang will not translate any of them to an abstract element, unless the use of abstract
elements has been disabled by tx:enableAbstractElements for all except one of the define elements.
In fact, the use of abstract elements is not restricted to the case where the define consists of a choice that contains only element
patterns; the choice may also contain ref patterns referring to definitions that are to be translated into element declarations, whether
abstract or not. The tx:enableAbstractElements attribute applies equally to these definitions.
SEE ALSO
http://code.google.com/p/jing-trang/
Project homepage and source code
http://www.thaiopensource.com/relaxng/trang-manual.html
This manual in HTML format
AUTHORS
James Thai Open Source Software Center Ltd Clark <jj@thaiopensource.com>
Thai Open Source Software Center Ltd
Developer
Thomas Schraitle <toms@suse.de>
Creating manpage
COPYRIGHT
Copyright (C) 2002, 2003, 2008 Thai Open Source Software Center Ltd
See the file copying.txt for copying permission.
This document was compiled from http://www.thaiopensource.com/relaxng/trang-manual.html.
NOTES
1. RELAX NG 1.0 Committee Specification
http://www.oasis-open.org/committees/relax-ng/spec.html
2. RELAX NG Compact Syntax Committee Specification
http://www.oasis-open.org/committees/relax-ng/compact-20021121.html
3. XML 1.0 Recommendation
http://www.w3.org/TR/REC-xml
4. RELAX NG DTD Compatibility Committee Specification
http://www.oasis-open.org/committees/relax-ng/compatibility.html
5. XML Schema Recommendation
http://www.w3.org/TR/xmlschema-1/
http://code.google.com/p/j 03/16/2009 TRANG(1)