apertium(1)apertium(1)NAME
apertium - This application is part of ( apertium )
This tool is part of the apertium machine translation architecture: http://apertium.sf.net.
SYNOPSIS
apertium [-d datadir] [-f format] [-u] [-a] {language-pair} [infile [outfile]]
DESCRIPTION
apertium is the application that most people will be using as it simplifies the use of apertium/lt-toolbox tools for machine translation
purposes.
This tool tries to ease the use of lt-toolbox (which contains all the lexical processing modules and tools) and apertium (which contains
the rest of the engine) by providing a unique front-end to the end-user.
The different modules behind the apertium machine translation architecture are in order:
o de-formatter: Separates the text to be translated from the format information.
o morphological-analyser: Tokenizes the text in surface forms.
o part-of-speech tagger: Chooses one surface forms among homographs.
o lexical transfer module: Reads each source-language lexical form and delivers a corresponding target-language lexical form.
o structural transfer module: Detects fixed-length patterns of lexical forms (chunks or phrases) needing special processing due to
grammatical divergences between the two languages and performs the corresponding transformations.
o morphological generator: Delivers a target-language surface form for each target-language lexical form, by suitably inflecting it.
o post-generator: Performs orthographical operations such as contractions and apostrophations.
o re-formatter: Restores the format information encapsulated by the de-formatter into the translated text and removes the encapsula-
tion sequences used to protect certain characters in the source text.
OPTIONS -d datadir The directory holding the linguistic data. By default it will used the expected installation path.
language-pair The language pair: LANG1-LANG2 (for instance es-ca or ca-es).
-f format Specifies the format of the input and output files which can have these values:
o txt (default value) Input and output files are in text format.
o html Input and output files are in "html" format. This "html" is the one acceptd by the vast majority of web browsers.
o rtf Input and output files are in "rtf" format. The accepted "rtf" is the one generated by Microsoft WordPad (C) and Microsoft
Office (C) up to and including Office-97.
-u Disable marking of unknown words with the '*' character.
-a Enable marking of disambiguated words with the '=' character.
FILES
These are the two files that can be used with this command:
infile Input file (stdin by default).
outfile Output file (stdout by default).
SEE ALSO lt-proc(1), lt-comp(1), lt-expand(1), apertium-tagger(1).
BUGS
Lots of...lurking in the dark and waiting for you!
AUTHOR
(c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved.
2006-03-08 apertium(1)
Check Out this Related Man Page
mpexpand(n) Documentation toolbox mpexpand(n)
__________________________________________________________________________________________________________________________________________________NAME
mpexpand - Markup processor
SYNOPSIS
mpexpand ?-module module? format infile|- outfile|-
mpexpand.all ?-verbose? ?module?
_________________________________________________________________DESCRIPTION
This manpage describes a processor / converter for manpages in the doctools format as specified in doctools_fmt. The processor is based
upon the package doctools.
mpexpand ?-module module? format infile|- outfile|-
The processor takes three arguments, namely the code describing which formatting to generate as the output, the file to read the
markup from, and the file to write the generated output into. If the infile is "-" the processor will read from stdin. If outfile is
"-" the processor will write to stdout.
If the option -module is present its value overrides the internal definition of the module name.
The currently known output formats are
nroff The processor generates *roff output, the standard format for unix manpages.
html The processor generates HTML output, for usage in and display by web browsers.
tmml The processor generates TMML output, the Tcl Manpage Markup Language, a derivative of XML.
latex The processor generates LaTeX output.
wiki The processor generates Wiki markup as understood by wikit.
list The processor extracts the information provided by manpage_begin.
null The processor does not generate any output.
mpexpand.all ?-verbose? ?module?
This command uses mpexpand to generate all possible output formats for all manpages in the current directory. The manpages are rec-
ognized through the extension ".man". If -verbose is specified the command will list its actions before executing them.
The module information is passed to mpexpand.
NOTES
Possible future formats are plain text, pdf and postscript.
SEE ALSO
expander(n), format(n), formatter(n)
KEYWORDS
HTML, TMML, conversion, manpage, markup, nroff
CATEGORY
Documentation tools
COPYRIGHT
Copyright (c) 2002 Andreas Kupries <andreas_kupries@users.sourceforge.net>
Copyright (c) 2003 Andreas Kupries <andreas_kupries@users.sourceforge.net>
doctools 1.0 mpexpand(n)