Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

apertium(1) [debian man page]

apertium(1)															       apertium(1)

NAME
apertium - This application is part of ( apertium ) This tool is part of the apertium machine translation architecture: http://apertium.sf.net. SYNOPSIS
apertium [-d datadir] [-f format] [-u] [-a] {language-pair} [infile [outfile]] DESCRIPTION
apertium is the application that most people will be using as it simplifies the use of apertium/lt-toolbox tools for machine translation purposes. This tool tries to ease the use of lt-toolbox (which contains all the lexical processing modules and tools) and apertium (which contains the rest of the engine) by providing a unique front-end to the end-user. The different modules behind the apertium machine translation architecture are in order: o de-formatter: Separates the text to be translated from the format information. o morphological-analyser: Tokenizes the text in surface forms. o part-of-speech tagger: Chooses one surface forms among homographs. o lexical transfer module: Reads each source-language lexical form and delivers a corresponding target-language lexical form. o structural transfer module: Detects fixed-length patterns of lexical forms (chunks or phrases) needing special processing due to grammatical divergences between the two languages and performs the corresponding transformations. o morphological generator: Delivers a target-language surface form for each target-language lexical form, by suitably inflecting it. o post-generator: Performs orthographical operations such as contractions and apostrophations. o re-formatter: Restores the format information encapsulated by the de-formatter into the translated text and removes the encapsula- tion sequences used to protect certain characters in the source text. OPTIONS
-d datadir The directory holding the linguistic data. By default it will used the expected installation path. language-pair The language pair: LANG1-LANG2 (for instance es-ca or ca-es). -f format Specifies the format of the input and output files which can have these values: o txt (default value) Input and output files are in text format. o html Input and output files are in "html" format. This "html" is the one acceptd by the vast majority of web browsers. o rtf Input and output files are in "rtf" format. The accepted "rtf" is the one generated by Microsoft WordPad (C) and Microsoft Office (C) up to and including Office-97. -u Disable marking of unknown words with the '*' character. -a Enable marking of disambiguated words with the '=' character. FILES
These are the two files that can be used with this command: infile Input file (stdin by default). outfile Output file (stdout by default). SEE ALSO
lt-proc(1), lt-comp(1), lt-expand(1), apertium-tagger(1). BUGS
Lots of...lurking in the dark and waiting for you! AUTHOR
(c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved. 2006-03-08 apertium(1)

Check Out this Related Man Page

mpexpand(n)						       Documentation toolbox						       mpexpand(n)

__________________________________________________________________________________________________________________________________________________

NAME
mpexpand - Markup processor SYNOPSIS
mpexpand ?-module module? format infile|- outfile|- mpexpand.all ?-verbose? ?module? _________________________________________________________________ DESCRIPTION
This manpage describes a processor / converter for manpages in the doctools format as specified in doctools_fmt. The processor is based upon the package doctools. mpexpand ?-module module? format infile|- outfile|- The processor takes three arguments, namely the code describing which formatting to generate as the output, the file to read the markup from, and the file to write the generated output into. If the infile is "-" the processor will read from stdin. If outfile is "-" the processor will write to stdout. If the option -module is present its value overrides the internal definition of the module name. The currently known output formats are nroff The processor generates *roff output, the standard format for unix manpages. html The processor generates HTML output, for usage in and display by web browsers. tmml The processor generates TMML output, the Tcl Manpage Markup Language, a derivative of XML. latex The processor generates LaTeX output. wiki The processor generates Wiki markup as understood by wikit. list The processor extracts the information provided by manpage_begin. null The processor does not generate any output. mpexpand.all ?-verbose? ?module? This command uses mpexpand to generate all possible output formats for all manpages in the current directory. The manpages are rec- ognized through the extension ".man". If -verbose is specified the command will list its actions before executing them. The module information is passed to mpexpand. NOTES
Possible future formats are plain text, pdf and postscript. SEE ALSO
expander(n), format(n), formatter(n) KEYWORDS
HTML, TMML, conversion, manpage, markup, nroff CATEGORY
Documentation tools COPYRIGHT
Copyright (c) 2002 Andreas Kupries <andreas_kupries@users.sourceforge.net> Copyright (c) 2003 Andreas Kupries <andreas_kupries@users.sourceforge.net> doctools 1.0 mpexpand(n)
Man Page