Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

mftraining(1) [debian man page]

MFTRAINING(1)															     MFTRAINING(1)

NAME
mftraining - feature training for Tesseract SYNOPSIS
mftraining -U unicharset -O lang.unicharset FILE... DESCRIPTION
mftraining takes a list of .tr files, from which it generates the files inttemp (the shape prototypes), shapetable, and pffmtable (the number of expected features for each character). (A fourth file called Microfeat is also written by this program, but it is not used.) OPTIONS
-U FILE (Input) The unicharset generated by unicharset_extractor(1) -F font_properties_file (Input) font properties file, each line is of the following form, where each field other than the font name is 0 or 1: *font_name* *italic* *bold* *fixed_pitch* *serif* *fraktur* -X xheights_file (Input) x heights file, each line is of the following form, where xheight is calculated as the pixel x height of a character drawn at 32pt on 300 dpi. [ That is, if base x height + ascenders + descenders = 133, how much is x height? ] *font_name* *xheight* -D dir Directory to write output files to. -O FILE (Output) The output unicharset that will be given to combine_tessdata(1) SEE ALSO
tesseract(1), cntraining(1), unicharset_extractor(1), combine_tessdata(1), shapeclustering(1), unicharset(5) http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 COPYING
Copyright (C) Hewlett-Packard Company, 1988 Licensed under the Apache License, Version 2.0 AUTHOR
The Tesseract OCR engine was written by Ray Smith and his research groups at Hewlett Packard (1985-1995) and Google (2006-present). 02/09/2012 MFTRAINING(1)

Check Out this Related Man Page

ADDFTINFO(1)						      General Commands Manual						      ADDFTINFO(1)

NAME
addftinfo - add information to troff font files for use with groff SYNOPSIS
addftinfo [ -v ] [ -param value... ] res unitwidth font DESCRIPTION
addftinfo reads a troff font file and adds some additional font-metric information that is used by the groff system. The font file with the information added is written on the standard output. The information added is guessed using some parametric information about the font and assumptions about the traditional troff names for characters. The main information added is the heights and depths of characters. The res and unitwidth arguments should be the same as the corresponding parameters in the DESC file; font is the name of the file describing the font; if font ends with I the font will be assumed to be italic. OPTIONS
-v prints the version number. All other options changes one of the parameters that is used to derive the heights and depths. Like the existing quantities in the font file, each value is in inches/res for a font whose point size is unitwidth. param must be one of: x-height The height of lowercase letters without ascenders such as x. fig-height The height of figures (digits). asc-height The height of characters with ascenders, such as b, d or l. body-height The height of characters such as parentheses. cap-height The height of uppercase letters such as A. comma-depth The depth of a comma. desc-depth The depth of characters with descenders, such as p,q, or y. body-depth The depth of characters such as parentheses. addftinfo makes no attempt to use the specified parameters to guess the unspecified parameters. If a parameter is not specified the default will be used. The defaults are chosen to have the reasonable values for a Times font. SEE ALSO
groff_font(5), groff(1), groff_char(7) Groff Version 1.21 31 December 2010 ADDFTINFO(1)
Man Page