Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

inline-detox(1) [debian man page]

DETOX(1)						    BSD General Commands Manual 						  DETOX(1)

NAME
inline-detox -- clean up filenames (stream-based) SYNOPSIS
inline-detox [-hnLrv] [-s -sequence] [-f -configfile] file ... DESCRIPTION
The inline-detox utility can remove spaces and other such annoyances from streams. It'll also translate or cleanup Latin-1 (ISO 8859-1) characters encoded in 8-bit ASCII, Unicode characters encoded in UTF-8, and CGI escaped characters. Basically its detox, but does not oper- ate on files. Sequences inline-detox is driven by a configurable series of filters, called a sequence. Sequences are covered in more detail in detoxrc(5) and are discoverable with the -L option. Some examples of default sequences are iso8859_1 and utf_8. Options The main options: -f configfile Use configfile instead of the default configuration files for loading translation sequences. No other config file will be parsed. -h --help Display helpful information. -L List the currently available sequences. When paired with -v this option shows what filters are used in each sequence and any properties applied to the filters. -r Recurse into subdirectories. -s sequence Use sequence instead of default. -v Be verbose about which files are being renamed. -V Show the current version of inline-detox. Deprecated Options Deprecated Options are options that were available in earlier versions of inline-detox but have lost their meaning and are being phased out. --remove-trailing Removes _ and - after .'s in filenames. This was first provided in the 0.9 series of inline-detox. After the introduction of sequences, it lost its meaning, as you could now determine the properties of wipeup through a particular sequence's configura- tion. It presently forces all instances of the wipeup filter to use remove trailing, regardless of what's actually in the config files. FILES
detoxrc The system-wide detoxrc file. ~/.detoxrc A user's personal detoxrc. Normally it extends the system-wide detoxrc, unless -f has been specified, in which case, it is ignored. iso8859_1.tbl The default ISO 8859-1 translation table. unicode.tbl The default Unicode (UTF-8) translation table. EXAMPLES
echo Foo Bar | inline-detox -s iso8859_1 -v Will run the sequence iso8859_1 listing any changes and returning the result to STDOUT. SEE ALSO
detox(1), detoxrc(5), detox.tbl(5). HISTORY
detox was originally designed to clean up files that I had received from friends which had been created using other operating systems. It's trivial to create a filename with spaces, parenthesis, brackets, and ampersands under some operating systems. These have special meaning within FreeBSD and Linux, and cause problems when you go to access them. I created inline-detox to clean up these files. AUTHORS
inline-detox was written by Doug Harple. BUGS
Long options don't work under Solaris or Darwin. An error in the config file will cause a segfault as it's going to print the offending word within the config file. BSD
August 3, 2004 BSD

Check Out this Related Man Page

UTF(6)								   Games Manual 							    UTF(6)

NAME
UTF, Unicode, ASCII, rune - character set and format DESCRIPTION
The Plan 9 character set and representation are based on the Unicode Standard and on the ISO multibyte UTF-8 encoding (Universal Character Set Transformation Format, 8 bits wide). The Unicode Standard represents its characters in 16 bits; UTF-8 represents such values in an 8-bit byte stream. Throughout this manual, UTF-8 is shortened to UTF. In Plan 9, a rune is a 16-bit quantity representing a Unicode character. Internally, programs may store characters as runes. However, any external manifestation of textual information, in files or at the interface between programs, uses a machine-independent, byte-stream encoding called UTF. UTF is designed so the 7-bit ASCII set (values hexadecimal 00 to 7F), appear only as themselves in the encoding. Runes with values above 7F appear as sequences of two or more bytes with values only from 80 to FF. The UTF encoding of the Unicode Standard is backward compatible with ASCII: programs presented only with ASCII work on Plan 9 even if not written to deal with UTF, as do programs that deal with uninterpreted byte streams. However, programs that perform semantic processing on ASCII graphic characters must convert from UTF to runes in order to work properly with non-ASCII input. See rune(2). Letting numbers be binary, a rune x is converted to a multibyte UTF sequence as follows: 01. x in [00000000.0bbbbbbb] -> 0bbbbbbb 10. x in [00000bbb.bbbbbbbb] -> 110bbbbb, 10bbbbbb 11. x in [bbbbbbbb.bbbbbbbb] -> 1110bbbb, 10bbbbbb, 10bbbbbb Conversion 01 provides a one-byte sequence that spans the ASCII character set in a compatible way. Conversions 10 and 11 represent higher- valued characters as sequences of two or three bytes with the high bit set. Plan 9 does not support the 4, 5, and 6 byte sequences pro- posed by X-Open. When there are multiple ways to encode a value, for example rune 0, the shortest encoding is used. In the inverse mapping, any sequence except those described above is incorrect and is converted to rune hexadecimal 0080. FILES
/lib/unicode table of characters and descriptions, suitable for look(1). SEE ALSO
ascii(1), tcs(1), rune(2), keyboard(6), The Unicode Standard. UTF(6)
Man Page