ustr-import(1) [debian man page]

ustr(1) 						    Ustr String Library, tools							   ustr(1)

NAME

       ustr-import - ustr string library import tool

SYNOPSIS

       ustr-import [--32|--64] [-d][d] [-c] [-b x] [-e 1|0] [-s 1|0] section

DESCRIPTION

	This tool lets you use the Ustr string library without incuring dependencies on the library itself, so API/ABI compatibility is 100%
       (nothing changes unless you do it) and installing your application doesn't require the library to be pre-installed.

OPTIONS

       --32   If you installed with multilib, this runs the 32 bit variant (and installs the variable multilib build code as ustr-conf.h).

       --64   If you installed with multilib, this runs the 64 bit variant (and installs the variable multilib build code as ustr-conf.h).

       -d     Turn debugging on, USTR_ASSERT() now runs code etc.

       -d     Turn extra debugging on, including End of String (EOS) markers that takeup space. Note that you can do -dd to add both at once.

       -c     Use C files, this requires that you alter the build system to compile the C files and link them into your application. The default
	      is to just provide headers that you can just include.

       -b     Specify the default reference count byte size: 0, 1, 2 or 4 (or 8 on 64 bit platforms). Note that 2 bytes is the minimum if you have
	      explicit size storage.

       -e     Specify the default exact sized allocations flag, without this flag allocations are rounded up to the neared half power of two.

       -s     Specify the default explicit size storage flag, without this flag allocations have an implicit size based on their length with it a
	      size value is stored with the string (thus taking significantly larger space for small strings, but this doesn't require
	      reallocating the string when growing and shrinking the string). Note that turning this on also increases the minimum sizes for
	      length and reference count storage.

SECTIONS

       all    All of the following sections are included.

       b      Working with binary numbers in NBO format.

       cmp    Comparing, strcmp() for Ustr's, although the Ustr versions are safer and much faster.

       cntl   Control options dynamically.

       fmt    Formatted output, sprintf() for Ustr's.

       gdb    Copy just the .gdbinit file to the local dir.

       io     Input Output.

       ins    Inserting data.

       main   The core functions, including strcat(), strdup() and delete for Ustr's. Always safer and often much faster.

       parse  Parsing integers, Ie. Nice versions of strtol().

       pool   A bundled memory pool API, to use with the ustrp functions.

       replace
	      Replacing all occurances of data.

       sc     Shortcut functions for Ustr's.

       set    Setting data, strcpy() for Ustr's.

       split  Slit the data, strtok() / strsep() for Ustr's.

       spn    Spanning, strspn() / strcspn() for Ustr's.

       srch   Searching, strchr() / strrchr() / strstr() for Ustr's, although the Ustr versions are safer and much faster.

       sub    Substituting data.

       utf8   Working with UTF8.

FILES

       /ustr/include/ustr-conf.h /ustr/include/ustr-conf-debug.h
	      In multilib. this is the header to choose the correct conf.h header based on the byte size.

       /ustr/include/ustr*.h
	      The default "extern" header files.

       /usr/share/ustr-*/ustr-*-internal.h
	      Internal functions, used the implement the public interfaces.

       /usr/share/ustr-*/ustr-*-code.h
	      The code behind the public interfaces.

       /usr/share/ustr-*/ustr-*-code.c
	      The C files, which use the code header files to create objects.

       /usr/share/ustr-*/.gdbinit
	      The GDB init file containing macros to help inspect Ustr's in the debugger.

SEE ALSO

       ustr(3),ustr_const(3)

ustr-import 1.0.4						    03-Aug-2007 							   ustr(1)

Check Out this Related Man Page

Map8(3pm)						User Contributed Perl Documentation						 Map8(3pm)

NAME

       Unicode::Map8 - Mapping table between 8-bit chars and Unicode

SYNOPSIS

	require Unicode::Map8;
	my $no_map = Unicode::Map8->new("ISO646-NO") || die;
	my $l1_map = Unicode::Map8->new("latin1")    || die;

	my $ustr = $no_map->to16("V}re norske tegn b|r {res
");
	my $lstr = $l1_map->to8($ustr);
	print $lstr;

	print $no_map->tou("V}re norske tegn b|r {res
")->utf8

DESCRIPTION

       The Unicode::Map8 class implement efficient mapping tables between 8-bit character sets and 16 bit character sets like Unicode.	The tables
       are efficient both in terms of space allocated and translation speed.  The 16-bit strings is assumed to use network byte order.

       The following methods are available:

       $m = Unicode::Map8->new( [$charset] )
	   The object constructor creates new instances of the Unicode::Map8 class.  I takes an optional argument that specify then name of a
	   8-bit character set to initialize mappings from.  The argument can also be a the name of a mapping file.  If the charset/file can not
	   be located, then the constructor returns undef.

	   If you omit the argument, then an empty mapping table is constructed.  You must then add mapping pairs to it using the addpair() method
	   described below.

       $m->addpair( $u8, $u16 );
	   Adds a new mapping pair to the mapping object.  It takes two arguments.  The first is the code value in the 8-bit character set and the
	   second is the corresponding code value in the 16-bit character set.	The same codes can be used multiple times (but using the same pair
	   has no effect).  The first definition for a code is the one that is used.

	   Consider the following example:

	     $m->addpair(0x20, 0x0020);
	     $m->addpair(0x20, 0x00A0);
	     $m->addpair(0xA0, 0x00A0);

	   It means that the character 0x20 and 0xA0 in the 8-bit charset maps to themselves in the 16-bit set, but in the 16-bit character set
	   0x0A0 maps to 0x20.

       $m->default_to8( $u8 )
	   Set the code of the default character to use when mapping from 16-bit to 8-bit strings.  If there is no mapping pair defined for a
	   character then this default is substituted by to8() and recode8().

       $m->default_to16( $u16 )
	   Set the code of the default character to use when mapping from 8-bit to 16-bit strings. If there is no mapping pair defined for a
	   character then this default is used by to16(), tou() and recode8().

       $m->nostrict;
	   All undefined mappings are replaced with the identity mapping.  Undefined character are normally just removed (or replaced with the
	   default if defined) when converting between character sets.

       $m->to8( $ustr );
	   Converts a 16-bit character string to the corresponding string in the 8-bit character set.

       $m->to16( $str );
	   Converts a 8-bit character string to the corresponding string in the 16-bit character set.

       $m->tou( $str );
	   Same an to16() but return a Unicode::String object instead of a plain UCS2 string.

       $m->recode8($m2, $str);
	   Map the string $str from one 8-bit character set ($m) to another one ($m2).	Since we assume we know the mappings towards the common
	   16-bit encoding we can use this to convert between any of the 8-bit character sets.

       $m->to_char16( $u8 )
	   Maps a single 8-bit character code to an 16-bit code.  If the 8-bit character is unmapped then the constant NOCHAR is returned.  The
	   default is not used and the callback method is not invoked.

       $m->to_char8( $u16 )
	   Maps a single 16-bit character code to an 8-bit code. If the 16-bit character is unmapped then the constant NOCHAR is returned.  The
	   default is not used and the callback method is not invoked.

       The following callback methods are available.  You can override these methods by creating a subclass of Unicode::Map8.

       $m->unmapped_to8
	   When mapping to 8-bit character string and there is no mapping defined (and no default either), then this method is called as the last
	   resort.  It is called with a single integer argument which is the code of the unmapped 16-bit character.  It is expected to return a
	   string that will be incorporated in the 8-bit string.  The default version of this method always returns an empty string.

	   Example:

	    package MyMapper;
	    @ISA=qw(Unicode::Map8);

	    sub unmapped_to8
	    {
	       my($self, $code) = @_;
	       require Unicode::CharName;
	       "<" . Unicode::CharName::uname($code) . ">";
	    }

       $m->unmapped_to16
	   Likewise when mapping to 16-bit character string and no mapping is defined then this method is called.  It should return a 16-bit
	   string with the bytes in network byte order.  The default version of this method always returns an empty string.

FILES

       The Unicode::Map8 constructor can parse two different file formats; a binary format and a textual format.

       The binary format is simple.  It consist of a sequence of 16-bit integer pairs in network byte order.  The first pair should contain the
       magic value 0xFFFE, 0x0001.  Of each pair, the first value is the code of an 8-bit character and the second is the code of the 16-bit
       character.  If follows from this that the first value should be less than 256.

       The textual format consist of lines that is either a comment (first non-blank character is '#'), a completely blank line or a line with two
       hexadecimal numbers.  The hexadecimal numbers must be preceded by "0x" as in C and Perl.  This is the same format used by the Unicode
       mapping files available from <URL:ftp://ftp.unicode.org/Public>.

       The mapping table files are installed in the Unicode/Map8/maps directory somewhere in the Perl @INC path.  The variable
       $Unicode::Map8::MAPS_DIR is the complete path name to this directory.  Binary mapping files are stored within this directory with the
       suffix .bin.  Textual mapping files are stored with the suffix .txt.

       The scripts map8_bin2txt and map8_txt2bin can translate between these mapping file formats.

       A special file called aliases within $MAPS_DIR specify all the alias names that can be used to denote the various character sets.  The
       first name of each line is the real file name and the rest is alias names separated by space.

       The `"umap --list"' command be used to list the character sets supported.

BUGS

       Does not handle Unicode surrogate pairs as a single character.

SEE ALSO

       umap(1), Unicode::String

COPYRIGHT

       Copyright 1998 Gisle Aas.

       This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.14.2							    2010-01-18								 Map8(3pm)

Linux and UNIX Man Pages

ustr-import(1) [debian man page]

Check Out this Related Man Page