idnconv(1) User Commands idnconv(1)
NAME
idnconv - Internationalized Domain Name (IDN) encoding conversion utility
SYNOPSIS
idnconv [-i in-code | --in in-code | -f in-code | --from in-code] [-o out-code | --out out-code | -t out-code | --to out-code] [-a |
--asciicheck | --ascii-check] [-A | --noasciicheck | --no-ascii-check] [-b | --bidicheck | --bidi-check] [-B | --nobidicheck | --no-
bidi-check] [-l | --lengthcheck | --length-check] [-L | --nolengthcheck | --no-length-check] [-n | --nameprep] [-N | --nonameprep |
--no-nameprep] [-u | --unassigncheck | --unassign-check] [-U | --nounassigncheck | --no-unassign-check] [-h | --help] [-v | --version]
[file...]
DESCRIPTION
idnconv converts the codeset or encoding of given text, if applicable. You can change the conversion with different options. idnconv reads
from file or standard input and writes the results to standard output.
When more than one IDN names or labels are supplied as input, such names or labels can be delimitered by using white-space characters of
the POSIX locale or the label separators defined in the RFC 3490.
The main use for idnconv is to convert Internationalized Domain Names in one codeset or encoding to another codeset or encoding. For
instance, you can use the utility to convert IDN names in UTF-8 codeset to ASCII Compatible Encoding (ACE) encoded IDN names in 7-bit
ASCII. For any other codeset conversion purposes, use iconv(1) instead.
OPTIONS
The following options are supported:
-a | --asciicheck | --ascii-check
During IDN conversion process, enforce ASCII character range checks.
This is identical to setting the UseSTD3ASCIIRules flag described in RFC 3490. For more details on the ASCII character range checks,
refer to idn_encodename(3EXT) and RFC 3490. This is the default.
-A | --noasciicheck | --no-ascii-check
During IDN conversion process, do not perform ASCII character range checks.
This is identical to unsetting the UseSTD3ASCIIRules flag described in RFC 3490. For more details on the ASCII character range checks,
refer to idn_encodename(3EXT) and RFC 3490.
-b | --bidicheck | --bidi-check
During IDN conversion process, enforce checkings on bidirectional strings as specified in RFC 3491 and RFC 3454.
This is the default.
-B | --nobidicheck | --no-bidi-check
During IDN conversion process, do not perform checkings on bidirectional strings which is specified in RFC 3491 and RFC 3454.
-h | --help
Print information about the utility and the options it supports.
All other options and operands if any are ignored.
-i in-code| --in in-code| -f in-code | --from in-code
Identify the input codeset with the in-code argument. All iconv code conversion names that can be converted to UTF-8 can be used as the
value of the in-code. If not supplied, the current locale's codeset is assumed as the codeset of the input. The utility also checks
each individual name in the actual input and if the name is in ACE, the ACE is assumed as the in-code for the name.
-l | --lengthcheck | --length-check
During IDN conversion process, enforce label length check.
See idn_encodename(3EXT) and RFC 3490. This ensures that the length of each label is in the range of 1 to 63. This is the default.
-L | --nolengthcheck | --no-length-check
During IDN conversion process, do not perform label length check.
See idn_encodename(3EXT) and RFC 3490.
-n | --nameprep
During IDN conversion process, enforce Nameprep step as specified in the RFC 3490, RFC 3491, and RFC 3454. This is the default.
-N | --nonameprep | --no-nameprep
During IDN conversion process, do not perform Nameprep step. For more details on the Nameprep, refer to idn_encodename(3EXT), RFC
3490, RFC 3491, and RFC 3454.
-o out-code | --out out-code | --t out-code | --to out-code
Identify the output codeset with the out-code argument.
All iconv code conversion names that can be converted to UTF-8 can be used as the value of the out-code. If not supplied, the current
locale's codeset is assumed as the codeset of the output; if the in-code is ACE, then, the utility tries to convert names from actual
input to non-ACE IDN names in the output codeset.
-u | --unassigncheck | --unassign-check
During IDN conversion process, enforce unassigned character checking.
This is identical to unsetting the AllowUnassigned flag described in the RFC 3490. This option is useful when the IDN names are con-
verted for storing purpose or to give the names to server machines. For more details on the unassigned character checking, refer to RFC
3490, RFC 3491, and RFC 3454. This is the default.
-U | --nounassigncheck | --no-unassign-check
During IDN conversion process, do not perform unassigned character checking.
This is identical to setting the AllowUnassigned flag described in the RFC 3490. This option is useful when the IDN names are converted
for the query purpose. For more details on the unassigned character checking, refer to RFC 3490, RFC 3491, and RFC 3454.
-v | --version
Prints information about the utility's name, version, and legal status. All other options and operands if any are ignored.
OPERANDS
The following operands are supported:
file A path name of the input file to be converted. If file is omitted, the standard input is used.
EXAMPLES
Example 1: Converting IDN Names
The following example converts IDN names.
It reads names in the current locale's codeset from standard input. It converts and writes the converted results to results.txt file. If
the names given to the utility are in ACE, the results are non-ACE IDN names in the current locale's codeset. If the names given to the
utility are in non-ACE IDN names, the results are IDN names in ACE.
example% idnconv > results.txt
Example 2: Converting an ACE Encoded IDN Name
The following example converts an ACE encoded IDN name into an IDN name in UTF-8.
It reads xn--1lq90i which is in ACE encoding from standard input. It writes the converted results to file Beijing-UTF-8.txt. The file con-
tains Beijing in two Chinese letters in UTF-8 codeset.
example% idnconv -t UTF-8 > Beijing-UTF-8.txt
xn--1lq90i
<CTRL>d
Example 3: Converting Names in KOI8-R Cyrillic Single Byte Codeset
The following example converts names in KOI8-R Cyrillic single byte codeset to ACE encoded names.
It reads from file inputfile.txt which is in KOI8-R. It writes the converted results to standard output. The results are in ACE encoding.
example% idnconv --in KOI8-R --out ACE inputfile.txt
xn--80adxhks
xn--90aqflb3d1a
xn--80aesccdb4a2a8c
example%
Example 4: Converting Names for Storing Purpose
The following example converts names for storing purposes.
It reads from file inputfile.txt that is in ISO8859-1. It converts and writes the results to the outputfile.txt in ACE. It also yields ACE
names that are good to be used as server names.
example% idnconv --from ISO8859-1 --to ACE --unassign-check
inputfile.txt > outputfile.txt
Example 5: Converting Names for Query Purposes
The following example converts names for query purposes.
It reads from standard input in the current locale's codeset. It converts and writes the results to the outputfile.txt in ACE:
example% idnconv -U -t ACE > outputfile.txt
ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of idnconv: LANG, LC_ALL, LC_CTYPE,
LC_MESSAGES, and NLSPATH.
EXIT STATUS
The following exit values are returned:
0 Successful completion.
1 Not supported in-code or out-code value.
2 ASCII character range checking has failed.
3 Checkings on bidirectional strings have failed.
4 Label length checking has failed.
5 Nameprep step reported an error.
6 Unassigned character has been found.
7 Illegal or unknown option has been supplied.
8 Input file cannot be found.
9 Not enough memory.
10 During internal iconv code conversions, conversion error occurred.
11 During internal iconv code conversions, non-identical code conversion has happened.
>11 Unspecified error occurred.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWidnu |
+-----------------------------+-----------------------------+
|Interface Stability |Evolving |
+-----------------------------+-----------------------------+
SEE ALSO
iconv(1), iconv(3C), iconv_close(3C), iconv_open(3C), idn_decodename(3EXT), idn_decodename2(3EXT), idn_encodename(3EXT), attributes(5),
environ(5), iconv(5)
RFC 3490 Internationalizing Domain Names in Applications (IDNA)
RFC 3491 Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)
RFC 3492 Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA)
RFC 3454 Preparation of Internationalized Strings ("stringprep")
RFC 952 DoD Internet Host Table Specification
RFC 921 Domain Name System Implementation Schedule - Revised
STD 3, RFC 1122 Requirements for Internet Hosts -- Communication Layers
STD 3, RFC 1123 Requirements for Internet Hosts -- Applications and Support
Unicode Standard Annex #15: Unicode Normalization Forms, Version 3.2.0.http://www.unicode.org/unicode/reports/tr15/tr15-22.html
International Language Environments Guide
NOTES
For the generic information on IDN in applications, refer to RFC 3490 and the International Language Environments Guide.
There are some distinctions between the storing purpose and the querying purpose when you decide on the names of systems. For more details
on the terms and distinctions, refer to RFC 3454.
SunOS 5.10 21 Jun 2004 idnconv(1)