Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

isutf8(1) [debian man page]

isutf8(1)																 isutf8(1)

NAME
isutf8 - check whether files are valid UTF-8 SYNOPSIS
isutf8 [-hq] [--help] [--quiet] [file...] DESCRIPTION
isutf8 checks whether files are syntactically valid UTF-8. Input is either files named on the command line, or the standard input. Notices about files with invalid UTF-8 are printed to standard output. OPTIONS
-h, --help Print out a help summary. -q, --quiet Don't print messages telling which files are invalid UTF-8, merely indicate it with the exit status. EXIT STATUS
If the file is valid UTF-8, the exit status is zero. If the file is not valid UTF-8, or there is some error, the exit status is non-zero. AUTHOR
Lars Wirzenius SEE ALSO
utf8(7) 2006-02-19 isutf8(1)

Check Out this Related Man Page

UTF8_ENCODE(3)								 1							    UTF8_ENCODE(3)

utf8_encode - Encodes an ISO-8859-1 string to UTF-8

SYNOPSIS
string utf8_encode (string $data) DESCRIPTION
This function encodes the string $data to UTF-8, and returns the encoded version. UTF-8 is a standard mechanism used by Unicode for encoding wide character values into a byte stream. UTF-8 is transparent to plain ASCII characters, is self-synchronized (meaning it is possible for a program to figure out where in the bytestream characters start) and can be used with normal string comparison functions for sorting and such. PHP encodes UTF-8 characters in up to four bytes, like this: UTF-8 encoding +------+-------------------------------------+---+ |bytes | | | | | | | | | bits | | | | | | | | representation | | | | | | +------+-------------------------------------+---+ | 1 | | | | | | | | | 7 | | | | | | | | 0bbbbbbb | | | | | | | 2 | | | | | | | | | 11 | | | | | | | | 110bbbbb 10bbbbbb | | | | | | | 3 | | | | | | | | | 16 | | | | | | | | 1110bbbb 10bbbbbb 10bbbbbb | | | | | | | 4 | | | | | | | | | 21 | | | | | | | | 11110bbb 10bbbbbb 10bbbbbb 10bbbbbb | | | | | | +------+-------------------------------------+---+ Each b represents a bit that can be used to store character data. PARAMETERS
o $data - An ISO-8859-1 string. RETURN VALUES
Returns the UTF-8 translation of $data. SEE ALSO
utf8_decode(3). PHP Documentation Group UTF8_ENCODE(3)
Man Page