Novell Doc: NDK: Libraries for C (LibC), Volume 1 - Unicode Character Classifications

12.2 Unicode Character Classifications

The unitype function uses the following flags to return information about a specified Unicode character. These flags can be ORed together. They are defined in the unilib.h file.

Constant	Value	Description
UNI_UNDEF	0x00000000	No classification
UNI_CNTRL	0x00000001	Control character
UNI_SPACE	0x00000002	Non-printing space
UNI_PRINT	0x00000004	Visible print character
UNI_SPECIAL	0x00000008	Dingbats, special symbols, etc.
UNI_PUNCT	0x00000010	General punctuation
UNI_DIGIT	0x00000020	Decimal digit
UNI_XDIGIT	0x00000040	Hexadecimal digit
UNI_RESERVED1	0x00000080	Reserved for future use
UNI_LOWER	0x00000100	Lowercase, if applicable
UNI_UPPER	0x00000200	Uppercase, if applicable
UNI_RESERVED2	0x00000400	Reserved for future use
UNI_ALPHA	0x00000800	Non-number, non-punctuation
UNI_LATIN	0x00001000	Latin-based
UNI_GREEK	0x00002000	Greek
UNI_CYRILLIC	0x00004000	Cyrillic
UNI_HEBREW	0x00008000	Hebrew
UNI_ARABIC	0x00010000	Arabic
UNI_CJK	0x00020000	Chinese, Japanese, or Korean characters
UNI_INDIAN	0x00040000	Indian: Devanagari, Bengali, Tamil, etc.
UNI_SEASIA	0x00080000	Southeast Asia: Thai, Lao
UNI_CENASIA	0x00100000	Central Asia: Armenian, Tibetan, Georgian
UNI_OTHER	0x80000000	None of the above