libidnkit(3)libidnkit(3)NAME
libidnkit, libidnkitlite - Internationalized Domain Name Handling
Libraries
SYNOPSIS
#include <idn/api.h>
idn_result_t
idn_nameinit(int load_file)
idn_result_t
idn_encodename(int actions, const char *from, char *to, size_t tolen)
idn_result_t
idn_decodename(int actions, const char *from, char *to, size_t tolen)
idn_result_t
idn_decodename2(int actions, const char *from, char *to, size_t tolen,
const char *auxencoding)
idn_result_t
idn_enable(int on_off)
#include <idn/result.h>
char *
idn_result_tostring(idn_result_t result)
OVERVIEW
The libidnkit and libidnkitlite libraries support various manipulations
of internationalized domain names, including:
- encoding convesion
- name preparation
They are designed according to IDNA framework where each application
must do necessary preparations for the internationalized domain names
before passing them to the resolver.
To help applications do the preparation, the libraries provide easy-to-
use, high-level interface for the work.
Both libraries provide almost the same API. The difference between
them is that libidnkit internally uses iconv function to provide encod‐
ing conversion from UTF-8 to the local encoding (such as iso-8859-1,
usually determined by the current locale), and vise versa.
libidnkitlite is lightweight version of libidnkit. It assumes local
encoding is UTF-8 so that it never uses iconv.
This manual describes only a small subset of the API that the libraries
provide, most important functions for application programmers. For
other API, please refer to the idnkit's specification document (which
is not yet available) or the header files typically found under
`/usr/local/include/idn/' on your system.
DESCRIPTION
The idn_nameinit function initializes the library. It also sets
default configuration if load_file is 0, otherwise it tries to read a
configuration file. If idn_nameinit is called more than once, the
library initialization will take place only at the first call while the
actual configuration procedure will occur at every call.
If there are no errors, idn_nameinit returns idn_success. Otherwise,
the returned value indicates the cause of the error. See the section
``RETURN VALUES'' below for the error codes.
Usually you don't have to call this function explicitly because it is
implicitly called when idn_encodename or idn_decodename is first called
without prior calling of idn_nameinit. In such case, initialization
without the configuration file takes place.
idn_encodename function performs name preparation and encoding conver‐
sion on the internationalized domain name specified by from, and stores
the result to to, whose length is specified by tolen. actions is a
bitwise-OR of the following macros, specifying which subprocesses in
the encoding process are to be employed.
IDN_LOCALCONV Local encoding to UTF-8 conversion
IDN_DELIMMAP Delimiter mapping
IDN_LOCALMAP Local mapping
IDN_NAMEPREP NAMEPREP mapping, normalization,
prohibited character check and bidirectional
string check
IDN_UNASCHECK NAMEPREP unassigned codepoint check
IDN_ASCCHECK ASCII range character check
IDN_IDNCONV UTF-8 to IDN encoding conversion
IDN_LENCHECK Label length check
Details of this encoding process can be found in the section ``NAME
ENCODING''.
For convenience, also IDN_ENCODE_QUERY, IDN_ENCODE_APP and
IDN_ENCODE_STORED macros are provided. IDN_ENCODE_QUERY is used to
encode a ``query string'' (see the IDNA specification). It is equal to
(IDN_LOCALCONV | IDN_DELIMMAP | IDN_LOCALMAP | IDN_NAMEPREP
| IDN_IDNCONV | IDN_LENCHECK)
if you are using libidnkit, and equal to
(IDN_DELIMMAP | IDN_LOCALMAP | IDN_NAMEPREP | IDN_IDNCONV
| IDN_LENCHECK)
if you are using libidnkitlite.
IDN_ENCODE_APP is used for ordinary application to encode a domain
name. It performs IDN_ASCCHECK in addition with IDN_ENCODE_QUERY.
IDN_ENCODE_STORED is used to encode a ``stored string'' (see the IDNA
specification). It performs IDN_ENCODE_APP plus IDN_UNASCHECK.
idn_decodename function performs the reverse of idn_encodename. It
converts the internationalized domain name given by from, which is rep‐
resented in a special encoding called ACE, to the application's local
codeset and stores into to, whose length is specified by tolen. As in
idn_encodename, actions is a bitwise-OR of the following macros.
IDN_DELIMMAP Delimiter mapping
IDN_NAMEPREP NAMEPREP mapping, normalization,
prohibited character check and bidirectional
string check
IDN_UNASCHECK NAMEPREP unassigned codepoint check
IDN_IDNCONV UTF-8 to IDN encoding conversion
IDN_RTCHECK Round trip check
IDN_ASCCHECK ASCII range character check
IDN_LOCALCONV Local encoding to UTF-8 conversion
Details of this decoding process can be found in the section ``NAME
DECODING''.
For convenience, also IDN_DECODE_QUERY, IDN_DECODE_APP and
IDN_DECODE_STORED macros are provided. IDN_DECODE_QUERY is used to
decode a ``qeury string'' (see the IDNA specification). It is equal to
(IDN_DELIMMAP | IDN_NAMEPREP | IDN_IDNCONV | IDN_RTCHECK
| IDN_LOCALCONV)
if you are using libidnkit, and equal to
(IDN_DELIMMAP | IDN_NAMEPREP | IDN_IDNCONV | IDN_RTCHECK)
if you are using libidnkitlite.
IDN_DECODE_APP is used for ordinary application to decode a domain
name. It performs IDN_ASCCHECK in addition with IDN_DECODE_QUERY.
IDN_DECODE_STORED is used to decode a ``stored string'' (see the IDNA
specification). It performs IDN_DECODE_APP plus IDN_UNASCHECK.
idn_decodename2 function provides the same functionality as idn_decode‐
name except that character encoding of from is supposed to be auxencod‐
ing. If IDN encoding is Punycode and auxencoding is ISO 8859-2 for
example, it is assumed that the Punycode string stored in from is writ‐
ten in ISO 8859-2.
In the IDN decode procedure, IDN_NAMEPREP is done before IDN_IDNCONV,
and some non-ASCII characters are converted to ASCII characters as the
result of IDN_NAMEPREP. Therefore, ACE string given by from may con‐
tains those non-ASCII characters. That is the reason docode_name2
exists.
All of the functions above return error code of type idn_result_t. All
codes other than idn_success indicates some kind of failure.
idn_result_tostring function takes an error code result and returns a
pointer to the corresponding message string.
NAME ENCODING
Name encoding is a process that transforms the specified international‐
ized domain name to a certain string suitable for name resolution. For
each label in a given domain name, the encoding processor performs:
(1) Convert to UTF-8 (IDN_LOCALCONV)
Convert the encoding of the given domain name from application's
local encoding (e.g. ISO-8859-1) to UTF-8. Note that
libidnkitlite doesn't support this step.
(2) Delimiter mapping (IDN_DELIMMAP)
Map domain name delimiters to `.' (U+002E). The recoginzed
delimiters are: U+3002 (ideographic full stop), U+FF0E (full‐
width full stop), U+FF61 (halfwidth ideographic full stop).
(3) Local mapping (IDN_LOCALMAP)
Apply character mapping whose rule is determined by the TLD of
the name.
(4) NAMEPREP (IDN_NAMEPREP, IDN_UNASCHECK)
Perform name preparation (NAMEPREP), which is a standard process
for name canonicalizaion of internationalized domain names.
NAMEPREP consists of 5 steps: mapping, normalization, prohibited
character check, bidirectional text check and unassigned code‐
point check. The first four steps are done by IDN_NAMEPREP, and
the last step is done by IDN_UNASCHECK.
(5) ASCII range character check (IDN_ASCCHECK)
Checks if the domain name contains non-LDH ASCII character (not
alpha-numeric or hyphen), or it begins or end with hyphen.
(6) Convert to ACE (IDN_IDNCONV)
Convert the NAMEPREPed name to a special encoding designed for
representing internationalized domain names.
The encoding is also known as ACE (ASCII Compatible Encoding)
since a string in the encoding is just like a traditional ASCII
domain name consisting of only letters, numbers and hyphens.
(7) Label length check (IDN_LENCHECK)
For each label, check the number of characters in it. It must
be in the range 1 to 63.
There are many configuration parameters for this process, such as the
ACE or the local mapping rules. These parameters are read from the
default idnkit's configuration file, idn.conf. See idn.conf(5) for
details.
NAME DECODING
Name decoding is a reverse process of the name encoding. It transforms
the specified internationalized domain name in a special encoding suit‐
able for name resolution to the normal name string in the application's
current codeset. However, name encoding and name decoding are not sym‐
metric.
For each label in a given domain name, the decoding processor performs:
(1) Delimiter mapping (IDN_DELIMMAP)
Map domain name delimiters to `.' (U+002E). The recoginzed
delimiters are: U+3002 (ideographic full stop), U+FF0E (full‐
width full stop), U+FF61 (halfwidth ideographic full stop).
(2) NAMEPREP (IDN_NAMEPREP, IDN_UNASCHECK)
Perform name preparation (NAMEPREP), which is a standard process
for name canonicalizaion of internationalized domain names.
(3) Convert to UTF-8 (IDN_IDNCONV)
Convert the encoding of the given domain name from ACE to UTF-8.
(4) Round trip check (IDN_RTCHECK)
Encode the result of (3) using the ``NAME ENCODING'' scheme, and
then compare it with the result of the step (2). If they are
different, the check is failed. If IDN_UNASCHECK, IDN_ASCCHECK
or both are specified, also they are done in the encoding pro‐
cesses.
(5) Convert to local encoding
Convert the result of (3) from UTF-8 to the application's local
encoding (e.g. ISO-8859-1). Note that libidnkitlite doesn't
support this step.
If prohibited character check, unassigned codepoint check or bidirec‐
tional text check at step (2) is failed, or round trip check at step
(4) is failed, the original input label is returned.
The configuration parameters for this process, are also read from the
configuration file idn.conf.
IDN_DISABLE
If the IDN_DISABLE environ variable is defined at run-time, the
libraries disable internationalized domain name support, by default.
In this case, idn_encodename and idn_decodename don't encode/decode an
input name, but instead they simply ouput a copy of the input name as
the result of encoding/decoding.
If your application should always enable mulitilingual domain name sup‐
port regardless of definition of IDN_DISABLE, call
idn_enable(1)
before performing encoding/decoding.
RETURN VALUES
Most of the API functions return values of type idn_result_t in order
to indicate the status of the call.
The following is a complete list of the status codes. Note that some
of them are never returned by the functions described in this manual.
idn_success Not an error. The call succeeded.
idn_notfound Specified information does not exist.
idn_invalid_encoding
The encoding of the specified string is invalid.
idn_invalid_syntax
There is a syntax error in the configuration file.
idn_invalid_name
The specified name is not valid.
idn_invalid_message
The specified DNS message is not valid.
idn_invalid_action
The specified action contains invalid flags.
idn_invalid_codepoint
The specified Unicode code point value is not valid.
idn_invalid_length
The number of characters in an ACE label is not in the
range 1 to 63.
idn_buffer_overflow
The specified buffer is too small to hold the result.
idn_noentry The specified key does not exist in the hash table.
idn_nomemory Memory allocation using malloc failed.
idn_nofile The specified file could not be opened.
idn_nomapping Some characters do not have the mapping to the target
character set.
idn_context_required
Context information is required.
idn_prohibited The specified string contains some prohibited charac‐
ters.
idn_failure Generic error which is not covered by the above codes.
EXAMPLES
To get the address of a internationalized domain name in the applica‐
tion's local codeset, use idn_encodename to convert the name to the
format suitable for passing to resolver functions.
idn_result_t r;
char ace_name[256];
struct hostent *hp;
...
r = idn_encodename(IDN_ENCODE_APP, name, ace_name,
sizeof(ace_name));
if (r != idn_success) {
fprintf(stderr, "idn_encodename failed: %s\n",
idn_result_tostring(r));
exit(1);
}
hp = gethostbyname(ace_name);
...
To decode the internationalized domain name returned from a resolver
function, use idn_decodename.
idn_result_t r;
char local_name[256];
struct hostent *hp;
...
hp = gethostbyname(name);
r = idn_decodename(IDN_DECODE_APP, hp->h_name, local_name,
sizeof(local_name));
if (r != idn_success) {
fprintf(stderr, "idn_decodename failed: %s\n",
idn_result_tostring(r));
exit(1);
}
printf("name: %s\n", local_name);
...
SEE ALSOidn.conf(5)
Mar 11, 2002 libidnkit(3)