utf man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

UTF(3)									UTF(3)

NAME
       runetochar,  chartorune,	 runelen, fullrune, utflen, utfrune, utfrrune,
       utfutf - Unicode Text Format functionality

SYNOPSIS
       #include <utf.h>

       int runetochar(char *cp, Rune *rp);

       int chartorune(Rune *rp, char *cp);

       int runelen(long r);

       int fullrune(char *cp, int n);

       int utflen(char *s);

       int utfbytes(char *s);

       char *utfrune(char *cp, long r);

       char *utfrrune(char *cp, long r);

       char *utfutf(char *big, char *little);

       int utf_snprintf(char *buf, size_t size, char *format, ...);

       int utfcmp(char *s1, char *s2);

       int utfncmp(char *s1, char *s2, int rc);

       char *utfcpy(char *dst, char *src);

       char *utfncpy(char *dst, char *src, int nbytes);

       char *utfcat(char *src, char *append);

       char *utfncat(char *src, char *append, int nbytes);

DESCRIPTION
       The UTF routines are used to pack the  Unicode  text  encoding  into  a
       standard	 character  stream.   To do that effectively, ASCII characters
       form the lowest 127 characters of UTF-8. These  characters  are	inter‐
       changeable between the two character sets.  A Rune is a Unicode charac‐
       ter, defined in the header file utf.h.

       runetochar translates a single Rune to a UTF sequence and  returns  the
       number  of  bytes produced. chartorune is the inverse of this function,
       returning the number of bytes consumed.	runelen returns the number  of
       bytes  in  the  encoding	 of  a Rune.  fullrune checks that the first n
       bytes of the UTF string cp contain a complete UTF encoding.

       utflen returns the number of runes in a UTF  string.   utbytes  returns
       the  number of bytes in a UTF string.  utfrune returns a pointer to the
       first occurrence of a rune in a UTF string.  utfrrune returns a pointer
       to  the last.  utfutf searches for the first occurrence of a UTF string
       in another UTF string.

       utf_snprintf is a prticularly dumb implementation of snprintf  for  utf
       strings	-  it  only  interprets	 %%, %s and %d sequences in the format
       string, and does no field width calculation on those.

       utfcmp compares	two  strings  lexicographically,  Rune	by  Rune,  and
       returns	a  value  greater  than	 0,  equal  to zero, or less than zero
       depending on whether the first UTF string is greater than, the same as,
       or  less	 than  the second string.  utfncmp does the same comparison as
       utfcmp, with a maximum upper bound of rc Runes.

       utfcpy copies from source to destination, Rune by Rune, and returns its
       destination  string.  No bounds checking is done on the number of Runes
       copied, or their individual  sizes.   The  dst  argument	 is  returned.
       utfncpy	copies at most nbytes bytes from source to destination, termi‐
       nating when a null Rune is found in the source. If the number of	 bytes
       copied is less than nbytes, then the destination string is paddedf with
       null (0) bytes. If it is equal to or greater than nbytes, no zero bytes
       is added.  The dst argument is returned.	 utfcat appends the UTF string
       append onto the UTF string src.	utfncat appends the UTF string	append
       onto  the  UTF  string src, bearing in mind that the buffer src is only
       nbytes long.

IMPLEMENTATION
       This implementation of UTF, nominally UTF-8, can encode a null  Unicode
       character  using	 a one-byte or a two-byte encoding.  Typically, Plan 9
       uses a one-byte encoding, whilst Java uses a two-byte encoding.	Plan 9
       type  encoding  makes  backwards	 compatibility	much easier, and loses
       nothing - all the Java functionality is there, there  are  no  embedded
       null  bytes  in	a  UTF string, due to the encoding of second and third
       characters, and ordinary C strings are recognised as well, which is not
       the case in Java.  By default, a one byte Null-byte encoding is used.

       UTF-8  is  defined in X/Open Company Ltd., "File System Safe UCS Trans‐
       formation Format (FSS_UTF)", X/Open Preliminary Specification, Document
       Number: P316, which also appears in ISO/IEC 10646, Annex P.

BUGS
       Undoubtably, these are many, and legion.

AUTHOR
       Written	  by	Alistair    Crooks   (agc@amdahl.com,	or   agc@west‐
       ley.demon.co.uk), from a draft document written by  Rob	Pike  and  Ken
       Thompson,  detailing  the implementation of UTF in the Plan 9 operating
       system.

									UTF(3)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net