fwscanf(3S)


fwscanf, swscanf, wscanf, vfwscanf, vswcanf, vwscanf -- convert formatted wide/multibyte character input

Synopsis

   #include <wchar.h>
   

int fwscanf(FILE *strm, const wchar_t *format, .../* args */);

int swscanf(const wchar_t *s, const wchar_t *format, .../* args */);

int wscanf(const wchar_t *format, .../* args */);

#include <stdarg.h>

int vfwscanf(FILE *strm, const wchar_t *format, va_arg args);

int vswscanf(const wchar_t *s, const wchar_t *format, va_arg args);

int vwscanf(const wchar_t *format, va_arg args);

Description

Each function reads wide/multibyte characters, interprets them, and stores the results through the arg pointers, under control of the wide character string format. Each function returns the number of successfully matched and assigned input items, or EOF if the input ended prior any successful matches.

fwscanf and vfwscanf read multibyte characters from the stream strm.

wscanf and vwscanf read multibyte characters from the standard input stream, stdin.

swscanf and vswscanf read from the wide character string s.

The ``v'' functions take their pointer arguments through the single va_arg object passed. See stdarg(5).

The format consists of zero or more portable white-space wide characters (blanks, horizontal and vertical tabs, newlines, carriage returns, and form-feeds) which cause white-space input wide/multibyte characters [as defined by iswspace, see wctype(3C)] to be skipped, zero or more ordinary wide characters (not %) which must match the next input wide/multibyte characters, and zero or more conversion specifications, each of which is introduced by the a % which can result in the matching of a sequence of input wide/multibyte characters and possibly the assignment of a converted value.

Each conversion specification takes the following general form and sequence:

   %[pos$][*][width][size]fmt


pos$
An optional entry, consisting of one or more decimal digits followed by a $ character, that specifies the number of the next pointer arg to access. The first arg (just after format) is numbered 1. If this entry is not present, the arg following the most recently used arg will be accessed.

When numbered argument specifications are used, specifying the Nth argument requires that all the preceding arguments, from the first to the (N-1)th, be specified at least once, in a consistent way, in the format string.


*
An optional flag that suppresses the usual assignment of the converted value after a successful match. (No corresponding arg pointer should be present.)

width
An optional entry that consists of one or more decimal digits that specifies the maximum field width. No width limitation will occur by default, except for c and C.

size
An optional hh, h, l (ell), ll, L, j, t or z that specifies other than the default argument pointer type, depending on the fmt specifier:

a, e, f, g
The default argument type is pointer to float; an l changes it to be a pointer to double, and L or ll to pointer to long double.

b, o, u, x
The default argument type is pointer to unsigned int; an h changes it to be a pointer to unsigned short int, l to pointer to unsigned long int. and L or ll to pointer to unsigned long long int.

c, s, [...]
The default argument type is pointer to character; an l changes it to a pointer to wchar_t. lc (ls) is a synonym for C (S).

d, i, n
The default argument type is pointer to int; an h changes it to be a pointer to short int, l to pointer to long int, and L or ll to pointer to long long int.

If a size appears other than in these combinations, the behavior is undefined.


fmt
A conversion wide character or sequence (described below) that shows the type of conversion to be applied.

A conversion specification directs the matching and conversion of the next input item; the result is placed in the object pointed to by the corresponding arg unless assignment suppression was indicated by the * flag. The suppression of assignment provides a way of describing an input item that is to be skipped. For all conversion specifiers except c, C, n and [...], leading white-space wide/multibyte characters are skipped. An input item is usually defined as a sequence of non-white-space wide/multibyte characters that extends to the next inappropriate wide/multibyte character or until the maximum field width (if one is specified) is exhausted.

The conversion specifiers and their meanings are:


a, e, f, g
Matches an optionally signed floating number, whose format is the same as expected for the subject string of the wcstod(3C) function.

b, o, u, x
The default argument type is pointer to unsigned int; an hh changes it to be a pointer to unsigned char, h a pointer to unsigned short, l a pointer to unsigned long int, ll or L a pointer to unsigned long long int, j a pointer to uintmax_t, t a pointer to the unsigned type corresponding to ptrdiff_t, and z a pointer to size_t.

c
Matches a sequence of wide/multibyte characters of the number specified by the field width (1 if no field width is present in the directive). The corresponding argument should be a pointer to the initial element of a character array large enough to accept the generated multibyte sequence. No null character is added. The normal skip over white space is suppressed.

C, lc
Matches a sequence of wide/multibyte characters of the number specified by the field width (1 if no field width is present in the directive). The corresponding argument should be a pointer to the initial element of a wchar_t array large enough to accept the sequence of wide characters. No null wide character is added. The normal skip over white space is suppressed.

d, i, n
The default argument type is pointer to int; an hh changes it to be a pointer to signed char, h a pointer to short int, l a pointer to long int, ll or L a pointer to long long int, j a pointer to intmax_t, t a pointer to ptrdiff_t, and z a pointer to ssize_t.

p
Matches a sequence of printable wide/multibyte characters as is produced by the fwprintf(3S) functions' %p conversion. The corresponding argument should be a pointer to a pointer to void. If the input matched is a value converted earlier (during the same program execution), the pointer that results will compare equal to that value; otherwise, the behavior is undefined.

s
Matches a sequence of wide/multibyte characters, optionally delimited by white-space wide/multibyte characters. The corresponding argument should be a pointer to the initial element of a character array large enough to accept the generated multibyte sequence and a terminating null character, which will be added automatically.

S, ls
Matches a sequence of wide/multibyte characters, optionally delimited by white-space wide/multibyte characters. The corresponding argument should be a pointer to the initial element of a wchar_t array large enough to accept the sequence of wide characters and a terminating null wide character, which will be added automatically.

[...]
Matches a nonempty sequence of wide/multibyte characters from a set of expected wide characters (the ``scanset'') as designated by the wide characters between the brackets (the ``scanlist''), see below. The corresponding argument should be a pointer to the initial element of a character array large enough to accept the generated multibyte sequence and a terminating null character, which will be added automatically.

l[...]
Matches a nonempty sequence of wide/multibyte characters from a set of expected wide characters (the ``scanset'') as designated by the wide characters between the brackets (the ``scanlist''), see below. The corresponding argument should be a pointer to the initial element of a wchar_t array large enough to accept the sequence of wide characters and a terminating null wide character, which will be added automatically.

%
Matches a single %; no assignment is done.

For [...] and l[...], the scanlist consists of all wide characters up to, but not including, the matching right bracket (]). The first right bracket matches unless the specifier begins with [] or [^], in which case the scanlist includes a ] and the matching one is the second right bracket. The scanset is those wide characters described by the scanlist unless it begins with a circumflex (^), in which case the scanset is those wide characters not described by the scanlist that follows the circumflex. The scanlist can describe an inclusive range of wide characters by low-high where low is not lexically greater than high (and where these endpoints are in the same codeset for locales whose wide/multibyte characters have such); otherwise, a dash (-) will stand for itself, as it will when it occurs last in the scanlist, or the first, or the second when a circumflex is first.

If the form of the conversion specification does not match any of the above, the results of the conversion are undefined. Similarly, the results are undefined if there are insufficient pointer args for the format. If the format is exhausted while args remain, the excess args are ignored.

When matching floating numbers, the locale's decimal point wide character is taken to introduce a fractional portion, the sequences inf and infinity (case ignored) are taken to represent infinities, and the sequence nan[(m)] (case ignored), where the optional parenthesized ``m'' consists of zero or more alphanumeric or underscore (_) wide characters, are taken to represent NaNs (not-a-numbers). Note, however, that the locale's thousands' separator wide character will not be recognized as such.

If conversion terminates on a conflicting input wide/multibyte character, the offending input wide/multibyte character is left unread in the input stream. Trailing white space (including newline wide characters) is left unread unless matched by a directive.

If end-of-file is encountered during input, conversion is terminated. If end-of-file occurs before any wide/multibyte characters matching the current directive have been read (other than leading white space where permitted), execution of the current directive terminates with an input failure; otherwise, unless execution of the current directive is terminated with a matching failure, execution of the following directive (other than %n, if any) is terminated with an input failure.

If a truncated sequence (due to reaching end-of-file or a conflicting input wide/multibyte character, or because a field width is exhausted) does not form a valid match for the current directive, the directive is terminated with a matching failure.

The success of literal matches and suppressed assignments is not directly determinable other than via the %n directive.

Multibyte characters from streams (stdin or strm) are read as if the getc function had been called repeatedly.

Errors

These routines return the number of successfully matched and assigned input items; this number can be zero in the event of an early matching failure. If the input ends before the first matching failure or conversion, EOF is returned.

Usage

The call to the function wscanf:
   int i, n; float x; wchar_t name[50];
   n = wscanf(L"%d%f%ls", &i, &x, name);

with the input line:

   25 54.32E-1 thompson

will assign to n the value 3, to i the value 25, to x the value 5.432, and name will contain the wide characters thompson\0.

The call to the function wscanf:

   int i; float x; char name[50];
   (void) wscanf(L"%2d%f%*d %[0-9]", &i, &x, name);

with the input line:

   56789 0123 56a72

will assign 56 to i, 789.0 to x, skip 0123, and place the characters 56\0 in name. The next character read from stdin will be a.

The following shows a simple use of vfwscanf, a function that reads formatted input from its own connection to /dev/tty.

#include <stdarg.h>
#include <stdio.h>

static FILE *instream;

int wscan(const wchar_t *fmt, ...)
{
va_list ap;
int ret;

va_start(ap, fmt);
if (instream == 0) {
if ((instream = fopen("/dev/tty", "r")) == 0)
return EOF;
}
ret = vfwscanf(instream, fmt, ap);
va_end(ap);
return ret;
}

References

fprintf(3S), fscanf(3S), fwprintf(3S), getc(3S), Intro(3S), stdarg(5), strtol(3C), wcstol(3C)


© 2004 The SCO Group, Inc. All rights reserved.
UnixWare 7 Release 7.1.4 - 25 April 2004