runawk_modules(3)runawk_modules(3)NAME
runawk - wrapper for AWK interpreter
MODULES
runawk provides dozens of modules. Below is the documentation for
them.
CR_in.awk
As the name of this module says (_in suffix) this module reads and
optionally changes input lines.
Carriage-Return symbol at the end of input lines is removed. This
symbol usually appears in Windows text files. If you want to adapt
your script to accept windows files on input, just put
#use "CR_in.awk"
in the very beginning of your script.
abort.awk
abort (MSG, [EXIT_STATUS])
print MSG to stderr and exits program with EXIT_STATUS. EXIT_STATUS
defaults to 1.
abs.awk
abs (V)
return absolute value of V.
alt_assert.awk
assert (CONDITION, MSG, STATUS)
print an error message MSG to standard error and terminates the
program with STATUS exit code if CONDITION is false.
alt_getopt.awk
getopt(SHORT_OPTS)
This function processes ARGV array and returns TRUE if option is
received, received option is saved in 'optopt' variable, option
argument (if any) is saved in 'optarg' variable. Long options (like
--help or --long-option) present in GNU libc and BSD systems are also
supported.
NOTE: alt_getopt.awk module follows rules from SUS/POSIX "Utility
Syntax Guidelines"
alt_join.awk
join_keys (HASH, SEP)
return string consisting of all keys from HASH separated by SEP.
join_values (HASH, SEP)
return string consisting of all values from HASH separated by SEP.
join_by_numkeys (ARRAY, SEP [, START [, END]])
return string consisting of all values from ARRAY separated by SEP.
Indices from START (default: 1) to END (default: +inf) are analysed.
Collecting values is stopped on index absent in ARRAY.
backslash_in.awk
As the name of this module (_in suffix) says this module reads and
optionally changes input lines.
Backslash character at the end of line is treated as a sign that
current line is continued on the next one. Example is below.
Input:
a b c\
d e f g
a
b
e\
f
What your program using backslash_in.awk will obtain:
a b cd e f g
a
b
e f
basename.awk
basename (PATH)
return filename portion of the PATH (the same as dirname(3))
See example/demo_basename for the sample of usage
braceexpand.awk
braceexp(STRING)
shell-like brace expansion.
For example: print braceexpand("ab{,22{,7,8}}z{8,9}")
-| abz8 abz9 ab22z8 ab22z9 ab227z8 ab227z9 ab228z8 ab228z9
dirname.awk
dirname (PATH)
return dirname portion of the PATH (the same as dirname(3))
See example/demo_dirname for the sample of usage
embed_str.awk
This module reads a program's file, find .begin-str/.end-str pairs and
reads lines between them.
EMBED_STR - Associative array with string index
Example:
Input:
.begin-str mymsg
Line1
Line2
.end-str
Output (result)
EMBED_STR ["mymsg"]="Line1\nLine2"
See example/demo_embed_str for the sample of usage
exitnow.awk
exitnow (STATUS)
similar to the statement 'exit' but do not run END sections.
fieldwidth.awk
By default AWK interpreter splits input lines into tokens according to
regular expression that defines "spaces" between them using special
variable FS. Sometimes it is useful to define a fixed-size fields for
tokens. This is what this module is for. The functionality of
fieldwidths.awk is very close to GNU awk's FIELDWIDTHS variable.
fieldwidths(STRING, FW)
extracts substrings from STRING according to FW from the left to the
right and assigns $1, $2 etc. and NF variable. FW is a space
separated list of numbers that specify fields widths.
fieldwidths0(FW)
Does the the same as `fieldwidths' function but splits $0 instead.
FW
global variable. If it is set to non-empty string, all input lines
are split automatically and the value of variable FS is ignored in
this case.
See example/demo_fieldwidths for the sample of usage
ftrans_in.awk
beginfile() function provided by user is called before file reading
endfile() function provided by user is called after file reading
glob.awk
glob2ere (PATTERN)
convert glob PATTERN
(http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_13)
to equivalent extended regular expression
(http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_04)
has_prefix.awk
has_prefix (STRING, PREFIX)
return TRUE if STRING begins with PREFIX
See example/demo_has_prefix for the sample of usage
has_suffix.awk
has_suffix(STRING, SUFFIX)
return TRUE if STRING ends with SUFFIX
See example/demo_has_suffix for the sample of usage
heapsort.awk
heapsort (src_array, dest_remap, start, end)
The content of `src_array' is sorted using awk's rules for comparing
values. Values with indices in range [start, end] are sorted.
`src_array' array is not changed. Instead dest_remap array is
generated such that
Result:
src_array [dest_remap [start]] <=
<= src_array [dest_remap [start+1]] <=
<= src_array [dest_remap [start+2]] <= ... <=
<= src_array [dest_remap [end]]
`heapsort' algorithm is used.
Examples: see demo_heapsort and demo_heapsort2 executables.
heapsort_values (src_hash, dest_remap)
The same as `heapsort' described above, but hash values are sorted.
Result:
src_array [dest_remap [1]] <=
<= src_array [dest_remap [2]] <=
<= src_array [dest_remap [3]] <= ... <=
<= src_array [dest_remap [count]]
`count', a number of elements in `src_hash', is a return value.
Examples: see demo_heapsort3 executable.
heapsort_indices (src_hash, dest_remap)
The same as `heapsort' described above, but hash indices are sorted.
Result:
dest_remap [1] <=
<= dest_remap [2] <=
<= dest_remap [3] <= ... <=
<= dest_remap [count]
`count', a number of elements in `src_hash', is a return value.
Examples: demo_ini
heapsort_fields (dest_remap, [start [, end [, strnum]]])
The same as function "heapsort0" but $1, $2... array is sorted. Note
that $1, $2... are not changed, but dest_remap array is filled in!
The variable "start" default to 1, "end" -- to NF. If "strnum" is
set to 1, values are forcibly compared as strings. If "strnum" is
set to 2, values are forcibly compared as numbers.
heapsort0 ([start [, end [, strnum]]])
The same as "heapsort_fields" but $1, $2... are changed.
ini.awk
This module provides functions for manipulating .ini files. See
example/demo_ini for the sample of use.
read_inifile(FILENAME, RESULT [, SEPARATOR])
Reads .ini file FILENAME and fills array RESULT, e.g. RESULT
[<section5><SEPARATOR><name6>] = <value5.6> etc. If SEPARATOR is not
specified, `.' symbols is used by default.
Features:
- spaces are allowed everywhere, i.e. at the beginning and end of
line, around `=' separator. THEY ARE STRIPPED!
- comment lines start with `;' or `#' sign. Comment lines are ignored.
- values can be surrounded by signle or double quote. In this case
spaces are presenrved, otherwise they are removed from
beginning and at the end of line and replaced with single space
in the middle of the line.
- Escape character are not supported (yet?).
init_getopt.awk
Initialization step for power_getopt.awk module. In some cases it
makes sense to process options in a while() loop. This module allows
doing this. See the documentation about how options are initialized in
power_getopt.awk module.
print_help ()
display help message.
io.awk
This module provides a number of IO functions.
is_file(FILENAME)
returns 1 if the specified FILENAME is a regular file or 0 otherwise.
is_socket(FILENAME)
returns 1 if the specified FILENAME is a socket or 0 otherwise.
is_dir(FILENAME)
returns 1 if the specified FILENAME is a dir or 0 otherwise.
is_exec(FILENAME)
returns 1 if the specified FILENAME is executable or 0 otherwise.
is_fifo(FILENAME)
returns 1 if the specified FILENAME is a FIFO or 0 otherwise.
is_blockdev(FILENAME)
returns 1 if the specified FILENAME is a block special file or 0
otherwise.
is_chardev(FILENAME)
returns 1 if the specified FILENAME is a character special file or 0
otherwise.
is_symlink(FILENAME)
returns 1 if the specified FILENAME is a symlink or 0 otherwise.
file_size(FILENAME, USE_STAT_NOT_LSTAT)
returns the size of the specified FILENAME. If USE_STAT_NOT_LSTAT is
True, stat(2) is used instead of lstat(2).
Return value:
-2 if file doesn't exist
-1 if file is not a regular file
filesize otherwise
file_type(FILENAME, USE_STAT_NOT_LSTAT)
returns a single letter that corrspond to the file type. If
USE_STAT_NOT_LSTAT is True, stat(2) is used instead of lstat(2).
Return value:
- -- regular file
d -- directory
c -- character device
b -- block device
p -- FIFO
l -- symlink
s -- socket
See example/demo_io for the sample of usage
isnum.awk
isnum (NUM)
returns 1 if an argument is a number
match_br.awk
match_br(STRING, BR_OPEN, BR_CLOSE)
return start position (or zero if failure) of the substring
surrounded by balanced (), [], {} or similar characters Also sets
RSTART and RLENGTH variables just like the standard 'match' function
does
For example:
print match_br("A (B (), C(D,C,F (), 123))", "(", ")")
print RSTART, RLENGTH
-| 3
-| 3
-| 24
max.awk
max, max3, max4, max5
maximum functions
max_key(HASH, DFLT)
returns a maximum key in HASH or DFLT if it is empty
max_value(HASH, DFLT)
returns a maximum value in HASH or DFLT if it is empty
key_of_max_value(HASH, DFLT)
returns A KEY OF maximum value in HASH or DFLT if it is empty
min.awk
min, min3, min4, min5
minimum functions
min_key(HASH, DFLT)
returns a minimum key in HASH or DFLT if it is empty
min_value(HASH, DFLT)
returns a minimum value in HASH or DFLT if it is empty
key_of_min_value(HASH, DFLT)
returns A KEY OF minimum value in HASH or DFLT if it is empty
modinfo.awk
This module provides the following variables
MODC
A number of modules (-f <filename>) passed to an awk interpreter
MODV
Array with [0..MODC) indexes of those modules
MODMAIN
Path to the main module, i.e. program filename
See example/demo_modinfo for the sample of usage
multisub.awk
multisub(STRING, SUBST_REPLS[, KEEP])
is a substitution function. It searches for a list of substrings,
specified in SUBST_REPL in a left-most longest order and (if found)
replaces found fragments with appropriate replacement. SUBST_REPL
format: "SUBSTRING1:REPLACEMENT1 SUBSTRING2:REPLACEMENT2...".
Three spaces separate substring:replacement pairs from each other.
If KEEP is specified and some REPLACEMENT(N) is equal to it, then
appropriate SUBSTRING(N) is treated as a regular expression and
matched text is kept as is, i.e. not changed.
For example:
print multisub("ABBABBBBBBAAB", "ABB:c BBA:d AB:e")
|- ccBBde
pow.awk
pow (X, Y)
returns the value of X to the exponent Y
power_getopt.awk
power_getopt.awk module provides a very easy way to add options to AWK
application and follows rules from SUS/POSIX "Utility Syntax
Guidelines"
power_getopt.awk analyses '.begin-str help/.end-str' section in AWK
program (main module), and processes options specified there. The
following strings mean options:
-X single letter option
--XXX long option
-X|--XXX single letter option with long synonym
=X single letter option with argument
=-XXX long option with argument
=X|--XXX single letter option and long synonym with argument
If --help option was applied, usage information is printed (lines
between ".begin-str help" and ".end-str") replacing leading `='
character with `-'.
getarg(OPT, DEFAULT)
returns either 1 (option OPT was applied) or 0 (OPT was not applied)
for options not accepting the argument, and either specified value or
DEFAULT for options accepting the argument.
See example/demo_power_getopt for the sample of usage
quicksort.awk
quicksort (src_array, dest_remap, start, end)
The content of `src_array' is sorted using awk's rules for comparing
values. Values with indices in range [start, end] are sorted.
`src_array' array is not changed. Instead dest_remap array is
generated such that
Result:
src_array [dest_remap [start]] <=
<= src_array [dest_remap [start+1]] <=
<= src_array [dest_remap [start+2]] <= ... <=
<= src_array [dest_remap [end]]
`quicksort' algorithm is used. Examples: see demo_quicksort and
demo_quicksort2 executables
quicksort_values (src_hash, dest_remap)
The same as `quicksort' described above, but hash values are sorted.
Result:
src_hash [dest_remap [1]] <=
<= src_hash [dest_remap [2]] <=
<= src_hash [dest_remap [3]] <= ... <=
<= src_hash [dest_remap [count]]
`count', a number of elements in `src_hash', is a return value.
Examples: see demo_quicksort* executables.
quicksort_indices (src_hash, dest_remap)
The same as `quicksort' described above, but hash indices are sorted.
Result:
dest_remap [1] <=
<= dest_remap [2] <=
<= dest_remap [3] <= ... <=
<= dest_remap [count]
`count', a number of elements in `src_hash', is a return value.
readfile.awk
readfile(FILENAME)
read entire file and return its content as a string
See example/demo_readfile for the sample of usage
runcmd.awk
runcmd1 (CMD, OPTS, FILE)
wrapper for system() function that runs a command CMD with options
OPTS and one filename FILE. Unlike system(CMD " " OPTS " " FILE) the
function runcmd1 handles correctly FILE containing spaces, single
quote, double quote, tilde etc.
xruncmd1 (FILE)
safe wrapper for 'runcmd1'. awk exits with error if runcmd1()
function failed.
shquote.awk
shquote(str)
transforms the string `str' by adding shell escape and quoting
characters to include it to the system() and popen() functions as an
argument, so that the arguments will have the correct values after
being evaluated by the shell.
For example:
print shquote("file name.txt")
|- 'file name.txt'
print shquote("'")
|- \'
print shquote("Peter's")
|- 'Peter'\''s'
print shquote("*&;<>#~")
|- '*&;<>#~'
This module was inspired by NetBSD shquote(3)
http://netbsd.gw.com/cgi-bin/man-cgi?shquote+3+NetBSD-current and
shquote(1) by Alan Barrett
http://ftp.sunet.se/pub/os/NetBSD/misc/apb/shquote.20080906/
sort.awk
sort (src, dest_remap, start, end)
Call either heapsort function from heapsort.awk (if RUNAWK_SORTTYPE
environment variable is "heapsort") or quicksort from quicksort.awk
(if RUNAWK_SORTTYPE is "quicksort"). Sorttype defaults to
"heapsort".
sort_values (src, dest_remap)
Call either heapsort_values function from heapsort.awk (if
RUNAWK_SORTTYPE environment variable is "heapsort") or
quicksort_values from quicksort.awk (if RUNAWK_SORTTYPE is
"quicksort"). Sorttype defaults to "heapsort".
sort_indices (src, dest_remap)
Call either heapsort_indices function from heapsort.awk (if
RUNAWK_SORTTYPE environment variable is "heapsort") or
quicksort_indices from quicksort.awk (if RUNAWK_SORTTYPE is
"quicksort"). Sorttype defaults to "heapsort".
str2regexp.awk
str2regex(STRING)
returns a regular expression that matches given STRING
For example:
print str2regexp("all special symbols: ^$(){}[].*+?|\\")
-| all special symbols: [^][$][(][)][{][}][[]\][.][*][+][?][|]\\
tmpfile.awk
This module provides a function `tmpfile' for generating temporary
filenames. All these filenames are under temporary directory created
(if necessary) by runawk(1) which is removed automatically during
normal exit or when runawk(1) reveives SIGINT, SIGQUIT, SIGTERM, SIGHUP
or SIGPIPE.
tmpfile()
returns a temporary file name.
runawk_tmpdir
global variable that keeps tempdir created by runawk -t
See example/demo_tmpfile for the sample of usage
tokenre.awk
By default AWK splits input lines into tokens according to regular
expression that defines "spaces" between tokens using special variable
FS. In many situations it is more useful to define regular expressions
for tokens themselves. This is what this module does.
tokenre(STRING, REGEXP)
extracts substrings from STRING according to REGEXP from the left to
the right and assigns $1, $2 etc. and NF variable.
tokenre0(REGEXP)
Does the the same as `tokenre' but splits $0 instead.
splitre(STRING, ARR, REGEXP)
The same as `tokenre' but ARR[1], ARR[2]... are assigned. A number
of extracted tokens is a return value.
TRE
global variable. If it is set to non-empty string, all input lines
are split automatically.
trim.awk
trim_l(STRING)
Removes leading Tab and Space characters from STRING and returns the
result.
trim_r(STRING)
Removes Tab and Space characters at the end of STRING and returns the
result.
trim_c(STRING, REPL)
Replaces sequences of Tab and Space characters in STRING with REPL
and returns the result. If REPL is not specified, it defaults to
single Space character.
trim_lr(STRING)
Equal to trim_l(trim_r(STRING))
trim_lrc(STRING, REPL)
Equal to trim_l(trim_r(trim_c(STRING, REPL)))
See example/demo_trim for the sample of usage
trim_in.awk
As the name of this module says (_in suffix) this module reads and
potentially changes input lines.
Leading, ending spaces and/or spaces in the middle of input lines are
removed depending on TRIM variable. TRIM values:
"l" - remove leading space characters
"r" - remove ending space characters
"c" - remove extra space characters in the middle of input lines
"lr" - See l and r
"lrc" - See l, r and c
"lc" - See l and c
"cr" - See c and r By default TRIM variable is set to "lr". TRIM set
to a single space character means no trimming.
xclose.awk
xclose(FILE)
safe wrapper for 'close'. awk exits with error if close() function
failed.
xgetline.awk
xgetline0([FILE])
Safe analog to 'getline < FILE' or 'getline' (if no FILE is
specified). 0 at the end means that input line is assigned to $0.
xgetline([FILE])
Safe analog to 'getline __input < FILE' and 'getline __input' (if no
FILE is specified)
In both cases "safe" means that returned value is analysed and if it is
less than zero (file reading error happens) program will be terminated
emmidiately with appropriate error message sent to stderr. Both
functions return zero if end of file is reached or non-zero otherwise.
Example:
while (xgetline("/etc/passwd")){
print "user: " __input
}
xsystem.awk
xsystem(FILE)
safe wrapper for 'system'. awk exits with error if system() function
failed.
ord.awk
ord (CHAR)
return numeral code of CHAR
chr (CODE)
return symbol from the CODE
AUTHOR
Copyright (c) 2007-2014 Aleksey Cheusov <vle@gmx.net>
BUGS/FEEDBACK
Please send any comments, questions, bug reports etc. to me by e-mail
or register them at sourceforge project home. Feature requests are
also welcomed.
HOME
<http://sourceforge.net/projects/runawk/>
SEE ALSO awk(1)
2014-12-26 runawk_modules(3)