dbh.h(0) DBHashTables Programmers' Manual dbh.h(0)
‐
NAMEdbh.h - Dbh header file
SYNOPSIS
#include <dbh.h>
cc -ldbh
DESCRIPTION
Disk Based Hashtables (DBH) 64 bit
Library to create and manage hash tables residing on disk. Associations
are made between keys and values so that for a given a key the value
can be found and loaded into memory quickly. Being disk based allows
for large and persistent hashes. 64 bit support allows for hashtables
with sizes over 4 Gigabytes on 32 bit systems. Cuantified key genera‐
tion allows for minimum access time on balanced multidimensional trees.
Functionality
A DBHashTable provides associations between keys and values which is
optimized so that given a key, the associated value can be found very
quickly.
Note that only one hash record is loaded from disk to memory at any
given moment for a DBHashTable. Both keys and values should be copied
into the DBHashTable record, so they need not exist for the lifetime of
the DBHashTable. This means that the use of static strings and tempo‐
rary strings (i.e. those created in buffers and those returned by GTK+
widgets) should be copied with dbh_set_key() and dbh_set_data() into
the DBHashTable record before being inserted.
You must be careful to ensure that copied key length matches the
defined key length of the DBHashTable, and also that the copied data
does not exceed the maximum length of the DBHashTable record (1024
bytes by default, and expandable by dbh_set_size(). If the DBHashTable
record length is to be variable, be sure to set the appropriate length
before each dbh_update(), with dbh_set_recordsize(), otherwise the
record length need only be set before the first dbh_update().
To create a DBHashTable, use dbh_create(). To insert a key and value
into a DBHashTable, use dbh_update(). The DBHashTable will not be mod‐
ified until this command is given. All changes to the current
DBHashTable record only reside in memory: dbh_update() is necessary to
commit the changes to the DBHashTable. To lookup a value corresponding
to a given key, use dbh_load(). To erase and unerase a key and value,
use dbh_erase()dbh_unerase().
To call a function for each key and value pair (using a sweep route)
use dbh_foreach_sweep() and dbh_sweep(). To call a function for each
key and value pair (using a fanout route) use dbh_foreach_fanout() and
dbh_foreach_fanout(). To destroy a DBHashTable use dbh_destroy().
This is dbh version 2, incompatible with dbh version 1 files. The main
difference between the two version is the handling of file pointers. In
version 1, file pointers were 32 bits in length, while in version 2,
file pointers are 64 bits in length. This allows for DBHashTables with
sizes greater than 2 GBytes.
Cuantified numbers
Cuantified numbers are an alternate way to view the set of natural num‐
bers {1, 2, 3, ...} where order is defined in two levels. In natural
numbers there is only one level of order (defined by the > boolean
operator). In cuantified numbers the first level of order is defined by
the cuanta or quantity. The cuanta is obtained by adding all the digits
of the cuantified number.
Thus, for example, 10022, 5, 32, and 11111 are all equal at the first
level of order since they all add up to 5. The second level or order
may be obtained in different manners. In functions dbh_genkey() and
dbh_genkey2() the corresponding order of the natural numbers from which
they are associated is not conserved.
In dbh_orderkey() the corresponding order of the natural numbers from
which they are associated is conserved, but at a price. The base, or
maximum value each digit may reach, must be defined. This effectively
puts a limit on the number of keys which may be generated for a given
number of digits.
When a DBHashTable is constructed with cuantified keys, the maximum
amount of disk access instructions generated to access any given record
is equal to the cuanta of the cuantified number represented by the key.
This allows a DBHashTable to be constructed with minimum access time
across all records.
FILE_POINTER
FILE_POINTER is an architecture independent 64 bit integer type.
Structures
typedef struct dbh_header_t;
dbh_header_t is the structural information written at the first 256
bytes of a DBHashTable file.
typedef struct DBHashTable;
DBHashTable is a data structure containing the record information for
an open DBHashTable file.
Macros
unsigned char DBH_KEYLENGTH (DBHashTable * dbh);
FILE_POINTER DBH_RECORD_SIZE (DBHashTable * dbh);
void *DBH_KEY (DBHashTable * dbh);
void *DBH_DATA (DBHashTable * dbh);
FILE_POINTER DBH_ERASED_SPACE (DBHashTable * dbh);
FILE_POINTER DBH_DATA_SPACE (DBHashTable * dbh);
FILE_POINTER DBH_TOTAL_SPACE (DBHashTable * dbh);
FILE_POINTER DBH_FORMAT_SPACE (DBHashTable * dbh);
FILE_POINTER DBH_RECORDS (DBHashTable * dbh);
FILE_POINTER DBH_MAXIMUM_RECORD_SIZE (DBHashTable * dbh);
char *DBH_PATH (DBHashTable * dbh);
Functions
int dbh_close (DBHashTable *dbh);
int dbh_destroy (DBHashTable *dbh);
DBHashTable *dbh_open(constchar*path);
DBHashTable *dbh_openR(constchar*path);
DBHashTable *dbh_create (const char *path, unsigned char key_length);
int dbh_erase (DBHashTable *dbh);
int dbh_unerase (DBHashTable *dbh);
int dbh_prune (DBHashTable *dbh, unsigned char *key, unsigned char sub‐
tree_length);
int dbh_unprune (DBHashTable *dbh, unsigned char *key, unsigned char
subtree_length);
FILE_POINTER dbh_find (DBHashTable *dbh, int n);
void dbh_genkey (unsigned char *key, unsigned char length, unsigned int
n);
void dbh_genkey2 (unsigned char *key, unsigned char length, unsigned
int n);
void dbh_orderkey (unsigned char *key, unsigned char length, unsigned
int n, unsigned char base); sp FILE_POINTER dbh_load (DBHashTable
*dbh);
unsigned char dbh_load_address (DBHashTable *dbh, FILE_POINTERcur‐
rentseek);
FILE_POINTER dbh_load_parent (DBHashTable *dbh);
FILE_POINTER dbh_load_child (DBHashTable *dbh, unsigned char
key_index);
DBHashTable *dbh_regen_sweep (DBHashTable *dbh);
DBHashTable *dbh_regen_fanout (DBHashTable *dbh);
int dbh_settempdir (DBHashTable *dbh, char *temp_dir);
void dbh_set_data (DBHashTable *dbh, void *data, FILE_POINTER size);
void dbh_set_key (DBHashTable *dbh, unsigned char *key);
int dbh_set_size (DBHashTable *dbh, FILE_POINTERsize);
void dbh_set_recordsize (DBHashTable *dbh, int record_size );
int dbh_sweep (DBHashTable *dbh, DBHashFunc operate, unsigned char
*key1, unsigned char *key2, unsigned char ignore_portion);
int dbh_fanout (DBHashTable *dbh, DBHashFunc operate, unsigned char
*key1, unsigned char *key2, unsigned char ignore_portion);
int dbh_foreach_sweep (DBHashTable *dbh, DBHashFunc operate);
int dbh_foreach_fanout (DBHashTable *dbh, DBHashFunc operate);
void dbh_exit_sweep (DBHashTable *dbh);
void dbh_exit_fanout (DBHashTable *dbh);
FILE_POINTER dbh_update (DBHashTable *dbh);
int dbh_writeheader (DBHashTable *dbh);
SEE ALSO
dbh_macros (3), dbh_close (3), dbh_create (3), dbh_erase (3), dbh_find
(3), dbh_genkey (3), dbh_load (3), dbh_regen_sweep (3), dbh_set_data
(3), dbh_set_size (3), dbh_sweep (3), dbh_update (3)
Author
Edscott Wilson Garcia <edscott@xfce.org>
DBH DBHashTables dbh.h(0)