wwwoffle.conf(5)wwwoffle.conf(5)NAMEwwwoffle.conf - The configuration file for the proxy server for the
World Wide Web Offline Explorer.
Introduction
The configuration file ( wwwoffle.conf ) specifies all of the parame‐
ters that control the operation of the proxy server. The file is split
into sections each containing a series of parameters as described
below. The file CHANGES.CONF explains the changes in the configuration
file between this version of the program and previous ones.
The file is split into sections, each of which can be empty or contain
one or more lines of configuration information. The sections are named
and the order that they appear in the file is not important.
The general format of each of the sections is the same. The name of
the section is on a line by itself to mark the start. The contents of
the section are enclosed between a pair of lines containing only the
´{´ and ´}´ characters or the ´[´ and ´]´ characters. When the ´{´ and
´}´ characters are used the lines between contain configuration infor‐
mation. When the ´[´ and ´]´ characters are used then there must only
be a single non-empty line between them that contains the name of a
file (in the same directory) containing the configuration information
for the section.
Comments are marked by a ´#´ character at the start of the line and
they are ignored. Blank lines are also allowed and ignored.
The phrases URL-SPECIFICATION (or URL-SPEC for short) and WILDCARD have
specific meanings in the configuration file and are described at the
end. Any item enclosed in ´(´ and ´)´ in the descriptions means that
it is a parameter supplied by the user, anything enclosed in ´[´ and
´]´ is optional, the ´|´ symbol is used to denote alternate choices.
Some options apply to specific URLs only, this is indicated by having a
URL-SPECIFICATION enclosed between ´<´ & ´>´ in the option, the first
URL-SPECIFICATION to match is used. If no URL-SPECIFICATION is given
then it matches all URLs.
StartUp
This contains the parameters that are used when the program starts,
changes to these are ignored if the configuration file is re-read while
the program is running.
bind-ipv4 = (hostname) | (ip-address) | none
Specify the hostname or IP address to bind the HTTP proxy and
WWWOFFLE control port sockets to using IPv4 (default=´0.0.0.0´).
If ´none´ is specified then no IPv4 socket is bound. If this is
changed from the default value then the first entry in the
LocalHost section may need to be changed to match.
bind-ipv6 = (hostname) | (ip-address) | none
Specify the hostname or IP address to bind the HTTP proxy and
WWWOFFLE control port sockets to using IPv6 (default=´::´). If
´none´ is specified then no IPv6 socket is bound. This requires
the IPv6 compilation option. If this is changed from the
default value then the first entry in the LocalHost section may
need to be changed to match.
http-port = (port)
An integer specifying the port number for connections to access
the internal WWWOFFLE pages and for HTTP/HTTPS/FTP proxying
(default=8080). This is the port number that must be specified
in the client to connect to the WWWOFFLE proxy for
HTTP/HTTPS/FTP proxying.
https-port = (port)
An integer specifying the port number for encrypted connections
to access the internal WWWOFFLE pages and for HTTP/FTP proxying
(default=8443). Requires gnutls compilation option.
wwwoffle-port = (port)
An integer specifying the port number for the WWWOFFLE control
connections to use (default=8081).
spool-dir = (dir)
The full pathname of the top level cache directory (spool direc‐
tory) (default=/var/spool/wwwoffle or whatever was used when the
program was compiled).
run-uid = (user) | (uid)
The username or numeric uid to change to when the WWWOFFLE
server is started (default=none). This option only works if the
server is started by the root user on UNIX-like systems.
run-gid = (group) | (gid)
The group name or numeric gid to change to when the WWWOFFLE
server is started (default=none). This option only works if the
server is started by the root user on UNIX-like systems.
use-syslog = yes | no
Whether to use the syslog facility for messages or not
(default=yes).
password = (word)
The password used for authentication of the control pages, for
deleting cached pages etc (default=none). For the password to
be secure the configuration file must be set so that only autho‐
rised users can read it.
max-servers = (integer)
The maximum number of server processes that are started for
online and automatic fetching (default=8).
max-fetch-servers = (integer)
The maximum number of server processes that are started to fetch
pages that were marked in offline mode (default=4). This value
must be less than max-servers or you will not be able to use
WWWOFFLE interactively online while fetching.
Options
Options that control how the program works.
log-level = debug | info | important | warning | fatal
The minimum log level for messages in syslog or stderr
(default=important).
socket-timeout = (time)
The time in seconds that WWWOFFLE will wait for data on a socket
connection before giving up (default=120).
dns-timeout = (time)
The time in seconds that WWWOFFLE will wait for a DNS (Domain
Name Service) lookup before giving up (default=60).
connect-timeout = (time)
The time in seconds that WWWOFFLE will wait for the socket con‐
nection to be made before giving up (default=30).
connect-retry = yes | no
If a connection cannot be made to a remote server then WWWOFFLE
should try again after a short delay (default=no).
dir-perm = (octal int)
The directory permissions to use when creating spool directories
(default=0755). This option overrides the umask of the user and
must be in octal starting with a ´0´.
file-perm = (octal int)
The file permissions to use when creating spool files
(default=0644). This option overrides the umask of the user and
must be in octal starting with a ´0´.
run-online = (filename)
The full pathname of a program to run when WWWOFFLE is switched
to online mode (default=none). The program is started in the
background with a single parameter set to the current mode name
"online".
run-offline = (filename)
The full pathname of a program to run when WWWOFFLE is switched
to offline mode (default=none). The program is started in the
background with a single parameter set to the current mode name
"offline".
run-autodial = (filename)
The full pathname of a program to run when WWWOFFLE is switched
to autodial (default=none). The program is started in the back‐
ground with a single parameter set to the current mode name
"fetch".
run-fetch = (filename)
The full pathname of a program to run when a WWWOFFLE fetch
starts or stops (default=none). The program is started in the
background with two parameters, the first is the word "fetch"
and the second is "start" or "stop".
lock-files = yes | no
Enable the use of lock files to stop more than one WWWOFFLE
process from downloading the same URL at the same time
(default=no). Disabling the lock-files may result in incomplete
pages being displayed or many copies being downloaded if multi‐
ple requests are made for the same URL at the same time.
reply-compressed-data = yes | no
If the replies that are made to the client are to contain com‐
pressed data when requested (default=no). Requires zlib compi‐
lation option.
reply-chunked-data = yes | no
If the replies that are made to the client are to use chunked
encoding when possible (default=yes).
exec-cgi = (pathname)
Enable the use of CGI scripts for the local pages on the WWWOF‐
FLE server that match the wildcard pathname (default=none).
OnlineOptions
Options that control how WWWOFFLE behaves when it is online.
[<URL-SPEC>] pragma-no-cache = yes | no
Whether to request a new copy of a page if the request from the
client has ´Pragma: no-cache´ (default=yes). This option takes
precedence over the request-changed and request-changed-once
options.
[<URL-SPEC>] cache-control-no-cache = yes | no
Whether to request a new copy of a page if the request from the
client has ´Cache-Control: no-cache´ (default=yes). This option
takes precedence over the request-changed and
request-changed-once options.
[<URL-SPEC>] cache-control-max-age-0 = yes | no
Whether to request a new copy of a page if the request from the
client has ´Cache-Control: max-age=0´ (default=yes). This
option takes precedence over the request-changed and
request-changed-once options.
[<URL-SPEC>] cookies-force-refresh = yes | no
Whether to force the refresh of a page if the request from the
client contains a cookie (default=no). This option takes prece‐
dence over the request-changed and request-changed-once options.
[<URL-SPEC>] request-changed = (time)
While online pages will only be fetched if the cached version is
older than this specified time in seconds (default=600). Set‐
ting this value negative will indicate that cached pages are
always used while online. Longer times can be specified with a
´m´, ´h´, ´d´ or ´w´ suffix for minutes, hours, days or weeks
(e.g. 10m=600).
[<URL-SPEC>] request-changed-once = yes | no
While online pages will only be fetched if the cached version
has not already been fetched once this session online
(default=yes). This option takes precedence over the
request-changed option.
[<URL-SPEC>] request-expired = yes | no
While online pages that have expired will always be requested
again (default=no). This option takes precedence over the
request-changed and request-changed-once options.
[<URL-SPEC>] request-no-cache = yes | no
While online pages that ask not to be cached will always be
requested again (default=no). This option takes precedence over
the request-changed and request-changed-once options.
[<URL-SPEC>] request-redirection = yes | no
While online pages that redirect the client to another URL tem‐
porarily will be requested again. (default=no). This option
takes precedence over the request-changed and
request-changed-once options.
[<URL-SPEC>] request-conditional = yes | no
While online pages that are requested from the server will be
conditional requests so that server only sends data if the page
has changed (default=yes).
[<URL-SPEC>] validate-with-etag = yes | no
When making a conditional request to a server enable the use of
the HTTP/1.1 cache validator ´Etag´ as well as modification time
(default=yes).
[<URL-SPEC>] try-without-password = yes | no
If a request is made for a URL that contains a username and
password then a request is made for the same URL without a user‐
name and password specified (default=yes). This allows for
requests for the URL without a password to re-direct the client
to the passworded version.
[<URL-SPEC>] intr-download-keep = yes | no
If the client closes the connection while online the currently
downloaded incomplete page should be kept (default=no).
[<URL-SPEC>] intr-download-size = (integer)
If the client closes the connection while online the page should
continue to download if it is smaller than this size in kB
(default=1).
[<URL-SPEC>] intr-download-percent = (integer)
If the client closes the connection while online the page should
continue to download if it is more than this percentage complete
(default=80).
[<URL-SPEC>] timeout-download-keep = yes | no
If the server connection times out while reading then the cur‐
rently downloaded incomplete page should be kept (default=no).
[<URL-SPEC>] keep-cache-if-not-found = yes | no
If the remote server replies with an error message or a redi‐
rection while there is a cached version with status 200 the pre‐
viously cached version should be kept (default=no).
[<URL-SPEC>] request-compressed-data = yes | no
If the requests that are made to the server are to request com‐
pressed data (default=yes). Requires zlib compilation option.
[<URL-SPEC>] request-chunked-data = yes | no
If the requests that are made to the server are to request chun‐
ked encoding (default=yes).
OfflineOptions
Options that control how WWWOFFLE behaves when it is offline.
[<URL-SPEC>] pragma-no-cache = yes | no
Whether to request a new copy of a page if the request from the
client has ´Pragma: no-cache´ (default=yes). This option should
be set to ´no´ if when browsing offline all pages are
re-requested by a ´broken´ browser.
[<URL-SPEC>] cache-control-no-cache = yes | no
Whether to request a new copy of a page if the request from the
client has ´Cache-Control: no-cache´ (default=yes). This option
should be set to ´no´ if when browsing offline all pages are
re-requested by a ´broken´ browser.
[<URL-SPEC>] cache-control-max-age-0 = yes | no
Whether to request a new copy of a page if the request from the
client has ´Cache-Control: max-age=0´ (default=yes). This
option should be set to ´no´ if when browsing offline all pages
are re-requested by a ´broken´ browser.
[<URL-SPEC>] confirm-requests = yes | no
Whether to return a page requiring user confirmation instead of
automatically recording requests made while offline
(default=no).
[<URL-SPEC>] dont-request = yes | no
Do not request any URLs that match this when offline
(default=no).
SSLOptions
Options that control how WWWOFFLE behaves when a connection is made to
it for a Secure Sockets Layer (SSL) server. Normally only tunnelling
(with no decryption or caching of the data) is possible. When WWWOFFLE
is compiled with the gnutls library it is possible configure WWWOFFLE
to decrypt, cache and re-encrypt the connections.
enable-caching = yes | no
If caching (involving decryption and re-encryption) of Secure
Sockets Layer (SSL) server connections is allowed (default =
no).
allow-tunnel = (host[:port])
A hostname and port number (a WILDCARD match) for an SSL server
that can be connected to using WWWOFFLE as a tunnelling proxy
(no caching or decryption of the data) (default is no hosts or
ports allowed). This option should be set to *:443 to allow
https to the default port number. There can be more than one
option for other ports or hosts as required. This option takes
precedence over the allow-cache option. The host value is
matched against the URL as presented, no hostname to IP or IP to
hostname lookups are performed to find alternative equivalent
names.
disallow-tunnel = (host[:port])
A hostname and port number (a WILDCARD match) for an SSL server
that can not be connected to using WWWOFFLE as a tunnelling
proxy. There can be more than one option for other ports or
hosts as required. This option takes precedence over the
allow-tunnel option. The host value is matched against the URL
as presented, no hostname to IP or IP to hostname lookups are
performed to find alternative equivalent names.
allow-cache = (host[:port])
A hostname and port number (a WILDCARD match) for an SSL server
that can be connected to using WWWOFFLE as a caching proxy
(decryption of the data) (default is no hosts or ports allowed).
This option should be set to *:443 to allow https to the default
port number. There can be more than one option for other ports
or hosts as required. The host value is matched against the URL
as presented, no hostname to IP or IP to hostname lookups are
performed to find alternative equivalent names.
disallow-cache = (host[:port])
A hostname and port number (a WILDCARD match) for an SSL server
that can not be connected to using WWWOFFLE as a caching proxy.
This option takes precedence over the allow-cache option. The
host value is matched against the URL as presented, no hostname
to IP or IP to hostname lookups are performed to find alterna‐
tive equivalent names.
FetchOptions
Options that control what linked elements are downloaded when fetching
pages that were requested while offline.
[<URL-SPEC>] stylesheets = yes | no
If style sheets are to be fetched (default=no).
[<URL-SPEC>] images = yes | no
If images are to be fetched (default=no).
[<URL-SPEC>] webbug-images = yes | no
If images that are 1 pixel square are also to be fetched,
requires the images option to also be selected. (default=yes).
If these images are not fetched then the replace-webbug-images
option in the ModifyHTML section can be used to stop browsers
requesting them.
[<URL-SPEC>] icon-images = yes | no
If icons (also called favourite icons or shortcut icons) as used
by browsers for bookmarks are to be fetched (default=no).
[<URL-SPEC>] only-same-host-images = yes | no
If the only images that are fetched are the ones that are on the
same host as the page that references them, requires the images
option to also be selected (default=no).
[<URL-SPEC>] frames = yes | no
If frames are to be fetched (default=no).
[<URL-SPEC>] iframes = yes | no
If inline frames (iframes) are to be fetched (default=no).
[<URL-SPEC>] scripts = yes | no
If scripts (e.g. Javascript) are to be fetched (default=no).
[<URL-SPEC>] objects = yes | no
If objects (e.g. Java class files) are to be fetched
(default=no).
IndexOptions
Options that control what is displayed in the indexes.
create-history-indexes = yes | no
Enables creation of the lasttime/prevtime and lastout/prevout
indexes (default=yes). The cycling of the indexes is always
performed and they will flush even if this option is disabled.
cycle-indexes-daily = yes | no
Cycles the lasttime/prevtime and lastout/prevout indexes daily
instead of each time online or fetching (default = no).
<URL-SPEC> list-outgoing = yes | no
Choose if the URL is to be listed in the outgoing index
(default=yes).
<URL-SPEC> list-latest = yes | no
Choose if the URL is to be listed in the lasttime/prevtime and
lastout/prevout indexes (default=yes).
<URL-SPEC> list-monitor = yes | no
Choose if the URL is to be listed in the monitor index
(default=yes).
<URL-SPEC> list-host = yes | no
Choose if the URL is to be listed in the host indexes
(default=yes).
<URL-SPEC> list-any = yes | no
Choose if the URL is to be listed in any of the indexes
(default=yes).
ModifyHTML
Options that control how the HTML that is provided from the cache is
modified.
[<URL-SPEC>] enable-modify-html = yes | no
Enable the HTML modifications in this section (default=no).
With this option disabled the following HTML options will not
have any effect. With this option enabled there is a small
speed penalty.
[<URL-SPEC>] add-cache-info = yes | no
At the bottom of all of the spooled pages the date that the page
was cached and some navigation buttons are to be added
(default=no).
[<URL-SPEC>] anchor-cached-begin = (HTML code) |
Anchors (links) in the spooled page that are in the cache are to
have the specified HTML inserted at the beginning (default="").
[<URL-SPEC>] anchor-cached-end = (HTML code) |
Anchors (links) in the spooled page that are in the cache are to
have the specified HTML inserted at the end (default="").
[<URL-SPEC>] anchor-requested-begin = (HTML code) |
Anchors (links) in the spooled page that are not in the cache
but have been requested for download are to have the specified
HTML inserted at the beginning (default="").
[<URL-SPEC>] anchor-requested-end = (HTML code) |
Anchors (links) in the spooled page that are not in the cache
but have been requested for download are to have the specified
HTML inserted at the end (default="").
[<URL-SPEC>] anchor-not-cached-begin = (HTML code) |
Anchors (links) in the spooled page that are not in the cache or
requested are to have the specified HTML inserted at the begin‐
ning (default="").
[<URL-SPEC>] anchor-not-cached-end = (HTML code) |
Anchors (links) in the spooled page that are not in the cache or
requested are to have the specified HTML inserted at the end
(default="").
[<URL-SPEC>] disable-script = yes | no
Removes all scripts and scripted events (default=no).
[<URL-SPEC>] disable-applet = yes | no
Removes all Java applets (default=no).
[<URL-SPEC>] disable-style = yes | no
Removes all stylesheets and style references (default=no).
[<URL-SPEC>] disable-blink = yes | no
Removes the <blink> tag from HTML but does not disable blink in
stylesheets (default=no).
[<URL-SPEC>] disable-marquee = yes | no
Removes the <marquee> tag from HTML to stop scrolling text
(default=no).
[<URL-SPEC>] disable-flash = yes | no
Removes any Shockwave Flash animations (default=no).
[<URL-SPEC>] disable-iframe = yes | no
Removes any inline frames (the <iframe> tag) from HTML
(default=no).
[<URL-SPEC>] disable-meta-refresh = yes | no
Removes any meta tags in the HTML header that re-direct the
client to change to another page after an optional delay
(default=no).
[<URL-SPEC>] disable-meta-refresh-self = yes | no
Removes any meta tags in the HTML header that re-direct the
client to reload the same page after a delay (default=no).
[<URL-SPEC>] disable-meta-set-cookie = yes | no
Removes any meta tags in the HTML header that cause cookies to
be set (default=no).
[<URL-SPEC>] disable-dontget-links = yes | no
Disables any links to URLs that are in the DontGet section of
the configuration file (default=no).
[<URL-SPEC>] disable-dontget-iframes = yes | no
Disables inline frame (iframe) URLs that are in the DontGet sec‐
tion of the configuration file (default=no).
[<URL-SPEC>] replace-dontget-images = yes | no
Replaces image URLs that are in the DontGet section of the con‐
figuration file with a static URL (default=no).
[<URL-SPEC>] replacement-dontget-image = (URL)
The replacement image to use for URLs that are in the DontGet
section of the configuration file (default=/local/dont‐
get/replacement.gif).
[<URL-SPEC>] replace-webbug-images = yes | no
Replaces image URLs that are 1 pixel square with a static URL
(default=no). The webbug-images option in the FetchOptions sec‐
tion can be used to stop these images from being automatically
downloaded.
[<URL-SPEC>] replacement-webbug-image = (URL)
The replacement image to use for images that are 1 pixel square
(default=/local/dontget/replacement.gif).
[<URL-SPEC>] demoronise-ms-chars = yes | no
Replaces strange characters that some Microsoft applications put
into HTML with character equivalents that most browsers can dis‐
play (default=no). The idea for this comes from the public
domain Demoroniser perl script.
[<URL-SPEC>] fix-mixed-cyrillic = yes | no
Replaces punctuation characters in cp-1251 encoding that are
combined with text in koi-8 encoding that appears in some cyril‐
lic web pages.
[<URL-SPEC>] disable-animated-gif = yes | no
Disables the animation in animated GIF files (default=no).
LocalHost
A list of hostnames that the host running the WWWOFFLE server may be
known by. This is so that the proxy does not need to contact itself if
the request has a different name for the same server.
(host) A hostname or IP address that in connection with the port number
(in the StartUp section) specifies the WWWOFFLE proxy HTTP
server. The hostnames must match exactly, it is not a WILDCARD
match. The first named host is used as the server name for sev‐
eral features so must be a name that will work from any client
host on the network. The entries can be hostnames, IPv4
addresses or IPv6 addresses enclosed within ´[...]´. None of
the hosts named here are cached or fetched via a proxy.
LocalNet
A list of hostnames whose web servers are always accessible even when
offline and are not to be cached by WWWOFFLE because they are on a
local network.
(host) A hostname or IP address that is always available and is not to
be cached by WWWOFFLE. The host name matching uses WILDCARD s.
A host can be excluded by appending a ´!´ to the start of the
name. The host value is matched against the URL as presented,
no hostname to IP or IP to hostname lookups are performed to
find alternative equivalent names. The entries can be host‐
names, IPv4 addresses or IPv6 addresses enclosed within ´[...]´.
All entries here are assumed to be reachable even when offline.
None of the hosts named here are cached or fetched via a proxy.
AllowedConnectHosts
A list of client hostnames that are allowed to connect to the server.
(host) A hostname or IP address that is allowed to connect to the
server. The host name matching uses WILDCARD s. A host can be
excluded by appending a ´!´ to the start of the name. If the IP
address or hostname (if available) of the machine connecting
matches then it is allowed. The entries can be hostnames, IPv4
addresses or IPv6 addresses enclosed within ´[...]´. All of the
hosts named in LocalHost are also allowed to connect.
AllowedConnectUsers
A list of the users that are allowed to connect to the server and their
passwords.
(username):(password)
The username and password of the users that are allowed to con‐
nect to the server. If this section is left empty then no user
authentication is done. The username and password are both
stored in plaintext format. This requires the use of clients
that handle the HTTP/1.1 proxy authentication standard.
DontCache
A list of URLs that are not to be cached by WWWOFFLE.
[!]URL-SPECIFICATION
Do not cache any URLs that match this. The URL-SPECIFICATION
can be negated to allow matches to be cached. The URLs that are
not cached will not be requested if offline.
DontGet
A list of URLs that are not to be got by WWWOFFLE when it is fetching
and not to be served from the WWWOFFLE cache even if they exist.
[!]URL-SPECIFICATION
Do not get any URLs that match this. The URL-SPECIFICATION can
be negated to allow matches to be got.
[<URL-SPEC>] replacement = (URL)
The URL to use to replace any URLs that match the URL-SPECIFICA‐
TION s instead of using the standard error message
(default=none). The URLs in /local/dontget/ are suggested
replacements (e.g. replacement.gif or replacement.png which are
1x1 pixel transparent images or replacement.js which is an empty
javascript file).
<URL-SPEC> get-recursive = yes | no
Choose whether to get URLs that match this when doing a recur‐
sive fetch (default=yes).
<URL-SPEC> location-error = yes | no
When a URL reply contains a ´Location´ header that redirects to
a URL that is not got (specified in this section) then the reply
is modified to be an error message instead (default=no). This
will stop ISP proxies from redirecting users to adverts if the
advert URLs are in this section.
DontCompress
A list of MIME types and file extensions that are not to be compressed
by WWWOFFLE (because they are already compressed or not not worth com‐
pressing). Requires zlib compilation option.
mime-type = (mime-type)/(subtype)
The MIME type of a URL that is not to be compressed in the cache
or when providing compressed pages to clients.
file-ext = .(file-ext)
The file extension of a URL that is not to be requested com‐
pressed from a server.
CensorHeader
A list of HTTP header lines that are to be removed from the requests
sent to web servers and the replies that come back from them.
[<URL-SPEC>] (header) = yes | no | (string)
A header field name (e.g. From, Cookie, Set-Cookie, User-Agent)
and the string to replace the header value with (default=no).
The header is case sensitive, and does not have a ´:´ at the
end. The value of "no" means that the header is unmodified,
"yes" or no string can be used to remove the header or a string
can be used to replace the header. This only replaces headers
it finds, it does not add any new ones. An option for Referer
here will take precedence over the referer-self and ref‐
erer-self-dir options.
[<URL-SPEC>] referer-self = yes | no
Sets the Referer header to the same as the URL being requested
(default=no). This will add the Referer header if none is con‐
tained in the original request.
[<URL-SPEC>] referer-self-dir = yes | no
Sets the Referer header to the directory name of the URL being
requested (default=no). This will add the Referer header if
none is contained in the original request. This option takes
precedence over referer-self.
[<URL-SPEC>] referer-from = yes | no
Removes the Referer header based on a match of the referring URL
(default=no).
[<URL-SPEC>] force-user-agent = yes | no
Forces a User-Agent header to be inserted into all requests that
are made by WWWOFFLE (default=no). This User-Agent is added
only if there is not an existing User-Agent header and is set to
the value WWWOFFLE/<version-number>. This header is inserted
before censoring and may be changed by the normal header censor‐
ing method.
FTPOptions
Options to use when fetching files using the ftp protocol.
anon-username = (string)
The username to use for anonymous ftp (default=anonymous).
anon-password = (string)
The password to use for anonymous ftp (default determined at run
time). If using a firewall then this may contain a value that
is not valid to the FTP server and may need to be set to a dif‐
ferent value.
<URL-SPEC> auth-username = (string)
The username to use on a host instead of the default anonymous
username.
<URL-SPEC> auth-password = (string)
The password to use on a host instead of the default anonymous
password.
MIMETypes
MIME Types to use when serving files that were not fetched using HTTP
or for files on the built-in web-server.
default = (mime-type)/(subtype)
The default MIME type (default=text/plain).
.(file-ext) = (mime-type)/(subtype)
The MIME type to associate with a file extension. The ´.´ must
be included in the file extension. If more than one extension
matches then the longest one is used.
Proxy
This contains the names of the HTTP (or other) proxies to use external
to the WWWOFFLE server machine.
[<URL-SPEC>] proxy = (host[:port])
The hostname and port on it to use as the proxy.
<URL-SPEC> auth-username = (string)
The username to use on a proxy host to authenticate WWWOFFLE to
it. The URL-SPEC in this case refers to the proxy and not the
URL being retrieved.
<URL-SPEC> auth-password = (string)
The password to use on a proxy host to authenticate WWWOFFLE to
it. The URL-SPEC in this case refers to the proxy and not the
URL being retrieved.
[<URL-SPEC>] ssl = (host[:port])
A proxy server that should be used for Secure Socket Layer (SSL)
connections e.g. https. Note that for the <URL-SPEC> that only
the host is checked and that the other parts must be ´*´ WILD‐
CARD s.
Alias
A list of aliases that are used to replace the server name and path
with another server name and path.
URL-SPECIFICATION = URL-SPECIFICATION
Any requests that match the first URL-SPECIFICATION are replaced
by the second URL-SPECIFICATION. The first URL-SPECIFICATION is
a wildcard match for the protocol and host/port, the path must
match the start of the requested URL exactly and includes all
subdirectories.
Purge
The method to determine which pages to purge, the default age the host
specific maximum age of the pages in days, and the maximum cache size.
use-mtime = yes | no
The method to use to decide which files to purge, last access
time (atime) or last modification time (mtime) (default=no).
max-size = (size)
The maximum size for the cache in MB after purging (default=-1).
A maximum cache size of -1 (or 0 for backwards compatibility)
means there is no limit to the size. If this and the min-free
options are both used the smaller cache size is chosen. This
option take into account the URLs that are never purged when
measuring the cache size but will not purge them.
min-free = (size)
The minimum amount of free disk space in MB after purging
(default=-1). A minimum disk free of -1 (or 0) means there is
no limit to the free space. If this and the max-size options
are both used the smaller cache size is chosen. This option
take into account the URLs that are never purged when measuring
the cache size but will not purge them.
use-url = yes | no
If true then use the URL to decide on the purge age, otherwise
use the protocol and host only (default=no).
del-dontget = yes | no
If true then delete the URLs that match the entries in the Dont‐
Get section (default=no).
del-dontcache = yes | no
If true then delete the URLs that match the entries in the Dont‐
Cache section (default=no).
[<URL-SPEC>] age = (age)
The maximum age in the cache for URLs that match this
(default=14). An age of zero means always to delete, negative
means not to delete. The URL-SPECIFICATION matches only the
protocol and host unless use-url is set to true. Longer times
can be specified with a ´w´, ´m´ or ´y´ suffix for weeks, months
or years (e.g. 2w=14).
[<URL-SPEC>] compress-age = (age)
The maximum age in the cache for URLs that match this to be
stored uncompressed (default=-1). Requires zlib compilation
option. An age of zero means always to compress, negative means
never to compress. The URL-SPECIFICATION matches only the pro‐
tocol and host unless use-url is set to true. Longer times can
be specified with a ´w´, ´m´ or ´y´ suffix for weeks, months or
years (e.g. 2w=14).
WILDCARD
A WILDCARD match is one that uses the ´*´ character to represent any
group of characters.
This is basically the same as the command line file matching expres‐
sions in DOS or the UNIX shell, except that the ´*´ can match the ´/´
character.
For example
*.gif matches foo.gif and bar.gif
*.foo.com
matches www.foo.com and ftp.foo.com
/foo/* matches /foo/bar.html and /foo/bar/foobar.html
URL-SPECIFICATION
When specifying a host and protocol and pathname in many of the sec‐
tions a URL-SPECIFICATION can be used, this is a way of recognising a
URL.
For the purposes of this explanation a URL is considered to be made up
of five parts.
proto The protocol that is used (e.g. 'http', 'ftp')
host The server hostname (e.g. 'www.gedanken.demon.co.uk').
port The port number on the host (e.g. default of 80 for HTTP).
path The pathname on the host (e.g. '/bar.html') or a directory name
(e.g. ´/foo/´).
args Optional arguments with the URL used for CGI scripts etc. (e.g.
´search=foo´).
For example the WWWOFFLE homepage:
http://www.gedanken.demon.co.uk/wwwoffle/ The protocol is ´http´, the
host is ´www.gedanken.demon.co.uk´, the port is the default (in this
case 80), and the pathname is ´/wwwoffle/´.
In general this is written as
(proto)://(host)[:(port)]/[(path)][?(args)]
Where [] indicates an optional feature, and () indicate a user supplied
name or number.
Some example URL-SPECIFICATION options are the following:
*://*/*
Any protocol, Any host, Any port, Any path, Any args (This is
the default for options that can have a <URL-SPEC>
prefix when none is specified).
*://*/(path)
Any protocol, Any host, Any port, Named path, Any args
*://*/*?
Any protocol, Any host, Any port, Any path, No args
*://*/(path)?*
Any protocol, Any host, Any port, Named path, Any args
*://(host)
Any protocol, Named host, Any port, Any path, Any args
(proto)://*/*
Named proto, Any host, Any port, Any path, Any args
(proto)://(host)/* Named proto, Named host, Any port, Any path, Any
args
(proto)://(host):/* Named proto, Named host, Default port, Any path,
Any args
*://(host):(port)/* Any protocol, Named host, Named port, Any path, Any
args
The matching of the host, the path and the args use the WILDCARD match‐
ing that is described above. The matching of the path has the special
condition that a WILDCARD of ´/*/foo´ will match ´/foo´ and
´/any/path/foo´, in other words it matches any path prefix.
In some sections that accept URL-SPECIFICATION s they can be negated by
inserting the ´!´ character before it. This will mean that the compar‐
ison of a URL with the URL-SPECIFICATION will return the logically
opposite value to what would be returned without the ´!´. If all of
the URL-SPECIFICATION s in a section are negated and ´*://*/*´ is added
to the end then the sense of the whole section is negated.
In all sections that accept URL-SPECIFICATION s the comparison can be
made case insensitive for the path and arguments part by inserting the
´~´ character before it. (The host and the protocol comparisons are
always case insensitive).
EXAMPLE
StartUp
{
http-port = 8080
wwwoffle-port = 8081
spool-dir = /var/spool/wwwoffle
use-syslog = yes
password =
}
Options
{
add-info-refresh = no
request-changed = 3600
}
SSLOptions
{
enable-caching = no
allow-tunnel = *:443
}
FetchOptions
{
images = yes
frames = yes
iframes = yes
}
LocalHost
{
wwwoffle.foo.com
localhost
127.0.0.1
}
DontGet
[
wwwoffle.DontGet.conf
]
LocalNet
{
*.foo.com
}
AllowedConnectHosts
{
*.foo.com
}
Proxy
{
<http://foo.com/*> proxy = www-cache.foo.com:8080
}
Purge
{
max-size = 10
age = 28
<http://*.bar.com/*> age = 7
}
FILES
/etc/wwwoffle/wwwoffle.conf The wwwoffled(8) configuration file.
/var/spool/wwwoffle The WWWOFFLE spool directory.
SEE ALSOwwwoffle(1), wwwoffled(8).
AUTHOR
Andrew M. Bishop 1996-2007 (amb@gedanken.demon.co.uk)
September 29, 2007 wwwoffle.conf(5)