rwcount man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

rwcount(1)			SiLK Tool Suite			    rwcount(1)

NAME
       rwcount - Print traffic summary across time

SYNOPSIS
	 rwcount [--bin-size=SIZE] [--load-scheme=LOADSCHEME]
	       [--start-time=START_TIME] [--end-time=END_TIME]
	       [--skip-zeroes] [--bin-slots] [--epoch-slots]
	       [--timestamp-format=FORMAT] [--no-titles]
	       [--no-columns] [--column-separator=CHAR]
	       [--no-final-delimiter] [{--delimited | --delimited=CHAR}]
	       [--print-filenames] [--copy-input=PATH] [--output-path=PATH]
	       [--pager=PAGER_PROG] [--site-config-file=FILENAME]
	       [{--legacy-timestamps | --legacy-timestamps={1,0}}]
	       {[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}

	 rwcount --help

	 rwcount --version

DESCRIPTION
       rwcount summarizes SiLK flow records across time.  It counts the
       records in the input stream, and groups their byte and packet totals
       into time bins.	rwcount produces textual output with one row for each
       bin.

       rwcount reads SiLK Flow records from the files named on the command
       line or from the standard input when no file names are specified and
       --xargs is not present.	To read the standard input in addition to the
       named files, use "-" or "stdin" as a file name.	If an input file name
       ends in ".gz", the file will be uncompressed as it is read.  When the
       --xargs switch is provided, rwcount will read the names of the files to
       process from the named text file, or from the standard input if no file
       name argument is provided to the switch.	 The input to --xargs must
       contain one file name per line.

       rwcount splits each flow record into bins whose size is determined by
       the argument to the --bin-size switch.  When that switch is not
       provided, rwcount uses 30-second bins by default.

       By default, the first row of data rwcount prints is the bin containing
       the starting time of the earliest record that appears in the input.
       rwcount then prints a row for every bin until it reaches the bin
       containing the most recent ending time.	Rows whose counts are zero are
       printed unless the --skip-zero switch is specified.

       The --start-time and --end-time switches tell rwcount to use a specific
       time for the first row and the final row.  The --start-time switch
       always sets the time stamp on the first bin to the specified time.
       With the --end-time switch, rwcount computes a maximum end-time by
       setting any unspecified hour, minute, second, and millisecond field to
       its maximum value, and the final bin is that which contains the maximum
       end-time.

       When --start-time and --end-time are both specified, rwcount reserves
       the memory for the bins before it begins processing the records.	 If
       the memory cannot be allocated, rwcount exits.  If this happens, try
       reducing the time span or increasing the bin-size.

   Load Scheme
       A router or other flow generator summarizes the traffic it sees into
       records.	 In addition to the five-tuple (source port and address,
       destination port and address, and protocol), the record has its start
       time, end time, total byte count, and total packet count.  There is no
       way to know how the bytes and packets were distributed during the
       duration of the record: their distribution could be front-loaded, back-
       loaded, uniform, et cetera.

       When the start and end times of a individual flow record put that
       record into a single bin, rwcount can simply add that record's volume
       (byte and packet counts) to the bin.

       When the duration of a flow record causes it to span multiple bins,
       rwcount must to told how to allocate the volume among the bins.	The
       --load-scheme switch determines this, and it has supports the following
       allocation schemes:

       time-proportional
	   Divides the total volume of the flow by the duration of the flow,
	   and multiplies the quotient by the time spent in the bin.  Thus,
	   the volume the flow contributes to a bin is proportional to the
	   time the flow spent in the bin.  This models a flow where the
	   volume/second ratio is uniform.

       bin-uniform
	   Divides the volume of the flow by the number of bins the flow
	   spans, and adds the quotient to each of the bins.  In this scheme,
	   the volume/bin ratio is uniform.

       start-spike
	   Adds the total volume for the flow into the bin containing the
	   start time of the flow.  This models a flow that is front-loaded to
	   the point where the entire volume is a single spike occurring in
	   the initial millisecond of flow.

       middle-spike
	   Determines the time at the midpoint of the flow, and adds the
	   entire volume for the flow into the bin containing that time.

       end-spike
	   Adds the total volume for the flow into the bin containing the end
	   time of the flow.  This models a flow that is back-loaded to the
	   point where the entire volume is a single spike occurring in final
	   millisecond of the flow.

       maximum-volume
	   Adds the entire volume for the flow into every bin that contains
	   any part of the flow.  In theory, the distribution of the bytes in
	   the record could be a spike that occurs at any point during the
	   flow's duration.  This scheme allows one to determine, in
	   aggregate, the maximum possible volume that could have occurred
	   during this bin.  In this scheme, the "Records" column gives the
	   number of records that were active during the bin.

       minimum-volume
	   Acts as though the volume for the flow occurred in some other bin.
	   It is possible that a record that spans multiple bins did not
	   contribute any volume to the current bin.  This scheme allows one
	   to determine, in aggregate, the minimum possible volume that may
	   have occurred during this bin.  The "Records" column in this
	   scheme, as in the "maximum-volume" scheme, gives the number of flow
	   records that were active during the bin.

       Be aware that the "spike" load-schemes allocate the entire flow to a
       single bin. This can create the impression that there is more traffic
       occurring during a particular time window that the physical network
       supports.

       The "maximum-volume" and "minimum-volume" schemes are used to compute
       the maximum and minimum volumes that could have been transferred during
       any one bin.  "maximum-volume" intentionally over-counts the flow
       volume and "minimum-volume" intentionally under-counts.

       To see the effect of the various load-schemes, suppose rwcount is using
       60-second bins and the input contains two records.  The first record
       begins at 12:03:50, ends at 12:06:20, and contains 12,600 bytes (60
       bytes/second for 210 seconds).  This record may contribute to bins at
       12:03, 12:04, 12:05, and 12:06.	The second record begins at 12:04:05
       and lasts 15 seconds; this record's volume always contributes its 200
       bytes to the 12:04 bin.	The --load-scheme option splits the byte-
       counts of the records as follows:

	BIN		    12:03:00	12:04:00    12:05:00	12:06:00

	time-proportional	 600	    3800	3600	    1200
	bin-uniform		3150	    3350	3150	    3150
	start-spike	       12600	     200	   0	       0
	middle-spike		   0	     200       12600	       0
	end-spike		   0	     200	   0	   12600
	maximum-volume	       12600	   12800       12600	   12600
	minimum-volume		   0	     200	   0	       0

       For the record that spans multiple bins: the "time-proportional" scheme
       assumes 60 bytes/second, the "bin-uniform" scheme divides the volume
       evenly by the four bins, the "middle-spike" scheme assumes all the
       volume occurs at 12:05:05, the "maximum-volume" scheme adds the volume
       to every bin, and the "minimum-volume" scheme ignores the record.

OPTIONS
       Option names may be abbreviated if the abbreviation is unique or is an
       exact match for an option.  A parameter to an option may be specified
       as --arg=param or --arg param, though the first form is required for
       options that take optional parameters.

       --bin-size=SIZE
	   Denote the size of each time bin, in seconds; defaults to 30
	   seconds.  rwcount supports millisecond size bins; SIZE may be a
	   floating point value equal to or greater than than 0.001.

       --load-scheme=LOADSCHEME
	   Specify how a flow record that spans multiple bins allocates its
	   bytes and packets among the bins.  The default scheme is
	   "time-proportional", which assumes the volume/second ratio of the
	   flow record is constant.  See the "Load Scheme" section for
	   additional information on the load-scheme choices.  The LOADSCHEME
	   may be one of the following names or numbers; names may be
	   abbreviated to the shortest prefix that is unique.

	   time-proportional,4
	       Allocate the volume in proportion to the amount of time the
	       flow spent in the bin.

	   bin-uniform,0
	       Allocate the volume evenly across the bins that contain any
	       part of the flow's duration.

	   start-spike,1
	       Allocate the entire volume to the bin containing the start time
	       of the flow.

	   middle-spike,3
	       Allocate the entire volume to the bin containing the time at
	       the midpoint of the flow.

	   end-spike,2
	       Allocate the entire volume to the bin containing the end time
	       of the flow.

	   maximum-volume,5
	       Allocate the entire volume to all of the bins containing any
	       part of the flow.

	   minimum-volume,6
	       Allocate the flow's volume to a bin only if the flow is
	       completely contained within the bin; otherwise ignore the flow.

       --start-time=START_TIME
	   Set the time of the first bin to START_TIME.	 When this switch is
	   not given, the first bin is one that holds the starting time of the
	   earliest record.  The START_TIME may be specified in a format of
	   "yyyy/mm/dd[:HH[:MM[:SS[.sss]]]]" (or "T" may be used in place of
	   ":" to separate the day and hour).  The time must be specified to
	   at least day precision, and unspecified hour, minute, second, and
	   millisecond values are set to zero.	Whether the date strings
	   represent times in UTC or the local timezone depend on how SiLK was
	   compiled, which can be determined from the "Timezone support"
	   setting in the output from rwcount --version.  Alternatively, the
	   time may be specified as seconds since the UNIX epoch, and an
	   unspecified milliseconds value is set to 0.

       --end-time=END_TIME
	   Set the time of the final bin to END_TIME.  When this switch is not
	   given, the final bin is one that holds the ending time of the
	   latest record.  The format of END_TIME is the same as that for
	   START_TIME.	Unspecified hour, minute, second, and millisecond
	   values are set to 23, 59, 59, and 999 respectively.	When END_TIME
	   is specified as seconds since the UNIX epoch, an unspecified
	   milliseconds value is set to 999.  When both --start-time and
	   --end-time are used, the END_TIME is adjusted so that the final bin
	   represents a complete interval.

       --skip-zeroes
	   Disable printing of bins with no traffic.  By default, all bins are
	   printed.

       --bin-slots
	   Use the internal bin index as the label for each bin in the output;
	   the default is to label each bin with the time in a human-readable
	   format.

       --epoch-slots
	   Use the UNIX epoch time (number of seconds since midnight UTC on
	   1970-01-01) as the label for each bin in the output; the default is
	   to label each bin with the time in a human-readable format.	This
	   switch is equivalent to --timestamp-format=epoch.  This switch is
	   deprecated as of SiLK 3.11.0, and it will be removed in the SiLK
	   4.0 release.

       --timestamp-format=FORMAT
	   Specify the format and/or timezone to use when printing timestamps.
	   When this switch is not specified, the SILK_TIMESTAMP_FORMAT
	   environment variable is checked for a default format and/or
	   timezone.  If it is empty or contains invalid values, timestamps
	   are printed in the default format, and the timezone is UTC unless
	   SiLK was compiled with local timezone support.  FORMAT is a comma-
	   separated list of a format and/or a timezone.  The format is one
	   of:

	   default
	       Print the timestamps as "YYYY/MM/DDThh:mm:ss".

	   iso Print the timestamps as "YYYY-MM-DD hh:mm:ss".

	   m/d/y
	       Print the timestamps as "MM/DD/YYYY hh:mm:ss".

	   epoch
	       Print the timestamps as the number of seconds since 00:00:00
	       UTC on 1970-01-01.

	   When a timezone is specified, it is used regardless of the default
	   timezone support compiled into SiLK.	 The timezone is one of:

	   utc Use Coordinated Universal Time to print timestamps.

	   local
	       Use the TZ environment variable or the local timezone.

       --no-titles
	   Turn off column titles.  By default, titles are printed.

       --no-columns
	   Disable fixed-width columnar output.

       --column-separator=C
	   Use specified character between columns and after the final column.
	   When this switch is not specified, the default of '|' is used.

       --no-final-delimiter
	   Do not print the column separator after the final column.  Normally
	   a delimiter is printed.

       --delimited
       --delimited=C
	   Run as if --no-columns --no-final-delimiter --column-sep=C had been
	   specified.  That is, disable fixed-width columnar output; if
	   character C is provided, it is used as the delimiter between
	   columns instead of the default '|'.

       --print-filenames
	   Print to the standard error the names of input files as they are
	   opened.

       --copy-input=PATH
	   Copy all binary input to the specified file or named pipe.  PATH
	   can be "stdout" to print flows to the standard output as long as
	   the --output-path switch has been used to redirect rwcount's ASCII
	   output.

       --output-path=PATH
	   Determine where the output of rwcount (ASCII text) is written.  If
	   this option is not given, output is written to the standard output.

       --pager=PAGER_PROG
	   When output is to a terminal, invoke the program PAGER_PROG to view
	   the output one screen full at a time.  This switch overrides the
	   SILK_PAGER environment variable, which in turn overrides the PAGER
	   variable.  If the value of the pager is determined to be the empty
	   string, no paging will be performed and all output will be printed
	   to the terminal.

       --site-config-file=FILENAME
	   Read the SiLK site configuration from the named file FILENAME.
	   When this switch is not provided, rwcount searches for the site
	   configuration file in the locations specified in the "FILES"
	   section.

       --legacy-timestamps
       --legacy-timestamps=NUM
	   When NUM is not specified or is 1, this switch is equivalent to
	   --timestamp-format=m/d/y.  Otherwise, the switch has no effect.
	   This switch is deprecated as of SiLK 3.0.0, and it will be removed
	   in the SiLK 4.0 release.

       --xargs
       --xargs=FILENAME
	   Cause rwcount to read file names from FILENAME or from the standard
	   input if FILENAME is not provided.  The input should have one file
	   name per line.  rwcount will open each file in turn and read
	   records from it, as if the files had been listed on the command
	   line.

       --help
	   Print the available options and exit.

       --version
	   Print the version number and information about how SiLK was
	   configured, then exit the application.

       --start-epoch=START_TIME
	   Alias the --start-time switch.  This switch is deprecated as of
	   SiLK 3.8.0.

       --end-epoch=START_TIME
	   Alias the --end-time switch.	 This switch is deprecated as of SiLK
	   3.8.0.

EXAMPLES
       In the following examples, the dollar sign ("$") represents the shell
       prompt.	The text after the dollar sign represents the command line.
       Lines have been wrapped for improved readability, and the back slash
       ("\") is used to indicate a wrapped line.

       To count all web traffic on Feb 12, 2009, into 1 hour bins:

	$ rwfilter --pass=stdout --start-date=2009/02/12:00	   \
	       --end-date=2009/02/12:23 --proto=6 --aport=80	   \
	  | rwcount --bin-size=3600
		       Date|	  Records|	    Bytes|	Packets|
	2009/02/12T00:00:00|	  1490.49|   578270918.16|    463951.55|
	2009/02/12T01:00:00|	  1459.33|   596455716.52|    457487.80|
	2009/02/12T02:00:00|	  1529.06|   562602842.44|    451456.41|
	2009/02/12T03:00:00|	  1503.89|   562683116.38|    455554.81|
	2009/02/12T04:00:00|	  1561.89|   590554569.78|    489273.81|
	....

       To bin the records according to their start times, use the
       --load-scheme switch:

	$ rwfilter ... --pass=stdout	   \
	  | rwcount --bin-size=3600 --load-scheme=1
		       Date|	  Records|	    Bytes|	Packets|
	2009/02/12T00:00:00|	  1494.00|   580350969.00|    464952.00|
	2009/02/12T01:00:00|	  1462.00|   596145212.00|    457871.00|
	2009/02/12T02:00:00|	  1526.00|   561629416.00|    451088.00|
	2009/02/12T03:00:00|	  1502.00|   563500618.00|    455262.00|
	2009/02/12T04:00:00|	  1562.00|   589265818.00|    489279.00|
	...

       To bin the records by their end times:
	$ rwfilter ... --pass=stdout	   \
	  | rwcount --bin-size=3600 --load-scheme=2
		       Date|	  Records|	    Bytes|	Packets|
	2009/02/12T00:00:00|	  1488.00|   577132372.00|    463393.00|
	2009/02/12T01:00:00|	  1458.00|   596956697.00|    457376.00|
	2009/02/12T02:00:00|	  1530.00|   562806395.00|    451551.00|
	2009/02/12T03:00:00|	  1506.00|   562101791.00|    455671.00|
	2009/02/12T04:00:00|	  1562.00|   591408602.00|    489371.00|
	...

       To force the hourly bins to run from 30 minutes past the hour, use the
       --start-time switch:

	$ rwfilter ... --pass=stdout	   \
	  | rwcount --bin-size=3600 --start-time=2002/12/31:23:30
		       Date|	  Records|	    Bytes|	Packets|
	2009/02/12T00:30:00|	  1483.26|   581251364.04|    456554.40|
	2009/02/12T01:30:00|	  1494.00|   575037453.00|    449280.00|
	2009/02/12T02:30:00|	  1486.36|   559700466.61|    447700.15|
	2009/02/12T03:30:00|	  1555.23|   588882400.58|    480724.48|
	2009/02/12T04:30:00|	  1537.79|   564756248.52|    472003.45|
	...

ENVIRONMENT
       SILK_TIMESTAMP_FORMAT
	   This environment variable is used as the value for
	   --timestamp-format when that switch is not provided.	 Since SiLK
	   3.11.0.

       SILK_PAGER
	   When set to a non-empty string, rwcount automatically invokes this
	   program to display its output a screen at a time.  If set to an
	   empty string, rwcount does not automatically page its output.

       PAGER
	   When set and SILK_PAGER is not set, rwcount automatically invokes
	   this program to display its output a screen at a time.

       SILK_CLOBBER
	   The SiLK tools normally refuse to overwrite existing files.
	   Setting SILK_CLOBBER to a non-empty value removes this restriction.

       SILK_CONFIG_FILE
	   This environment variable is used as the value for the
	   --site-config-file when that switch is not provided.

       SILK_DATA_ROOTDIR
	   This environment variable specifies the root directory of data
	   repository.	As described in the "FILES" section, rwcount may use
	   this environment variable when searching for the SiLK site
	   configuration file.

       SILK_PATH
	   This environment variable gives the root of the install tree.  When
	   searching for configuration files, rwcount may use this environment
	   variable.  See the "FILES" section for details.

       TZ  When the argument to the --timestamp-format switch includes "local"
	   or when a SiLK installation is built to use the local timezone, the
	   value of the TZ environment variable determines the timezone in
	   which rwcount displays timestamps.  (If both of those are false,
	   the TZ environment variable is ignored.)  If the TZ environment
	   variable is not set, the machine's default timezone is used.
	   Setting TZ to the empty string or 0 causes timestamps to be
	   displayed in UTC.  For system information on the TZ variable, see
	   tzset(3) or environ(7).  (To determine if SiLK was built with
	   support for the local timezone, check the "Timezone support" value
	   in the output of rwcount --version.)	 The TZ environment variable
	   is also used when rwcount parses the timestamp specified in the
	   --start-time or --end-time switches if SiLK is built with local
	   timezone support.

FILES
       ${SILK_CONFIG_FILE}
       ${SILK_DATA_ROOTDIR}/silk.conf
       /data/silk.conf
       ${SILK_PATH}/share/silk/silk.conf
       ${SILK_PATH}/share/silk.conf
       /usr/local/share/silk/silk.conf
       /usr/local/share/silk.conf
	   Possible locations for the SiLK site configuration file which are
	   checked when the --site-config-file switch is not provided.

SEE ALSO
       rwfilter(1), rwuniq(1), silk(7), tzset(3), environ(7)

BUGS
       Unlike rwuniq(1), rwcount does not support counting the number of
       distinct IPs in a bin.  However, using the --bin-time switch on rwuniq
       can provide time-based binning similar to what rwcount supports.	 Note
       that rwuniq always bins by the each record's start-time (similar to
       rwcount --load-factor=1), and there is no support in rwuniq for
       dividing a SiLK record among multiple time bins.

SiLK 3.11.0.1			  2016-02-19			    rwcount(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net