rwcombine man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

rwcombine(1)			SiLK Tool Suite			  rwcombine(1)

NAME
       rwcombine - Combine flows denoting a long-lived session into a single
       flow

SYNOPSIS
	 rwcombine [--actions=ACTIONS] [--ignore-fields=FIELDS]
	       [--max-idle-time=NUM]
	       [{--print-statistics | --print-statistics=FILENAME}]
	       [--temp-directory=DIR_PATH] [--buffer-size=SIZE]
	       [--note-add=TEXT] [--note-file-add=FILE]
	       [--compression-method=COMP_METHOD] [--print-filenames]
	       [--output-path=PATH] [--site-config-file=FILENAME]
	       {[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}

	 rwcombine --help

	 rwcombine --help-fields

	 rwcombine --version

DESCRIPTION
       rwcombine reads SiLK Flow records from one or more input sources,
       searches for flow records where the attributes field denotes records
       that were prematurely created or were continuations of prematurely
       created flows, and attempts to combine those records into a single
       record.	All the unmodified SiLK records and the combined records are
       written to the file specified by the --output-path switch or to the
       standard output when the --output-path switch is not provided and the
       standard output is not connected to a terminal.

       Some flow exporters, such as yaf(1), provide fields that describe
       characteristics about the flow record, and these characteristics are
       stored in the attributes field of SiLK Flow records.  The two flags
       that rwcombine considers are:

       "T" The flow generator prematurely created a record for a long-lived
	   session due to the connection's lifetime reaching the active
	   timeout of the flow generator.  (Also, when yaf is run with the
	   --silk switch, it prematurely creates a flow and marks it with "T"
	   if the byte count of the flow cannot be stored in a 32-bit value.)

       "C" The flow generator created this flow as a continuation of long-
	   running connection, where the previous flow for this connection met
	   a timeout.  (yaf only sets this flag when it is invoked with the
	   --silk switch.)

       A very long-running session may be represented by multiple flow
       records, where the first record is marked with the "T" flag, the final
       record is marked with the "C" flag, and intermediate records are marked
       with both "C" (this record continues an earlier flow) and "T" (this
       record also met the active time-out).  rwcombine attempts to combine
       these multiple flow records into a single record.

       The input to rwcombine does not need to be sorted.  As part of its
       processing, rwcombine may re-order the records before writing them.

       rwcombine reads SiLK Flow records from the files named on the command
       line or from the standard input when no file names are specified and
       --xargs is not present.	To read the standard input in addition to the
       named files, use "-" or "stdin" as a file name.	If an input file name
       ends in ".gz", the file will be uncompressed as it is read.  When the
       --xargs switch is provided, rwcombine will read the names of the files
       to process from the named text file, or from the standard input if no
       file name argument is provided to the switch.  The input to --xargs
       must contain one file name per line.

   Algorithm
       The algorithm rwcombine uses to combine records is

       1.  rwcombine reads SiLK flow records, examines the attributes field on
	   each record, and immediately writes to the destination stream all
	   records where both the time-out flag ("T") and the continuation
	   flag ("C") are not set.  Records where one or both of those flags
	   are set are stored until all input records have been read.

       2.  rwcombine groups the stored records into bins where the following
	   fields for each record in each bin are identical: sIP, dIP, sPort,
	   dPort, protocol, sensor, in, out, nhIP, application, class, and
	   type.

       3.  For each bin, the records are stored by time (sTime and elapsed).

       4.  Within a bin, rwcombine combines two records into a single record
	   when the attributes field of the first record has the "T" (time-
	   out) flag set and the second record has the "C" (continuation) flag
	   set.	 When combining records, the bytes field and packets fields
	   are summed, the initialFlags from the first record is used, the
	   sessionFlags field becomes the bit-wise OR of both sessionFlags
	   fields and the second record's initialFlags field, and the eTime is
	   set to that of the second flow.

       5.  If the second record's "T" flag was set, rwcombine checks to see if
	   the third record's "C" flag is set.	If it is, the third record
	   becomes part of the new record.

       6.  The previous step repeats for the records in the bin until the bin
	   contains a single record, the most recently added record did not
	   have the "T" flag set, or the next record in the bin does not have
	   the "C" flag set.

       7.  After examining a bin, rwcombine writes the record(s) the bin
	   contains to the destination stream.

       8.  Steps 3 through 7 are repeated for each bin.

       The --ignore-fields switch allows the user to remove fields from the
       set that rwcombine uses when grouping records in Step 2.

       When combining two records into one (Step 4), rwcombine completely
       disregards the difference between the first record's end-time and the
       second record's start-time (the idle time).  To tell rwcombine not to
       combine those records when the difference is greater than a limit,
       specify that value as the argument to the --max-idle-time switch.

       To see information on the number of flows combined and the minimum and
       maximum idle times, specify the --print-statistics switch.

       During its processing, rwcombine will try to allocate a large (near
       2GB) in-memory array to hold the records.  (You may use the
       --buffer-size switch to change this maximum buffer size.)  If more
       records are read than will fit into memory, the in-core records are
       temporarily stored on disk as described by the --temp-directory switch.
       When all records have been read, the on-disk files are merged to
       produce the output.

       By default, the temporary files are stored in the /tmp directory.
       Because the sizes of the temporary files may be large, it is strongly
       recommended that /tmp not be used as the temporary directory, and
       rwcombine will print a warning when /tmp is used.  To modify the
       temporary directory used by rwcombine, provide the --temp-directory
       switch, set the SILK_TMPDIR environment variable, or set the TMPDIR
       environment variable.

OPTIONS
       Option names may be abbreviated if the abbreviation is unique or is an
       exact match for an option.  A parameter to an option may be specified
       as --arg=param or --arg param, though the first form is required for
       options that take optional parameters.

       --actions=ACTIONS
	   Select the type of action(s) that rwcombine should take to combine
	   the input records.  The default action is "all", and the following
	   actions are supported:

	   all Perform all the actions described below.

	   timeout
	       Combine into a single flow record those records where the
	       timeout flags in the attributes field indicate that the flow
	       exporter has divided a long-lived session into multiple flow
	       records.

	   This switch is provided for future expansion of rwcombine, since at
	   present rwcombine supports a single action.	When writing a script
	   that uses rwcombine, specify --action=timeout for compatibility
	   with future versions of rwcombine.

       --ignore-fields=FIELDS
	   Ignore the fields listed in FIELDS when determining if two flow
	   records should be grouped into the same bin; that is, treat FIELDS
	   as being identical across all flows.	 By default, rwcombine puts
	   records into a bin when the records have identical values for the
	   following fields: sIP, dIP, sPort, dPort, protocol, sensor, in,
	   out, nhIP, application, class, and type.

	   FIELDS is a comma separated list of field-names, field-integers,
	   and ranges of field-integers; a range is specified by separating
	   the start and end of the range with a hyphen (-).  Field-names are
	   case-insensitive.  Example:

	    --ignore-fields=sensor,12-15

	   The list of supported fields are:

	   sIP,1
	       source IP address

	   dIP,2
	       destination IP address

	   sPort,3
	       source port for TCP and UDP, or equivalent

	   dPort,4
	       destination port for TCP and UDP, or equivalent

	   protocol,5
	       IP protocol

	   sensor,12
	       name or ID of sensor at the collection point

	   in,13
	       router SNMP input interface or vlanId if packing tools were
	       configured to capture it (see sensor.conf(5))

	   out,14
	       router SNMP output interface or postVlanId

	   nhIP,15
	       router next hop IP

	   class,20,type,21
	       class and type of sensor at the collection point (represented
	       internally by a single value)

	   application,29
	       guess as to the content of the flow.  Some software that
	       generates flow records from packet data, such as yaf(1), will
	       inspect the contents of the packets that make up a flow and use
	       traffic signatures to label the content of the flow.  SiLK
	       calls this label the application; yaf refers to it as the
	       appLabel.  The application is the port number that is
	       traditionally used for that type of traffic (see the
	       /etc/services file on most UNIX systems).  For example, traffic
	       that the flow generator recognizes as FTP will have a value of
	       21, even if that traffic is being routed through the standard
	       HTTP/web port (80).

       --max-idle-time=NUM
	   Do not combine flow records when the start time of the second flow
	   record begins NUM seconds after the end time of the first flow
	   record.  NUM may be fractional.  If not specified, the maximum idle
	   time may be considered infinite.

       --print-statistics
       --print-statistics=FILENAME
	   Print to the standard error or to the specified FILENAME the number
	   of flows records read and written, the number of flows that did not
	   require combining, the number of flows combined, the number that
	   could not be combined, and minimum and maximum idle time between
	   combined flow records.

       --temp-directory=DIR_PATH
	   Specify the name of the directory in which to store data files
	   temporarily when more records have been read that will fit into
	   RAM.	 This switch overrides the directory specified in the
	   SILK_TMPDIR environment variable, which overrides the directory
	   specified in the TMPDIR variable, which overrides the default,
	   /tmp.

       --buffer-size=SIZE
	   Set the maximum size of the buffer to use for holding the records,
	   in bytes.  A larger buffer means fewer temporary files need to be
	   created, reducing the I/O wait times.  The default maximum for this
	   buffer is near 2GB.	The SIZE may be given as an ordinary integer,
	   or as a real number followed by a suffix "K", "M" or "G", which
	   represents the numerical value multiplied by 1,024 (kilo),
	   1,048,576 (mega), and 1,073,741,824 (giga), respectively.  For
	   example, 1.5K represents 1,536 bytes, or one and one-half
	   kilobytes.  (This value does not represent the absolute maximum
	   amount of RAM that rwcombine will allocate, since additional
	   buffers will be allocated for reading the input and writing the
	   output.)

       --output-path=PATH
	   Write the SiLK Flow records to the specified file or named pipe.
	   When the standard output is not a terminal and this switch is not
	   provided or its argument is "-" or "stdout", the records are
	   written to the standard output.

       --note-add=TEXT
	   Add the specified TEXT to the header of the output file as an
	   annotation.	This switch may be repeated to add multiple
	   annotations to a file.  To view the annotations, use the
	   rwfileinfo(1) tool.

       --note-file-add=FILENAME
	   Open FILENAME and add the contents of that file to the header of
	   the output file as an annotation.	This switch may be repeated to
	   add multiple annotations.  Currently the application makes no
	   effort to ensure that FILENAME contains text; be careful that you
	   do not attempt to add a SiLK data file as an annotation.

       --compression-method=COMP_METHOD
	   Specify how to compress the output.	When this switch is not given,
	   output to the standard output or to named pipes is not compressed,
	   and output to files is compressed using the default chosen when
	   SiLK was compiled.  The valid values for COMP_METHOD are determined
	   by which external libraries were found when SiLK was compiled.  To
	   see the available compression methods and the default method, use
	   the --help or --version switch.  SiLK can support the following
	   COMP_METHOD values when the required libraries are available.

	   none
	       Do not compress the output using an external library.

	   zlib
	       Use the zlib(3) library for compressing the output, and always
	       compress the output regardless of the destination.  Using zlib
	       produces the smallest output files at the cost of speed.

	   lzo1x
	       Use the lzo1x algorithm from the LZO real time compression
	       library for compression, and always compress the output
	       regardless of the destination.  This compression provides good
	       compression with less memory and CPU overhead.

	   best
	       Use lzo1x if available, otherwise use zlib.  Only compress the
	       output when writing to a file.

       --print-filenames
	   Print to the standard error the names of input files as they are
	   opened.

       --site-config-file=FILENAME
	   Read the SiLK site configuration from the named file FILENAME.
	   When this switch is not provided, rwcombine searches for the site
	   configuration file in the locations specified in the "FILES"
	   section.

       --xargs
       --xargs=FILENAME
	   Causes rwcombine to read file names from FILENAME or from the
	   standard input if FILENAME is not provided.	The input should have
	   one file name per line.  rwcombine will open each file in turn and
	   read records from it, as if the files had been listed on the
	   command line.

       --help
	   Print the available options and exit.

       --help-fields
	   Print the description and alias(es) of each field and exit.

       --version
	   Print the version number and information about how SiLK was
	   configured, then exit the application.

EXAMPLES
       In the following examples, the dollar sign ("$") represents the shell
       prompt.	The text after the dollar sign represents the command line.
       Lines have been wrapped for improved readability, and the back slash
       ("\") is used to indicate a wrapped line.

       The output from rwcut(1) shows the flow exporter split this long-lived
       ssh session into multiple flow records:

	$ rwfilter --saddr=192.168.126.252 --dport=22 --pass=- data.rw \
	  | rwcut --fields=flags,attributes,stime,etime
	   flags|attribut|		    sTime|		    eTime|
	 S PA	|T	 |2009/02/13T00:29:59.563|2009/02/13T00:59:39.668|
	   PA	|TC	 |2009/02/13T00:59:39.668|2009/02/13T01:29:19.478|
	   PA	|TC	 |2009/02/13T01:29:19.478|2009/02/13T01:58:48.890|
	   PA	|TC	 |2009/02/13T01:58:48.891|2009/02/13T02:28:43.599|
	F  PA	| C	 |2009/02/13T02:28:43.600|2009/02/13T02:32:58.272|

       Here is the other half of that conversation:

	$ rwfilter --daddr=192.168.126.252 --sport=22 --pass=- data.rw \
	  | rwcut --fields=flags,attributes,stime,etime
	   flags|attribut|		    sTime|		    eTime|
	 S PA	|T	 |2009/02/13T00:30:00.060|2009/02/13T00:59:39.667|
	   PA	|TC	 |2009/02/13T00:59:39.670|2009/02/13T01:29:19.478|
	   PA	|TC	 |2009/02/13T01:29:19.481|2009/02/13T01:58:48.890|
	   PA	|TC	 |2009/02/13T01:58:48.893|2009/02/13T02:28:43.599|
	F  PA	| C	 |2009/02/13T02:28:43.600|2009/02/13T02:32:58.271|

       Use rwuniq(1) to compute the byte and packet counts for that ssh
       session:

	$ rwfilter --any-addr=192.168.126.252 --aport=22 --pass=- data.rw \
	  | rwuniq --fields=sip,dip,sport,dport --values=records,byte,packets
		    sIP|	    dIP|sPort|dPort|Records|  Bytes|Packets|
	  10.11.156.107|192.168.126.252|   22|28975|	  5|4677240|   3881|
	192.168.126.252|  10.11.156.107|28975|	 22|	  5| 281939|   3891|

       Invoke rwcombine on these records and store the result in the file
       combined.rw:

	$ rwfilter --any-addr=192.168.126.252 --aport=22 --pass=- data.rw \
	  | rwcombine --print-statistics --output-path=combined.rw
	FLOW RECORD COUNTS:
	Read:					 10
	Initially Complete:	      -		  0 *
	Sorted & Examined:	      =		 10
	Missing end:		      -		  0 *
	Missing start & end:	      -		  0 *
	Missing start:		      -		  0 *
	Prior to combining:	      =		 10
	Eliminated:		      -		  8
	Made complete:		      =		  2 *
	Written:				  2 (sum of *)

	IDLE TIMES:
	Minimum:	0:00:00:00.000
	Penultimate:	0:00:00:00.000
	Maximum:	0:00:00:00.003

       View the resulting records:

	$ rwcut --fields=sip,dip,sport,dport,bytes,packets,flags combined.rw
		    sIP|	    dIP|sPort|dPort|  bytes|packets|   flags|
	  10.11.156.107|192.168.126.252|   22|28975|4677240|   3881|FS PA   |
	192.168.126.252|  10.11.156.107|28975|	 22| 281939|   3891|FS PA   |

	$ rwcut --fields=sip,attributes,stime,etime combined.rw
		    sIP|attribut|		   sTime|		   eTime|
	  10.11.156.107|	|2009/02/13T00:30:00.060|2009/02/13T02:32:58.271|
	192.168.126.252|	|2009/02/13T00:29:59.563|2009/02/13T02:32:58.272|

ENVIRONMENT
       SILK_TMPDIR
	   When set and --temp-directory is not specified, rwcombine writes
	   the temporary files it creates to this directory.  SILK_TMPDIR
	   overrides the value of TMPDIR.

       TMPDIR
	   When set and SILK_TMPDIR is not set, rwcombine writes the temporary
	   files it creates to this directory.

       SILK_CLOBBER
	   The SiLK tools normally refuse to overwrite existing files.
	   Setting SILK_CLOBBER to a non-empty value removes this restriction.

       SILK_CONFIG_FILE
	   This environment variable is used as the value for the
	   --site-config-file when that switch is not provided.

       SILK_DATA_ROOTDIR
	   This environment variable specifies the root directory of data
	   repository.	As described in the "FILES" section, rwcombine may use
	   this environment variable when searching for the SiLK site
	   configuration file.

       SILK_PATH
	   This environment variable gives the root of the install tree.  When
	   searching for configuration files, rwcombine may use this
	   environment variable.  See the "FILES" section for details.

       SILK_TEMPFILE_DEBUG
	   When set to 1, rwcombine prints debugging messages to the standard
	   error as it creates, re-opens, and removes temporary files.

FILES
       ${SILK_CONFIG_FILE}
       ${SILK_DATA_ROOTDIR}/silk.conf
       /data/silk.conf
       ${SILK_PATH}/share/silk/silk.conf
       ${SILK_PATH}/share/silk.conf
       /usr/local/share/silk/silk.conf
       /usr/local/share/silk.conf
	   Possible locations for the SiLK site configuration file which are
	   checked when the --site-config-file switch is not provided.

       ${SILK_TMPDIR}/
       ${TMPDIR}/
       /tmp/
	   Directory in which to create temporary files.

SEE ALSO
       rwfilter(1), rwcut(1), rwuniq(1), rwfileinfo(1), sensor.conf(5),
       silk(7), yaf(1), zlib(3)

NOTES
       The first release of rwcombine occurred in SiLK 3.9.0.

SiLK 3.11.0.1			  2016-02-19			  rwcombine(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net