mls man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

mls(1)				 User Manuals				mls(1)

NAME
       mls - Display useful statistics on email messages

SYNOPSIS
       mls  [-hvq] [-l lang] [-i file] [-o file] [-r|w|u file] [-t|T text] [-m
       mode] [-n XX] [-g xxxx]

DESCRIPTION
       MailListStat is program that prints some "useful" statistical  info  on
       email  messages.	  It's	main  usage  is in email conferences - mailing
       lists. Currently it displays both tables and graphs.   You  can	select
       either TEXT or HTML output.

OPTIONS
       -h     print help text and exit

       -q     be quiet (print only errors to stderr)

       -v     turn  on	verbose mode - in this mode it will print more info to
	      stderr - indication of progress (will print  every  10th,	 20th,
	      ...,  90th, 100th, 200th, ..., 900th, 1000th, 2000th ... message
	      being processed) and warnings about malformed headers found

       -l lang
	      select output language; please note that this  applies  only  to
	      generated	 statistics  -	program messages printed to stderr ale
	      always in English. These languages are currently	supported:  EN
	      (English),   SK	(Slovak),  IT  (Italian),  FR  (Francais),  DE
	      (Deutsch), ES (Spanish), SR (Serbian), BR (Portugues Brasil).

       -i file
	      name of input file (if not specified,  use  stdin).   This  file
	      should be in MBOX format. It should exist and be readable.

       -o file
	      name  of output file (if not specified, use stdout).  If exists,
	      it will be overwritten.

       -r file
	      read input from cache file instead  of  mailbox.	You  can  read
	      input either from mailbox or cache file, not both!

       -w file
	      write  cache  file  (no  stats produced). You can either produce
	      text output or write cache file, not both!  When	writing	 cache
	      file, output-related options are ignored.

       -u file
	      update cache file = read cache, read input, write cache. For use
	      with .procmailrc/.forward

       -t text
	      name of mailing list this statistics is computed for. If	speci‐
	      fied, it is just appended to the title of statistics, so it will
	      be like "Statistics from 16.8.2001 to 7.9.2001 for text",	 where
	      text  is whatever you put as this parameter (it could be name of
	      the mailing list or just its email, e.g.	mobil@mobil.sk).

       -T text
	      title text (only this will be printed as	title);	 this  can  be
	      used  to	supress	 normal title text (date of oldest/newest msg)
	      and completely replace it with your text.

       -m mode
	      select mode of output (text, html, html2).

       -n XX  show TOP XX tables (default TOP 10). By  default,	 mls  displays
	      tables  of  TOP  10 people, subjects, quoting or whatever. Using
	      this parameter, you can define how many lines shall these tables
	      have.

       -g xxxx
	      graphs  to  show (Day, Week, Month, Year, Xnone) - specify first
	      letter (e.g. -g dmy).

EXIT STATUS
       0      Everything went OK and no error occurred.

       1      Error in sscanf() while reading & parsing cache file.  It	 means
	      that  the	 format	 of  cache  file is invalid. Try to create the
	      cache file again.

       2      Invalid command-line  option/language.  You  have	 specified  an
	      invalid command-line parameter.

       3      Cannot  open input/output file. Please check that you have typed
	      correct filename and that you have read  permissions  for	 input
	      file  and	 write	permissions  to destination directory (because
	      output file must be created).  If output file exists, it's over‐
	      written.

       4      Not  enough  memory is available for dynamically allocated vari‐
	      ables. This could be caused by user-limits, because mls requires
	      only  few	 MBs  of memory (it depends on number of messages pro‐
	      cessed and number of different subjects and authors).

       5      Error compiling regex. This error should	not  occur  in	world-
	      available versions.

USAGE
   Input
       On  input, there should be mailbox file in standard MBOX format. If the
       file is in different  format,  the  results  are	 unpredictable.	 There
       should  be  at  least one email message, otherwise no stats can be com‐
       puted.

       Warning: Be sure that no special messages are in input files  (such  as
       that with "DON'T DELETE THIS MESSAGE -- FOLDER INTERNAL DATA" subject),
       because they will be also analysed. Many programs  (POP3/IMAP  daemons,
       email  readers) put their special messages to the mailbox. This message
       is only ignored when reporting oldest message found.

   Output
       Statistics is put into output file (or stdout if unspecified) in speci‐
       fied language. All diagnostic messages are written to stderr and are in
       English.	 Output consists of several statistical data - tables,	graphs
       and summaries.  The title has two formats depending on -t parameter. If
       it's not	 specified,  it	 looks	like  "Statistics  from	 16.8.2001  to
       7.9.2001",  where  first	 date  is  date of the oldest message found in
       input and second is date of the newest one. If there is for example  -t
       mobil@mobil.sk  parameter, it will look like "Statistics from 16.8.2001
       to 7.9.2001 for mobil@mobil.sk". The problem is that date of  oldest  &
       newest  msg  is	often wrong (thanks to bad date/time settings on PC of
       msg author), so you can specify entire title using command-line	option
       -T.   When used, only your text will be printed as title, nothing more.
       There  you  can	put  for  example  something  like   "Statistics   for
       mobil@mobil.sk".

       Now  you	 have  option ( -g) to specify which graphs you want to show -
       hours of Day, days of Week, days of Month, months of Year. Use 1st let‐
       ters  as	 argument  to -g option (so -g dw will print just hours of Day
       and days of Week). Use -g x to disable printing of any graph. For exam‐
       ple you don't want to show graph for months of Year if you are present‐
       ing stats for one month, but for full-year stats you probably want it.

   HTML output
       You can choose between 2 modes of output - TEXT and HTML. When in  HTML
       mode,  mls will produce the output as HTML page. When you specify HTML2
       mode, only the body of HTML document is produced (no  header/footer)  -
       it can be used to have different HTML header/footer when calling mls as
       CGI or when using PHP wrapper. The output consists of HTML  tables  and
       bar  graphs.  Almost  every aspect of how it looks can be configured by
       modifying CSS style-sheet. Please note  that  files  style_mls.css  and
       bar.gif	must  be  present in the same directory as produced HTML file.
       You can, however, modify both  to  best	suit  your  needs.  Everything
       should  be  clear after reading comments in CSS file and looking at the
       produced HTML source.

       I was unsure what type of graphs to produce. I have tried also horizon‐
       tal bar graphs and if you want to try them, just uncomment part of code
       in PrintGraphHtml() in mls_text.c.

   Cache file support
       Instead of producing statistics in text format, you can	save  all  the
       generated values/results into "cache" file. Retrieving information from
       this file is very fast, so it is useful for integration with web pages.
       Now  you	 can  update  the cache file just after new mail was received.
       Users can view actual stats using mls
	as CGI script. It has an advantage over static	stats  that  user  can
       choose  language	 and  others  options  and  it	will be generated in a
       moment!

       To update cache file, use the -u option. It works like this: first, the
       stats  are  loaded from cache file (doesn't have to exist) and then new
       message(s) to be added are read from stdin (or from -i file) and	 added
       to  the stats.  Finally the updated stats are written back to the cache
       file. The process is really quick, because usually only one message  is
       added  at  a  time. This is useful mainly for updating cache files upon
       receiving new message. In the "examples/" subdir, you can find examples
       of integration with your .forward and .procmailrc files. By running MLS
       more than once, you can generate cache files for individual months  and
       also  for  whole	 years	(see  examples).  Then	use some PHP script to
       present list of these cache files to user.

       Format of cache files was changed in version 1.3, because of new	 stats
       added.	Now  it	 contains version info, so mls can inform you that you
       have to re-create that cache file with new version. Unfortunately,  you
       have  to re-create them also when you want new email clients to be rec‐
       ognized also in old  (already  processed)  messages.  Note  that	 email
       clients detection was buggy in 1.2.2 (a lot of clients not recognized).

   PHP wrapper
       I  have written also PHP wrapper for mls to make it more "interactive".
       It has two major advantages over plain HTML output from mls:  User  can
       choose  output  language	 and  number of TOP items to show. It works by
       running mls with appopriate command-line options.  It's	safe,  because
       only two items from user are language and topXX which are checked using
       regexp, so running arbitrary code is not possible. You can  also	 alter
       mls output - for example change @ in email addresses to (at) to prevent
       spamming.

       You can have normal MBOX file as input, but  I  recommend  using	 cache
       file.   When  using cache file, the stats are produced in a moment. You
       can see how long it took to generate the page, see  the	last  line  of
       HTML  source.  However,	there  is minor speed problem. It takes longer
       when you specify to show many topXX (like 999). The problem  is	regexp
       that  searches  for  @.	It  has	 to  search for it in whole mls output
       together and when it is large, it takes a  while	 (1.1  seconds	on  my
       2.1GHz pentium4). I have added an option which should use Perl-compati‐
       ble regex function (preg_replace) instead of POSIX  (ereg_replace),  if
       available.  This	 will result in MUCH faster execution (50ms instead of
       1.1sec).

NOTES
   How it is all computed?
       OK, so let's start from beginning - the format of MBOX file. It's plain
       text file containing some email messages delimited with one empty line.
       Each message starts with line like  this	 From  abc@a.sk	  Thu  Aug  16
       15:48:58	 2001.	After this line, there are few headers, one empty line
       and message text.  Storing emails in this format is quite common - your
       incoming	 mail is usually saved in MBOX format and also your folders in
       mail-readers like elm(1), pine(1), mutt(1)...

       Who is author of an email message? It's taken from From:	 header	 field
       and everything except the actual email address (like your full name) is
       stripped off using quite simple regular expression (regexp).

       Subject is taken from Subject: header field. If it contains  some  Re:,
       those  will be stripped off. There can be up to 5 of them. Also counted
       format ( Re[3]:) is supported. For example The Bat! email  client  uses
       it. MIME-decoding is applied to subject lines (see below).

       Date  is	 just everything in the Date: header. This header is generated
       by the email client, so it's date of message creation  and  it  doesn't
       have to be present in each message. If it isn't, you are warned by mes‐
       sage like "Warning: 1 message(s) not counted." in output. Some  clients
       don't  put  full	 date there and usually the day of week is missing and
       you are warned.	No timezones are considered, the date is taken as-is.

       Message size is everything between end of message header and  beginning
       of  new	email  (or  end	 of file). So only actual size of message text
       (body) is counted, not headers.

       Email clients are taken from X-Mailer: or User-Agent: or	 X-Newsreader:
       headers	and  some  grouping is done to avoid different versions of the
       same mailer to take the whole TOP 10. There  is	also  work-around  for
       Pine mailer (MLS will search also Message-ID: header).

   What is quoting? Why I have it 95%?
       What is quoting? When you reply to some message, you can insert part of
       the original message there, you quote the author of  original  message.
       Every  line  of original text is usually prepended with > or MP>, where
       MP are initials of the original sender's name  (for  example  The  Bat!
       uses this second format).

       And  what  is  "quote ratio"? It's size of quoted text divided by total
       size of message, specified in percent. It's included in stats,  because
       many  people  reply  to message, add one line of text and leaving there
       for example 10 pages of original text, which makes the quote ratio even
       higher  than  90%!  In  times of FIDONET, there were conferences, where
       quote ratio higher than 50% was forbidden. Try to think about  it  when
       replying	 to  message  in  mailing list where more than 300 people will
       download and read it.

   And now all the stats
       At first, there are TOP 10 tables (or TOP XX when using -n  XX  parame‐
       ter). First table shows people who have written most messages, how much
       and how many percent of total message count it is. Last row  shows  the
       "other"	-  number of messages written by everyone not listed above and
       how many percent it is. Second and third tables are similar to this one
       -  they also show best authors, but not by the number of messages writ‐
       ten.  Authors are sorted by total (or average) size of all  their  mes‐
       sages,  but  without quoting (size of message minus how much was quoted
       in that msg).  Next table shows most successful subjects and  how  many
       messages with this subject have been posted. The other table shows most
       used email clients.  The last table  show  people  with	maximal	 quote
       ratio.  It's  computed  as  sum	of quoted text in all his/her messages
       divided by total size of those messages.	 Last row shows an  average  -
       sum  of	quoted	text in all messages divided by total size of all mes‐
       sages.

       Next part of stats are some graphs. They show how  much	messages  have
       been  written  during different hours of day, days of month and days of
       week. From these you can see for example when  (and  how	 much)	people
       sleep :) or if they work during the working-hours or just write tons of
       messages...

       Next part contains info about messages which are BEST  in  something  -
       message	with  max. quote ratio, longest message and some details about
       most successful subject.

       At the end, there is final summary - total number  of  messages,	 their
       total and average size and number of different authors and subjects.

   MIME (Multipurpose Internet Mail Extensions)
       What  is	 it?  Original	implementation email permitted only 7bit ASCII
       messages.  But during the time, there was need to send international or
       even binary files. MIME defines how can these be encoded into 7bit form
       suitable for emailing and how to decode it back to human readable form.

       In email message, you can have MIME-encoded text (body of message), but
       also  some  headers - for example subject and From field.  MLS tries to
       find out if subject lines are MIME-encoded  and	if  so,	 it  tries  to
       decode  it,  to	present it to you in human-readable form. You can read
       more about MIME in RFC 1521 and 1522.

   Inspiration
       I was inspired by similar DOS program used before few years in  FIDONET
       and Slovak ULTRANET. It was created by Ivan Friedlander.

BUGS/TODO
       ·      doesn't  support	header	fields splitted to more lines (you can
	      use formail(1) to put them to one line before using MLS)

       ·      charset conversion in MIME-decoding

       ·      more stats

VERSION
       This man page is written for mls version 1.3.

AUTHOR
       mls   (MailListStat)   is   written   by	   Marek    -Marki-    Podmaka
       <marki@nexin.sk>.

SEE ALSO
       Visit				     http://freshmeat.net/projects/mls
       ⟨http://freshmeat.net/projects/mls⟩ for	more  information  and	latest
       version of mls.

COPYING
       MailListStat  - print useful statistics on email messages Copyright (C)
       2001-2003  Marek Podmaka <marki@nexin.sk>

       This program is free software; you can redistribute it and/or modify it
       under  the  terms of the GNU General Public License as published by the
       Free Software Foundation; either version 2 of the License, or (at  your
       option) any later version.

       This  program  is  distributed  in the hope that it will be useful, but
       WITHOUT ANY  WARRANTY;  without	even  the  implied  warranty  of  MER‐
       CHANTABILITY  or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General
       Public License for more details.

       You should have received a copy of the GNU General Public License along
       with this program; if not, write to the Free Software Foundation, Inc.,
       59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

Utils				   June 2003				mls(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net