ssearch man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

SSEARCH(1)							    SSEARCH(1)

NAME
       ssearch - scan a protein or DNA sequence library for similar sequences

SYNOPSIS
       ssearch	[-a -b # -d # -E # -f # -g # -h -i -l FASTLIBS	-L -r STATFILE
       -m # -O filename -Q -s SMATRIX -w # -z ]	 query-sequence-file  library-
       file

       ssearch [-QabdEfghilmOrswz] query-file @library-name-file

       ssearch [-QabdEfghilmOrswz] query-file "%PRMVI"

       ssearch [-aEfghilmrsw] - interactive mode

DESCRIPTION
       ssearch	compares  a protein or DNA sequence to all of the entries in a
       sequence library using the rigorous Smith-Waterman algorithm (Smith and
       Waterman,  J.  Mol. Biol. (1983) 147:195-197.  For example, ssearch can
       compare a protein sequence to all of the sequences in the NBRF PIR pro‐
       tein  sequence database.	 ssearch will automatically decide whether the
       query sequence is DNA or protein by reading the query sequence as  pro‐
       tein  and determining whether the `amino-acid composition' is more than
       85% A+C+G+T.  The program can be invoked either with command line argu‐
       ments  or  in interactive mode.	ssearch compares a query sequence to a
       sequence library which consists of sequence data interspersed with com‐
       ments,  see  below.  The fasta programs, including ssearch, use a stan‐
       dard text format sequence file.	Lines beginning with  or  lower	 case,
       blanks,tabs and unrecognizable characters are ignored.  ssearch expects
       sequences to use the single letter amino acid codes, see protcodes(1) .
       Library files for ssearch should have the form shown below.

OPTIONS
       ssearch	can  be	 directed to change the scoring matrix, search parame‐
       ters, output format, and default search directories by entering options
       on  the	command	 line  (preceeded by a `-'). All of the options should
       preceed the file name and ktup arguments). Alternately,	these  options
       can be changed by setting environment variables.	 The options and envi‐
       ronment variables are:

       -a     (SHOWALL) Modifies the display of the two	 sequences  in	align‐
	      ments.  Normally, both sequences are shown only where they over‐
	      lap (SHOWALL=0); If -a or the environment variable SHOWALL =  1,
	      both sequences are shown in their entirety.

       -b #   The  number  of similarity scores to be shown when the -Q option
	      is used.	This value is usually calculated based on  the	actual
	      scores.

       -d #   The  number  of alignments to be shown.  Normally, ssearch shows
	      the same number of alignments as similarity  scores.   By	 using
	      ssearch  -Q  -b  200  -d	50,  one would see the top scoring 200
	      sequences and alignments for the 50 best scores.

       -E #   The expectation value threshold for displaying similarity scores
	      and sequence alignments.	fasta -Q -E 2.0 would show all library
	      sequences with scores expected to occur no more than 2 times  by
	      chance in a search of the library.

       -f #   Penalty for the first residue in a gap (-12 by default).

       -g #   Penalty for additional residues in a gap (-2 by default).

       -h     Do not display histogram of similarity scores.

       -l file
	      (FASTLIBS)  The  name  of	 the library menu file.	 Normally this
	      will be determined by the environment variable  FASTLIBS.	  How‐
	      ever, a library menu file can also be specified with -l.

       -L     display  more  information  about	 the  library  sequence in the
	      alignment.

       -m #   (MARKX) =0,1,2,3. Alternate display of matches and mismatches in
	      alignments.  MARKX=0  uses ":","."," ", for identities, conseva‐
	      tive replacements, and  non-conservative	replacements,  respec‐
	      tively.  MARKX=1	uses  " ","x", and "X".	 MARKX=2 does not show
	      the second sequence, but uses the second alignment line to  dis‐
	      play  matches  with  a "."  for identity, or with the mismatched
	      residue for mismatches.  MARKX=2 is useful  for  aligning	 large
	      numbers  of  similar  sequences.	 MARKX=3  writes out a file of
	      library sequences in FASTA format.   MARKX=3  should  always  be
	      used  with  the  "SHOWALL"  (-a)	option, but this does not com‐
	      pletely ensure that all of the sequences output will be aligned.

       -O filename
	      Sends copy of results to "filename".

       -Q Quiet option.	 This allows ssearch to search a database and report
	      the results  without  asking  any	 questions.  ssearch  -Q  file
	      library  > output can be put in the background or run at a later
	      time with the unix  'at'	command.   The	number	of  similarity
	      scores  and alignments displayed with the -Q option can be modi‐
	      fied with the -b (scores) and -d (alignments) options.

       -r     STATFILE Causes ssearch to write out  the	 sequence  identifier,
	      superfamily  number  (if	available),  and  similarity scores to
	      STATFILE for every sequence in the library.  These  results  are
	      not sorted.

       -s str (SMATRIX)	 the  filename	of an alternative scoring matrix file.
	      For protein sequences, BLOSUM50 is used by default;  PAM250  can
	      be used with the command line option -s 250.

       -w #   (LINLEN)	output line length for sequence alignments.  (normally
	      60, can be set up to 200).

       -z     Do not do statistical significance calculation.

EXAMPLES
       (1)    ssearch musplfm.aa $AABANK

       Compare the amino acid sequence in the file musplfm.aa  with  the  com‐
       plete  PIR protein sequence library.  This is extremely slow and should
       almost never be	done.	ssearch	 is  designed  to  search  very	 small
       libraries of sequences.

	    >LCBO bovine preprolactin
	    WILLLSQ ...
	    >LCHU human ...
	    ...

       (2)    ssearch -a -w 80 musplfm.aa lcbo.aa

       Compare	the  amino  acid  sequence  in	the  file  musplfm.aa with the
       sequences in the file lcbo.aa using ktup = 1.  Show both	 sequences  in
       their entirety, with 80 residues on each output line.

       (3)    ssearch

       Run  the	 ssearch program in interactive mode.  The program will prompt
       for the file name for the query sequence, list alternative libraries to
       be seached (if FASTLIBS is set), and prompt for the ktup.

       You can use your own sequence files for ssearch, just be certain to put
       a '>' and comment as the first line before the sequence.

SEE ALSO
       rss(1), align(1), fasta(1), rdf2(1),protcodes(5), dnacodes(5)

AUTHOR
       Bill Pearson
       wrp@virginia.EDU

				     local			    SSEARCH(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net