hmmsearch man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

hmmsearch(1)			 HMMER Manual			  hmmsearch(1)

NAME
       hmmsearch - search profile(s) against a sequence database

SYNOPSIS
       hmmsearch [options] <hmmfile> <seqdb>

DESCRIPTION
       hmmsearch  is  used  to	search one or more profiles against a sequence
       database.  For each profile in <hmmfile>, use  that  query  profile  to
       search  the  target  database of profiles in <seqdb>, and output ranked
       lists of the sequences with the most significant matches	 to  the  pro‐
       file.

       The <hmmfile> may contain more than one profile. To build profiles from
       multiple alignments, see hmmbuild.

       The output format is designed to be human-readable,  but	 is  often  so
       voluminous  that	 reading  it is impractical, and parsing it is a pain.
       The --tblout and --domtblout options save output in simple tabular for‐
       mats  that are concise and easier to parse.  The -o option allows redi‐
       recting the main output, including throwing it away in /dev/null.

OPTIONS
       -h     Help; print a brief reminder  of	command	 line  usage  and  all
	      available options.

OPTIONS FOR CONTROLLING OUTPUT
       -o <f> Direct  the  main human-readable output to a file <f> instead of
	      the default stdout.

       -A <f> Save a multiple alignment of all significant hits (those	satis‐
	      fying inclusion thresholds) to the file <f>.

       --tblout <f>
	      Save  a  simple  tabular	(space-delimited) file summarizing the
	      per-target output, with one  data	 line  per  homologous	target
	      sequence found.

       --domtblout <f>
	      Save  a  simple  tabular	(space-delimited) file summarizing the
	      per-domain output, with one  data	 line  per  homologous	domain
	      detected in a query sequence for each homologous model.

       --acc  Use accessions instead of names in the main output, where avail‐
	      able for profiles and/or sequences.

       --noali
	      Omit the alignment  section  from	 the  main  output.  This  can
	      greatly reduce the output volume.

       --notextw
	      Unlimit  the length of each line in the main output. The default
	      is a limit of 120 characters per line, which helps in displaying
	      the output cleanly on terminals and in editors, but can truncate
	      target profile description lines.

       --textw <n>
	      Set the main output's line length limit to  <n>  characters  per
	      line. The default is 120.

OPTIONS CONTROLLING REPORTING THRESHOLDS
       Reporting  thresholds  control  which hits are reported in output files
       (the main output, --tblout, and --domtblout).  Sequence hits and domain
       hits  are  ranked  by  statistical significance (E-value) and output is
       generated in two sections called per-target and per-domain  output.  In
       per-target  output, by default, all sequence hits with an E-value <= 10
       are reported. In the per-domain output, for each target that has passed
       per-target  reporting  thresholds,  all	domains	 satisfying per-domain
       reporting thresholds are reported. By default, these are	 domains  with
       conditional  E-values  of  <=  10.  The	following options allow you to
       change the default E-value reporting thresholds, or to  use  bit	 score
       thresholds instead.

       -E <x> In  the  per-target  output,  report target sequences with an E-
	      value of <= <x>.	The default is 10.0, meaning that on  average,
	      about  10 false positives will be reported per query, so you can
	      see the top of the noise and decide for yourself if it's	really
	      noise.

       -T <x> Instead  of  thresholding per-profile output on E-value, instead
	      report target sequences with a bit score of >= <x>.

       --domE <x>
	      In the per-domain output, for target sequences that have already
	      satisfied the per-profile reporting threshold, report individual
	      domains with a conditional E-value of <= <x>.   The  default  is
	      10.0.   A conditional E-value means the expected number of addi‐
	      tional false positive domains in the  smaller  search  space  of
	      those  comparisons that already satisfied the per-target report‐
	      ing threshold (and thus must have at least one homologous domain
	      already).

       --domT <x>
	      Instead  of  thresholding	 per-domain output on E-value, instead
	      report domains with a bit score of >= <x>.

OPTIONS FOR INCLUSION THRESHOLDS
       Inclusion thresholds are stricter than reporting thresholds.  Inclusion
       thresholds  control  which hits are considered to be reliable enough to
       be included in an output alignment or a	subsequent  search  round,  or
       marked as significant ("!") as opposed to questionable ("?")  in domain
       output.

       --incE <x>
	      Use an E-value of <= <x> as the per-target inclusion  threshold.
	      The default is 0.01, meaning that on average, about 1 false pos‐
	      itive would be expected in every	100  searches  with  different
	      query sequences.

       --incT <x>
	      Instead  of  using E-values for setting the inclusion threshold,
	      instead use a bit score of >= <x> as  the	 per-target  inclusion
	      threshold.  By default this option is unset.

       --incdomE <x>
	      Use  a conditional E-value of <= <x> as the per-domain inclusion
	      threshold, in targets that have already  satisfied  the  overall
	      per-target inclusion threshold.  The default is 0.01.

       --incdomT <x>
	      Instead of using E-values, use a bit score of >= <x> as the per-
	      domain inclusion threshold.

OPTIONS FOR MODEL-SPECIFIC SCORE THRESHOLDING
       Curated profile databases may define specific bit score thresholds  for
       each profile, superseding any thresholding based on statistical signif‐
       icance alone.

       To use these options, the profile must contain the appropriate (GA, TC,
       and/or  NC)  optional  score threshold annotation; this is picked up by
       hmmbuild from  Stockholm	 format	 alignment  files.  Each  thresholding
       option  has  two	 scores:  the per-sequence threshold <x1> and the per-
       domain threshold <x2> These act	as  if	-T<x1>	--incT<x1>  --domT<x2>
       --incdomT<x2>  has been applied specifically using each model's curated
       thresholds.

       --cut_ga
	      Use the GA (gathering) bit scores	 in  the  model	 to  set  per-
	      sequence	(GA1)  and  per-domain	(GA2)  reporting and inclusion
	      thresholds. GA thresholds are generally  considered  to  be  the
	      reliable	curated	 thresholds  defining  family  membership; for
	      example, in Pfam, these thresholds define what gets included  in
	      Pfam Full alignments based on searches with Pfam Seed models.

       --cut_nc
	      Use  the	NC (noise cutoff) bit score thresholds in the model to
	      set per-sequence (NC1) and per-domain (NC2) reporting and inclu‐
	      sion  thresholds.	 NC  thresholds are generally considered to be
	      the score of the highest-scoring known false positive.

       --cut_tc
	      Use the NC (trusted cutoff) bit score thresholds in the model to
	      set per-sequence (TC1) and per-domain (TC2) reporting and inclu‐
	      sion thresholds. TC thresholds are generally  considered	to  be
	      the  score  of  the  lowest-scoring  known true positive that is
	      above all known false positives.

OPTIONS CONTROLLING THE ACCELERATION PIPELINE
       HMMER3 searches are accelerated in a three-step	filter	pipeline:  the
       MSV  filter, the Viterbi filter, and the Forward filter. The first fil‐
       ter is the fastest and most approximate; the last is the	 full  Forward
       scoring	algorithm.  There  is  also a bias filter step between MSV and
       Viterbi. Targets that pass all the steps in the	acceleration  pipeline
       are then subjected to postprocessing -- domain identification and scor‐
       ing using the Forward/Backward algorithm.

       Changing filter thresholds only removes or includes targets  from  con‐
       sideration;  changing  filter  thresholds does not alter bit scores, E-
       values, or alignments, all of which are determined solely  in  postpro‐
       cessing.

       --max  Turn  off	 all  filters, including the bias filter, and run full
	      Forward/Backward postprocessing on every target. This  increases
	      sensitivity somewhat, at a large cost in speed.

       --F1 <x>
	      Set  the P-value threshold for the MSV filter step.  The default
	      is 0.02, meaning that roughly 2% of the highest  scoring	nonho‐
	      mologous targets are expected to pass the filter.

       --F2 <x>
	      Set  the	P-value	 threshold  for	 the Viterbi filter step.  The
	      default is 0.001.

       --F3 <x>
	      Set the P-value threshold for  the  Forward  filter  step.   The
	      default is 1e-5.

       --nobias
	      Turn  off	 the bias filter. This increases sensitivity somewhat,
	      but can come at a high cost in speed, especially	if  the	 query
	      has  biased  residue  composition (such as a repetitive sequence
	      region, or if it is a membrane protein  with  large  regions  of
	      hydrophobicity). Without the bias filter, too many sequences may
	      pass the filter with biased  queries,  leading  to  slower  than
	      expected	performance  as	 the  computationally  intensive  For‐
	      ward/Backward algorithms shoulder an abnormally heavy load.

OTHER OPTIONS
       --nonull2
	      Turn off the null2 score corrections for biased composition.

       -Z <x> Assert that the total number of targets in your searches is <x>,
	      for  the	purposes  of per-sequence E-value calculations, rather
	      than the actual number of targets seen.

       --domZ <x>
	      Assert that the total number of targets in your searches is <x>,
	      for the purposes of per-domain conditional E-value calculations,
	      rather than the number of	 targets  that	passed	the  reporting
	      thresholds.

       --seed <n>
	      Set the random number seed to <n>.  Some steps in postprocessing
	      require Monte Carlo simulation.  The default is to use  a	 fixed
	      seed  (42),  so that results are exactly reproducible. Any other
	      positive integer will give  different  (but  also	 reproducible)
	      results. A choice of 0 uses a randomly chosen seed.

       --qformat <s>
	      Assert  that the query sequence file is in format <s>.  Accepted
	      formats include fasta, embl, genbank, ddbj, uniprot,  stockholm,
	      pfam,  a2m, and afa.  The default is to autodetect the format of
	      the file.

       --cpu <n>
	      Set the number of parallel worker threads to <n>.	  By  default,
	      HMMER  sets  this	 to the number of CPU cores it detects in your
	      machine - that is, it tries to maximize the use of  your	avail‐
	      able  processor  cores.  Setting	<n>  higher than the number of
	      available cores is of little if any value, but you may  want  to
	      set  it  to  something less. You can also control this number by
	      setting an environment variable, HMMER_NCPU.

	      This option is only available if HMMER was compiled  with	 POSIX
	      threads  support.	 This  is  the	default,  but it may have been
	      turned off at compile-time for your site	or  machine  for  some
	      reason.

       --stall
	      For  debugging the MPI master/worker version: pause after start,
	      to enable the developer to attach debuggers to the running  mas‐
	      ter  and worker(s) processes. Send SIGCONT signal to release the
	      pause.  (Under gdb: (gdb) signal	SIGCONT)  (Only	 available  if
	      optional MPI support was enabled at compile-time.)

       --mpi  Run in MPI master/worker mode, using mpirun.  (Only available if
	      optional MPI support was enabled at compile-time.)

SEE ALSO
       See hmmer(1) for a master man page with a list of  all  the  individual
       man pages for programs in the HMMER package.

       For  complete  documentation,  see  the	user guide that came with your
       HMMER  distribution  (Userguide.pdf);  or  see  the  HMMER   web	  page
       (@HMMER_URL@).

COPYRIGHT
       @HMMER_COPYRIGHT@
       @HMMER_LICENSE@

       For  additional	information  on	 copyright and licensing, see the file
       called COPYRIGHT in your HMMER source distribution, or  see  the	 HMMER
       web page (@HMMER_URL@).

AUTHOR
       Eddy/Rivas Laboratory
       Janelia Farm Research Campus
       19700 Helix Drive
       Ashburn VA 20147 USA
       http://eddylab.org

HMMER @HMMER_VERSION@		 @HMMER_DATE@			  hmmsearch(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net