sgmls man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

SGMLS(1)							      SGMLS(1)

NAME
       sgmls - a validating SGML parser

       An  System Conforming to
       International Standard ISO 8879 —
       Standard Generalized Markup Language

SYNOPSIS
       sgmls [ -deglprsuv ] [ -cfile ] [ -iname ] [ -mfile ] [ filename...  ]

DESCRIPTION
       Sgmls  parses  and  validates  the  document entity in filename...  and
       prints on the standard output a simple ASCII representation of its Ele‐
       ment  Structure	Information Set.  (This is the information set which a
       structure-controlled conforming	application should  act	 upon.)	  Note
       that the document entity may be spread amongst several files; for exam‐
       ple, the SGML  declaration,  document  type  declaration	 and  document
       instance	 set  could  each  be in a separate file.  If no filenames are
       specified, then sgmls will read the document entity from	 the  standard
       input.	A  filename  of	 -  can	 also be used to refer to the standard
       input.

       The following options are available:

       -cfile Report any capacity limits that are exceeded and write a	report
	      of  capacity  usage  to  file.  The report is in the format of a
	      RACT result.  RACT is the	 Reference  Application	 for  Capacity
	      Testing  defined in the Proposed American National Standard Con‐
	      formance Testing for Standard Generalized Markup Language	 (SGL)
	      Systems (X3.190-199X), Draft July 1991.

       -d     Warn about duplicate entity declarations.

       -e     Describe open entities in error messages.	 Error messages always
	      include the  position  of	 the  most  recently  opened  external
	      entity.

       -g     Show the GIs of open elements in error messages.

       -iname Pretend that

		     <!ENTITY % name "INCLUDE">

	      occurs  at  the start of the document type declaration subset in
	      the  document entity.  Since repeated definitions of  an	entity
	      are ignored, this definition will take precedence over any other
	      definitions of this entity in  the  document  type  declaration.
	      Multiple	-i  options are allowed.  If the  declaration replaces
	      the reserved name INCLUDE then the new reserved name will be the
	      replacement  text	 of  the  entity.  Typically the document type
	      declaration will contain

		     <!ENTITY % name "IGNORE">

	      and will use %name; in the status	 keyword  specification	 of  a
	      marked  section  declaration.   In  this	case the effect of the
	      option will be to cause the marked section not to be ignored.

       -l     Output L commands giving the current line number and filename.

       -mfile Map public identifiers and entity names  to  system  identifiers
	      using  the  catalog  entry  file	file.  Multiple -m options are
	      allowed.	Catalog entry files specified with the -m option  will
	      be searched before the defaults.

       -p     Parse  only the prolog.  Sgmls will exit after parsing the docu‐
	      ment type declaration.  Implies -s.

       -r     Warn about defaulted references.

       -s     Suppress output.	Error messages will still be printed.

       -u     Warn about undefined elements: elements used in the DTD but  not
	      defined.

       -v     Print the version number.

   Entity Manager
       An  external  entity  resides in one or more files.  The entity manager
       component of sgmls maps a sequence of files into	 an  entity  in	 three
       sequential stages:

       1.     each carriage return character is turned into a non-SGML charac‐
	      ter;

       2.     each newline character is turned into a  record  end  character,
	      and at the same time a record start character is inserted at the
	      beginning of each line;

       3.     the files are concatenated.

       A system identifier is interpreted as a list of filenames separated  by
       colons.	A filename of - can be used to refer to the standard input.

       If  a  system  identifier is not specified, then the entity manager can
       generate one using catalog entry files in the  format  defined  in  the
       SGML  Open  Draft Technical Resolution on Entity Management.  A catalog
       entry file contains a sequence of entries in one of the following  four
       forms:

       PUBLIC pubid sysid
	      This  specifies  that sysid should be used as the system identi‐
	      fier if the the public identifier is pubid.  Sysid is  a	system
	      identifier  as defined in ISO 8879 and pubid is a public identi‐
	      fier as defined in ISO 8879.

       ENTITY name sysid
	      This specifies that sysid should be used as the  system  identi‐
	      fier if the entity is a general entity whose name is name.

       ENTITY %name sysid
	      This  specifies  that sysid should be used as the system identi‐
	      fier if the entity is a parameter entity	whose  name  is	 name.
	      Note that there is no space between the % and the name.

       DOCTYPE name sysid
	      This  specifies  that sysid should be used as the system identi‐
	      fier if the entity is an entity declared in a document type dec‐
	      laration whose document type name is name.

       The  last two forms are extensions to the SGML Open format.  The delim‐
       iters can be omitted from the sysid provided it does  not  contain  any
       white  space.   Comments are allowed between parameters delimited by --
       as in SGML.  The environment  variable  SGML_CATALOG_FILES  contains  a
       colon-separated	list  of  catalog entry files.	These will be searched
       after any catalog entry files specified using the -m option.   If  this
       environment  variable is not set, then a system dependent list of cata‐
       log entry files will be used.  A match in a catalog entry  file	for  a
       PUBLIC  entry will take precedence over a match in the same file for an
       ENTITY or DOCTYPE entry.	 A filename in a system identifier in a	 cata‐
       log  entry file is interpreted relative to the directory containing the
       catalog entry file.

       If no match can be found in a catalog entry file, then the entity  man‐
       ager  will  attempt  to generate a filename using the public identifier
       (if there is one) and other  information	 available  to	it.   Notation
       identifiers  are	 not  subject to this treatment.  This process is con‐
       trolled by the environment variable SGML_PATH; this contains  a	colon-
       separated  list	of filename templates.	A filename template is a file‐
       name that may contain substitution fields; a substitution field is a  %
       character  followed  by a single letter that indicates the value of the
       substitution.  The value of a substitution can either be a string or it
       can  be	null.  The entity manager transforms the list of filename tem‐
       plates into a list of filenames by substituting for  each  substitution
       field  and  discarding any template that contained a substitution field
       whose value was null.  It then uses the first resulting	filename  that
       exists  and  is	readable.   Substitution values are transformed before
       being used for substitution: firstly, any names that  were  subject  to
       upper case substitution are folded to lower case; secondly, space char‐
       acters are mapped to underscores and slashes are	 mapped	 to  percents.
       The  value of the %S field is not transformed.  The values of substitu‐
       tion fields are as follows:

       %%     A single %.

       %D     The entity's data content notation.  This substitution will suc‐
	      ceed only for external data entities.

       %N     The entity, notation or document type name.

       %P     The  public  identifier if there was a public identifier, other‐
	      wise null.

       %S     The system identifier if there was a system identifier otherwise
	      null.

       %X     (This  is	 provided  mainly  for compatibility with ARCSGML.)  A
	      three-letter string chosen as follows:
					 │	      │
					 │	      │ With public identifier
					 │	      ├─────────────┬───────────
					 │ No public  │	  Device    │  Device
					 │ identifier │ independent │ dependent
	      ───────────────────────────┼────────────┼─────────────┼───────────
	      Data or subdocument entity │ nsd	      │ pns	    │ vns
	      General SGML text entity	 │ gml	      │ pge	    │ vge
	      Parameter entity		 │ spe	      │ ppe	    │ vpe
	      Document type definition	 │ dtd	      │ pdt	    │ vdt
	      Link process definition	 │ lpd	      │ plp	    │ vlp

	      The device dependent version is  selected	 if  the  public  text
	      class  allows  a	public text display version but no public text
	      display version was specified.

       %Y     The type of thing for which the filename is being generated:

	      SGML subdocument entity	 sgml
	      Data entity		 data
	      General text entity	 text
	      Parameter entity		 parm
	      Document type definition	 dtd
	      Link process definition	 lpd

       The value of the following substitution fields will be  null  unless  a
       valid formal public identifier was supplied.

       %A     Null if the text identifier in the formal public identifier con‐
	      tains an unavailable text indicator, otherwise the empty string.

       %C     The public text class, mapped to lower case.

       %E     The public text designating sequence (escape  sequence)  if  the
	      public text class is CHARSET, otherwise null.

       %I     The  empty  string  if the owner identifier in the formal public
	      identifier is an ISO owner identifier, otherwise null.

       %L     The public text language, mapped to lower case, unless the  pub‐
	      lic text class is CHARSET, in which case null.

       %O     The owner identifier (with the +// or -// prefix stripped.)

       %R     The  empty  string  if the owner identifier in the formal public
	      identifier is a registered owner identifier, otherwise null.

       %T     The public text description.

       %U     The empty string if the owner identifier in  the	formal	public
	      identifier is an unregistered owner identifier, otherwise null.

       %V     The public text display version.	This substitution will be null
	      if the public text class does not allow a display version or  if
	      no  version was specified.  If an empty version was specified, a
	      value of default will be used.

       Normally if the external identifier for an  entity  includes  a	system
       identifier, the entity manager will use the specified system identifier
       and not attempt to generate one.	 If, however, SGML_PATH	 uses  the  %S
       field,  then  the entity manager will first search for a matching entry
       in the catalog entry files.  If a match is found,  then	this  will  be
       used  instead  of  the  specified system identifier.  Otherwise, if the
       specified system identifier does not contain  any  colons,  the	entity
       manager	will  use  SGML_PATH  to  generate  a filename.	 Otherwise the
       entity manager will use the specified system identifier.

   System declaration
       The system declaration for sgmls is as follows:

			  SYSTEM "ISO 8879:1986"
				  CHARSET
       BASESET	"ISO 646-1983//CHARSET
		 International Reference Version (IRV)//ESC 2/5 4/0"
       DESCSET	0 128 0
       CAPACITY PUBLIC	"ISO 8879:1986//CAPACITY Reference//EN"
				 FEATURES
       MINIMIZE DATATAG NO  OMITTAG  YES   RANK	    NO	SHORTTAG YES
       LINK	SIMPLE	NO  IMPLICIT NO	   EXPLICIT NO
       OTHER	CONCUR	NO  SUBDOC   YES 1 FORMAL   YES
       SCOPE	DOCUMENT
       SYNTAX	PUBLIC	"ISO 8879:1986//SYNTAX Reference//EN"
       SYNTAX	PUBLIC	"ISO 8879:1986//SYNTAX Core//EN"
				 VALIDATE
		GENERAL YES MODEL    YES   EXCLUDE  YES CAPACITY YES
		NONSGML YES SGML     YES   FORMAL   YES
				   SDIF
		PACK	NO  UNPACK   NO

       Exceeding a capacity limit will be ignored  unless  the	-c  option  is
       given.

       The memory usage of sgmls is not a function of the capacity points used
       by a document;  however,	 sgmls	can  handle  capacities	 significantly
       greater than the reference capacity set.

       In  some	 environments,	higher	values may be supported for the SUBDOC
       parameter.

       Documents that do not use optional features are	also  supported.   For
       example,	 if FORMAL NO is specified in the  declaration, public identi‐
       fiers will not be required to be valid formal public identifiers.

       Certain parts of the concrete syntax may be changed:

	      The shunned character numbers can be changed.

	      Eight bit characters can be assigned to LCNMSTRT, UCNMSTRT, LCN‐
	      MCHAR and UCNMCHAR.

	      Uppercase	 substitution  can  be performed or not performed both
	      for entity names and for other names.

	      Either short reference  delimiters  assigned  by	the  reference
	      delimiter set or no short reference delimiters are supported.

	      The reserved names can be changed.

	      The  quantity set can be increased within certain limits subject
	      to there being sufficient memory available.  The upper limit  on
	      NAMELEN  is 239.	The upper limits on ATTCNT, ATTSPLEN, BSEQLEN,
	      ENTLVL, LITLEN, PILEN, TAGLEN, and TAGLVL are more  than	thirty
	      times  greater  than  the	 reference limits.  The upper limit on
	      GRPCNT, GRPGTCNT, and GRPLVL is 253.  NORMSEP cannot be changed.
	      DTAGLEN are DTEMPLEN irrelevant since sgmls does not support the
	      DATATAG feature.

    declaration
       The  declaration may be omitted,	 the  following	 declaration  will  be
       implied:

			     <!SGML "ISO 8879:1986"
				     CHARSET
       BASESET	"ISO 646-1983//CHARSET
		 International Reference Version (IRV)//ESC 2/5 4/0"
       DESCSET	  0  9 UNUSED
		  9  2	9
		 11  2 UNUSED
		 13  1 13
		 14 18 UNUSED
		 32 95 32
		127  1 UNUSED
       CAPACITY PUBLIC	"ISO 8879:1986//CAPACITY Reference//EN"
       SCOPE	DOCUMENT
       SYNTAX	PUBLIC	"ISO 8879:1986//SYNTAX Reference//EN"
				    FEATURES
       MINIMIZE DATATAG NO OMITTAG  YES		 RANK	  NO  SHORTTAG YES
       LINK	SIMPLE	NO IMPLICIT NO		 EXPLICIT NO
       OTHER	CONCUR	NO SUBDOC   YES 99999999 FORMAL	  YES
				  APPINFO NONE>
       with  the exception that characters 128 through 254 will be assigned to
       DATACHAR.

       Sgmls identifies base character sets using the designating sequence  in
       the  public identifier.	The following designating sequences are recog‐
       nized:

	 Designating	      ISO	  Minimum      Number
	    Escape	  Registration	 Character	 of		Description
	   Sequence	     Number	  Number     Characters
       ────────────────────────────────────────────────────────────────────────────────────
       ESC 2/5 4/0	       -	     0		128	  full set of ISO 646 IRV
       ESC 2/8 4/0		2	    33		 94	  G0 set of ISO 646 IRV
       ESC 2/8 4/2		6	    33		 94	  G0 set of ASCII
       ESC 2/13 4/1	      100	    32		 96	  G1 set of ISO 8859-1
       ESC 2/1 4/0		1	     0		 32	  C0 set of ISO 646
       ESC 2/2 4/3	       77	     0		 32	  C1 set of ISO 6429
       ESC 2/5 2/15 3/0	       -	     0		256	  the system character set

       When one of the G0 sets is used as a base set, the characters SPACE and
       DELETE  are  treated as occurring at positions 32 and 127 respectively;
       although these characters are not part of the character sets designated
       by  the	escape	sequences,  this mimics the behaviour of ISO 2022 with
       respect to these code positions.

   Output format
       The output is a series of lines.	 Lines can be arbitrarily long.	  Each
       line  consists  of  an  initial command character and one or more argu‐
       ments.  Arguments are separated by a single space, but when  a  command
       takes a fixed number of arguments the last argument can contain spaces.
       There is no space between the command character and the first argument.
       Arguments can contain the following escape sequences.

       \\     A \.

       \n     A record end character.

       \|     Internal SDATA entities are bracketed by these.

       \nnn   The character whose code is nnn octal.

       A  record  start	 character will be represented by \012.	 Most applica‐
       tions will need to ignore \012 and translate \n into newline.

       The possible command characters and arguments are as follows:

       (gi    The start of an element whose generic  identifier	 is  gi.   Any
	      attributes for this element will have been specified with A com‐
	      mands.

       )gi    The end an element whose generic identifier is gi.

       -data  Data.

       &name  A reference to an external data entity name; name will have been
	      defined using an E command.

       ?pi    A processing instruction with data pi.

       Aname val
	      The  next	 element to start has an attribute name with value val
	      which takes one of the following forms:

	      IMPLIED
		     The value of the attribute is implied.

	      CDATA data
		     The attribute  is	character  data.   This	 is  used  for
		     attributes whose declared value is CDATA.

	      NOTATION nname
		     The  attribute  is	 a notation name; nname will have been
		     defined using a N command.	 This is used  for  attributes
		     whose declared value is NOTATION.

	      ENTITY name...
		     The  attribute  is	 a list of general entity names.  Each
		     entity name will have been defined using an  I,  E	 or  S
		     command.	This  is  used	for  attributes whose declared
		     value is ENTITY or ENTITIES.

	      TOKEN token...
		     The attribute is a list of	 tokens.   This	 is  used  for
		     attributes whose declared value is anything else.

       Dename name val
	      This  is	the  same as the A command, except that it specifies a
	      data attribute for an external entity named ename.  Any  D  com‐
	      mands  will  come after the E command that defines the entity to
	      which they apply, but before any & or A commands that  reference
	      the entity.

       Nnname nname.   Define  a notation This command will be preceded by a p
	      command if the notation was declared with a  public  identifier,
	      and  by  a  s command if the notation was declared with a system
	      identifier.  A notation will only be defined if it is to be ref‐
	      erenced in an E command or in an A command for an attribute with
	      a declared value of NOTATION.

       Eename typ nname
	      Define an external data entity named ename with type typ (CDATA,
	      NDATA or SDATA) and notation not.	 This command will be preceded
	      by one or more f commands giving the filenames generated by  the
	      entity  manager  from  the system and public identifiers, by a p
	      command if a public identifier was declared for the entity,  and
	      by  a  s	command	 if  a	system identifier was declared for the
	      entity.  not will have been defined using	 a  N  command.	  Data
	      attributes may be specified for the entity using D commands.  An
	      external data entity will only be defined if it is to be	refer‐
	      enced  in	 a & command or in an A command for an attribute whose
	      declared value is ENTITY or ENTITIES.

       Iename typ text
	      Define an internal data entity named ename with type typ	(CDATA
	      or  SDATA)  and  entity text text.  An internal data entity will
	      only be defined if it is referenced  in  an  A  command  for  an
	      attribute whose declared value is ENTITY or ENTITIES.

       Sename Define  a	 subdocument entity named ename.  This command will be
	      preceded by one or more f commands giving the  filenames	gener‐
	      ated  by	the  entity manager from the system and public identi‐
	      fiers, by a p command if a public identifier  was	 declared  for
	      the  entity,  and	 by  a	s  command  if a system identifier was
	      declared for the entity.	A  subdocument	entity	will  only  be
	      defined  if  it  is referenced in a { command or in an A command
	      for an attribute whose declared value is ENTITY or ENTITIES.

       ssysid This command applies to the next E, S or N command and specifies
	      the associated system identifier.

       ppubid This command applies to the next E, S or N command and specifies
	      the associated public identifier.

       ffilename
	      This command applies to the next E or S command and specifies an
	      associated  filename.  There will be more than one f command for
	      a single E or S command if the system identifier used a colon.

       {ename The start of the	subdocument entity ename; ename will have been
	      defined using a S command.

       }ename The end of the  subdocument entity ename.

       Llineno file
       Llineno
	      Set the current line number and filename.	 The filename argument
	      will be omitted if only the line number has changed.  This  will
	      be output only if the -l option has been given.

       #text  An  APPINFO parameter of text was specified in the  declaration.
	      This is not strictly part of  the	 ESIS,	but  a	structure-con‐
	      trolled  application  is	permitted  to act on it.  No # command
	      will be output if APPINFO NONE was specified.  A # command  will
	      occur  at most once, and may be preceded only by a single L com‐
	      mand.

       C      This command indicates that the document was a conforming	 docu‐
	      ment.   If  this command is output, it will be the last command.
	      An  document is not conforming if it  references	a  subdocument
	      entity that is not conforming.

BUGS
       Some  non-SGML characters in literals are counted as two characters for
       the purposes of quantity and capacity calculations.

SEE ALSO
       The  Handbook, Charles F. Goldfarb
       ISO 8879 (Standard Generalized Markup Language), International  Organi‐
       zation for Standardization

ORIGIN
       ARCSGML was written by Charles F. Goldfarb.

       Sgmls was derived from ARCSGML by James Clark (jjc@jclark.com), to whom
       bugs should be reported.

								      SGMLS(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net