morphy man page on DragonFly

Printed from http://www.polarhome.com/service/man/?qf=morphy&af=0&tf=2&of=DragonFly

MORPHY(7WN)			   WordNet™			   MORPHY(7WN)

NAME
       morphy - discussion of WordNet's morphological processing

DESCRIPTION
       Although	 only  base  forms  of	words  are  usually stored in WordNet,
       searches may be done on inflected forms.	 A  set	 of  morphology	 func‐
       tions,  Morphy, is applied to the search string to generate a form that
       is present in WordNet.

       Morphology in WordNet uses two types of processes to try to convert the
       string  passed  into  one  that	can  be found in the WordNet database.
       There are lists of inflectional endings, based on  syntactic  category,
       that can be detached from individual words in an attempt to find a form
       of the word that is in WordNet.	There are also exception  list	files,
       one  for	 each  syntactic  category, in which a search for an inflected
       form is done.  Morphy tries to use these two processes in  an  intelli‐
       gent  manner  to	 translate the string passed to the base form found in
       WordNet.	 Morphy first checks for exceptions, then uses	the  rules  of
       detachment.   The  Morphy  functions  are not independent from WordNet.
       After each transformation, WordNet is searched for the resulting string
       in the syntactic category specified.

       The  Morphy  functions are passed a string and a syntactic category.  A
       string is either a single word or a  collocation.   Since  some	words,
       such  as	 axes  can have more than one base form (axe and axis), Morphy
       works in the following manner.  The first time that  Morphy  is	called
       with  a	specific  string, it returns a base form.  For each subsequent
       call to Morphy made with a NULL string argument, Morphy returns another
       base form.  Whenever Morphy cannot perform a transformation, whether on
       the first call for a word or subsequent calls,  NULL  is	 returned.   A
       transformation  to  a valid English string will return NULL if the base
       form of the string is not in WordNet.

       The morphological functions are found  in  the  WordNet	library.   See
       morph(3WN) for information on using these functions.

   Rules of Detachment
       The following table shows the rules of detachment used by Morphy.  If a
       word ends with one of the suffixes, it is stripped from	the  word  and
       the  corresponding  ending  is added.  Then WordNet is searched for the
       resulting string.  No rules are applicable to adverbs.

				    │	     │
			       POS  │ Suffix │ Ending
			       ─────┼────────┼────────
			       NOUN │ "s"    │ ""
			       NOUN │ "ses"  │ "s"
			       NOUN │ "xes"  │ "x"
			       NOUN │ "zes"  │ "z"
			       NOUN │ "ches" │ "ch"
			       NOUN │ "shes" │ "sh"
			       NOUN │ "men"  │ "man"
			       NOUN │ "ies"  │ "y"
			       VERB │ "s"    │ ""
			       VERB │ "ies"  │ "y"
			       VERB │ "es"   │ "e"
			       VERB │ "es"   │ ""
			       VERB │ "ed"   │ "e"
			       VERB │ "ed"   │ ""
			       VERB │ "ing"  │ "e"
			       VERB │ "ing"  │ ""

			       ADJ  │ "er"   │ ""
			       ADJ  │ "est"  │ ""
			       ADJ  │ "er"   │ "e"
			       ADJ  │ "est"  │ "e"

   Exception Lists
       There is one exception list file	 for  each  syntactic  category.   The
       exception  lists	 contain the morphological transformations for strings
       that are not regular and therefore cannot be processed in an  algorith‐
       mic  manner.  Each line of an exception list contains an inflected form
       of a word or collocation, followed by one or more base forms.  The list
       is kept in alphabetical order and a binary search is used to find words
       in these lists.	See wndb(5WN) for information on  the  format  of  the
       exception list files.

   Single Words
       In  general, single words are relatively easy to process.  Morphy first
       looks for the word in the exception list.  If it	 is  found  the	 first
       base  form  is  returned.  Subsequent calls with a NULL argument return
       additional base forms, if present.  A NULL is returned when  there  are
       no more base forms of the word.

       If  the	word  is  not found in the exception list corresponding to the
       syntactic category, an algorithmic process using the rules  of  detach‐
       ment  looks  for	 a  matching suffix.  If a matching suffix is found, a
       corresponding ending is	applied	 (sometimes  this  ending  is  a  NULL
       string,	so in effect the suffix is removed from the word), and WordNet
       is consulted to see if the resulting word is found in the desired  part
       of speech.

   Collocations
       As  opposed  to	single	words,	collocations can be quite difficult to
       transform into a base form that is present  in  WordNet.	  In  general,
       only  base  forms  of  words,  even  those comprising collocations, are
       stored in WordNet, such as attorney general.  Transforming the colloca‐
       tion  attorneys general	is  then  simply  a matter of finding the base
       forms of the individual words comprising the collocation.  This usually
       works  for  nouns, therefore non-conforming nouns, such as customs duty
       are presently entered in the noun exception list.

       Verb collocations that contain prepositions, such  as  ask for it,  are
       more  difficult.	  As with single words, the exception list is searched
       first.  If the collocation is not found, special code in Morphy	deter‐
       mines whether a verb collocation includes a preposition.	 If it does, a
       function is called to try to find the base form in the  following  man‐
       ner.   It  is  assumed that the first word in the collocation is a verb
       and that the last word is a noun.  The algorithm then builds  a	search
       string  with the base forms of the verb and noun, leaving the remainder
       of the collocation (usually just the preposition, but more words may be
       involved)  in the middle.  For example, passed asking for it, the data‐
       base search would be performed with ask for it, which is found in Word‐
       Net,  and  therefore  returned from Morphy.  If a verb collocation does
       not contain a preposition, then the base form of each word in the  col‐
       location is found and WordNet is searched for the resulting string.

   Hyphenation
       Hyphenation  also presents special difficulties when searching WordNet.
       It is often a subjective decision as to whether a word  is  hyphenated,
       joined  as one word, or is a collocation of several words, and which of
       the various forms are entered  into  WordNet.   When  Morphy  breaks  a
       string  into  "words",  it  looks for both spaces and hyphens as delim‐
       iters.  It also looks for periods in strings and	 removes  them	if  an
       exact  match  is	 not  found.   A  search for an abbreviation like oct.
       return the synset for { October, Oct }.	Not every pattern  of  hyphen‐
       ated  and  collocated  string  is  searched  for properly, so it may be
       advantageous to specify several search strings  if  the	results	 of  a
       search attempt seem incomplete.

   Special Processing for nouns ending with 'ful'
       Morphy  contains	 code that searches for nouns ending with ful and per‐
       forms a transformation on the substring preceeding it.  It then appends
       'ful'  back  onto  the resulting string and returns it. For example, if
       passed the nouns boxesful, it will return boxful.

BUGS
       Since  many  noun   collocations	  contains   prepositions,   such   as
       line of products, an algorithm similar to that used for verbs should be
       written for  nouns.   In	 the  present  scheme,	if  Morphy  is	passed
       lines of products,  the search string becomes line of product, which is
       not in WordNet

       Morphy will allow non-words to be converted to words,  if  they	follow
       one of the rules described above.  For example, it will happily convert
       plantes to plants.

ENVIRONMENT VARIABLES (UNIX)
       WNHOME		   Base	  directory   for   WordNet.	 Default    is
			   /usr/local/WordNet-3.0.

       WNSEARCHDIR	   Directory  in  which	 the WordNet database has been
			   installed.  Default is WNHOME/dict.

REGISTRY (WINDOWS)
       HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome
			   Base directory for  WordNet.	  Default  is  C:\Pro‐
			   gram Files\WordNet\3.0.

FILES
       pos.exc		   morphology exception lists

SEE ALSO
       wn(1WN), wnb(1WN), binsrch(3WN), morph(3WN), wndb(5WN), wninput(7WN).

WordNet 3.0			   Dec 2006			   MORPHY(7WN)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net