



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The Word Expert Parser, a computer program that analyzes natural language text to extract meaning roles of words in combination. It explores how words gain meaning through lexical sequences and the importance of idiosyncratic interactions in understanding language. The document uses examples to illustrate the concept and reflects on its implications for language understanding.
What you will learn
Typology: Study notes
1 / 7
This page cannot be seen from the preview
Don't miss anything!
VIEWING Word EXPERT PARSING AS LINGUISTIC THEORY*
Department of Computer Science Mathematical Sciences B u i l d i n g U n i v e r s i t y o f Rochester Rochester, New York 14627
The Word Expert Parser is a computer program t h a t analyzes fragments o f n a t u r a l language t e x t i n order to e x t r a c t t h e i r meaning in c o n t e x t. The c o n s t r u c t i o n of the program has l e d to the development of a l i n g u i s t i c theory based on n o t i o n s orthogonal t o those t r a d i t i o n a l l y found a t the heart of such t h e o r i e s. Word Expert Parsing e x p l a i n s the understanding o f t e x t u a l fragments c o n t a i n i n g h i g h l y i d i o s y n c r a t i c elements, such as idioms, c o l l o c a t i o n s , c l i c h e s , and c o l l i g a t i o n s , a s w e l l a s l e x i c a l sequences t h a t c o n t a i n i n t e r e s t i n g s t r u c t u r a l phenomena. The theory p e r c e i v e s the i n d i v i d u a l word of language as the o r g a n i z i n g u n i t f o r l i n g u i s t i c knowledge, and views understanding as c o n s i s t i n g of lexical Interactions among procedural word e x p e r t s. T h i s paper d e s c r i b e s four classes o f l e x i c a l i n t e r a c t i o n r e q u i r e d t o e x p l a i n the understanding o f sentences i n c o n t e x t ,
The paper purposely a v o i d s programming d e t a i l s in order to focus on Word Expert Parsing as l i n g u i s t i c theory.
The Word Expert Parser (UEP) is a computer program t h a t analyzes fragments of n a t u r a l language t e x t i n order t o e x t r a c t t h e i r meaning i n c o n t e x t. The system has been developed w i t h p a r t i c u l a r a t t e n t i o n p a i d t o the wide v a r i e t y o f d i f f e r e n t meaning r o l e s of words when appearing in combination w i t h o t h e r words. The t h e o r e t i c a l p o s i t i o n advanced by UEP about the nature of I n d i v i d u a l words is t h a t words have no meaning per se, but r a t h e r , t h a t fragments o f l e x i c a l items mean something through t h e i r I n t e r r e l a t i o n s h i p s. Furthermore, t h e c h a r a c t e r o f l e x i c a l r e l a t i o n s runs the gamut from the simple d i r e c t knowledge t h a t some word sequence r e p r e s e n t s some remembered concept, to the more a n a l y t i c a l knowledge t h a t p a r t i c u l a r k i n d s o f l e x i c a l sequences o f t e n represent c e r t a i n c l a s s e s o f conceptual n o t i o n s.
The support of NASA, the F u l b r i g h t - H a y s program, and IRCAM are g r a t e f u l l y acknowledged.
Small
Groupe d 1 I n t e l l i g e n c e A r t i f i c i e l l c Departernent d ' I n f o r m a t i q u e U n i v e r s i t y Paris V I I I - Vincennes 93200 Saint Denis FRANCE
The e v o l u t i o n o f t h i s p e r s p e c t i v e s t a r t e d w i t h the o b s e r v a t i o n t h a t the understanding of a p a r t i c u l a r fragment of t e x t depends fundamentally on the disambiguation of the i n d i v i d u a l words composing i t. Knowing the c o n t e x t u a l meanings of the words is tantamount to understanding the meaning of the o v e r a l l fragment. Another way of saying the same t h i n g [Rieger, 1977] is t h a t language i n t e r p r e t a t i o n can be u l t i m a t e l y viewed as a process of word sense d i s c r i m i n a t i o n. U n f o r t u n a t e l y , t h i s p e r s p e c t i v e does not e l i m i n a t e t h e c l a s s i c problems of d e c i d i n g the nature of a d i s t i n c t word sense, the d i f f e r e n c e between d i f f e r e n t usages, word senses, and idioms, and so f o r t h. The s o l u t i o n to these problems comes in r e a l i z i n g t h a t the process of understanding the meaning of words in c o n t e x t does not r e q u i r e r e f e r e n c e to those n o t i o n s at a l l. The design of a p a r s i n g procedure based on d e t e r m i n i n g the meaning r o l e s of i n d i v i d u a l words in c o n t e x t has l e d to the Drthogonal l i n g u i s t i c n o t i o n s t h a t are the subject o f t h i s paper.
The o r g a n i z a t i o n of UEP is founded on the b e l i e f t h a t the grouping together of words to form meaningful sequences is an a c t i v e process which succeeds o n l y because of h i g h l y i d i o s y n c r a t i c a p p l i c a t i o n of l e x i c a l knowledge. That i s , we fragment t e x t and understand the meaning of the pieces because we know how the p a r t i c u l a r words i n v o l v e d i n t e r a c t w i t h each o t h e r.
Sometimes sequences of two or more words i n t e r a c t together to such an e x t e n t t h a t they seem to behave as a s i n g l e l e x i c a l item. L i n g u i s t s have l a b e l l e d such sequences Idioms. The n o t i o n to which t h i s d e f i n i t i o n g i v e s r i s e , however, causes several problems f o r l i n g u i s t i c t h e o r y. F i r s t o f a l l , r a r e l y does such a sequence hold together so t i g h t l y t h a t i t can b e t r u l y t r e a t e d t h e o r e t i c a l l y as a s i n g l e l e x i c a l item. Secondly, r a r e l y does such a sequence have a unique meaning. More o f t e n than n o t , the meaning of an i d i o m a t i c expression must be determined by d i s a m b i g u a t i o n. The sequence must be analyzed in c o n t e x t and be t r e a t e d by comprehension processes as being e i t h e r (a) a cohesive whole w i t h i d i o s y n c r a t i c meaning, or (b) a sequence having meaning through l e s s s p e c i f i c language knowledge. There is no a priori way of knowing the meaning of the sequence to be the one or the o t h e r.
The n o t i o n of i diom f a l l s at one end of a spectrum, an i d e a l i z e d end t h a t I c l a i m does not e x i s t. L e x i c a l seauences can be more or less Idiomatic, in the sense t h a t the process i n t e r a c t i o n s c o n s t i t u t i n g the understanding of them i n c l u d e s g r e a t e r or fewer i d i o s y n c r a t i c I n t e r a c t i o n s. The UEP way of l o o k i n g at the most i d i o m a t i c sequences i s t h a t the s p e c i a l
i n t e r a c t i o n s among the p a r t i c i p a t i n g words take p r i o r i t y over any other p o t e n t i a l i n t e r a c t i o n s i n v o l v i n g those words. The d i s a m b i g u a t i o n of i d i o m a t i c expressions, i. e. , the understanding o f the sequences as e i t h e r idioms or non-idioms (to use the popular d i s t i n c t i o n ) , g e n e r a l l y r e q u i r e s other process i n t e r a c t i o n s besides the s t r i c t l y w o r d - s p e c i f i c ones. The understanding of an idiom thus d i f f e r s i n s i g n i f i c a n t l y , from the p e r s p e c t i v e of UEP t h e o r y , from comprehension of any other k i n d (according to whatever c l a s s i f i c a t i o n scheme) of l e x i c a l sequence.
The n o t i o n t h a t a l l fragments of language are more or l e s s i d i o m a t i c , w h i l e r a d i c a l in some l i n g u i s t i c q u a r t e r s , has been p r e v i o u s l y suggested. In h i s i n t r o d u c t o r y textbook, Aspects of Language, Dwight B o l i n g e r asks "whether e v e r y t h i n g we say may be in some degree i d i o m a t i c — t h a t i s , whether t h e r e are a f f i n i t i e s among words t h a t continue to r e f l e c t the attachments the words had when we learned them, w i t h i n l a r g e r groups" [ B o l i n g e r , 1975J. A f t e r working w i t h i n what he c a l l s " t h e p r e v a i l i n g r e d u c t i o n i s m " , B o l i n g e r began to suggest a p o s i t i v e answer to h i s pedagogical q u e s t i o n , choosing to take "an i d i o m a t i c r a t h e r than an a n a l y t i c a l view" [ B o l i n g e r , 1979] of language. The c o n t r i b u t i o n o f a r t i f i c i a l i n t e l l i g e n c e I n g e n e r a l , and of Word Expert Parsing in p a r t i c u l a r , is to develop theory from t h i s i n f o r m a l view. The n o t i o n of process, and of process i n t e r a c t i o n , a l l o w s us to begin to do Just t h a t.
The UEP computer system maintains l i n g u i s t i c knowledge across a community of word-based s t r u c t u r e s c a l l e d word experts, which represent the process of d e t e r m i n i n g the c o n t e x t u a l meaning and r o l e of the i n d i v i d u a l words. A word expert must not be thought of as a r e p r e s e n t a t i o n f o r the v a r i o u s meanings, r o l e s , and c o n t r i b u t i o n s of a word in c o n t e x t , but r a t h e r as a d e c l a r a t i v e r e p r e s e n t a t i o n (a network) of the process (which we s h a l l c a l l disambiguation) of d e t e r m i n i n g these t h i n g s. C e r t a i n l y , i t i s the meaning c o n t r i b u t i o n s of i n d i v i d u a l l e x i c a l items t h a t we wish to determine. Word e x p e r t s are b o t h data and process; they can be augmented, examined, and manipulated as data, y e t p a r s i n g takes p l a c e through t h e i r i n t e r p r e t a t i o n as program by an e x p e r t e v a l u a t o r , s i m i l a r to the EVAL of L i s p.
The d i s t r i b u t e d p a r s i n g scheme of UEP works as f o l l o w s. The UEP reader examines a word of t e x t and r e t r i e v e s i t s word e x p e r t from memory. The word e x p e r t s t a r t s e x e c u t i n g , t r y i n g t o determine the meaning r o l e o f i t s word i n c o n t e x t , i. e. , i n t e r a c t i n g w i t h other word e x p e r t s and w i t h h i g h e r - o r d e r system processes to a c q u i r e the a p p r o p r i a t e c o n t e x t u a l knowledge to make the c o r r e c t i n f e r e n c e s. F i n a l l y , a l l the word e x p e r t s f o r a p a r t i c u l a r fragment of t e x t come to mutual agreement on the meaning of the fragment, and the l o c a l d i s t r i b u t e d process t e r m i n a t e s. L o c a l , i n the sense t h a t as long as t h e r e remains i n p u t t e x t , the o v e r a l l p a r s i n g process c o n t i n u e s , w h i l e the d i s a m b i g u a t i o n o f i n d i v i d u a l l e x i c a l sequences making up the l a r g e r t e x t completes.
I n t e r a c t i o n , between i n d i v i d u a l s i n the w o r l d , or between d i s t r i b u t e d processes in a computer program, r e q u i r e s both (a) g i v i n g i n f o r m a t i o n and (b) r e c e i v i n g i n f o r m a t i o n. In UEP, the e x p e r t s exchange two kinds of i n f o r m a t i o n , c a l l e d concept structures and control signals. Concept structures represent human concepts, such as "a book", " g o i n g f i s h i n g " , " t h e box of candy 1 gave Joanie f o r V a l e n t i n e ' s day in 1981", "some b l u e p h y s i c a l
o b j e c t " , and the l i k e. Control s i g n a l s represent p r o c e s s i n g c l u e s , such as "expect a word t h a t can begin a l e x i c a l sequence t h a t can d e s c r i b e concept s t r u c t u r e X", "send me the concept s t r u c t u r e r e p r e s e n t i n g the agent of concept s t r u c t u r e Y or a s i g n a l saying you cannot", " w a i t a second and you w i l l be sent a concept s t r u c t u r e t h a t w i l l help y o u " , and s i m i l a r t h i n g s. The r e p r e s e n t a t i o n and use of concepts and s i g n a l s are described f u l l y in [Small, 1980].
I use the term lexical interaction to denote the sending and r e c e i v i n g of c o n t r o l s i g n a l s and concept s t r u c t u r e s by word experts in UEP, This i n c l u d e s i n t e r a c t i o n s between i n d i v i d u a l experts, as w e l l as those between a word expert and another k i n d of model process ( e. g. , a mechanism i n f e r r i n g the goals of a d i a l o g u e p a r t i c i p a n t ). This paper discusses l e x i c a l i n t e r a c t i o n s b y p r e s e n t i n g four classes of r e q u i r e d i n t e r a c t i o n , and then arguing f o r the n e c e s s i t y and g i v i n g examples of each. The c a t e g o r i z a t i o n is by the k i n d of knowledge exchanged in the communication, and i n c l u d e s the f o l l o w i n g.
The l e a s t general c l a s s of l e x i c a l i n t e r a c t i o n s are considered i d i o s y n c r a t i c since they are w o r d - s p e c i f i c and a r i s e through simple r e c a l l memory. T h i s type of i n t e r a c t i o n permits the understanding of i d i o m a t i c fragments. General knowledge about the syntax and semantics of some n a t u r a l language g i v e s r i s e to linguistic i n t e r a c t i o n s , and are of course c r u c i a l to the understanding of l e x i c a l sequences not p r e v i o u s l y seen. Sometimes words i n t e r a c t w i t h processes t h a t monitor the development of an e n t i r e t e x t (or p a r t s t h e r e o f ) , o r the goals o f p a r t i c i p a n t s i n d i s c u s s i o n. These discourse i n t e r a c t i o n s are o f t e n necessary f o r the meaningful cohesion of l e x i c a l fragments. L a s t l y , but c e r t a i n l y not l e a s t i m p o r t a n t are the logical i n t e r a c t i o n s between words and the most general c o g n i t i v e processes. Perceptions about the w o r l d , b e l i e f s , inference-making s k i l l s , r o t e memory, and so f o r t h , a r e b a s i c to language understanding.
The c l a s s i f i c a t i o n of word fragments i n t o c a t e g o r i e s such as " i d i o m " , " c o l l o c a t i o n " , " c o l l i g a t i o n " , "noun phrase", "complement", and the l i k e , does not make sense in UEP theory. Rather, i n d i v i d u a l words are viewed as having c e r t a i n kinds and sequences of i n t e r a c t i o n s w i t h t h e i r neighbors to form meaningful pieces of t e x t. Fragments o f t e n d e s c r i b e d as " I d i o m s " are those t h a t are understood p r i n c i p a l l y through i d i o s y n c r a t i c l e x i c a l i n t e r a c t i o n s. A n o n - i d i o m a t i c s t r u c t u r e , diagnosed as a "noun phrase", is one t h a t i n v o l v e s mostly l i n g u i s t i c i n t e r a c t i o n s t o understand. A s o - c a l l e d "noun-noun p a i r " can be thought of as a l e x i c a l sequence comprehended w i t h the help of l o g i c a l i n t e r a c t i o n s , w i t h recourse to common sense memory and s k i l l s.
3.1 I d i o s y n c r a t i c i n t e r a c t i o n
Since the emphasis of the UEP research e f f o r t is to c o n s t r u c t a computer program to understand n a t u r a l language, we are not q u a l i f i e d to make
Noun Phrases
How can the p u r e l y l e x i c a l UEP system r e q u i r e no n o t i o n of h i g h order s t r u c t u r a l phenomena, y e t s t i l l b e a b l e t o account f o r them? The f o l l o w i n g example ( p r o v i d e d b y Y o r i c k w i l k s ) i l l u s t r a t e s the l e x i c a l i n t e r a c t i o n s r e q u i r e d t o a n a l y z e a n i n t e r e s t i n g fragment o f t e x t.
(3) " J o a n i e washes the c o l o r f u l d i s h e s u p. "
The d i f f i c u l t y w i t h t h i s fragment i s i n d e t e r m i n i n g t h a t the word dishes c o n t r i b u t e s to the meaning of t h e fragment t h r o u g h i n t e r a c t i o n s w i t h the two words t o i t s l e f t , b u t t h a t the word u p c o n t r i b u t e s by a s s o c i a t i o n w i t h the word washes, which precedes up by many i n t e r v e n i n g words. The reason t h a t I am a v o i d i n g the use o f t r a d i t i o n a l l i n g u i s t i c j a r g o n f o r d e s c r i b i n g t h i s phenomenon i s the f o l l o w i n g b e l i e f : An u n d e r s t a n d i n g of UEP r e q u i r e s the v i e w i n g o f language i n t e r p r e t a t i o n from the vantage p o i n t o f the i n d i v i d u a l word and i t s i n t e r a c t i o n s. An i m p o r t a n t way to achieve t h i s is to d e s c r i b e the a n a l y s i s process w i t h r e f e r e n c e t o the very n o t i o n s (not t h e t r a d i t i o n a l ones) around which i t i s o r g a n i z e d.
In the a n a l y s i s of the example fragment, UEP would f i n d the r e f e r e n t of Joanie and then proceed as f o l l o w s. The wash e x p e r t would b e g i n e x e c u t i n g , t r y i n g to d e t e r m i n e i t s own meaning r o l e in some l e x i c a l fragment, and at t h e same t i m e , t r y i n g to p r o v i d e i n f o r m a t i o n t o o t h e r l e x i c a l agents t o p e r m i t them to do the same. The meaning of wash in c o n t e x t depends on a number of f a c t o r s , i n c l u d i n g the n a t u r e o f the words succeeding i t , and t h e i r own a c t i o n s i n d e t e r m i n i n g t h e i r meaning and r o l e c o n t r i b u t i o n s in c o n t e x t. The wash e x p e r t must thus prepare f o r a number of c o n t i n g e n c i e s , or d i f f e r e n t t h i n g s t h a t c o u l d happen i n the t e x t , and then, w a i t t o see i f any o f them a c t u a l l y occur. I f the word up appears to t h e r i g h t of wash, f o r example, the words c o u l d choose to p a i r up i n t o a meaningful fragment (as in throw in above). Under c e r t a i n c o n d i t i o n s , the word up c o u l d appear l a t e r on in t h e t e x t , and s t i l l p a i r up w i t h wash (as must occur f o r c o r r e c t i n t e r p r e t a t i o n o f the example s e n t e n c e ).
What a r e the c o n t e x t u a l c o n d i t i o n s t h a t would p e r m i t t h i s? One of t h e c o n t i n g e n c i e s t h a t wash a n t i c i p a t e s i s t h e g r o u p i n g o f t h e words t o i t s r i g h t i n t o a meaningful fragment of t h e i r own ( i. e. , a concept structure). The wash e x p e r t knows t h a t (a) t h e n a t u r e of t h i s concept s t r u c t u r e may be i m p o r t a n t f o r i t s own sense d i s a m b i g u a t i o n , and (b) t h a t t h e word immediately f o l l o w i n g the meaningful fragment c o u l d p a i r u p w i t h i t. I n the Jargon of UEP, one of t h e e x p e r t s in such a meaningful fragment reports a concept s t r u c t u r e. Since t h e up e x p e r t does n o t r e p l y to dishes w i t h an a c c e p t a b l e message to c o n t i n u e the ongoing concept b u i l d i n g a c t i v i t y , the dishes e x p e r t r e p o r t s t h e s t r u c t u r e. I t i s t h i s r e p o r t t h a t t r i g g e r s some new p r o c e s s i n g by t h e wash e x p e r t , namely the e x a m i n a t i o n of t h e n e x t word ( i. e. , up).
The r e s t o f t h e a n a l y s i s t a k e s p l a c e p r e d i c t a b l y. The wash e x p e r t i n t e r a c t s w i t h up as i f u p o c c u r r e d t o i t s immediate r i g h t i n the t e x t. The p a i r i n g up of the two words r e s u l t s from mutual a c c o r d , and t h e wash e x p e r t c r e a t e s a concept s t r u c t u r e t o r e p r e s e n t t h e meaning o f the washing up of d i s h e s. Next wash o r g a n i z e s t h e conceptual o b j e c t , the c o 7 o r f u l dishes, i n t o t h e o v e r a l l meaning o f t h e sentence, and a g a i n w a i t s f o r t h i n g s t o happen. T h i s t i m e , t h e word e x p e r t f o r t h e p e r i o d at t h e end of the sentence executes, and t r a n s m i t s an a p p r o p r i a t e message. The wash e x p e r t a g a i n e x e c u t e s , c l e a n s u p i t s b u s i n e s s , and r e p o r t s
t h e concept s t r u c t u r e r e p r e s e n t i n g the meaning, i n c o n t e x t , o f t h e e n t i r e fragment (sentence).
P a s s i v e s and R e l a t i v e Clauses
Sentences in the p a s s i v e v o i c e and those c o n t a i n i n g r e l a t i v e clauses a r e s i m i l a r i n b e i n g complex s t r u c t u r a l phenomena i n n a t u r a l language, and o f t e n s u g g e s t i v e o f s e n t e n c e - l e v e l r u l e s a s l i n g u i s t i c e x p l a n a t i o n. Furthermore, t h e u n d e r s t a n d i n g of such c o n s t r u c t i o n s by the d i s t r i b u t e d word-based approach of wEP may be f a r f r o m e v i d e n t , e s p e c i a l l y c o n s i d e r i n g my c l a i m t h a t n o e x p l i c i t n o t i o n s o f s t r u c t u r e are r e f e r e n c e d b y t h e computer system or used in the t h e o r y. I n t e r p r e t a t i o n o f t e x t u a l fragments c o n t a i n i n g complex s y n t a c t i c s t r u c t u r e s takes p l a c e t h r o u g h complex p a t t e r n s o f l e x i c a l i n t e r a c t i o n s among the a p p r o p r i a t e word e x p e r t s. The words t h a t n o r m a l l y cue a reader about the presence of such s t r u c t u r a l r e l a t i o n s in a fragment a r e the ones in wEP t h a t c o o r d i n a t e the process of u n d e r s t a n d i n g them.
The a n a l y s i s of a p a s s i v e sentence i n v o l v e s l i n g u i s t i c i n t e r a c t i o n s among t h e word e x p e r t s f o r t h e s u f f i x en, t h e word by, and the o t h e r words composing i t. The f o l l o w i n g sentence has been parsed by the e x i s t i n g UEP system, and d i s c u s s e d at l e n g t h i n [Small, 1980].
(4) "The case was thrown out by f e d e r a l c o u r t. "
The en e x p e r t begins e x e c u t i n g b e f o r e throw, and t h e normal a t t e m p t s by t h e throw e x p e r t to c o o r d i n a t e the a n a l y s i s o f the fragment i n which i t p a r t i c i p a t e s a r e i n t e r c e p t e d by an. The a c t i o n s of en a l l o w throw to p a i r up w i t h o u t. as o u t l i n e d above f o r throw in and wash up, but i t s l e x i c a l i n t e r a c t i o n s t o d e t e r m i n e t h e n a t u r e o f t h e o b j e c t b e i n g " t h r o w n o u t " , and t h e agent d o i n g t h e " t h r o w i n g " a r e a l l i n t e r c e p t e d b y t h e e n e x p e r t , which p r o v i d e s throw w i t h t h e c o r r e c t r e p l i e s t o i t s q u e r i e s. Please r e f e r t o [Small, 1980] and [ S m a l l , 1981] f o r a f u l l e r d i s c u s s i o n.
R e l a t i v e c l a u s e s b e g i n n i n g w i t h the word who a r e analyzed by UEP t h r o u g h the i n t e r a c t i o n s among t h e who word e x p e r t and t h e e x p e r t s f o r the o t h e r words in t h e c l a u s e and t h e l a r g e r fragment c o n t a i n i n g i t. The f o l l o w i n g sentence i s a n example of such a fragment.
(5) "The man who throws the game l i k e s to l o s e. "
The who e x p e r t in t h i s sentence has the r e s p o n s i b i l i t y f o r i n t e r a c t i n g w i t h the word e x p e r t f o r likes to i n f o r m Tikes about the man d o i n g t h e " l i k i n g ". O r d i n a r i l y , t h i s e x p e r t would expect t o f i n d a meaningful l e x i c a l sequence t o i t s l e f t r e p r e s e n t i n g the needed concept. However, the p a r t i c u l a r s t r u c t u r e of the fragment means t h a t who must be at t h e o t h e r end of the r e l e v a n t l i n g u i s t i c i n t e r a c t i o n s o f likes, r a t h e r than the e x p e r t f o r t h e word t o i t s immediate l e f t , which would n o r m a l l y p e r f o r m the needed s e r v i c e.
The UEP i n t e r p r e t a t i o n of the example fragment proceeds as f o l l o w s. The word e x p e r t s f o r the and man agree to form a meaningful sequence and c o n s t r u c t a concept s t r u c t u r e t o r e p r e s e n t i t s meaning. The who e x p e r t begins e x e c u t i n g , g e t s h o l d of t h i s concept, and w a i t s f o r the throw e x p e r t t o s t a r t e x p l o r i n g the n a t u r e o f t h e l e x i c a l sequence o n i t s l e f t. I n a d d i t i o n , the who e x p e r t a n t i c i p a t e s t h a t another word e x p e r t f u r t h e r down the l i n e ( i n the example, t h e e x p e r t f o r 7 7/ces) w i l l a l s o seek out i n f o r m a t i o n about the sequence t o i t s l e f t , i n e x a c t l y the way t h a t throw does. The who e x p e r t , l i k e every word e x p e r t in UEP,
plans a s t r a t e g y to i n t e r a c t w i t h the e x p e r t s i n v o l v e d i n both i t s p r i o r c o n t e x t and i t s subsequent c o n t e x t , c o o p e r a t i v e l y t o i n t e r p r e t fragments of t e x t.
The throw e x p e r t begins e x e c u t i n g and i n v e s t i g a t e s the nature of the l e x i c a l sequence to i t s l e f t. The who e x p e r t p r o v i d e s the a p p r o p r i a t e I n f o r m a t i o n , i. e. , the concept s t r u c t u r e r e p r e s e n t i n g the men, and throw begins to disambiguate i t s meaning in c o n t e x t. The e x p e r t s f o r a and game m u t u a l l y agree on t h e i r l o c a l meaning, and through l i n g u i s t i c and i d i o s y n c r a t i c i n t e r a c t i o n s w i t h throw help i t determine i t s meaning as the " t h r o w i n g of a c o n t e s t ". The likes e x p e r t s t a r t s e x e c u t i n g , and i t s messages i n search of the person doing the " l i k i n g " are i n t e r c e p t e d by the who e x p e r t , which has been on the l o o k o u t f o r such i n t e r a c t i o n s since the b e g i n n i n g. Since the who e x p e r t knows the unique name of the concept s t r u c t u r e r e p r e s e n t i n g the man, it sends t h i s concept to likes , which proceeds n o r m a l l y , knowing n o t h i n g o f the s t r u c t u r a l c o m p l e x i t i e s preceeding i t.
The word e x p e r t s f o r b o t h throw and f o r likes can be expected to e x p l o r e the u n d e r l y i n g meaning of the l e x i c a l sequences preceding them. Note the way t h a t UEP a p p l i e s t h i s l i n g u i s t i c knowledge to the i n t e r p r e t a t i o n of fragments of n a t u r a l language t e x t c o n t a i n i n g these words. Rather than saying t h a t throw and Ukes a c t as f i n i t e verbs in c e r t a i n c o n t e x t s (which are d e s c r i b e d in some r e l a t i o n a l r e p r e s e n t a t i o n a l scheme, such as grammar r u l e s or l o g i c ) , we say i n s t e a d t h a t these words c a r r y on l i n g u i s t i c i n t e r a c t i o n s w i t h the a c t i v e processes modelling the other words making up the ( l o c a l l i n g u i s t i c ) c o n t e x t t o a r r i v e a t a m u t u a l l y acceptable c h a r a c t e r i z a t i o n o f t h e i r i n d i v i d u a l c o n t r i b u t i o n s to t e x t u a l meaning. The advantage of t h i s p e r s p e c t i v e comes from the f a c t t h a t l i n g u i s t i c i n t e r a c t i o n s c o n s t i t u t e but a p o r t i o n o f a l l p o s s i b l e l e x i c a l i n t e r a c t i o n s t h a t r e p r e s e n t i n UEP the process of understanding.
While i t i s c l e a r t h a t c e r t a i n l e x i c a l sequences cannot be understood s o l e l y through recourse to syntax and semantics, namely those fragments f o r which i d i o s y n c r a t i c i n t e r a c t i o n s are r e q u i r e d ( i. e. , s p e c i f i c remembered c o n t e x t s ) , why do we need o t h e r k i n d s of general knowledge? We have a l r e a d y seen examples suggesting the answer to t h i s q u e s t i o n. I n t r y i n g t o understand the meaning of throw 1n the towel, the r e l e v a n t word e x p e r t s must f i n d out some t h i n g s about the person performing the d e s c r i b e d a c t i o n , b e f o r e knowing what a c t i o n h e i s i n e f f e c t p e r f o r m i n g.
I f the d i s c o u r s e d e s c r i b e s some s o r t o f c o m p e t i t i o n between two people (or teams), f o r example, throw in the towel could indicate a concession of d e f e a t by one of them. The f o l l o w i n g fragment i l l u s t r a t e s such a c o n t e x t u a l s i t u a t i o n.
(6) "Rick and Joanle p l a y chess. Rick throws i n the t o w e l. "
On the other hand, if t h e d i s c o u r s e has r e c e n t l y made reference to a p l a c e where one might dispose of a t o w e l , throw 1n the towel might be s i g n i f y i n g the p u t t i n g of some towel in t h a t p l a c e. The f o l l o w i n g example i l l u s t r a t e s t h i s case.
(7) "Joanie drops a penny in the p i t. Rick throws i n the t o w e l. "
I am not c l a i m i n g t h a t knowledge of the d i s c o u r s e
c o n t e x t is s u f f i c i e n t to disambiguate the meanings of the example sentence, but r a t h e r , t h a t such knowledge i s r e q u i r e d t o understand i t.
The d i s c o u r s e i n t e r a c t i o n s r e q u i r e d to i n t e r p r e t the above example take place (a) between the throw expert and a higher order process m o d e l l i n g the activity context, and (b) between the 1n e x p e r t an a process modelling the d i s c o u r s e focus of attention.* There are two aspects to the processing of the a c t i v i t y mechanism, the u n s o l i c i t e d sending o f c o n t r o l s i g n a l s t o i n d i c a t e the a n t i c i p a t i o n o f c e r t a i n a c t i o n s i n the t e x t and concept s t r u c t u r e s to represent them, and the more d a t a - d i r e c t e d i n t e r a c t i o n s w i t h word experts (and other understanding processes) to determine the n a t u r e of the a c t i o n s t h a t a c t u a l l y do occur. The throw expert must c a r r y on a c t i v i t y context Interactions to determine if the discourse could be seen as d i s c u s s i n g some c o m p e t i t i v e a c t i v i t y. If so, the "concession of d e f e a t " i n t e r p r e t a t i o n of the example sentence is p l a u s i b l e. The in expert carries on rocus or attention interactions to find out if some l o c a t i o n has r e c e n t l y been described in the t e x t in which something might be thrown.
While the UEP system has been d i r e c t e d toward the understanding of fragments of t e x t o c c u r r i n g in t e x t u a l d i s c o u r s e , the issues a r i s i n g i n the i n t e r p r e t a t i o n of d i a l o g u e are very s i m i l a r. The d i f f e r e n c e between the two tasks i n v o l v e s the n a t u r e o f d i s c o u r s e i n t e r a c t i o n s. I n i n t e r p r e t i n g fragments of d i a l o g u e from the vantage p o i n t of one of the p a r t i c i p a n t s , word experts must i n t e r a c t w i t h model processes m o n i t o r i n g the goals of the o t h e r p a r t i c i p a n t. The f o l l o w i n g example (provided by James A l l e n ) i l l u s t r a t e s the q u e s t i o n.
(8) "When is the Windsor t r a i n? "
I n t r y i n g t o understand t h i s q u e s t i o n from the p e r s p e c t i v e of the person at the i n f o r m a t i o n desk of a t r a i n s t a t i o n , the q u e s t i o n could be d i r e c t e d a t e l i c i t i n g e i t h e r o f two pieces o f i n f o r m a t i o n [ A l l e n , 1978], i. e. , the time o f the next a r r i v a l from Windsor, or the time of the next departure to Windsor.
By saying t h a t the Windsor train is a "noun-noun p a i r " , we get nowhere in t r y i n g to understand i t. In UEP, the word experts f o r Windsor and t r a i n would i n t e r a c t l o c a l l y and determine the range of p o s s i b l e i n t e r p r e t a t i o n s f o r the fragment. In the case of t e x t u a l d i s c o u r s e , the train expert would c a r r y on d i s c o u r s e i n t e r a c t i o n s w i t h the a c t i v i t y process t o f i n d out i f d i s c u s s i o n o f some p a r t i c u l a r t r a i n were a n t i c i p a t e d in the t e x t. In the case of d i a l o g u e , these i n t e r a c t i o n s would occur between train and an Intention mechanism, which might determine t h a t the speaker in the d i a l o g u e is concerned w i t h the t r a i n s coming from Windsor, and not w i t h the t r a i n s l e a v i n g f o r Windsor. I f the processes modelling the a c t i v i t y c o n t e x t o r the speaker i n t e n t i o n s cannot p r o v i d e help to the t r a i n e x p e r t , the word e x p e r t s f o r the sequence would c o n s t r u c t a concept s t r u c t u r e to represent the d i s j u n c t of the two p o s s i b i l i t i e s , but continue t o await the i n f o r m a t i o n t h a t would decide between them.
The understanding of these fragments is c o o r d i n a t e d in WEP by the word e x p e r t f o r the a f f i x 1ng. The 1ng expert i n t e r a c t s l i n g u i s t i c a l l y w i t h the e x p e r t s f o r the words around i t , h e l p i n g them form meaningful sequences, and c a r r i e s on l o g i c a l i n t e r a c t i o n s w i t h the b e l i e f m o d e l l i n g process t o determine the r e l a t i v e p l a u s i b i l i t y o f the two p r o p o s i t i o n s p o s s i b l y s i g n i f i e d b y the l a r g e r sequence. In the f i r s t case above, the ing e x p e r t begins e x e c u t i n g a f t e r the and man have a l r e a d y s t a r t e d c o n s t r u c t i n g a concept s t r u c t u r e to represent the meaning of the man. It a w a i t s the r e p o r t of t h i s concept s t r u c t u r e , as w e l l as the one to be r e p o r t e d by the t i g e r word e x p e r t. Furthermore, 1ng c a r r i e s on l i n g u i s t i c i n t e r a c t i o n s w i t h eat to a r r i v e c o o p e r a t i v e l y at a concept s t r u c t u r e r e p r e s e n t i n g i t s meaning. The i n g e x p e r t then has a p l a u s i b i l i t y interaction with the belief modeller, and c o o r d i n a t e s the remainder of the understanding process based on t h i s i m p o r t a n t knowledge.
Word Expert Parsing is a l i n g u i s t i c theory based on a l e x i c a l o r g a n i z a t i o n of l i n g u i s t i c knowledge represented p r o c e d u r a l l y in word e x p e r t s. The comprehension of fragments of n a t u r a l language t e x t is viewed as a process of word i n t e r a c t i o n s , where a c t i v e l e x i c a l agents cooperate to form meaningful sequences o f i n t e r r e l a t e d l e x i c a l items. L e x i c a l i n t e r a c t i o n s are o f four types, idiosyncratic, linguistic, discourse, and logical. I d i o s y n c r a t i c i n t e r a c t i o n s a l l o w UEP to e x p l a i n the understanding of i d i o m a t i c (more or l e s s i d i o m a t i c ) l e x i c a l sequences, by comparing new sequences w i t h e x p l i c i t l y remembered ones ( c a l l e d prerabs by Bolinger [1979]). L i n g u i s t i c i n t e r a c t i o n s enable the use of s y n t a c t i c and semantic g e n e r a l i z a t i o n s t o i n t e r p r e t fragments, and d i s c o u r s e i n t e r a c t i o n s p r o v i d e word e x p e r t s w i t h knowledge of d i s c o u r s e a c t i v i t i e s and f o c i o f a t t e n t i o n. L o g i c a l i n t e r a c t i o n s a l l o w word e x p e r t s to use knowledge about the r e a l - w o r l d , e s p e c i a l l y about the m u l t i p l e p e r s p e c t i v e s o f i n d i v i d u a l conceptual o b j e c t s w i t h i n i t and the r e l a t i v e p l a u s i b i l i t y o f p r o p o s i t i o n s about i t.
The t h e o r y presented in t h i s paper was o r i g i n a l l y conceived i n wonderful c o o p e r a t i o n w i t h Chuck Rieger over the past several years. Some of the s e n s i b l e p e r s p e c t i v e s in the paper have b e n e f i t t e d from much a p p r e c i a t e d w r i t t e n and spoken suggestions and c r i t i c i s m s of Y o r i c k W i l k s , Dick Hudson, Dwight B o l i n g e r , Pat Hayes, and James A l l e n. The Groupe d ' I n t e l l i g e n c e A r t i f i c i e l l e o f the U n i v e r s i t y P a r i s V I I I - Vincennes has p r o v i d e d an e x c e l l e n t environment f o r r e s e a r c h. Thanks to P a t r i c k Greussay, Harald Wertz, Daniel Goosens, Annette C a t t e n a t , and Gerard Paul. Many e x t r a thanks t o P a t r i c k f o r a l l the personal and r o f e s s i o n a l help he has g i v e n in making my year at incennes both v a l u a b l e and e n j o y a b l e.
Allen, James (1978), Recognizing Intention in Dialogue, Technical Report, Department of Computer Science, U n i v e r s i t y o f Toronto.
B o l i n g e r , Dwight (1975), Aspects of Language, Harcourt Brace Jovanovich.
B o l i n g e r , Dwight (1979), Meaning and Memory, in Experience Forms, Haydn ( e d. ) , Mouton.
Charniak, Eugene (1977), Ms. Malaprop, A Language Comprehension Program, F i f t h I n t e r n a t i o n a l J o i n t Conference o n A r t i f i c i a l I n t e l l i g e n c e.
Chomsky, Noam (1965), Aspects of the Theory of Syntax, MIT Press.
Dik, Simon C. (1978), F u n c t i o n a l Grammar, North Holland.
F i l l m o r e , Charles J. (1968), The Case f o r Case, in Universa Is in L i n g u i s t i c Theory, Each and Harms ( e d s. ) , H o l t.
Grosz, Barbara (1977), The Representation and Use of Focus in Dialogue Understanding, Technical Note #151, S t a n f o r d Research I n s t i t u t e.
Hudson, Richard A. (1979), Pan-Lexical ism, Working Paper, Department of Phonetics and L i n g u s t i c s , U n i v e r s i t y College London.
Jackendoff, Ray (1972), Semantic I n t e r p r e t a t i o n in Generative Grammar, MIT Press.
Jackendoff, Ray (1976), Toward an Explanatory Semantic Representation, Linguistic Inquiry, v. 7 , n. 1.
Kaplan, Ronald M., and Joan W. Bresnan (1980), Lexical-Functional Grammar: A Formal System for Grammatical Representation, Occasional Taper #13, MIT Center f o r C o g n i t i v e Science.
Marcus, M i t c h e l l P. (1979), An Overview or a Theory of Syntactic Recognition for Natural Language, AI Memo #531, MIT A r t i f i c i a l I n t e l i g e n c e Laboratory.
Miller, George A. (1978), Semantic Relations among Words, in L i n g u i s t i c Theory and P s y c h o l o g i c a l R e a l i t y , H a l l e , Bresnan, and M i l l e r ( e d s. ) , MIT Press.
Mitchell, T. F. (1971), Linguistic "goings on"; collocations and other lexical matters arising on the s y n t a c t i c r e c o r d , Archivum L i n g u i s t i c u m , v. 2.
Rieger, Chuck (1977), Viewing Parsing as yord Sense D i s c r i m i n a t i o n , in A Survey of L i n g u i s t i c Science, Dingwall ( e d. ). Greylock P u b l i s h e r s.
Riesbeck, C h r i s t o p h e r K., and Roger C. Schank (1976), Comprehension by Computer: Expectation-Based Analysis of Sentences in Context, Research Report #78, Department of Computer Science, Yale U n i v e r s i t y.
Schank, Roger C., and Robert P. Abel son (1975), Scripts, Plans, and Knowledge, Fourth International J o i n t Conference o n A r t i f i c i a l I n t e l l i g e n c e.
Small, Steven (1980), Word Expert P a r s i n g : A Theory of Distributed Word-Based Natural Language Understanding, Technical Report #954, Department of Computer Science, U n i v e r s i t y of Maryland.
Small, Steven (1981), Toward a Cognitive Mechanics for Distributed Modelling, Technical Report, Department of Computer Science, U n i v e r s i t y of Rochester ( t o appear).
Smith, Edward E., Lance J. Rips, and Edward J. Shoben (1974), Semantic Memory and Psychological Semantics, in The Psychology of L e a r n i n g and M o t i v a t i o n , v. 8, Bower ( e d. ) , Academic Press.
W i l k s , Yorick (1980), Some Thoughts on Procedural Semantics, C o g n i t i v e Studies Centre Report # 1 , U n i v e r s i t y of Essex.