Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Understanding Word Meaning through Sequences: An Analysis of Word Expert Parsing, Study notes of Construction

The Word Expert Parser, a computer program that analyzes natural language text to extract meaning roles of words in combination. It explores how words gain meaning through lexical sequences and the importance of idiosyncratic interactions in understanding language. The document uses examples to illustrate the concept and reflects on its implications for language understanding.

What you will learn

  • What is the role of idiosyncratic interactions in understanding language?
  • What is the significance of lexical sequences in understanding word meaning?
  • How does the Word Expert Parser extract meaning roles of words in natural language text?

Typology: Study notes

2021/2022

Uploaded on 09/12/2022

francyne
francyne 🇺🇸

4.7

(21)

268 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
VIEWING Word EXPERT PARSING AS LINGUISTIC THEORY*
Department of Computer Science
Mathematical Sciences Building
University of Rochester
Rochester, New York 14627
ABSTRACT
The Word Expert Parser is a computer program
that analyzes fragments of natural language text in
order to extract their meaning in context. The
construction of the program has led to the
development of a linguistic theory based on notions
orthogonal to those traditionally found at the
heart of such theories. Word Expert Parsing
explains the understanding of textual fragments
containing highly idiosyncratic elements, such as
idioms, collocations, cliches, and colligations, as
well as lexical sequences that contain interesting
structural phenomena. The theory perceives the
individual word of language as the organizing unit
for linguistic knowledge, and views understanding
as consisting of lexical Interactions among
procedural word experts. This paper describes four
classes of lexical interaction required to explain
the understanding of sentences in context,
Idiosyncratic Interaction, linguistic Interaction,
discourse interaction, and logical interaction.
The paper purposely avoids programming details in
order to focus on Word Expert Parsing as linguistic
theory.
1. INTRODUCTION
The Word Expert Parser (UEP) is a computer
program that analyzes fragments of natural language
text in order to extract their meaning in context.
The system has been developed with particular
attention paid to the wide variety of different
meaning roles of words when appearing in
combination with other words. The theoretical
position advanced by UEP about the nature of
Individual words is that words have no meaning per
se, but rather, that fragments of lexical items
mean something through their Interrelationships.
Furthermore, the character of lexical relations
runs the gamut from the simple direct knowledge
that some word sequence represents some remembered
concept, to the more analytical knowledge that
particular kinds of lexical sequences often
represent certain classes of conceptual notions.
The initial research motivating the ideas
described In this report was supported by grant
NSG-7253 from the National Aeronautics and Space
Administration to the University of Maryland.
During the writing of this paper, the author was
supported by a Fulbright Lectureship in Artificial
Intelligence at the University de Paris, and
computer facilities were provided by the Institut
de Recherche et Coordination Acoustlqua/Musique.
The support of NASA, the Fulbright-Hays program,
and IRCAM are gratefully acknowledged.
Small
Groupe d1Intelligence Artificiellc
Departernent d'Informatique
University Paris VIII - Vincennes
93200 Saint Denis FRANCE
The evolution of this perspective started with
the observation that the understanding of a
particular fragment of text depends fundamentally
on the disambiguation of the individual words
composing it. Knowing the contextual meanings of
the words is tantamount to understanding the
meaning of the overall fragment. Another way of
saying the same thing [Rieger, 1977] is that
language interpretation can be ultimately viewed as
a process of word sense discrimination.
Unfortunately, this perspective does not eliminate
the classic problems of deciding the nature of a
distinct word sense, the difference between
different usages, word senses, and idioms, and so
forth. The solution to these problems comes in
realizing that the process of understanding the
meaning of words in context does not require
reference to those notions at all. The design of a
parsing procedure based on determining the meaning
roles of individual words in context has led to the
Drthogonal linguistic notions that are the subject
of this paper.
The organization of UEP is founded on the
belief that the grouping together of words to form
meaningful sequences is an active process which
succeeds only because of highly idiosyncratic
application of lexical knowledge. That is, we
fragment text and understand the meaning of the
pieces because we know how the particular words
involved interact with each other.
Sometimes sequences of two or more words
interact together to such an extent that they seem
to behave as a single lexical item. Linguists have
labelled such sequences Idioms. The notion to
which this definition gives rise, however, causes
several problems for linguistic theory. First of
all, rarely does such a sequence hold together so
tightly that it can be truly treated theoretically
as a single lexical item. Secondly, rarely does
such a sequence have a unique meaning. More often
than not, the meaning of an idiomatic expression
must be determined by disambiguation. The sequence
must be analyzed in context and be treated by
comprehension processes as being either (a) a
cohesive whole with idiosyncratic meaning, or (b) a
sequence having meaning through less specific
language knowledge. There is no a priori way of
knowing the meaning of the sequence to be the one
or the other.
The notion of idiom falls at one end of a
spectrum, an idealized end that I claim does not
exist. Lexical seauences can be more or less
Idiomatic, in the sense that the process
interactions constituting the understanding of them
includes greater or fewer idiosyncratic
Interactions. The UEP way of looking at the most
idiomatic sequences is that the special
70
pf3
pf4
pf5

Partial preview of the text

Download Understanding Word Meaning through Sequences: An Analysis of Word Expert Parsing and more Study notes Construction in PDF only on Docsity!

VIEWING Word EXPERT PARSING AS LINGUISTIC THEORY*

Department of Computer Science Mathematical Sciences B u i l d i n g U n i v e r s i t y o f Rochester Rochester, New York 14627

ABSTRACT

The Word Expert Parser is a computer program t h a t analyzes fragments o f n a t u r a l language t e x t i n order to e x t r a c t t h e i r meaning in c o n t e x t. The c o n s t r u c t i o n of the program has l e d to the development of a l i n g u i s t i c theory based on n o t i o n s orthogonal t o those t r a d i t i o n a l l y found a t the heart of such t h e o r i e s. Word Expert Parsing e x p l a i n s the understanding o f t e x t u a l fragments c o n t a i n i n g h i g h l y i d i o s y n c r a t i c elements, such as idioms, c o l l o c a t i o n s , c l i c h e s , and c o l l i g a t i o n s , a s w e l l a s l e x i c a l sequences t h a t c o n t a i n i n t e r e s t i n g s t r u c t u r a l phenomena. The theory p e r c e i v e s the i n d i v i d u a l word of language as the o r g a n i z i n g u n i t f o r l i n g u i s t i c knowledge, and views understanding as c o n s i s t i n g of lexical Interactions among procedural word e x p e r t s. T h i s paper d e s c r i b e s four classes o f l e x i c a l i n t e r a c t i o n r e q u i r e d t o e x p l a i n the understanding o f sentences i n c o n t e x t ,

Idiosyncratic Interaction, linguistic Interaction,

discourse interaction, and logical interaction.

The paper purposely a v o i d s programming d e t a i l s in order to focus on Word Expert Parsing as l i n g u i s t i c theory.

1. INTRODUCTION

The Word Expert Parser (UEP) is a computer program t h a t analyzes fragments of n a t u r a l language t e x t i n order t o e x t r a c t t h e i r meaning i n c o n t e x t. The system has been developed w i t h p a r t i c u l a r a t t e n t i o n p a i d t o the wide v a r i e t y o f d i f f e r e n t meaning r o l e s of words when appearing in combination w i t h o t h e r words. The t h e o r e t i c a l p o s i t i o n advanced by UEP about the nature of I n d i v i d u a l words is t h a t words have no meaning per se, but r a t h e r , t h a t fragments o f l e x i c a l items mean something through t h e i r I n t e r r e l a t i o n s h i p s. Furthermore, t h e c h a r a c t e r o f l e x i c a l r e l a t i o n s runs the gamut from the simple d i r e c t knowledge t h a t some word sequence r e p r e s e n t s some remembered concept, to the more a n a l y t i c a l knowledge t h a t p a r t i c u l a r k i n d s o f l e x i c a l sequences o f t e n represent c e r t a i n c l a s s e s o f conceptual n o t i o n s.

  • The i n i t i a l research m o t i v a t i n g the ideas described In t h i s r e p o r t was supported by g r a n t NSG-7253 from the N a t i o n a l A e r o n a u t i c s and Space A d m i n i s t r a t i o n t o t h e U n i v e r s i t y o f Maryland. During the w r i t i n g of t h i s paper, the author was supported b y a F u l b r i g h t L e c t u r e s h i p i n A r t i f i c i a l I n t e l l i g e n c e at the U n i v e r s i t y de P a r i s , and computer f a c i l i t i e s were p r o v i d e d by the Institut

de Recherche et Coordination Acoustlqua/Musique.

The support of NASA, the F u l b r i g h t - H a y s program, and IRCAM are g r a t e f u l l y acknowledged.

Small

Groupe d 1 I n t e l l i g e n c e A r t i f i c i e l l c Departernent d ' I n f o r m a t i q u e U n i v e r s i t y Paris V I I I - Vincennes 93200 Saint Denis FRANCE

The e v o l u t i o n o f t h i s p e r s p e c t i v e s t a r t e d w i t h the o b s e r v a t i o n t h a t the understanding of a p a r t i c u l a r fragment of t e x t depends fundamentally on the disambiguation of the i n d i v i d u a l words composing i t. Knowing the c o n t e x t u a l meanings of the words is tantamount to understanding the meaning of the o v e r a l l fragment. Another way of saying the same t h i n g [Rieger, 1977] is t h a t language i n t e r p r e t a t i o n can be u l t i m a t e l y viewed as a process of word sense d i s c r i m i n a t i o n. U n f o r t u n a t e l y , t h i s p e r s p e c t i v e does not e l i m i n a t e t h e c l a s s i c problems of d e c i d i n g the nature of a d i s t i n c t word sense, the d i f f e r e n c e between d i f f e r e n t usages, word senses, and idioms, and so f o r t h. The s o l u t i o n to these problems comes in r e a l i z i n g t h a t the process of understanding the meaning of words in c o n t e x t does not r e q u i r e r e f e r e n c e to those n o t i o n s at a l l. The design of a p a r s i n g procedure based on d e t e r m i n i n g the meaning r o l e s of i n d i v i d u a l words in c o n t e x t has l e d to the Drthogonal l i n g u i s t i c n o t i o n s t h a t are the subject o f t h i s paper.

The o r g a n i z a t i o n of UEP is founded on the b e l i e f t h a t the grouping together of words to form meaningful sequences is an a c t i v e process which succeeds o n l y because of h i g h l y i d i o s y n c r a t i c a p p l i c a t i o n of l e x i c a l knowledge. That i s , we fragment t e x t and understand the meaning of the pieces because we know how the p a r t i c u l a r words i n v o l v e d i n t e r a c t w i t h each o t h e r.

Sometimes sequences of two or more words i n t e r a c t together to such an e x t e n t t h a t they seem to behave as a s i n g l e l e x i c a l item. L i n g u i s t s have l a b e l l e d such sequences Idioms. The n o t i o n to which t h i s d e f i n i t i o n g i v e s r i s e , however, causes several problems f o r l i n g u i s t i c t h e o r y. F i r s t o f a l l , r a r e l y does such a sequence hold together so t i g h t l y t h a t i t can b e t r u l y t r e a t e d t h e o r e t i c a l l y as a s i n g l e l e x i c a l item. Secondly, r a r e l y does such a sequence have a unique meaning. More o f t e n than n o t , the meaning of an i d i o m a t i c expression must be determined by d i s a m b i g u a t i o n. The sequence must be analyzed in c o n t e x t and be t r e a t e d by comprehension processes as being e i t h e r (a) a cohesive whole w i t h i d i o s y n c r a t i c meaning, or (b) a sequence having meaning through l e s s s p e c i f i c language knowledge. There is no a priori way of knowing the meaning of the sequence to be the one or the o t h e r.

The n o t i o n of i diom f a l l s at one end of a spectrum, an i d e a l i z e d end t h a t I c l a i m does not e x i s t. L e x i c a l seauences can be more or less Idiomatic, in the sense t h a t the process i n t e r a c t i o n s c o n s t i t u t i n g the understanding of them i n c l u d e s g r e a t e r or fewer i d i o s y n c r a t i c I n t e r a c t i o n s. The UEP way of l o o k i n g at the most i d i o m a t i c sequences i s t h a t the s p e c i a l

i n t e r a c t i o n s among the p a r t i c i p a t i n g words take p r i o r i t y over any other p o t e n t i a l i n t e r a c t i o n s i n v o l v i n g those words. The d i s a m b i g u a t i o n of i d i o m a t i c expressions, i. e. , the understanding o f the sequences as e i t h e r idioms or non-idioms (to use the popular d i s t i n c t i o n ) , g e n e r a l l y r e q u i r e s other process i n t e r a c t i o n s besides the s t r i c t l y w o r d - s p e c i f i c ones. The understanding of an idiom thus d i f f e r s i n s i g n i f i c a n t l y , from the p e r s p e c t i v e of UEP t h e o r y , from comprehension of any other k i n d (according to whatever c l a s s i f i c a t i o n scheme) of l e x i c a l sequence.

The n o t i o n t h a t a l l fragments of language are more or l e s s i d i o m a t i c , w h i l e r a d i c a l in some l i n g u i s t i c q u a r t e r s , has been p r e v i o u s l y suggested. In h i s i n t r o d u c t o r y textbook, Aspects of Language, Dwight B o l i n g e r asks "whether e v e r y t h i n g we say may be in some degree i d i o m a t i c — t h a t i s , whether t h e r e are a f f i n i t i e s among words t h a t continue to r e f l e c t the attachments the words had when we learned them, w i t h i n l a r g e r groups" [ B o l i n g e r , 1975J. A f t e r working w i t h i n what he c a l l s " t h e p r e v a i l i n g r e d u c t i o n i s m " , B o l i n g e r began to suggest a p o s i t i v e answer to h i s pedagogical q u e s t i o n , choosing to take "an i d i o m a t i c r a t h e r than an a n a l y t i c a l view" [ B o l i n g e r , 1979] of language. The c o n t r i b u t i o n o f a r t i f i c i a l i n t e l l i g e n c e I n g e n e r a l , and of Word Expert Parsing in p a r t i c u l a r , is to develop theory from t h i s i n f o r m a l view. The n o t i o n of process, and of process i n t e r a c t i o n , a l l o w s us to begin to do Just t h a t.

2. THE UORD EXPERT PARSER

The UEP computer system maintains l i n g u i s t i c knowledge across a community of word-based s t r u c t u r e s c a l l e d word experts, which represent the process of d e t e r m i n i n g the c o n t e x t u a l meaning and r o l e of the i n d i v i d u a l words. A word expert must not be thought of as a r e p r e s e n t a t i o n f o r the v a r i o u s meanings, r o l e s , and c o n t r i b u t i o n s of a word in c o n t e x t , but r a t h e r as a d e c l a r a t i v e r e p r e s e n t a t i o n (a network) of the process (which we s h a l l c a l l disambiguation) of d e t e r m i n i n g these t h i n g s. C e r t a i n l y , i t i s the meaning c o n t r i b u t i o n s of i n d i v i d u a l l e x i c a l items t h a t we wish to determine. Word e x p e r t s are b o t h data and process; they can be augmented, examined, and manipulated as data, y e t p a r s i n g takes p l a c e through t h e i r i n t e r p r e t a t i o n as program by an e x p e r t e v a l u a t o r , s i m i l a r to the EVAL of L i s p.

The d i s t r i b u t e d p a r s i n g scheme of UEP works as f o l l o w s. The UEP reader examines a word of t e x t and r e t r i e v e s i t s word e x p e r t from memory. The word e x p e r t s t a r t s e x e c u t i n g , t r y i n g t o determine the meaning r o l e o f i t s word i n c o n t e x t , i. e. , i n t e r a c t i n g w i t h other word e x p e r t s and w i t h h i g h e r - o r d e r system processes to a c q u i r e the a p p r o p r i a t e c o n t e x t u a l knowledge to make the c o r r e c t i n f e r e n c e s. F i n a l l y , a l l the word e x p e r t s f o r a p a r t i c u l a r fragment of t e x t come to mutual agreement on the meaning of the fragment, and the l o c a l d i s t r i b u t e d process t e r m i n a t e s. L o c a l , i n the sense t h a t as long as t h e r e remains i n p u t t e x t , the o v e r a l l p a r s i n g process c o n t i n u e s , w h i l e the d i s a m b i g u a t i o n o f i n d i v i d u a l l e x i c a l sequences making up the l a r g e r t e x t completes.

I n t e r a c t i o n , between i n d i v i d u a l s i n the w o r l d , or between d i s t r i b u t e d processes in a computer program, r e q u i r e s both (a) g i v i n g i n f o r m a t i o n and (b) r e c e i v i n g i n f o r m a t i o n. In UEP, the e x p e r t s exchange two kinds of i n f o r m a t i o n , c a l l e d concept structures and control signals. Concept structures represent human concepts, such as "a book", " g o i n g f i s h i n g " , " t h e box of candy 1 gave Joanie f o r V a l e n t i n e ' s day in 1981", "some b l u e p h y s i c a l

o b j e c t " , and the l i k e. Control s i g n a l s represent p r o c e s s i n g c l u e s , such as "expect a word t h a t can begin a l e x i c a l sequence t h a t can d e s c r i b e concept s t r u c t u r e X", "send me the concept s t r u c t u r e r e p r e s e n t i n g the agent of concept s t r u c t u r e Y or a s i g n a l saying you cannot", " w a i t a second and you w i l l be sent a concept s t r u c t u r e t h a t w i l l help y o u " , and s i m i l a r t h i n g s. The r e p r e s e n t a t i o n and use of concepts and s i g n a l s are described f u l l y in [Small, 1980].

3, LEXICAL INTERACTIONS

I use the term lexical interaction to denote the sending and r e c e i v i n g of c o n t r o l s i g n a l s and concept s t r u c t u r e s by word experts in UEP, This i n c l u d e s i n t e r a c t i o n s between i n d i v i d u a l experts, as w e l l as those between a word expert and another k i n d of model process ( e. g. , a mechanism i n f e r r i n g the goals of a d i a l o g u e p a r t i c i p a n t ). This paper discusses l e x i c a l i n t e r a c t i o n s b y p r e s e n t i n g four classes of r e q u i r e d i n t e r a c t i o n , and then arguing f o r the n e c e s s i t y and g i v i n g examples of each. The c a t e g o r i z a t i o n is by the k i n d of knowledge exchanged in the communication, and i n c l u d e s the f o l l o w i n g.

Idiosyncratic Interaction

Linguistic Interaction

Discourse Interaction

Logical Interaction

The l e a s t general c l a s s of l e x i c a l i n t e r a c t i o n s are considered i d i o s y n c r a t i c since they are w o r d - s p e c i f i c and a r i s e through simple r e c a l l memory. T h i s type of i n t e r a c t i o n permits the understanding of i d i o m a t i c fragments. General knowledge about the syntax and semantics of some n a t u r a l language g i v e s r i s e to linguistic i n t e r a c t i o n s , and are of course c r u c i a l to the understanding of l e x i c a l sequences not p r e v i o u s l y seen. Sometimes words i n t e r a c t w i t h processes t h a t monitor the development of an e n t i r e t e x t (or p a r t s t h e r e o f ) , o r the goals o f p a r t i c i p a n t s i n d i s c u s s i o n. These discourse i n t e r a c t i o n s are o f t e n necessary f o r the meaningful cohesion of l e x i c a l fragments. L a s t l y , but c e r t a i n l y not l e a s t i m p o r t a n t are the logical i n t e r a c t i o n s between words and the most general c o g n i t i v e processes. Perceptions about the w o r l d , b e l i e f s , inference-making s k i l l s , r o t e memory, and so f o r t h , a r e b a s i c to language understanding.

The c l a s s i f i c a t i o n of word fragments i n t o c a t e g o r i e s such as " i d i o m " , " c o l l o c a t i o n " , " c o l l i g a t i o n " , "noun phrase", "complement", and the l i k e , does not make sense in UEP theory. Rather, i n d i v i d u a l words are viewed as having c e r t a i n kinds and sequences of i n t e r a c t i o n s w i t h t h e i r neighbors to form meaningful pieces of t e x t. Fragments o f t e n d e s c r i b e d as " I d i o m s " are those t h a t are understood p r i n c i p a l l y through i d i o s y n c r a t i c l e x i c a l i n t e r a c t i o n s. A n o n - i d i o m a t i c s t r u c t u r e , diagnosed as a "noun phrase", is one t h a t i n v o l v e s mostly l i n g u i s t i c i n t e r a c t i o n s t o understand. A s o - c a l l e d "noun-noun p a i r " can be thought of as a l e x i c a l sequence comprehended w i t h the help of l o g i c a l i n t e r a c t i o n s , w i t h recourse to common sense memory and s k i l l s.

3.1 I d i o s y n c r a t i c i n t e r a c t i o n

Since the emphasis of the UEP research e f f o r t is to c o n s t r u c t a computer program to understand n a t u r a l language, we are not q u a l i f i e d to make

Noun Phrases

How can the p u r e l y l e x i c a l UEP system r e q u i r e no n o t i o n of h i g h order s t r u c t u r a l phenomena, y e t s t i l l b e a b l e t o account f o r them? The f o l l o w i n g example ( p r o v i d e d b y Y o r i c k w i l k s ) i l l u s t r a t e s the l e x i c a l i n t e r a c t i o n s r e q u i r e d t o a n a l y z e a n i n t e r e s t i n g fragment o f t e x t.

(3) " J o a n i e washes the c o l o r f u l d i s h e s u p. "

The d i f f i c u l t y w i t h t h i s fragment i s i n d e t e r m i n i n g t h a t the word dishes c o n t r i b u t e s to the meaning of t h e fragment t h r o u g h i n t e r a c t i o n s w i t h the two words t o i t s l e f t , b u t t h a t the word u p c o n t r i b u t e s by a s s o c i a t i o n w i t h the word washes, which precedes up by many i n t e r v e n i n g words. The reason t h a t I am a v o i d i n g the use o f t r a d i t i o n a l l i n g u i s t i c j a r g o n f o r d e s c r i b i n g t h i s phenomenon i s the f o l l o w i n g b e l i e f : An u n d e r s t a n d i n g of UEP r e q u i r e s the v i e w i n g o f language i n t e r p r e t a t i o n from the vantage p o i n t o f the i n d i v i d u a l word and i t s i n t e r a c t i o n s. An i m p o r t a n t way to achieve t h i s is to d e s c r i b e the a n a l y s i s process w i t h r e f e r e n c e t o the very n o t i o n s (not t h e t r a d i t i o n a l ones) around which i t i s o r g a n i z e d.

In the a n a l y s i s of the example fragment, UEP would f i n d the r e f e r e n t of Joanie and then proceed as f o l l o w s. The wash e x p e r t would b e g i n e x e c u t i n g , t r y i n g to d e t e r m i n e i t s own meaning r o l e in some l e x i c a l fragment, and at t h e same t i m e , t r y i n g to p r o v i d e i n f o r m a t i o n t o o t h e r l e x i c a l agents t o p e r m i t them to do the same. The meaning of wash in c o n t e x t depends on a number of f a c t o r s , i n c l u d i n g the n a t u r e o f the words succeeding i t , and t h e i r own a c t i o n s i n d e t e r m i n i n g t h e i r meaning and r o l e c o n t r i b u t i o n s in c o n t e x t. The wash e x p e r t must thus prepare f o r a number of c o n t i n g e n c i e s , or d i f f e r e n t t h i n g s t h a t c o u l d happen i n the t e x t , and then, w a i t t o see i f any o f them a c t u a l l y occur. I f the word up appears to t h e r i g h t of wash, f o r example, the words c o u l d choose to p a i r up i n t o a meaningful fragment (as in throw in above). Under c e r t a i n c o n d i t i o n s , the word up c o u l d appear l a t e r on in t h e t e x t , and s t i l l p a i r up w i t h wash (as must occur f o r c o r r e c t i n t e r p r e t a t i o n o f the example s e n t e n c e ).

What a r e the c o n t e x t u a l c o n d i t i o n s t h a t would p e r m i t t h i s? One of t h e c o n t i n g e n c i e s t h a t wash a n t i c i p a t e s i s t h e g r o u p i n g o f t h e words t o i t s r i g h t i n t o a meaningful fragment of t h e i r own ( i. e. , a concept structure). The wash e x p e r t knows t h a t (a) t h e n a t u r e of t h i s concept s t r u c t u r e may be i m p o r t a n t f o r i t s own sense d i s a m b i g u a t i o n , and (b) t h a t t h e word immediately f o l l o w i n g the meaningful fragment c o u l d p a i r u p w i t h i t. I n the Jargon of UEP, one of t h e e x p e r t s in such a meaningful fragment reports a concept s t r u c t u r e. Since t h e up e x p e r t does n o t r e p l y to dishes w i t h an a c c e p t a b l e message to c o n t i n u e the ongoing concept b u i l d i n g a c t i v i t y , the dishes e x p e r t r e p o r t s t h e s t r u c t u r e. I t i s t h i s r e p o r t t h a t t r i g g e r s some new p r o c e s s i n g by t h e wash e x p e r t , namely the e x a m i n a t i o n of t h e n e x t word ( i. e. , up).

The r e s t o f t h e a n a l y s i s t a k e s p l a c e p r e d i c t a b l y. The wash e x p e r t i n t e r a c t s w i t h up as i f u p o c c u r r e d t o i t s immediate r i g h t i n the t e x t. The p a i r i n g up of the two words r e s u l t s from mutual a c c o r d , and t h e wash e x p e r t c r e a t e s a concept s t r u c t u r e t o r e p r e s e n t t h e meaning o f the washing up of d i s h e s. Next wash o r g a n i z e s t h e conceptual o b j e c t , the c o 7 o r f u l dishes, i n t o t h e o v e r a l l meaning o f t h e sentence, and a g a i n w a i t s f o r t h i n g s t o happen. T h i s t i m e , t h e word e x p e r t f o r t h e p e r i o d at t h e end of the sentence executes, and t r a n s m i t s an a p p r o p r i a t e message. The wash e x p e r t a g a i n e x e c u t e s , c l e a n s u p i t s b u s i n e s s , and r e p o r t s

t h e concept s t r u c t u r e r e p r e s e n t i n g the meaning, i n c o n t e x t , o f t h e e n t i r e fragment (sentence).

P a s s i v e s and R e l a t i v e Clauses

Sentences in the p a s s i v e v o i c e and those c o n t a i n i n g r e l a t i v e clauses a r e s i m i l a r i n b e i n g complex s t r u c t u r a l phenomena i n n a t u r a l language, and o f t e n s u g g e s t i v e o f s e n t e n c e - l e v e l r u l e s a s l i n g u i s t i c e x p l a n a t i o n. Furthermore, t h e u n d e r s t a n d i n g of such c o n s t r u c t i o n s by the d i s t r i b u t e d word-based approach of wEP may be f a r f r o m e v i d e n t , e s p e c i a l l y c o n s i d e r i n g my c l a i m t h a t n o e x p l i c i t n o t i o n s o f s t r u c t u r e are r e f e r e n c e d b y t h e computer system or used in the t h e o r y. I n t e r p r e t a t i o n o f t e x t u a l fragments c o n t a i n i n g complex s y n t a c t i c s t r u c t u r e s takes p l a c e t h r o u g h complex p a t t e r n s o f l e x i c a l i n t e r a c t i o n s among the a p p r o p r i a t e word e x p e r t s. The words t h a t n o r m a l l y cue a reader about the presence of such s t r u c t u r a l r e l a t i o n s in a fragment a r e the ones in wEP t h a t c o o r d i n a t e the process of u n d e r s t a n d i n g them.

The a n a l y s i s of a p a s s i v e sentence i n v o l v e s l i n g u i s t i c i n t e r a c t i o n s among t h e word e x p e r t s f o r t h e s u f f i x en, t h e word by, and the o t h e r words composing i t. The f o l l o w i n g sentence has been parsed by the e x i s t i n g UEP system, and d i s c u s s e d at l e n g t h i n [Small, 1980].

(4) "The case was thrown out by f e d e r a l c o u r t. "

The en e x p e r t begins e x e c u t i n g b e f o r e throw, and t h e normal a t t e m p t s by t h e throw e x p e r t to c o o r d i n a t e the a n a l y s i s o f the fragment i n which i t p a r t i c i p a t e s a r e i n t e r c e p t e d by an. The a c t i o n s of en a l l o w throw to p a i r up w i t h o u t. as o u t l i n e d above f o r throw in and wash up, but i t s l e x i c a l i n t e r a c t i o n s t o d e t e r m i n e t h e n a t u r e o f t h e o b j e c t b e i n g " t h r o w n o u t " , and t h e agent d o i n g t h e " t h r o w i n g " a r e a l l i n t e r c e p t e d b y t h e e n e x p e r t , which p r o v i d e s throw w i t h t h e c o r r e c t r e p l i e s t o i t s q u e r i e s. Please r e f e r t o [Small, 1980] and [ S m a l l , 1981] f o r a f u l l e r d i s c u s s i o n.

R e l a t i v e c l a u s e s b e g i n n i n g w i t h the word who a r e analyzed by UEP t h r o u g h the i n t e r a c t i o n s among t h e who word e x p e r t and t h e e x p e r t s f o r the o t h e r words in t h e c l a u s e and t h e l a r g e r fragment c o n t a i n i n g i t. The f o l l o w i n g sentence i s a n example of such a fragment.

(5) "The man who throws the game l i k e s to l o s e. "

The who e x p e r t in t h i s sentence has the r e s p o n s i b i l i t y f o r i n t e r a c t i n g w i t h the word e x p e r t f o r likes to i n f o r m Tikes about the man d o i n g t h e " l i k i n g ". O r d i n a r i l y , t h i s e x p e r t would expect t o f i n d a meaningful l e x i c a l sequence t o i t s l e f t r e p r e s e n t i n g the needed concept. However, the p a r t i c u l a r s t r u c t u r e of the fragment means t h a t who must be at t h e o t h e r end of the r e l e v a n t l i n g u i s t i c i n t e r a c t i o n s o f likes, r a t h e r than the e x p e r t f o r t h e word t o i t s immediate l e f t , which would n o r m a l l y p e r f o r m the needed s e r v i c e.

The UEP i n t e r p r e t a t i o n of the example fragment proceeds as f o l l o w s. The word e x p e r t s f o r the and man agree to form a meaningful sequence and c o n s t r u c t a concept s t r u c t u r e t o r e p r e s e n t i t s meaning. The who e x p e r t begins e x e c u t i n g , g e t s h o l d of t h i s concept, and w a i t s f o r the throw e x p e r t t o s t a r t e x p l o r i n g the n a t u r e o f t h e l e x i c a l sequence o n i t s l e f t. I n a d d i t i o n , the who e x p e r t a n t i c i p a t e s t h a t another word e x p e r t f u r t h e r down the l i n e ( i n the example, t h e e x p e r t f o r 7 7/ces) w i l l a l s o seek out i n f o r m a t i o n about the sequence t o i t s l e f t , i n e x a c t l y the way t h a t throw does. The who e x p e r t , l i k e every word e x p e r t in UEP,

plans a s t r a t e g y to i n t e r a c t w i t h the e x p e r t s i n v o l v e d i n both i t s p r i o r c o n t e x t and i t s subsequent c o n t e x t , c o o p e r a t i v e l y t o i n t e r p r e t fragments of t e x t.

The throw e x p e r t begins e x e c u t i n g and i n v e s t i g a t e s the nature of the l e x i c a l sequence to i t s l e f t. The who e x p e r t p r o v i d e s the a p p r o p r i a t e I n f o r m a t i o n , i. e. , the concept s t r u c t u r e r e p r e s e n t i n g the men, and throw begins to disambiguate i t s meaning in c o n t e x t. The e x p e r t s f o r a and game m u t u a l l y agree on t h e i r l o c a l meaning, and through l i n g u i s t i c and i d i o s y n c r a t i c i n t e r a c t i o n s w i t h throw help i t determine i t s meaning as the " t h r o w i n g of a c o n t e s t ". The likes e x p e r t s t a r t s e x e c u t i n g , and i t s messages i n search of the person doing the " l i k i n g " are i n t e r c e p t e d by the who e x p e r t , which has been on the l o o k o u t f o r such i n t e r a c t i o n s since the b e g i n n i n g. Since the who e x p e r t knows the unique name of the concept s t r u c t u r e r e p r e s e n t i n g the man, it sends t h i s concept to likes , which proceeds n o r m a l l y , knowing n o t h i n g o f the s t r u c t u r a l c o m p l e x i t i e s preceeding i t.

The word e x p e r t s f o r b o t h throw and f o r likes can be expected to e x p l o r e the u n d e r l y i n g meaning of the l e x i c a l sequences preceding them. Note the way t h a t UEP a p p l i e s t h i s l i n g u i s t i c knowledge to the i n t e r p r e t a t i o n of fragments of n a t u r a l language t e x t c o n t a i n i n g these words. Rather than saying t h a t throw and Ukes a c t as f i n i t e verbs in c e r t a i n c o n t e x t s (which are d e s c r i b e d in some r e l a t i o n a l r e p r e s e n t a t i o n a l scheme, such as grammar r u l e s or l o g i c ) , we say i n s t e a d t h a t these words c a r r y on l i n g u i s t i c i n t e r a c t i o n s w i t h the a c t i v e processes modelling the other words making up the ( l o c a l l i n g u i s t i c ) c o n t e x t t o a r r i v e a t a m u t u a l l y acceptable c h a r a c t e r i z a t i o n o f t h e i r i n d i v i d u a l c o n t r i b u t i o n s to t e x t u a l meaning. The advantage of t h i s p e r s p e c t i v e comes from the f a c t t h a t l i n g u i s t i c i n t e r a c t i o n s c o n s t i t u t e but a p o r t i o n o f a l l p o s s i b l e l e x i c a l i n t e r a c t i o n s t h a t r e p r e s e n t i n UEP the process of understanding.

  1. 3 Discourse I n t e r a c t i o n s

While i t i s c l e a r t h a t c e r t a i n l e x i c a l sequences cannot be understood s o l e l y through recourse to syntax and semantics, namely those fragments f o r which i d i o s y n c r a t i c i n t e r a c t i o n s are r e q u i r e d ( i. e. , s p e c i f i c remembered c o n t e x t s ) , why do we need o t h e r k i n d s of general knowledge? We have a l r e a d y seen examples suggesting the answer to t h i s q u e s t i o n. I n t r y i n g t o understand the meaning of throw 1n the towel, the r e l e v a n t word e x p e r t s must f i n d out some t h i n g s about the person performing the d e s c r i b e d a c t i o n , b e f o r e knowing what a c t i o n h e i s i n e f f e c t p e r f o r m i n g.

I f the d i s c o u r s e d e s c r i b e s some s o r t o f c o m p e t i t i o n between two people (or teams), f o r example, throw in the towel could indicate a concession of d e f e a t by one of them. The f o l l o w i n g fragment i l l u s t r a t e s such a c o n t e x t u a l s i t u a t i o n.

(6) "Rick and Joanle p l a y chess. Rick throws i n the t o w e l. "

On the other hand, if t h e d i s c o u r s e has r e c e n t l y made reference to a p l a c e where one might dispose of a t o w e l , throw 1n the towel might be s i g n i f y i n g the p u t t i n g of some towel in t h a t p l a c e. The f o l l o w i n g example i l l u s t r a t e s t h i s case.

(7) "Joanie drops a penny in the p i t. Rick throws i n the t o w e l. "

I am not c l a i m i n g t h a t knowledge of the d i s c o u r s e

c o n t e x t is s u f f i c i e n t to disambiguate the meanings of the example sentence, but r a t h e r , t h a t such knowledge i s r e q u i r e d t o understand i t.

The d i s c o u r s e i n t e r a c t i o n s r e q u i r e d to i n t e r p r e t the above example take place (a) between the throw expert and a higher order process m o d e l l i n g the activity context, and (b) between the 1n e x p e r t an a process modelling the d i s c o u r s e focus of attention.* There are two aspects to the processing of the a c t i v i t y mechanism, the u n s o l i c i t e d sending o f c o n t r o l s i g n a l s t o i n d i c a t e the a n t i c i p a t i o n o f c e r t a i n a c t i o n s i n the t e x t and concept s t r u c t u r e s to represent them, and the more d a t a - d i r e c t e d i n t e r a c t i o n s w i t h word experts (and other understanding processes) to determine the n a t u r e of the a c t i o n s t h a t a c t u a l l y do occur. The throw expert must c a r r y on a c t i v i t y context Interactions to determine if the discourse could be seen as d i s c u s s i n g some c o m p e t i t i v e a c t i v i t y. If so, the "concession of d e f e a t " i n t e r p r e t a t i o n of the example sentence is p l a u s i b l e. The in expert carries on rocus or attention interactions to find out if some l o c a t i o n has r e c e n t l y been described in the t e x t in which something might be thrown.

While the UEP system has been d i r e c t e d toward the understanding of fragments of t e x t o c c u r r i n g in t e x t u a l d i s c o u r s e , the issues a r i s i n g i n the i n t e r p r e t a t i o n of d i a l o g u e are very s i m i l a r. The d i f f e r e n c e between the two tasks i n v o l v e s the n a t u r e o f d i s c o u r s e i n t e r a c t i o n s. I n i n t e r p r e t i n g fragments of d i a l o g u e from the vantage p o i n t of one of the p a r t i c i p a n t s , word experts must i n t e r a c t w i t h model processes m o n i t o r i n g the goals of the o t h e r p a r t i c i p a n t. The f o l l o w i n g example (provided by James A l l e n ) i l l u s t r a t e s the q u e s t i o n.

(8) "When is the Windsor t r a i n? "

I n t r y i n g t o understand t h i s q u e s t i o n from the p e r s p e c t i v e of the person at the i n f o r m a t i o n desk of a t r a i n s t a t i o n , the q u e s t i o n could be d i r e c t e d a t e l i c i t i n g e i t h e r o f two pieces o f i n f o r m a t i o n [ A l l e n , 1978], i. e. , the time o f the next a r r i v a l from Windsor, or the time of the next departure to Windsor.

By saying t h a t the Windsor train is a "noun-noun p a i r " , we get nowhere in t r y i n g to understand i t. In UEP, the word experts f o r Windsor and t r a i n would i n t e r a c t l o c a l l y and determine the range of p o s s i b l e i n t e r p r e t a t i o n s f o r the fragment. In the case of t e x t u a l d i s c o u r s e , the train expert would c a r r y on d i s c o u r s e i n t e r a c t i o n s w i t h the a c t i v i t y process t o f i n d out i f d i s c u s s i o n o f some p a r t i c u l a r t r a i n were a n t i c i p a t e d in the t e x t. In the case of d i a l o g u e , these i n t e r a c t i o n s would occur between train and an Intention mechanism, which might determine t h a t the speaker in the d i a l o g u e is concerned w i t h the t r a i n s coming from Windsor, and not w i t h the t r a i n s l e a v i n g f o r Windsor. I f the processes modelling the a c t i v i t y c o n t e x t o r the speaker i n t e n t i o n s cannot p r o v i d e help to the t r a i n e x p e r t , the word e x p e r t s f o r the sequence would c o n s t r u c t a concept s t r u c t u r e to represent the d i s j u n c t of the two p o s s i b i l i t i e s , but continue t o await the i n f o r m a t i o n t h a t would decide between them.

  • The term activity context d e s c r i b e s a n o t i o n s i m i l a r to the scripts of Schank and Abelson [1975] and to the rrames of Charniak [1977]. The n o t i o n of rocus or attention has been taken d i r e c t l y from the work of Grosz [1977].

The understanding of these fragments is c o o r d i n a t e d in WEP by the word e x p e r t f o r the a f f i x 1ng. The 1ng expert i n t e r a c t s l i n g u i s t i c a l l y w i t h the e x p e r t s f o r the words around i t , h e l p i n g them form meaningful sequences, and c a r r i e s on l o g i c a l i n t e r a c t i o n s w i t h the b e l i e f m o d e l l i n g process t o determine the r e l a t i v e p l a u s i b i l i t y o f the two p r o p o s i t i o n s p o s s i b l y s i g n i f i e d b y the l a r g e r sequence. In the f i r s t case above, the ing e x p e r t begins e x e c u t i n g a f t e r the and man have a l r e a d y s t a r t e d c o n s t r u c t i n g a concept s t r u c t u r e to represent the meaning of the man. It a w a i t s the r e p o r t of t h i s concept s t r u c t u r e , as w e l l as the one to be r e p o r t e d by the t i g e r word e x p e r t. Furthermore, 1ng c a r r i e s on l i n g u i s t i c i n t e r a c t i o n s w i t h eat to a r r i v e c o o p e r a t i v e l y at a concept s t r u c t u r e r e p r e s e n t i n g i t s meaning. The i n g e x p e r t then has a p l a u s i b i l i t y interaction with the belief modeller, and c o o r d i n a t e s the remainder of the understanding process based on t h i s i m p o r t a n t knowledge.

4. S U M M A R Y

Word Expert Parsing is a l i n g u i s t i c theory based on a l e x i c a l o r g a n i z a t i o n of l i n g u i s t i c knowledge represented p r o c e d u r a l l y in word e x p e r t s. The comprehension of fragments of n a t u r a l language t e x t is viewed as a process of word i n t e r a c t i o n s , where a c t i v e l e x i c a l agents cooperate to form meaningful sequences o f i n t e r r e l a t e d l e x i c a l items. L e x i c a l i n t e r a c t i o n s are o f four types, idiosyncratic, linguistic, discourse, and logical. I d i o s y n c r a t i c i n t e r a c t i o n s a l l o w UEP to e x p l a i n the understanding of i d i o m a t i c (more or l e s s i d i o m a t i c ) l e x i c a l sequences, by comparing new sequences w i t h e x p l i c i t l y remembered ones ( c a l l e d prerabs by Bolinger [1979]). L i n g u i s t i c i n t e r a c t i o n s enable the use of s y n t a c t i c and semantic g e n e r a l i z a t i o n s t o i n t e r p r e t fragments, and d i s c o u r s e i n t e r a c t i o n s p r o v i d e word e x p e r t s w i t h knowledge of d i s c o u r s e a c t i v i t i e s and f o c i o f a t t e n t i o n. L o g i c a l i n t e r a c t i o n s a l l o w word e x p e r t s to use knowledge about the r e a l - w o r l d , e s p e c i a l l y about the m u l t i p l e p e r s p e c t i v e s o f i n d i v i d u a l conceptual o b j e c t s w i t h i n i t and the r e l a t i v e p l a u s i b i l i t y o f p r o p o s i t i o n s about i t.

S. ACKNOWLEDGEMENTS

The t h e o r y presented in t h i s paper was o r i g i n a l l y conceived i n wonderful c o o p e r a t i o n w i t h Chuck Rieger over the past several years. Some of the s e n s i b l e p e r s p e c t i v e s in the paper have b e n e f i t t e d from much a p p r e c i a t e d w r i t t e n and spoken suggestions and c r i t i c i s m s of Y o r i c k W i l k s , Dick Hudson, Dwight B o l i n g e r , Pat Hayes, and James A l l e n. The Groupe d ' I n t e l l i g e n c e A r t i f i c i e l l e o f the U n i v e r s i t y P a r i s V I I I - Vincennes has p r o v i d e d an e x c e l l e n t environment f o r r e s e a r c h. Thanks to P a t r i c k Greussay, Harald Wertz, Daniel Goosens, Annette C a t t e n a t , and Gerard Paul. Many e x t r a thanks t o P a t r i c k f o r a l l the personal and r o f e s s i o n a l help he has g i v e n in making my year at incennes both v a l u a b l e and e n j o y a b l e.

6. REFERENCES

Allen, James (1978), Recognizing Intention in Dialogue, Technical Report, Department of Computer Science, U n i v e r s i t y o f Toronto.

B o l i n g e r , Dwight (1975), Aspects of Language, Harcourt Brace Jovanovich.

B o l i n g e r , Dwight (1979), Meaning and Memory, in Experience Forms, Haydn ( e d. ) , Mouton.

Charniak, Eugene (1977), Ms. Malaprop, A Language Comprehension Program, F i f t h I n t e r n a t i o n a l J o i n t Conference o n A r t i f i c i a l I n t e l l i g e n c e.

Chomsky, Noam (1965), Aspects of the Theory of Syntax, MIT Press.

Dik, Simon C. (1978), F u n c t i o n a l Grammar, North Holland.

F i l l m o r e , Charles J. (1968), The Case f o r Case, in Universa Is in L i n g u i s t i c Theory, Each and Harms ( e d s. ) , H o l t.

Grosz, Barbara (1977), The Representation and Use of Focus in Dialogue Understanding, Technical Note #151, S t a n f o r d Research I n s t i t u t e.

Hudson, Richard A. (1979), Pan-Lexical ism, Working Paper, Department of Phonetics and L i n g u s t i c s , U n i v e r s i t y College London.

Jackendoff, Ray (1972), Semantic I n t e r p r e t a t i o n in Generative Grammar, MIT Press.

Jackendoff, Ray (1976), Toward an Explanatory Semantic Representation, Linguistic Inquiry, v. 7 , n. 1.

Kaplan, Ronald M., and Joan W. Bresnan (1980), Lexical-Functional Grammar: A Formal System for Grammatical Representation, Occasional Taper #13, MIT Center f o r C o g n i t i v e Science.

Marcus, M i t c h e l l P. (1979), An Overview or a Theory of Syntactic Recognition for Natural Language, AI Memo #531, MIT A r t i f i c i a l I n t e l i g e n c e Laboratory.

Miller, George A. (1978), Semantic Relations among Words, in L i n g u i s t i c Theory and P s y c h o l o g i c a l R e a l i t y , H a l l e , Bresnan, and M i l l e r ( e d s. ) , MIT Press.

Mitchell, T. F. (1971), Linguistic "goings on"; collocations and other lexical matters arising on the s y n t a c t i c r e c o r d , Archivum L i n g u i s t i c u m , v. 2.

Rieger, Chuck (1977), Viewing Parsing as yord Sense D i s c r i m i n a t i o n , in A Survey of L i n g u i s t i c Science, Dingwall ( e d. ). Greylock P u b l i s h e r s.

Riesbeck, C h r i s t o p h e r K., and Roger C. Schank (1976), Comprehension by Computer: Expectation-Based Analysis of Sentences in Context, Research Report #78, Department of Computer Science, Yale U n i v e r s i t y.

Schank, Roger C., and Robert P. Abel son (1975), Scripts, Plans, and Knowledge, Fourth International J o i n t Conference o n A r t i f i c i a l I n t e l l i g e n c e.

Small, Steven (1980), Word Expert P a r s i n g : A Theory of Distributed Word-Based Natural Language Understanding, Technical Report #954, Department of Computer Science, U n i v e r s i t y of Maryland.

Small, Steven (1981), Toward a Cognitive Mechanics for Distributed Modelling, Technical Report, Department of Computer Science, U n i v e r s i t y of Rochester ( t o appear).

Smith, Edward E., Lance J. Rips, and Edward J. Shoben (1974), Semantic Memory and Psychological Semantics, in The Psychology of L e a r n i n g and M o t i v a t i o n , v. 8, Bower ( e d. ) , Academic Press.

W i l k s , Yorick (1980), Some Thoughts on Procedural Semantics, C o g n i t i v e Studies Centre Report # 1 , U n i v e r s i t y of Essex.