Contexts, hierarchies, and filters : a study of transformational systems as disambiguated languages

Linguistische Arbeiten 128 Herausgegeben von Hans Altmann, Herbert E. Brekle, Hans Jürgen Heringer, Christian Rohrer, Heinz Vater und Otmar Werner

Uwe Κ. Η. Reichenbach

Contexts, Hierarchies, and Filters A Study of Transformational Systems as Disambiguated Languages

Max Niemeyer Verlag Tübingen 1983

CIP-Kurztitelaufnahme der Deutschen Bibliothek

Reichenbach, Uwe K. H.: Contexts, hierarchies, and filters : a study of transformational systems as disambiguated languages / Uwe Κ. Η. Reichenbach. - Tübingen : Niemeyer, 1983.

(Linguistische Arbeiten ; 128) NE: GT

ISBN 3-484-30128-7 ISSN 0344-6727

© Max Niemeyer Verlag Tübingen 1983 Alle Rechte vorbehalten. Ohne ausdrückliche Genehmigung des Verlages ist es nicht gestattet, dieses Buch oder Teile daraus auf photomechanischem Wege zu verviel-fältigen. Printed in Germany. Druck: Weihert-Druck GmbH, Darmstadt.

TABLE OF CONTENTS

0. Introduction 1

1. Disambiguated Languages 3

2. Montague's Definition of a Disambiguated

Language 18

3. Phrase Structure and Context 33

4. Syntactic Operations and Transformations .... 64

5. Basic Expressions and Hierarchies 87

6. References 108

0. Introduction

The present work grew out of a comparison of Montague

Grammar with recent models of the transformational

school and was strongly influenced by Bowers' Theory

of Grammatical Relations. Its central part is a proof,

given in chapter 4, that transformational systems

consisting of a Phrase Structure Grammar, a lexicon, and

a set of transformations, constitute disambiguated

languages in Montague's sense. The proof is based on a

reinterpretation of Phrase Structure Trees as representa-

tions of Fregean properties of expressions called 'hier-

archies over expressions'. Hierarchies comprehend hier-

archies over combinations of morphological operations

as a special class. The latter allow for a redefinition

of basic expressions as morphologically fully developed

expressional constants of a language and at the same time

provide the mechanism to derive basic expressions from

unanalyzed root forms. Closely related to the concept of

a hierarchy is the concept of a context which, in the

form introduced here, allows to analyze expressions into

equivalence classes with the help of filters. Filters,

however, are not Chomskyan filters, but logical structures

defined for Boolean Algebras, and have been known in the

- 2 -

mathematical literature for quite some time. The three

concepts of a filter, a context, and a hierarchy, in the

form introduced here, allow finally for an extention

of the transformational theory in a way briefly illustrated

at the end of chapter 5 with some fragmentary rules for

German. There again, Bowers' Theory of Grammatical Rela-

tions served as a model. Eventual shortcomings, however,

are my own responsibility and must be blamed on the theory

presented here. A description of the difference between

Bowers' and the Standard Theory in formal terms is included

in chapter 4.

1. Disambiguated languages

One of the most outstanding characteristics of a natural

language as compared to a formal language is the asymmetric

relationship between form, function, and meaning of its

expressions, which is generally perceived as ambiguity.

In natural languages ambiguities can be produced and elim-

inated systematically; they are therefore an inevitable

problem for every linguistic analysis. Naturally a linguist

will be more concerned with the elimination of ambiguities

than with their production. But while in communication they

can be eliminated by direct and immediate reference to some

real or suggested context of use, analysis can take advantage

of such aids only indirectly by description. To describe

natural languages then means to a large extent to eliminate

ambiguities.

In theory, one may try to eliminate ambiguities by intro-

ducing for every ambiguous expression of a language as many

artificially designed and unambiguous substitutes or replace-

ments as there are readings it allows for. If we knew for any

such replacement exactly which of the possible readings of

an expression it is to represent, then a list consisting

of an expression followed by all of its newly introduced

substi tutes could be regarded as a partial description of

- 4 -

that expression. If we demanded furthermore that every un-ambiguous expression - provided a language has unambiguous expressions - be a substitute for itself, then a list con-sisting of all the substitutes of all the expressions of a language together with a relation specifying for every substitute which expression it can possibly replace could be regarded as a partial, although highly redundant, syn-chronic description of that language. In order to qualify for a description of a language, however, such a system must satisfy a number of additional requirements. Let me use the attribute 'grammatical* to refer to both syntactic and semantic phenomena. Then above all substitute expressions must be constructed in such a way that all the grammatical relations holding between expressions are being preserved, including of course the relation of being composed of. This means that if expression χ in one of its readings bears the grammatical relation R to expression in one of the readings of the latter, then there must be exactly one substitute for χ bearing R to exactly one substitute for Of course we know that the number of expressions of a lan-guage is potentially infinite, hence a list of the proposed kind could hardly be completed. We know, however, also that grammatical relations holding between expressions can be described in terms of grammatical rules by which we construct complex expressions from less complex ones. Thus if we were

- 5 -

to design a set of rules which would in principle enable us to construct any of the above substitute expressions from a stock of basic expressions of a language L, then these rules together with all the expressions they operate on would constitute an artificial but autonomous grammatical system, a disambiguated counterpart of L or, as we might simply say, a disambiguated language. The idea underlying an attempt to design disambiguated languages is that any language can be completely described by relating it to a disambiguated counterpart. Other than formal languages in the traditional sense of the word, however, disambiguated languages are usually constructed in close analogy to human languages, hence are mostly the product of an empirical investigation.

Naturally the question as to how one should go about constructing a disambiguated language allows for a variety of answers. Since we know, however, that in some wider sense of language every language has means to eliminate ambigui-ties all by itself, any answer seems to depend crucially on a prior determination of the various factors by which ambiguities are caused. One may be tempted to postulate two major sources of ambiguities, one of them - and probably the most frequent one - being insufficient information about the context in which an expression occurs. Among the ambi-

- 6 -

guities arising from this source one would very likely in-

clude those that have been called 'lexical ambiguities' as

special cases; obviously lexical ambiguities exist as long

as an expression is mentioned completely out of context or,

to put it differently, when no information about a context

is given. The other source would then be uncertainty about

the function of an expression within a certain context.

Ambiguities arising from the latter source have been called

syntactic ambiguities at times. Of course if context does

influence meaning, as suggested in the characterization of

what I tentatively called 'first source of ambiguities',

then both sources can no longer be distinguished clearly.

It can be verified that, like ambiguities of the former

kind, ambiguities of the latter kind can be resolved by

providing more information about the context in question.

If this is so, then it should be generally possible to

eliminate ambiguities by relating an η-way ambiguous ex-

pression to η unambiguous counterparts consisting of an

arbitrary but fixed representation of the expression itself

followed by an exact description of a particular context

in which it may occur. This presupposes, of course, that we

first agree on a suitable interpretation of the word 'context'.

In general contexts of expressions are determined by a

variety of factors, among them facial expressions, hand and

- 7 -

body movement, all sorts of psychological and sociological

conditions, presuppositions of speakers and hearers and so

on. To try to account for all these factors in the way

described would not only appear to be extremely uneconomical,

but it would also make it necessary that we have already

a descriptive language at our disposal which is completely

free of ambiguities. We have none of the kind, let alone one

that could account for all the psychological factors involved.

This may be one reason why linguists have generally restricted

their attention to what I shall call 'grammatical contexts'.

These are contexts consisting exclusively of strings of

expressions and some of a limited number of well defined

symbols together with a specification of the syntactic or

semantic relations holding between them. Such contexts, or

at least a subclass of them, were usually represented by

structural descriptions of expressions of a kind similar or

identical to the tree structures introduced in the transfor-

mational literature. Where the non-pragmatic information

provided by these structural descriptions was considered

insufficient, additional pragmatic information sometimes

entered the description schematically in the form of reference

points. These are sequences of symbols representing respec-

tively those pragmatic parameters that were considered

relevant for a unique determination of an extralinguistic

context of the kind mentioned. Reference points could then

- 8 -

be attached to structural descriptions as indices.

It might seem odd to think of structural descriptions in terms of grammatical contexts. We are used to think of them as analyses of expressions, i.e. systematic representations of the parallel processes of an expression's being decom-posed into or constructed from its elementary components. Sometimes rule markers, attached as labels to the nodes of a tree, would provide additional information about the deriva-tional history of the expression analyzed which, depending on the approach we are working in, would occur in some suit-able representation either on top or at the bottom of a tree. From this point of view information about a grammatical con-text seems to be missing completely. That such a view is rather narrow becomes obvious if we shift our attention away from the expression analyzed and to those expressions that occur as components in its structural description. Then the whole configuration surrounding each individual component expression can be regarded as its respective grammatical context. In transformational grammar configura-tions of this kind or grammatical contexts, as I called them, have been used systematically to trigger both lexical inser-tion and transformations.

Now if we think of contexts in a more general way as sets

- 9 -

of configurations of some kind, then we can certainly speak of empty contexts where in the usual sense of the word no context at all is being given. Technically speaking empty contexts are also contexts. Thus we might be tempted to speak of empty contexts when we actually refer to contexts surrounding those expressions that we have called 'expressions analyzed' with respect to certain structural descriptions. But notice that if in fact we do so, then we exclude all those configurations from the context of an expression that contribute to its own analysis. Translated into a trans-formational approach this would mean that nothing dominated by a particular node in a tree would belong to the grammatical context of the complete expression generated under that node. Thus a determiner or a noun would not be part of the context of the nounphrase of which they are constituents. Rather the grammatical context of a nounphrase would be the complete tree configuration that was left, after the NP, under which the nounphrase in question had been generated, had been cut out. Although such an interpretation of 'context' seems to confirm our intuitive understanding of the word, it may never-theless be inadequate for our purposes. Information such as from which constituents an expression has been constructed, by what rules, and in which order the rules were applied, may be vital if we want to eliminate ambiguities, and in the de-scription of a grammatical context must complement the infor-

- 1 0 -

mation about the expressions own role as a constituent expres-

sion. Whether all of this information can always be supplied

by structural descriptions of the transformational kind is

open. Still it is essential for a complete appreciation of

every grammatical context.

Here, by the way, a well known defect of some traditional

descriptive theories, including those that adhere to the

popular semantic principle that the meaning of an expression

is a function of the meanings of its parts, becomes apparent.

Certainly the validity of the latter principle is being re-

stricted by the other principle according to which the

meaning of an expression is to some extent also being de-

termined by the function of that expression within a certain

context. But while theories in which the analysis ends at the

level of a sentence might be able to adequately account for

both principles as far as sentence constituents are concerned,

they will never be able to do so for sentences themselves.

Thus sentences, as the output of structural desdriptions,

will never be analyzed in the context of other sentences.

Therefore, unless we are dealing with dependent clauses, the

semantic analysis of sentences will necessarily remain in-

complete. I shall assume here, however, that reference points,

even though they were introduced as purely pragmatic com-

ponents, can compensate this defect up to a certain point.

- 1 1 -

Notice that a text analysis, for example, is faced with the same dilemma at a higher level, which is one of the reasons why so called 'work immanent' text interpretations have been severely criticized in recent decades.

An interpretation of structural descriptions as speci-fications of grammatical contexts makes it easy to see that the grammatical relations an expression bears to other ex-pressions of a language are completely determined by the set of all structural descriptions in which it can occur, provided our structural descriptions are complete. Here 'complete' means that they unambiguously represent the respective grammatical contexts in which possible ambiguities of an expression can be resolved. Thus, in a manner of speaking, we disambiguate grammatical contexts rather than expressions by designing adequate structural descriptions. The expressions themselves retain their principal ambiguity and they gradually regain it as bits of the configuration surrounding them in a structural description are step by step removed. A list consisting of an expression followed by all the structural descriptions in which it can occur could therefore be regarded not only as a partial description of the kind mentioned earlier, but as a complete grammatical description of the expression in question. In practice one would of course try to simplify such descriptions by ana-

-12-

lyzing them into equivalence classes or types. Such a type could possibly be defined recursively and could be thought of as representing schematically all those contexts in which a particular expression or a class of expressions retain one and the same function or one and the same inter-pretation. Subcategorization conditions in Transformational Grammar are examples of context types. Their construction has reached a preliminary peak in sophistication in Bowers 1975, where in addition to specifying a static contextual configuration of some kind, a subcategorization condition also includes encoded information about the rules that can create that particular configuration and the order in which they would apply. The order of the rules in question can then be imposed as a condition on the acceptability of a derivation. For example if a certain contextual con-figuration can not be constructed from a given configuration by applying certain rules in the order predicted by the sub-categorization condition of a lexical item, say a verb, then the derivation blocks, meaning that the item in question can not be inserted or does not fit into the given context. Of course there are other ways to design context types; the concept of a structural description as introduced here is as of yet not committed to a transformationalist interpreta-tion. Since moreover the attribute 'grammatical' in 'gram-

- 1 3 -

matical context' was supposed to refer to syntactic and semantic phenomena alike, it may well turn out that the structural descriptions we need here cannot be found in at least that part of the transformational literature that is still committed to the Chomskyan concept of deep structure, the more so since, to my knowledge, an adequate theory of semantic representation within the transformational framework does not exist.

If an expression can indeed be disambiguated by assigning it an unambiguous grammatical context in form of a structural description, and if moreover every expression is indeed com-pletely determined once all the grammatical contexts in which it can occur are known, then every attempt to design adequate structural descriptions for the expressions of a language, however incomplete they may turn out to be initially, must ultimately result - or is at least intended to result -in the construction of a disambiguated language. What else should encourage such an attempt if not the belief that grammatical differences between expressions can be explained and ambiguities resolved by means of description? As linguists we may be aware of the fact that the actual construction of a disambiguated counterpart for any language could never be completed. Yet trying to approximate such a hypothetical construct will almost certainly help us gain some insights

-14-

into the mechanisms underlying every communicative process. From this point of view it is truely hard to see why many linguists still regard Montague's claim that there is no important theoretical difference between natural languages and formal languages as sheer blasphemy, while at the same time they have been discussing purely formal languages all along. Certainly languages whose expressions are structural descriptions of natural language expressions are formal languages in the sense that the relationship between form, function, and meaning of structural descriptions is at least intended symmetric. And have such languages not been con-structed in order to be able to interprete natural language expressions indirectly via an interpretation of their struc-tural descriptions?

To say that there are no important theoretical differences between natural languages and formal languages, however, is not the same as saying that there are no differences at all. It has already been pointed out initially that one difference between formal languages, as used, for example, in mathematics, and those languages, that will serve as disambiguated counter-parts of natural languages, lies precisely in the fact that the latter are mainly the product of empirical investigation. From a theoretical point of view it may nonetheless be of advantage to start a linguistic investigation by considering

-15-

and formulating in exact terms certain minimal requirements

a language as a purely theoretical construct must meet in

order to qualify for a disambiguated language. In a second

step we can then start to construct a disambiguated language

not only on the basis of the evidence provided by a thorough

inspection of the natural language under investigation, but

also in accordance with the theoretical requirements formulated

earlier. This is the way Montague proceeded, and in this

respect his approach is clearly different from those of the

structuralist and transformational schools, for which dis-

ambiguated languages, although - or may be because - the

concept was not yet known, seemed to be exclusively the

theoretical fall out of an empirical investigation of human

languages. The definition of a disambiguated language plays

therefore a central role in Montague's theory. That it is

moreover the central part of his theory of syntax should not

surprise us, for if a language is free of ambiguities, then

semantic phenomena should be completely reflected in the

syntactic structure. Thus the mere assumption of the existence

of a disambiguated language for every language seems to lead

automatically to postulating a one to one correspondence

between syntax and semantics, i.e. in particular to the

requirement that the syntactic and the semantic derivation

of any particular expression run entirely parallel.

- 1 6 -

That the definition of a disambiguated language must not

be mistaken for a disambiguated language itself goes without

saying. Rather it is part of a linguistic theory of a kind

which, in Chomsky's terms, should meet the level of explana-

tory adequacy. Of course, explanatory adequacy is in itself

a rather vague concept, largely because its definition rests,

like the definition of other concepts introduced by Chomsky,

on the prior definition of the rather individualistic concept:

competence of a speaker hearer. Assuming that Montague's

definition of a disambiguated language is explanatory adequate

would entail that every grammar meeting the requirements

set up in that definition is also descriptively adequate,

simply because we find nothing among these requirements that

could help Us discriminate between good and bad disambiguated

languages. This would be a surprising result and, unless we

assume that there is one and only one disambiguated language

for every language, highly unlikely. Is this then an argument

against Montague? I don't think so as long as there is no

satisfactory answer to one crucial question: How can a general

agreement on descriptive adequacy be reached when our under-

standing of it too seems to rest to a large extent on our

understanding of such concepts as: psychologically correct

or: correctly describing the intrinsic competence of an ideal-

ized speaker hearer? Since current psychological theories seem

unable to offer anything less subjective than our privat*

-17-

understanding of the phrase 'psychologically correct', any

attempt to answer that question leads inevitably into the

realm of speculation.

A word must be said about the usage of the term 'language'

in this context. It may have become obvious already that the

concept of a disambiguated language comprehends the tradi-

tional concept of a grammar. Thus descriptive adequacy was

by no means attributed to languages in the conventional

sense, which would of course be utterly meaningless. Montague

himself has repeatedly tried to construct a grammar (a dis-

ambiguated language that is) for a fragment of English. From

a linguist ' s point of view these attempts must be called

failures. That this is so can, however, not be blamed on

the quality of his theory, but must rather be blamed on the

fact that he chose to ignore the empirical work that had

already been carried out by linguists of various schools

and the insights this work can offer to a theoretician. For

the following discussion Montague's practical work will be

of marginal interest only; instead of losing any more words

on it, let us therefore go right back to the core of his

syntactic theory and take a closer look at his definition

of a disambiguated language as introduced in Montague 1970,1.

2. Montague's definition of a disambiguated language

As can be expected, the basic stock of a disambiguated language

in Montague's definition is a set of basic expressions of a

language and a set of syntactic operations to construct

complex expressions from less complex ones. Right from

the start Montague appeals to a more intuitive understanding

of the phrase 'basic expressions' as referring to a minimal

syntactic unit which is in some conventional sense meaning-

ful. Although the obvious lack of concern with the exemplifi-

cation of one of his basic concepts may be intolerable from

a linguist's point of view, it is still justifiable linguis-

tically. As the input of the syntactic component of a grammar,

so one could argue, basic expressions are of course not the

result of the application of any of the syntactic operations

(see p. 30, Def 1: 4). Rather they are the output of a theory

of morphology or even a theory of phonology, if it should

indeed turn out that morphological processes can be adequately

accounted for within a phonological framework. Thus, one would

conclude, the definition of the concept of a basic expression

cannot be part of a theory of syntax. In Transformational

Grammar the equivalent of Montague's set of basic expressions

is the lexicon.

Following a logical tradition, Montague generally rep-

-19-

resents expressions by strings of letters of the English

alphabet, enclosed in singular quotation marks. If such a

string includes unidentified parts, i.e. names of variables

of the meta language for expressions of the object language,

then the singular quotes are being replaced by corners or

quasi quotes. For example if 'walks' is an expression of

the object language and χ is a variable for any such expres-

sion, then the result of writing 'walks' followed by χ will

be represented by *walks x 1 rather than 'walks x'. Montague's

theoretical work, however, is not at all committed to rep-

resentations of this kind and in practice we are free to

represent both expressions and basic expressions any way

we like, provided we do so in a consistent and unambiguous

way. The only thing relevant at this point of the discussion

is that an analysis of the set of basic expressions into

mutually exclusive equivalence classes, called 'basic cate-

gories', is assumed to have been carried out somehow. In

Transformational Grammar such an analysis has been accomplished

with the help of categorial features [+N] , [+V] etc. , which

have been attached to lexical items as labels, thus marking

them for class membership. Montague, in order to avoid a

committment to the needs of any particular language on the

fairly abstract level of the definition of a disambiguated

language, refers to basic categories in a rather vage and

unspecific way by using symbols 'X ', where ¿ represents an i

- 2 0 -

index chosen from a set of natural numbers. The use of natural numbers to distinguish different basic categories is of course by no means necessary. As soon as we turn to the description of particular languages, we may well choose categorial symbols similar to the non terminal or pre terminal symbols of a Phrase Structure Grammar to represent the elements of In this case it might even turn out that basic categories can be unambiguously referred to by using indices alene. Here, on the other hand, they can not, because the same set will be used to index both basic categories X^ and derived categories, called ' C^1, of those expressions, that are to be constructed from basic expressions by a sys-tematic application of the syntactic operations. In this way we will be able to express that, whatever value i might take, X^ will always be a subset of C¿. Translated into a transformational terminology this means that we will be in a position to say that while all nouns are nounphrases and all verbs are verbphrases, the opposite is not necessarily true. Here sentences will get a special treatment of course. Since all sentences are output of at least one syntactic operation, there are no basic ones. Hence there is no basic category of sentences or, to put it differently, the basic category of sentences is the empty set. The category of sentences will therefore get the special index i®. Naturally X^O is also a subset of C^o·

- 2 1 -

Before we turn to the role of syntactic operations within

the context of a disambiguated language, some explanatory-

remarks on syntactic operations in general might be helpful.

An η-place syntactic operation assigns expressions to sequen-

ces of η constituent expressions. If f is any operation

and <x ,...,x > is a sequence of expressions, then by

'f(x , ...,x ) = y1 we express that y_ is the result of ap-

plying f to <x^,. . . ,xn> . We call 'the output expression'

or 'the output of f for the arguments 1. The com-

plete string <x^,...,*n> is called 'input string' or 'input

of f1 . Input strings are ordered sequences, which means that

a different arangement of the arguments leads to different

results or sometimes to no result at all. An operation can

assign only one result to every input string. It can, however,

assign the same result to different input strings. If, on the

other hand, we were to get two or more different results by

applying f to <x^,...,xn> at different times, then f could

no longer be called an operation. Rather it would be called

a relation in this case. The set of all input strings of an

operation f or the set of all strings for which f yields a

result is called 'the domain of f', abbreviated *D[f]'.

R[f] or the range of f is the set of all of its output ex-

pressions. In general the domain of an η-place operation

is being given as an η-place cartesian product n, C. = — k<n = C. χ C. χ ... χ C. (read 'cross' for 'χ') of η not 1 1 1 — x0 h n-1

- 2 2 -

necessarily different categories C^. Here '<' means the

same as 'smaller than1 and 'k<n' says that k takes succes-

sive numerical values starting with 0 (the smallest number

smaller than n) and ending with n-1 (the largest number

smaller than n). In words Π, C. is the set of all n-place — k<n — sequences with first constituent from C· , second constituent

from C. , and n-th constituent from C- . For η = 2, 11 * xn-l

then, Π, „C. is the cartesian product C- χ C. which is the k<2 i k F i 0 i]_

set of all ordered pairs with first constituent from C. and

second constituent from C^ . For completeness we require that,

for η = 0, be the set {0} which includes the empty k

sequence 0 or the blank as its only element.

One might get the impression that a syntactic operation

is completely defined once its domain and its range has

been specified. This is not so, simply because the results

of the operation cannot be chosen from a set of already

existing expressions. Rather they must be constructed from

the constituent expressions given in the input strings.

Exactly how they can be constructed has to be specified in

the definition of the operation in question. Such a definition

can be given informally. An agreement operation for simple

German declarative sentences, for example, could be intro-

duced in form of an instruction such as: Attach an ending

to the verb in dependence of number and person of the imme-

-23-

diately preceding noun. (For the majority of German sentences,

even simple ones, this instruction would of course produce

wrong results. Since it serves only illustrative purposes,

however, it may be accepted for the moment.) The operation

is completely defined if the grammar generating the declara-

tive sentences in question has means to distinguish both

nouns and verbal endings relative to the given parameters.

However, a definition can also be introduced formally. How

this would be done for the above agreement operation shall

be illustrated by the following definitions. Although these

definitions will differ from the definitions in Montague's

fragments of English in that they refer to additional theo-

retical concepts not available to him, I shall use only the

formal methods introduced so far.

Let Ν be a set of proper names and pronouns. Since plurals

in German are formed irregularly, we assume that Ν includes

singulars and plurals alike. Let furthermore V be a set

of verbal stems of German intransitive one-place verbs. We

want to define an operation f with domain D[f] = Ν χ V and

range R[f] a set S of declarative sentences. In short we say

that f is a function from from Ν cross V into S and write

'f:N xV — ? S'. Before we can describe the working of f, we

will have to analyze Ν into subclasses according to number

and person.

-24-

Let NUMBER and PERSON be two different feature sets. We represent the elements of the first one by the expressions 'singular' and 'plural' and the elements of the latter by the latin numerals Ί', 'II', and "III1. The particular nature of these elements shall not concern us here. Let us merely assume that it is in some less absolute sense meaningful to speak of properties of expressions. We now define a mapping £ from Ν onto NUMBER χ PERSON. 'Onto' means that every ordered pair <a,b> in NUMBER χ PERSON, where a is the referent of one of the three latin numerals and b is the referent of 'singular' or 'plural', has to be the value of at least one element of N. Since on the other hand no element of Ν can be assigned more than one value by g, every ordered pair in NUMBER χ PERSON can now be taken to represent exactly one subclass of N. Obviously no two subclasses represented in this way have an element in common. Moreover the union of all the subclasses so identified is just the set N.

Another mapping h has to be formulated in order to interprete the elements of a set M of verbal endings. Like £, h will be onto, but this time NUMBER χ PERSON will be the domain. It is h : NIMBER χ PERSON gnt(f M. Actually h can be represented in form of a two dimensional matrix with the rows labeled by elements of NUMBER, the columns labeled by elements of PERSON, and the values m of h written down where rows and columns intersect. Matrices of this kind can be found

-25-

in every school grammar.

With the help of ¿ and h we can finally define f as follows. Let n, v, m be elements of Ν, V, and M respectively. It is, for all n, v, and m,

f(n,v) = rn vnT iff m » h(g(n)).

It may have been observed that, throughout the description of f, we have not once made any assumptions as to how expres-sions or morphemes should be represented. A mixture of levels seems to have occured, however, in the formation of constituent expressions 'vm1. After all by h every m in M had assumed the status of a class of sequences of syntactic properties. Of course an interpretation of grammatical endings as class markers should come natural if we consider the fact that they indeed prepare expression morphologically to be combined with those and only those expressions that share the respective properties a grammatical ending represents. These are precise-ly the properties by which we identify classes of expressions. In representations such as 'vm', however, the difference in theoretical nature between ν (an expression) and m (a class) seems to disappear. A precise characterization of basic ex-pressions and expressions in a later chapter will eliminate the problem, as we shall see.

- 2 6 -

With these preliminary and mainly illustrative remarks on the nature of syntactic operations in mind we should be able to go back to Montague's definition of a disambiguated language and take a closer look at the characterization of syntactic operations given there.

Like he did with basic categories, Montague represents syntactic operations only vaguely by using symbols * £^ *. Again ¿ is an index chosen from a set J of natural numbers and, as before, the choice of natural numbers to serve as indices is arbitrary and of no consequence for the theory in general. Syntactic operations, according to Montague, must meet essentially two requirements: They must be uniquely defined and they must be defined in accordance with the needs of a particular language. The first one, or the unique-ness condition, supplements what has been said about operations before, namely that they assign exactly one value to every argument sequence in their domain. It specifies that every expression can be the value of at most one syntactic opera-tion. This translates into the terminology of a logician as follows: For any two operations fj and f.,, and any two input strings a and a' in their respective domains, if f j (a) - i.e. the value of f^ for a - is the same as fjt(a'), then fj must be the same as f j, and a must be identical with a' (see p. 31, Def 1: 5). This is in accordance with our intuitions,

- 27 -

for clearly if an expression can be derived in two different

ways, then it is at least two way ambiguous. Expressions of

a disambiguated language must of course be free of ambiguities.

Freedom of ambiguity must therefore go as a requirement also

into the formulation of syntactic operations. The second

condition, namely that syntactic operations be language

particular, follows from the fact that they are defined over

a set A of possible expressions. Here 'possible' can only

mean possible in a certain language; hence the limits for

the construction of syntactic operations are set by the

structural peculiarities of the language to be analyzed.

This confirms the initial claim that disambiguated languages

are to a large extent the product of thorough empirical

investigations. Recall that a disambiguated language is to

provide the background against which a human language is

being described. The set A must therefore include only ex-

pressions that can be meaningfully and unambiguously related

to the expressions of the latter. Above all it must include

of course all the basic expressions of the latter natural

language in question in some suitable representation (see

p. 30, Def 1: 2). To secure that it includes all the expressions

that can be derived from basic expressions by a successive

application of the syntactic operations, we must demand that

both range and domain of every operation fj be subsets of A,

in other words that A is closed under all operations f· (see

-28-

p. 30, Def 1: 3). To make sure that it includes only expres-sions that can be derived by a successive application of syntactic operations, we must require that A be the smallest set including just the expressions mentioned. Any system <A,fj> consisting of an arbitrary set A and a number of operations f^, whose range and domain is included in A, is called 'an algebra1 (see p. 30, Def 1: 1).

Range and domain of every syntactic operation fj is specified independently of the description of the working of fj in what Montague calls 'a syntactic rule'. Such a rule is an ordered three-place sequence <f^ »<i]c>]c<n»i'> consisting of the operation fj itself, followed by an η-place sequence <1Q,...,in_i> of category indices representing the respective categories from which the arguments of fj are taken, and the index of the category comprehending the range of f^ (see p. 31, Def 1: 6). By the middle constituent <i,>, of <f^,<iv>v »i**» κ k-ííi j κ κ<η then, fj is actually being characterized as an η-place opera-tion. Syntactic rules must be read as instructions to actually construct expressions of category C^, by applying f to strings of expressions as specified. By a successive applica-tion of syntactic rules, categories of expressions can therefore be built up systematically. It has already been mentioned that the same set is used to supply both basic categories X. and categories C. with indices. While this

-29-

policy had the advantage that by its index alone every X^ can be identified as a subset of the corresponding C^ (see p. 31, Def 2: 1 and 2), it creates also a definite ambiguity in the inner constituents of syntactic rules. Yet the ambi-guity is intended. Notice that, although initially syntactic operations can take only basic expressions as arguments, they must in principle be allowed to refer to the complete stock of expressions every category can supply at each of the successive stages of its being built up. Thus it is the ambiguity of the inner constituent that adds the necessary recursive quality to a rule. At the same time it explains why Montague chose to represent the domâins of the f^ by sequences of category indices rather than expressions 'nk<nCik' • a s have been expected. The latter, as names

of cartesian products, would create the impression as if at all times there would be a fixed stock of argument ex-pressions available for every operation f.. Actually this is not so. It is true that, if the C^ are classes of possible expressions, the domain of fj is always a proper subset of nk<n^i e V e r y E"P^ace operation f^. However it may be a

different subset at any of a number of successive applications of f.. J

This brings us to the cardinality of categories, i.e. the number of expressions they include. It follows from

-30-

what has been said that the cardinality of C. is infinite ι for every in and so is consequently the cardinality of A. Index sets, on the other hand, are finite. This con-forms to a quality generally desired for all generative grammars, namely that they be able to generate a potentially infinite amount of sentences by applying only a finite amount of rules. That there are in fact no more categories of expressions than those indexed by elements of is being ensured in number 4 of Def 2. In order to make a formal definition complete we must, however, also secure that no rule applies vacuously, i.e. that every operation produces results. This is being guaranteed by 3 of Def 2. Let me at this point repeat the definition of a disambiguated language as stated in Montague 1970,1, hoping that enough has been said to make it comprehensible.

Def 1: (Montague): A disambiguated language is a system <A,f.,X.,S,i°>. T . T such that j' i' ieI,jeJ 1. <A,f.>. _ is an algebra; J ]eJ 6

2. for all iel, X^ is a subset of A; 3. A is the smallest set including as subsets all

(for iel) and closed under all the operations fj (for jeJ) ;

4. X^ and the range of f. is disjoint whenever iel and jeJ;

-31-

5. for all j,j'eJ, all sequences a in the domain of fj, all sequences a* in the domain of t, if fj(a) = fj,(a'), then j = j' and a = a' ;

6. S is a set of sequences of the form <f^ , <i]c>jc<n > i ' > » where jeJ, η is the number of places of the operation fj, i^el for all k<n and i'el;

7. i el;

Def 2: (Montague): If D is a disambiguated language, then D generates the family C of syntactic categories, iff 1. C is a family, indexed by of subsets of A; 2. X¿ c Ci for all iel; 3. Whenever <f^ ,<ijc>jc<n,i 1 >eS and a^eC^ f°r k<n,

then f (<a, >, )eC- , ; k k<n ι 4. Whenever C'satisfies 1-3, c for all iel;

Now suppose that, as was suggested earlier, the expressions of a disambiguated language are indeed structural descriptions, of some kind, of natural language expressions. As has been pointed out, they must have been constructed in this case by constant comparison with, and hence by constantly relating them to, their intended natural language counterparts. This whole process can be represented formally in a somewhat sweeping manner by setting up an all inclusive relation R, relating our formal (i.e. disambiguated) language with the

- 3 2 -

language under investigation. Such a language can then be

described as a system L = <D,R>, where D is a disambiguated

language and the domain of R is included in A. An ex-

pression of L is now called meaningfull, if there is at

least one expression in D to which it is related by R; if

there is more than one such expression, then it is called

'ambiguous'. Clearly R induces a classification for the

expressions of L corresponding to the classification imposed

on the expressions of D by the syntactic rules of D. In

this way the grammatical relations holding between expres-

sions of D are being transferred to expressions of L; hence

the rules of L are also rules for D. Technically speaking

R is just a device to delete all the subscripts, superscripts

and parentheses, which are part of the structural descriptions

of expressions, together with the complete configuration

surrounding them in a tree. Thus by R, in effect, we separate

an expression from the description of the grammatical con-

text in which it occurs.

3. Phrase structure and context.

It was Barbara Hall Partee who, to my knowledge, first

pointed out that some of the syntactic operations designed

by Montague for a description of a fragment of English show

a close similarity to local transformations. This observa-

tion must have then brought about the idea that by trans-

lating the formal representations of certain transformations

into the formal language used by Montague and adding them to

his stock of operations one would be able to develop and

extend the syntax of his fragment to a point where it was

no longer inferior to a syntactic description of English

of the transformational kind. Underlying this idea seemed

to be the belief that a system consisting of a Phrase Struc-

ture Grammar, a lexicon and a set of transformations satis-

fying certain well known ordering restrictions constitutes

a model for a disambiguated language a la Montague. There

are indeed arguments to substantiate such a belief, as we

shall see, and it should have been substantiated prior to

an attempt to incorporate the practical results of the

transformational theory into a Montague framework. By choosing

not to do so, however, one tacitly, and maybe unwillingly,

endorsed an assumption about the nature of expressions which,

although it is by no means forced upon us by the definition

of a disambiguated language, is characteristic for Montague's

-34 -

practical work. This is the assumption that the expressions

of a disambiguated language are linear arrangements of certain

primitive units, represented in Montague's work by letters

of the English alphabet, supplemented by a limited number

of technical signs such as indexed parentheses, brackets,

and the like, referring to their derivational history up to

the point where they themselves can take part in a deriva-

tion as relatively independent syntactic units. The trans-

formational theory, although it remains rather vague as far

as the characterization of the concept of an expression is

concerned, is specific enough so as to leave no room for

an interpretation of the above kind. Here I am not referring

to the use of complex symbols to represent constituents, but

to the fact that the transformational theory is geared towards,

and almost obsessed with, the generation of sentences. Behind

this preoccupation with the language unit sentence the role

of constituents of sentences as autonomous units disappears

almost completely. If we can speak of constituent expressions

at all, then the complete tree configuration dominating a

terminal string, which is to include them as parts, and which

is to be generated under the node S of a Phrase Structure

Tree, is as much part of them as the linear order and the

mode of representation of the primitive units from which they

are composed. It follows naturally that expressions can not

represent the individual nodes of a particular tree, as they

-35-

can in Montague's fragments. A combination of the two theories

that takes Montague's fragmentery syntax of English as a basis

therefore not only obscures a major theoretical difference

between both approaches, but also deprives itself of the

means to accomplish just what it was supposed to accomplish,

namely add to the generative power of Montague's syntactic

apparatus by adding transformations. Transformations manipulate

phrasemarkers, i.e. expressions of a kind that has been de-

scribed earlier as consisting of strings of complex symbols

together with a specification of the complete grammatical

configuration in which they (i.e. the complex symbols) are to

play a role as substrings. This is true for local transforma-

tions too as footnote 18 for chapter 2 of Chomsky's Aspects

of the Theory of Syntax may reveal. There a transformation

is called 'local' with respect to a node A, if it affects

only the string immediately dominating A. Obviously this

does not exclude the possibility that a local transformation

is triggered by or, in its structural description, refers to

a configuration that includes the tree dominated by A only

as a subtree. Thus what may have been considered a weakness

of the transformational theory by some, namely its primary

concern with the well formedness of sentences, turns Out to

be its strength, because it virtually forces us to a systematic

use of what here has been called 'grammatical context'. In

Montague's fragments contexts come into play in form of trees

-36-

dominated by an expression only. In generating an expression

we can therefore at no stage refer to any larger context

in which the expression itself is to play only a subordinate

role; hence its description remains necessarily incomplete.

Since there is moreover nothing, aside from these only partial

specifications of contexts of expressions, a transformation

could possibly relate to, the latter could only lose in

generative power, if we were to incorporate it into the stock

of rules proposed by Montague for English. By trying to cure

the patients impotence, we would end up castrating the doctor.

As pointed out earlier, the above criticism applies to

Montague's practical work only. It is precisely his assumption

about the nature of expressions which makes even his de-

scription of rather simple syntactic processes appear quite

clumsy. For this reason, and for others expressed in Bowers

and Reichenbach 1979, Montague's practical work is hardly

of any significance for linguists and as a starting point

for a comparison with other linguistic theories quite un-

suitable. Let me, however, further investigate the claim

that there are indeed reasons to believe that a system con-

sisting of a Phrase Structure Grammar, a lexicon, and a set

of transformations, constitutes a disambiguated language in

Montague's sense. A claim to this effect presupposes, of

course, that a Phrase Structure Grammar constitutes the

-37-

essential part of a disambiguated language, if it is not a disambiguated language all by itself. We must therefore examine Phrase Structure Grammars more closely. The following definition introduces a formal characterization of the Chomskyan concept of a Phrase Structure Grammar.

Def 3: A Phrase Structure Grammar is a system <V^,I,P,S> such that 1. V^ is a finite set; 2. Σ c V^ (read: Σ is a subset of VL); 3. Ρ is a set of productions (rewriting rules) ,

i.e. a set of ordered pairs <u,v> with ueV^-Σ (read: u is an element of V^ minus Z) and veV^*;

4. S is the initial symbol or the start variable;

The definition includes some new notation that needs to be explained. If X, Y are sets, then by 'X c Y* we express in particular that X is a proper subset of Y; 'X c Y 1, on the other hand, means that X is a proper subset of or equal to Y. X-Y is the set of all elements that are in X but not in Y. Intuitively this is just what is left of X after all the elements that are also elements of Y were taken out. By 'X*', finally, we denote the set of all finite sequences of elements from X.

-38-

As is well known, V^ includes the symbols 'S1, 'NP', 'VP1,

•PP', 'P', 'Det', Ά ' , 'Ν', 'V , etc., the last five of which

are also elements of ζ. Σ then is the set of all pre terminal

symbols of a Phrase Structure Grammar. The set Ρ of rewriting

rules is usually given in form of a list. The actual selection

of these rules depends of course on the empirical evidence

provided by a previous constituent analysis of the sentences

of the language under investigation. Rewriting rules expand

non terminal symbols u from V^-Σ into strings ν from V^*,

which explains their characteristic form u ^v. If some

of the constituents of ν are enclosed in parentheses, then

ν no longer represents a single string, but a number of sub-

strings of the string obtained from ν by deleting all parenthe-

ses. In particular it represents all those substrings of the

latter that share the constituents of ν which are not enclosed

in parentheses. The string '(Det)(A)N', for example, represents

the four strings 'Ν', 'Det Ν', Ά Ν 1, and 'Det A N', all of

which include 'N' as a constituent and all of which are sub-

strings of 'Det A N'. Accordingly an expression such as

'NP > (Det)(A)N' represents four different rewriting rules

which among other things, are related by the fact that all

expand the same symbol 'NP'. In a more recent version of a

Phrase Structure Grammar, V includes in addition a dummy L·

symbol Δ which itself can never occur at the left side of

an arrow. Only pre terminal symbols, however, are allowed

-39-

to expand into Δ. In this version condition 3. of the above

definition must be replaced by

3'. Ρ is a set of ordered pairs <u,v> with ueV^

and veV^*> and ueZ <—* ν = Δ.

Let me at this point offer a somewhat different charac-

terization of expressions such as 'NP (Det) (A)N'. To this

end I will first have to define the relation of being a sub-

string of a string. I shall employ variables u, v, w to range

over strings and characterize strings in the usual way as

sequences ν = <vjc>](<n η constituents v^. With the help

of an operation of concatenation, denoted by ' n-place

sequences can be introduced inductively by requiring that

 = u for one-place sequences u and that, for all natural

numbers n, <v,>. = <v.>. π A <v .,>. The zero-place - k k<n k k<n-l n-1 v

sequence, also called 'empty sequence1, 'blank', or 'the

sequence that has no constituents', will be referred to by

the symbol 'J0'. We require finally that v"0 = J0*v = ν for

all η-place sequences v. The relation of being a substring

of an η-place string ν can now be characterized for arbitrary

ν by the following three statements:

1. The empty string 0 is a substring of v;

2. Every constituent v v of ν is a substring of v;

- 4 0 -

3. Whenever, for natural numbers k<l<n and strings

u and w, both and v^ Aw are substrings of v,

then u'v^vj'w is a substring of v.

In 3. we assume that v, and v, are constituents of ν and that k 1 -

v^ precedes v^ (not necessarily immediately) . Then we demand

that the result of concatenating a substring of ν beginning

with v^ to the right of a substring of ν ending with v^ be

also a substring of v. Notice that 0, allthough a substring

of ν by 1., is not a constituent of v; v^ and v^, on the other

hand, are constituents of ν and hence, by 2., also substrings.

This means that in 3. the empty string can replace u and w

but not v^ and v^. It follows from 1., 2., and 3. that in

particular ν is a substring of itself.

For what follows it is essential that the constituents of

an η-place sequence be considered pairwise different. Thus two

occurrences of the same constituent will be considered different

by virtue of the different places they occupy in relation to

other constituents as indicated by different indices. The rela-

tion of being a substring then defines a partial ordering on the

set S v of all substrings of a given string v. It is reflexive,

meaning that every string is a substring of itself, antisym-

metric , meaning that if u is a substring of w and u and w are

different, then w cannot be a substring of u, and transitive,

- 4 1 -

meaning that if u is a substring of v' and v1 is a substring

of w, then u is a substring of w. S v includes a maximal ele-

ment, namely ν itself, of which all other strings in S y are

substrings, and it includes a minimal element, namely 0, which

is a substring of every string in S v· If u and w are any two

strings in S v, then the greatest lower bound of u and w, ab-

breviated 'g.l.b.(u,w)', is the unique string in S y which is

a substring of both u and w and of which all other common sub-

strings of u and w are substrings. The least upper bound of u

and w, abbreviated Ί.u.b.(u,w)·, is the unique string in S v

of which both u and w are substrings and which is itself a sub-

string of all other strings in S y sharing this particular

property. Little consideration will reveal that if u is a

substring of w, then g.l.b.(u,w) = u and l.u.b.(u,w) = w.

Notice that without the requirement that the constituents of

an η-place string be considered pairwise different both g.l.b.

and l.u.b. are no longer uniquely defined. For a string

u Aw Au, for example, we would no longer be able to determine

l.u.b.(u,w); neither would we be able to determine g.l.b.

for two different substrings u"w and w'u of a string v.

Two operations, denoted by '+' and '·', for elements of S y

can now be defined in terms of the relation of being a substring

of a string. We require that, for any two elements u and w

from S y, u+w = l.u.b.(u,w) and u*v = g.l.b.(u,w). In addition

-42-

we demand that, for every u in Sv> -u be the unique string such that u+(-u) = ν and u· (-u) = J3. Anyone who is interested in formal details can easily verify that the set Sv together with the two-place operations + and · and the on-place operation - (read: the complement of) constitutes a Boolean Algebra and in the matematical terminology would be called 'a system <Sv,sub>', if '.sub1 denotes the relation of being a substring of a string.

Naturally we are not always interested in all the substrings of a given string v. Recall that l(Det)(A)Nl, for example, refers to either one of four different substrings of Det~A*N, excluding others such as Det, A, or Det'A. The question then is: How can a subset of S v be defined which includes only those substrings of ν that share certain constituents. Incidentally sets of this kind have been known for a long time and were introduced in the mathematical literature under the name 'filter1. Filters are subsets of Boolean Algebras satisfying certain requirements. For our purposes these requirements can be stated as follows. Let sub be the relation of being a substring and let <Sv,sub> be a Boolean Algebra as described. Then a subset F of S y is called 'a filter on ν', if

1. IF includes with every string u all the strings from Sy of which u is a substring;

-43-

2. F includes with any two strings their greatest lower bound, and

3. IF includes v, but not 0.

If F includes a minimal element w, then we say that F is generated from w.

Now let ν = DetAAAN. We reinterprete '(Det)(A)N' to denote the set of all those substrings of ν that share the same con-stituent N. Obviously (Det)(A)N is completely included in Sy. Since it satisfies moreover the conditions 1., 2., and 3. in the above definition, it is a filter, specifically a filter on Det'A'N generated from N.

There are two reason for introducing the concept of a filter. First of all it will help us gain some theoretical insights con-cerning the nature of transformational systems as models for disambiguated languages which secondly will help us extend the transformational theory and it's possibilities to some degree. The latter will be accomplished via a generalization of the system of a Phrase Structure Grammar with the help of a partic-ular kind of filters which I have called 'rigid filters'. These are filters defined over trees. Needless to say that the concept of a filter as introduced here differs from the Chomskyan concept in that Chomsky's filters are closer to sets of transformations

-44-

satisfying certain ordering conditions than to Boolean Algebras. I shall now proceed by first using the relation of a substring to define an equivalent relation for trees. To this end I will have to discuss Phrase Structure Trees in more detail as well as the use of subcategorization conditions to analyse Phrase Structure Trees into equivalence classes. The discussion will be based on a model of a Phrase Structure Grammar satisfying condition 3'. (see p. ) rather than 3.; hence preterminal symbols will expand into Δ.

Let Ρ be a set of productions as before and let be a family of all those rewriting rules in Ρ that expand the same symbol 'u'. I shall call p^ for all ueV^ 'a rewriting rule of type u*. Now let pu = <u,v> and Pui= <u',v'> be two productions of type u andu' respectively. We define that for all u,u'eVL such that pu and put are productions as specified:

1. Pu, is immediately under p^ if and only if u' is a substring of ν ;

2. ρ is under pu, if and only if there are k<n and ρ such that ρ » ρ ., p„ = Ρ , and ρ

"k u0 u un-l u' *uk is immediately under ρ for all k<n-l.

uk+l

-45-

It follows that if pu, is immediately under pu> then it is under ρ but not vice versa. Productions of the same type u can now be further classified by requiring that for all u,u'eVT such that ρ and ρ , are productions as specified: L· u U

P u and pu, be neighbors of type u, if and only if u - u' and v' is a substring of v.

Thus every production p u is in particular a neighbor of itself. Before we can use the above preliminary definitions for a classification of trees, however, we must determine what a tree is and how it can be characterized.

It follows from condition 4. of the definition of a Phrase Structure Grammar, that Phrase Structure Grammars generate only trees dominated by S. Since every tree will be constructed by applying a fixed number of productions, one may be tempted to characterize trees in terms of sets Τ of rewriting rules by requiring that 1. Pg or the rule expanding S is in Τ and 2. whenever pu is in Τ and pu, is immediately under pu, then pui is also in T. Notice, however, that by 2. Τ would include with every pu, also every neighbor of Pui· We may thus specify in a third condition that 3. Τ includes wit every production pu at most as many neighbors of type u' as there are occurences of u' in the second constituent ν of p„. But there is still

- 4 6 -

no way of telling which p ui is to expand which occurrence of

u' in v. Moreover, if we allow for the existence of recursive

rules, then none of the preceding conditions won't even

specify which production is to start a derivation. It seems

therefore that unordered sets of productions may determine

classes of trees. In order to uniquely characterize singuar

trees, however, we must somehow specify the order in which

the productions are supposed to apply. The following definition

will characterize trees as ordered sequences or embeddings

of productions. There the set Τ is no longer a set of pro-

ductions but will instead be introduced as a set of trees

for a language L. Let 'tu' for all ueV^ denote a tree dominated

by u; I shall call such a tree 'a tree of type u 1. Then

Τ can be characterized recursively by requiring that

1. t = Δ is in T;

2. whenever <tVjc

>jc<n is a sequence of trees of type

v^ and <u»<v]c>]c<n

> is a production in P, then

<u,<t >, > is in T. ' νjç k<n

The definition exceeds the limits set by the definition of

a Phrase Structure Grammar in two different ways. Firstly the

trees dominated by S constitute only a subclass of T, and

secondly Δ is a tree. Moreover it appears to suggest thirdly

that trees be generated in a bottom to top fashion. The latter

- 4 7 -

is actually not the case as long as we don't reinterprete

Ρ to be a function from V * onto V T. In this interpretation L L

the elements of Ρ would be ordered pairs <v,u> with veV^*

and ueV^. Intuitively this corresponds to a reversal of the

direction of the arrows in rewriting rules, such that

u > v becomes ν — f o r all ordered pairs p. There is

of course no a priori condition specifying that trees be

generated from top to bottom. Notice, however, that, no

matter which interpretation we prefer, the above definition

has as of yet no implications for the transformational

theory as a whole. It precisely characterizes all the trees

generated by a Phrase Structure Grammar. Let T u, for all ueV^

be the set of all trees in Τ that are of type u; then Τ c Τ ϋ includes moreover only the trees generated by a Phrase Struc-

ture Grammar. Let us examine the Τ and their elements a u little closer.

The characterization of Τ indicates that, t excepted, Δ

there is for every tree of type u at least one corresponding

production p u of the same type. This will actually help us

transfer the neighbor relation, which was introduced for

productions, to trees. Let, for all u,veV^*, p^ = <u,v> be

a production of type u and let t . = <u',<t >. > be a r — u' v k k < n

tree of type u*. If we then call ρ 'associated with t .', — u u if and only if u = u' and ν = < vv >v » w e c a n subsequently

-48-

define that

two trees are neighbors, iff their associated operations are neighbors.

Stated as it is, however, the neighbor relation remains rather general and it should turn out beneficial if we could reduce it to a partial ordering. Thus let pu and p^ be neigh-bors. We say that

P u is lower than p^ and, conversely, p^ is higher than pu> if and only if the second constituent ν of pu is a substring of the second constituent v1

° f p¿·

The relation is lower than (is higher than) can be immedi-ately transferred to trees by demanding for any two tree neighbors tu and t^:

t is lower than t^ and, conversely, t^ is higher than t¿, iff the production associated with tu is lower than the production associated with t'.

It follows that every tree (and every production) is in particular lower than itself.

- 4 9 -

While the preceding definitions help to somewhat reduce

the generality of the neighbor concept by allowing us to

distinguish lower from higher neighbors, they leave us still

with too little to work on. Yet greater precision can only

be reached with the help of the auxiliary concept of a sub-

tree which shall be defined for every tree t of type u by

the following two statements:

1. Every t u is a subtree of itself;

2. whenever t ( = <u',<tv >jc<n

> is a subtree of k

t , then so is t for all k<n. v k

Each subtree of a tree t will be assigned a natural number

η as a degree relative to tu· We define that, for all ueV^

and for all t , u

1. t is of degree () relative to itself;

2. whenever t , = <u',<t >, > is a subtree of t u' vk u

of degree η relative to t , then t„ is of - u v k

degree n+1 relative to tu·

Let me use the symbol 't^1 to refer to any subtree of t of

degree n. Notice that tJJ has actually the status of a variable

whose range is being determined by the subscript 'u' and the

superscript 'n' together. Now if t.1} = <u',<t >. > , then it u Vj, κ<η

-50-

is a subtree of t of type u', as the first constituent u —

reveals, and I shall call t , , for all t1* = <u',<t.r > v _> k k

and for all k<n, 'the k-th immediate subtree of t ,*. u j .

Relative to t u, , then, each of its immediate subtrees is

of degree one, and this is true for every subtree t u, of

t u. Notice furthermore that the degree of a tree is not

the same as what has otherwise been called its rank. The

rank of a tree t u would rather be the largest number η

such that ty is a subtree of t u , provided there is such

a subtree.

We can now further narrow down the neighbor relation to

what I shall call 'a strict neigbor' or, mor exactly, 'a

strict lower (or higher) neighbor of a tree'. Thus let t

and t , again be trees and neighbors. Then

t u is a strict lower neighbor of t , , if and only if

1. there is a higher neighbor t™, for every tjj ^ Δ and

2. whenever t™, is a higher neighbor of tJJ and Δ

is a subtree of tjj , then the higher neighbor of

is a subtree of t®i of degree m+1.

Here it suffices to require that tjj and be f Δ, since

for and at least one of its immediate subtrees this

follows from the fact that they are neighbors of the former.

-51-

Recall that the neighbor relation has not yet been defined for Δ. We will do this now, however, by requiring for two occurrences Δ^ and Aj of δ in respectively two trees tu and tu, that

1. Δ^ and Aj are neighbors, if and only if they are immediate subtrees of respectively two neighbors;

2. i- is a lower (higher) neighbor of Aj , if and only if Δ^ is a neighbor of Aj .

The definition of a strict lower neighbor of a tree can then be restated as follows. Let tu and tu, be trees an neighbors:

tu is a strict lower neighbor of tu, , if and only if 1. Every tJJ is lower neighbor of a t™, and 2. whenever t?? is a lower neighbor of tm. and t n + 1

u ° u u is a subtree of t n , then is a lower neigh-u u bor of a subtree of t™, of degree m+1.

For the definition of the reverse relation of a strict higher neighbor let tu and t , be as above, then

t is a strict higher neighbor or a context of tu, , if and only if tui is a strict lower neighbor of t .

- 52 -

It follows that every tree is in particular a context of

itself. Furthermore if t u, is a context of t u , then every

context of t , is also a context of tu· Notice that, while

every tree is a context of all of its subtrees, it does not

generally follow from the fact that t , is a context of t u

that t u is a subtree of t In fact the set of all possible

contexts of a tree t includes the set of all trees, of which

it is also a subtree, only as a proper subset. Thus the

definition provides, for the first time, an exact characteri-

zation of what has been called 'grammatical context' ini-

tially. Let me illustrate the difference between subtrees and

strict lower neighbors by the following nine examples:

-53-

In the above configurations the relevant trees are encircled and enumerated from la, 2a, and 3a, to lc, 2c, and 3c respec-tively. Now la is a possible context of 3a and 2b is a pos-sible context of lb. In both cases, however, the latter doesn't occur as a subtree in the former. On the other hand, 2c is a possible context of both lc and 3c. But while Lc occurs also as subtree of 2c, 3c does not. Notice that 2a is not a possible context of Iji because condition 2. of the defini-tion of contexts is violated. Thus, while every subtree of la has a higher neighbor in 2a, the higher neighbor of l£ is not an immediate subtree of the higher neighbor of la itself.

It seems that, for reasons of completeness, we need some-thing that fits into every context, a minimal element for contexts or an empty tree. Certainly Δ will not qualify for such a tree since later on we will have to be able to replace Δ with objects of various properties depending in part on the symbol dominating it immediately. I shall introduce the empty

-54-

tree axiomatically by the following three statements, and

eventually use the blank to represent it:

1. The empty tree is a tree;

2. The empty tree is a subtree of every tree;

3. Every tree is a context of the empty tree;

The use of the term 'context' suggests finally that, instead

of calling a tree t u 'a strict lower neighbor of another tree

t ,', we simply say that t is in t , . u ' —u u-1-

In what follows we will concern ourselves exclusively

with trees of finite rank or finite contexts in the termi-

nology preferred here. Then every tree can of course be a

context of different trees. Again we shall employ a technique

that was introduced in the previous discussion of substrings,

and treat different occurrences of a tree in the same context

like different trees. Let me illustrate the motivation for

such a policy by using an analogy. In a sentence of the form

'x loves 'x' and may be different names for the same

individual. Yet in the context set up by the proposition that

χ loves £ it assumes different functions. Thus both syntacti-

cally and semantically we must distinguish the subject from

the direct object. While relative to each individual level

both may be identical in the conventional meaning of the word,

-55-

we cannot treat them alike. This means that, relative to a

given context, identity entails identity of function as ex-

pressed here by identical sub- and superscripts. Where we

have stated identity as a condition in previous definitions,

it was mostly identity outside the particular contexts

provided by trees, i.e. identity of what remains after all

subscripts and superscripts have been deleted. The distinc-

tion appears to be pointing towards the Fregean sense-

reference distinction, rather than to a distinction between

extension and intension; however, the problem shall be

discussed at a different place. On the understanding, then,

that identity is indeed a relative concept, the relation of

being a context of a tree is a partial ordering for trees.

It is reflexive (every tree is a context for itself) , anti-

symmetric (there is only one context among all the possible

contexts of a tree, namely the tree itself, of which that

tree can be a context), and transitive (whenever something is

a context of a context of a tree, then it is a context of that

tree as well).

Now let C u be a set of contexts of t u, for all ueV^, and

let c ueC u. Let in particular cJJ be a context of t u of degree

n, i.e. a context in which t u is of the n-th degree. Then

c u = tu t'ie minimal element of C u and the superscript * 0 *

correctly indicates that the function of t u in c® is not

-56-

determined by any other context but t itself. In general, however, the function of tu in c{J is not uniquely determined by its degree. This is the reason why we have to introduce traces as additional parameters. That the concept of a trace needed here is different from the concept after which trace theory was named will be clear from the following definition:

A trace to t u in c" is a sequence s. • <u]c>k<n of symbols u^eVj^ for which there are trees t

lc in c£ satisfying the conditions that for all k<n:

in c" and is an immediate subtree of t„ and uk c n and t • t . u u n u

A sequence £ is a cut through cjj, if and only if £ is a trace to t in c" and tu = Δ.

It is obvious that in a tree dominated by u there is exactly one way for each subordihated node to be reached starting from u. Yet in order to secure that there be exactly one trace to each subtree, so it seems, we must commit ourselves among other things to generate indirect- and direct object NPs under respectively two different nodes VP of which the former would dominate the latter. Of course there is also the possi-bility of taking into account the linear arrangement of those

1. S is 2. t uk +i 3.

- 5 7 -

trees in a context that are immediate subtrees of one and the same tree. Thus function would be determined by the three parameters degree, trace, and linear order. Since the degree of t in a context c u is always one less than the length |s| of the trace ¿ leading to tu in cu, we could replace the expressions 'c"1 by expressions which would unambiguously characterize a context of t in which tu is of degree |s| - 1 and occurs as the k-th immediate subtree of a tree t , , where u' is the next to last con-u ' ' — stituent in s. Moreover, since u always occurs as the last constituent of a trace .s leading to tu, the index u in lCyS,^c>l is actually redundant and we can replace 'c*5'^' by 'c<s jc> *. In this notation, then, tu would be the same as c „ „ = c = c„. Here s • is a one-place sequence, U — r n '

hence |u| = 1, hence |u| - 1 « 0. Thus the name 'c 1 cor-rectly informs us that t u is of degree 0 in c u and, conse-quently, as also indicated by the missing second constituent in the index of c, subtree to no other tree except t itself.

The choice of natural numbers as second constituents of indices of contexts, however, is problematic for two reasons. Firstly k describes the place of t in a segment of subtrees which may be relatively small in comparison with a complete context. Secondly it is not always possible to determine

-58-

function in terms of a fixed number k. In German, for example, a direct object may precede or follow an indirect object and, depending on which is the case, the value of k in the index assigned to a context of the t^p dominating either one of them may vary. It is therefore desirable to determine the position of a subtree in a context relative to the position of other trees in that context to which it does not itself bear the relation of being a subtree of. In order to be able to do that, however, we need to introduce yet another concept, namely the concept of a segment of a tree:

A segment of a tree tu is a sequence <tu >^<η of lc

trees such that for all k<n

1. t„ is an occurrence of a tree in t„ and k u

2. none of any two of these occurrences is a sub-tree of the other one in tu·

If s = <t >. is a segment of a tree t„ , then — u^ k<n 6 u ' the sequence <ujc>jc will be called 'the upper rim of s ' .

Clearly the upper rim of a segment of t is a sequence of end points of a number of traces, none of which is a substring of any of the other ones. Now if c" is again a context of t of degree n, then the position of t in c11 can be uniquely

- 5 9 -

deterrained by intersecting a trace £ to t with an upper rim r of a segment of cjj which includes tu as a constituent. Moreover the pair <s,r>, where j» and r are as described, unambiguously characterizes the class of all contexts of tu

that share both s and a segment with upper rim r. I shall call <s,r> 'a subcategorization condition for tu'. In ac-cordance with what has been said for traces, <s,r> determines all those contexts of t , in which tu is of degree |s| - 1. Henceforth we shall use subcategorization conditions to analyze the class C u of all possible contexts of a tree tu

into mutually exclusive equivalence classes. Then <u,u> marks the class including t as its only element. Since u is a one-place sequence, the condition <u,u> correctly predicts that t is of degree |u| - 1 = 0 with respect to itself.

Let G be a Phrase Structure Grammar and let s and r be sequences of elements from V^. Then <s,r> is a subcategorization condition for tu , if and only if there is a context c of t , gener-ated by G, such that 1. ¿ i s a trace leading to tu in c and 2. r is the upper rim of a segment of c including

t as a constituent, u

The name 'subcategorization condition1 is of course not

- 6 0 -

chosen arbitrarily. In fact the subcategorization conditions used in the transformational literature constitute a proper subclass of the ones introduced here. There subcategoriza-tion conditions were introduced for objects replacing Δ only, hence for a class of subtrees of the trees generated by a Phrase Structure Grammar, if we take Δ to be a variable ranging over complex symbols. In order to demonstrate the correspondence, let me briefly discuss the subcategorization condition <<S,VP,V>,<NP,V,NP>> for an arbitrary transitive verb, which represents the intersection of a trace ... with the upper rim — of a segment as illustrated below

•VP

NP· I Ν I Δ

Translated into the transformational notation <<S,VP,V>, <NP,V,NP>> would be represented roughly in form of a pair <[+V],tgNP[vp_NP]]>. If we omit the redundant information, i.e. the information predictable by the way we set up our Phrase Structure Grammar, then the two versions can be re-duced to <V,<V,NP>> and <[+V],[_NP]> respectively. It shows that in the redundant form of the latter version the labels

- 6 1 -

of the brackets, if read from left to right, represent a trace to V, while the symbols emclosed in brackets, also read from left to right, represent the upper rim of a seg-ment of tg , minus, in both cases, the node at which they intersect, which is mentioned seperately in the first con-stituent. The subcategorization conditions in Bowers 197 differ in that - in the terminology proposed here - some constituents of an upper rim of a segment can be replaced by the complete segment they are to represent. An example is the sequence <V,[gNP to VP]> which would actually rep-resent a segment in the subcategorization condition of seem: _[gNP to VP]; the trace belonging to it would be <S,VP,V>. Here such a segment would be represented as <V,<<S,<NP to VP>>,<NP to VP>>>, where the second constit-uent is a sub-subcategorization condition all by itself, namely the one of <NP to VP> in t . Thus the complete sub-categorization condition for seem would have the form <s,<r' ,<s",r' '>>>.

At this point a few remarks concerning the general direc-tion of this discussion seem to be in order. Having called Δ a variable ranging over complex symbols, and having required by definition that every Δ represent a tree, implies that complex symbols actually represent terminal trees in a transformational model. However, the previous observations

- 6 2 -

about subcategorization conditions lead to yet another

drastic change in our view concerning the theoretical nature

of terminal trees. Since every complex symbol carries a sub-

categorization condition as part of its name, complex sym-

bols can actually be taken to represent predicates ranging

over just those trees into which they can be inserted. Thus

lexical insertion can be described quite simply as a process

involving the λ-operator, in which a class is being applied

to an object, in this case a tree. Insertion takes place,

if and only if the tree, into which we want to insert a

complex symbol £s, is a member of the class determined by

the subcategorization condition occurring in cs. An expression

consisting of a tree dominating cs, then, asserts that that

tree is a member of the extension of £s. That the same variable

Δ can be used to represent different trees in a terminal seg-

ment follows from the fact that the range of each particular

occurrence of Δ is uniquely determined by the context in which

it stands. This context differs for each Δ in at least one

quality from the context of every other Δ : Even though we

seem to be talking about the same Phrase Structure Tree t_,

every Δ is immediately dominated by a different subtree of ;t

or, to be exact, by a different occurrence of a subtree of t.

Again we find ourselves arguing at a theoretical level which

appears to be very close to the Fregean level of sense and

which is not covered by the extension-intension distinction.

- 6 3 -

That it should be possible to treat complex symbols as

predicates ranging over trees may come as a surprise and,

indeed, seems somewhat unnatural. We will see in a later

chapter that such an interpretation, even though plausible

for the moment, is by no means forced upon us. There the

concept of a complex symbol will be replaced by the concept

of an expression, which will subsequently enable us to

choose a more natural approach and have contexts behave like

predicates ranging over expressions. The characterization

of expressions offered, and in particular the characteriza-

tion of basic expressions, will, however, depart consider-

ably from current linguistic views. For now let me talk about

a subclassification of trees some more and the relations

different types (i.e. classes) of trees bear to each other.

The following description of tree types will make use of

the previously introduced concept of a filter. I shall start

with a close examination of contexts of type <u,u>.

4. Syntactic operations and transformations

The relation of being a context has been found to be

a partial ordering for (occurrences of) trees. For Phrase

Structure Grammars this raises the question whether a

system consisting of 1. a set Tg of finite trees dominated

by S - where finite means that their rank does not exceed

an upper bound m -, and 2. the relation of being a context,

constitutes a Boolean Algebra. If it does, Tg must include

a mimimal and a maximal element. Since the empty tree has

all the characteristics of a minimal element and can be

added to S quite easily, the question boils down to another

question, namely what conditions the productions of a Phrase

Structure Grammar must satisfy in order for every finite set

of contexts of type <u,u> to include a maximal element, i.e.

a tree which is context of every tree in the same set.

Apparently all productions of the same type must have a

maximal element in this case, i.e. one that is higher neigh-

bor to every production of the same type. In a Phrase Struc-

ture Grammar including no other rules expanding VP, except

those represented by the scheme VP — > V (NP) (NP) (VP), for

example, such a maximal element - in this case of type VP -

would be the rule V > V NP NP VP. Now it is certainly

possible to design a Phrase Structure Grammar in such a way

- 6 5 -

as to give us the option of not expanding ueVL at all.

This can not be accomplished, however, by merely enclosing

all the constituents of a string at the right side of an

arrow in parentheses. If we did, verbphrases and noun-

phrases could no longer be distinguished, as they are now,

by the requirement that they include respectively at

least a verb and a noun as a constituent. Thus in order to

secure that, if we do expand VP at all, we must at least

expand it into a V, we must include the complete second

constituent of the above scheme in parentheses, thereby

changing it to VP — > (V (NP) (NP) (VP)). Moreover, if ν

is any string of symbols from V^ including parentheses,

then rule schemes would generally have the form u ? (ν),

thus stating correctly that any ueV^ can dominate the empty

tree. This is consistent with the characterization of the

empty tree, given ealier, as a tree of which every tree is

context. We must then add to our grammar the general prin-

ciple that, forall u,u'eV^ and trees tu, f Δ,

<u,t ,> —\ <t .> ' u' ' u'

which allows us to replace any tree having only a single

immediate subtree by that immediate subtree, and which I

will call 'the first reduction principle'. By this principle,

- 6 6 -

then, a tree of the form 1. will be reduced to 2.:

1. X / \ Y s / \ NP VP

0 V I Δ

X / \ Y VP

I V I Δ

If we supplement the first reduction principle by the first strong reduction principle

<u,t .> <—•><t .>, u u

then we can reconstruct 1. from 2. whenever the need should arise.

Interestingly enough the definition of a Phrase Structure Grammar does not require major changes in order to allow for rewriting rules of the type described. Recall that the set Ρ of rewriting rules was defined as a set of ordered pairs <u,v> with first constituent from V^ and second constituent from VL*. As the set of all finite sequences of elements from VL* must clearly include the empty sequence 0; hence the pair <u,0> = may well be among the elements of

-67 -

P. In order to allow for the existence of a maximal

element, however, we must add two conditions to the first

modification 3'. of condition 3. of the definition of a

Phrase Structure Grammar, thus getting

3' 1. Ρ is a set of productions j) such that

a. £ is as in 31 ;

b. all productions of the same type

are neighbors;

c. ueE <—» ν = Δ ;

By c. only pre terminal symbols expand into Δ. But recall

that Δ is a variable whose range can be made to include the

blank as well.

Now let Ρ and Τ be a set of productions and trees of the

same type respectively. Let furthermore P + be like Ρ except

that it includes in addition the production u ^ 0, and

let T + be the result of adding the empty tree 0 to T. Let

finally be the set of all second constituents ν of 4* +

elements from Ρ . Then the system is a Boolean Algebra if and only if the system <S , is

a substring of> is a Boolean Algebra. Moreover if, for a

given Phrase Structure Grammar, every system <P +, is a higher

- 6 8 -

neighbor of> is a Boolean Algebra, then <T +, is a higher

neighbor of> is a Boolean Algebra, and so is the system

<T+, is a context of>. As in the case of substrings, three

operations +, ·, and - can be defined in terms of the

concepts of the greatest lower and the least upper bound.

Thus let A be a set and x,yeA; let < be a partial ordering

for A and let y „ and y . be the maximal and the minimal 'max 'min element relative to < respectively. Then x+y = l.u.b.(x,y),

x-y = g.l.b.(x,y), and -x is the unique element in A such

that -x+x = y , and -χ·χ = y . . Here, as always, I am 'max ' m m ' ' ' presupposing that T + include only trees of finite rank.

In practice we will of course have to deal with subsets

P' of P + only. Such a subset is a filter, if and only if

the set S . of all second constituents of elements of P'

is a filter with respect to the relation of being a sub-

string of. If moreover, for a given Phrase Structure Grammar,

every set P^ of productions of the same type is a filter,

then there are subsets T¿ and TJJ of T* such that T¿ and

TJJ are filters with respect to the relations of being a

higher neighbor of and being a context of respectively.

Since contexts are strict higher neighbors, I shall call TJJ

a strict or rigid filter.

-69-

Now let B be a subset of C*u u > satisfying the

following requirements:

1. B includes with any two contexts their

greatest lower bound and

2. B does not include the empty tree.

Then B is called 'a filter basis'. Starting from

B we can form a rigid filter IF consisting of all

those trees of a Phrase Structure Grammar which are context

of at least one of the trees in B . We say that IF is

generated from B in this case and list the requirements

F must satisfy as follows:

1. IF includes with any context c of t ueB 

all contexts of c, and

2. IF includes with any two contexts of trees

t ueB their greatest lower bound, and

3. F does not include the empty tree.

Now let G be a Phrase Structure Grammar including the

schemes NP — ¿ (Det) (A) N, VP * V (NP) , and S — » NP VP;

let G moreover include no rules except the ones represented

by these schemes. The C < N p N p > and C < v p v p > are rigid filters

-70-

generated from C ^ and C <y y > respectively. Moreover,

every rigid filter generated from C < N p > N p > ( C < V P ) V P > ) is

a rigid filter generated from C < N > N > CC < V > V >). Let M < V P ) V P >

— *"<νΡ VP> a generated from C < N N > . We call

M^jyp y p > 'a filter in C <yp γρ>Λ· Then all the trees in

M < y p y p > are also contexts of In general, if IF and IF'

are filters generated from A and Β respectively, and have

common elements, then their intersection is a filter gener-

ated from A and B. Thus M < v p v p > is a filter in C < y p y p >

generated from C < N N > and C < v > v > . Moreover, C < V P ) V P > -

M < v p v p > = {<VP,<V,<Δ>>>} which, according to the first

reduction principle, is the same as {<ν,Δ>}. But notice that

there are different filters M < g g > t generated from

in C <g g >, and all of them are also filters generated from C<V V>" o r c* e r t o identify each of them correctly, it is

apparently necessary to know which they are generated

from. Her the advantage of subcategorization conditions

shows drastically. For G the filters in question can be un-

ambiguously identified by the two indices <<S,NP>,<NP>> ,

and <<S,VP,NP>,<NP>>. By the first constituent S of every

trace they are marked for being subsets of C <g g >. The

second constituent <NP> informs us that all contexys in

question are contexts of a segment with upper rim NP. Since

a trace represents always the shortest path from one node

-71-

of a tree to another one, the middle constituent VP in the second trace indicates that the segment in question must be a segment of every subtree typ of the trees generated by G.

Here the question arises as to whether subcategoriza-tion conditions need a second constituent at all. It seems that the two filters C < < g > N p > ^ < N p > > and C < < s > v p > N p > ) < N p > >

of G are uniquely determined by the first constituents <S,NP> and <S,VP,NP> of their indices alone. While this may be true for G, however, it is no longer true for a Phrase Structure Grammar G1 including the scheme VP » V (NP) (NP). Here the indices of two different filters in C<S S> s^ a r e s a m e first constituent <S,VP,NP>, and hence can only be distinguished by their second constit-uents <V,NP> and <<V,NP>,NP> which represent different segments. The inner brackets in the latter sequence indicate that the trace <S,VP,NP> is to intersect with the second occurrence of NP in <V,NP,NP>. A bracketing of this kind is unproblematic on the understanding that a sequence of segments is also a segment. Such a requirement was avoided in the definition of a segment, basically because a Phrase Structure Grammar does not have to include schemes of the form VP — } V (NP) (NP) in order to gain the gener-

-72-

ative power of those that do. Notice incidentally that by distinguishing different functions of an NP under a VP in the way described we have not made any claims as to which NP is to represent the direct- and which the indirect object. In fact a claim to this effect would make a grammar overly restrictive; as long as we secure that, if either NP dominates a direct object, the other one can not, we have done enough.

Let me at this point introduce a class of particular segments, whose constituents are subtrees of degree η relative to a given tree:

1. A segment is of degree η relative to a context c < s r >, if and only if all its constituents are subtrees of c < s r > of degree n.

2. A segment of degree η is complete, if and only if it includes all the subtrees of c < s r > of degree n.

Apparently the upper rim of a complete segment of degree η consists of the end points of all traces of length n+1 of a context c . If C is a rigid filter generated

' 3 ».I * * S » 1 '

- 7 3 -

from C , and ceC, then there is for every complete segment sm (of degree n) of c exactly one set of n+1-place traces which all the contexts of c in C, have in common. <s,r> ' and whose end points define the upper rim r of sm. If sm is of degree m in C < s r >, then the upper rims of all com-plete segments of degree m of contexts of c in C < s r > form a filter of substrings generated from r. This leads to the conclusion that, if c is a context of c', then the function of c' in any of its contexts in C < s r > remains the same. Moreover, all the traces in C < g r > leading to any of the c' in c have the same initial segment ending in u. Thus the subcategorization condition for c' in C unambiguously characterizes the function of c' in C < g r >. If moreover c1

is dominated by u1 , and all the traces leading to c' in C < s r > are identical, then the first constituent of the subcategorization condition for c' can be unambiguously represented by u'. This explains, why in one of our previous examples <<S,VP,V>,<NP,V,NP>> could be reduced to <V,<V,NP>>. There we were discussing trees of a filter in C < s s > generated from C < v p γ Ρ >, and presupposing that all contexts of <V,A> in C<g g> include <S,VP,V> as the only trace leading to V.

-74-

That relative to the same class C different oc-currences of a tree in one of the contexts from C be subcategorized differently is implied in the preceding discussion. In general it is sufficient, however, to choose the smallest segment in which a tree differs from all the others to be the first constituent of a subcate-gorization condition.

It follows from condition 4. of the definition of a Phrase Structure Grammar that Phrase Structure Grammars generate only contexts of type <S,S>. That the latter play indeed a special role among the trees discussed so far shall be reflected here by what I shall call 'the second reduction principle1 which involves mainly trees dominated by S. Let t , tu, be trees dominated by u and irrespectively. Then, in accordance with the earlier charaterization of contexts, tu-tu, = tu+(-tu,) = l.u.b.(tw,-tu,). Thus if tu

is a context of t ,, then t "t , is the complement of tu, in t , i.e. what is left of t after t . has been deleted, u u u' Furthermore, if a, b, c are objects of any kind and b occurs in a, then a^ is the result of replacing b in a by £. The second reduction principle can now be formulated as follows:

-75-

1. tc is in t ; O u 2. t , occurs in both

tc and t -tP.

Here both 'is in t ' and 'occurs in t 1 stand for 't is u u u a context of'; henceforth all three formulations shall be used interchangeably. The word 'occurs' in 2. expresses furthermore that the subcategorization condition of t , in tg is different from the subcategorization condition of t , in t -to, in other words that we are dealing with U U u two different occurrences of t . in t . Again the second u u reduction principle can be supplemented by the second strong reduction principle, according to which

t u W t * S , iff S u '

1. and 2. as above.

The assumption that the second reduction principle is a universal principle is of course as unfounded as the assumption that the concept of a Phrase Structure Grammar is usefull for the description of any language. Together with the first reduction principle, it will, however, take care of such operations as Equi and Raising which can be found typically in transformational descriptions of English

-76-

and German. But it will also help explain 1. the equiv-

alence of constructions such as 'Mary can read and Mary-

can write' with 'Mary can read and write' and, possibly,

2. the blank representing the head in a relative clause

such as 'the man, who 0 shot Billy the Kid' or 'the man

I met 0 yesterday', provided we accept the somewhat

problematic view that relative pronouns combine with

complete sentences before deletion takes place.

That processes such as Equi and Raising are in fact

covered by a single principle has also been argued in

Bowers' Theory of Grammatical Relations. There, as well

as in the transformational literature in general, operations

manipulating trees were introduced as transformations. This

raises the question as to whether transformations in general,

and the two reduction principles introduced above in partic-

ular, are syntactic operations in the Montaguean sense.

Recall that Montague's operations of a disambiguated lan-

guage manipulate expressions. Trees, however, qualify for

expressions under one condition only, namely the condition

that all the symbols Δ are replaced by complex symbols or

any suitable equivalent. For a standard transformational

model, then, in which transformations apply only after

lexical insertion has been completed, the question can

-77-

answered in the affirmative. But then lexical insertion assumes a rather peculiar status. As an operation manip-ulating trees, it is clearly a transformation. As an operation manipulating trees dominating unreplaced terminal symbols Δ, however, it is not a syntactic operation in the same sense a transformation is. Should we then conclude that there are two different kinds of transformations? Surely not, as Bowers' Theory of Grammatical Relations proves. Bowers' argumentation against the Equi-Raising distinction leads eventually to a model in which lexical insertion and transformations apply in random order. Incidentally it is just this property of the theory he proposes which, in his opinion, renders the concept of Deep Structure, in its common interpretation as a fixed level of grammatical representation, meaningles. In Bowers' theory, then, lexical insertion and transformations are no longer distinct; but neither are they syntactic opera-tions in the Montaguean sense. Notice that the two reduction principles introduced here are transformations in Bowers' sense. For this reason, and because I consider Bowers' approach more organic - or less ad hoc - than the standard theory, I shall assume that his interpretation of trans-formations is the correct one. But what are syntactic operations then, and what is their role within the system

-78-

proposed here? The answer to this question will finally

enable us to prove the initial assumption that transforma-

tional systems are indeed disambiguated languages.

It has been pointed ou that terminal symbols Δ are

variables marking the place in a tree where lexical items

of a certain category have to be inserted. This suggests

that we take trees to be expressions including free varia-

bles, and hence representations of the syntactic functions

we are looking for. Clearly their domain is a class of basic

expressions of various categories, and - if the inter-

pretation of expressions as trees with no free variables Δ

is correct - their range is in every case a class of ex-

pressions. Before going into detail, let me repeat both

the definition of a Phrase Structure Grammar with all the

changes made and the characterization of a tree.

Def 3': A Phrase Structure Grammar is a system <VL,Z,P,S>

Such that

1. V^ is a finite set and ueV^;

2. Σ c V L is a set of basic categories;

3. Ρ is a set of productions <u,v> such that

a. ueV^ and veV^*;

b. all productions of the same type are neighbors;

- 79 -

c. ueZ » ν = Δ ;

4. SeV^ is the category of sentences.

The definition suggests that the elements of V^ are

actually categories of expressions. If so, their members

can not be expressions of a disambiguated language, however,

but must rather be expressions - in some suitable represen-

tation - of the language under investigation. Thus 3.c.

could be rewritten as follows:

3. c. uel > veu;

If G is a Phrase Structure Grammar, then t , or a tree of

type u (i.e. dominated by u) , is defined as follows:

1. ueVT — » t = <u,A>eG; L· U

2. ueV L"Z, < u» < v] c>] c< n

> e P » a n d t v e G f o r

all k<n t u = <u,<t V k> k < n>eG;

The definition differs from the one given earlier in that

we no longer explicitly require that Δ be a tree, a require-

ment which turns out to be unnecessary. In order to show

how every tree can be represented as a function, I will now

introduce for every t ueG a corresponding family of opera-

- 8 0 -

tions Fj . Let t ueG and let < Ä j > > j < m be a complete seg-

ment of t u of degree η. Clearly u^eZ for all j<m and

<<u. ,Δ .>>..„ is the sequence of all subtrees of t which 3 3 J <m ^ u

are dominated by a pre terminal symbol. We may call such

a sequence a sequence of terminal trees. Let furthermore

a . be an expressional constant for all j<m. I shall first

list the Fj individually, and then give a general definition

for all j<m:

λΔ,

λΔ.

it m-1 u

X Am-l tu λΔ9...λΔ ,t 2 m-1 u

' m-1

m

, . +m-1 m-1 u

t m u

F.(a.) = F . ^ iff a-eu-, where 3 D ] +l J J

is the result of replacing

Δ. in t·' by a- and t° = t . 3 u 7 3 u u

F m is the result of replacing all the free variables Δ^

in FQ by expressions a^ of a language L; F m is thus an

expression of what has yet to be proven to be a disambig-

uated language. Apparently each successive application of

a family of operations F- belonging to t eG describes the

- 8 1 -

process of lexical insertion as a strictly ordered process.

The difference between a standard transformational model

and Bowers' Theory of Grammatical Relations can now be

illustrated formally. Let t u be of type u and <Fj >j < m a

family of operations associated with t as before. Let

furthermore T^, for every natural number k, be a combination

of transformations. What follows are schematic representa-

tions of two different derivations of an expression of type

u, the first one according to a standard model, the second

one according to Bowers.

Standard: F Q = XAQ. . . _ j _ t ;

Fj = Fj-i(aj-l) for l<j<m;

F m = deep structure;

T^(Fm) = surface structure;

Bowers: F Q = λ Δο ' ' ' λΔιη-1ΐυ'

Fj = Tk· 1C Fj-l ) ( aj-l ) f o r 1 Ü < m ;

T v CF Ì = surface structure; K_ m m

The second scheme illustrates nicely why in Bowers' model

insertion of individual items can be made contingent on

the prior application of certain transformations. Here the

T, are of course also combinations of transformations. kj

- 8 2 -

But notice that the term 'combination of transformations'

is used in the mathematical sense of 'η-place combination1

which, for η φ 0, means one or more and implies strict

order. For the special case of η = 0 it means moreover the

same as 'none at all'. A .more precise characterization

of η-place combinations of operations will be given at

a later point.

Now let <Fj u> j < m u e y be the family of operations

' ' L associated with t for all ueVT and t eG. Then each of U L U the F. for j>0 is uniquely determined by F n . The 3> u υ, u extension of F n , according to the definition of F., is υ, u J moreover the class of all expressions F„ without free r m ,u variables that can be constructed by a successive applica-tion of the F. to the a. in the domain of F n . Let

j,u j 0,u

Ε , for all ueVT , be an expression in the extension of U Jj

F n , i.e. E = F . W e can generalize the definition 0,u' u m,u 6

of Fj given earlier by expanding their domains. Thus

the domain of F^ u will no longer merely include expres-

sions a^ from basic categories in Σ, but expressions E u,

as well:

Let t eG and F n as described. Let furthermore <t u 0,u Vj j <m be a complete segment of t u· Then Ε γ is an expression in

- 83 -

the extension of F n for all j<m and v.eVT . We define: υ , ν j j L

0 ,u

:2,u

= : Xt Xt t v0 m-

= : Xt,r t V1 m-

= : Xt . . .Xt t v2 . vm-

m-1 ,u Xt m-1 1 u m-1

f j . u ^ ) • >>l,u i f f i s i n

Ί i +1 the extension of F n , where tJ U , V j u

is the result of replacing t in

tj by E¿ , and t¡¡ = V

The two derivational schemes illustrating the difference

between Bowers' and a standard transformational theory

can now be generalized accordingly. I shall do this for

the second scheme only, since the theory presented here

is basically an extention of the proposals of Bowers and

Montague.

0 ,u

Tk CFm,u) m

= Xt . . . Xt t ; vn v m , u' 0 m-1, = Tk (Fi-l J ( t v 3 for l<j<m;

j-1 J 1 , U Vj-1 = surface structure.

-84-

The discussion leads thus to an interesting conclusion.

Although transformations have often been thought of as

syntactic operations in the Montaguean sense, they are

in fact not. As operations manipulating tree shaped ex-

pressions including free variables, their status is more-

over quite unique and unprecedented in any formal theory

of the Montague type. Notice that the objects in the

domain of transformations do qualify for Montaguean syntac-

tic operations, which means that transformations work

on a quite different level. Syntactic operations, even though

distinct from transformations, can nevertheless be defined

with the help of transformations. In the present frame-

work every m-place syntactic operation can be - and implicit-

ly has been already - defined as a family F of one-place

operations such that

F = <T-(F.)>. J J ]<m'

where T. is an η-place combination of transformations 3 *

for all i<m, and F. is like the F. described above. J ' J j,u Clearly each F is an operation from expressions into

expressions and thus, like Montague's operations, builds

up complex expressions from less complex ones. The

question as to whether transformations are meaning

- 8 5 -

preserving is open. If they are, then they merely state

equivalences between operations. If they are not, they

neverthelessdetermine every syntactic operation uniquely,

provided they are strictly ordered, as we have assumed

they are.

Now if, for all jeJ, F. is a syntactic operation

<T. (F. )> v. as defined by a Phrase Structure Grammar G, J k J k κ<η

then initially F^ has only basic expressions from ueZ to

work on. A step by step application, however, leads to

the generation of more and more complex expressions which

are in turn added to the domains of F^. Clearly if A

is the set of all expressions occurring in the domain or

the range of Fj for all jeJ, then 1. <A,F^> is an algebra.

Moreover 2. u c A, for all ueZ, and 3. A is the smallest

set including all ueZ as subsets and closed under Fj.

That 4. the range of Fj has no common elements with

any of the basic categories in Σ is being guaranteed for

all jeJ by the definition of Fj. The trees of G, as the

previous discussion has shown, are free of ambiguities.

From this and the definition of F. it follows that the

Fj are uniquely defined, i.e. that 5. whenever j,j'eJ,

aeD[F.], and a'eDfF.,], then Fj (a) = F^ , (a') implies that 3 3 •>

j = j 1, and a = a'. A set of rules can and will be added

to our system which 6. may have the form of ordered triples

- 8 6 -

<Fj ,<u]c>]c<n>v> » where jeJ, η is the number of places of

Fj, u^eV^ for all k<n, and veV^. Here both u^ and ν are

of course indices of categories, since we have seen that,

with the only exception of basic expressions, none of

the expressions generated by G can occur as an element of

any of the u in V^. That finally 7. G generates a category

of sentences is also being guaranteed by definition.

It may not have remained unnoticed that 1. - 7. are

precisely the conditions a disambiguated language according

to Montague must satisfy. Since transformationsl systems

including Bowers' Theory of Grammatical Relations are

special cases of the system set up here, they are of course

disambiguated languages as well. It remains to be asked

now what kinds of objects basic expressions are. That

they are elements of the categories in Σ has been hinted

at on several occasions. Their theoretical nature as well

as the question as to how they should be represented shall

be discussed in some detail in the following chapter.

5. Basic expressions and hierarchies

There seems to be general agreement over the fact that,

independently from the nature of any particular language,

certain processe involving both the combination of

expressions and morphological adjustment take place relative

to certain syntactic and semantic properties of the ex-

pressions concerned. In Montague's description of a fragment

of English, which makes use of his general theory of variable

binding, we find therefore rules that replace an unmarked

morpheme 'he^' by 'him', 'her1, or 'it', if it occupies the

direct object place of a verb phrase and depending on whether

the noun phrase by which it is being bound refers to a

masculine, feminine, or neuter individual or object. In

German, and to some extent in romance languages, even prop-

erties such as masculine, feminine and neuter have become

properties of nouns with complete disregard in a large

number of cases for the actual nature of their referents.

This means that, within a description of German, an equiva-

lent rule of variable binding would apply relative to ex-

clusively syntactic properties of expressions, and consider-

ing certain peculiarities in the usage of English pronouns

it seems not alltogether inappropriate to postulate the

existence of syntactic gender even for English. It is

precisely the partial dependence of some grammatical

- 8 8 -

processes on syntactic properties that have caused trans-

formationalists to describe them within the syntax of a

language, even when they appear to be morphological in

nature. The legitimacy of such an approach, however, which

has always been tacitly assumed, can not be confirmed

until the extension of a syntactic property can be deter-

mined beyond doubt.

The following consideration may show that the problem is

not entirely trivial. On the one hand an expression can as-

sume - and be marked for - mutually exclusive syntactic

properties, i.e. properties whose extensions have apparently

no expression in common. Thus while a noun can be either

nominative or accusative, no expression can be nominative

and accusative at the same time. On the other hand the

presence of some syntactic properties seems to depend to

some extent on the presence of others. Notice that a

question such as 'What is the accusative of the German root

morpheme 'dies'?' can not be answered with any certainty

before we know whether 'dies' is to be masculine, feminine,

or neuter, singular, or plural, first, second, or third

person. This seems to suggest two things. First, that

associated with every syntactic property Ρ there is an opera-

tion assigning every expression in its domain a morphologi-

-89-

cal variant and thus marking it for having P. Secondly, that the operations associated with different syntactic properties are hierarchically ordered in such a way that the output of one operation is input for another one. In the case of verbs, for example, we attach personal endings to a present, past, or perfect stem. This means that a verbstem would have to be provided first which can not be done unless the morpheme it is to be derived from is marked for indicative or subjunctive. But the result of marking a morpheme for mood might turn out differently depending on whether it is marked for the property strong or weak, and marking for potency may presuppose a marking for transitivity. Transitivity, finally, is a property whose associated opera-tion, if there were one, may feed on the results of an operation associated with the property of being the root of a verb. The latter would then be the lowest operation in a hierarchy and take arguments in a set of completely unmarked morphemes. Now although not all the processes, by which morphological variants can be derived from root mor-phemes in a bottom to top fashion, may yield different inter-mediate results, a hierarchy is clearly detectable in allmost every case. Thus if we were to define operations of the kind described, we would have to start from the assumption that indeed every step in a morphological derivation brings forth

-90-

different morphological results.

Suppose there is an associated operation for every syntactic property, and let f and g be associated with respectively different syntactic properties. Let further-more the range of f have common elements with the domain of £. Then the combination of f and g is the unique function h such that h(x) = g(f(x)) for all expressions χ in the domain of h. Notice that the domain of h is a subset of the domain of f. An η-place combination is a combination of η operations f^, and well defined only if the domain of and the range of f^ have common elements for all

i<n. Every η-place combination shall be represented by an η-place sequence <fi>j<n> which fQ is the lowest or innermost operation of the combination, followed by the one next to the innermost, and so on. Thus h = <f,g>, and h(x) = <£»g>(x) = g(f(x))· Now the following two statements are equivalent for every expression x:

1. χ has η syntactic properties. 2. χ is the value of an η-place combination

of operations for a root morpheme m.

- 9 1 -

Since the equivalence holds relative to a language L and

a reference point i - where i may be thought of as an

ordered pair with first constituent a possible world and

second constituent a moment of time - an interpretation

of syntactic properties as functions, whose value for every

pair <L,i> is an associated operation, seems plausible.

A closer look at such an interpretation, however, reveals

serious disadvantages. Thus the ranges of operations

associated with different properties are necessarily

disjoint. Consequently an expression such as 'masculine

noun1 could only refer to the values of the operation

associated with masculine. But these may not even have

morphological realizations in L so that in effect we wouldn't

even know what we are talking about. Clearly, if we speak

of masculine nouns, we refer to nouns of any property besides

masculine that determines their actual form, i.e to masculine

nouns of any person number and case.

Now it can be observed that the form of all expressions

of a given type is usually determined by a fixed number of

syntactic properties. The form of nouns, for example is

determined by gender, number, person, and case, the form of

verbs by potency, mood, voice, aspect, tense, number, and

person. If we therefore let the values of syntactic proper-

- 9 2 -

ties Ρ for any pair <L,i> be a function onto the set of

truth values, whose extension, for any P, is the class

of all combinations in which the operation associated

with Ρ occurs as a constituent, then the problem of

uncertain reference does no longer arise. This is the

policy we shall adopt here. Since henceforth syntactic

properties will be referred to indirectly only by

mentioning their values at a fixed reference point,

names such as 'masculine1, 'feminine', 'transitive',

'nominative' etc. - i.e names that have been used other-

wise to identify syntactic features - shall be used to

denote associated operations. Names such as 'potency',

'mood', 'voice', 'aspect1, 'tense', 'gender', 'number',

'person', 'case', on the other hand, denote variables

ranging over operations of a certain type. Mood, for

example can take the values indicative or subjunctive;

all variables can moreover be replaced by the identity

operation, which shall be the only value they have in

common. The value of the property of being a mascline

noun at <L,i>, for example, can now be represented by

an expression such as 'XnumberXpersonXcase[<masculine,

number,person,case>(root)] ' .

It has been well established by now that expressions

of a language L refer to functions operating over the

-93-

referents of other expressions. Since there is thus

no essential theoretical difference between the values

of root, in the above representation of (the value of)

the property of being a masculine noun, and the values

of number, person, and case, the asymmetry of the ex-

pression in square brackets is quite unwarranted. Notice

that the asymmetry vanishes furthermore in translation:

<masculine,number,person,case>(root) = case(person(number-

(masculine(root))). This suggests that we put root into

the angular brackets along with the other variables and

features. Since it is the lowest constituent of a combina-

tion, however, it must be mentioned first, thus getting

<root,masculine,number,person,case>. The representation

of being a masculine noun at <L,i> changes now to

'XrootXnumberXpersonAcase f<root, mascul ine, number, person,

case>]'. Since finally the λ-notation is not only bother-

some, but also imposes an order on the replacement of

operational variables which is unnecessary - the sequence

in square brackets is strictly ordered and free of ambig-

uities -, I shall do away with it and represent the value

of being a masculine noun at <L,i> by the sequence '<root,-

masculine,number.person,case>1 alone.

-94-

Now if f = <f¿ >¿ < n is a sequence of operational variables such that any complete replacement of the f^ by constants leads to an η-place combination of associared operations, then I shall call f 'a hierarchy over f^' for all i<n. If f ' is like f, except that m operational variables have been replaced by constants a. (k<m<n), then I shall call f1

xk 'a hierarchy over a. and a. and ... and a. · or simply

x0 11 m-1 'a hierarchy over <a^ >jc<m' and represent it by 'f<a > '·

k ik k<m A hierarchy is a function from associated operations onto the set of truth values. The extension of a hierarchy is a class of combinations of associated operations which will be called 'basic expressions'. The following scheme rep-resents a step by step transition from a hierarchy to an η-place combination or basic expression. Let a^ for all i<n be a constant in the range of f^:

<a0'£l' ' · · n-l* < a0 'al'''',fn-l>

<ao> »Vi*

Notice the structural similarity of hierarchies and syn-tactic operations. < a g , . . . , a n = <a^ ,...»an_2>(a(p of course not the final representation of a basic expres-

<f0>fl ' W ^ o ) =

<a0>fl> ^ - l ^ =

<a0,...,a 2,£ >(a ) =

-95-

sion. As the value of an.^ > for the root morpheme

, the latter will be determined with the help of tables in which occur as labels of columns and rows, and in which grammatical endings for ag are written down where rows and columns intersect. The genitive singular of the German root morpheme 'Mann1, for example, occurs as the value of <'Mann'.masculine,singular,3,genitive> = <masculine, singular, 3 ,genitive>.( 'Mann1 ) = 'Mannes'.

It is obvious that basic expressions in the present framework differ radically from the basic expressions of other linguistic schools. There basic expressions are mostly unanalyzed root forms which are being developed into morphologically marked surface forms in the course of a syntactic derivation with the help of agreement transformations or similar devices. Here, on the other hand basic expressions are morphologically fully developed expressional constants. While a syntactic operation that combines a transitive verb with a noun phrase will there-fore in principle be able to choose between nominatives and accusatives, the choice will automatically be restricted to accusatives if the subject place has already been filled. This is as it should be, for in a way it is the role of syntactic properties to change the way of referring. If

-96-

it is correct that definite nounphrases refer to properties

of an individual or object, then the nominative and accusa-

tive forms highlight respectively different properties

of the same individual. These are precisely the differences

Frege had in mind when he talked about differences of sense.

Since they are not only being reflected in morphological

markings, but also in our way of combining expressions

in an act of communication, a syntactic operation that

can not refer to them in one or the other way operates

essentially in a void.

Having restricted the input of our syntactic theory to

morphologically developed expressional constants does not

imply, however, that all of these constants have to be

listed in the lexicon. We wouldn't need morphological

operations - here associated operations - if this were so.

But before we go into a description of the lexicon let

me illustrate the theoretical findings presented so far

with some practical examples. What follows is a list of

17 classes of associated operations - henceforth features -

which are relevant for a description of German. These

classes will play a role in the formation of seven hierar-

chies of German, which will be introduced in a second

step. How a hierarchy over a rootmorpheme a uniquely

determines the class of all morphological derivatives of

-97-

a of a certain type will then be illustrated for the root morpheme 'alt'.

KQ = the set of German root morphemes; Kj = {det,adj,adv,noun,prep,conj,cat0}

K2 = {pro,common,proper,status0} Kj = {modal,aux,regular,status^} K4 = {interrog,demonstr,pers,possve,status^} Kg • {declarative,final,causal»consecutive.relative,

statusj} Kg = {coord,subord,status^} Ky = {def,indef,spec0} Kg = {trans,intrans,place0} Kg = {strong,weak,pot°} KJ Q = {indie,subj ,imp,mood0} K n = {active,passive,voice0} Kj2 = {perf,imperf,inf,aspect0}

= {past,present,future, tense0} K ^ » {mase,fem,neut,gender0}

= {sing,plur,number0} K 1 6 = {1,2,3,person0}

= {nom, gen, dat,acc, case0}

-98-

Before adding some explanations concerning the classes K^ let me introduce the variables I will be using. Let 'x:K^' assert that χ is a variable of type K^ for all i<18, i.e. that χ takes values in K^.

root :KQ; status^ :K6; aspect :K12; cat :Kl5 spec • Ky ; tense :K13;

status2 :K2; place :Kg; gender :κ14; status^ :K3; pot : Kg ; number :K15; status^ : K4 ; mood :K10; person :K16;

statusj :K5; voice :K11; case :K17;

Now if var^ is a variable of type K^, then var? is to be the identity operation which is the only operation all K^ have in common. The features in K^ determine class member-ship. K^ - K^y include subcategorization features, K2 for nouns, Kj for verbs, K^ for pronouns - possve = possessive -, Kj. and Kg for conjunctions, K^ for determiners, Kg, K ^ -^13' ^15' anc* ^16 v e r^ s» f°r verbs, determiners, and adjectives, K ^ - K^y for determiners, adjectives, and nouns, and K^y possibly also for prepositions, which determine case in German, but must not necessarily be marked for it. 'Spec' in Ky stands for specificity, 'pot' in Kg for potency, and 'imp* in K n n for imperative. The rest of the abbreviations

-99-

is self explanatory. The following seven hierarchies are

hierarchies of German:

Det = <det,spec,pot.gender.number,pers,case>;

Adj = <adj,pot.gender.number.person,case>;

Adv = <adv>;

V = <verb,status^.place.mood.voice,aspect,tense,

number,person>;

Ν = <noun,status2,gender.number.person,case>;

Ρ = <prep>;

Conj = <conj .status· .status^ ;

Recall now that any successive replacement of the variables

in the above sequences leads eventually to the representa-

tion of a basic expression. For example the value of

Adj (alt) (weak) (fem) (sing) (3) (dat) is the sequence <alt,weak,-

fem,sing,3,dat> which uniquely determines the German basic

expression 'alten' (=old). Adj(von), however, fails, i.e.

yields falsity, since 'von' is a preposition and hence not

of the same type as adj. Hierarchies have been introduced

as functions from features into truth values. The extension

of Adj is thus the union of all the morphological variants

of any root that can replace adj. The extension of Adj a^ t =

<alt,pot,gender.number.person,case> is moreover the class

- 1 0 0 -

of all morphological variants of 'alt1; Adj a^ t was called

'a hierarchy over 'alt'1. But we can subclassify further:

Adj a^ t s t r o n g > example determines the class of all

morphological variants of 'alt' that carry strong endings,

Adi the class of all derivatives of 'alt' with •"alt,nom

nominative endings, and so on. The seven hierarchies introd-

uced above determine thus every conceivable subclass of

German basic expressions. Only some of them will play

independent roles, however, within the syntax of German,

as for example the extensions of

Np r Q: the hierarchy over pronouns,

Ν : the hierarchy over common nouns, common ' '

^proper' hi e r a r chy over proper nouns,

etc.

This brings us back to the problem of representation in

the lexicon. If all the morphological derivatives of a

root morpheme a are uniquely determined by a hierarchy Η — α

over a, then, from a formal point of view any expression in

the extension of H a can be chosen to represent the complete

class, as long as it is marked for belonging to the class

determined by Η . Thus we might as well hold on to the α traditional way of having infinitives represent verbs, and

-101-

nominatives represent nouns etc. but index individual lexicon entries with the names of the hierarchy in whose extension they belong. If some of the morphological variants in the class represented by an entry are formed irregularly, however, we must list them as supplementary entries and index them with the names of the associated operations that cause the irregularities. Typical lexicon entries of German and English would therefore be

Mann.. , Männer , , ; common P l u r a l

gehenv, ging 3 f s i n g > p a s t, gegangenper£; womanN , women l u r a l;

common r

sleepv, slept 3 > s i n g > p a s t, sleptperf;

Now expressions have been described in the model theoretic literature as referring to functions whose values, for some reference point, are again functions - mostly properties in the Fregean sense - or relations. If this is so then it should be possible to combine basic expressions to form expressions in the same way syntactic properties were combined to form basic expressions. Hierarchies would then occur as constituents of other hierarchies which would allow for an immediate generalization of the relation of being over. By demanding that a. every hierarchy is a

- 1 0 2 -

hierarchy over itself and b. whenever F is a hierarchy

over G, and G a hierarchy over H, then F is a hierarchy

over H, being over turns into a partial ordering: It is

reflexive, transitive, and antisymmetric. Moreover if

H is a hierarchy over I, then H is a context of I and

vice versa. For a proof, consider that every hierarchy can

be represented as an ordered pair <u,v>, where u is a

name such as Det, N, etc. and ν is its sequence representa-

tion. Then the equivalence follows from the definition

of being over and the transitivity. The trees of a Phrase

Structure Grammar are thus hierarchies which finally

closes the gap between expressions and basic expressions.

Recall that expressions were being defined as trees in

which all free variables had been replaced by basic ex-

pressions. Both expressions and basic expressions occur

thus as the values of a successive application of hierar-

hies.

Recall now that the properties (in the Fregean sense)

whose extensions are classes of expressions did not occur

as referents of tree configurations, but as referents of

predicates constructed from trees with the help of a

λ-operator. As in the case of basic expressions, the

λ-notation should be avoided, however, as it imposes an

- 1 0 3 -

unnecessary order on the replacement of variables. As the discussion of subcategorization conditions has shown, trees are free of ambiguities and we should be free to decide from case to case if the necessity of ordering indeed arises. From now on trees shall be predicates all by themselves, referring to Fregean properties of expres-sions or hierarchies as we have seen.

During the discussion of predicates of this kind we focused in on those those we had called FQ u > which in-cluded only free variables and were the first in a family <F^ u >j < m u ey of predicates, and expressions.

* ' L Little attention was given to the F. that occured J >u as intermediate results of a derivation. The latter, hovever, allow for yet another generalization. Thus let Hj be a hierarchy over I and let I be an expression. Then the domain of Hj can be expanded to include not onlt ex-pressions, but all hierarchies H' such that 1.u.b.(Hj,H') is an expression of type H which includes I as a con-stituent. Such a H1 is called a complement of I in H, and includes free variables where Hj doesn't and constants where Hj doesnt. Moreover it should include all hierarchies in H that have free variables where Hj doesnt. Now if <HT ·>. is a family of hierarchies over I such that

-104-

Hj 0 = Hj, and Hj m is an expression including I as a constituent, then the HT . (j m) form a rigid filter ί » J in Η generated from I. Let IF be such a filter. We define for all Ηγ .: ·*• > J

The extension of HT . is the class of all ·*· > J hierarchies H' for which there is a filter IF such that l.u.b.(H, .,H')eF; ·*· » J

Just as the <F j ) U> j < m > u e V L did, every family <HI>;j>j<m

defines an η-place syntactic operations from expressions into expressions.

Let me now get back to the hierarchies over basic expressions and introduce a number of subordinated hierarchies that play independent roles in a grammar of German and therefore deserve to be given names.

^mod SC MOD; Npro S PRO; V aux = AUX; ^common = CN; V trans = TRANS; ^proper S PN; V-intr = INTR; Detindef S EIN;

Vplace° = PLACE0 D e tdef s DEF; Vinf = INF;

- 1 0 5 -

Conjsub - SUB; C o nj rel " R E L ;

Con3coord - C 0 ;

Let furthermore CASE be a variable taking values in {NOM,GEN,DAT,ACC}. Then the following hierarchies are also hierarchies of German.

S = <N0M,VP,(C0,S')>; VP = <V,ARG,COMP>; CASE = <(Detcase),(Acase),Ncase, (PP),(CS)>; ARG = <(DAT),(ACC),(A),(PP)>; COMP = <(INF),(CS)>;

PP - <P„a .CASE>; case ' CS = <Conj,S>;

The parentheses enclosing constituents of the sequences on the right side express optionality as usual. The following rules are rules of formation. Recall that the above sequences represent hierarchies, i.e. functions from hierarchies into truth values.

1. CASE(PRO,,0 Ί - <CASE - CASE,PR0„Q > ; v case' ' case 2. CASE(Deti) = <Deti,Adj;-,Ν^,.. .>; where

a. i • pot,gender,number,pers,case,

-106-

j = <pot'.gender»number,person,case>; k - <gender,number.person,case>;

b. pot' » weak if pot - strong, and pot1 • strong if pot • weak;

3. VP (TRANS) S < Vtrans' A R G" D A T"-- > ;

4. VP(INTR) m <Vintr,ARG-ACC,...>; 5. VP(PLACE0) = <Vplace° »ARG-DAT-ACC,...>; 6. VP(AUX) - <AUXtense'VP"Vtense'Vperf'·'·>;

7. VP(MOD) = <MODtense >vp-vtense >v¿nf>· · · > ;

8. CS(SUB) = <SUB,S-Vtense,Vtense>; 9. CS(REL) = <CASEi,RELiS-CASEi-Vtense,Vtense>;

1. and 2. are rules of nounphrase formation taking care of the fact that 1. only pronouns can not combine with adjectives and determiners in German, 2. that adjectives, determiners and nouns agree in number, gender, person and case. 2. moreover makes clear that the ending of an adjective is chosen in dependence of a preceding strong or weak ending. Agreement can, of course, also be taken care of in the definition of S which has not been done here. The selection of rules is fragmentary and serves illustra-tive purposes mainly. Let me add three reduction principles,

-107-

two of which were introduced before in the discussion of trees. Let, forall ueV, , [ x] be a tree of type u L U — with a complete segment χ of subtrees of degree 1. Let furthermore u[v] = iu·..[yx]...]· We say that ïï[v] is complete, iff ν is a complete segment of [ux] of degree Then

10. x[y] > y, iff x[y] is complete; 11. 5[χ,5'[χ]] —>5[x,5'-x]; 12. 5[HÖHfVt(m 1 - ΚδΜ = ,];

The result of 12. reduces to VP[Vinf] by 10. 11. is another version of the weak second reduction principle. In 3.- 8., now, ARG has been singled out to allow for a separate formulation of the rules concerning word order in the inner field of German sentences. 6. and 7. form complements for modals and auxiliaries and put them in end position. 8. and 9. take care of verb end position in subordinated clause, 9. moreover extraposes a particular nounphrase, that occurs as a constituent of S, into ante-cedent position. By the definition of the extension of hierarchies, which takes care of variable binding, the result of nine is input for CASE. An application of CASE to the results of 9., then, results in the formation of a nounphrase with a relative clause complement.

REFERENCES

1. Bowers, J.S. The Theory of Grammatical Relations. Ithaca, 1980.

2. Bowers, J.S., and Reichenbach, U. Montague Grammar and Transformational Grammar. Linguistic Analysis 5, 2: 195-245 (1979).

3. Carnap, R. Meaning and Necessity. Chicago, 1956.

4. . Grundlagen der Logik und Mathematik. München, 1973.

5. . The Logical Syntax of Language. London, 1971.

6. Chihara, C.S. Ontology and the Vicious Circle Principle. Ithaca, 1973.

7. Chomsky, N. Aspects of the Theory of Syntax. Cambridge, Mass., 1965.

8. Danto, Α., and Morgenbesser, S. (eds.). Philosophy of Science. New York, 1970.

9. Felgner, U. Modell-Theory. Mimeographed handwritten script. Heidelberg, 1973.

10. Frege, G. Sinn und Bedeutung. In Patzig, G. (ed.). Funktion, Begriff, Bedeutung. FUnf Logische Studien. Göttingen, 1962.

11. . Funktion und Begriff. Ibid.

12. . Was ist eine Funktion? Ibid.

13. . Grundlagen der Arithmetik. Austin, J.L. (ed.). Evanston, 1974.

14. Fodor, J.D. Semantics. Theories of Meaning in Generative Grammar. New York, 1974.

15. Gloede, E. Axiomatische Mengenlehre. Mimeographed script. Heidelberg, 1970

16. Gross, M., and Lentin, A. Mathematische Linguistik. Heidelberg, 1971.

-109-

17. Harris, J., and Severens, R. (eds.). Analyticity. Chicago, 1970.

18. Jech, T. Lectures in Set Theory with Particular Emphasis on the Method of Forcing. Berlin, New York, 1971.

19. Kalish, D., and Montague, R. Logic: Techniques of Formal Reasoning. New York, 1964.

20. Klaus, G. Moderne Logik. Berlin, 1972.

21. Kripke, S. Semantical Considerations on Modal Logic. Acta Philosophica Fennica 16: 83-94 (1963).

22. Lackey, D. (ed.). Essays in Analysis by Bertrand Russell. New York, 1973.

23. Lewis, D. Counterfactuals. Cambridge, Mass., 1973.

24. Linsky, L. (ed.). Reference and Modality. Oxford University Press, 1971.

25. Mackie, J. Truth, Probability and Paradox. Oxford University Press, 1973.

26. Montague, R. English as a Formal Language. In Thomason, R. (ed.). Formal Philosophy. London, 1974.

27. . Universal Grammar. Ibid.

28. . The Proper Treatment of Quantification. Ibid.

29. . Pragmatics and Intensional Logic. Ibid.

30. . On the Nature of Certain Philosophical Entities. Ibid.

31. Partee, Β. Comments on Richard Montague's "Quantifica-tion in Ordinary English." In J. Hintikka, J, Moravcsiik, and P. Suppes (eds.). Approaches to Natural Language. Dodrecht, 1973.

32. Quine, W. Word, and Object. Cambridge. Mass., 1960.

33. . Philosophy of Logic, New York, 1970.

34. . From a Logical Point of View. New York, 1973.

- 1 1 0 -

35. Rodman, R. Papers in Montague Grammar. Los Angeles, 1972.

36. Reichenbach, U. On the Compatibility of Montague Grammar and Transformational Grammar. Unpublished Paper. Ithaca, 1976.

37. Russell, B. On Denoting. Mind 14: 479-93 (1905). 38. Schwabhäuser, W. Modelltheory, Mannheim, 1971. 33. Stalnaker, R. Pragmatics. Synthese 22: 272-89 (1970). 34. Tarski, Α. Der Wahrheitsbegriff in den Formalisier-

ten Sprachen. Studia Philosophica 1: 261-405 (1936). 35. Thomason, R. Formal Philosophy. London. 1974. 36. Thomason R., and Stalnaker, R. A Semantic Theory of

Adverbs. Linguistic Inquiry 4: 195-220 (1973) 37. van Fraassen, B. Formal Semantics and Logic. New

York, 1971. 38. Wittgenstein, L. Philosophische Untersuchungen,

Berlin, 1971. 39. . Philosophische Grammatik. Berlin, 1973. 40. . Tractatus Logico-Philosophicus. Berlin, 1973.

Contexts, hierarchies, and filters : a study of transformational systems as disambiguated languages

Documents

Transcript of Contexts, hierarchies, and filters : a study of transformational systems as disambiguated languages