1. INTRODUCTION.


                There seem  to be two main concerns and goals behind the work on natural
              language using computers, converging at many points. One goal, that of more
              natural communication with computers, has its roots in the concern for application
              of the fruits of computational research, amplified by the recent race to build a new
              generation of computers, in order to make them even more user-oriented and
              more useful for our purposes. The other goal is theoretical investigation of human
              language, and some researchers hope that the computer might be a very useful
              tool in this task; this hope is expressed by Winograd (1983: 13) thus

              'The computer  shares with  the human  mind the  abiblity to  manipulate
              symbols and carry out complex processes that include making decisions on
              the  basis of  stored knowledge.   Unlike the  human mind,  the computer
              workings  are  completely open  to  inspection  and  study, and  we  can
              experiment   by   building  programs   and   knowledge   bases  to   our
              specifications.  Theoretical  concept of program  and data can  form the
              basis for  building precise  computational models of  mental processing.
              We can try to explain the regularities  among linguistic structures as a
              consequence of the computations underlying them.'


                After over thirty years of computational experience with natural language, there
              are good reasons for looking at work on automatic NLP in its own right, as an
              intellectual tradition. At the same time, however, it has become plain that the
              approach  whereby  computational parsing is an  application of (preferebly
              well-founded!) linguistic theory can be much more rewarding, both in its resulting
              descriptive power and theoretical significance (cf. Wilks and Sparck Jones, 1985).

                The  'revolution' in linguistics, of which Chomsky (1957, 1965) is generally
              considered to have been the initiator, has brought a different approach to language
              - generative approach. Some linguists (cf. Newmeyer, 1980) feel that we have
              learned more about language within the last thirty years than in all the ages taken
              together, which might indeed well be the case. At the same time, the 'Standard
              Theory' had many flaws in it and had to be first 'extended' and then 'revised' in
              order to account for the linguistic facts that it was found not to account for
              adequately. In addition, many researchers abandoned the typically 'Chomskyan'
              model altogether and started looking for a fresh one - Lexical Functional Grammar
              (Bresnan (1978, 1982), General Phrase Structure Grammar (Gazdar, 1982, 1983 and
              Gazdar et al, 1985) or Categorial Grammar (Steedman, 1985) may be seen as
              examples of this trend.

                Yet there is an even more serious flaw of the Chomsky's paradigm which may
              call for a major revision of our approach to natural language: in a vast majority of
              cases, its findings are mainly based on the analyses of just one language, namely
              English. In many respects, English, even though it may be an interesting language
              from the linguistic point of view, has many facets in its grammatical organisation
              strictly parochial in nature (idiosyncracy in language is not such a strange
              phenomenon  after all), and obliterates many of the problems that are found in
              many  other languages. One  of such  features of English is relatively tight
              formal/structural organization, which can be seen as a reason for the fact that
              quite a large subset of its syntax can be  described using the Immediate
              Constituent approach.

                Another weakness  of 'standard models' of language description was their
              unbalanced distribution of the descriptive burden, most of which was carried out

                                                 3


              by syntax. Generative Semantics, the Lexicalist Hypothesis, and then LFG and the
              most recent approaches have tried to remedy this situation, either as the first
              theory, by 'turning' the model, to have semantics equal to Deep Structure, or, like
              the second one, to relegate some more responsibility to the lexicon (especially in
              the espect of word  formation), or giving both to the lexicon and to strong
              (interpretive) semantics a fair share of responsibility (cf. Cann, 1985; Dowty, 1982).

                I shall discuss some notions that have to be taken into account in the parsing of
              natural languages, especially in respect to to 'inflectional' languages, although it
              seems that, in the approach that will be taken, a unified treatment can be applied
              to both inflectional and configurational languages. For the sake of clarification: I
              shall assume the term 'inflectional' for languages like Polish, and 'configuratonal'
              for languages like English as these terms seem to reflect the charactaristic ways in
              which the respective languages organize their grammars, as we shall see in latter
              sections.


                                          2. INFLECTION.


                Polish has a 'fusional' type of inflection, i.e., the word-to-morpheme ratio is
              one-to-many  and the morphemes  are fused into a single unanalysable morph;
              therefore the Hockett's Word and Paradigm model characterises its morphology
              most adequately.

                The role of inflection is to cumulate grammatical functions (Jakobczyk: 385);
              thus in Polish czyta-l ('[he] read'), 'l' indicates tense, person, number, and gender.
              Inflectional categories are regular and systematic sets, given a priori.

                In English, a verbal inflection has to cover one function, namely tense. Thus we
              can say that an English verb consists of two constituents:

              STEM + TENSE affix

              the only inflectional ending expressing simultaneously: tense, person and number
              in English being -s (Jakobczyk, op. cit.).

                A Polish one, however, consists of more constituents:

              STEM + TENSE + ASPECT + PERSON + NUMBER + GENDER

              These constituents are, to be more precise, functions to be expressed either
              inflectionally or by other means, eg. a prefix denoting aspect.

                Although the details of inflection are not essential here, the objective being
              rather the finding of more general principles, some surface differences between
              English and Polish in terms of inflection might be in order to give the flavor of the
              task of practical implementation of the notion of inflection:

              - gender distinction in the Polish past tense is marked inflectionally, eg. czytal
              [masc], czytala [fem] ('he/she read');

              - occurrence of the inflectional future tense in Polish, not in English, eg. bedzie
              czytal ('he will read');

              - the conjugations of the verb BE are the most similar, although in the case of
              English conjugation some  functions are performed by  pronouns, not only
              inflectionally;


                                                 4


              - the category which is obligatory in Polish is aspect, but not marked inflectionally,
              eg. chodzil ('[he] went'), chadzal ('[he] used to go') (cf. App. 1, pp. 1-19 to 1-21);

              - the categories of person, number and gender are expressed inflectionally in
              Polish (cf. App.1, file PLEX4a,PLEX4b).


                       3. WORD,  LEXEME,   MORPHOSYNTACTIC       CATEGORY.


                Words may  be considered to be, besides sentences, the most important units in
              language. But it seems that, just as in the case of the sentence, which cannot be
              given a satisfactory one-sentence definition, but rather it must be defined in terms
              of a device (grammar) generating sentences in a particular language, the notion of
              the word cannot be defined without constructing a dictionary of that language.
              Thus sentences are objects generated by the grammar of a language, while words
              are objects listed in the dictionary to the left of each lexical entry.

                Krzeszowski (1981) gives a definition in which words are put in contrast with
              other units of linguistic analysis:

              '... a word is the smallest significant  unit of a given language, which
              is internally stable (in terms  of the order of component morphemes) but
              potentially mobile (permutable  with other words in  the same sentence)'
              (p. 134).

              This definition makes it possible to distinguish between the word and the phrase
              (not the smallest significant unit), the word and the morpheme (not positionally
              mobile within a word), as well as the word and the phoneme (not significant). In
              this way the definition isolates lexicology from syntax, morphology and phonology.

                However, all these areas are mutually interrelated, and in the actual analytic
              practice it is often difficult to make categorial decisions, esp. lexicology and
              syntax. In the languages in which words cannot be easily isolated in texts, the
              problem of separating lexicology from grammar is especially acute.

                Moreover, we   can consider as  words  not only compounds   (blackboard,
              typewriter), but also phrases of various degree of fixedness, such as red tape (=
              bureaucracy), kick the bucket (=die), etc.. If one accepts the view that the lexicon
              has to deal with compounds and fixed expressions, one faces a formidable task of
              delimiting the upper bound of lexicology, separating it from syntax (cf. Becker's
              interesting exposition of the frequency with which these expressions are used in
              ordinary speech). The basic problem is to what extent constraints on collocations
              of particular items in syntactic constructions are subject to listing in a dictionary
              and to what extent they are statable in terms of rules (cf. the treatment of idioms
              in GPSG in Gazdar et al, 1985). This in turn is connected with a more general
              problem of what might be called 'precision' of grammars. Early transformational
              models were crude in that they imposed no constraints on the co-occurrence of
              various content words  (nouns, verbs, adjectives, adverbs) in syntactically
              well-formed preterminal strings. Selection restrictions were introduced by
              Chomsky  (1965); this, however, is problematic since it requires introduction of
              appalingly large number of theoretical concepts called semantic markers.


              3.1. Lexeme, Word-Form, Grammatical Word.


                Generally, on one hand, a word is an element of the dictionary of a language, as
              a unit composed of elements/'units' of 'meaning'; on the other hand, a word is an

                                                 5


              element of the grammatical system of a language as a basic unit of syntactic
              operations and as a point where all sorts of grammatical relations converge. In
              the first sense, the term lexeme is used, to denote 'word' as an abstract unit of the
              dictionary.

                Lexemes, when  they enter into syntagmatic relations with other parts of the
              sentence, may be realized by more than one word form. For instance, the lexeme
              PALIC ('to smoke') may be realized as bedzie palil ('[he] will smoke'). Bedzie palil is
              a unit which has a syntactic function as a whole rather than as two individual units
              bedzie and palil . Also, in the dimension of paradigmatic relations, bedzie palil is as
              a unit, in the functional opposition to pali ('[he/she/it] somkes'), palil ('[he] smoked'),
              pal ('smoke!'), palic ('to smoke'), palac ('smoking').
                                                            1
                It must be underlined that the element palil  in bedzie palil is not composed of
                                                                      2
              the unit bedzie (i.e., < BYC + future tense >) and palil  (i.e., < PALIC + past
                            1          2
              tense >; palil  and palil  are two different functional units of the language, despite
              their identical phonological form, thus
                   2
              palil  < past tense >

              and
                   1
              palil  < future tense >

              are two different grammatical units of equal functional status. Being such, they
              enter certain syntactic operations, and are called grammatical words/forms.

                The term 'grammatical word', then, is used to denote a functional unit of a
              language. (There is a  further distinction of terms phonological word and
              orthographic word, but this is not essential for present purposes).

                Word-form  is the smallest segment of text which fulfils at least one of the
              following conditions (after Grzegorczykowa et al, 1984):

              1) it may occur as an autonomous utterance, eg. tak ('yes'), czyzby ('really'), jemu ('to
              him'), lampe ('the lamp' < GENITIVE >);

              2) it is in syntagmatic relation with other elements of the predicate
                (even though the order can be changed, eg.:

               Komu mam  to dac? Jemu. ('Who shall I give this to?' 'To him')
               Komu to mam dac?

              Jaki on jest? Nudny. ('What is he like?' 'Boring.')
              On jaki jest?

              chcialbym       ('I would like')
              moze bym chcial

              3) is a continuous linear segment, i.e., it cannot be split into subsegments (even
              despite orthography), e.g. po polsku ('in Polish'), mniej wiecej ('more or less'), dzien
              dobry ('good day'), etc.

                But it is not word forms but grammatical words, i.e., functional units defined by
              a complex of syntactic functions and grammatical categories which are the object
              of grammatical  description. Word-form is interesting for its relations to
              grammatical word and as an elementary unit of syntactic rules of linear order.

                Often, word form is identified with grammatical word - it happens where a
              grammatical word is represented by a single word form, eg.

                                                 6


              PIES < dat,sing >  => psu ('dog')

              PISAC < ind, pres, 2nd person, sing >  => piszesz ('[you write'), etc.

              A grammatical word represented by a single word form can be called a synthetic
              grammatical word (synthetic grammatical form). But as seen above, a grammatical
              word/form can be also represented by more than one word form, as in

              PISAC < fut, 3rd person >  => bedzie pisal ('[he] will write')

              Grammatical  words/forms with  this structure can be  called analytic (or
              periphrastic) grammatical words/forms (after Grzegorczykowa et al, op. cit).


              3.2. Variants of a grammatical word.

              The same  grammatical word may  be represented by different word forms (or
              sequences of word forms) which are formal variants of a given grammatical word.
              The distribution of the variants of a given grammatical word may be

              1) identical - then we talk about facultative variants of the word,
                eg.  INZYNIER  <nom,  sing>  =>  inzynierowie <nom,  pl.>, or inzynierzy
              ('engineers')

              2) in complementary distribution - positional variants of a given grammatical
                word; here, a lexeme may occur in several variants, conditioned

                a) morphologically:

                  BEZ   =>     bez   ('without')

                  BEZ   =>     beze    ('without')

                b) phonologically, as in enclitic and stressed forms of pronouns JA,TY,ON
                    ('I', 'you', 'he', respectively), eg.

                  tobie <  +stress  >    ('you')
                     ci <  -stress  >

                c) syntactically, eg.

                             => latami
                  ROK  <instr, plu>
                  ('year')     =>  laty

                  where  latami is used only in connection with preposition przy
                 or przed + numeral: przed trzema laty ('three years ago')


                Of particular interest for my purposes is the status of syntactically determined
              variants of a lexeme. Thus for instance word forms of lexeme DOBRY ('good'):
              dobry, dobra, dobre occur in mutually excllusive syntactic contexts, i.e., are in
              complementary distribution:

               - dobry enters into syntactic relations only with masculine nouns (in fact, the
              picture is rather more complicated: the problem of agreement and government will
              be treated in later sections)

               - dobra - only with feminine nouns


                                                 7


               - dobre - with neuter singular and nonvirile plural nouns.

              These expressions could then be treated as syntactically determined positional
              variants of one lexeme (DOBRY in this case). However, the form of each of these
              expressions fulfils also a specific semiotic function: it is an indicator of the gender
              of the noun with which adjective DOBRY occurs in an expression. The linguistic
              function of these expressions is then complex. On one  hand, thay are an
              expression of meaning which constitutes the content of adjective DOBRY as a
              lexical unit, on the other hand, they fulfil an intratextual function, signalling the
              existence of a syntactic relationship (connection) between that lexical unit and the
              unit with which it co-occurs in a given expression.

                Dobry, dobra, dobre are then three different functional units:

               DOBRY + <N,masc>  =>  dobry
               DOBRY + <N,fem>  =>  dobra
               DOBRY + <N, neut> =>  dobre

              The ability of a given adjectival form to co-occur with nouns of a specific gender
              is called a  (syntactically dependent) grammatical gender of an adjective,
              recognizing grammatical gender as a morphological (inflectional) category of an
              adjective.

              To conclude, functional and formal differences between dobry, dobra, dobre have a
              categorial character - they concern many dictionary units which belong to the
              class of adjectives. This fact is reflected in the syntax, since the rule is general.
              In contrast, phonological variants of a word, eg. jego, niego, which are
              non-categorial features of a lexial unit are better expressed in the dictionary
              information about a given lexeme.

                At the grammatical level then, a word is referred to by the lexeme to which it
              belongs and the place which it occupies in the paradigm. Thus czytal 'he was
              reading' would be represented as belonging to the lexeme CZYTAC 'read', and as
              occupying the place in the paradigm defined by the terms Imperfect, Indicative, 3rd
              sing., Active, etc. - the terms being unordered properties of the word considered
              as a whole, and morphosyntactic, since they have a role both in morphology and
              in syntax: thus, for instance, it is a syntactic statement that certain prepositions
              govern certain cases (see App. 1, p.25 for an illustration), and in morphological
              rules, eg. rules for case inflections.

                A feature of this representation is that the lexeme, eg. CZYTAC, is a different
              sort of primitive or elementary term from the morphosyntactic property, eg. Past
              Participle. This is reflected in the different roles which they are said to play in the
              derivation of word-forms: the lexeme as  the source of the root, and the
              morphosyntactic properties (plus in some cases the inflectional class) as the
              features which select the operations which are applicable to it. However, the
              formal terms which are exemplified (lexeme, word, morphosyntactic category, and
              so on) can  be applied to any language for which these techniques are felt
              appropriate. Only the substantive or descriptive terms will vary (Mathews, 1974:
              67).


                                           4. SENTENCE.


                In linguistic analysis, sentence is considered as one of the fundamental units of
              grammatical description. However, what one has a direct access to in his analysis
              of language is actual utterances. Therefore the distinction introduced by de
              Saussure between langue and parole, or, later, by Chomsky, between 'competence'

                                                 8


              and 'performance' which reflects the fundamental difference between a system and
              its use can be applied to establish both the difference and the relation between
              sentences and utterances.

                But each of the terms can be used to refer to different phenomena, I shall
              therefore try to disambiguate them in the first place.

                Thus  the term 'utterance', in one sense, '...may refer to a piece  of
              behaviour' (Lyons,1977:26), that is, to the process of uttering, the product of this
              behaviour being a speech act. In the second sense, 'utterance' may denote the
              product of that behaviour, or process, in the form of inscriptions, i.e., sequences of
              symbols in some physical medium.

                As for the term 'sentence', it is defined, in Hurford & Heasley as '...a string
              of words put together by the  grammatical rules of a language [...,] the
              ideal  string of  words behind  various realizations  in utterances  and
              inscriptions (1983: 16). Lyons draws a distinction between a more abtract and a
              more concrete sense of the term. Thus, on one hand, 'sentence' may denote the
              product of a bit of language behaviour, a subclass of utterance-inscriptions, the
              segments of speech (1977:29; 1981:196), and being such, they can be uttered. He
              calls them 'text-sentences'. On the other hand, he uses the term 'sentence' to
              refer to 'abstract  theoretical  constructs,  correlates  of  which  are
              generated by the linguist's  model of the language...' (1977:622) They are
              used by the linguist in order to account for '...acknowledged grammaticality
              of  certain potential  utterances  and the  ungrammaticality of  others'
              (Lyons,1981:196); Lyons calls them system-sentences. As such, they are maximum
              units of  grammatical  description. Chomsky's  grammar,  then,  generates
              system-sentences.

                In this work, I shall use, especially the term 'sentence' in both the sense of
              system-sentences and  text-sentences, but this type-token distinction should
              nevertheless be borne in mind throughout.

                I shall now procede to discuss a few aspects of analysis at the sentence level,
              considering especially word order. Some different approaches will be looked at, in
              search for an optimal and, hopefully, the most adequate way of treatment of both
              inflectional languages as well as, to some extent, configurational languages.
              Undoubtedly, an adequate grammatical description is a prerequisite for work on
              computational analysis of natural language.


              4.1. Word Order and Grammaticality.


                From  the point of view of formal, structural well-formedness of complex
              expressions, the following requirements must be fulfilled:

              1. formal (syntactic/relational) ability of co-occurrence

              2. linear order of the expressions (Topolinska, 1984).

                In Polish, grammatical incorrectness follows from the not observing of rules of
              co-occurrence (distribution) of grammatical forms; when linear order has not been
              observed, the sentence is not ungrammatical; this is not the case in English. It
              seems that in fixed word order languages, the position of the expression in the
              linear sequence depends solely on its formal (syntactic) properties. In free word
              order languages, the position usually does not violate the rules of linear order, and
              the final ordering depends on the theme-rheme formation rules. Nevertheless, a
              certain number of linear order rules is subordinated to the rules of structural
              formation and cannot be formulated independently. Thus two of the following

                                                 9


              sentences have the wrong linear order:

              (a) * Sie Jana spodziewam.

              (a') Spodziewam sie Jana.
                  'I'm expecting Jan'

              (b) * Piotr na  patrzy    mnie.
                    ('Peter' 'at' 'is looking' 'me')

              (b') Piotr patrzy  na   mnie.
                   ('Peter' 'is looking' 'at' 'me'
                  'Peter is looking at me'

              but they are interpretable semantically.

              Now, the first of these sentences is grammatically incorrect:

              (c) * Nasze idziemy      do kina.
                  'our' 'go' [pl,1st pers] 'to' 'cinema'

              (c') My idziemy do kina
                  'We are going to the cinema'

              The second of these, in turn, is ill-formed semantically:

              (e) Kot     pije   mleko.
                  'the cat is drinking the milk'

              (e') * Radosc pije     odwage.
                  'the joy' 'is drinking' 'the courage'

                These  three sets of rules, i.e., semantic structure formation rules, formal
              (syntactic) structure formation rules and linear word order rules are to some extent
              autonomous, since one can form sentences that violate just one set of rules. All
              three are, however, required to produce fully grammatical sentences.


              4.2. Word Order and Configurationality.


                As it has just been observed, Polish and English differ from each other in terms
              of the freedom with which constituents may be rearranged within phrases. This
              general difference has formed the basis of a traditional typological distinction
              between   'free-word-order' languages  such  as   Polish  or  Latin  and
              'fixed-word-order' languages such as English. The theory of grammar should
              account for this variation in terms of specific parameters embedded within the
              deductive structure of the theory, so that the contrast between 'free-word-order'
              and 'fixed-word-order' can be traced to be the options exercised by particular
              grammars, possibly at the points specified by Universal Grammar (Stowell: 1).

                Within the scientific tradition of generative grammar, it has generally been
              assumed  that fixed constituent order is determined primarily by the formulae of
              context-free rewrite rules of the categorial component of the base. Each PS rule
              defines the internal structure of a particular type of constituent, specifying the set
              of constituents which it immediately dominates and the linear order in which these
              constituents appear. For instance, here is a simplified version of the PS rule for
              VP:

               VP -->  V (NP) (PRT) (NP) (PP) (PP) (S')

                                                 10


              The categorial rule system is supposed to be responsible for constituent order
              within phrases - including sentences (S/S') and phrases projected from the major
              lexical categories (NP,VP,PP, etc).

                The  notation of CF rewrite rules assumed in most versions of Generative
              Grammar  (and both in the Aspects model and in the GB framework) required that a
              fixed constituent order be established for each phrase type, since the the sytem of
              grammatical relations is defined structurally. Consequently, free-word-order
              languages are treated within these theories as being identical to fixed-word-order
              languages at the level of syntax. Using GB terminology, the right (or LF) side of
              the grammar is also identical. The difference is in the left (or PF) side of the
              grammar. There, local optional transformations, called 'scrambling' rules operate to
              derive the observed surface word orders. Thus in Aspects , Chomsky writes

              'In general, the  rules of stylistic reordering are  very different from
              the grammatical transformations, which are  so much more deeply embedded
              in the grammatical system.  It might,  in fact, be argued the former are
              not so much rules of grammar as rules of performance...' (1965: 127).

                The scrambling theory seems to lose some of its theoretical interest in the
              implementation, when one notices that few real predictions are made by it. Worse
              still, it practically treats as a as a matter of stylistic choices a phenomenon which
              seems, in many  relevant aspects, strictly grammatical (cf. Gorna, 1976: 202;
              Newmeyer,  1980). Moreover, it addresses the problem of word order without
              providing any explanation for its underlying mechanisms. There is also a related
              problem, termed by linguists as configurationality.

                It has been suggested that the explanation of configurationality and word order
              related phenomena may be found in the X-bar theory of the categorial component
              which, among  other things, offers us two dimensions: category, and 'type' (or
              hierarchical depth), along which rules, and therefore languages, may vary (Hale: 3).
              The dimension of hierarchical depth is probably the central one in relation to the
              question of configurationality, since it permits phrase markers to be relatively 'flat'
              or relatively 'hierarchical'. Thus if we take the following set of rules:

              (1) X" --> ...X'...
              (2) X' --> ...X ...,

              two core linguistic types can be defined along this dimension:

              (a) two-bar languages, i.e., such whose grammars utilize the endocentric PS rule
              schemata (1) and (2); such languages may be termed configurational

              (b) one bar languages, whose sole endocentric rule schemata is (2); these may be
              termed nonconfigurational.

              The respective structures of the two language types would be as follows: (a') for
              two-bar languages

                    X"
                   /
                  A"   X'
                      /
                    X     B"


                                                 11


              (b') for one-bar languages

                    X'
                  / |
                 /  |
                A'  X    B'

              (also cf. Cann, 1986: 28,30 for a similar approach).

                Indeed, it seems the case for Polish that we only need to distinguish between
              two levels in the structure of its PS rules: the 'lexical' (X) level (i.e., one where the
              'head' of the phrase is), and the 'phrasal' (XP) level. The distribution of the
              dependent categories in both types of languages will, as a consequence, also
              differ:

              (a') for two-bar languages it might be like this


                            XP
                            /
                     SPEC(X')  X'
                               /
                         MOD(X')   X'
                                  /
                             ARG(X)   X

              (b') and like this for one-bar level ones

                            XP
                         / / |
                       /  /  |
                     /   /   |
              SPEC(X) MOD(X) X   ARG(X)

              (The linear order is arbitrary).


                A common  characteristics of configurational languages, not a defining criterion,
              but a fairly consistent property nonetheless, is relative 'tightness' of grammatical
              organization - in particular, a relatively straightforward and consistent relationship
              between theta-role assignment and structural position. In short, in configurational
              languages, grammatical grammatical principles are typically articulated in structural
              terms; thus theta-roles are assigned to structurally defined positions. In this
              regard, non-configurational languages are characterized by much   greater
              'looseness' of grammatical organization. Hale (op. cit.) suggests that the principle
              of government, which is behind theta-role assignment, being entirely derivative of
              siterhood (since it is a relation that holds between the head of a category and its
              immediate sisters), can operate only in a structure like (a) above - thus it 'clicks
              on' in two-bar languages; in a 'flat' structure like (b), government as defined above
              cannot serve to partition a structure into distinct sub-phrasal domains - therefore
              this principle 'shuts down' in one-bar languages (op. cit., p3).

                In configurational languages, principles such as abstract case assignment and
              theta-role assignment are  dependent  upon  government.   In contrast, in
              non-configurational languages, configuration alone cannot account for different
              case and theta-role assignment. Therefore, if a NC language uses case, it is
              inherent case, not assigned one. Thus in Polish, for example, inherent case is case
              associated with nominal expressions by virtue of the word-formation component
              alone. And with respect to the notion of theta-positions (i.e., a position to which a
              theta-role is assigned, all positions are theta-positions in NC languages.

                                                 12


                It has even been suggested that free constituent order may be due to the fact
              that the inclusion of a component of PS rules in a particular grammar is subject to
              parametric variation. More specifically, one could suggest that free constituent
              order in languages like Latin follows from the fact that the grammars of these
              languages lack PS rules entirely. Although these languages exhibit certain limited
              restrictions on constituent order, Hale proposes that these  follow from
              requirements imposed by interpretive rules - but they in fact lack PS rules.

                It seems, however, that configurational/nonconfigurational distinction should
              rather be attributed to to the presence or absence of 'linear precedence' rules in
              particular grammars (see for example Gazdar et al, 1985 for an elaboration on the
              operation of LP rules in English). The role of these rules was also illustrated in the
              previous section.

                All of the theories of phrase structure cited above share one assumption: that
              restrictions on constituent order in configurational languages such as English are
              imposed by CF  rewrite rules. However, a diferent approach may be taken: we
              could assume that the component of categorial rules simply does not exist in any
              language;  all languages  could  then  be  treated  as  being  essentially
              non-configurational, from the perspective of the theory of phrase structure
              (Stowell, p2). In this approach, the base component is category-neutral, and
              certain apparent instances of category-specific properties of phrase structure are
              really due to the operation of a class of word-formation rules (op. cit., p.3).

                But if there are no PS rules, then it is not clear how to account for the fact that
              constituent scrambling is normally constrained so as not to apply across clause
              boundaries, suggesting that even languages with entirely free word order maintain
              some constituent structure. In addition, certain restrictions on discontinuous NPs
              and the placement of the Aux receive a natural account in terms of constituent
              structure.

                A  solution to these problems is offered by the theory of Japanese phrase
              structure developed by Farmer (1980, quoted by Stowell, op. cit.). He proposes
              that the base component of Japanese is category-neutral; in other words, the PS
              rules of the language may refer only to the primitives of X-bar theory, and may
              not make  use  of categorial features. First, lexical insertion is context-free,
              abstracting away from subcategorization requirements, since no structural position
              is reserved for any specific category by the base rules; this derives 'scrambling' as
              an automatic consequence. There is no redundancy with strict subcategorization,
              since it is impossible for PS rules to specify which categories may occur as
              complements in NP or VP. But phrases and sentences must still conform to the
              hierarchical structure defined by the category-neutral base, and the complements
              of each verb must appear within V' in order to satisfy strict subcategorization.
              Thus the integrity of clausal structure is maintained, and 'scrambling' across clause
              boundaries is ruled out. Finally, phrasal nodes have no intrinsic categorial features,
              and they acquire a categorial indentity only after lexical insertion has placed a
              particular lexical category in the head position of a phrase. Hence all phrases are
              of necessity endocentric, and hierarchical structure is constant across categories
              (Stowell, op.cit.).

                But it is not clear whether this account, as its consequence, postulates that all
              phrases have a parallel structure - it seems that they have not (cf. Cann, 1986):
              each major phrase type has numerous idiosyncratic properties, suggesting that
              category-specific PS rules are required for each category, even given X-bar theory
              (suffice to compare noun phrases and adjectival or adverbial phrases). It seems
              that certain aspects of constituent structure must be accounted for directly by
              rules which determine hierarchical PS configurations. One of such PS rules that is
              required is one that orders the head term X of a phrase XP at the right/left
              boundary of a level. This holds constant across all major categories: NP, AP, VP,
              PP.

                                                 13


                However,  category-neutral PS rules can define positions for the  head,
              complements, i.e., the hierarchical structure of phrases; theta-role theory and case
              would then combine to account for the distribution and ordering of various types
              of arguments at each X-bar level; the rest could be dealt with in an extended
              component  of word-formation rules - this is exactly the position that Stowell
              takes (op. cit.,); a similar conclusion in many respects seems to follow from Cann
              (op. cit.), and this seems indeed a very attractive approach to me. Thus the
              properties of PS rules relating to the hierarchical and linear ordering turn out to
              follow from the interaction of a number of distinct components of grammar.

                To illustrate the effects of the interaction of the hierarchical structure rules and
              other components of the grammar, let us have a look at a noun phrase in Polish,
              eg.

              ta moja stara skrzynka z drewna ('this my old box of wood')

              (Appendix 2: fig. 2.21, p. 2-11).

              Instead of a tree diagram, we could contruct something like a 'hierarchical' diagram
              or 'slot' diagram, i.e., such where each category fills in a certain slot in the
              structure of a phrase. Thus for our noun phrase, the diagram would look as
              follows:


                                              NP
                                               |
               * <-* <------------------------Nspec
               |   |                           |
               |   |    * <-------------------Nmod
               |   |    |                      |
               |   |    |       * <-----------Nhead
               |   |    |       |              |
               |   |    |       |      * <----Ncomp
               |   |    |       |     _|_
               |   |    |       |    |   |
              ta moja stara skrzynka z drewna

              If each constituent is assigned some value which would indicate the place where it
              it belongs in the construction, and the semantic interpretation is compositional in
              the sense of Montague (eg. as in Dowty, 1982 or Cann, 1986), it follows that the
              correct interpretation of the phrase would be made irrespective of the order of the
              constituents in the phrase (up to the level of ambiguity, that is); so the order
              might also well be like this

                                              NP
                                              |
               * <----  *  ------------------ Nsp
               |        |                     |
               |   * <- | ------------------- Nm
               |   |    |                     |
               |   |    |               *  <--Nh
               |   |    |               |     |
               |   |    |     *  <----- | <---Nc
               |   |    |    _|_        |
               |   |    |   |   |       |
              ta stara moja z drewna skrzynka

              This kind of interpretation of each constituent could be made possible irrespective
              of the position of the constituent even within a larger construction, up to the
              sentence level. A corresponding traditional tree-structure diagram would then

                                                 14


              have 'crossing' branches - and I do not see any reason why it should not have.

                There are several consequences of this phenomenon, and they do seem to be
              present in languages. Thus, for instance, in a non-configurational language, a verb
              must subcategorize for lexical case (i.e., it governs a NP in a particular case, eg.
              for verb czytac ('to read') in Polish, the subject noun must be in Nominative and the
              object noun in Accusative - see App. 2, diagrams 2.22 and 2.23). In contrast, in
              English, the adjacency condition would be responsible for imposing a fixed order of
              components, since almost  no lexical case marking exists. Obviously, in a
              free-word-order language, it would be necessary for NPs to bear case markings
              when  there are several NP complements. But this does not entail indiscriminate
              free constituent order in these languages. Thus prepositions and postpositions
              normally occur with just a single NP. Also the adverb bardzo ('very') in the phrase
              bardzo dobra skrzynka ('a very good box') would usually occur next to the adjective
              - possibly because adverbs have no case marking in Polish!

                To  summarise, a non-configurational language differs from a configurational
              language in that morphological form (eg. affixes), rather than position (i.e.,
              configuration), indicates which words are syntactically connected to each other. In
              English, the grammatical, and hence semantic, relationships are indicated in surface
              form by  the position of words, and changing  these positions changes the
              relationships, and hence the meaning. In contrast, in Polish, the relationships
              bewteen words are indicated by the affixes on the various nouns, and to change
              the relationships, one would have to change the affixes.

                It seems then, that in these languages morphological form plays the same role
              that word order does in a configurational language (English, for example). This is
              why, I think, it is more appropriate to term Polish as an inflectional language, in
              opposition to a configurational language like English. Obviously this a tendency
              rather than an absolute, and certain elements of both types would be found in
              both languages (i.e., some information about grammatical relations can be obtained
              either through word order or morphology), but it is a tendency strong enough to
              make  a clear distinction, especially when two given languages are on different
              ends along a given parameter.


              4.3. Theme-Rheme Structure (Functional Sentence Perspective).


                I shall now discuss a few aspects of the Functional Sentence Perspective theory,
              developed by Prague School linguists, such as Mathesius and Firbas, in order to
              see how it might account for certain word order phenomena in Polish.

                The semantic (communicative) structure of a sentence requires that the object
              or a set of object talked about and information given about it to be indicated. A
              sentence is then said to be composed  of  theme and  rheme  (or topic and
              comment), in other words, it has a theme-rheme structure, also called functional
              sentence perspective (Topolinska, 1984: 31).

                There is a noticeable connection between the definition of theme and the
              semantico-syntactic definition of argument, and also between the definition of
              rheme and the definition of predicate. Namely, in elementary sentences composed
              of one-argument  predicate and  in derived sentences with  multi-argument
              predicates used with one argument as a result of unfilling the other argument
              positions, there is a one-to-one correspondence between the argument and
              theme, and the predicate and rheme, as in:

              theme: argument / rheme: predicate


                                                 15


              Jan / czyta,
                  'is reading'

              Piotr / jest zonaty.
                   'is married'

                A  more   complicated situation is found in  elementary sentences with
              multi-argument predicates - any argument may  be  the theme.  Most often,
              however, there is just one theme even where several arguments are present.
              Rarely, allarguments may  jointly constitute the (complex) theme. Special
              constructions are used in Polish, like
              jezeli chodzi o ... , to...
              co do .... , to ...
              'as concerns ..., ... ',

              more rarely the change of the order (permutation) of arguments is involved:

              Jezeli chodzi o Piotra i Anne, to on sie z nia ozenil.
              Co do Piotra i Anny, to on sie z nia ozenil.
              'Peter and Ann got married'

              The complex theme  in the above sentences is represented by the expressions
              'Piotr i Anna' .

                In sentences with multi-argument predicates which have one simple theme, any
              of the arguments, regardless of its form, may be chosen to serve as theme,
              depending of the communicative intention of the speaker. On the surface, then,
              theme does not have to be always represented by a noun in the nominative case,
              although in utterances which are not closely interrelated in a broader context the
              noun in the Nominative would usually fulfil the function of theme, here are a few
              examples:

              Piotr / ozenil sie z Anna.
              'Peter got maried to Ann'

              Z Anna / ozenil sie Piotr.

              Piotr / kocha Anne.
              'Peter loves Ann'

              Anne / kocha Piotr.

              (English translations for the second of each sentences might be the so called
              'topicalized' constrution:

              It was Ann that Peter married.
              It is Ann that Peter loves.)

              In all of the above sentences, theme is always represented by the first noun.

                Sentences with theme expressed by nouns in the nominative, occupying the
              initial position, are the most natural constructions, functioning independently of the
              context. They  are regarded  as unmarked,  or neutral, in respect to the
              theme-rheme  ordering of their elements, in other words, they have a neutral
              functional perspective.

                In turn, expressions with the same predicates but in which theme is represented
              by a non-nominative form of the noun, are regarded as marked, and they usually
              function in a more general context.


                                                 16


                The choice of one argument as theme means invariably the shift of of the rest
              of the arguments to the rheme position in the sentence. Rheme is then a complex
              structure resulting from the absorbtion by the predicate some of the arguments; in
              other words, the prdicate has the abiblity to order (or hierarchize) differently its
              arguments - they have the ability to create different theme-rheme structures, with
              unmarked (neutral) or marked orders.


                It follows then, that predicate-argument constructions are not ordered in Polish;
              they are just differentiated according to their predicates. Thus the first argument
              (x) in ma(x,y) ('has(x,y)') will have a noun in nominative, or non-nominative forms
              in its converses (cf. Topolinska, 1984), i.e., nalezy_do(x,y) ('belongs_to(x,y)'), for
              example.


                Here are some examples of sentence (i.e., theme-rheme) structures for sentence
              'Peter is getting married to Ann', (the original Prague School terminology has been
              preserved):

              1. Piotr z Anna / sie zeni ('Peter' 'to Ann' / 'is getting married')

              ,
                                S
                               /
                             /
                           /
                         /            TLD
                       /            / /
                     /           /   /
                   /           /    /
                  M(od)      T(im) L(oc)   D(ictum)
                   /          /     /      /
                  /         /      /     /
                 /         /      /     Tm             Rm
                /        /       /     /
               /        /       /     x     y              g
              /        /       /      |     |              |
              0      [pres]    0    Piotr  Anna       sie_zeni


              2a. Piotr / zeni sie z Anna ('Peter' / 'is getting maried' 'to Ann') (the tree will be
              abbreviated now):

                                D
                               /
                             /
                           /
                          Tm         Rm

                          |           |
                          |           |
                          x           g    y
                          |           |
                         Piotr     zeni_sie  Anna


                                                 17


              2b. Z Anna / zeni sie Piotr. ('to Ann'/ 'Peter is getting married ')

                                   D
                                  /
                                /
                              /
                            Tm            Rm
                            |             |
                            y             g   x
                            |             |
                           Anna        zeni_sie Piotr


              (slightly adapted from Topolinska, 1984: 36,37).


                The  theme-rheme  structure seems, then  to  neatly account for certain
              phenomena  of word order in Polish. It also appears that including theme-rheme
              distinction can be beneficial in English too, especially within the context of the
              communicative dynamism theory, also part of Prague School linguistics (also see
              Kay, 1975).


              4.4. Grammatical Relations.


                Most descriptions of syntax have assumed that, perhaps in addition to semantic
              and pragmatic roles, there are also purely syntactic relations contracted between a
              NP and its predicate, which, however closely they may correlate with semantic or
              pragmatic relations, cannot be identified with them. These might be called
              grammatical relations (but the term grammatical here does not have the narow
              sense of syntactic), and they can be called subject (Sub), direct object (DO) and
              indirect object (IO). It will assumed, afer Comrie (1981), that grammatical relations
              do exist, but unlike much recent work on grammatical relations (in particular,
              relational grammar), it will be argued that much of syntax can be understood only
              in relation to semantics and pragmatics (cf. also Bach 1982, for a similar view), or
              more specifically, that grammatical relations cannot be understood in their entirety
              unless they are related to semantic and pragmatic roles:

              '... at least many aspects of the nature of grammatical relations can be
              understood in terms  of the interaction of semantc  and pragmatic roles:
              for instance, many facets of subjecthood  can be understood by regarding
              the  prototype of  subject as  the intersection  of agent  and topic...'
              (Comrie, 1981: 60).


                In much recent work on grammatical relatins, it is taken for granted that certain
              grammatical roles exist as given by the general theory, and that the linguist
              looking at an individual language has to work out which NPs in this language
              evince these particular roles - of course oversimplifications are imminent to avoid
              certain pitfalls (e.g. problems with indirect objects in English) (Comrie, op. cit.). But
              there is a discrepancy between syntax and morphology, i.e., that the morphology is
              arbitrarily relative to the syntax (or vice versa), eg. the use of a certain case to
              exhibit a grammatical relation. Rather, interaction of parameters: semantic roles,
              pragmatic roles, grammatical relations and morphological cases should be
              considered.

                If Polish and English are compared, it turns out that they may be regarded as
              being radically different types along a given parameter. Thus in English, there is a
              high correlation between grammatical relations and word order; indeed, word order

                                                 18


              is the basic carrier of grammatical relations, esp. of subject and direct object, as
              can be seen in


              John loves Mary.
              Mary loves John.

              The position immediately before the verb is reserved for the subject, while position
              immediately after the verb is reserved for direct object; even in corresponding
              questions, changing of the word order therefore changes the grammatical relations,
              and ultimately the meaning of the sentence.

                Also, relations and valencies of verbs can be considered. Thus English has
              many  verbs that can be used either transitively or intransitively. When used
              transitively, the subject will be an agent; when used intransitively, the verb will
              have a patient as its subject, as in


              John opened the door.
              The door opened.

                Next, in English, morphological marking of NPs plays a marginal role. To be
              sure, most pronouns have a nominative vs. accusative distinction, as in


              I saw him.
              He saw me.

              However, the existence of this case distinction does not provide for any greater
              freedom of word order: sentences like


              *Him saw I.
              *Me saw he.

              are simply ungrammatical in the modern language. Moreover, except in very
              straightforward examples like the above, the correlation between case and
              grammatical relation is rather weak; for instance, many speakers of English have
              the pattern illustrated below:

              a) John and I saw Mary.
              b) Me and John saw Mary.

              Here, the difference between I and me is in turn conditioned by register ((b) is
              considered more colloquial than (a)), although there is no difference in grammatical
              relations (op. cit: 70).

                Generally, in English, pragmatic roles play a very small role in the syntactic
              structure of sentences; hence the choice between alternative syntactic means of
              encoding  the same  semantic  structure is often determined by  pragmatic
              considerations, one of the principles being a preference to make the topic subject
              wherever possible, thus leading to a correlation between subject and topic.

                In contrast, in Polish, the basic marker of grammatical relation is is not word
              order, but rather the morphology; thus the noun in the nominative is the subject
              and the noun  in the accusative is the direct object in a majority of cases.
              Changing the word order does not affect the distribution of grammatical relations
              or semantic roles, and any of the  logically possible permutations of major
              categories might be accepted - up to the level of ambiguity.


                                                 19


                But the permutations are by no means equivalent; in particular, they differ in
              terms of the pragmatic roles expressed. The basic principle in Polish, esp. in
              non-affective use, is that topic comes at the beginning of the sentence, and the
              focus at the end. Let us then look at some specific examples taken from Polish (I
              hope that the 'violence' of the example will be excused):

              (1)
              - Kto uderzyl Piotra? - Piotra uderzyl Janek.
                        <acc>     <acc>       <nom>

              - 'Who hit Peter?' - 'Jan hit Peter'

              (2)
              - Kogo Janek uderzyl? - Janek uderzyl Piotra.
                   <nom>          <nom>       <acc>

              - 'Who did Jan hit? - 'Jan hit Peter'

              (3)
              - Pawel uderzyl Roberta. - A Janek? - Janek uderzyl Piotra.
               <nom>        <acc>       <nom>     <nom>        <acc>

              - 'Pawel hit Robert' - 'What about Jan? - 'Jan hit Peter'

              (4)
              - Pawel uderzyl Roberta. - A Piotra? - Piotra uderzyl Janek
               <nom>        <acc>       <acc>     <acc>        <nom>

              - 'Pawel hit Robert' - 'What about Peter?' - 'Jan hit Peter'

              Note that in examples (3) and (4), the distinction between the nominative ( Jan )
              and the accusative ( Piotra ) is crucial to the understanding whether the question
              is about the bully or the victim; this is not brought out in the English translations,
              which would, to carry the same amount of information, have to be more explicit,
              eg.


              (3) and who did Jan hit?,
              (4) and who hit Peter?

                It seems then, that Polish and English differ in that in English word order is
              determined by grammatical relations and independent of pragmatic roles; in Polish,
              morphology is determined by and carries grammatical relations, while word order
              is determined by pragmatic roles. (Comrie, op. cit.: 72; also cf. op. cit. for an
              account of differnces between Russian and English in terms of the interaction of
              semantic roles with grammatical relations).

                Moreover, Polish has no syntactic equivalent of object-to-subject raising, thus a
              translation of


              It is easy to solve this problem.

              might be

              Latwo (jest) rozwiazac ten problem.

              Given the free word order of Polish, it is also possible to move the NP to the
              beginning of the sentence:


                                                 20


               Ten problem jest latwo rozwiazac. ('This problem is easy to solve')

              We  observe then the syntactically different means of encoding the same set of
              semantic roles. Thus the equivalent of English passive would be, in Polish, the
              active with the word order DO-V-Sub, as in

              Piotra udedrzyl Janek. ('Jan hit Peter/it was Peter who Jan hit')
              Piotr byl/zostal uderzony przez Janka. ('Peter was/has been hit by Jan')

              but in fact a subjectless construction with the verb in the 'impersonal form' is
              more usual:


              Piotra uderzono. ('Peter has been hit')

              It seems then the basic function of the Polish passive is not as much pragmatic as
              stylistic (likewise Russian, cf. Comrie, op. cit.: 76).

                To summarise, in English, the grammatical relations play a much greater role
              than in Polish. Firstly, the grammatical roles in English are more independent than
              in Polish, with a low correlation in English between grammatical roles and either
              syntactic or pragmatic roles (or morphology, which is virtually non-existent).
              Secondly, there is a wider range of syntactic processes in English than in Polish,
              where grammatical roles and changes in grammatical relations are relevant. In
              Polish, semantic and pragmatic roles (and even morphology) play a greater role
              than in English. There is, obviously, a difference of degree.

                Some   grammatical relations may be crucial in Polish, for instance verb
              agreement with  the subject. There is no way  in which agreement  can be
              reformulated as agreement with agent or topic:

              (1)
              Anna uderzyla [fem] Pawla.
              'Anna hit Paul'

              (2)
              Pawel zostal uderzony [masc] przez Anne.
              'Paul was hit by Anna'

              (3)
              Pawla uderzyla [fem] Anna.
              'Paul, Anna hit'

              Also, only direct objects can become subject of the passive construction.

                However, some   NPs do  not make  a  nominative-accusative distinction, or
              morphology is ambivalent and both interpretations make sense - then preference
              is given to a SUB-V-DO interpretation - a preference rather than an absolute.

                It seems then that the interaction of semantis, syntactic and morphological
              relations can characterize a part of syntactic differences between languages with a
              different mode of organization better than does word order on its own, especially
              when  in terms of the basic word order the languages are similar. This makes
              quite obvious conclusions for building parsers for respective languages in order to
              adequately account for their organization.


                                                 21


                            5. Processing Free-Word-Order    Languages.


                There have  been some  attempts to account for freer word order, both in
              practical implementations, i.e., actual parsers, and in strictly theoretical terms. I
              shall examine briefly WEDNESDAY, a system which is lexically based, Johnson's
              unconstrained syntactic parser, and LFG, which, through its order-free composition
              claims to have the potential to account for free word order.


              5.1. WEDNESDAY.


                WEDNESDAY   (Castelfranchi et al, 1982) is lexically based system designed to
              cope with the freer word order of Italian. In the system, syntax is not a separate
              component, but is distributed throughout the lexicon. According to its authors, the
              parser can deal with, inter alia, flexible idioms that can vary in morphology, word
              order, a variety of syntactic constructions, semantic additions, and synonyms,
              whose recognition is governed by the individual lexical entries and takes place at
              the assembling level. Unfortunately, I have not been able to obtain more details
              about the system.


              5.2. Johnson's Parser.


                Most attempts to describe this word order freedom take mechanisms designed
              to handle fairly rigid word order systems and modify them in order to account for
              the greater freedom of NC languages. Johnson (1985) produces an unconstrained
              system   instead. In  order  to   describe dicsontinuous  constituents in
              non-configurational languages, he generalizes the notion of location of a
              constituent to allow discontinuous locations; these discontinuous constituents can
              be described by a variant of DCGs.

                In this approach, discontinuous constituents are represented directly, in terms of
              a syntactic category and a discontinuous location in the utterance, e.g., a given
              constituent could then have a following location: {[0,1],[3,4]}; obviously, this a
              manner reminiscent of vertices in a chart. 'Head-driven' aspects of Head Grammars
              (eg. Pollard's), whose conception is that heads contain as lexical information a list
              of the items they subcategorize for, are adopted, but he claims that the use of this
              method is not limited to DCGs but can be incorporated in GPSG, LFG of GB.

                If this approach turns out to be practical, it has a very interesting consequence
              in that it would offer a unified account of parsing both configurational and
              non-configurational languages. Its major shortcoming, however is the fact that it
              is too powerful, unconstrained. No language exhibits total scrambling, and
              therefore, in order for the parser to be interesting linguistically, it would have to
              have constraining mechanisms built on top. But it is generally known from the
              experience with ATNs that constraining a Turing machine is very difficult, if not
              altogether impossible. It would be more  desirable if, instead, the parser's
              constraints were the result of the principles of its internal organization. One way
              of ensuring this could be to provide a sufficiently constrained grammar.

              5.3. Order-Free-Composition in LFG.


                In LFG, the nature of the mapping between c-structure and f-structure enables it
              to achieve many of the effects of discontinuous constituents, even though the PS
              component (the c-structure) does not allow discontinuous constituents as such. In

                                                 22


              particular, the information represented in one component of the f-structure may
              come  from several constituents located throughout the sentence. So LFG is
              capable of describing the 'discontinuity' without using discontinuous constituents.
              There is, however, a subtle difference in the amount of 'discontinuity' allowed by
              the LFG and the extent to which 'discontinuous' constituents actually occur in
              non-configurational languages. Invariably, the problems with many discontinuous
              constituents would occur since, in LFG, the position of any item in the sentence's
              f-structure is determined solely by the f-equation annotations attached to it, and
              no new components in the f-structure can be created (Johnson, 1985: 15).


                                    6. CONCLUDING    REMARKS.


                Certain obvious conclusions as to the way sophisticated, powerful parsers
              should be built could be drawn from what has been said so far.

                As concerns the grammar, organising the lexicon in a certain way (eg., along the
              lines suggested in previous sections) has an advantage of making PS rules to a
              large extent 'category-neutral' in the sense of Farmer and Stowell (op. cit), and, in
              effect, an efficient implementation of this type of rules for free-word order
              languages. Moreover, both free- and fixed-word-order languages could be treated
              in a unified way, with the rules of linear precedence operating more strictly in
              languages like English. This also seems to be a solution for parsing partly
              ungrammatical or elliptical sentences. Obviously, in this approach, the descriptive
              burden on the lexicon is considerable. So is the role of semantics. a strong
              model-theoretical semantic component in Montague tradition is envisaged.

                Another consideration in building a sophisticated parser seems very important,
              as we have seen in the theoretical discussion of linguistic description, and it is, in
              the words of Lyons, that

              '...uterances  are  produced  in   particular  contexts  and  cannot  be
              understood (even within  the limits imposed [...]  on the interpretation
              of the  term 'understanding'[...]) without  a knowledge of  the relevant
              contextual features [...] '...we must not lose sight of the relationship
              between utterances and particular contexts...' (1968: 419).

              This is largely reflected also in the way humans do it: inferential processes are
              brought into play as soon as we start to read any phrase or text.

                There are some  very  powerful tools already available for the use in the
              computational approach: ATN-based systems, Marcus-type parsers or charts. In
              particular, the chart, both in its role as an indexing scheme for constituents and as
              an active parsing agent, can be seen a very desirable tool, its main virtue being,
              besides the 'bookkeeping' role, the fact that it allows to treat independent pending
              hypotheses independently, and also its enormous flexibility and perspicuousness.

                Chart parsing is often seen as a rival approach to Marcus-parsing. Marcus is an
              appealing parsing method, since we can go to find local clues which enable the
              parser to select properly what to do next. This idea seems advantageous for free
              word order languages since rich inflection makes the local clues more explicit and
              the parser's expectations more precise. But there are problems with writing
              grammar  rules for a deterministic parser; on one hand, there is a heavy
              responsibility on the grammar-writer to find a way of writing his grammar so that
              the parser will be efficient and linguistically interesting (Ritchie, 1983:79); on the
              other hand, it does not predict certain properties universal in languages (Briscoe,
              1983: 67).


                                                 23


                From the point of view of actual implementation of natural lanuage processing
              systems, there has been an increase of the application in at least two distinct
              directions in the past few years. Thus on one hand, there has been a marked
              trend towards the construction of large, efficient parsers for use as front ends to
              data bases  or intelligent systems. On the other hand, as  a result of a
              re-evaluation of the fundamental notions of parsing and syntactic structure, viewed
              from the perspective of programs that understand natural language, systems are
              being designed ehich attempt to capture in their design the same kinds of
              generalizations that linguists and psycholinguists posit as theories of language
              structure and language use. Thus the Marcus parser, for instance, can be seen as
              an embodiment  of an approach, however fragile as yet, in which attention is
              towards the interaction between the structural facts about syntax and the control
              structures for implementic the parsing process.

                Also, the current trend is away from simple methods of applying grammars (as
              with PS grammars), toward more integrated approaches. This could be spurred on
              by the fact that parsers are geing called upon to handle more 'natural' text,
              including discourse, conversation, and sentence fragments. These involve aspects
              of language that cannot be easily described in the conventional, grammar-based
              models.


                                                 24