{ MODULE 9 Syntax Analysis and Syntax Error Recovery 9.1 Syntax analysis. Syntax analysis is implemented as a set of recursive descent pro- cedures based on syntax rules given in the ISO definition. The order, names, and nesting of the procedures is as follows: Programme ProgramHeading Block LabelDeclarationPart UnsignedConstant ConstExpression SimpleConstExpression ConstTerm ConstFactor PragmaList ConstDefinitionPart NewSubType TypeDenoter OrdinalType EnumeratedType SubrangeType StructuredType ArrayType RecordType FieldList SetType FileType PointerType TypeDefinitionPart DeclarationPart FixVarId DesignateVars Initialize ScalarInitialisation PointerIntialisation SetInitialisation RecordInitialisation ArrayInitialisation PfDeclaration PfHeading FormalParameterList Selector Variable Call Expression SimpleExperssion Term Factor BooleanExpression Statement Assignment GotoStatement CompoundStatement IfStatement CaseStatement WhileStatement RepeatStatement ForStatement WithStatement StatementSequence StatementPart The procedure Selector implements the analysis of the syntax category variable (as does the procedure Variable which follows it) but assumes that the initial variable identifier has been checked and accepted before Selector is called. The procedure Call is introduced to implement the analysis of the construct [ actual-parameter-list ] as it appears in function- designator and procedure-statement. It does so both for program- defined procedures and functions, and for the irregular parameter lists allowed for standard procedures and functions. The syntax analysers are written on the assumption that the next syntactic goal can always be selected by inspection of (at most) the next incoming symbol ( i.e. that the underlying grammar is LL(1) ). This is not so at the following points in the syntax of Pascal:- 1. A statement beginning with an identifier may be either an assignment or a procedure call. 2. A factor beginning with an identifier may be either a vari- able, a constant or a function call. 3. A type beginning with an identifier may be either a named type already declared or a subrange whose lower bound is a constant identifier. In all cases to resolve the choice on a purely syntactic basis would require a look ahead of a further symbol. However if paral- lel semantic analysis is used these choices can be resolved without further lookahead, by inspection of the current semantic attributes of the identifier involved. For this reason syntactic resolution of these choices is not attempted. A similar problem arises with the actual parameter lists of the standard procedures and functions of Pascal which are analysed by special purpose syntax routines. These are again selected by a semantic rather than syntactic test. The syntax of Pascal is also not LL(1) in variant records, where the syntax is effectively: 'case' [ identifier ':' ] identifier 'of' .... In this case semantic resolution is not possible, and an ad-hoc lookahead technique is used. 9.2 Syntactic error recovery. Recovery in the syntax analysis process following the discovery of a syntax error is incorporated into the syntax procedures on the following basis: 1. Each procedure when called is passed an actual parameter which is a set of symbols forming the (right) context of the string which it should scan. This context normally includes all symbols which may legitimately follow the string to be scanned, and such additional symbols as a superior (calling) procedure may wish to handle in the event of error recovery. 2. When entered the procedure may ensure that the current symbol is an acceptable starter for the string to be scanned, and if not scan forward until such a symbol is found (subject to 4. below). 3. When calling a subsidiary syntax procedure, the procedure passes on as context its own context plus those symbols if any which it may determine as right context for the substring to be scanned. 4. To recover from a syntax error the procedure may scan over (skip) any symbol provided it is not contained in the context passed to it. 5. On exit, the syntax procedure ensures that the current symbol is contained in the context passed to it, flagging a terminal error and skipping if this is not initially the case. In practice this general recovery strategy may be omitted from those syntax procedures which are only used as an alternative or partial subpath of a superior procedure which inevitably enforces recovery. 9.3 Syntax utilities. The following procedures enable the analysis and recovery stra- tegies outlined above to be expressed neatly in the the analyser itself. They also improve the quality of syntax error reporting by suppressing vague "unexpected symbol" messages whenever a more precise subsequent diagnosis is made available, using the global variables Recovering, SymbolsSkipped, and ErrorReported to do so. Most syntax errors are reported using a code derived from the expected symbol, by the formula ord(expected symbol) + 10. Thus the codes used are interpreted as follows: 10. An identifer symbol was expected. 11. An integer-constant was expected. . . . 69. Other symbols were expected. In practice only a subset of these can be generated by the Pascal syntax analyser as some symbols are never mandatory. The syntax analyser makes use of several 'constant' sets of sym- bols during analysis. The variables used to hold these sets are initialised by the procedure InitSyntax, which must be called before syntax analysis is invoked. } program SyntaxAnalyser; #include "globals.x" #include "source.pf" #include "lexical.pf" function Missing(Symbol: SymbolType): integer; visible; begin Missing := ord(Symbol) + 10 end; procedure AcceptSymbol; visible; begin if Recovering then begin if not ErrorReported then Error(Missing(OtherSy) + ord(SymbolsSkipped), StartOfSymbol); Recovering := false end; NextSymbol end { acceptsymbol }; procedure Accept(SymbolExpected: SymbolType); visible; begin if Symbol = SymbolExpected then AcceptSymbol else begin Error(Missing(SymbolExpected), StartOfSymbol); ErrorReported := true end end { accept }; procedure AcceptEquals; visible; begin if (Symbol = RelOp) and (Operator = EqOp) then AcceptSymbol else Error(71, StartOfSymbol) end { acceptequals }; procedure RecoverAt(RelevantSymbols: SetOfSymbols); begin if not Recovering then begin StartError(StartOfSymbol); Recovering := true; ErrorReported := false; SymbolsSkipped := not (Symbol in RelevantSymbols) end; while not (Symbol in RelevantSymbols) do NextSymbol end { recoverat }; procedure CheckContext(ContextExpected: SetOfSymbols); visible; begin if not (Symbol in ContextExpected) then RecoverAt(ContextExpected) end { checkcontext }; procedure CheckNextOrContext(SymbolsExpected, DefaultContext: SetOfSymbols); visible; begin if not (Symbol in SymbolsExpected) then RecoverAt(SymbolsExpected + DefaultContext) end { checknextorcontext }; procedure InitSyntax; visible; begin BlockBegSys := [LabelSy, ConstSy, TypeSy, VarSy, ProcSy, FuncSy, BeginSy]; if ICLPascal then BlockBegSys := BlockBegSys + [PresetSy, ReadOnlySy]; ConstBegSys := [AddOp, IntConst, RealConst, CharConst, StringConst, Ident]; if ICLPascal then ConstBegSys := ConstBegSys + [BasedConst]; SimpTypeBegSys := ConstBegSys + [LeftParent]; TypeDels := [ArraySy, RecordSy, SetSy, FileSy]; TypeBegSys := SimpTypeBegSys + TypeDels + [Arrow, PackedSy]; StatBegSys := [BeginSy, GoToSy, IfSy, CaseSy, WhileSy, RepeatSy, ForSy, WithSy]; FacBegSys := [IntConst, RealConst, CharConst, StringConst, Ident, NilSy, LeftParent, LeftBracket, NotSy]; if ICLPascal then FacBegSys := FacBegSys + [BasedConst]; ConstFacBegSys := FacBegSys - [NilSy, LeftBracket]; SelectSymbols := [Arrow, Period, LeftBracket]; if InHousePascal then SelectSymbols := SelectSymbols + [Shriek]; ParamBegSys := [ProcSy, FuncSy, VarSy, Ident]; if ICLPascal then ParamBegSys := ParamBegSys + [ReadOnlySy]; case OptionValue[Level] of Level0 : ConformantParamBegSys := []; Level1, Level2, Level3 : ConformantParamBegSys := [PackedSy, ArraySy] end; IndexSpecBegSys := [Ident]; if ICLPascal then IndexSpecBegSys := IndexSpecBegSys + ConstBegSys; Recovering := false end { initsyntax }; begin { end of module } end.