% File: ECMI01.AI2_SENTENCE % Author: Peter Ross and others % Purpose: a simple-minded way of reading in a sentence % Needs: UTIL for memberchk/2. % Top-level goal is sentence/1. When it has succeeded the variable % will be instantiated to a list of atoms representing the words % typed in - e.g. if you type "who are you?" then sentence/1 will % instantiate its argument to the list [who, are, you, ?]. % % Procedures which read things in character by character are slightly % more tricky than many others to design, because it is important not % to backtrack over a get/1 goal. If that happens, the character it had % previously read will be lost. % % In the program below, words/2 has an ASCII code as first argument. The % second argument will be instantiated to the list of atoms corresponding % to the typed sentence, starting with the given character. The first two % words/2 clauses therefore define how the end of the sentence is defined. sentence(F) :- get0(C), words(C,F), !. words(10,[]). % 10 = words(13,[]). % 13 = words(C, [SingleCharWord|Ws]) :- % This clause deals with characters single_char_word_list(List), % which are treated as complete words. memberchk(C, List), name(SingleCharWord, [C]), get0(C1), words(C1,Ws). words(C,[W|Ws]) :- % In this clause, the characters making up a single letter(C), % word are gathered up in the list L, and ... single_word(C,C1,L), name(W,L), % ... name/2 turns that list into an atom or integer. words(C1,Ws). % words/2 is used to gather the rest of the sentence. words(_,W) :- % This clause is the dustbin. If it fires, the first get0(C), % argument represents a character that is not the end words(C,W). % of the sentence and is not part of a word: it is lost. % single_word/3: the first argument is the first character, and the second % argument eventually gets instantiated to the character that followed the word. % The third argument gets instantiated to the list of characters in the word. single_word(C,C1,[C|Cs]) :- get0(C2), (letter(C2), single_word(C2,C1,Cs); C1=C2,Cs=[]). % If letter(C) succeeds then the character (whose ASCII code is the argument) % is part of a word, not a word by itself and not the last character of a % sentence. If letter(C) fails, then that character is either one that comes % between words (e.g. space, tab) and can be forgotten, or one that marks the % end of a sentence. letter(32) :- !, fail. % 32 = space letter(9) :- !, fail. % 9 = tab letter(C) :- single_char_word_list(List), memberchk(C, List), !, fail. letter(10) :- !, fail. % 10 = letter(13) :- !, fail. % 13 = letter(_). single_char_word_list("!,.?").