/* Read in a sentence. Clearly, when reading in text character by character (using get0), it is essential NEVER to backtrack over a get0 goal, otherwise you'd have lost the character it read in. Thus, when you get a character, you must make sure it gets used. If it can't get used in one context (e.g. when constructing a word, and you reach a space) it must be passed back to predicates that can use it. Thus, for example, readword has three arguments below: the first is a character, potentially the first of a word but perhaps a bit of interword filling (space or tab), the second is uninstantiated at first but will be a word eventually, the third is uninstantiated but will finally be instantiated to the character that followed the word - i.e. the character that told readword it had reached the end of a word. Similarly, restsent (which is "higher-level" than readword) has three arguments: the first is the word just read, the second is the character following that word (which has to be handed forward for the reasons explained above) and the third is a variable that will be instantiated by restsent to a list consisting of the rest of the sentence. The first argument is necessary in order that restsent can tell whether there IS any more to be read: it might be that the word last read was the last. This is decided by lastword - the current candidates for being the last in a sentence are the dot, the exclamation mark and the question mark (all valid atoms, by the way, though this is not a vital characteristic). */ read_in([W|Ws]) :- get0(C), readword(C,W,C1), restsent(W,C1,Ws). /* Given a word and the character after it, read in the rest of the sentence */ restsent(W,_,[]) :- lastword(W), !. restsent(W,C,[W1|Ws]) :- readword(C,W1,C1), restsent(W1,C1,Ws). /* Read in a single word, given an initial character (C), and remembering what character came after the word */ readword(C,W,C1) :- single_char(C), ! , name(W,[C]), get0(C1). readword(C,W,C2) :- in_word(C,NewC), !, get0(C1), restword(C1,Cs,C2), name(W,[NewC|Cs]). readword(C,W,C2) :- get0(C1), readword(C1,W,C2). /* When in a word, get the rest of it */ restword(C,[NewC|Cs],C2) :- in_word(C,NewC), !, get0(C1), restword(C1,Cs,C2). restword(C,[],C). /* These are single character words */ single_char(44). single_char(59). single_char(58). single_char(63). single_char(33). single_char(46). /* These characters can be inside a word. The second clause lowers upper case letters */ in_word(C,C) :- C>96, C<123. in_word(C,L) :- C>64, C<91, L is C+32. in_word(C,C) :- C>47, C<58. /* digits */ in_word(39,39). in_word(45,45). /* These terminate a sentence */ lastword('.'). lastword('!'). lastword('?').