After a false start on Friday - I tried to compile imptoc with all checks on, and wasted hours going through the source initialising every unassigned variable error to 0 - I finally realised I should compile it with checks off - and it leapt into life. (At which point I deleted everything I'd done and started from a clean copy again) It is somewhat temperamental and crashes on various valid inputs, and in such a way that you don't get any diags. (GDB isn't helpful in these instances either). I suspect it's more a limitation of the Intel compiler as currently compiled, and the lack of full signal trapping. I'm hoping that when imp80 is fully bootstrapped, the remaining problems with imptoc will go away. (Or it may just be due to "Set Sigs" being missing from imptoc) I have found and fixed several actual bugs, and cleaned up a few infelicities, and started work on integrating the IMP I/O more cleanly into C (using FILE * I/O but keeping Imp stream numbers). I only had to do one thing which was not "clean". The Imp80 compiler does not correctly handle line numbers with respect to include files. When it reports an error, the line number reported is the cumulative line number formed by adding the number of lines in the include files to the real line number. This made tracking down error reports almost impossible, so I have physically inserted the include files in the sources at the points where the %include occurred. There was one big compiler limitation: even with MAXDICT and MAXWORK turned on (the only flags I could find to increase compiler capacity), the Imp80 compiler failed to compile some huge single statements of the form %if <...> %then ... %else %if <...> %then ... %else %if <...> %then ... %else ... Since these were already broken over dozens of lines with "%c" it was relatively easy to rewrite them as "%finish %else %if <...> %start" which then allowed them to be compiled. Now, here are the parts I changed: * SKIP SC and SKIP COND etc had some logic to use the REVERSED flag - but it was never implemented in the imptoc version - presumably in the actual compiler it just reversed the sense of the jumps, but in imptoc there are no jumps, so you need to change "&&" <-> "||" and invert the sense of all the comparisons (eg "<" <-> ">=" ). This allows for much more human- readable output for %unless and %repeat %until. For example: %cycle k = byte(p1); print symbol(k); p1 = p1+1 %repeat %until k = nl %and p1-p2 >= 0 now becomes do { k = byte[p1]; fprintf (out_file, "%c", k); p1++; } while (k != nl || (p1 - p2) < 0); (neat work on the auto-increment and x += y optimisations by the way!) * I changed "do { ... } while (TRUE);" for the more idiomatic C "for (;;) { ... }" which most C programmers should find more intuitive on first reading. * I implemented the missing "readsymbol"/"readch" and am working on filling out more gaps in the perms. I also made selectinput and selectoutput work correctly, with suitable defaults, and have added openinput and openoutput as externals. * I changed the target of pass1 from mips to pentium. This avoided some errors where it said the target machine was incapable of some conversions. (Ideally there should be a new target, "C") * I removed some of the comment padding because I'm leaving indentation and comment padding to the Gnu Indent program. Pre-padding in imptoc caused most commented lines to extend past col 78, because Gnu Indent never removes spaces in comments, it only adds extra ones. * I extended the array of reserved keywords to include some procedures which will always be present in the C code because we need things like and to be linked in. So far I've only added "exit" but may be more systematic later and add everything in the standard include files. Note: relying on a case difference is not portable on systems where external linkage supports only one case. Should rewrite that function to use "_" chars, ie exit -> Exit may not work, so use exit -> _exit. (Actually _exit is unfortunately implemented in some C's so something different may be needed, but you get the general idea - maybe i_exit() or exit_()? ) * As part of a process to move towards using standard C constructs where possible rather than the imp_* procedures, I generate exit(0) directly instead of imp_stop which all it does is call exit... * %on %event handling needs to be changed, even if only in a trivial way: If you have this in your source: %on %event 1,2,9 %start ->eof %finish the translated version retains only the goto eof; This usually means that the entire body of any program is skipped over! My first thought was to rewrite it as so: #ifdef NEVER /* onevent 1,2,9 */ goto eof; #endif but I realised that this less useful than: if (imp_onevent((1<<8) | (1<<1) | (1<<0))) { goto eof; } because you can write it at first with a null implementation that always returns FALSE, but later perhaps replace it with something sensible involving say setjmp or C++ events. I don't understand the offsets in the phrase structure well enough to implement the block above myself so I've left it for later. I couldn't work out where the %finish was being handled - it wasn't in the obvious place. * Currently ->sw(i) jumps to a dispatch table. Good idea, but the table catches missing switch labels with the default: statement and reports the error using __LINE__ and __FILE__. This is a good idea too but not as helpful as it could be, because those are always the same place in the dispatch table. I have modified the generation of the jump to the dispatch table so that it passes the values of __LINE__ and __FILE__ that were present at the jump statement to the dispatch table via variables. This way you can find out where in the source a jump to sw(missing) actually occurred. * I couldn't find the definition of BADSWITCH so I made it a macro that calls assert() in the absence of anything similar to invoking Imp diags. * I made monitor into a call on assert(FALSE). It would be neat if we could intercept this UI a level further up, i.e. %if %then %monitor would become assert(!) which with the new condition-reversal code would read very nicely too, eg %if (pp > fend) %then %monitor becomes assert(pp <= fend); (Don't do the optimisation for any other form, such as %if %then do something %and %monitor) * I fixed a small infelicity where the code generates "{;" * I fixed a nasty bug in the output of character constants - a quote was output as ''' and a backslash as '\' - these should of course have been '\'' and '\\' I believe a similar bug lurks in double-quoted strings but I haven't worked on that yet. Percents were not treated properly in printf statements, but I have hacked around that by the more clean (although less efficient) C coding as follows: printf(litstring) becomes printf("%s", litstring) * I also created a concept of 'current input/output stream/file' and all printf's are now fprintf(out_file, ...) out_file is a macro expanding to "outmap[outstream]" - outstream is the currently selected imp stream number, and outmap is an array which maps that to a FILE * such as stdout, or the result on an fopen(). Thus out_file is the currently selected output stream, as a C FILE*. That I believe is the full total of what I've worked on. There is a lot more I'ld like to do, some of which I know how to do and some of which I don't know how: * The parm_arr flag is a little confusing. There are two issues I think it may (or should) be controlling: 1) 100% compatible imp strings (with length byte) vs C strings (not always 100% compatible but often so) 2) Imp-style I/O vs C-style I/O. I want to move everything to C-compatible I/O, so that (2) is no longer a factor. I want to clean up (1) so that it actually works - currently I don't think you have a fully imp-compatible mode where strings have a length byte. (I'm not sure what compiler options to set in order to set parm_arr...) * I believe there is a bug in array addressing: %integer c(low:high) i = c() becomes int c [high - (low) + 1]; i = c[ - (low)]; I think it is a mistake not to have the in brackets too, if it is a complex expression. This construct only works if does not contain any C operators with a priority lower than '-'. There are some, such as "|". i = c[p1 | p2 - (low)] I think evaluates to p1 | (p2 - (low)) rather than (p1 | p2) - (low) I plan to check this with some examples then add the extra brackets. A better implementation would be to generate an actual subtract triple internally, hooked to the two parameters, which would have the effect when code generating of removing any extraneous brackets that were not actually needed. Don't know how to do this as yet. Would be nice to fold out constants too if possible. * I have an inkling that it might be better to do the maths just once at the point of declaration, eg something like: int _c[(high)-(low)+1], *c = _c-low; I don't know however how well this will work for multi-dimensional arrays. If at all. * I think we can add array bounds checking as follows, at the expense of less readable C: int _c_low = low, _c_high = high; int __c[_c_high-_c_low+1], *_c = __c=low; #define c(idx) _c[_boundscheck(idx,_c_low,_c_high,__LINE__,__FILE__)] (taking care to never evaluate a parameter in the macro twice nor use parameters whose values may have changed since declaration). Do a #undef c at the end of the procedure. boundscheck must be a function, which does the obvious thing. Having done all this you must access the array with ()s instead of []s. i = c(j); Note: you can remove the overhead entirely, at compile time, by doing #define _boundscheck(idx,l,h,ll,ff) idx * There is a bug in the translation of external specs to C. It puts "void" for each of its parameters. I have marked where in the source the "void" is printed but don't know the underlying cause or how to fix it. * Merge string concatenations within printfs into a single printf with multiple parameters * rewrite string resolution to use C's "strstr()" if possible and concatenation to use strcat *OR* sprintf where appropriate. %if s -> ("xyz") %then if (strstr(s, "xyz") != NULL) { %if s -> ("xyz").pqr %then _s1 = "xyz"; if ((_ptr = strstr(s, _s1)) != NULL) { pqr = _ptr + strlen(_s1); { %if s -> ("xyz").pqr.("lmn") %then _s1 = "xyz"; _s2 = "lmn"; if ( ((_ptr = strstr(s, _s1)) != NULL) && ((_ptr3 = strstr(_ptr2=(_ptr+strlen(_s1)), _s2)) != NULL) ) { strncpy(_ptr2, pqr, _ptr3-_ptr2); _ptr2[_ptr3-_ptr2] = '\0'; { %if s -> ("xyz").pqr.("lmn").uvw %then %if s -> abc.("xyz") %then %if s -> abc.("xyz").pqr %then %if s -> abc.("xyz").pqr.("lmn") %then %if s -> abc.("xyz").pqr.("lmn").uvw %then * check that string jam transfers are done as well as possible - I think I spotted a problem when skimming through the code but didn't have a marker at the time to note it down (OK, I was in bed at the time) I *think* we can use strncpy rather than imp_strjam (which in turns calls strcat - no length checking?) * clean up read and write etc to use scanf and printf if possible. Ditto itos and htos. There's a bug in htos (in impsup.c) - the temp buffer for the result was not static. (The whole static buffer thing is dubious anyway, if the function is called twice recursively - fortunately unlikely, but not impossible) (btw there's a better way to handle (-MAXINT-1) in itos than the way here) * In impsup.c, where we have case 3: sprintf(sp,"%3x",val); break; case 4: sprintf(sp,"%4x",val); break; I'm sure there is a printf option to pass the '3' or '4' in as a parameter rather than inline. Need to check "man printf"... * add "cliparam" * %integerarray a(0:j) * %integerarray a(i:j) dynamic bounds are not implented. Using variables for both bounds rather than just the upper bound causes an imptoc crash Fix this as in the prototype with a call to malloc(((j)-(i)+1) * sizeof()) with optional tweaking of the lower bound and rangechecking, as above.